vinh q. tran (@vqctran) Twitter Tweets • TwiCopy

vinh q. tran

@vqctran

+ Follow

research scientist @GoogleDeepMind, all thoughts my own, he/him

ID: 974097637564665856

linkhttp://vqtran.github.io calendar_today15-03-2018 01:39:09

129 Tweet

1,1K Followers

322 Following

vinh q. tran

@vqctran

2 years ago

Been wild to hear from Yi how things in the wild are, definitely worth a read!

thumb_up_off_alt8

chat_bubble_outline1

repeat0

shareShare

vinh q. tran

@vqctran

a year ago

idk who needs to hear this but span corruption != bidirectional attention, you could have one, both, or neither. addendum: ul2 could be implemented completely with a casual decoder!!!!

thumb_up_off_alt15

chat_bubble_outline1

repeat0

shareShare

lmarena.ai (formerly lmsys.org)

@lmarena_ai

a year ago

Exciting News from Chatbot Arena! Google DeepMind's new Gemini 1.5 Pro (Experimental 0801) has been tested in Arena for the past week, gathering over 12K community votes. For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive

Exciting News from Chatbot Arena!

<a href="/GoogleDeepMind/">Google DeepMind</a>'s new Gemini 1.5 Pro (Experimental 0801) has been tested in Arena for the past week, gathering over 12K community votes.

For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive

thumb_up_off_alt1,1K

chat_bubble_outline83

repeat410

shareShare

hero ⚔️

@1thousandfaces_

a year ago

bratpropagation

thumb_up_off_alt2,2K

chat_bubble_outline32

repeat191

shareShare

Hritik Bansal

@hbxnov

a year ago

New paper📢 LLM folks have been supervised finetuning their models with data from large and expensive models (e.g., Gemini Pro). However, we achieve better perf. by finetuning on the samples from the smaller and weaker LLMs (e.g., Flash)! w/Mehran Kazemi Arian Hosseini Rishabh Agarwal vinh q. tran

thumb_up_off_alt837

chat_bubble_outline23

repeat144

shareShare

vinh q. tran

@vqctran

a year ago

just like many lessons in LLMs past, take the compute-match comparisons seriously and you will prosper check out Hritik's excellent internship work!

thumb_up_off_alt48

chat_bubble_outline1

repeat3

shareShare

vinh q. tran

@vqctran

a year ago

Welcome back Yi!! 🚀

thumb_up_off_alt4

chat_bubble_outline1

repeat0

shareShare

Ibrahim Alabdulmohsin | إبراهيم العبدالمحسن

@ibomohsin

a year ago

Have you wondered why next-token prediction can be such a powerful training objective? Come visit our poster to talk about language and fractals and how to predict downstream performance in LLMs better. Poster #3105, Fri 13 Dec 4:30-7:30pm x.com/ibomohsin/stat… See you there!

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

foam shazeer

@foamshazeer

a year ago

they don't realize it yet but legacy brain and legacy deepmind are secretly the yin and yang of AI research -- forced together in GDM they are differing, chasing, yet necessary to push Gemini beyond the frontier

thumb_up_off_alt59

chat_bubble_outline3

repeat5

shareShare

rohan anil

@_arohan_

a year ago

Prediction: People say pretraining will end, and I think everyone will be surprised how many multipliers we can squeeze from existing data through all kinds of algorithms.

thumb_up_off_alt176

chat_bubble_outline7

repeat9

shareShare

vinh q. tran

@vqctran

a year ago

pretraining is a mindset

thumb_up_off_alt37

chat_bubble_outline0

repeat2

shareShare

vinh q. tran

@vqctran

9 months ago

Take BIG-Bench Hard but make it EVEN HARDER!! Check out this cool new benchmark that really shows how much further our models still have to go on general reasoning!

thumb_up_off_alt9

chat_bubble_outline0

repeat0

shareShare

vinh q. tran

@vqctran

8 months ago

the first of many exciting days 🥇

thumb_up_off_alt21

chat_bubble_outline1

repeat0

shareShare

vinh q. tran

@vqctran

7 months ago

even deeper BTS: semantic ids in DSI were thought of and implemented almost completely from the hospital since I was in poor health back then (kidney failure, all better now!) -- sometimes doing interesting research is a great distraction from other more difficult things going on

thumb_up_off_alt7

chat_bubble_outline1

repeat0

shareShare