Ilia Kulikov (@uralik1) Twitter Tweets • TwiCopy

Ilia Kulikov

@uralik1

+ Follow

ramming llms @MetaAI

ID: 977013150871638016

linkhttps://iliakulikov.ru calendar_today23-03-2018 02:44:22

118 Tweet

511 Followers

270 Following

Ilia Kulikov

@uralik1

4 years ago

haha they got me laughing when I saw this new iPad translation feature in WWDC video. At least looks like they are not cherry-picking there!! (the word жаренсезонных did not really exist until today 😂)

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Sean Welleck

@wellecks

4 years ago

"Mode recovery in neural autoregressive sequence modeling" We study mismatches between the most probable sequences of each stage of the "learning chain": ground-truth, data-collection, learning, decoding Led by Ilia Kulikov w/ Kyunghyun Cho, SPNLP talk on Friday arxiv.org/abs/2106.05459

thumb_up_off_alt23

chat_bubble_outline2

repeat3

shareShare

Ilia Kulikov

@uralik1

4 years ago

Apparently ancestral sampling yields high quality translations if we sample enough number of times, but how to choose one of them in the end? Bryan Eikema shows how to scale utility computations over large hypotheses spaces efficiently! very cool

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Ilia Kulikov

@uralik1

a year ago

Interested in LLM inference algorithms? Please come and watch our tutorial next week! cmu-l3.github.io/neurips2024-in…

thumb_up_off_alt23

chat_bubble_outline0

repeat0

shareShare

Ilia Kulikov

@uralik1

9 months ago

We are using fairseq2 for llm post-training research in our team. This release comes with a decent documentation (facebookresearch.github.io/fairseq2/stabl…) 😅 My favorite feature of the lib is the runtime extension support: one can develop research code without forking out the entire lib repo!

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Jason Weston

@jaseweston

9 months ago

🥥🌪️ Introducing CoCoMix - a LLM pretraining framework that predicts concepts and mixes them into its hidden state to improve next token prediction. 📈 More sample-efficient and outperforms next token prediction, knowledge distillation, and inserting pause tokens. 🔬Boosts

thumb_up_off_alt330

chat_bubble_outline1

repeat58

shareShare