Ilia Kulikov (@uralik1) 's Twitter Profile
Ilia Kulikov

@uralik1

ramming llms @MetaAI

ID: 977013150871638016

linkhttps://iliakulikov.ru calendar_today23-03-2018 02:44:22

118 Tweet

511 Followers

270 Following

Ilia Kulikov (@uralik1) 's Twitter Profile Photo

haha they got me laughing when I saw this new iPad translation feature in WWDC video. At least looks like they are not cherry-picking there!! (the word жаренсезонных did not really exist until today 😂)

haha they got me laughing when I saw this new iPad translation feature in WWDC video. At least looks like they are not cherry-picking there!!

(the word жаренсезонных did not really exist until today 😂)
Sean Welleck (@wellecks) 's Twitter Profile Photo

"Mode recovery in neural autoregressive sequence modeling" We study mismatches between the most probable sequences of each stage of the "learning chain": ground-truth, data-collection, learning, decoding Led by Ilia Kulikov w/ Kyunghyun Cho, SPNLP talk on Friday arxiv.org/abs/2106.05459

"Mode recovery in neural autoregressive sequence modeling"

We study mismatches between the most probable sequences of each stage of the "learning chain": ground-truth, data-collection, learning, decoding

Led by <a href="/uralik1/">Ilia Kulikov</a> w/ <a href="/kchonyc/">Kyunghyun Cho</a>, SPNLP talk on Friday

arxiv.org/abs/2106.05459
Ilia Kulikov (@uralik1) 's Twitter Profile Photo

Apparently ancestral sampling yields high quality translations if we sample enough number of times, but how to choose one of them in the end? Bryan Eikema shows how to scale utility computations over large hypotheses spaces efficiently! very cool

Ilia Kulikov (@uralik1) 's Twitter Profile Photo

Interested in LLM inference algorithms? Please come and watch our tutorial next week! cmu-l3.github.io/neurips2024-in…

Ilia Kulikov (@uralik1) 's Twitter Profile Photo

We are using fairseq2 for llm post-training research in our team. This release comes with a decent documentation (facebookresearch.github.io/fairseq2/stabl…) 😅 My favorite feature of the lib is the runtime extension support: one can develop research code without forking out the entire lib repo!

Jason Weston (@jaseweston) 's Twitter Profile Photo

🥥🌪️ Introducing CoCoMix - a LLM pretraining framework that predicts concepts and mixes them into its hidden state to improve next token prediction. 📈 More sample-efficient and outperforms next token prediction, knowledge distillation, and inserting pause tokens. 🔬Boosts

🥥🌪️ Introducing CoCoMix - a LLM pretraining framework that predicts concepts and mixes them into its hidden state to improve next token prediction.

📈 More sample-efficient and outperforms next token prediction, knowledge distillation, and inserting pause tokens.

🔬Boosts