Felix Kreuk (@felixkreuk) 's Twitter Profile
Felix Kreuk

@felixkreuk

AI Research @ FAIR, Meta AI. CS PhD from BIU. Opinions are my own.

ID: 38631957

linkhttp://felixkreuk.github.io calendar_today08-05-2009 08:35:51

285 Tweet

2,2K Followers

1,1K Following

Konstantin Mishchenko (@konstmish) 's Twitter Profile Photo

Learning rate schedulers used to be a big mistery. Now you can just take a guarantee for *convex non-smooth* problems (from arxiv.org/abs/2310.07831), and they give you *precisely* what you see in training large models. See this empirical study: arxiv.org/abs/2501.18965 1/3

Learning rate schedulers used to be a big mistery. Now you can just take a guarantee for *convex non-smooth* problems (from arxiv.org/abs/2310.07831), and they give you *precisely* what you see in training large models. 
See this empirical study:
arxiv.org/abs/2501.18965
1/3
Hila Chefer (@hila_chefer) 's Twitter Profile Photo

VideoJAM is our new framework for improved motion generation from AI at Meta We show that video generators struggle with motion because the training objective favors appearance over dynamics. VideoJAM directly adresses this **without any extra data or scaling** 👇🧵

Max 📟 (@maxnordau) 's Twitter Profile Photo

Eli Sharabi is free. Palestine murdered his wife. Palestine murdered his daughters. Palestine murdered his brother. Palestine kidnapped him. Palestine starved him. Palestine tortured him. Palestine is a failed experiment.

Eli Sharabi is free.

Palestine murdered his wife.
Palestine murdered his daughters.
Palestine murdered his brother.
Palestine kidnapped him.
Palestine starved him.
Palestine tortured him.

Palestine is a failed experiment.
Marina Medvin 🇺🇸 (@marinamedvin) 's Twitter Profile Photo

Palestinians killed his wife, both of his daughters, his brother, and even his dog. They starved and tortured him for 491 days. They paraded and humiliated him at their last opportunity, forcing him to read lines in some sadistic show they put on. Eli Sharabi didn’t know what

Palestinians killed his wife, both of his daughters, his brother, and even his dog. They starved and tortured him for 491 days. They paraded and humiliated him at their last opportunity, forcing him to read lines in some sadistic show they put on.

Eli Sharabi didn’t know what
Kilian Lieret @ICLR (@klieret) 's Twitter Profile Photo

SWE-agent 1.0 is the open-source SOTA on SWE-bench Lite! Tons of new features: massively parallel runs; cloud-based deployment; extensive configurability with tool bundles; new command line interface & utilities.

SWE-agent 1.0 is the open-source SOTA on SWE-bench Lite! Tons of new features: massively parallel runs; cloud-based deployment; extensive configurability with tool bundles; new command line interface & utilities.
Visegrád 24 (@visegrad24) 's Twitter Profile Photo

BREAKING: Spokesman of the Israeli Army Daniel Hagari just announced that the forensic examination of the bodies of 9-month-old Kfir Bibas and his 4-year-old brother Ariel were beaten to death by Hamas. They weren’t killed in an airstrike, they were brutally murdered

Eylon Levy (@eylonalevy) 's Twitter Profile Photo

Hamas made two hostages beg for their lives, mere meters behind ICRC vans at yesterday’s hostage release parade. Hamas wants the world to know it is evil. Israel wants the world to know Hamas is evil. On this, we can agree.

Gallil Maimon (@gallilmaimon) 's Twitter Profile Photo

🗣️🧠 Speech Language Models require lots of compute to train, right? In our new paper, we test is it possible to train an SLM on 1xA5000 gpu in 24 hours? The results may surprise you (they even surprised us)! Tips, open source resources, full paper 👇🏻

🗣️🧠 Speech Language Models require lots of compute to train, right? 
In our new paper, we test is it possible to train an SLM on 1xA5000 gpu in 24 hours? 
The results may surprise you (they even surprised us)!
Tips, open source resources, full paper 👇🏻
Gallil Maimon (@gallilmaimon) 's Twitter Profile Photo

Many modern SpeechLMs are trained with Speech-Text interleaving. How does this impact scaling trends? In our new paper, we train several dozen SLMs, and show - quite a lot! So there is room for optimism 😊 Key insights, code, models, full paper 👇🏻

Many modern SpeechLMs are trained with Speech-Text interleaving. How does this impact scaling trends?

In our new paper, we train several dozen SLMs, and show - quite a lot! So there is room for optimism 😊

Key insights, code, models, full paper 👇🏻
Michael Hassid (@michaelhassid) 's Twitter Profile Photo

The longer reasoning LLM thinks - the more likely to be correct, right? Apparently not. Presenting our paper: “Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning”. Link: arxiv.org/abs/2505.17813 1/n

The longer reasoning LLM thinks - the more likely to be correct, right?

Apparently not.

Presenting our paper: “Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning”.

Link: arxiv.org/abs/2505.17813

1/n
Gallil Maimon (@gallilmaimon) 's Twitter Profile Photo

🎵💬 If you are interested in Audio Tokenisers, you should check out our new work! We empirically analysed existing tokenisers from every way - reconstruction, downstream, LMs and more. Grab yourself a ☕/🍺 and sit down for a read!

🎵💬 If you are interested in Audio Tokenisers, you should check out our new work!
We empirically analysed existing tokenisers from every way - reconstruction, downstream, LMs and more.

Grab yourself a ☕/🍺 and sit down for a read!
Itai Gat (@itai_gat) 's Twitter Profile Photo

Excited to share our recent work on corrector sampling in language models! A new sampling method that mitigates error accumulation by iteratively revisiting tokens in a window of previously generated text. With: Neta Shaul Uriel Singer Yaron Lipman Link: arxiv.org/abs/2506.06215

Excited to share our recent work on corrector sampling in language models! A new sampling method that mitigates error accumulation by iteratively revisiting tokens in a window of previously generated text.
With: <a href="/shaulneta/">Neta Shaul</a> <a href="/urielsinger/">Uriel Singer</a> <a href="/lipmanya/">Yaron Lipman</a>
Link: arxiv.org/abs/2506.06215
Yaron Lipman (@lipmanya) 's Twitter Profile Photo

A new paper: We finetune an LLM to rethink and resample previously generated tokens, allowing to reduce sampling errors and improve performance.

A new paper: We finetune an LLM to rethink and resample previously generated tokens, allowing to reduce sampling errors and improve performance.
Gallil Maimon (@gallilmaimon) 's Twitter Profile Photo

🎉Thrilled that our paper on "scaling analysis of interleaved speech-text LMs" was accepted to #CoLM2025 It gives room for optimism when scaling SpeechLMs *right* - with large TextLMs (in place of more data), interleaving, and synth training data💪

🎉Thrilled that our paper on "scaling analysis of interleaved speech-text LMs" was accepted to #CoLM2025
It gives room for optimism when scaling SpeechLMs *right* - with large TextLMs (in place of more data), interleaving, and synth training data💪