Felix Kreuk (@felixkreuk) Twitter Tweets • TwiCopy

Konstantin Mishchenko

7 months ago

Learning rate schedulers used to be a big mistery. Now you can just take a guarantee for *convex non-smooth* problems (from arxiv.org/abs/2310.07831), and they give you *precisely* what you see in training large models. See this empirical study: arxiv.org/abs/2501.18965 1/3

thumb_up_off_alt440

chat_bubble_outline5

repeat75

shareShare

Hila Chefer

@hila_chefer

7 months ago

VideoJAM is our new framework for improved motion generation from AI at Meta We show that video generators struggle with motion because the training objective favors appearance over dynamics. VideoJAM directly adresses this **without any extra data or scaling** 👇🧵

thumb_up_off_alt1,1K

chat_bubble_outline60

repeat194

shareShare

Hen Mazzig

@henmazzig

7 months ago

Crimes against humanity.

thumb_up_off_alt21,21K

chat_bubble_outline990

repeat3,3K

shareShare

Max 📟

@maxnordau

7 months ago

Eli Sharabi is free. Palestine murdered his wife. Palestine murdered his daughters. Palestine murdered his brother. Palestine kidnapped him. Palestine starved him. Palestine tortured him. Palestine is a failed experiment.

thumb_up_off_alt13,13K

chat_bubble_outline432

repeat2,2K

shareShare

Kosher🎗🧡

@koshercockney

7 months ago

Palestinian terrorists murdered his wife, brother and two young girls and then did this to him. I have no more words.

thumb_up_off_alt39,39K

chat_bubble_outline2,2K

repeat8,8K

shareShare

Marina Medvin 🇺🇸

@marinamedvin

7 months ago

Palestinians killed his wife, both of his daughters, his brother, and even his dog. They starved and tortured him for 491 days. They paraded and humiliated him at their last opportunity, forcing him to read lines in some sadistic show they put on. Eli Sharabi didn’t know what

thumb_up_off_alt81,81K

chat_bubble_outline4,4K

repeat19,19K

shareShare

The Mossad: Satirical and Awesome

@themossadil

7 months ago

This is where they were kept for 491 days.

thumb_up_off_alt10,10K

chat_bubble_outline329

repeat1,1K

shareShare

Kilian Lieret @ICLR

@klieret

7 months ago

SWE-agent 1.0 is the open-source SOTA on SWE-bench Lite! Tons of new features: massively parallel runs; cloud-based deployment; extensive configurability with tool bundles; new command line interface & utilities.

thumb_up_off_alt60

chat_bubble_outline3

repeat18

shareShare

Visegrád 24

@visegrad24

7 months ago

BREAKING: Spokesman of the Israeli Army Daniel Hagari just announced that the forensic examination of the bodies of 9-month-old Kfir Bibas and his 4-year-old brother Ariel were beaten to death by Hamas. They weren’t killed in an airstrike, they were brutally murdered

thumb_up_off_alt35,35K

chat_bubble_outline4,4K

repeat10,10K

shareShare

Eylon Levy

@eylonalevy

6 months ago

Hamas made two hostages beg for their lives, mere meters behind ICRC vans at yesterday’s hostage release parade. Hamas wants the world to know it is evil. Israel wants the world to know Hamas is evil. On this, we can agree.

thumb_up_off_alt12,12K

chat_bubble_outline396

repeat2,2K

shareShare

Gallil Maimon

@gallilmaimon

6 months ago

🗣️🧠 Speech Language Models require lots of compute to train, right? In our new paper, we test is it possible to train an SLM on 1xA5000 gpu in 24 hours? The results may surprise you (they even surprised us)! Tips, open source resources, full paper 👇🏻

thumb_up_off_alt136

chat_bubble_outline7

repeat37

shareShare

Eyal Yakoby

@eyakoby

6 months ago

Please listen and share freed hostage Eli Sharabi’s testimony. The world must know how evil Hamas is.

thumb_up_off_alt1,1K

chat_bubble_outline43

repeat633

shareShare

Gallil Maimon

@gallilmaimon

5 months ago

Many modern SpeechLMs are trained with Speech-Text interleaving. How does this impact scaling trends? In our new paper, we train several dozen SLMs, and show - quite a lot! So there is room for optimism 😊 Key insights, code, models, full paper 👇🏻

thumb_up_off_alt69

chat_bubble_outline4

repeat20

shareShare

Michael Hassid

@michaelhassid

3 months ago

The longer reasoning LLM thinks - the more likely to be correct, right? Apparently not. Presenting our paper: “Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning”. Link: arxiv.org/abs/2505.17813 1/n

thumb_up_off_alt104

chat_bubble_outline5

repeat34

shareShare

Gallil Maimon

@gallilmaimon

3 months ago

🎵💬 If you are interested in Audio Tokenisers, you should check out our new work! We empirically analysed existing tokenisers from every way - reconstruction, downstream, LMs and more. Grab yourself a ☕/🍺 and sit down for a read!

thumb_up_off_alt84

chat_bubble_outline1

repeat22

shareShare

Itai Gat

@itai_gat

2 months ago

Excited to share our recent work on corrector sampling in language models! A new sampling method that mitigates error accumulation by iteratively revisiting tokens in a window of previously generated text. With: Neta Shaul Uriel Singer Yaron Lipman Link: arxiv.org/abs/2506.06215

thumb_up_off_alt88

chat_bubble_outline4

repeat21

shareShare

Yaron Lipman

@lipmanya

2 months ago

A new paper: We finetune an LLM to rethink and resample previously generated tokens, allowing to reduce sampling errors and improve performance.

thumb_up_off_alt220

chat_bubble_outline4

repeat22

shareShare

Gallil Maimon

@gallilmaimon

2 months ago

🎉Thrilled that our paper on "scaling analysis of interleaved speech-text LMs" was accepted to #CoLM2025 It gives room for optimism when scaling SpeechLMs *right* - with large TextLMs (in place of more data), interleaving, and synth training data💪

thumb_up_off_alt3

chat_bubble_outline1

repeat1

shareShare