Ori Yoran (@oriyoran) 's Twitter Profile
Ori Yoran

@oriyoran

NLP researcher / P.hD candidate (Tel-Aviv University)

ID: 1375041744459468800

calendar_today25-03-2021 11:07:59

165 Tweet

606 Followers

543 Following

Pierre Chambon (@pierrechambon6) 's Twitter Profile Photo

Does your LLM truly comprehend the complexity of the code it generates? ๐Ÿฅฐ ย  Introducing our new non-saturated (for at least the coming week? ๐Ÿ˜‰) benchmark: ย  โœจBigO(Bench)โœจ - Can LLMs Generate Code with Controlled Time and Space Complexity? ย  Check out the details below !๐Ÿ‘‡

Does your LLM truly comprehend the complexity of the code it generates? ๐Ÿฅฐ
ย 
Introducing our new non-saturated (for at least the coming week? ๐Ÿ˜‰) benchmark:
ย 
โœจBigO(Bench)โœจ - Can LLMs Generate Code with Controlled Time and Space Complexity?
ย 
Check out the details below !๐Ÿ‘‡
Noam Razin (@noamrazin) 's Twitter Profile Photo

The success of RLHF depends heavily on the quality of the reward model (RM), but how should we measure this quality? ๐Ÿ“ฐ We study what makes a good RM from an optimization perspective. Among other results, we formalize why more accurate RMs are not necessarily better teachers! ๐Ÿงต

The success of RLHF depends heavily on the quality of the reward model (RM), but how should we measure this quality?

๐Ÿ“ฐ We study what makes a good RM from an optimization perspective. Among other results, we formalize why more accurate RMs are not necessarily better teachers!
๐Ÿงต
Gallil Maimon (@gallilmaimon) 's Twitter Profile Photo

Many modern SpeechLMs are trained with Speech-Text interleaving. How does this impact scaling trends? In our new paper, we train several dozen SLMs, and show - quite a lot! So there is room for optimism ๐Ÿ˜Š Key insights, code, models, full paper ๐Ÿ‘‡๐Ÿป

Many modern SpeechLMs are trained with Speech-Text interleaving. How does this impact scaling trends?

In our new paper, we train several dozen SLMs, and show - quite a lot! So there is room for optimism ๐Ÿ˜Š

Key insights, code, models, full paper ๐Ÿ‘‡๐Ÿป
Michael Hassid (@michaelhassid) 's Twitter Profile Photo

The longer reasoning LLM thinks - the more likely to be correct, right? Apparently not. Presenting our paper: โ€œDonโ€™t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoningโ€. Link: arxiv.org/abs/2505.17813 1/n

The longer reasoning LLM thinks - the more likely to be correct, right?

Apparently not.

Presenting our paper: โ€œDonโ€™t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoningโ€.

Link: arxiv.org/abs/2505.17813

1/n
Alex Zhang (@a1zhang) 's Twitter Profile Photo

Can GPT, Claude, and Gemini play video games like Zelda, Civ, and Doom II? ๐—ฉ๐—ถ๐—ฑ๐—ฒ๐—ผ๐—š๐—ฎ๐—บ๐—ฒ๐—•๐—ฒ๐—ป๐—ฐ๐—ต evaluates VLMs on Game Boy & MS-DOS games given only raw screen input, just like how a human would play. The best model (Gemini) completes just 0.48% of the benchmark! ๐Ÿงต๐Ÿ‘‡

Yoav Gur Arieh (@guryoav) 's Twitter Profile Photo

Can we precisely erase conceptual knowledge from LLM parameters? Most methods are shallow, coarse, or overreach, adversely affecting related or general knowledge. We introduce๐Ÿช๐๐ˆ๐’๐‚๐„๐’ โ€” a general framework for Precise In-parameter Concept EraSure. ๐Ÿงต 1/

Can we precisely erase conceptual knowledge from LLM parameters?
Most methods are shallow, coarse, or overreach, adversely affecting related or general knowledge.

We introduce๐Ÿช๐๐ˆ๐’๐‚๐„๐’ โ€” a general framework for Precise In-parameter Concept EraSure. ๐Ÿงต 1/
Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

Corrector Sampling in Language Models "Autoregressive language models accumulate errors due to their fixed, irrevocable left-to-right token generation. To address this, we propose a new sampling method called Resample-Previous-Tokens (RPT). RPT mitigates error accumulation by

Corrector Sampling in Language Models

"Autoregressive language models accumulate errors due to  their fixed, irrevocable left-to-right token generation. To address  this, we propose a new sampling method called Resample-Previous-Tokens  (RPT). RPT mitigates error accumulation by
Ricky T. Q. Chen (@rickytqchen) 's Twitter Profile Photo

Padding in our non-AR sequence models? Yuck. ๐Ÿ™… ๐Ÿ‘‰ Instead of unmasking, our new work *Edit Flows* perform iterative refinements via position-relative inserts and deletes, operations naturally suited for variable-length sequence generation. Easily better than using mask tokens.

Yijia Shao (@echoshao8899) 's Twitter Profile Photo

๐Ÿšจ 70 million US workers are about to face their biggest workplace transmission due to AI agents. But nobody asks them what they want. While AI races to automate everything, we took a different approach: auditing what workers want vs. what AI can do across the US workforce.๐Ÿงต

๐Ÿšจ 70 million US workers are about to face their biggest workplace transmission due to AI agents. But nobody asks them what they want.

While AI races to automate everything, we took a different approach: auditing what workers want vs. what AI can do across the US workforce.๐Ÿงต
Mor Geva (@megamor2) 's Twitter Profile Photo

โœจMLP layers have just become more interpretable than ever โœจ In a new paper: * We show a simple method for decomposing MLP activations into interpretable features * Our method uncovers hidden concept hierarchies, where sparse neuron combinations form increasingly abstract ideas

โœจMLP layers have just become more interpretable than ever โœจ
In a new paper:
* We show a simple method for decomposing MLP activations into interpretable features
* Our method uncovers hidden concept hierarchies, where sparse neuron combinations form increasingly abstract ideas
Neta Shaul (@shaulneta) 's Twitter Profile Photo

[1/n] New paper alert! ๐Ÿš€ Excited to introduce ๐“๐ซ๐š๐ง๐ฌ๐ข๐ญ๐ข๐จ๐ง ๐Œ๐š๐ญ๐œ๐ก๐ข๐ง๐  (๐“๐Œ)! We're replacing short-timestep kernels from Flow Matching/Diffusion with... a generative model๐Ÿคฏ, achieving SOTA text-2-image generation! Uriel Singer Itai Gat Yaron Lipman

Itay Itzhak (@itay_itzhak_) 's Twitter Profile Photo

๐ŸšจNew paper alert๐Ÿšจ ๐Ÿง  Instruction-tuned LLMs show amplified cognitive biases โ€” but are these new behaviors, or pretraining ghosts resurfacing? Excited to share our new paper, accepted to CoLM 2025๐ŸŽ‰! See thread below ๐Ÿ‘‡ #BiasInAI #LLMs #MachineLearning #NLProc

๐ŸšจNew paper alert๐Ÿšจ

๐Ÿง  
Instruction-tuned LLMs show amplified cognitive biases โ€” but are these new behaviors, or pretraining ghosts resurfacing?

Excited to share our new paper, accepted to CoLM 2025๐ŸŽ‰!
See thread below ๐Ÿ‘‡
#BiasInAI #LLMs #MachineLearning #NLProc