Alessandro Suglia (@ale_suglia) Twitter Tweets • TwiCopy

Alessandro Suglia

3 months ago

"LLMs can play games" is a fashionable trend. As we demonstrate in our paper, training on a specific set of games yields higher performance on those games alone. However, this doesn't help models to play unseen games, showcasing their limitations in true instruction following! ↓

thumb_up_off_alt8

chat_bubble_outline0

repeat3

shareShare

$Francesco Capuano (@_fracapuano) 's Twitter Profile Photo$

Francesco Capuano

@_fracapuano

3 months ago

Robotics models are increasingly bulky and difficult to run directly on robots. With Remi Cadene and the team LeRobot and Hugging Face we’re changing that. Introducing SmolVLA, a sub-500M VLA designed for efficient training and inference. A thread 🧵

$Francesco Capuano (@_fracapuano) on Twitter photo Robotics models are increasingly bulky and difficult to run directly on robots. With <a href="/RemiCadene/">Remi Cadene</a> and the team <a href="/LeRobotHF/">LeRobot</a> and <a href="/huggingface/">Hugging Face</a> we’re changing that. Introducing SmolVLA, a sub-500M VLA designed for efficient training and inference. A thread 🧵$

thumb_up_off_alt190

chat_bubble_outline6

repeat35

shareShare

Remi Cadene

@remicadene

3 months ago

🚨 5 DAYS TO GO! The world’s biggest AI Robotics Hackathon is almost here! 2,000+ builders, coders, dreamers are joining June 14–15 One rule: build, learn and have fun together! Find your local hackathon & team up Register now: forms.gle/NP22nZ9knKCB2K…

thumb_up_off_alt112

chat_bubble_outline2

repeat31

shareShare

Alessandro Suglia

@ale_suglia

3 months ago

Super cool opportunity to work with us on implementing novel Embodied #GenAI to enable human-robot collaboration! Reach out if you want to know more!

thumb_up_off_alt6

chat_bubble_outline0

repeat4

shareShare

Alessandro Suglia

@ale_suglia

3 months ago

This is beautiful. Well done Rajat and the rest of the team. Truly inspiring!

thumb_up_off_alt1

chat_bubble_outline1

repeat0

shareShare

Manos Zaranis

@manoszaranis

2 months ago

🚨Meet MF²: Movie Facts & Fibs: a new benchmark for long-movie understanding! 🤔Do you think your model understands movies? Unlike existing benchmarks, MF² targets memorable events, emotional arcs 💔, and causal chains 🔗 — things humans recall easily, but even top models like

thumb_up_off_alt55

chat_bubble_outline2

repeat23

shareShare

Naomi Saphra hiring a lab 🧈🪰

@nsaphra

2 months ago

Reasoning is about variable binding. It’s not about information retrieval. If a model cannot do variable binding, it is not good at grounded reasoning, and there’s evidence accruing that large scale can make LLMs worse at in-context grounded reasoning. 🧵

thumb_up_off_alt117

chat_bubble_outline3

repeat12

shareShare

Judd Rosenblatt — d/acc

@juddrosenblatt

2 months ago

Current AI “alignment” is just a mask Our findings in The Wall Street Journal explore the limitations of today’s alignment techniques and what’s needed to get AI right 🧵

Current AI “alignment” is just a mask

Our findings in <a href="/WSJ/">The Wall Street Journal</a> explore the limitations of today’s alignment techniques and what’s needed to get AI right 🧵

thumb_up_off_alt9,9K

chat_bubble_outline352

repeat1,1K

shareShare

Alessandro Suglia

@ale_suglia

2 months ago

Any updates on the status of Open Review? I really need to submit my NeurIPS Conference reviews :(

thumb_up_off_alt0

chat_bubble_outline1

repeat0

shareShare

Alessandro Suglia

@ale_suglia

2 months ago

I hope somebody mentioned pixel-based models: arxiv.org/abs/2401.03321 antonio vergari ⚔️ not at #ICML2025

thumb_up_off_alt7

chat_bubble_outline2

repeat2

shareShare

Aryo Pradipta Gema

@aryopg

2 months ago

New Anthropic Research: “Inverse Scaling in Test-Time Compute” We found cases where longer reasoning leads to lower accuracy. Our findings suggest that naïve scaling of test-time compute may inadvertently reinforce problematic reasoning patterns. 🧵

thumb_up_off_alt989

chat_bubble_outline52

repeat133

shareShare

Pasquale Minervini is hiring postdocs! 🚀

@pminervini

a month ago

The amazing folks at EdinburghNLP will be presenting a few papers at ACL 2025 (ACL 2025); if you're in Vienna, touch base with them! Here are the papers in the main track 🧵

The amazing folks at <a href="/EdinburghNLP/">EdinburghNLP</a> will be presenting a few papers at ACL 2025 (<a href="/aclmeeting/">ACL 2025</a>); if you're in Vienna, touch base with them! Here are the papers in the main track 🧵

thumb_up_off_alt71

chat_bubble_outline1

repeat17

shareShare

Verena Rieser

@verena_rieser

a month ago

Looking forward to kicking off the day2 at #ACL2025NLP with my keynote! We'll be tackling new frontiers of AI alignment. 🗓️ Tuesday, 9:00 AM 🗣️ "Who's Gold? Re-imagining Alignment for Truly Beneficial AI" Here's a sneak peek of the talk. #AI #AIAlignment #NLProc #ACL2025NLP

thumb_up_off_alt93

chat_bubble_outline2

repeat17

shareShare

Agostina Calabrese 🦋

@agostina_cal

a month ago

At #ACL2025NLP and on the job market (NLP + AI Safety) 💼 It's great to see growing interest in safety/alignment, but we often miss the social context. Come to our Workshop on Online Abuse and Harms Friday to dive deeper into safe safety research! A quiet token from the biggest ACL 2025 ⬇️

thumb_up_off_alt41

chat_bubble_outline0

repeat9

shareShare

Alessandro Suglia

@ale_suglia

a month ago

Wish I could be doing this. Check this proposal from Mario!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Tejas Kulkarni

@tejasdkulkarni

a month ago

Special thanks to Google DeepMind for inviting me to try out Genie 3. I'm excited to share my thoughts on this early research prototype and also some of my live recordings below: I spent the whole day playing with the system and when it works, it is truly mind blowing🤯. It is

thumb_up_off_alt599

chat_bubble_outline21

repeat76

shareShare

Alessandro Suglia

@ale_suglia

a month ago

This looks amazing. I wonder how robust it is but seems definitely the way to go for setting up a truly open-ended learning regime where the learning agent is constantly challenged with novel and learnable experiences. Congrats Genie team!

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

clem 🤗

@clementdelangue

a month ago

When Sam Altman told me at the AI summit in Paris that they were serious about releasing open-source models & asked what would be useful, I couldn’t believe it. But six months of collaboration later, here it is: Welcome to OSS-GPT on Hugging Face! It comes in two sizes, for both

When <a href="/sama/">Sam Altman</a> told me at the AI summit in Paris that they were serious about releasing open-source models & asked what would be useful, I couldn’t believe it.

But six months of collaboration later, here it is: Welcome to OSS-GPT on <a href="/huggingface/">Hugging Face</a>! It comes in two sizes, for both

thumb_up_off_alt2,2K

chat_bubble_outline91

repeat263

shareShare