Jad Kabbara (@jad_kabbara) 's Twitter Profile
Jad Kabbara

@jad_kabbara

NLP Postdoc @MIT Center for Constructive Communication (CCC). PhD from McGill University @rllabmcgill & @Mila_Quebec. @AUB_Lebanon alum.

ID: 847162226914013185

linkhttp://www.mit.edu/~jkabbara/ calendar_today29-03-2017 19:03:09

1,1K Tweet

1,1K Followers

745 Following

Hussein Mozannar (@hsseinmzannar) 's Twitter Profile Photo

Excited to release my first lead project Magentic-UI at Microsoft Research, an OS web agent application designed for efficient human-agent interaction. CUA agents are cool but they're not so useful yet, Magentic-UI helps us study how to get value from them. github.com/microsoft/mage…

Shayne Longpre (@shayneredford) 's Twitter Profile Photo

🚨 Lucie-AimΓ©e Kaffee and I are looking for a junior collaborator to research the Open Model Ecosystem! πŸ€– Ideally, someone w/ AI/ML background, who can help w/ annotation pipeline + analysis. docs.google.com/forms/d/e/1FAI…

Eric (@eric_chamoun) 's Twitter Profile Photo

What are NLP papers really saying about the purpose and use of their models/datasets? πŸ€” Who are they for? What problems do they solve? How are they used? We built a framework + tool to: (1) analyze framing trends across papers (2) help authors reflect on their own framing 🧡

What are NLP papers really saying about the purpose and use of their models/datasets? πŸ€”

Who are they for? What problems do they solve? How are they used?

We built a framework + tool to:
(1) analyze framing trends across papers
(2) help authors reflect on their own framing 🧡
EleutherAI (@aieleuther) 's Twitter Profile Photo

Can you train a performant language models without using unlicensed text? We are thrilled to announce the Common Pile v0.1, an 8TB dataset of openly licensed and public domain text. We train 7B models for 1T and 2T tokens and match the performance similar models like LLaMA 1&2

Can you train a performant language models without using unlicensed text?

We are thrilled to announce the Common Pile v0.1, an 8TB dataset of openly licensed and public domain text. We train 7B models for 1T and 2T tokens and match the performance similar models like LLaMA 1&2
Ziling Cheng (@ziling_cheng) 's Twitter Profile Photo

Do LLMs hallucinate randomly? Not quite. Our #ACL2025 (Main) paper shows that hallucinations under irrelevant contexts follow a systematic failure mode β€” revealing how LLMs generalize using abstract classes + context cues, albeit unreliably. πŸ“Ž Paper: arxiv.org/abs/2505.22630 1/n

Do LLMs hallucinate randomly? Not quite. Our #ACL2025 (Main) paper shows that hallucinations under irrelevant contexts follow a systematic failure mode β€” revealing how LLMs generalize using abstract classes + context cues, albeit unreliably.

πŸ“Ž Paper: arxiv.org/abs/2505.22630 1/n
Benno Krojer (@benno_krojer) 's Twitter Profile Photo

Excited to share the results of my internship research with AI at Meta, as part of a larger world modeling release! What subtle shortcuts are VideoLLMs taking on spatio-temporal questions? And how can we instead curate shortcut-robust examples at a large-scale? Details πŸ‘‡πŸ”¬

Excited to share the results of my internship research with <a href="/AIatMeta/">AI at Meta</a>, as part of a larger world modeling release!

What subtle shortcuts are VideoLLMs taking on spatio-temporal questions?

And how can we instead curate shortcut-robust examples at a large-scale?

Details πŸ‘‡πŸ”¬
Victor Sanh (@sanhestpasmoi) 's Twitter Profile Photo

πŸ”₯Big exciting news - I've started a new company! πŸš€ We are building AI agents that take actions in the real world by orchestrating the movement of physical goods. We're working with our first partners and are now growing the founding engineering team. We're building in NYC,

Shayne Longpre (@shayneredford) 's Twitter Profile Photo

Thrilled to collaborate on the launch of πŸ“š CommonPile v0.1 πŸ“š ! Introducing the largest openly-licensed LLM pretraining corpus (8 TB), led by Nikhil Kandpal Brian Lester Colin Raffel. πŸ“œ: arxiv.org/pdf/2506.05209 πŸ“šπŸ€– Data & models: huggingface.co/common-pile 1/

Thrilled to collaborate on the launch of πŸ“š CommonPile v0.1 πŸ“š !

Introducing the largest openly-licensed LLM pretraining corpus (8 TB), led by <a href="/kandpal_nikhil/">Nikhil Kandpal</a> <a href="/blester125/">Brian Lester</a> <a href="/colinraffel/">Colin Raffel</a>.

πŸ“œ: arxiv.org/pdf/2506.05209
 πŸ“šπŸ€– Data &amp; models: huggingface.co/common-pile
1/
Hope Schroeder (@schropes) 's Twitter Profile Photo

1) Thrilled to be at #Facct2025 for the first time this week, representing a meta-research paper on positionality statements at FAccT from 2018-2024, in collaboration with Solon Barocas (Solon Barocas) and Akshansh Pareek.

Andrei Lupu (@_andreilupu) 's Twitter Profile Photo

Theory of Mind (ToM) is crucial for next gen LLM Agents, yet current benchmarks suffer from multiple shortcomings. Enter πŸ’½ Decrypto, an interactive benchmark for multi-agent reasoning and ToM in LLMs! Work done with Timon Willi & Jakob Foerster at AI at Meta & Foerster Lab for AI Research πŸ§΅πŸ‘‡

Cesare Spinoso-Di Piano (@cesare_spinoso) 's Twitter Profile Photo

A blizzard is raging in Montreal when your friend says β€œWow, the weather is amazing!” Humans easily interpret irony, while LLMs struggle at it. We propose a 𝘳𝘩𝘦𝘡𝘰𝘳π˜ͺ𝘀𝘒𝘭-𝘴𝘡𝘳𝘒𝘡𝘦𝘨𝘺-𝘒𝘸𝘒𝘳𝘦 probabilistic framework as a solution. arxiv.org/abs/2506.09301 @ #acl2025

A blizzard is raging in Montreal when your friend says β€œWow, the weather is amazing!” Humans easily interpret irony, while LLMs struggle at it. We propose a 𝘳𝘩𝘦𝘡𝘰𝘳π˜ͺ𝘀𝘒𝘭-𝘴𝘡𝘳𝘒𝘡𝘦𝘨𝘺-𝘒𝘸𝘒𝘳𝘦 probabilistic framework as a solution. arxiv.org/abs/2506.09301 @ #acl2025
Michiel Bakker (@bakkermichiel) 's Twitter Profile Photo

🚨🚨 Excited to share a new paper led by Haiwen Li with the Community Notes team! LLMs will reshape the information ecosystem. Community Notes offers a promising model for keeping human judgment central but it's an open question how to best integrate LLMs. ThreadπŸ‘‡

🚨🚨 Excited to share a new paper led by <a href="/Li_Haiwen_/">Haiwen Li</a> with the <a href="/CommunityNotes/">Community Notes</a> team!

LLMs will reshape the information ecosystem. Community Notes offers a promising model for keeping human judgment central but it's an open question how to best integrate LLMs.

ThreadπŸ‘‡
Shayne Longpre (@shayneredford) 's Twitter Profile Photo

Existing AI Agent benchmarks are broken πŸ€–πŸ’” Great work by Yuxuan Zhu and Daniel Kang identify + fix issues, and establish rigorous best practices for Agentic AI benchmarks! Check out the blog: ddkang.substack.com/p/ai-agent-ben…

Existing AI Agent  benchmarks are broken πŸ€–πŸ’” 

Great work by <a href="/maxYuxuanZhu/">Yuxuan Zhu</a> and <a href="/daniel_d_kang/">Daniel Kang</a> identify + fix issues, and establish rigorous best practices for Agentic AI benchmarks!

Check out the blog: ddkang.substack.com/p/ai-agent-ben…
Joey Bose (@bose_joey) 's Twitter Profile Photo

πŸŽ‰Personal update: I'm thrilled to announce that I'm joining Imperial College London Imperial College London as an Assistant Professor of Computing Imperial Computing starting January 2026. My future lab and I will continue to work on building better Generative Models πŸ€–, the hardest

Shayne Longpre (@shayneredford) 's Twitter Profile Photo

Copyrighted 🚧, private πŸ›‘, and sensitive ☒️ data remain major challenges for AI. FlexOlmo introduces an architectural mechanism to flexibly opt-in/opt-out segments of data in the training weights, **at inference time**. (Prior common solutions were to filter your data once

Copyrighted 🚧, private πŸ›‘, and sensitive ☒️ data remain major challenges for AI. 

FlexOlmo introduces an architectural mechanism to flexibly opt-in/opt-out segments of data in the training weights, **at inference time**.

(Prior common solutions were to filter your data once
Nicholas Meade (@ncmeade) 's Twitter Profile Photo

I'll be at #ICML2025 this week presenting SafeArena (Wednesday 11AM - 1:30PM in East Exhibition Hall E-701). Come by to chat with me about web agent safety (or anything else safety-related)!

I'll be at #ICML2025 this week presenting SafeArena (Wednesday 11AM - 1:30PM in East Exhibition Hall E-701).

Come by to chat with me about web agent safety (or anything else safety-related)!
Siva Reddy (@sivareddyg) 's Twitter Profile Photo

I am speaking at 10 am PT on a slightly different topic than I usually talk about πŸ™‚: "Simple Ideas Can Have Mighty Effects: Don't Take LLM Fundamentals for Granted" Check out if you're around. #ICML2025

Shayne Longpre (@shayneredford) 's Twitter Profile Photo

Excited to present our AI Flaw Disclosure paper at #ICML2025 in Vancouver!πŸŒ²πŸŒŠπŸ”οΈ Swing by our poster session in East Exhibition Halls A-B E-606!