Luke Zettlemoyer (@lukezettlemoyer) Twitter Tweets • TwiCopy

Yizhong Wang

3 months ago

Thrilled to announce that I will be joining UT Austin Computer Science at UT Austin as an assistant professor in fall 2026! I will continue working on language models, data challenges, learning paradigms, & AI for innovation. Looking forward to teaming up with new students & colleagues! 🤠🤘

Thrilled to announce that I will be joining <a href="/UTAustin/">UT Austin</a> <a href="/UTCompSci/">Computer Science at UT Austin</a> as an assistant professor in fall 2026!

I will continue working on language models, data challenges, learning paradigms, & AI for innovation. Looking forward to teaming up with new students & colleagues! 🤠🤘

thumb_up_off_alt620

chat_bubble_outline98

repeat48

shareShare

Sahil Verma

@sahil1v

3 months ago

🚨 New Paper! 🚨 Guard models slow, language-specific, and modality-limited? Meet OmniGuard that detects harmful prompts across multiple languages & modalities all using one approach with SOTA performance in all 3 modalities!! while being 120X faster 🚀 arxiv.org/abs/2505.23856

thumb_up_off_alt73

chat_bubble_outline1

repeat33

shareShare

Saumya Malik

@saumyamalik44

3 months ago

I’m thrilled to share RewardBench 2 📊— We created a new multi-domain reward model evaluation that is substantially harder than RewardBench, we trained and released 70 reward models, and we gained insights about reward modeling benchmarks and downstream performance!

thumb_up_off_alt218

chat_bubble_outline4

repeat46

shareShare

Jason Weston

@jaseweston

3 months ago

🚨Self-Challenging Language Model Agents🚨 📝: arxiv.org/abs/2506.01716 A new paradigm to train LLM agents to use different tools with challenging self-generated data ONLY: Self-challenging agents (SCA) both propose new tasks and solve them, using self-generated verifiers to

thumb_up_off_alt522

chat_bubble_outline2

repeat110

shareShare

Chau Minh Pham

@chautmpham

3 months ago

🤔 What if you gave an LLM thousands of random human-written paragraphs and told it to write something new -- while copying 90% of its output from those texts? 🧟 You get what we call a Frankentext! 💡 Frankentexts are surprisingly coherent and tough for AI detectors to flag.

thumb_up_off_alt115

chat_bubble_outline4

repeat33

shareShare

Mohit Iyyer

@mohitiyyer

3 months ago

Tired of AI slop? Our work on "Frankentexts" shows how LLMs can stitch together random fragments of human writing into coherent, relevant responses to arbitrary prompts. Frankentexts are weirdly creative, and they also pose problems for AI detectors: are they AI? human? More 👇

thumb_up_off_alt55

chat_bubble_outline2

repeat8

shareShare

Jihan Yao

@jihan_yao

3 months ago

We introduce MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation ✅ Reliable: 94.3% agreement with human judgment ✅ Comprehensive: 4 modality combination × 49 tasks × 937 instructions 🔍Results and Takeaways: > GPT-Image-1 from OpenAI

thumb_up_off_alt29

chat_bubble_outline2

repeat17

shareShare

Tim Franzmeyer

@frtimlive

3 months ago

What if LLMs knew when to stop? 🚧 HALT finetuning teaches LLMs to only generate content they’re confident is correct. 🔍 Insight: Post-training must be adjusted to the model’s capabilities. ⚖️ Tunable trade-off: Higher correctness 🔒 vs. More completeness 📝 with AI at Meta 🧵

thumb_up_off_alt62

chat_bubble_outline1

repeat13

shareShare

Kempner Institute at Harvard University

@kempnerinst

3 months ago

Luke Zettlemoyer of University of Washington and @MetaAI demonstrates different approaches to building multimodal foundation models. @lukezettlemoyer #NeuroAI2025 #AI #ML #LLMs

Luke Zettlemoyer of <a href="/UW/">University of Washington</a> and @MetaAI demonstrates different approaches to building multimodal foundation models.

@lukezettlemoyer #NeuroAI2025 #AI #ML #LLMs

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Jacqueline He

@jcqln_h

3 months ago

LMs often output answers that sound right but aren’t supported by input context. This is intrinsic hallucination: the generation of plausible, but unsupported content. We propose Precise Information Control (PIC): a task requiring LMs to ground only on given verifiable claims.

thumb_up_off_alt43

chat_bubble_outline1

repeat18

shareShare

Kempner Institute at Harvard University

@kempnerinst

3 months ago

NEW: Luke Zettlemoyer (@lukezettlemoyer) of University of Washington and @MetaAI walks through different approaches to building multimodal foundation models. Watch the video: youtu.be/vTI4cziw84Q #NeuroAI2025 #AI #ML #LLMs #NeuroAI

NEW: Luke Zettlemoyer (@lukezettlemoyer) of <a href="/UW/">University of Washington</a> and @MetaAI walks through different approaches to building multimodal foundation models.

Watch the video: youtu.be/vTI4cziw84Q

#NeuroAI2025 #AI #ML #LLMs #NeuroAI

thumb_up_off_alt10

chat_bubble_outline0

repeat2

shareShare

Mickel Liu

@mickel_liu

3 months ago

🤔Conventional LM safety alignment is reactive: find vulnerabilities→patch→repeat 🌟We propose 𝗼𝗻𝗹𝗶𝗻𝗲 𝐦𝐮𝐥𝐭𝐢-𝐚𝐠𝐞𝐧𝐭 𝗥𝗟 𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 where Attacker & Defender self-play to co-evolve, finding diverse attacks and improving safety by up to 72% vs. RLHF 🧵

thumb_up_off_alt96

chat_bubble_outline5

repeat22

shareShare

Rulin Shao

@rulinshao

3 months ago

🎉Our Spurious Rewards is available on ArXiv! We added experiments on - More prompts/steps/models/analysis... - Spurious Prompts! Surprisingly, we obtained 19.4% gains when replacing prompts with LaTex placeholder text (\lipsum) 😶‍🌫️ Check out our 2nd blog: tinyurl.com/spurious-prompt

$🎉Our Spurious Rewards is available on ArXiv! We added experiments on - More prompts/steps/models/analysis... - Spurious Prompts! Surprisingly, we obtained 19.4% gains when replacing prompts with LaTex placeholder text (\lipsum) 😶‍🌫️ Check out our 2nd blog: tinyurl.com/spurious-prompt$

thumb_up_off_alt219

chat_bubble_outline4

repeat40

shareShare

Andy Konwinski

@andykonwinski

2 months ago

Today, I’m launching a deeply personal project. I’m betting $100M that we can help computer scientists create more upside impact for humanity. Built for and by researchers, including Jeff Dean & Joelle Pineau on the board, Laude Institute catalyzes research with real-world impact.

thumb_up_off_alt1,1K

chat_bubble_outline48

repeat105

shareShare

Thao Nguyen

@thao_nguyen26

2 months ago

Web data, the “fossil fuel of AI”, is being exhausted. What’s next?🤔 We propose Recycling the Web to break the data wall of pretraining via grounded synthetic data. It is more effective than standard data filtering methods, even with multi-epoch repeats! arxiv.org/abs/2506.04689

thumb_up_off_alt213

chat_bubble_outline8

repeat57

shareShare

Galen Weld (@ CSCW 2024)

@galenweld

2 months ago

Super surprised and honored to receive the single Best Paper award 🏆 at #ICWSM this year (out of 138 papers) for my work with Leon Leibmann, Amy Zhang, and Tim Althoff on Reddit Rules! 🎊

Super surprised and honored to receive the single Best Paper award 🏆 at #ICWSM this year (out of 138 papers) for my work with Leon Leibmann, <a href="/amyxzh/">Amy Zhang</a>, and <a href="/timalthoff/">Tim Althoff</a> on Reddit Rules! 🎊

thumb_up_off_alt69

chat_bubble_outline2

repeat8

shareShare

AI at Meta

@aiatmeta

2 months ago

🚀New from Meta FAIR: today we’re introducing Seamless Interaction, a research project dedicated to modeling interpersonal dynamics. The project features a family of audiovisual behavioral models, developed in collaboration with Meta’s Codec Avatars lab + Core AI lab, that

thumb_up_off_alt356

chat_bubble_outline27

repeat105

shareShare

Yu Su @#ICLR2025

@ysu_nlp

2 months ago

🔎Agentic search like Deep Research is fundamentally changing web search, but it also brings an evaluation crisis⚠️ Introducing Mind2Web 2: Evaluating Agentic Search with Agents-as-a-Judge - 130 tasks (each requiring avg. 100+ webpages) from 1,000+ hours of expert labor -

thumb_up_off_alt211

chat_bubble_outline3

repeat45

shareShare

Jungo Kasai 笠井淳吾

@jungokasai

2 months ago

Finally closed our $11M+ funding round! Backed by top Japanese VCs and amazing angel investors including Joi Ito, Thomas Wolf from Hugging Face, Noah A. Smith, Luke Zettlemoyer, and Sasha Rush. Now it’s time to focus on commercialization and tech development!!

thumb_up_off_alt92

chat_bubble_outline6

repeat12

shareShare

Julian Michael

@_julianmichael_

2 months ago

I should probably announce that a few months ago, I joined Scale AI to lead the Safety, Evaluations, and Alignment Lab… and today, I joined Meta to continue working on AI alignment with Summer Yue and Alexandr Wang. Very excited for what we can accomplish together!

thumb_up_off_alt413

chat_bubble_outline14

repeat12

shareShare