Brandon Trabucco @ ICLR (@brandontrabucco) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

With the success of LLM agents like OpenAI Operator, we are entering a new scaling era, but how do we train these agent models? We present InSTA, the largest training environment for LLM agents, containing live web navigation tasks for 150k diverse websites in multiple

thumb_up_off_alt158

chat_bubble_outline9

repeat30

shareShare

Sachin Goyal

@goyalsachin007

4 months ago

Think your LLM loves endless pre-training? 🚨 Think again! Plot twist ahead! 🎢 While it (obviously) gives better base models, it might not necessarily give a better starting point for all the fancy post-training we do these days (instruction FT, multimodal FT, etc.). 👀

thumb_up_off_alt34

chat_bubble_outline2

repeat10

shareShare

Bowen Wang

@bowenwangnlp

4 months ago

🎮 Computer Use Agent Arena is LIVE! 🚀 🔥 Easiest way to test computer-use agents in the wild without any setup 🌟 Compare top VLMs: OpenAI Operator, Claude 3.7, Gemini 2.5 Pro, Qwen 2.5 vl and more 🕹️ Test agents on 100+ real apps & webs with one-click config 🔒 Safe & free

thumb_up_off_alt333

chat_bubble_outline14

repeat104

shareShare

Agentica Project

@agentica_

4 months ago

Introducing DeepCoder-14B-Preview - our fully open-sourced reasoning model reaching o1 and o3-mini level on coding and math. The best part is, we’re releasing everything: not just the model, but the dataset, code, and training recipe—so you can train it yourself!🔥 Links below:

thumb_up_off_alt886

chat_bubble_outline23

repeat224

shareShare

Brandon Trabucco @ ICLR

@brandontrabucco

4 months ago

🌏 Building web-scale agents, and tired of Math and Coding tasks? Come chat with us at ICLR in Singapore. We are presenting InSTA at the DATA-FM workshop in the second Oral session, April 28th 2:30pm. InSTA is the largest environment for training agents, spanning 150k live

thumb_up_off_alt40

chat_bubble_outline0

repeat6

shareShare

Christina Baek

@_christinabaek

3 months ago

When we train models to do QA, are we robustly improving context dependency? No! In our ICLR Oral (Fri 11 AM), we show that if the base model knows the facts already, it shortcuts and learns to ignore the context completely! Visit us to learn more about knowledge conflicts 😀

thumb_up_off_alt102

chat_bubble_outline3

repeat16

shareShare

Brandon Trabucco @ ICLR

@brandontrabucco

3 months ago

Excited for Sachin and Christina's talk tomorrow, a surprising result with major implications for VLM training!

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Pratyush Maini

@pratyushmaini

3 months ago

Join me & @hbxnov at #ICLR2025 for our very purple poster on risks of LLM evals by private companies! 🕒 Today, 10am | 🪧 #219 Beyond Llama drama, LMSYS incorporation & ARC-AGI train/test fiasco, we discuss irreducible biases—even when firms act in good faith. Come say hi! 💜

thumb_up_off_alt34

chat_bubble_outline2

repeat4

shareShare

Brandon Trabucco @ ICLR

@brandontrabucco

3 months ago

Building LLM Agents? Come to my talk at the #ICLR DATA-FM workshop today at 2:30pm, Hall 4, Section 4. I'll be presenting InSTA, our work building the largest environment for agents on the live internet. arxiv.org/abs/2502.06776 #Agents #LLM

thumb_up_off_alt44

chat_bubble_outline2

repeat12

shareShare

Brandon Trabucco @ ICLR

@brandontrabucco

3 months ago

Starting in 15 min!

thumb_up_off_alt10

chat_bubble_outline0

repeat2

shareShare

Tianhao Wang ("Jiachen") @ICLR

@jiachenwang97

3 months ago

It was challenging to organize the workshop as the sole in-person organizer, and I’m deeply grateful to everyone for their incredible support in making it a great success. Danqi Chen Peter Henderson Kyle Lo Vahab Mirrokni Bryan Kian Hsiang Low Xinran Gu Brandon Trabucco Zheng Xu, Edward Yeo,

thumb_up_off_alt29

chat_bubble_outline3

repeat3

shareShare

MIT Media Lab

@medialab

3 months ago

30+ years of Media Lab students, alumni, and postdocs at CHI 2025 in Yokohama! Photo courtesy of Professor Pattie Maes. #chi2025

thumb_up_off_alt121

chat_bubble_outline9

repeat8

shareShare

Stefano Ermon

@stefanoermon

3 months ago

They’re here. 🔥 Inception’s diffusion LLMs — lightning fast, state-of-the-art, and now public. Go build the future → platform.inceptionlabs.ai #GenAI #dLLMs #diffusion

thumb_up_off_alt136

chat_bubble_outline1

repeat15

shareShare

Brandon Trabucco @ ICLR

@brandontrabucco

3 months ago

Qwen3 models are fantastic. Its impressive that Alibaba built a frontier-class LLM in such a short time.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Xin Eric Wang @ ICLR 2025

@xwang_lk

3 months ago

𝘏𝘶𝘮𝘢𝘯𝘴 𝘵𝘩𝘪𝘯𝘬 𝘧𝘭𝘶𝘪𝘥𝘭𝘺—𝘯𝘢𝘷𝘪𝘨𝘢𝘵𝘪𝘯𝘨 𝘢𝘣𝘴𝘵𝘳𝘢𝘤𝘵 𝘤𝘰𝘯𝘤𝘦𝘱𝘵𝘴 𝘦𝘧𝘧𝘰𝘳𝘵𝘭𝘦𝘴𝘴𝘭𝘺, 𝘧𝘳𝘦𝘦 𝘧𝘳𝘰𝘮 𝘳𝘪𝘨𝘪𝘥 𝘭𝘪𝘯𝘨𝘶𝘪𝘴𝘵𝘪𝘤 𝘣𝘰𝘶𝘯𝘥𝘢𝘳𝘪𝘦𝘴. But current reasoning models remain constrained by discrete tokens, limiting their full

thumb_up_off_alt931

chat_bubble_outline27

repeat136

shareShare

David Bau

@davidbau

2 months ago

Dear MAGA friends, I have been worrying about STEM in the US a lot, because right now the Senate is writing new laws that cut 75% of the STEM budget in the US. Sorry for the long post, but the issue is really important, and I want to share what I know about it. The entire

thumb_up_off_alt466

chat_bubble_outline23

repeat74

shareShare

Jason Weston

@jaseweston

2 months ago

🚨Self-Challenging Language Model Agents🚨 📝: arxiv.org/abs/2506.01716 A new paradigm to train LLM agents to use different tools with challenging self-generated data ONLY: Self-challenging agents (SCA) both propose new tasks and solve them, using self-generated verifiers to

thumb_up_off_alt522

chat_bubble_outline2

repeat110

shareShare

Chhavi Yadav

@chhaviyadav_

2 months ago

Upon graduation, I paused to reflect on what my PhD had truly taught me. Was it just how to write papers, respond to brutal reviewer comments, and survive without much sleep? Or did it leave a deeper imprint on me — beyond the metrics and milestones? Turns out, it did. A

thumb_up_off_alt323

chat_bubble_outline22

repeat39

shareShare

Gokul Swamy

@g_k_swamy

2 months ago

Say ahoy to 𝚂𝙰𝙸𝙻𝙾𝚁⛵: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! 𝚂𝙰𝙸𝙻𝙾𝚁 ⛵ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!

thumb_up_off_alt247

chat_bubble_outline10

repeat64

shareShare