Brian Bartoldson (@bartoldson) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

GSM8K has been a cornerstone benchmark for LLMs, but performance seemed stuck around 95%. Why? Turns out, the benchmark itself was noisy. We fixed that, and found that it significantly affects evals. Introducing GSM8K-Platinum! w/Eddie Vendrow Josh Vendrow Sara Beery

thumb_up_off_alt471

chat_bubble_outline9

repeat59

shareShare

𝚐𝔪𝟾𝚡𝚡𝟾

@gm8xx8

4 months ago

Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training TBA is a scalable RL system for LLM post-training that uses off-policy data and replay buffers with Trajectory Balance. It decouples training from search, improving speed

thumb_up_off_alt17

chat_bubble_outline2

repeat1

shareShare

Bhavya Kailkhura

@bkailkhu

4 months ago

At Lawrence Livermore National Laboratory, we are using AI to: ⚛️ Solve nuclear fusion 🧪 Discover critical materials 🧠 Red-team vulnerabilities All to push science forward and protect national security 🌎 Post-training LLMs at scale can unlock these advances. But even with El Capitan—the world’s

At <a href="/Livermore_Lab/">Lawrence Livermore National Laboratory</a>, we are using AI to:
⚛️ Solve nuclear fusion
🧪 Discover critical materials
🧠 Red-team vulnerabilities

All to push science forward and protect national security 🌎

Post-training LLMs at scale can unlock these advances. But even with El Capitan—the world’s

thumb_up_off_alt9

chat_bubble_outline1

repeat1

shareShare

Yangjun Ruan

@yangjunr

4 months ago

New paper on synthetic pretraining! We show LMs can synthesize their own thoughts for more data-efficient pretraining, bootstrapping their capabilities on limited, task-agnostic data. We call this new paradigm “reasoning to learn”. arxiv.org/abs/2503.18866 Here’s how it works🧵

thumb_up_off_alt479

chat_bubble_outline14

repeat100

shareShare

fly51fly

@fly51fly

4 months ago

[LG] Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training B R. Bartoldson, S Venkatraman, J Diffenderfer, M Jain... [Lawrence Livermore National Laboratory & Mila] (2025) arxiv.org/abs/2503.18929

thumb_up_off_alt34

chat_bubble_outline0

repeat10

shareShare

Cihang Xie

@cihangxie

3 months ago

🚨 Interested in adopting Large Reasoning Models (LRMs) but concerned about safety risks? 🚨 Meet STAR-1 🌟 – A compact, high-quality safety dataset (just 1K samples!) boosting LRMs' safety by 40% with only a minimal (~1.1%) reasoning drop! 🚀 How we built STAR-1 in just 3

thumb_up_off_alt72

chat_bubble_outline2

repeat17

shareShare

Cihang Xie

@cihangxie

3 months ago

🚨Concerned about visual jailbreaking attacks holding back Vision-Language Model (VLM) deployment? 🌟 Excited to announce our latest research: Double Visual Defense! TL;DR: We introduce ΔCLIP and Δ²LLaVA — the first to reconcile robust adversarial performance with

thumb_up_off_alt21

chat_bubble_outline1

repeat7

shareShare

Brian Bartoldson

@bartoldson

3 months ago

🚀 The code for LLM post-training with TBA is now available! Try out Trajectory Balance with Asynchrony via github.com/bbartoldson/TBA. x.com/bartoldson/sta…

thumb_up_off_alt26

chat_bubble_outline0

repeat7

shareShare

Johan S. Obando 👍🏽

@johanobandoc

3 months ago

🥳Come chat with Brian Bartoldson and Moksh Jain at our TBA poster at the #ICLR25 workshop on Open Science for Foundation Models (SCI-FM). The workshop will be held in EXPO Hall 4 #5 on Monday, April 28th.

🥳Come chat with <a href="/bartoldson/">Brian Bartoldson</a> and <a href="/JainMoksh/">Moksh Jain</a> at our TBA poster at the #ICLR25 workshop on Open Science for Foundation Models (SCI-FM). The workshop will be held in EXPO Hall 4 #5 on Monday, April 28th.

thumb_up_off_alt19

chat_bubble_outline0

repeat5

shareShare

EleutherAI

@aieleuther

a month ago

Can you train a performant language models without using unlicensed text? We are thrilled to announce the Common Pile v0.1, an 8TB dataset of openly licensed and public domain text. We train 7B models for 1T and 2T tokens and match the performance similar models like LLaMA 1&2

thumb_up_off_alt556

chat_bubble_outline10

repeat127

shareShare

Brian Bartoldson

@bartoldson

a month ago

Here's a free/gift link to the Washington Post article about training LLMs on openly licensed text: wapo.st/3T94IfQ. x.com/AiEleuther/sta…

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Infini-AI-Lab

@infiniailab

a month ago

🚀 Excited to introduce our latest work GRESO: a method that identifies and skips millions of uninformative prompts before rollout and achieves up to 2.0x wall-clock time speedup in training. More rollouts lead to better model performance, but they’re also a major bottleneck in

thumb_up_off_alt163

chat_bubble_outline1

repeat31

shareShare

Brian Bartoldson

Gate.io

Aleksander Madry

𝚐𝔪𝟾𝚡𝚡𝟾

Bhavya Kailkhura

Yangjun Ruan

fly51fly

Cihang Xie

Cihang Xie

Brian Bartoldson

Johan S. Obando 👍🏽

EleutherAI

Brian Bartoldson

Infini-AI-Lab