Ben Athiwaratkun @ ICLR (@ben_athi) Twitter Tweets • TwiCopy

Ben Athiwaratkun @ ICLR

@ben_athi

7 months ago

Deep Research with open source models (+ open recipe)

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

Excited to share our work on scaling LLMs to handle million-token contexts! Training models for ultra-long sequences is challenging due to data scarcity. We introduce a novel hierarchical synthetic data generation pipeline to overcome this. Thrilled this will be presented at ICLR

thumb_up_off_alt235

chat_bubble_outline16

repeat46

shareShare

Junlin Wang

@junlinwang3

7 months ago

Excited to share work from my Together AI internship—a deep dive into inference‑time scaling methods 🧠 We rigorously evaluated verifier‑free inference-time scaling methods across both reasoning and non‑reasoning LLMs. Some key findings: 🔑 Even with huge rollout budgets,

Excited to share work from my <a href="/togethercompute/">Together AI</a> internship—a deep dive into inference‑time scaling methods 🧠

We rigorously evaluated verifier‑free inference-time scaling methods across both reasoning and non‑reasoning LLMs. Some key findings:

🔑 Even with huge rollout budgets,

thumb_up_off_alt177

chat_bubble_outline1

repeat60

shareShare

Ben Athiwaratkun @ ICLR

@ben_athi

7 months ago

If you're at ICLR and passionate about optimizing language models for speed and efficiency, swing by the Together AI booth for a chat.

thumb_up_off_alt70

chat_bubble_outline1

repeat29

shareShare

Together AI

@togethercompute

6 months ago

🔔 New blog post on how we can attain large speedups for our inference customers using custom speculators! 🚀 Key benefits of customization: ✅ ~1.3x faster inference ✅ ~25% cost reduction ✅ Gets better as you generate more responses

thumb_up_off_alt40

chat_bubble_outline7

repeat5

shareShare

Together AI

@togethercompute

6 months ago

🚀 New research: YAQA — Yet Another Quantization Algorithm (yes, pronounced like yaca/jackfruit 🥭) Led by Albert Tseng, YAQA minimizes the KL divergence to the original model during quantization, cutting it by >30% vs. prior methods and outperforming even QAT on Gemma 3. 👇

🚀 New research: YAQA — Yet Another Quantization Algorithm (yes, pronounced like yaca/jackfruit 🥭)

Led by <a href="/tsengalb99/">Albert Tseng</a>, YAQA minimizes the KL divergence to the original model during quantization, cutting it by >30% vs. prior methods and outperforming even QAT on Gemma 3.

👇

thumb_up_off_alt26

chat_bubble_outline4

repeat5

shareShare

Vipul Ved Prakash

@vipulved

5 months ago

.Together AI is building 2 gigawatts of AI factories (~100,000 GPUs) in the EU over the next 4 years with the first phase live in H2 '2025. AI compute is at <1% saturation relative to our 2035 forecast and we are starting early to build a large-scale sustainable AI cloud

thumb_up_off_alt185

chat_bubble_outline6

repeat18

shareShare

Hassan

@nutlope

5 months ago

Our open deep research app is launching in 24 hours! Generate reports about any topic using OSS LLMs. 100% free & open source.

thumb_up_off_alt325

chat_bubble_outline13

repeat36

shareShare

Jon Saad-Falcon

@jonsaadfalcon

5 months ago

How can we close the generation-verification gap when LLMs produce correct answers but fail to select them? 🧵 Introducing Weaver: a framework that combines multiple weak verifiers (reward models + LM judges) to achieve o3-mini-level accuracy with much cheaper non-reasoning

thumb_up_off_alt204

chat_bubble_outline11

repeat56

shareShare

Ben Athiwaratkun @ ICLR

@ben_athi

5 months ago

Open Deep Research app + fully open recipe ☺️

thumb_up_off_alt11

chat_bubble_outline0

repeat3

shareShare

Together AI

@togethercompute

5 months ago

🔓⚡ FLUX.1 Kontext [dev] just landed on Together AI First open-weight model w/ proprietary-level image editing: 🎨 Perfect character consistency 🏆 Beats Gemini Flash + competitors 🛠️ Full model weights for customization Enterprise-quality editing, open weights💎

thumb_up_off_alt67

chat_bubble_outline2

repeat6

shareShare

Together AI

@togethercompute

5 months ago

Announcing DeepSWE 🤖: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-32B. DeepSWE achieves 59% on SWEBench-Verified with test-time scaling (and 42.2% Pass@1), topping the SWEBench leaderboard for open-weight models. Built in

thumb_up_off_alt466

chat_bubble_outline9

repeat76

shareShare

Ben Athiwaratkun @ ICLR

@ben_athi

4 months ago

Come check out our poster on speeding up LLM, happening now til 1.30 pm. TL;DR — we show that we can hide the latency of all reduce operations in tensor parallel setting by modifying residual architecture to overlap MLP and attention.

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Ben Athiwaratkun @ ICLR

@ben_athi

4 months ago

TL;DR - one way to push the quality-efficiency frontier: obtain high quality generations via a collection of LLMs -> distill to a smaller model -> get a higher quality small model that is more inference-efficient than the original collection of models. Poster session

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Together AI

@togethercompute

4 months ago

🤖OpenAI's open models are here. gpt-oss models just landed on Together AI. Achieves near-parity with o4- mini, trained using o3 techniques. Build anything, deploy anywhere🔥

thumb_up_off_alt112

chat_bubble_outline13

repeat23

shareShare

Ben Athiwaratkun @ ICLR

@ben_athi

3 months ago

LLM efficiency research on steroids with agentic workflows! 🚀

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Ben Athiwaratkun @ ICLR

@ben_athi

a month ago

Most speculative decoding research focuses on algorithms. But we know that data matters a ton! (e.g. no matter how good the spec algorithm is, if it's trained on bad & misaligned data, the speed will be poor) What if we build on algorithms that make data really shine?! In

thumb_up_off_alt25

chat_bubble_outline0

repeat5

shareShare

Ben Athiwaratkun @ ICLR

Ben Athiwaratkun @ ICLR

Linda He

Junlin Wang

Ben Athiwaratkun @ ICLR

Together AI

Together AI

Vipul Ved Prakash

Hassan

Jon Saad-Falcon

Ben Athiwaratkun @ ICLR

Together AI

Together AI

Ben Athiwaratkun @ ICLR

Ben Athiwaratkun @ ICLR

Together AI

Ben Athiwaratkun @ ICLR

Ben Athiwaratkun @ ICLR