Eric Frankel (@esfrankel) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Gavin Brown

@gavinrbrown1

5 months ago

It doesn't matter whether MMD stands for "max mean discrepancy" or "mean max discrepancy," as the meany-max theorem implies they are equivalent.

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Model merging is a great way to combine multiple models' abilities, however, existing methods only work with models fine-tuned from the same initialization, and produce models of the same size. Our new work - PLeaS (at #CVPR2025) aims to resolve both these issues 🧵.

thumb_up_off_alt88

chat_bubble_outline2

repeat13

shareShare

Mihir Patel

@mvpatel2000

5 months ago

thumb_up_off_alt244

chat_bubble_outline7

repeat4

shareShare

Zhiyuan Zeng

@zhiyuanzeng_

4 months ago

Is a single accuracy number all we can get from model evals?🤔 🚨Does NOT tell where the model fails 🚨Does NOT tell how to improve it Introducing EvalTree🌳 🔍identifying LM weaknesses in natural language 🚀weaknesses serve as actionable guidance (paper&demo 🔗in🧵) [1/n]

thumb_up_off_alt240

chat_bubble_outline4

repeat89

shareShare

Teknium (e/λ)

@teknium1

4 months ago

.Mistral AI just released a new version of their 24B model - this time is multimodal and has 128K context - exactly what we wanted! This enables the reasoning models to be fully exploited on both long reasoning and vision tasks. They also gave DeepHermes a shoutout!

.<a href="/MistralAI/">Mistral AI</a> just released a new version of their 24B model - this time is multimodal and has 128K context - exactly what we wanted!

This enables the reasoning models to be fully exploited on both long reasoning and vision tasks.

They also gave DeepHermes a shoutout!

thumb_up_off_alt799

chat_bubble_outline21

repeat58

shareShare

Joel Jang

@jang_yoel

4 months ago

Excited to release GR00T N1! While this robot foundation model already stands out as the first open-source foundation model for humanoids and for its utilization of 540k simulation trajectories during pretraining, I want to highlight two other key innovations that truly set it

thumb_up_off_alt62

chat_bubble_outline6

repeat6

shareShare

Alisa Liu

@alisawuffles

4 months ago

We created SuperBPE🚀, a *superword* tokenizer that includes tokens spanning multiple words. When pretraining at 8B scale, SuperBPE models consistently outperform the BPE baseline on 30 downstream tasks (+8% MMLU), while also being 27% more efficient at inference time.🧵

thumb_up_off_alt2,2K

chat_bubble_outline96

repeat322

shareShare

Sewoong Oh

@sewoong79

4 months ago

We are releasing OpenDeepSearch (ODS), an open-source search agent that works with any LLM. When paired with DeepSeek-R1, ODS outperforms OpenAI’s specialized model for web search, GPT-4o-Search, on the challenging, multi-hop FRAMES benchmark from DeepMind (+9.7% accuracy).

thumb_up_off_alt2,2K

chat_bubble_outline37

repeat311

shareShare

Etash Guha @ ICLR

@etash_guha

4 months ago

Turns out, it’s possible to outperform DeepSeekR1-32B with only SFT on open data and no RL: Announcing OpenThinker2-32B and OpenThinker2-7B. We also release the data, OpenThoughts2-1M, curated by selecting quality instructions from diverse sources. 🧵 (1/n)

thumb_up_off_alt465

chat_bubble_outline19

repeat171

shareShare

Gonçalo Faria

@goncalorafaria

4 months ago

Introducing 𝗤𝗔𝗹𝗶𝗴𝗻🚀, a 𝘁𝗲𝘀𝘁-𝘁𝗶𝗺𝗲 𝗮𝗹𝗶𝗴𝗻𝗺𝗲𝗻𝘁 𝗺𝗲𝘁𝗵𝗼𝗱 that improves language model performance using Markov chain Monte Carlo. With no model retraining, 𝗤𝗔𝗹𝗶𝗴𝗻 outperforms DPO-tuned models even when allowed to match inference compute, and achieves

thumb_up_off_alt113

chat_bubble_outline4

repeat33

shareShare

Ai2

@allen_ai

3 months ago

Ever wonder how LLM developers choose their pretraining data? It’s not guesswork— all AI labs create small-scale models as experiments, but the models and their data are rarely shared. DataDecide opens up the process: 1,050 models, 30k checkpoints, 25 datasets & 10 benchmarks 🧵

thumb_up_off_alt659

chat_bubble_outline11

repeat121

shareShare

Ian Magnusson

@ianmagnusson

3 months ago

🔭 Science relies on shared artifacts collected for the common good. 🛰 So we asked: what's missing in open language modeling? 🪐 DataDecide 🌌 charts the cosmos of pretraining—across scales and corpora—at a resolution beyond any public suite of models that has come before.

thumb_up_off_alt88

chat_bubble_outline4

repeat62

shareShare

Rohan Baijal

@rohanbaijal

3 months ago

Long Range Navigator (LRN) 🧭— an approach to extend planning horizons for off-road navigation given no prior maps. Using vision LRN makes longer-range decisions by spotting navigation frontiers far beyond the range of metric maps. personalrobotics.github.io/lrn/

thumb_up_off_alt92

chat_bubble_outline6

repeat24

shareShare

Kunal Jha

@kjha02

3 months ago

Our new paper (first one of my PhD!) on cooperative AI reveals a surprising insight: Environment Diversity > Partner Diversity. Agents trained in self-play across many environments learn cooperative norms that transfer to humans on novel tasks. shorturl.at/fqsNN🧵

thumb_up_off_alt137

chat_bubble_outline5

repeat32

shareShare

Rui Xin

@rui_xin31

2 months ago

Think PII scrubbing ensures privacy? 🤔Think again‼️ In our paper, for the first time on unstructured text, we show that you can re-identify over 70% of private information *after* scrubbing! It’s time to move beyond surface-level anonymization. #Privacy #NLProc 🔗🧵

thumb_up_off_alt50

chat_bubble_outline2

repeat19

shareShare

Tong Chen @ ICLR

@tomchen0

2 months ago

LLMs naturally memorize some verbatim of pre-training data. We study whether post-training can be an effective way to mitigate unintentional reproduction of pre-training data. 🛠️ No changes to pre-training or decoding 🔥 Training models to latently distinguish between memorized

thumb_up_off_alt98

chat_bubble_outline1

repeat30

shareShare

Siting Li

@sitingli627

2 months ago

Excited to share that our paper "Exploring How Generative MLLMs Perceive More Than CLIP with the Same Vision Encoder" is accepted to #ACL2025! Preprint: arxiv.org/pdf/2411.05195 Thank Simon Shaolei Du and Pang Wei Koh so much for your support and guidance throughout the journey!

thumb_up_off_alt45

chat_bubble_outline2

repeat12

shareShare

Gavin Brown

@gavinrbrown1

2 months ago

I'm excited to announce that I will join UW–Madison Computer Sciences as an assistant professor this fall! Time to get to it.

thumb_up_off_alt239

chat_bubble_outline28

repeat6

shareShare

Stella Li

@stellalisy

2 months ago

🤯 We cracked RLVR with... Random Rewards?! Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by: - Random rewards: +21% - Incorrect rewards: +25% - (FYI) Ground-truth rewards: + 28.8% How could this even work⁉️ Here's why: 🧵 Blogpost: tinyurl.com/spurious-rewar…

thumb_up_off_alt1,1K

chat_bubble_outline69

repeat322

shareShare

Yizhong Wang

@yizhongwyz

2 months ago

Thrilled to announce that I will be joining UT Austin Computer Science at UT Austin as an assistant professor in fall 2026! I will continue working on language models, data challenges, learning paradigms, & AI for innovation. Looking forward to teaming up with new students & colleagues! 🤠🤘

Thrilled to announce that I will be joining <a href="/UTAustin/">UT Austin</a> <a href="/UTCompSci/">Computer Science at UT Austin</a> as an assistant professor in fall 2026!

I will continue working on language models, data challenges, learning paradigms, & AI for innovation. Looking forward to teaming up with new students & colleagues! 🤠🤘

thumb_up_off_alt620

chat_bubble_outline98

repeat48

shareShare

Eric Frankel

Gate.io

Gavin Brown

Anshul Nasery

Mihir Patel

Zhiyuan Zeng

Teknium (e/λ)

Joel Jang

Alisa Liu

Sewoong Oh

Etash Guha @ ICLR

Gonçalo Faria

Ai2

Ian Magnusson

Rohan Baijal

Kunal Jha

Rui Xin

Tong Chen @ ICLR

Siting Li

Gavin Brown

Stella Li

Yizhong Wang