Tim Vieira (@xtimv) 's Twitter Profile
Tim Vieira

@xtimv

machine learning, reinforcement learning, programming languages, handstands (he/him)

ID: 47864778

linkhttp://timvieira.github.io/blog calendar_today17-06-2009 05:15:39

2,2K Tweet

3,3K Followers

999 Following

Alisa Liu (@alisawuffles) 's Twitter Profile Photo

What do BPE tokenizers reveal about their training data?🧐 We develop an attack🗡️ that uncovers the training data mixtures📊 of commercial LLM tokenizers (incl. GPT-4o), using their ordered merge lists! Co-1⃣st Jonathan Hayase arxiv.org/abs/2407.16607 🧵⬇️

What do BPE tokenizers reveal about their training data?🧐

We develop an attack🗡️ that uncovers the training data mixtures📊 of commercial LLM tokenizers (incl. GPT-4o), using their ordered merge lists!

Co-1⃣st <a href="/JonathanHayase/">Jonathan Hayase</a>
arxiv.org/abs/2407.16607 🧵⬇️
Hanna Wallach (@hannawallach.bsky.social) (@hannawallach) 's Twitter Profile Photo

Super excited to announce that Microsoft Research's FATE group, Sociotechnical Alignment Center, and friends have several workshop papers at next week's NeurIPS Conference. A short thread about (some of) these papers below... #NeurIPS2024

Hanna Wallach (@hannawallach.bsky.social) (@hannawallach) 's Twitter Profile Photo

Super excited for the Evaluating Evaluations workshop at NeurIPS Conference today!!! evaleval.github.io #NeurIPS2024 Microsoft Research's FATE group, Sociotechnical Alignment Center, and friends will be presenting several papers there. See below for details...

Hanna Wallach (@hannawallach.bsky.social) (@hannawallach) 's Twitter Profile Photo

I'll be giving a short talk on "Evaluating GenAI Systems is a Social Science Measurement Challenge" (arxiv.org/abs/2411.10939) in the 230--3pm oral session.

Alex Lew (@alexanderklew) 's Twitter Profile Photo

Tim Vieira and I were just discussing this interesting comment in the DeepSeek paper introducing GRPO: a different way of setting up the KL loss. It's a little hard to reason about what this does to the objective. 1/

<a href="/xtimv/">Tim Vieira</a> and I were just discussing this interesting comment in the DeepSeek paper introducing GRPO: a different way of setting up the KL loss.

It's a little hard to reason about what this does to the objective. 1/
Ben Lipkin (@ben_lipkin) 's Twitter Profile Photo

New preprint on controlled generation from LMs! I'll be presenting at NENLP tomorrow 12:50-2:00pm Longer thread coming soon :)

New preprint on controlled generation from LMs!

I'll be presenting at NENLP tomorrow 12:50-2:00pm

Longer thread coming soon :)
Ġabe Ġrand (@gabe_grand) 's Twitter Profile Photo

Tackling complex problems with LMs requires search/planning, but how should test-time compute be structured? Introducing Self-Steering, a new meta-reasoning framework where LMs coordinate their own inference procedures by writing code!

MIT CSAIL (@mit_csail) 's Twitter Profile Photo

A new technique from MIT can make AI-generated code adhere to whatever programming language or other format is being used, while remaining error-free: bit.ly/43U2Pua

João Loula (@joaoloula) 's Twitter Profile Photo

#ICLR2025 Oral How can we control LMs using diverse signals such as static analyses, test cases, and simulations? In our paper “Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo” we: Cast controlled generation as an inference problem, with the LM

#ICLR2025 Oral

How can we control LMs using diverse signals such as static analyses, test cases, and simulations?
In our paper “Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo” we:
Cast controlled generation as an inference problem, with the LM
Ġabe Ġrand (@gabe_grand) 's Twitter Profile Photo

Excited to rep the team behind “Syntactic and Semantic Control of LLMs via Sequential Monte Carlo” ICLR 2026 #ICLR2025!🎲🎛️ Stop by our poster #634 from 10:00am-12:30pm today to chat with co-authors João Loula, Ben LeBrun, Alex Lew, Tim Vieira, Ryan Cotterell & more!

Excited to rep the team behind “Syntactic and Semantic Control of LLMs via Sequential Monte Carlo” <a href="/iclr_conf/">ICLR 2026</a> #ICLR2025!🎲🎛️
Stop by our poster #634 from 10:00am-12:30pm today to chat with co-authors <a href="/JoaoLoula/">João Loula</a>, <a href="/BenLeBrun2/">Ben LeBrun</a>, <a href="/alexanderklew/">Alex Lew</a>, <a href="/xtimv/">Tim Vieira</a>, Ryan Cotterell &amp; more!
Afra Amini (@afra_amini) 's Twitter Profile Photo

Current KL estimation practices in RLHF can generate high variance and even negative values! We propose a provably better estimator that only takes a few lines of code to implement.🧵👇 w/ Tim Vieira and Ryan Cotterell code: arxiv.org/pdf/2504.10637 paper: github.com/rycolab/kl-rb

Current KL estimation practices in RLHF can generate high variance and even negative values! We propose a provably better estimator that only takes a few lines of code to implement.🧵👇
w/ <a href="/xtimv/">Tim Vieira</a> and Ryan Cotterell
code: arxiv.org/pdf/2504.10637
paper: github.com/rycolab/kl-rb
Ben Lipkin (@ben_lipkin) 's Twitter Profile Photo

Many LM applications may be formulated as targeting some (Boolean) constraint. Generate a… - Python program that passes a test suite - PDDL plan that satisfies a goal - CoT trajectory that yields a positive reward The list goes on… How can we efficiently satisfy these? 🧵👇

Hanna Wallach (@hannawallach.bsky.social) (@hannawallach) 's Twitter Profile Photo

Alright, people, let's be honest: GenAI systems are everywhere, and figuring out whether they're any good is a total mess. Should we use them? Where? How? Do they need a total overhaul?

Hanna Wallach (@hannawallach.bsky.social) (@hannawallach) 's Twitter Profile Photo

📣 "Understanding and Meeting Practitioner Needs When Measuring Representational Harms Caused by LLM-Based Systems" is forthcoming at #ACL2025NLP---and you can read it now on arXiv! 🔗: arxiv.org/pdf/2506.04482 🧵: ⬇️

📣 "Understanding and Meeting Practitioner Needs When Measuring Representational Harms Caused by LLM-Based Systems" is forthcoming at #ACL2025NLP---and you can read it now on arXiv!   🔗: arxiv.org/pdf/2506.04482 🧵: ⬇️
Pushpendre Rastogi (@pushpendre89) 's Twitter Profile Photo

Has anyone tried running AI models (CNNs/LLMs, ViTs/ Diffusion) on weird chips? Edge: Qualcomm AR1, Ambarella, TensTorrent Cloud: Trainium, Inferentia, AMD Or even just porting Ampere → Hopper → Blackwell? Curious: how painful was it? Did it kill your project before it started?