Ben Lipkin (@ben_lipkin) 's Twitter Profile
Ben Lipkin

@ben_lipkin

phd @mit. cogsci, probml, nlp. he/him.

ID: 565036478

linkhttp://benlipkin.github.io calendar_today28-04-2012 00:50:33

177 Tweet

628 Followers

1,1K Following

Afra Amini (@afra_amini) 's Twitter Profile Photo

Current KL estimation practices in RLHF can generate high variance and even negative values! We propose a provably better estimator that only takes a few lines of code to implement.🧵👇 w/ Tim Vieira and Ryan Cotterell code: arxiv.org/pdf/2504.10637 paper: github.com/rycolab/kl-rb

Current KL estimation practices in RLHF can generate high variance and even negative values! We propose a provably better estimator that only takes a few lines of code to implement.🧵👇
w/ <a href="/xtimv/">Tim Vieira</a> and Ryan Cotterell
code: arxiv.org/pdf/2504.10637
paper: github.com/rycolab/kl-rb
Ben Lipkin (@ben_lipkin) 's Twitter Profile Photo

Life news: I moved to SF for the next few months. Excited to connect with old friends and meet new ones. Get in touch if you're around these days :)

Life news: I moved to SF for the next few months. Excited to connect with old friends and meet new ones. Get in touch if you're around these days :)
Ahmad Beirami @ ICLR 2025 (@abeirami) 's Twitter Profile Photo

As we go through a lot of excitement about RL recently with lots of cool work/results, here is a reminder that RL with a reverse KL-regularizer to the base model cannot learn new skills that were not already present in the base model. It can only amplify the existing weak skills.

As we go through a lot of excitement about RL recently with lots of cool work/results, here is a reminder that RL with a reverse KL-regularizer to the base model cannot learn new skills that were not already present in the base model. It can only amplify the existing weak skills.
Morph (@morph_labs) 's Twitter Profile Photo

We are excited to announce Trinity, an autoformalization system for verified superintelligence that we have developed at Morph. We have used it to automatically formalize in Lean a classical result of de Bruijn that the abc conjecture is true almost always.

We are excited to announce Trinity, an autoformalization system for verified superintelligence that we have developed at <a href="/morph_labs/">Morph</a>. We have used it to automatically formalize in Lean a classical result of de Bruijn that the abc conjecture is true almost always.
Kevin Ellis (@ellisk_kellis) 's Twitter Profile Photo

New paper: World models + Program synthesis by Wasu Top Piriyakulkij 1. World modeling on-the-fly by synthesizing programs w/ 4000+ lines of code 2. Learns new environments from minutes of experience 3. Positive score on Montezuma's Revenge 4. Compositional generalization to new environments

Jacob Andreas (@jacobandreas) 's Twitter Profile Photo

👉 New preprint on a new family of Transformer-type models whose depth scales logarithmically with sequence length. Enables: - fast training - fast decoding - large memory capacity in associative recall - strong length generalization on state tracking