Vivek Myers (@vivek_myers) Twitter Tweets • TwiCopy

Chongyi Zheng

5 months ago

1/ How should RL agents prepare to solve new tasks? While prior methods often learn a model that predicts the immediate next observation, we build a model that predicts many steps into the future, conditioning on different user intentions: chongyi-zheng.github.io/infom.

thumb_up_off_alt92

chat_bubble_outline1

repeat15

shareShare

Seohong Park

@seohong_park

5 months ago

New paper on unsupervised pre-training for RL! The idea is to learn a flow-based future prediction model for each "intention" in the dataset. We can then use these models to estimate values for fine-tuning.

thumb_up_off_alt150

chat_bubble_outline0

repeat20

shareShare

Seohong Park

@seohong_park

5 months ago

Q-learning is not yet scalable seohong.me/blog/q-learnin… I wrote a blog post about my thoughts on scalable RL algorithms. To be clear, I'm still highly optimistic about off-policy RL and Q-learning! I just think we haven't found the right solution yet (the post discusses why).

thumb_up_off_alt1,1K

chat_bubble_outline34

repeat174

shareShare

Siddharth Karamcheti

@siddkaramcheti

5 months ago

Thrilled to share that I'll be starting as an Assistant Professor at Georgia Tech (Georgia Tech School of Interactive Computing / Robotics@GT / Machine Learning at Georgia Tech) in Fall 2026. My lab will tackle problems in robot learning, multimodal ML, and interaction. I'm recruiting PhD students this next cycle – please apply/reach out!

Thrilled to share that I'll be starting as an Assistant Professor at Georgia Tech (<a href="/ICatGT/">Georgia Tech School of Interactive Computing</a> / <a href="/GTrobotics/">Robotics@GT</a> / <a href="/mlatgt/">Machine Learning at Georgia Tech</a>) in Fall 2026.

My lab will tackle problems in robot learning, multimodal ML, and interaction. I'm recruiting PhD students this next cycle – please apply/reach out!

thumb_up_off_alt492

chat_bubble_outline60

repeat26

shareShare

Andrew Wagenmaker

@ajwagenmaker

5 months ago

Diffusion policies have demonstrated impressive performance in robot control, yet are difficult to improve online when 0-shot performance isn’t enough. To address this challenge, we introduce DSRL: Diffusion Steering via Reinforcement Learning. (1/n) diffusion-steering.github.io

thumb_up_off_alt288

chat_bubble_outline8

repeat59

shareShare

Qiyang Li

@qiyang_li

4 months ago

Everyone knows action chunking is great for imitation learning. It turns out that we can extend its success to RL to better leverage prior data for improved exploration and online sample efficiency! colinqiyangli.github.io/qc/ The recipe to achieve this is incredibly simple. 🧵 1/N

thumb_up_off_alt334

chat_bubble_outline2

repeat60

shareShare

Andrew Wagenmaker

@ajwagenmaker

4 months ago

How can we train a foundation model to internalize what it means to “explore”? Come check out our work on “behavioral exploration” at ICML25 to find out!

thumb_up_off_alt369

chat_bubble_outline6

repeat48

shareShare

Eric Frankel

@esfrankel

4 months ago

Tomorrow, I'm excited to present "Finite-Time Convergence Rates in Stochastic Stackelberg Games with Smooth Algorithmic Agents", which addresses how a principal can influence the behavior of competitive learning agents! #ICML2025 📍West Exhibition Hall, W-817, 11:00 - 1:30 🧵👇

thumb_up_off_alt11

chat_bubble_outline1

repeat2

shareShare

Data On the Brain & Mind Workshop @NeurIPS2025

@dataonbrainmind

3 months ago

🚨 Excited to announce our #NeurIPS2025 Workshop: Data on the Brain & Mind 📣 Call for: Findings (4- or 8-page) + Tutorials tracks 🎙️ Speakers include FieteGroup Daniel Yamins Cengiz Pehlevan Rajesh P. N. Rao Laura Gwilliams 🌐 Learn more: data-brain-mind.github.io

🚨 Excited to announce our #NeurIPS2025 Workshop: Data on the Brain & Mind

📣 Call for: Findings (4- or 8-page) + Tutorials tracks

🎙️ Speakers include <a href="/FieteGroup/">FieteGroup</a> <a href="/dyamins/">Daniel Yamins</a> <a href="/CPehlevan/">Cengiz Pehlevan</a> <a href="/RajeshPNRao/">Rajesh P. N. Rao</a> <a href="/GwilliamsL/">Laura Gwilliams</a>

🌐 Learn more: data-brain-mind.github.io

thumb_up_off_alt11

chat_bubble_outline0

repeat6

shareShare

Daniel Yamins

@dyamins

3 months ago

This looks like it will be a cool workshop data-brain-mind.github.io. .

thumb_up_off_alt17

chat_bubble_outline0

repeat4

shareShare

Joey Hejna

@joeyhejna

3 months ago

We're hosting the 1st workshop on Making Sense of Data in Robotics at Conference on Robot Learning this year! We'll investigate what makes robot learning data "good" by discussing: 🧩 Data Composition 🧹 Data Curation 💡 Data Interpretability Paper submissions are due 8/22/2025! 🧵(1/3)

thumb_up_off_alt57

chat_bubble_outline2

repeat6

shareShare

Catherine Glossop

@catglossop

3 months ago

Inherent biases and imbalances in robot data can make training steerable VLA policies challenging. We introduce CAST, a method to augment datasets with counterfactuals to induce better language following cast-vla.github.io ← paper, code, data, and more available here! 🧵

thumb_up_off_alt59

chat_bubble_outline7

repeat10

shareShare

Data On the Brain & Mind Workshop @NeurIPS2025

@dataonbrainmind

3 months ago

📢10 days left to submit to the Data on the Brain & Mind Workshop at #NeurIPS2025 📝Call for: • Findings • Tutorials Perfect if you’re prepping for ICLR or already in NeurIPS, show how to use a cog neuro dataset by submitting to our tutorial track!🔗data-brain-mind.github.io

thumb_up_off_alt3

chat_bubble_outline0

repeat4

shareShare

Alicja Ziarko

@ziarkoalicja

2 months ago

Can complex reasoning emerge directly from learned representations? In our new work, we study representations that capture both perceptual and temporal structure, enabling agents to reason without explicit planning. princeton-rl.github.io/CRTR/

thumb_up_off_alt746

chat_bubble_outline4

repeat108

shareShare

Data On the Brain & Mind Workshop @NeurIPS2025

@dataonbrainmind

2 months ago

🧠 Working with neuro datasets? You can submit a notebook that shows how to work with the dataset—explaining how to process it, and potential applications—as a tutorial to the Data on the Brain and Mind workshop. 📅 Extended deadline: Sept 8, 2025 🔗 data-brain-mind.github.io

thumb_up_off_alt2

chat_bubble_outline0

repeat3

shareShare

Data On the Brain & Mind Workshop @NeurIPS2025

@dataonbrainmind

2 months ago

🚨 Tutorial Track deadline extended! Now Sept 12 (AoE) Working with neuro datasets? Submit a notebook that: • Shows how to process the data • Explains potential applications OpenReview: openreview.net/group?id=NeurI… More Information: data-brain-mind.github.io

thumb_up_off_alt2

chat_bubble_outline0

repeat2

shareShare

Kevin Zakka

@kevin_zakka

a month ago

Meet mjlab. Powered by MuJoCo Warp. Drops Monday.

thumb_up_off_alt239

chat_bubble_outline12

repeat38

shareShare

Chongyi Zheng

@chongyiz1

25 days ago

1/ How can we model the future rewards (returns) for RL agents? While prior methods round the returns into discrete bins or predict a finite number of quantiles, we use flexible models to predict the fine-grained structure of the full return distribution: pd-perry.github.io/value-flows.

thumb_up_off_alt41

chat_bubble_outline2

repeat6

shareShare