Meng Song (@meng_song) 's Twitter Profile
Meng Song

@meng_song

Independent RL researcher | Prev. PhD@UCSD, MS@CMU_Robotics

ID: 310904767

calendar_today04-06-2011 15:13:24

1,1K Tweet

67 Followers

245 Following

Kuan Fang (@kuanfang) 's Twitter Profile Photo

We introduce APT-Gen to procedurally generate tasks of rich variations as curricula for reinforcement learning in hard-exploration problems. Webpage: kuanfang.github.io/apt-gen/ Paper: arxiv.org/abs/2007.00350 w/ Yuke Zhu Silvio Savarese Fei-Fei Li

Kuan Fang (@kuanfang) 's Twitter Profile Photo

We are organizing the #RSS2021 Workshop on Visual Learning and Reasoning for Robotics (VLRR)! Submissions are now open! (Deadline: June 20th) Website: rssvlrr.github.io

We are organizing the #RSS2021 Workshop on Visual Learning and Reasoning for Robotics (VLRR)!

Submissions are now open! (Deadline: June 20th)

Website:  rssvlrr.github.io
Oriol Vinyals (@oriolvinyalsml) 's Twitter Profile Photo

MuZero removed simulators in MBRL vs AlphaGo. VQ Models for Planning generalize to partial observable & stochastic environments. How? 1. Discretize states w/ VQVAE 2. Train a LM over states 3. Plan w/ MCTS using the LM Led by Yazhe Li & Sherjil Ozair arxiv.org/abs/2106.04615

MuZero removed simulators in MBRL vs AlphaGo. VQ Models for Planning generalize to partial observable & stochastic environments. How?

1. Discretize states w/ VQVAE
2. Train a LM over states
3. Plan w/ MCTS using the LM

Led by <a href="/yazhe_li/">Yazhe Li</a> &amp; <a href="/sherjilozair/">Sherjil Ozair</a> 
arxiv.org/abs/2106.04615
Michael Bronstein @ICLR2025 🇸🇬 (@mmbronstein) 's Twitter Profile Photo

4K version of my ICLR keynote on #geometricdeeplearning is now on YouTube: youtube.com/watch?v=w6Pw4M… Accompanying paper: arxiv.org/abs/2104.13478 Blog post: towardsdatascience.com/geometric-foun…

4K version of my ICLR keynote on #geometricdeeplearning is now on YouTube: 

youtube.com/watch?v=w6Pw4M…

Accompanying paper: arxiv.org/abs/2104.13478

Blog post: towardsdatascience.com/geometric-foun…
Yevgen Chebotar (@yevgenchebotar) 's Twitter Profile Photo

Excited to present our work on Actionable Models at #ICML! Find the camera-ready version at arxiv.org/abs/2104.07749 In this work, we learn functional understanding of the world through goal-conditioned Q-learning and use it for reaching visual goals or learning downstream tasks.

Excited to present our work on Actionable Models at #ICML! Find the camera-ready version at arxiv.org/abs/2104.07749

In this work, we learn functional understanding of the world through goal-conditioned Q-learning and use it for reaching visual goals or learning downstream tasks.
hardmaru (@hardmaru) 's Twitter Profile Photo

Thinking Like Transformers RNNs have direct parallels in finite state machines, but Transformers have no such familiar parallel. This paper aims to change that. They propose a computational model for the Transformer in the form of a programming language. arxiv.org/abs/2106.06981

Thinking Like Transformers

RNNs have direct parallels in finite state machines, but Transformers have no such familiar parallel. This paper aims to change that. They propose a computational model for the Transformer in the form of a programming language.

arxiv.org/abs/2106.06981
Yann LeCun (@ylecun) 's Twitter Profile Photo

Using a diffusion kernel to represent the neighborhood structure in a graph for processing by a transformer. Cool stuff.

Sergey Levine (@svlevine) 's Twitter Profile Photo

A kind of GAN for system ID: Train simulator parameters so that the simulated *trajectories* are indistinguishable from real trajectories, treating the simulator as a "policy" that tries to replicate long-horizon behavior seen in the real data. arxiv.org/abs/2101.06005

Nando de Freitas (@nandodf) 's Twitter Profile Photo

Brilliant! Differentiation strikes again. Question/suggestion: same trick for the selection in evolutionary algorithms like PBT? ie differentiable natural selection?!

Pascal Notin (@notinpascal) 's Twitter Profile Photo

Very pleased to share a new pre-print with Jose Miguel Hernández-Lobato and Yarin OATML_Oxford “Improving black-box optimization in VAE latent space using decoder uncertainty” arxiv.org/abs/2107.00096 (1/)

Very pleased to share a new pre-print with <a href="/jmhernandez233/">Jose Miguel Hernández-Lobato</a> and <a href="/yaringal/">Yarin</a> <a href="/OATML_Oxford/">OATML_Oxford</a>
“Improving black-box optimization in VAE latent space using decoder uncertainty”
arxiv.org/abs/2107.00096 (1/)
Pulkit Agrawal (@pulkitology) 's Twitter Profile Photo

Models reduce the need for data in decision-making but often pick up on spurious features. With Xiang, Ge, and Tommi Jaakkola, we propose TIA that relies on co-operative reconstruction for identifying important features in our #ICML2021 paper, xiangfu.co/tia

Animesh Garg (@animesh_garg) 's Twitter Profile Photo

New #RSS2021 paper on self-supervised affordance discovery! GIFT: Generalizable Interaction-aware Functional Tool Affordances without Labels arxiv.org/abs/2106.14973 pair.toronto.edu/blog/2021/gift… w\ D. Turpin, L. Wang, Stavros Tsogkas, Sven Dickinson Vector Institute University of Toronto Robotics Institute

Andrew Gordon Wilson (@andrewgwils) 's Twitter Profile Photo

We need more papers like this. Simple and thought provoking on the foundations. I love the direction of performing adaptive computation at test time, and the analogy with human thinking. Should all models be recurrent?

Deepak Pathak (@pathak2206) 's Twitter Profile Photo

Super excited to share something we have been working on for the last 1.5yrs. Check out our #RSS2021 paper on Rapid Motor Adaptation. RMA allows a legged robot trained *fully* in simulation to *adapt* online to diverse real-world terrains in real-time! ashish-kmr.github.io/rma-legged-rob…

Emtiyaz Khan (@emtiyazkhan) 's Twitter Profile Photo

Our new paper on "The Bayesian Learning Rule" is now on arXiv, where we provide a common learning-principle behind a variety of learning algorithms (optimization, deep learning, and graphical models). arxiv.org/abs/2107.04562 Guess what, the principle is Bayesian. A very long🧵

Our new paper on "The Bayesian Learning Rule" is now on arXiv, where we provide a common learning-principle behind a variety of learning algorithms (optimization, deep learning, and graphical models). 
arxiv.org/abs/2107.04562
Guess what, the principle is Bayesian. A very long🧵
Eugene Vinitsky 🍒🦋 (@eugenevinitsky) 's Twitter Profile Photo

Are off-policy methods more sample efficient than on-policy in MARL? We show that PPO + centralized value functions is a strong baseline on a variety of cooperative MARL tasks! Paper: arxiv.org/abs/2103.01955 Blog post: bair.berkeley.edu/blog/2021/07/1…

Are off-policy methods more sample efficient than on-policy in MARL? We show that PPO + centralized value functions is a strong baseline on a variety of cooperative MARL tasks!
Paper: arxiv.org/abs/2103.01955
Blog post: bair.berkeley.edu/blog/2021/07/1…
John Langford (@johnclangford) 's Twitter Profile Photo

This paper: arxiv.org/pdf/2403.11940… is making an important point related to the tutorial (bit.ly/agentstate ) that Alex and I did at ICML. You really need forward dynamics on top of a multistep inverse model to fully capture an agent's endogenous dynamics.