Meng Song (@meng_song) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

We introduce APT-Gen to procedurally generate tasks of rich variations as curricula for reinforcement learning in hard-exploration problems. Webpage: kuanfang.github.io/apt-gen/ Paper: arxiv.org/abs/2007.00350 w/ Yuke Zhu Silvio Savarese Fei-Fei Li

thumb_up_off_alt57

chat_bubble_outline0

repeat22

shareShare

Kuan Fang

@kuanfang

4 years ago

We are organizing the #RSS2021 Workshop on Visual Learning and Reasoning for Robotics (VLRR)! Submissions are now open! (Deadline: June 20th) Website: rssvlrr.github.io

thumb_up_off_alt68

chat_bubble_outline1

repeat13

shareShare

Oriol Vinyals

@oriolvinyalsml

4 years ago

MuZero removed simulators in MBRL vs AlphaGo. VQ Models for Planning generalize to partial observable & stochastic environments. How? 1. Discretize states w/ VQVAE 2. Train a LM over states 3. Plan w/ MCTS using the LM Led by Yazhe Li & Sherjil Ozair arxiv.org/abs/2106.04615

thumb_up_off_alt207

chat_bubble_outline4

repeat45

shareShare

Yisong Yue

@yisongyue

4 years ago

Nominated for Best Paper at #CVPR2021! Very proud of Jennifer J. Sun! Check out all the nominees here: cvpr2021.thecvf.com/node/290

thumb_up_off_alt56

chat_bubble_outline1

repeat7

shareShare

Michael Bronstein @ICLR2025 🇸🇬

@mmbronstein

4 years ago

4K version of my ICLR keynote on #geometricdeeplearning is now on YouTube: youtube.com/watch?v=w6Pw4M… Accompanying paper: arxiv.org/abs/2104.13478 Blog post: towardsdatascience.com/geometric-foun…

thumb_up_off_alt1,1K

chat_bubble_outline20

repeat368

shareShare

Yevgen Chebotar

@yevgenchebotar

4 years ago

Excited to present our work on Actionable Models at #ICML! Find the camera-ready version at arxiv.org/abs/2104.07749 In this work, we learn functional understanding of the world through goal-conditioned Q-learning and use it for reaching visual goals or learning downstream tasks.

thumb_up_off_alt109

chat_bubble_outline1

repeat14

shareShare

hardmaru

@hardmaru

4 years ago

Thinking Like Transformers RNNs have direct parallels in finite state machines, but Transformers have no such familiar parallel. This paper aims to change that. They propose a computational model for the Transformer in the form of a programming language. arxiv.org/abs/2106.06981

thumb_up_off_alt566

chat_bubble_outline5

repeat111

shareShare

Yann LeCun

@ylecun

4 years ago

Using a diffusion kernel to represent the neighborhood structure in a graph for processing by a transformer. Cool stuff.

thumb_up_off_alt132

chat_bubble_outline2

repeat30

shareShare

Sergey Levine

@svlevine

4 years ago

A kind of GAN for system ID: Train simulator parameters so that the simulated *trajectories* are indistinguishable from real trajectories, treating the simulator as a "policy" that tries to replicate long-horizon behavior seen in the real data. arxiv.org/abs/2101.06005

thumb_up_off_alt106

chat_bubble_outline2

repeat21

shareShare

Nando de Freitas

@nandodf

4 years ago

Brilliant! Differentiation strikes again. Question/suggestion: same trick for the selection in evolutionary algorithms like PBT? ie differentiable natural selection?!

thumb_up_off_alt29

chat_bubble_outline0

repeat4

shareShare

Alexander Lavin

@alexlavin_c137

4 years ago

Sharing Yoshua's important post (from FB) for the AI and science folks here... jacobbuckman.com/2021-05-29-ple… Jacob Buckman

Sharing Yoshua's important post (from FB) for the AI and science folks here...
jacobbuckman.com/2021-05-29-ple… <a href="/jacobmbuckman/">Jacob Buckman</a>

thumb_up_off_alt39

chat_bubble_outline0

repeat7

shareShare

Pascal Notin

@notinpascal

4 years ago

Very pleased to share a new pre-print with Jose Miguel Hernández-Lobato and Yarin OATML_Oxford “Improving black-box optimization in VAE latent space using decoder uncertainty” arxiv.org/abs/2107.00096 (1/)

Very pleased to share a new pre-print with <a href="/jmhernandez233/">Jose Miguel Hernández-Lobato</a> and <a href="/yaringal/">Yarin</a> <a href="/OATML_Oxford/">OATML_Oxford</a>
“Improving black-box optimization in VAE latent space using decoder uncertainty”
arxiv.org/abs/2107.00096 (1/)

thumb_up_off_alt118

chat_bubble_outline3

repeat31

shareShare

Pulkit Agrawal

@pulkitology

4 years ago

Models reduce the need for data in decision-making but often pick up on spurious features. With Xiang, Ge, and Tommi Jaakkola, we propose TIA that relies on co-operative reconstruction for identifying important features in our #ICML2021 paper, xiangfu.co/tia

thumb_up_off_alt60

chat_bubble_outline3

repeat4

shareShare

Animesh Garg

@animesh_garg

4 years ago

New #RSS2021 paper on self-supervised affordance discovery! GIFT: Generalizable Interaction-aware Functional Tool Affordances without Labels arxiv.org/abs/2106.14973 pair.toronto.edu/blog/2021/gift… w\ D. Turpin, L. Wang, Stavros Tsogkas, Sven Dickinson Vector Institute University of Toronto Robotics Institute

thumb_up_off_alt39

chat_bubble_outline1

repeat6

shareShare

Andrew Gordon Wilson

@andrewgwils

4 years ago

We need more papers like this. Simple and thought provoking on the foundations. I love the direction of performing adaptive computation at test time, and the analogy with human thinking. Should all models be recurrent?

thumb_up_off_alt89

chat_bubble_outline1

repeat10

shareShare

Deepak Pathak

@pathak2206

4 years ago

Super excited to share something we have been working on for the last 1.5yrs. Check out our #RSS2021 paper on Rapid Motor Adaptation. RMA allows a legged robot trained *fully* in simulation to *adapt* online to diverse real-world terrains in real-time! ashish-kmr.github.io/rma-legged-rob…

thumb_up_off_alt779

chat_bubble_outline15

repeat123

shareShare

Emtiyaz Khan

@emtiyazkhan

4 years ago

Our new paper on "The Bayesian Learning Rule" is now on arXiv, where we provide a common learning-principle behind a variety of learning algorithms (optimization, deep learning, and graphical models). arxiv.org/abs/2107.04562 Guess what, the principle is Bayesian. A very long🧵

thumb_up_off_alt1,1K

chat_bubble_outline29

repeat365

shareShare

Eugene Vinitsky 🍒🦋

@eugenevinitsky

4 years ago

Are off-policy methods more sample efficient than on-policy in MARL? We show that PPO + centralized value functions is a strong baseline on a variety of cooperative MARL tasks! Paper: arxiv.org/abs/2103.01955 Blog post: bair.berkeley.edu/blog/2021/07/1…

thumb_up_off_alt61

chat_bubble_outline3

repeat18

shareShare

Bartolomeo Stellato

@b_stellato

4 years ago

Want to accelerate convex QP solvers using RL? Check out our latest work with Berkeley AI Research 👇 #ML4OPT #osqp

thumb_up_off_alt30

chat_bubble_outline1

repeat6

shareShare

John Langford

@johnclangford

a year ago

This paper: arxiv.org/pdf/2403.11940… is making an important point related to the tutorial (bit.ly/agentstate ) that Alex and I did at ICML. You really need forward dynamics on top of a multistep inverse model to fully capture an agent's endogenous dynamics.

thumb_up_off_alt68

chat_bubble_outline0

repeat13

shareShare