David Tao (@taodav) 's Twitter Profile
David Tao

@taodav

PhD student in the @BrownBigAI lab. MSc from the @rlai_lab.

ID: 102545961

linkhttp://taodav.cc calendar_today07-01-2010 02:07:03

161 Tweet

379 Followers

173 Following

Pablo Samuel Castro (@pcastr) 's Twitter Profile Photo

Last year we showed that deep RL can successfully control stratospheric balloons in the real world go.nature.com/3dtvrPr Today we're announcing a β-release of the BLE, a high-fidelity simulator of this complex real-world decision making problem. bit.ly/3y1i2aK 1/🎈

yobibyte (@y0b1byte) 's Twitter Profile Photo

A lot of people complain that RL doesn't work and RL researchers are still playing games. While this criticism is true to some extent, there's been a new trend of applying RL for real-life problems. This is a thread of notable papers split by the topic. 1/n

David Tao (@taodav) 's Twitter Profile Photo

I thought it might be helpful to put together a (not that short) blog post about lessons I've learnt from applying for a (CS) PhD program: link.medium.com/B1k51lBJMpb

David Tao (@taodav) 's Twitter Profile Photo

Check out our new work! It's an empirical study of the different input augmentations popular in real-world RL. This was work done during my time at the Reinforcement Learning and Artificial Intelligence with two of the best advisors a grad student could ask for - Marlos C. Machado and Adam White.

Accepted papers at TMLR (@tmlrpub) 's Twitter Profile Photo

Agent-State Construction with Auxiliary Inputs Ruo Yu Tao, Adam White, Marlos C. Machado. Action editor: Dinesh Jayaraman. openreview.net/forum?id=RLYky… #reinforcement #recurrent #summarizes

Marlos C. Machado (@marloscmachado) 's Twitter Profile Photo

Our paper, which was led by David Tao when he was at University of Alberta, is now officially published by TMLR 😁 Shortly: what do most success in deep RL look like in terms of inputs to the network? Why is that so effective? I tweeted about it a while ago: x.com/marloscmachado…

Richard Sutton (@richardssutton) 's Twitter Profile Photo

We finally have a version of our paper on loss of plasticity and continual backprop that is polished and submitted to a journal. Good work led by my PhD student Shibhansh Dohare. arxiv.org/abs/2306.13812…

RL_Conference (@rl_conference) 's Twitter Profile Photo

Thrilled to announce the first annual Reinforcement Learning Conference RL_Conference, which will be held at UMass Amherst August 9-12! RLC is the first strongly peer-reviewed RL venue with proceedings, and our call for papers is now available: rl-conference.cc.

Thrilled to announce the first annual Reinforcement Learning Conference <a href="/RL_Conference/">RL_Conference</a>, which will be held at UMass Amherst August 9-12!
RLC is the first strongly peer-reviewed RL venue with proceedings, and our call for papers is now available: rl-conference.cc.
Jascha Sohl-Dickstein (@jaschasd) 's Twitter Profile Photo

Have you ever done a dense grid search over neural network hyperparameters? Like a *really dense* grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish colors to hyperparameters for which training diverges.

Philipp Schoenegger (@schoeneggerphil) 's Twitter Profile Photo

‼️New preprint (with Tuminauskaite, @Dr_Park_Phd, and Philip E. Tetlock) on the ‘Wisdom of the Silicon Crowd’ in forecasting! We find that: 🔴 LLM crowd forecasting accuracy matches that of a human crowd 🔴 LLM predictions are improved by human input More details and paper below!

‼️New preprint (with Tuminauskaite, @Dr_Park_Phd, and <a href="/PTetlock/">Philip E. Tetlock</a>) on the ‘Wisdom of the Silicon Crowd’ in forecasting!

We find that:

🔴 LLM crowd forecasting accuracy matches that of a human crowd
🔴 LLM predictions are improved by human input

More details and paper below!
Cam Allen (@camall3n) 's Twitter Profile Photo

We also trained a probe to reconstruct the PacMan dots from the agent’s memory. Guess which agent had an easier time with this… Yep! The λ-discrepancy agent knows where it has been, while the normal RNN agent basically has no idea. 9/n

Dennis Farrell (@dennisfarrell) 's Twitter Profile Photo

Imagine being held at gunpoint (bear with me) by a literate animal, and the only hope of rescue is (BEAR WITH ME) tweeting a coded message