David Tao (@taodav) Twitter Tweets • TwiCopy

Pablo Samuel Castro

4 years ago

Last year we showed that deep RL can successfully control stratospheric balloons in the real world go.nature.com/3dtvrPr Today we're announcing a β-release of the BLE, a high-fidelity simulator of this complex real-world decision making problem. bit.ly/3y1i2aK 1/🎈

thumb_up_off_alt102

chat_bubble_outline1

repeat26

shareShare

David Tao

@taodav

4 years ago

Life update: I'll be joining the wonderful people in the Intelligent Robot Lab at Brown University this Fall for my PhD!!

thumb_up_off_alt72

chat_bubble_outline5

repeat1

shareShare

yobibyte

@y0b1byte

4 years ago

A lot of people complain that RL doesn't work and RL researchers are still playing games. While this criticism is true to some extent, there's been a new trend of applying RL for real-life problems. This is a thread of notable papers split by the topic. 1/n

thumb_up_off_alt763

chat_bubble_outline27

repeat141

shareShare

yobibyte

@y0b1byte

4 years ago

I'm only 50 pages in, but I totally recommend this book. One of the best textbooks I've ever read.

thumb_up_off_alt1,1K

chat_bubble_outline17

repeat121

shareShare

David Tao

@taodav

4 years ago

I thought it might be helpful to put together a (not that short) blog post about lessons I've learnt from applying for a (CS) PhD program: link.medium.com/B1k51lBJMpb

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Rosanne Liu

@savvyrl

4 years ago

Your NeurIPS submission Number: 12*** 😐

thumb_up_off_alt115

chat_bubble_outline10

repeat3

shareShare

David Tao

@taodav

3 years ago

Check out our new work! It's an empirical study of the different input augmentations popular in real-world RL. This was work done during my time at the Reinforcement Learning and Artificial Intelligence with two of the best advisors a grad student could ask for - Marlos C. Machado and Adam White.

thumb_up_off_alt14

chat_bubble_outline0

repeat1

shareShare

Marlos C. Machado

@marloscmachado

3 years ago

Our paper has been accepted to Accepted papers at TMLR! This work was led by David Tao while he as a student at University of Alberta with Adam White and I. openreview.net/forum?id=RLYky…

thumb_up_off_alt17

chat_bubble_outline0

repeat3

shareShare

Accepted papers at TMLR

@tmlrpub

3 years ago

Agent-State Construction with Auxiliary Inputs Ruo Yu Tao, Adam White, Marlos C. Machado. Action editor: Dinesh Jayaraman. openreview.net/forum?id=RLYky… #reinforcement #recurrent #summarizes

thumb_up_off_alt10

chat_bubble_outline0

repeat5

shareShare

Marlos C. Machado

@marloscmachado

3 years ago

Our paper, which was led by David Tao when he was at University of Alberta, is now officially published by TMLR 😁 Shortly: what do most success in deep RL look like in terms of inputs to the network? Why is that so effective? I tweeted about it a while ago: x.com/marloscmachado…

thumb_up_off_alt17

chat_bubble_outline0

repeat4

shareShare

Richard Sutton

@richardssutton

2 years ago

We finally have a version of our paper on loss of plasticity and continual backprop that is polished and submitted to a journal. Good work led by my PhD student Shibhansh Dohare. arxiv.org/abs/2306.13812…

thumb_up_off_alt301

chat_bubble_outline6

repeat52

shareShare

RL_Conference

@rl_conference

2 years ago

Thrilled to announce the first annual Reinforcement Learning Conference RL_Conference, which will be held at UMass Amherst August 9-12! RLC is the first strongly peer-reviewed RL venue with proceedings, and our call for papers is now available: rl-conference.cc.

Thrilled to announce the first annual Reinforcement Learning Conference <a href="/RL_Conference/">RL_Conference</a>, which will be held at UMass Amherst August 9-12!
RLC is the first strongly peer-reviewed RL venue with proceedings, and our call for papers is now available: rl-conference.cc.

thumb_up_off_alt228

chat_bubble_outline3

repeat87

shareShare

meowbooks

@untitled01ipynb

2 years ago

For every N likes this post gets, I'll make this Sloth Tensorflow developer more frustrated by the documentation.

thumb_up_off_alt515

chat_bubble_outline9

repeat41

shareShare

Jascha Sohl-Dickstein

@jaschasd

2 years ago

Have you ever done a dense grid search over neural network hyperparameters? Like a *really dense* grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish colors to hyperparameters for which training diverges.

thumb_up_off_alt9,9K

chat_bubble_outline275

repeat1,1K

shareShare

Philipp Schoenegger

@schoeneggerphil

2 years ago

‼️New preprint (with Tuminauskaite, @Dr_Park_Phd, and Philip E. Tetlock) on the ‘Wisdom of the Silicon Crowd’ in forecasting! We find that: 🔴 LLM crowd forecasting accuracy matches that of a human crowd 🔴 LLM predictions are improved by human input More details and paper below!

‼️New preprint (with Tuminauskaite, @Dr_Park_Phd, and <a href="/PTetlock/">Philip E. Tetlock</a>) on the ‘Wisdom of the Silicon Crowd’ in forecasting!

We find that:

🔴 LLM crowd forecasting accuracy matches that of a human crowd
🔴 LLM predictions are improved by human input

More details and paper below!

thumb_up_off_alt59

chat_bubble_outline4

repeat23

shareShare

Cam Allen

@camall3n

a year ago

We also trained a probe to reconstruct the PacMan dots from the agent’s memory. Guess which agent had an easier time with this… Yep! The λ-discrepancy agent knows where it has been, while the normal RNN agent basically has no idea. 9/n