Edward Hu (@edward_s_hu) Twitter Tweets • TwiCopy

Edward Hu

@edward_s_hu

+ Follow

cs phd @penn, student researcher @MSFTResearch. investigating ai / rl / intelligence.

ID: 4583386580

linkhttp://www.edwardshu.com calendar_today17-12-2015 09:58:06

145 Tweet

745 Followers

304 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Edward Hu

@edward_s_hu

2 years ago

just coded an RL env for Suika. try it out and let me know if your agent can get the watermelon! github.com/edwhu/suika_rl

thumb_up_off_alt18

chat_bubble_outline0

repeat1

shareShare

So excited and grateful to share that I matched UC San Francisco for internal medicine residency today! Thank you to my mentors, family, friends, and fiancé Edward Hu for all of your support during my med school journey! Couldn’t have done it without you ❤️

So excited and grateful to share that I matched <a href="/UCSF/">UC San Francisco</a> for internal medicine residency today! Thank you to my mentors, family, friends, and fiancé <a href="/edward_s_hu/">Edward Hu</a> for all of your support during my med school journey! Couldn’t have done it without you ❤️

thumb_up_off_alt121

chat_bubble_outline6

repeat7

shareShare

kache

@yacinemtb

a year ago

bruh this RL shit is hard 😭

thumb_up_off_alt534

chat_bubble_outline57

repeat11

shareShare

Edward Hu

@edward_s_hu

6 months ago

Pi0 really did work for us on the first try. No camera calibration, controller tuning, etc. The failure cases: missed grasps and risk-averse "hedging" behavior. Excited to see how the robotics community improves on this. At the very least, it will be a good baseline.

thumb_up_off_alt210

chat_bubble_outline6

repeat17

shareShare

Edward Hu

@edward_s_hu

5 months ago

I'll make a tweet before ICLR'25, but this thread captures the essence well. Predicting the next token with transformers is great; but predicting 2 tokens = provably sufficient representation for planning. Let's do that and see what happens.

thumb_up_off_alt144

chat_bubble_outline2

repeat10

shareShare

Edward Hu

@edward_s_hu

5 months ago

Sometimes, it's expensive to reset in RL (e.g. robotics). It turns out world models do pretty well here. Why? • learn reset policy in imagination for free • policy training in world model is resistant to distribution shift. Check out Zhao's TMLR accepted project!

thumb_up_off_alt16

chat_bubble_outline0

repeat1

shareShare

Edward Hu

@edward_s_hu

4 months ago

Well deserved. Dv3 made my last RL paper super easy to tune for new tasks: just vary model size and UTD ratio. Downside is the complexity of code. In PPO, I need to tune clipping, entropy, lr, batch, per new task. But the code is simple. Use the right tool for the job!

thumb_up_off_alt11

chat_bubble_outline0

repeat1

shareShare

Jason Ma

@jasonma2020

3 months ago

Introducing Dynamism v1 (DYNA-1) by Dyna Robotics – the first robot foundation model built for round-the-clock, high-throughput dexterous autonomy. Here is a time-lapse video of our model autonomously folding 850+ napkins in a span of 24 hours with • 99.4% success rate — zero

thumb_up_off_alt910

chat_bubble_outline61

repeat125

shareShare

Overleaf

@overleaf

3 months ago

⚠️ Attention: The site is currently down. Our engineering team is investigating. We will update as soon as possible. You can track progress here: status.overleaf.com Sorry for any inconvenience.

thumb_up_off_alt809

chat_bubble_outline229

repeat208

shareShare

Junyao Shi

@junyaoshi

2 months ago

On my way to Atlanta to present ZeroMimic: Distilling Robotic Manipulation Skills from Web Videos at IEEE ICRA! Stay tuned for an in-depth post about how ZeroMimic distills zero-shot policies from web human videos. 🌐 Project site: zeromimic.github.io

thumb_up_off_alt36

chat_bubble_outline0

repeat4

shareShare

Edward Hu

Gate.io

Edward Hu

Nina Singh, MD

kache

Edward Hu

Edward Hu

Edward Hu

Edward Hu

Jason Ma

Overleaf

Junyao Shi