Kaiwen Wang (@kaiwenw_ai) Twitter Tweets • TwiCopy

Kaiwen Wang

@kaiwenw_ai

+ Follow

RL Research @Cornell_Tech. @Google PhD Fellow.

ID: 1233566427505778688

linkhttps://kaiwenw.github.io/ calendar_today29-02-2020 01:35:37

56 Tweet

304 Followers

479 Following

Jason Wei

@_jasonwei

a year ago

2022: I never wrote a RL paper or worked with a RL researcher. I didn’t think RL was crucial for AGI Now: I think about RL every day. My code is optimized for RL. The data I create is designed just for RL. I even view life through the lens of RL Crazy how quickly life changes

thumb_up_off_alt1,1K

chat_bubble_outline38

repeat94

shareShare

Kaiwen Wang

@kaiwenw_ai

a year ago

Making inferences robust to distribution shifts and hidden confounders is paramount for decision making under uncertainty. At the upcoming NeurIPS Conference, I’m excited to present our efficient and sharp algorithm for off-policy evaluation in robust markov decision processes. Many

Making inferences robust to distribution shifts and hidden confounders is paramount for decision making under uncertainty.

At the upcoming <a href="/NeurIPSConf/">NeurIPS Conference</a>, I’m excited to present our efficient and sharp algorithm for off-policy evaluation in robust markov decision processes.

Many

thumb_up_off_alt27

chat_bubble_outline0

repeat7

shareShare

Kaiwen Wang

@kaiwenw_ai

a year ago

Join us Pluralistic Alignment Workshop workshop at #NeurIPS to learn more about CLP! 🗓️ Sat, 14 Dec, 2024 🕙 10:40-11:40am PST 📍 West Meeting Room 116, 117 🔗 arxiv.org/abs/2407.15762 x.com/kaiwenw_ai/sta…

thumb_up_off_alt8

chat_bubble_outline0

repeat2

shareShare

Jason Gauci

@neuralnets4life

9 months ago

I've made FANG billions of $ with reinforcement learning, so this episode is a long-time coming :-). Episode 180: Reinforcement Learning, drops on Monday! patreon.com/posts/180-lear…

thumb_up_off_alt3

chat_bubble_outline0

repeat2

shareShare

Jon Richens

@jonathanrichens

7 months ago

Are world models necessary to achieve human-level agents, or is there a model-free short-cut? Our new #ICML2025 paper tackles this question from first principles, and finds a surprising answer, agents _are_ world models… 🧵

thumb_up_off_alt1,1K

chat_bubble_outline33

repeat170

shareShare

Wen Sun

@wensun1

5 months ago

How can small LLMs match or even surpass frontier models like DeepSeek R1 and o3 Mini in math competition (AIME & HMMT) reasoning? Prior work seems to suggest that ideas like PRMs do not really work or scale well for long context reasoning. Kaiwen Wang will reveal how a novel

thumb_up_off_alt23

chat_bubble_outline0

repeat8

shareShare

Jin Zhou

@jinpzhou

5 months ago

This captures something fundamental we're seeing in AI right now! The shift from just scaling pre-training to scaling test-time compute is huge. Our Q# + VGS work shows how value-based methods can guide models through the vast implicit graphs of reasoning possibilities.

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

AI for Math Workshop @ ICML 2025

@ai4mathworkshop

5 months ago

It's happening today! 📍Location: West Ballroom C, Vancouver Convention Center ⌚️Time: 8:30 am - 6:00 pm 🎥 Livestream: icml.cc/virtual/2025/w… #ICML2025 #icml25 #icml #aiformath #ai4math #workshop

thumb_up_off_alt20

chat_bubble_outline0

repeat11

shareShare

Kaiwen Wang

@kaiwenw_ai

5 months ago

Correction re the time: my posters on Q# and VGS at AI for Math Workshop @ ICML 2025 is happening today from 10:50 am to 12:20 pm. Hope to see you there! x.com/kaiwenw_ai/sta…

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare