Wanqiao Xu (@wanqiao_xu) Twitter Tweets • TwiCopy

Wanqiao Xu

@wanqiao_xu

+ Follow

PhD student @stanford RL Group 🌲formerly @UMich Math 〽️ interested in RL and Finetuning LLM | Previously intern @MetaAI @MSFTResearch

ID: 943661690993885184

linkhttp://wanqiaox.github.io calendar_today21-12-2017 01:57:34

77 Tweet

211 Followers

401 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

So excited to announce that I am going to Stanford for a Management Science & Engineering PhD!!! Really looking forward to my five years of scholarship ahead and the great people I am about to meet there!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Nick Arnosti

@nickarnosti

3 years ago

Europeans have discovered many keys to living life well. But one area where America absolutely has it right is water in restaurants. Why can't I drink tap water? And why does water come in tiny 33 cl bottles? Apparently, Europeans have adapted to a perpetual state of dehydration.

thumb_up_off_alt125

chat_bubble_outline13

repeat7

shareShare

Hamsa Bastani

@hamsabastani

2 years ago

The exploration-exploitation tradeoff in #RL raises concerns about exploration -- who does it impact and how much? In a recent #AISTATS paper, we show how to effectively "spread out" exploration across episodes (individuals) w/ only a small cost to regret: arxiv.org/abs/2110.13060

thumb_up_off_alt35

chat_bubble_outline1

repeat8

shareShare

Yi Ma

@yimatweets

2 years ago

The best part of research is realizing you have understood something that no one else in the world has. That feeling is absolutely surreal, and the most addictive.

thumb_up_off_alt782

chat_bubble_outline21

repeat100

shareShare

Tengyu Ma

@tengyuma

2 years ago

Adam, a 9-yr old optimizer, is the go-to for training LLMs (eg, GPT-3, OPT, LLAMA). Introducing Sophia, a new optimizer that is 2x faster than Adam on LLMs. Just a few more lines of code could cut your costs from $2M to $1M (if scaling laws hold). arxiv.org/abs/2305.14342 🧵⬇️

thumb_up_off_alt3,3K

chat_bubble_outline96

repeat621

shareShare

Rahul Sethi

@eigenknight

2 years ago

Did P. Enflo just solve the Invariant Subspace Problem? arxiv.org/abs/2305.15442

thumb_up_off_alt191

chat_bubble_outline3

repeat40

shareShare

Tri Dao

@tri_dao

2 years ago

Announcing FlashAttention-2! We released FlashAttention a year ago, making attn 2-4 faster and is now widely used in most LLM libraries. Recently I’ve been working on the next version: 2x faster than v1, 5-9x vs standard attn, reaching 225 TFLOPs/s training speed on A100. 1/

thumb_up_off_alt3,3K

chat_bubble_outline39

repeat665

shareShare

Jim Fan

@drjimfan

2 years ago

You'll soon see lots of "Llama just dethroned ChatGPT" or "OpenAI is so done" posts on Twitter. Before your timeline gets flooded, I'll share my notes: ▸ Llama-2 likely costs $20M+ to train. Meta has done an incredible service to the community by releasing the model with a

thumb_up_off_alt5,5K

chat_bubble_outline166

repeat1,1K

shareShare

Dilip Arumugam

@dilip_arumugam

2 years ago

Prof. Anima Anandkumar For a simple data-generating process (not natural language), we've seen that RLHF moves between collapsing down to a most-preferred response or sticking to the supervised pre-training response distribution, depending on the KL regularization strength arxiv.org/abs/2305.11455

thumb_up_off_alt33

chat_bubble_outline2

repeat2

shareShare

Dilip Arumugam

@dilip_arumugam

2 years ago

So exciting to see information-theoretic Bayesian RL facilitating insights into how rate-limited learners accumulate knowledge across generations! A 🧵 to accompany @BenPrystawski 's upcoming awesome talk

thumb_up_off_alt9

chat_bubble_outline1

repeat2

shareShare

Wanqiao Xu

@wanqiao_xu

2 years ago

I don’t understand most of this tweet, but feel the need to document potential history

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Wanqiao Xu

@wanqiao_xu

2 years ago

A new open-source RL framework that you won’t want to miss!

thumb_up_off_alt4

chat_bubble_outline1

repeat0

shareShare

Ian Osband

@ianosband

2 years ago

Excited to (finally) present our work on Epistemic Neural Networks as a spotlight for #NeurIPS23 "Get better uncertainty than an ensemble size=100 at cost less than 2x base models" Poster 1924 arxiv.org/abs/2107.08924

thumb_up_off_alt99

chat_bubble_outline5

repeat12

shareShare

Wanqiao Xu

@wanqiao_xu

a year ago

Thanks for the recognition and organization of this thoroughly enjoyable conference!

thumb_up_off_alt21

chat_bubble_outline3

repeat1

shareShare

Wanqiao Xu

@wanqiao_xu

5 months ago

Congratulations! What an era to be doing research in!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Linqi (Alex) Zhou

@linqi_zhou

5 months ago

SO excited to finally share my work at Luma! We introduce Inductive Moment Matching, a new generative paradigm that can be trained stably with a single model and single objective from scratch, achieving 1.99 FID on ImageNet-256x256 in 8 steps and 1.98 FID on CIFAR-10 in 2 steps.

thumb_up_off_alt112

chat_bubble_outline7

repeat19

shareShare

Wanqiao Xu

@wanqiao_xu

a month ago

Very proud of this work done at MSR, featuring sequential decision-making from language feedback!

thumb_up_off_alt6

chat_bubble_outline1

repeat1

shareShare