Wanqiao Xu (@wanqiao_xu) 's Twitter Profile
Wanqiao Xu

@wanqiao_xu

PhD student @stanford RL Group 🌲formerly @UMich Math 〽️ interested in RL and Finetuning LLM | Previously intern @MetaAI @MSFTResearch

ID: 943661690993885184

linkhttp://wanqiaox.github.io calendar_today21-12-2017 01:57:34

77 Tweet

211 Followers

401 Following

Wanqiao Xu (@wanqiao_xu) 's Twitter Profile Photo

So excited to announce that I am going to Stanford for a Management Science & Engineering PhD!!! Really looking forward to my five years of scholarship ahead and the great people I am about to meet there!

Nick Arnosti (@nickarnosti) 's Twitter Profile Photo

Europeans have discovered many keys to living life well. But one area where America absolutely has it right is water in restaurants. Why can't I drink tap water? And why does water come in tiny 33 cl bottles? Apparently, Europeans have adapted to a perpetual state of dehydration.

Hamsa Bastani (@hamsabastani) 's Twitter Profile Photo

The exploration-exploitation tradeoff in #RL raises concerns about exploration -- who does it impact and how much? In a recent #AISTATS paper, we show how to effectively "spread out" exploration across episodes (individuals) w/ only a small cost to regret: arxiv.org/abs/2110.13060

Yi Ma (@yimatweets) 's Twitter Profile Photo

The best part of research is realizing you have understood something that no one else in the world has. That feeling is absolutely surreal, and the most addictive.

Tengyu Ma (@tengyuma) 's Twitter Profile Photo

Adam, a 9-yr old optimizer, is the go-to for training LLMs (eg, GPT-3, OPT, LLAMA). Introducing Sophia, a new optimizer that is 2x faster than Adam on LLMs. Just a few more lines of code could cut your costs from $2M to $1M (if scaling laws hold). arxiv.org/abs/2305.14342 🧵⬇️

Adam, a 9-yr old optimizer, is the go-to for training LLMs (eg, GPT-3, OPT, LLAMA).

Introducing Sophia, a new optimizer that is 2x faster than Adam on LLMs. Just a few more lines of code could cut your costs from $2M to $1M (if scaling laws hold).

arxiv.org/abs/2305.14342 🧵⬇️
Tri Dao (@tri_dao) 's Twitter Profile Photo

Announcing FlashAttention-2! We released FlashAttention a year ago, making attn 2-4 faster and is now widely used in most LLM libraries. Recently I’ve been working on the next version: 2x faster than v1, 5-9x vs standard attn, reaching 225 TFLOPs/s training speed on A100. 1/

Announcing FlashAttention-2! We released FlashAttention a year ago, making attn 2-4 faster and is now widely used in most LLM libraries. Recently I’ve been working on the next version: 2x faster than v1, 5-9x vs standard attn, reaching 225 TFLOPs/s training speed on A100. 1/
Jim Fan (@drjimfan) 's Twitter Profile Photo

You'll soon see lots of "Llama just dethroned ChatGPT" or "OpenAI is so done" posts on Twitter. Before your timeline gets flooded, I'll share my notes: ▸ Llama-2 likely costs $20M+ to train. Meta has done an incredible service to the community by releasing the model with a

You'll soon see lots of "Llama just dethroned ChatGPT" or "OpenAI is so done" posts on Twitter. Before your timeline gets flooded, I'll share my notes:

▸ Llama-2 likely costs $20M+ to train. Meta has done an incredible service to the community by releasing the model with a
Dilip Arumugam (@dilip_arumugam) 's Twitter Profile Photo

Prof. Anima Anandkumar For a simple data-generating process (not natural language), we've seen that RLHF moves between collapsing down to a most-preferred response or sticking to the supervised pre-training response distribution, depending on the KL regularization strength arxiv.org/abs/2305.11455

Dilip Arumugam (@dilip_arumugam) 's Twitter Profile Photo

So exciting to see information-theoretic Bayesian RL facilitating insights into how rate-limited learners accumulate knowledge across generations! A 🧵 to accompany @BenPrystawski 's upcoming awesome talk

Ian Osband (@ianosband) 's Twitter Profile Photo

Excited to (finally) present our work on Epistemic Neural Networks as a spotlight for #NeurIPS23 "Get better uncertainty than an ensemble size=100 at cost less than 2x base models" Poster 1924 arxiv.org/abs/2107.08924

Linqi (Alex) Zhou (@linqi_zhou) 's Twitter Profile Photo

SO excited to finally share my work at Luma! We introduce Inductive Moment Matching, a new generative paradigm that can be trained stably with a single model and single objective from scratch, achieving 1.99 FID on ImageNet-256x256 in 8 steps and 1.98 FID on CIFAR-10 in 2 steps.