Xiaozhe Yao (@xiaozheyao) Twitter Tweets • TwiCopy

Xiaozhe Yao

@xiaozheyao

+ Follow

Doctoral student in Computer Science @ETH_en.
love, passion and devotion

ID: 899883061596127232

linkhttps://about.yao.sh calendar_today22-08-2017 06:36:55

135 Tweet

224 Followers

871 Following

Xiaozhe Yao

@xiaozheyao

a year ago

When fine-tune LLMs, do you usually use LoRA (or it's variants) or full fine-tuning? Why? (I have read some reports, just curious)

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Zhou (Joe) Li

@lzcarl

a year ago

Wow, I appreciate ACM CCS 2024 being frank about review ethics! #CCS

Wow, I appreciate <a href="/acm_ccs/">ACM CCS 2024</a> being frank about review ethics! #CCS

thumb_up_off_alt181

chat_bubble_outline7

repeat53

shareShare

Nezihe Merve Gürel (nmervegurel.bsky.social)

@nmervegurel

a year ago

Deadline for #ICLR2025 workshop proposal is coming up! Submit your proposal by Oct 20th, 11:59 pm AOE!

thumb_up_off_alt46

chat_bubble_outline1

repeat10

shareShare

Just to clarify this benchmark. This is an apple to oranges comparison. - Cerebras is fast for batch size 1 but slow for batch size n. - GPUs are slow for batch size 1 but fast for batch size n. I get >800 tok/s on 8x H100 for a 405B model for batch size=n. Cerebras' system

thumb_up_off_alt825

chat_bubble_outline26

repeat84

shareShare

Ahmad Beirami @ ICLR 2025

@abeirami

9 months ago

The question that a reviewer should ask themselves is: Does this paper take a gradient step in a promising direction? Is the community better off with this paper published? If the answer is yes, then the recommendation should be to accept.

thumb_up_off_alt226

chat_bubble_outline7

repeat19

shareShare

Berivan Isik

@berivanisik

9 months ago

I’ll be hosting an intern Google AI in 2025 to work on the value of data for LLMs. If you’re interested, please email me your CV and a brief summary of your background. I won’t be checking DMs.

thumb_up_off_alt768

chat_bubble_outline12

repeat66

shareShare

Anne Ouyang

@anneouyang

9 months ago

Kernels are the kernel of deep learning. 🙃...but writing kernels sucks. Can LLMs help? 🤔 Introducing 🌽 KernelBench (Preview), a new coding benchmark designed to evaluate the ability of LLMs to generate ⚡️efficient💨 GPU kernels for optimizing neural network performance.

thumb_up_off_alt615

chat_bubble_outline19

repeat96

shareShare

Pika

@pika_labs

9 months ago

A giant pre-holiday gift from the Pika Team: We’re giving EVERYONE free, unlimited access to Pika 2.0. From today until December 22nd, anyone on any plan can generate as many videos as they want, using all the Scene Ingredients they want. It’s a 4-day Free-For-All, so get it

thumb_up_off_alt1,1K

chat_bubble_outline131

repeat289

shareShare

Maximilian Böther

@maxiboether

6 months ago

📊 Are you training LLMs and manage your training data via a DFS? Do you spend a lot of time writing data wrangling/mixing scripts? ⌛ We just posted a preprint on Mixtera, our data plane for LLM/VLM training🎉 🔗 github.com/eth-easl/mixte… 🔗 arxiv.org/abs/2502.19790 Read more👇

thumb_up_off_alt7

chat_bubble_outline1

repeat4

shareShare

Ana Klimovic

@anaklimovic

6 months ago

Paper preprint: arxiv.org/abs/2502.19790 Open source code: github.com/eth-easl/mixte… Maximilian Böther Xiaozhe Yao

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

Ana Klimovic

@anaklimovic

5 months ago

Very excited to host Jeff Dean at ETH Zürich Zurich for a Distinguished Colloquium and research discussions next Monday! ETH CS Department Info about his talk "Important Trends in AI: How Did We Get Here, What Can We Do Now and How Can We Shape AI’s Future?": inf.ethz.ch/news-and-event…

thumb_up_off_alt144

chat_bubble_outline7

repeat15

shareShare

Ajeet (opensox.in)

@ajeetunc

3 months ago

google needs a better ui engineer.

thumb_up_off_alt7,7K

chat_bubble_outline602

repeat254

shareShare

Xiaozhe Yao

@xiaozheyao

2 months ago

A whole day in the MET. Highly recommended.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Gautam Kamath

@thegautamkamath

2 months ago

System is so broken: - researchers write papers no one reads - reviewers don't have time to review, shamed to coauthors, use LLMs instead of reading - authors try to fool said LLMs with prompt injection - evaling researchers based on # of papers (no time to read) Dystopic.

thumb_up_off_alt1,1K

chat_bubble_outline39

repeat114

shareShare

Xiaozhe Yao

Xiaozhe Yao

Zhou (Joe) Li

Nezihe Merve Gürel (nmervegurel.bsky.social)

Tim Dettmers

Ahmad Beirami @ ICLR 2025

Berivan Isik

Anne Ouyang

Pika

Maximilian Böther

Ana Klimovic

Ana Klimovic

Ajeet (opensox.in)

Xiaozhe Yao

Gautam Kamath