Deqing Fu (@deqingfu) Twitter Tweets • TwiCopy

Deqing Fu

@deqingfu

+ Follow

CS PhD Student @CSatUSC. Alum @UChicago, B.S. '20, M.S.' 22. Interpretability of LLM; DL Theory; NLP | prev research intern @MetaAI, @Google

ID: 1327029576430718976

linkhttp://deqingfu.github.io calendar_today12-11-2020 23:24:49

143 Tweet

737 Followers

844 Following

Deqing Fu

@deqingfu

4 months ago

posted my slides for today's talk here: deqingfu.github.io/_docs/20250522… check it out!

thumb_up_off_alt34

chat_bubble_outline1

repeat5

shareShare

🧐When do LLMs admit their mistakes when they should know better? In our new paper, we define this behavior as retraction: the model indicates that its generated answer was wrong. LLMs can retract—but they rarely do.🤯 arxiv.org/abs/2505.16170 👇🧵

thumb_up_off_alt109

chat_bubble_outline5

repeat24

shareShare

xiao zhang

@xiaozha55937919

4 months ago

Excited to be at #CVPR2025 2025! Looking forward to catching up with old friends and meeting new ones. If you're interested in grabbing coffee, trying out new restaurants, or chatting about generative representation learning, feel free to DM me! A quick summary of my recent works:

Excited to be at <a href="/CVPR/">#CVPR2025</a> 2025! Looking forward to catching up with old friends and meeting new ones.

If you're interested in grabbing coffee, trying out new restaurants, or chatting about generative representation learning, feel free to DM me!

A quick summary of my recent works:

thumb_up_off_alt9

chat_bubble_outline0

repeat2

shareShare

Shangshang Wang

@upupwang

4 months ago

Sparse autoencoders (SAEs) can be used to elicit strong reasoning abilities with remarkable efficiency. Using only 1 hour of training at $2 cost without any reasoning traces, we find a way to train 1.5B models via SAEs to score 43.33% Pass@1 on AIME24 and 90% Pass@1 on AMC23.

thumb_up_off_alt494

chat_bubble_outline10

repeat56

shareShare

Deqing Fu

@deqingfu

4 months ago

How to make SAEs useful beyond interpretability and steering? Shangshang Wang 's work Resa shows: 🧐SAEs can capture the reasoning features (as an interpretability tool) 🤔SAEs can further elicit strong reasoning abilities via SAE-tuning the model (stronger claim than steering, imho)

thumb_up_off_alt21

chat_bubble_outline0

repeat0

shareShare

Mahdi Soltanolkotabi

@mahdisoltanol

4 months ago

We show that you can control and steer layout, style, etc in diffusion models using SAEs

thumb_up_off_alt11

chat_bubble_outline1

repeat1

shareShare

Deqing Fu

@deqingfu

4 months ago

💯

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Qinyuan Ye (👀Jobs)

@qinyuan_ye

2 months ago

1+1=3 2+2=5 3+3=? Many language models (e.g., Llama 3 8B, Mistral v0.1 7B) will answer 7. But why? We dig into the model internals, uncover a function induction mechanism, and find that it’s broadly reused when models encounter surprises during in-context learning. 🧵

thumb_up_off_alt103

chat_bubble_outline4

repeat15

shareShare

Micah Goldblum

@micahgoldblum

2 months ago

🚨Announcing Zebra-CoT, a large-scale dataset of high quality interleaved image-text reasoning traces 📜. Humans often draw visual aids like diagrams when solving problems, but existing VLMs reason mostly in pure text. 1/n

thumb_up_off_alt116

chat_bubble_outline1

repeat24

shareShare

Deqing Fu

@deqingfu

2 months ago

Presenting Zebra-CoT: A large-scale dataset to teach models intrinsic multimodal reasoning: interleaving text and natively-generated images like a zebra's stripes. It moves beyond the limitations of external tool-based visual CoT. 🔗arxiv.org/abs/2507.16746

thumb_up_off_alt57

chat_bubble_outline0

repeat8

shareShare

Robin Jia

@robinomial

2 months ago

I’ll be at ACL 2025 next week where my group has papers on evaluating evaluation metrics, watermarking training data, and mechanistic interpretability. I’ll also be co-organizing the first Workshop on LLM Memorization Workshop on Large Language Model Memorization on Friday. Hope to see lots of folks there!

thumb_up_off_alt47

chat_bubble_outline0

repeat5

shareShare

Deqing Fu

Deqing Fu

Yuqing Yang

xiao zhang

Shangshang Wang

Deqing Fu

Mahdi Soltanolkotabi

Deqing Fu

Qinyuan Ye (👀Jobs)

Micah Goldblum

Deqing Fu

Robin Jia