Sharon Y. Li (@sharonyixuanli) Twitter Tweets • TwiCopy

Sharon Y. Li

@sharonyixuanli

+ Follow

Assistant Professor @WisconsinCS. Formerly postdoc @Stanford, Ph.D. @Cornell. Making AI safe and reliable for the open world.

ID: 1107711818997395458

linkhttps://pages.cs.wisc.edu/~sharonli calendar_today18-03-2019 18:34:12

707 Tweet

9,9K Followers

757 Following

Sean Xuefeng Du

@xuefeng_du

a month ago

📣 Announcing two calls for postdocs and research assistants / interns in my lab at NTU Singapore! 1. NTU AI-for-X Postdoctoral Fellowship is accepting postdoc applications who is jointly supervised by AI faculty and a project mentor in their own research field (X) at NTU. It

thumb_up_off_alt34

chat_bubble_outline1

repeat7

shareShare

Sharon Y. Li

@sharonyixuanli

a month ago

Multi-Agent Debate (MAD) has been hyped as a collaborative reasoning paradigm — but let me drop the bomb: majority voting, without any debate, often performs on par with MAD. This is what we formally prove in our #NeurIPS2025 Spotlight paper: “Debate or Vote: Which Yields

thumb_up_off_alt456

chat_bubble_outline11

repeat68

shareShare

Sharon Y. Li

@sharonyixuanli

25 days ago

Excited to share our #NeurIPS2025 paper: Visual Instruction Bottleneck Tuning (Vittle) Multimodal LLMs do great in-distribution, but often break in the wild. Scaling data or models helps, but it’s costly. 💡 Our work is inspired by the Information Bottleneck (IB) principle,

thumb_up_off_alt242

chat_bubble_outline2

repeat33

shareShare

Sharon Y. Li

@sharonyixuanli

25 days ago

I will be giving a talk at UPenn CIS Seminar next Tuesday, October 7. More info below events.seas.upenn.edu/event/14856/ thanks Weijie Su for hosting!

I will be giving a talk at UPenn CIS Seminar next Tuesday, October 7.

More info below
events.seas.upenn.edu/event/14856/

thanks <a href="/weijie444/">Weijie Su</a> for hosting!

thumb_up_off_alt124

chat_bubble_outline3

repeat12

shareShare

Sharon Y. Li

@sharonyixuanli

23 days ago

Collecting large human preference data is expensive—the biggest bottleneck in reward modeling. In our #NeurIPS2025 paper, we introduce latent-space synthesis for preference data, which is 18× faster and uses a network that’s 16,000× smaller (0.5M vs 8B parameters) than

thumb_up_off_alt319

chat_bubble_outline5

repeat56

shareShare

Sharon Y. Li

@sharonyixuanli

17 days ago

Your LVLM says: “There’s a cat on the table.” But… there’s no cat in the image. Not even a whisker. This is object hallucination — one of the most persistent reliability failures in multi-modal language models. Our new #NeurIPS2025 paper introduces GLSim, a simple but

thumb_up_off_alt229

chat_bubble_outline3

repeat44

shareShare

Sharon Y. Li

@sharonyixuanli

16 days ago

We hear increasing discussion about aligning LLM with “diverse human values.” But what’s the actual price of pluralism? 🧮 In our #NeurIPS2025 paper (with Shawn Im), we move this debate from the philosophical to the measurable — presenting the first theoretical scaling law

thumb_up_off_alt284

chat_bubble_outline7

repeat33

shareShare

Sharon Y. Li

@sharonyixuanli

15 days ago

Check out our recent work led by Leitian Tao with the AI at Meta team on using hybrid RL for mathematical reasoning tasks. 🔥Hybrid RL offers a promising way to go beyond purely verifiable rewards — combining the reliability of verifier signals with the richness of learned feedback.

thumb_up_off_alt151

chat_bubble_outline2

repeat14

shareShare

Sharon Y. Li

@sharonyixuanli

12 days ago

Thanks Dan Hendrycks for leading this. Check out our latest preprint on a definition of AGI.

thumb_up_off_alt19

chat_bubble_outline0

repeat2

shareShare

Sharon Y. Li

@sharonyixuanli

11 days ago

Took on the challenge of putting together three different keynote talks for the upcoming #ICCV2025 workshops...and here are the titles: 🔍 Explainability Meets Reliability in Large Vision-Language Models — eXCV Workshop (excv-workshop.github.io) October 19, 10:15–10:45 Honolulu

thumb_up_off_alt31

chat_bubble_outline1

repeat0

shareShare

Sharon Y. Li

@sharonyixuanli

10 days ago

Human preference data is noisy: inconsistent labels, annotator bias, etc. No matter how fancy the post-training algorithm is, bad data can sink your model. 🔥 Min Hsuan (Samuel) Yeh and I are thrilled to release PrefCleanBench — a systematic benchmark for evaluating data cleaning

thumb_up_off_alt238

chat_bubble_outline7

repeat44

shareShare

Hugo Larochelle

@hugo_larochelle

7 days ago

We at TMLR are proud to announce that selected papers will now be eligible for an opportunity to present at the joint NeurIPS/ICML/ICLR Journal-to-Conference (J2C) Track: medium.com/@TmlrOrg/tmlr-…

thumb_up_off_alt323

chat_bubble_outline11

repeat58

shareShare

Sharon Y. Li

@sharonyixuanli

6 days ago

We look forward to including the Journal-to-Conference track papers in #ICML2026 program.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare