Chulin Xie (@chulinxie) 's Twitter Profile
Chulin Xie

@chulinxie

CS PhD student at UIUC; IBM PhD Fellowship 2024; prev. intern @GoogleAI @MSFTResearch @NvidiaAI

ID: 1109845260874579969

linkhttps://alphapav.github.io/ calendar_today24-03-2019 15:51:44

85 Tweet

1,1K Followers

842 Following

Bill Yuchen Lin (@billyuchenlin) 's Twitter Profile Photo

Is an LLMโ€™s reasoning ability solely based on its powerful memorization skills? We conducted an in-depth empirical study to explore this question and uncovered some fascinating findings. Check out Chulin Xieโ€™s threads for more details!

Yangsibo Huang (@yangsibohuang) 's Twitter Profile Photo

Probing results are my fav in our paper (Sec 4.2)!! 1. LLMs clearly develop reasoning skills through direct DT (i.e., w/o CoT). 2. Harder tasks demand more internal computation to solve. 3. Probing accuracy peaks in the middle layersโ€”not the final layer.

Victor Reis (@vetohaze) 's Twitter Profile Photo

Our Algorithms group at Microsoft Research is hiring interns in differential privacy, reasoning abilities of LLMs, and theory: jobs.careers.microsoft.com/global/en/job/โ€ฆ jobs.careers.microsoft.com/global/en/job/โ€ฆ jobs.careers.microsoft.com/global/en/job/โ€ฆ

Tian Li (@litian0331) 's Twitter Profile Photo

I am taking new Ph.D. students from UChicagoCS and Data Science Institute in the 2024-2025 cycle! If you are interested in distributed optimization, data sharing, and trustworthy ML, please feel free to apply! More info on our research: litian96.github.io

I am taking new Ph.D. students from <a href="/UChicagoCS/">UChicagoCS</a> and <a href="/DSI_UChicago/">Data Science Institute</a> in the 2024-2025 cycle! If you are interested in distributed optimization, data sharing, and trustworthy ML, please feel free to apply! More info on our research: litian96.github.io
Chulin Xie (@chulinxie) 's Twitter Profile Photo

Exciting internship opportunity on privacy & foundation models with the amazing Zinan Lin at MSR! Zinan is an incredibly insightful and supportive mentor!

Maya Varma (@mayavarma23) 's Twitter Profile Photo

(1/4) Excited to share RaVL, which is appearing this week at #NeurIPS2024! RaVL discovers and mitigates spurious correlations in fine-tuned vision-language models (VLMs). ๐Ÿ“„ Paper: arxiv.org/abs/2411.04097 ๐Ÿ’ป GitHub: github.com/Stanford-AIMI/โ€ฆ

(1/4) Excited to share RaVL, which is appearing this week at #NeurIPS2024! RaVL discovers and mitigates spurious correlations in fine-tuned vision-language models (VLMs). 

๐Ÿ“„ Paper: arxiv.org/abs/2411.04097
๐Ÿ’ป GitHub: github.com/Stanford-AIMI/โ€ฆ
Dawn Song (@dawnsongtweets) 's Twitter Profile Photo

๐ŸŽ‰ Deeply honored that our paper "Decoding Trust: Comprehensive Assessment of Trustworthiness in GPT Modelsโ€ which was awarded Outstanding Paper at NeurIPS 2023, has just been awarded Best Scientific Cybersecurity Paper of 2024, in collaboration with Bo Li Sanmi Koyejo

๐ŸŽ‰ Deeply honored that our paper "Decoding Trust: Comprehensive Assessment of Trustworthiness in GPT Modelsโ€ which was awarded Outstanding Paper at NeurIPS 2023, has just been awarded Best Scientific Cybersecurity Paper of 2024, in collaboration with <a href="/uiuc_aisecure/">Bo Li</a> <a href="/sanmikoyejo/">Sanmi Koyejo</a>
Chulin Xie (@chulinxie) 's Twitter Profile Photo

๐Ÿ’ป Are Code Agents Safe? #NeurIPS2024 In RedCode, we evaluate the risks of code execution and generation in 19 code agents within real system environments. ๐Ÿ—“๏ธ Thu 12 Dec | 4:30 PM โ€“ 7:30 PM PST ๐Ÿ“: West Ballroom A-D #5300 ๐Ÿ”—: redcode-agent.github.io Stop by the RedCode poster

Zinan Lin (@lin_zinan) 's Twitter Profile Photo

๐Ÿš€ Image AR models (๐—ฉ๐—”๐—ฅ & ๐—Ÿ๐—น๐—ฎ๐—บ๐—ฎ๐—š๐—ฒ๐—ป) can be distilled to ๐—ข๐—ก๐—˜ step (up to ๐Ÿฎ๐Ÿญ๐Ÿด๐˜… ๐—ณ๐—ฎ๐˜€๐˜๐—ฒ๐—ฟ) for the first time! See ๐‘ซ๐’Š๐’”๐’•๐’Š๐’๐’๐’†๐’… ๐‘ซ๐’†๐’„๐’๐’…๐’Š๐’๐’ˆ โ†“ ๐—ช๐—ฒ๐—ฏ๐˜€๐—ถ๐˜๐—ฒ: imagination-research.github.io/distilled-decoโ€ฆ ๐—ฃ๐—ฎ๐—ฝ๐—ฒ๐—ฟ: arxiv.org/abs/2412.17153 huggingface.co/papers/2412.17โ€ฆ (1/n)

Xiang Yue@ICLR2025๐Ÿ‡ธ๐Ÿ‡ฌ (@xiangyue96) 's Twitter Profile Photo

Demystifying Long CoT Reasoning in LLMs arxiv.org/pdf/2502.03373 Reasoning models like R1 / O1 / O3 have gained massive attention, but their training dynamics remain a mystery. We're taking a first deep dive into understanding long CoT reasoning in LLMs! 11 Major

Demystifying Long CoT Reasoning in LLMs
arxiv.org/pdf/2502.03373
Reasoning models like R1 / O1 / O3 have gained massive attention, but their training dynamics remain a mystery. We're taking a first deep dive into understanding long CoT reasoning in LLMs! 

11 Major
Hejie Cui (@hennyjiecc) 's Twitter Profile Photo

We build ๐— ๐—ฒ๐—ฑ๐—›๐—˜๐—Ÿ๐— โœจ: a comprehensive benchmark evaluating AI on realistic clinical tasks that healthcare professionals perform daily instead of just medical exams.๐Ÿ‘ฉโ€โš•๏ธโš•๏ธ โ€ข Stanford HAI Blog: hai.stanford.edu/news/holistic-โ€ฆ โ€ข Leaderboard: crfm.stanford.edu/helm/medhelm/lโ€ฆ

Virtue AI (@virtueai_co) 's Twitter Profile Photo

Weโ€™ve raised $30M in Seed + Series A funding led by Lightspeed and Walden Catalyst Ventures, with participation from Prosperity7 Ventures, Factory, Osage University Partners (OUP), Lip-Bu Tan, Chris Re, and more. Virtue AI is the first unified platform for securing AI across

Prateek Mittal (@prateekmittal_) 's Twitter Profile Photo

Last week, I shared two #ICLR2025 papers that were recognized by their Award committee. Reflecting on the outcome, I thought it might be interesting to share that both papers were previously rejected by #NeurIPS2024. I found the dramatic difference in reviewer perception of