Fan Nie (@fannie1208) 's Twitter Profile
Fan Nie

@fannie1208

BS @SJTU1886 |MS @Stanford|Prev. Exchange & Intern @EPFL |Research in Reliable AI & Generative AI

ID: 1765238885247066112

calendar_today06-03-2024 04:52:09

22 Tweet

69 Followers

198 Following

Foundation Models in the Wild @ ICLR 2025 (@fm_in_wild) 's Twitter Profile Photo

🤩 It's happening today! Join us at the 2nd Workshop on Foundation Models in the Wild — Hall 4, #6, Singapore EXPO! 🔥 10 amazing invited talks 🔥 12 exciting oral presentations 🔥 Cutting-edge ideas and lively discussions 🚀 Don't miss it — come say hi and explore the future

🤩 It's happening today!
Join us at the 2nd Workshop on Foundation Models in the Wild — Hall 4, #6, Singapore EXPO!

🔥 10 amazing invited talks
🔥 12 exciting oral presentations
🔥 Cutting-edge ideas and lively discussions

🚀 Don't miss it — come say hi and explore the future
Pan Lu (@lupantech) 's Twitter Profile Photo

🎉 Thrilled to share that OctoTools won the Best Paper Award at KnowledgeNLP Workshop @NAACL 2025 #NAACL! 🏆 OctoTools is a flexible, easy-to-use framework that equips LLMs with diverse tools for complex reasoning—just customize your agent by mixing modular “tool cards” like building with Lego 🧩

🎉 Thrilled to share that OctoTools won the Best Paper Award at <a href="/knowledgenlp/">KnowledgeNLP Workshop @NAACL 2025</a> #NAACL! 🏆

OctoTools is a flexible, easy-to-use framework that equips LLMs with diverse tools for complex reasoning—just customize your agent by mixing modular “tool cards” like building with Lego 🧩
James Zou (@james_y_zou) 's Twitter Profile Photo

Our new #ICML2025 paper formulates #LLM hallucination as hypothesis testing to provide statistical guarantees on factuality. #FactTest is a distribution free and model agnostic approach to improve LLM accuracy. Great job Fan Nie Xiaotian Hou, Shuhang Lin Huaxiu Yao

James Zou (@james_y_zou) 's Twitter Profile Photo

>700K people die each year due to S. aureus infection. Today we show that our AI designed new molecule, synthecin, stops drug-resistant S. aureus MRSA in mouse model💊 We created synthecin w/ SyntheMol-RL, our new RL generative AI. All open source biorxiv.org/content/10.110…

&gt;700K people die each year due to S. aureus infection.

Today we show that our AI designed new molecule, synthecin, stops drug-resistant S. aureus MRSA in mouse model💊

We created synthecin w/ SyntheMol-RL, our new RL generative AI. All open source biorxiv.org/content/10.110…
Sheng Liu (@shengliu_) 's Twitter Profile Photo

🚨 Call for submission at our workshop on Computer Vision for Automated Medical Diagnosis #ICCV2025 in 🌴 Honolulu, Hawaii! Join us at #ICCV2025 to present your work on 🤖multimodal LLMs, 🧠 agents, & ⚖️ fair, reliable AI for healthcare and medicine. 🗓️ Deadline: June 21,

🚨 Call for submission at our workshop on Computer Vision for Automated Medical Diagnosis <a href="/ICCVConference/">#ICCV2025</a> in 🌴 Honolulu, Hawaii!
Join us at #ICCV2025 to present your work on 🤖multimodal LLMs, 🧠 agents, &amp; ⚖️ fair, reliable AI for healthcare and medicine.
🗓️ Deadline: June 21,
Yilun Du (@du_yilun) 's Twitter Profile Photo

Excited to share work on using classical search approaches to scale inference in diffusion models! We show how global graph search algorithms (BFS, DFS) and local search can be used to improve generation performance across domains such as image generation, planning, and RL!

Excited to share work on using classical search approaches to scale inference in diffusion models!

We show how global graph search algorithms (BFS, DFS) and local search can be used to improve generation performance across domains such as image generation, planning, and RL!
Dongfu Jiang (@dongfujiang) 's Twitter Profile Photo

Introducing VerlTool - a unified and easy-to-extend tool agent training framework based on verl. Recently, there's been a growing trend toward training tool agents with reinforcement learning algorithms like GRPO and PPO. Representative works include SearchR1, ToRL, ReTool, and

Introducing VerlTool - a unified and easy-to-extend tool agent training framework based on verl.

Recently, there's been a growing trend toward training tool agents with reinforcement learning algorithms like GRPO and PPO. Representative works include SearchR1, ToRL, ReTool, and
Pan Lu (@lupantech) 's Twitter Profile Photo

Do LLMs truly understand math proofs, or just guess? 🤔Our new study on #IneqMath dives deep into Olympiad-level inequality proofs & reveals a critical gap: LLMs are often good at finding answers, but struggle with rigorous, sound proofs. ➡️ ineqmath.github.io To tackle

Do LLMs truly understand math proofs, or just guess? 🤔Our new study on #IneqMath dives deep into Olympiad-level inequality proofs &amp; reveals a critical gap: LLMs are often good at finding answers, but struggle with rigorous, sound proofs.

➡️ ineqmath.github.io

To tackle
Shirley Wu (@shirleyyxwu) 's Twitter Profile Photo

Even the smartest LLMs can fail at basic multiturn communication Ask for grocery help → without asking where you live 🤦‍♀️ Ask to write articles → assumes your preferences 🤷🏻‍♀️ ⭐️CollabLLM (top 1%; oral ICML Conference) transforms LLMs from passive responders into active collaborators.

Even the smartest LLMs can fail at basic multiturn communication

Ask for grocery help → without asking where you live 🤦‍♀️
Ask to write articles → assumes your preferences 🤷🏻‍♀️

⭐️CollabLLM (top 1%; oral <a href="/icmlconf/">ICML Conference</a>) transforms LLMs from passive responders into active collaborators.
Percy Liang (@percyliang) 's Twitter Profile Photo

Wrapped up Stanford CS336 (Language Models from Scratch), taught with an amazing team Tatsunori Hashimoto Marcel Rød Neil Band Rohith Kuditipudi. Researchers are becoming detached from the technical details of how LMs work. In CS336, we try to fix that by having students build everything:

Hanlin Zhang (@_hanlin_zhang_) 's Twitter Profile Photo

[1/n] Discussions about LM reasoning and post-training have gained momentum. We identify several missing pieces: ✏️Post-training based on off-the-shelf base models without transparent pre-training data components and scale. ✏️Intermediate checkpoints with incomplete learning