Yilun Zhao (@yilunzhao_nlp) 's Twitter Profile
Yilun Zhao

@yilunzhao_nlp

PhD student @yalenlp

ID: 1535739979955060736

linkhttp://yilunzhao.github.io calendar_today11-06-2022 21:45:26

61 Tweet

235 Followers

326 Following

Yilun Zhao (@yilunzhao_nlp) 's Twitter Profile Photo

🔥 The ICLR 2025 LLM Reasoning & Planning Workshop is committed to supporting early-career researchers & fostering diversity, equity, and inclusion! 🌍

Arman Cohan (@armancohan) 's Twitter Profile Photo

We release TESS2, a new diffusion LLM! 🚀Some highlights: ⚪ A general instruction-following LLM! ⚪ We show a working recipe for continual training of AR LLMs with diffusion. ⚪ Use reward guidance, which can use any reward model to steer model generations at test time. ⚪

Ai2 (@allen_ai) 's Twitter Profile Photo

We’re excited to share some updates to Ai2 ScholarQA: 🗂️ You can now sign in via Google to save your query history across devices and browsers. 📚 We added 108M+ paper abstracts to our corpus - expect to get even better responses! ✨ The backbone model has been updated to the

We’re excited to share some updates to Ai2 ScholarQA:
🗂️ You can now sign in via Google to save your query history across devices and browsers.
📚 We added 108M+ paper abstracts to our corpus - expect to get even better responses!
✨ The backbone model has been updated to the
Rob Tang (@xiangrutang) 's Twitter Profile Photo

Happy to announce our paper has been accepted at #ICLR2025! 🎉 "ChemAgent: Self-updating Library in LLMs Improves Chemical Reasoning" 📈 Significant improvements across multiple LLMs: - GPT-4: +37% avg accuracy, up to +46% on CHEMMC (SciBench) - Llama3-70B: +13% improvement -

Happy to announce our paper has been accepted at #ICLR2025! 🎉
"ChemAgent: Self-updating Library in LLMs Improves Chemical Reasoning" 

📈 Significant improvements across multiple LLMs:

- GPT-4: +37% avg accuracy, up to +46% on CHEMMC (SciBench)
- Llama3-70B: +13% improvement
-
Simeng (Sophia) Han (@hansineng) 's Twitter Profile Photo

We’re thrilled to welcome you to New England NLP 2025 at Yale University on April 11th in New Haven, CT 🎉 nenlp.github.io/spr2025/! Join us for a full day of exciting talks and sparkling discussions with NLP researchers across the New England region and beyond. 👉 Register now

We’re thrilled to welcome you to New England NLP 2025 at Yale University on April 11th in New Haven, CT 🎉 nenlp.github.io/spr2025/! 
Join us for a full day of exciting talks and sparkling discussions with NLP researchers across the New England region and beyond. 
👉 Register now
Rob Tang (@xiangrutang) 's Twitter Profile Photo

🧠 Excited to share our latest work: "MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning"! We've curated a challenging hard subset from existing medical QA datasets. We select questions where fewer than 50% of the LLMs (incl. GPT-4o,

🧠 Excited to share our latest work: "MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning"! We've curated a challenging hard subset from existing medical QA datasets. We select questions where fewer than 50% of the LLMs (incl. GPT-4o,
Simeng (Sophia) Han (@hansineng) 's Twitter Profile Photo

Excited to announce that a Best Paper Prize 🏆 will be awarded at New England NLP 2025 nenlp.github.io/spr2025/! Submit your COLM/ARR/ICLR paper 📑now to be considered! Deadline is approaching 👀, don’t miss out. Registration and submission 🔗: docs.google.com/forms/d/e/1FAI…. #NLP

Asaf Yehudai (@asafyehudai) 's Twitter Profile Photo

Survey on Evaluation of LLM-based Agents 🤖 Our paper is the first to provide a comprehensive overview of LLM-based agent evaluation 📜 Paper: arxiv.org/pdf/2503.16416

Survey on Evaluation of LLM-based Agents 🤖

Our paper is the first to provide a comprehensive overview of LLM-based agent evaluation 📜

Paper: arxiv.org/pdf/2503.16416
Simeng (Sophia) Han (@hansineng) 's Twitter Profile Photo

⌛️ Time’s ticking! Don’t miss your chance to submit your CoLM/ARR/ICLR/ongoing work to New England NLP 2025. nenlp.github.io/spr2025/ And join us with an exciting lineup of speakers 🥳.

⌛️ Time’s ticking! Don’t miss your chance to submit your CoLM/ARR/ICLR/ongoing work to New England NLP 2025.

nenlp.github.io/spr2025/

And join us with an exciting lineup of speakers 🥳.
Zhaojian Yu (@yfngnin4) 's Twitter Profile Photo

Shifted Thinking is interesting! Z1 can perform simple or complex reasoning on different problems (No extra prompt is provided). Checkout our repo: github.com/efficientscali…

Shifted Thinking is interesting! 
Z1 can perform simple or complex reasoning on different problems (No extra prompt is provided).
Checkout our repo: github.com/efficientscali…
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Small Language Models struggle on knowledge-intensive tasks due to reliance on internal knowledge and suboptimal integration of external facts via standard RAG. MCTS-RAG solves this by integrating Monte Carlo Tree Search (MCTS) for structured reasoning exploration with adaptive

Small Language Models struggle on knowledge-intensive tasks due to reliance on internal knowledge and suboptimal integration of external facts via standard RAG.

MCTS-RAG solves this by integrating Monte Carlo Tree Search (MCTS) for structured reasoning exploration with adaptive
Simeng (Sophia) Han (@hansineng) 's Twitter Profile Photo

We wrapped up New England NLP 2025 at Yale Engineering 🎉! 215 registrations from 37 institutions ✅ An amazing lineup of speakers 🎤 A heated panel 💬 86 posters and 5 oral presentations 🗣️ Attaching some sparkling moments captured by Cho-Haam Lye!

We wrapped up New England NLP 2025 at <a href="/YaleEngineering/">Yale Engineering</a> 🎉! 

215 registrations from 37 institutions ✅
An amazing lineup of speakers 🎤
A heated panel 💬
86 posters and 5 oral presentations 🗣️

Attaching some sparkling moments captured by <a href="/_Chuhan_Li/">Cho-Haam Lye</a>!
Chuhan Li @ICLR2025 (@_chuhan_li) 's Twitter Profile Photo

Me and Ziyao Shangguan @ ICLR 2025 will be at ICLR 2026 in Singapore this week to present our work 🍅 TOMATO for evaluating video understanding capabilities of MFMs. 🎇Poster: 10 am -- 12:30pm at Hall 3 + Hall 2B #73, Apr 24. If you are interested in MFMs and multimodal reasoning, let's

Me and <a href="/ZiyaoShangguan/">Ziyao Shangguan @ ICLR 2025</a> will be at <a href="/iclr_conf/">ICLR 2026</a> in Singapore this week to present our work 🍅 TOMATO for evaluating video understanding capabilities of MFMs. 

🎇Poster: 10 am -- 12:30pm at Hall 3 + Hall 2B #73, Apr 24. 

If you are interested in MFMs and multimodal reasoning, let's
Zhiyuan (@zhiyuancs) 's Twitter Profile Photo

🚀 Beyond “aha”: toward Meta‑Abilities Alignment! Zero human annotation enables LRMs masters strong reasoning abilities rather than aha emerging and generalize across math ⚙️, code 💻, science 🔬. Meta‑ability alignment lifts the ceiling of further domain‑RL—7B → 32B

🚀 Beyond “aha”: toward Meta‑Abilities Alignment!
Zero human annotation enables LRMs masters strong reasoning abilities rather than aha emerging and generalize across math ⚙️, code 💻, science 🔬.

Meta‑ability alignment lifts the ceiling of further domain‑RL—7B → 32B
Simeng (Sophia) Han (@hansineng) 's Twitter Profile Photo

Excited to see more investigation into LLM creativity. We have some pioneering work on this topic as well: Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models. arxiv.org/pdf/2505.10844.