Yilun Zhao (@yilunzhao_nlp) Twitter Tweets • TwiCopy

Yilun Zhao

10 months ago

🔥 The ICLR 2025 LLM Reasoning & Planning Workshop is committed to supporting early-career researchers & fostering diversity, equity, and inclusion! 🌍

thumb_up_off_alt15

chat_bubble_outline0

repeat3

shareShare

We release TESS2, a new diffusion LLM! 🚀Some highlights: ⚪ A general instruction-following LLM! ⚪ We show a working recipe for continual training of AR LLMs with diffusion. ⚪ Use reward guidance, which can use any reward model to steer model generations at test time. ⚪

thumb_up_off_alt26

chat_bubble_outline0

repeat6

shareShare

Ai2

@allen_ai

10 months ago

We’re excited to share some updates to Ai2 ScholarQA: 🗂️ You can now sign in via Google to save your query history across devices and browsers. 📚 We added 108M+ paper abstracts to our corpus - expect to get even better responses! ✨ The backbone model has been updated to the

thumb_up_off_alt166

chat_bubble_outline3

repeat37

shareShare

Rob Tang

@xiangrutang

10 months ago

Happy to announce our paper has been accepted at #ICLR2025! 🎉 "ChemAgent: Self-updating Library in LLMs Improves Chemical Reasoning" 📈 Significant improvements across multiple LLMs: - GPT-4: +37% avg accuracy, up to +46% on CHEMMC (SciBench) - Llama3-70B: +13% improvement -

thumb_up_off_alt70

chat_bubble_outline5

repeat18

shareShare

Simeng (Sophia) Han

@hansineng

9 months ago

We’re thrilled to welcome you to New England NLP 2025 at Yale University on April 11th in New Haven, CT 🎉 nenlp.github.io/spr2025/! Join us for a full day of exciting talks and sparkling discussions with NLP researchers across the New England region and beyond. 👉 Register now

thumb_up_off_alt105

chat_bubble_outline4

repeat24

shareShare

Rob Tang

@xiangrutang

9 months ago

🧠 Excited to share our latest work: "MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning"! We've curated a challenging hard subset from existing medical QA datasets. We select questions where fewer than 50% of the LLMs (incl. GPT-4o,

thumb_up_off_alt59

chat_bubble_outline1

repeat18

shareShare

Simeng (Sophia) Han

@hansineng

9 months ago

Excited to announce that a Best Paper Prize 🏆 will be awarded at New England NLP 2025 nenlp.github.io/spr2025/! Submit your COLM/ARR/ICLR paper 📑now to be considered! Deadline is approaching 👀, don’t miss out. Registration and submission 🔗: docs.google.com/forms/d/e/1FAI…. #NLP

thumb_up_off_alt50

chat_bubble_outline2

repeat11

shareShare

Asaf Yehudai

@asafyehudai

9 months ago

Survey on Evaluation of LLM-based Agents 🤖 Our paper is the first to provide a comprehensive overview of LLM-based agent evaluation 📜 Paper: arxiv.org/pdf/2503.16416

thumb_up_off_alt334

chat_bubble_outline4

repeat83

shareShare

Simeng (Sophia) Han

@hansineng

9 months ago

⌛️ Time’s ticking! Don’t miss your chance to submit your CoLM/ARR/ICLR/ongoing work to New England NLP 2025. nenlp.github.io/spr2025/ And join us with an exciting lineup of speakers 🥳.

thumb_up_off_alt48

chat_bubble_outline1

repeat10

shareShare

AK

@_akhaliq

9 months ago

Z1 just dropped on Hugging Face Efficient Test-time Scaling with Code

thumb_up_off_alt340

chat_bubble_outline4

repeat56

shareShare

Zhaojian Yu

@yfngnin4

9 months ago

Shifted Thinking is interesting! Z1 can perform simple or complex reasoning on different problems (No extra prompt is provided). Checkout our repo: github.com/efficientscali…

thumb_up_off_alt17

chat_bubble_outline0

repeat6

shareShare

Arman Cohan

@armancohan

8 months ago

Join us at Yale this Friday! We have an amazing line of speakers, an exciting panel, and tons of fascinating posters!

thumb_up_off_alt24

chat_bubble_outline0

repeat3

shareShare

Rohan Paul

@rohanpaul_ai

8 months ago

Small Language Models struggle on knowledge-intensive tasks due to reliance on internal knowledge and suboptimal integration of external facts via standard RAG. MCTS-RAG solves this by integrating Monte Carlo Tree Search (MCTS) for structured reasoning exploration with adaptive

thumb_up_off_alt209

chat_bubble_outline1

repeat42

shareShare

Simeng (Sophia) Han

@hansineng

8 months ago

We wrapped up New England NLP 2025 at Yale Engineering 🎉! 215 registrations from 37 institutions ✅ An amazing lineup of speakers 🎤 A heated panel 💬 86 posters and 5 oral presentations 🗣️ Attaching some sparkling moments captured by Cho-Haam Lye!

We wrapped up New England NLP 2025 at <a href="/YaleEngineering/">Yale Engineering</a> 🎉!

215 registrations from 37 institutions ✅
An amazing lineup of speakers 🎤
A heated panel 💬
86 posters and 5 oral presentations 🗣️

Attaching some sparkling moments captured by <a href="/_Chuhan_Li/">Cho-Haam Lye</a>!

thumb_up_off_alt39

chat_bubble_outline0

repeat7

shareShare

Chuhan Li @ICLR2025

@_chuhan_li

8 months ago

Me and Ziyao Shangguan @ ICLR 2025 will be at ICLR 2026 in Singapore this week to present our work 🍅 TOMATO for evaluating video understanding capabilities of MFMs. 🎇Poster: 10 am -- 12:30pm at Hall 3 + Hall 2B #73, Apr 24. If you are interested in MFMs and multimodal reasoning, let's

Me and <a href="/ZiyaoShangguan/">Ziyao Shangguan @ ICLR 2025</a> will be at <a href="/iclr_conf/">ICLR 2026</a> in Singapore this week to present our work 🍅 TOMATO for evaluating video understanding capabilities of MFMs.

🎇Poster: 10 am -- 12:30pm at Hall 3 + Hall 2B #73, Apr 24.

If you are interested in MFMs and multimodal reasoning, let's

thumb_up_off_alt22

chat_bubble_outline2

repeat6

shareShare

Xin Eric Wang @ ICLR 2025

@xwang_lk

8 months ago

Welcome to talk to Chuhan Li, who will join my group at UCSB this fall! How exciting!

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Zhiyuan

@zhiyuancs

7 months ago

🚀 Beyond “aha”: toward Meta‑Abilities Alignment! Zero human annotation enables LRMs masters strong reasoning abilities rather than aha emerging and generalize across math ⚙️, code 💻, science 🔬. Meta‑ability alignment lifts the ceiling of further domain‑RL—7B → 32B

thumb_up_off_alt97

chat_bubble_outline2

repeat18

shareShare

siyue.zhang

@siyue_zhang_sg

7 months ago

[1/8] How embeddings from Text Diffusion Models ✨compare to those from LLMs 🦙? Check out our work “Diffusion vs. AR Language Models: A Text Embedding Perspective”! We introduce a new diffusion-based embedding model that excels in long-document and reasoning-intensive retrieval.

thumb_up_off_alt21

chat_bubble_outline1

repeat5

shareShare

Simeng (Sophia) Han

@hansineng

6 months ago

Excited to see more investigation into LLM creativity. We have some pioneering work on this topic as well: Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models. arxiv.org/pdf/2505.10844.

thumb_up_off_alt17

chat_bubble_outline0

repeat6

shareShare