Seongyun Lee (@sylee_ai) Twitter Tweets • TwiCopy

Seungone Kim @ NAACL2025

7 months ago

#NLProc New paper on "evaluation-time scaling", a new dimension to leverage test-time compute! We replicate the test-time scaling behaviors observed in generators (e.g., o1, r1, s1) with evaluators by enforcing to generate additional reasoning tokens. arxiv.org/abs/2503.19877

thumb_up_off_alt171

chat_bubble_outline2

repeat37

shareShare

Geewook Kim

@geewookkim

6 months ago

Presenting our poster at #ICLR2025 today (Fri, Apr 25, 15:00) — Hall 3 + Hall 2B #264! We explored safety issues when extending LLMs to vision and how to address them. Come by and let’s chat—always happy to discuss ideas! 🤗

thumb_up_off_alt10

chat_bubble_outline0

repeat4

shareShare

AK

@_akhaliq

6 months ago

Paper2Code Automating Code Generation from Scientific Papers in Machine Learning

thumb_up_off_alt1,1K

chat_bubble_outline13

repeat233

shareShare

Jinheon Baek

@jinheonbaek

6 months ago

So excited to drop PaperCoder, a multi-agent LLM system that turns ML papers into full codebases. It looks like this:📄 (papers) → 🧠 (planning) → 🛠️ (full repos), all powered by 🤖. Big thanks to AK for the shoutout! Paper: arxiv.org/abs/2504.17192

thumb_up_off_alt69

chat_bubble_outline2

repeat19

shareShare

Seungone Kim @ NAACL2025

@seungonekim

6 months ago

🏆Glad to share that our BiGGen Bench paper has received the best paper award at NAACL HLT 2025! x.com/naaclmeeting/s… 📅 Ballroom A, Session I: Thursday May 1st, 16:00-17:30 (MDT) 📅 Session M (Plenary Session): Friday May 2nd, 15:30-16:30 (MDT) 📅 Virtual Conference: Tuesday

thumb_up_off_alt130

chat_bubble_outline11

repeat22

shareShare

jiyeon kim

@jiyeonkimd

6 months ago

Presenting ✨Knowledg Entropy✨ at #ICLR2025 today in Oral 5C(Garnet 216-218) at 10:30AM and in Poster 6(#251) from 3:00PM We investigated how changes in a model's tendency to integrate its parametric knowledge during pretraining affect knowledge acquisition and forgetting

thumb_up_off_alt46

chat_bubble_outline1

repeat9

shareShare

Seungone Kim @ NAACL2025

@seungonekim

5 months ago

Glad to share that our AgoraBench paper has been accepted at ACL 2025 2025 (main)! Special thanks to our coauthors JuYoung Suk Xiang Yue Vijay V. Seongyun Lee Yizhong Wang Kiril Gashteovski Carolin Sean Welleck Graham Neubig! A belief I hold more firmly now than when I started this project

thumb_up_off_alt65

chat_bubble_outline1

repeat10

shareShare

elvis

@omarsar0

5 months ago

The CoT Encyclopedia How to predict and steer the reasoning strategies of LLMs that use chain-of-thought (CoT)? More below:

thumb_up_off_alt315

chat_bubble_outline7

repeat66

shareShare

fly51fly

@fly51fly

5 months ago

[CL] The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think S Lee, S Kim, M Seo, Y Jo... [KAIST AI & CMU] (2025) arxiv.org/abs/2505.10185

thumb_up_off_alt7

chat_bubble_outline0

repeat3

shareShare

The AI Timeline

@theaitimeline

5 months ago

🚨This week's top AI/ML research papers: - AlphaEvolve - Qwen3 Technical Report - Insights into DeepSeek-V3 - Seed1.5-VL Technical Report - BLIP3-o - Parallel Scaling Law for LMs - HealthBench - Learning Dynamics in Continual Pre-Training for LLMs - Learning to Think - Beyond

thumb_up_off_alt1,1K

chat_bubble_outline6

repeat174

shareShare

Minki Kang

@mkkang_1133

5 months ago

🚨 New preprint! Can small language models (sLMs) solve complex problems like LLMs? We show how to go beyond cloning reasoning—to distill tool-using agent behavior into sLMs as tiny as 0.5B. Meet Agent Distillation: 📄 huggingface.co/papers/2505.17… Here's the details 🧵👇:

thumb_up_off_alt116

chat_bubble_outline2

repeat27

shareShare

Changdae Oh

@changdae_oh

5 months ago

Does anyone want to dig deeper into the robustness of Multimodal LLMs (MLLMs) beyond empirical observations Happy to serve this exactly through our new #ICML2025 paper "Understanding Multimodal LLMs Under Distribution Shifts: An Information-Theoretic Approach"!

thumb_up_off_alt111

chat_bubble_outline1

repeat25

shareShare

Hyeonbin Hwang

@ronalhwang

5 months ago

🚨 New Paper co-led with byeongguk jeon 🚨 Q. Can we adapt Language Models, trained to predict next token, to reason in sentence-level? I think LMs operating in higher-level abstraction would be a promising path towards advancing its reasoning, and I am excited to share our

🚨 New Paper co-led with <a href="/bkjeon1211/">byeongguk jeon</a> 🚨

Q. Can we adapt Language Models, trained to predict next token, to reason in sentence-level?

I think LMs operating in higher-level abstraction would be a promising path towards advancing its reasoning, and I am excited to share our

thumb_up_off_alt167

chat_bubble_outline4

repeat44

shareShare

Yunjae Won

@yunjae_won_

5 months ago

[1/6] Ever wondered why Direct Preference Optimization is so effective for aligning LLMs? 🤔 Our new paper dives deep into the theory behind DPO's success, through the lens of information gain. Paper: "Differential Information: An Information-Theoretic Perspective on Preference

thumb_up_off_alt64

chat_bubble_outline4

repeat22

shareShare

hyunji amy lee

@hyunji_amy_lee

4 months ago

🚨 Want models to better utilize and ground on the provided knowledge? We introduce Context-INformed Grounding Supervision (CINGS)! Training LLM with CINGS significantly boosts grounding abilities in both text and vision-language models compared to standard instruction tuning.

thumb_up_off_alt48

chat_bubble_outline1

repeat22

shareShare

Sakana AI

@sakanaailabs

4 months ago

We’re excited to introduce AB-MCTS! Our new inference-time scaling algorithm enables collective intelligence for AI by allowing multiple frontier models (like Gemini 2.5 Pro, o4-mini, DeepSeek-R1-0528) to cooperate. Blog: sakana.ai/ab-mcts Paper: arxiv.org/abs/2503.04412

thumb_up_off_alt1,1K

chat_bubble_outline27

repeat222

shareShare