Dongkeun Yoon (@dongkeun_yoon) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

🙁 LLMs are overconfident even when they are dead wrong. 🧐 What about reasoning models? Can they actually tell us “My answer is only 60% likely to be correct”? ❗Our paper suggests that they can! Through extensive analysis, we investigate what enables this emergent ability.

thumb_up_off_alt298

chat_bubble_outline9

repeat50

shareShare

Sohee Yang

@soheeyang_

2 months ago

Reasoning models are quite verbose in their thinking process. Is it any good? We find out that it enables reasoning models to be more accurate in telling what they know and don’t know (confidence)! Even non-reasoning models can do it better if they mimic the verbose reasoning! 👀

thumb_up_off_alt82

chat_bubble_outline1

repeat11

shareShare

Seungone Kim @ NAACL2025

@seungonekim

2 months ago

Turns out that reasoning models not only excel at solving problems but are also excellent confidence estimators - an unexpected side effect of long CoTs! This reminds me that smart ppl are good at determining what they know & don't know👀 Check out Dongkeun Yoon 's post!

thumb_up_off_alt17

chat_bubble_outline0

repeat1

shareShare

fly51fly

@fly51fly

2 months ago

[CL] Reasoning Models Better Express Their Confidence D Yoon, S Kim, S Yang, S Kim... [KAIST & CMU & UCL] (2025) arxiv.org/abs/2505.14489

thumb_up_off_alt19

chat_bubble_outline0

repeat5

shareShare

Connor Shorten

@cshorten30

2 months ago

Dongkeun Yoon 🔥🔥🔥

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare

Smells Like ML

@smellslikeml

2 months ago

Dongkeun Yoon Congrats to the team for this fantastic work! Had a chance to try the code on my reasoning VLM and found consistent results. x.com/smellslikeml/s…

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare

Chaeeun Kim

@chaechaek1214

2 months ago

❓What if your RAG didn’t need a separate retrieval model at all? We present 🧊FREESON, a new framework for retriever-FREE retrieval-augmented reasoning. With FREESON, a single LRM acts as both generator and retriever, shifting the focus from seq2seq matching to locating

thumb_up_off_alt29

chat_bubble_outline1

repeat5

shareShare

arlo_son

@gson_ai

2 months ago

Imagine you’re collaborating with an AI co-scientist: you ask it to proofread your manuscript and flag any errors. Which LLM would you choose? 🤔 We evaluated the new Claude 4 models on SPOT. It looks like o3 is still the best model for this.

thumb_up_off_alt8

chat_bubble_outline2

repeat5

shareShare

Hoyeon Chang

@hoyeon_chang

2 months ago

New preprint 📄 (with Jinho Park ) Can neural nets really reason compositionally, or just match patterns? We present the Coverage Principle: a data-centric framework that predicts when pattern-matching models will generalize (validated on Transformers). 🧵👇

New preprint 📄 (with <a href="/jinho___park/">Jinho Park</a> )

Can neural nets really reason compositionally, or just match patterns?
We present the Coverage Principle: a data-centric framework that predicts when pattern-matching models will generalize (validated on Transformers). 🧵👇

thumb_up_off_alt125

chat_bubble_outline2

repeat30

shareShare

Hyeonbin Hwang

@ronalhwang

2 months ago

🚨 New Paper co-led with byeongguk jeon 🚨 Q. Can we adapt Language Models, trained to predict next token, to reason in sentence-level? I think LMs operating in higher-level abstraction would be a promising path towards advancing its reasoning, and I am excited to share our

🚨 New Paper co-led with <a href="/bkjeon1211/">byeongguk jeon</a> 🚨

Q. Can we adapt Language Models, trained to predict next token, to reason in sentence-level?

I think LMs operating in higher-level abstraction would be a promising path towards advancing its reasoning, and I am excited to share our

thumb_up_off_alt167

chat_bubble_outline4

repeat44

shareShare

Sheikh Shafayat ✈️ ICLR'25 🇸🇬

@shafayat_sheikh

2 months ago

Check out our latest work on self-improving LLMs, where we try to see if LLMs can utilize their internal self consistency as a reward signal to bootstrap itself using RL. TL;DR: it can, to some extent, but then ends up reward hacking the self-consistency objective. We try to see

thumb_up_off_alt143

chat_bubble_outline4

repeat24

shareShare

Dayoon Ko

@dayoon12161

2 months ago

🚨 Excited to share that our paper was accepted to #ACL2025 Findings 🎉 "When Should Dense Retrievers Be Updated in Evolving Corpora? Detecting Out-of-Distribution Corpora Using GradNormIR" Huge thanks to my amazing collaborators! 🙌 Jinyoung Kim Sohyeon Kim We propose

thumb_up_off_alt38

chat_bubble_outline0

repeat7

shareShare

Sohee Yang

@soheeyang_

a month ago

🚨 New Paper 🧵 How effectively do reasoning models reevaluate their thought? We find that: - Models excel at identifying unhelpful thoughts but struggle to recover from them - Smaller models can be more robust - Self-reevaluation ability is far from true meta-cognitive awareness

thumb_up_off_alt103

chat_bubble_outline3

repeat24

shareShare

hyunji amy lee

@hyunji_amy_lee

a month ago

🚨 Want models to better utilize and ground on the provided knowledge? We introduce Context-INformed Grounding Supervision (CINGS)! Training LLM with CINGS significantly boosts grounding abilities in both text and vision-language models compared to standard instruction tuning.

thumb_up_off_alt48

chat_bubble_outline1

repeat22

shareShare

Ricardo Rei

@ricardorei7

a month ago

🚀 Tower+: our latest model in the Tower family — sets a new standard for open-weight multilingual models! We show how to go beyond sentence-level translation, striking a balance between translation quality and general multilingual capabilities. 1/5 arxiv.org/pdf/2506.17080

thumb_up_off_alt24

chat_bubble_outline1

repeat8

shareShare

José Maria Pombal

@zmprcp

a month ago

Check out the latest iteration of Tower models, Tower+. Ideal for translation tasks and beyond, and available at three different scales: 2B, 9B, 72B. All available on huggingface: huggingface.co/collections/Un… Kudos to everyone involved!

thumb_up_off_alt10

chat_bubble_outline0

repeat1

shareShare

hyunji amy lee

@hyunji_amy_lee

23 days ago

🥳Excited to share that I’ll be joining UNC Computer Science as postdoc this fall. Looking forward to work with Mohit Bansal & amazing students at UNC AI. I'll continue working on retrieval, aligning knowledge modules with LLM's parametric knowledge, and expanding to various modalities.

🥳Excited to share that I’ll be joining <a href="/unccs/">UNC Computer Science</a> as postdoc this fall. Looking forward to work with <a href="/mohitban47/">Mohit Bansal</a> & amazing students at <a href="/unc_ai_group/">UNC AI</a>.
I'll continue working on retrieval, aligning knowledge modules with LLM's parametric knowledge, and expanding to various modalities.

thumb_up_off_alt159

chat_bubble_outline20

repeat26

shareShare