Jimin Mun (@jiminmun_) Twitter Tweets • TwiCopy

Clara Na

a year ago

Building/customizing your own LLM? You'll want to curate training data for it, but how do you know what makes the data good? You can try out recipes👩‍🍳 iterate on vibes✨ but we can't actually test all possible combos of tweaks,,, right?? 🙅‍♂️WRONG! arxiv.org/abs/2410.15661 (1/n) 🧵

thumb_up_off_alt171

chat_bubble_outline3

repeat35

shareShare

Jocelyn Shen

@jocelynjshen

a year ago

Will be presenting our work next week at #EMNLP2024 in Computational Social Science + Cultural Analytics session 1 (Nov 12)!! Come say hello ☺️🌴

thumb_up_off_alt79

chat_bubble_outline1

repeat9

shareShare

Shuyan Zhou

@shuyanzhxyc

a year ago

My lab at Duke has multiple Ph.D. openings! Our mission is to augment human decision-making by advancing the reasoning, comprehension, and autonomy of modern AI systems. I am attending #emnlp2024, happy to chat about PhD applications, LLM agents, evaluation etc etc!

thumb_up_off_alt252

chat_bubble_outline8

repeat57

shareShare

Simran Khanuja

@simi_97k

a year ago

Thank you so much EMNLP 2025 for this wonderful recognition! I’m so honored and humbled 💕 Thanks Graham Neubig for your support throughout! We’ve been working on this for 1.5 years and everyone who has spoken with me in the recent past knows how passionately I feel about this

thumb_up_off_alt444

chat_bubble_outline77

repeat30

shareShare

So Yeon (Tiffany) Min on Industry Job Market

@soyeontiffmin

10 months ago

🚨🚨 Preprint Alert 🚨🚨 🚀🚀 As AI become agents 🤖, how can we reliably delegate tasks to them, if they cannot communicate their limitations😭 or ask for help or test-time compute 🧑‍🚒 when needed? We present our new pre-print **Self-Regulation and Requesting Interventions**

thumb_up_off_alt108

chat_bubble_outline1

repeat39

shareShare

Jimin Mun

@jiminmun_

10 months ago

Check out our work on improving LLM's ability to seek information through asking better questions! 💫

thumb_up_off_alt31

chat_bubble_outline0

repeat7

shareShare

Chan Young Park

@chan_young_park

9 months ago

Can AI achieve political neutrality? Check out our new position paper! (spoiler: no, it can’t, but there are other ways)

thumb_up_off_alt14

chat_bubble_outline0

repeat4

shareShare

Chan Young Park

@chan_young_park

9 months ago

⭐️Looking for a PhD Intern⭐️ Join me this summer at MSR to work on personal AI agents! We're developing innovative models to enhance personalized MS Copilot experiences. I'm seeking candidates with strong modeling skills and experience with LLM (multi-)agents/preference learning

thumb_up_off_alt80

chat_bubble_outline2

repeat18

shareShare

Akhila Yerukola

@akhila_yerukola

9 months ago

Did you know? Gestures to express universal concepts—like wishing for luck—vary WIDELY across cultures? 🤞means luck in US but deeply offensive in Vietnam 🚨 📣We introduce MC-SIGNS, a test bed to evaluate how LLMs/VLMs/T2I handle such nonverbal cues 📜: arxiv.org/abs/2502.17710

thumb_up_off_alt50

chat_bubble_outline2

repeat15

shareShare

Santiago Cortés-Gómez

@sancortes_95

9 months ago

Throwback to our work on Decision-Aware Uncertainty Quantification!—excited that it will be presented at ICLR 2025! If you missed it, check it out here:[arxiv.org/abs/2410.01767] x.com/sancortes_95/s…

thumb_up_off_alt11

chat_bubble_outline0

repeat2

shareShare

Danny To Eun Kim (@teknology.bsky.social)

@teknologyy

9 months ago

🚨New Breakthrough in Tip-of-the-Tongue (TOT) Retrieval Research! We address data limitations and offer a fresh evaluation method for the TOT complex queries. Curious how TREC TOT track test queries are created? Check out this thread🧵 and our paper📄: arxiv.org/abs/2502.17776

thumb_up_off_alt29

chat_bubble_outline2

repeat9

shareShare

Seungone Kim @ NAACL2025

@seungonekim

8 months ago

#NLProc New paper on "evaluation-time scaling", a new dimension to leverage test-time compute! We replicate the test-time scaling behaviors observed in generators (e.g., o1, r1, s1) with evaluators by enforcing to generate additional reasoning tokens. arxiv.org/abs/2503.19877

thumb_up_off_alt171

chat_bubble_outline2

repeat37

shareShare

Hyunwoo Kim

@hyunw_kim

7 months ago

Humans backtrack where we should've made a better decision. How do we do this? We search and simulate alternative paths that might have led to better outcomes. Our🌈RETRO-Search mimics this process, empowering models to achieve SOTA performance AND efficient reasoning in math🌟

thumb_up_off_alt33

chat_bubble_outline0

repeat4

shareShare

Omar Shaikh

@oshaikh13

7 months ago

Hi! I'm gonna be presenting this at #ICLR2025 during the Thursday poster session (4/24; 3 p.m - 5:30 p.m, Hall 3 + Hall 2B #208). Come by if you want to talk about making ice cream!! (and also human-computer grounding, interacting with LMs, user models, etc.)

thumb_up_off_alt98

chat_bubble_outline0

repeat14

shareShare

Chan Young Park

@chan_young_park

7 months ago

🚀 Excited to share our #NAACL2025 paper on Language Model Personalization! arxiv.org/abs/2410.16027 Current RLHF methods often overlook *whose* preferences are being optimized. This can cause conflicting signals and models that mainly cater to the “average” or most dominant users

thumb_up_off_alt84

chat_bubble_outline2

repeat15

shareShare

Valentina Pyatkin

@valentina__py

7 months ago

📢 The SoLaR workshop will be collocated with COLM! Conference on Language Modeling SoLaR is a collaborative forum for researchers working on responsible development, deployment and use of language models. We welcome both technical and sociotechnical submissions, deadline July 5th!

📢 The SoLaR workshop will be collocated with COLM! <a href="/COLM_conf/">Conference on Language Modeling</a>

SoLaR is a collaborative forum for researchers working on responsible development, deployment and use of language models.

We welcome both technical and sociotechnical submissions, deadline July 5th!

thumb_up_off_alt85

chat_bubble_outline1

repeat13

shareShare

Valentina Pyatkin

@valentina__py

7 months ago

📄Website: solar-colm.github.io/call/ ✨ Organizers: Usman Anwar Liwei Jiang Valentina Pyatkin Sharon Levy Daniel Tan Akhila Yerukola Jimin Mun Ruth Appel Sumeet Motwani David Krueger Sheila McIlraith Maarten Sap (he/him)

thumb_up_off_alt11

chat_bubble_outline0

repeat2

shareShare

Seungone Kim @ NAACL2025

@seungonekim

7 months ago

Glad to share that our AgoraBench paper has been accepted at ACL 2025 2025 (main)! Special thanks to our coauthors JuYoung Suk Xiang Yue Vijay V. Seongyun Lee Yizhong Wang Kiril Gashteovski Carolin Sean Welleck Graham Neubig! A belief I hold more firmly now than when I started this project

thumb_up_off_alt65

chat_bubble_outline1

repeat10

shareShare

Myra Cheng

@chengmyra1

7 months ago

Dear ChatGPT, Am I the Asshole? While Reddit users might say yes, your favorite LLM probably won’t. We present Social Sycophancy: a new way to understand and measure sycophancy as how LLMs overly preserve users' self-image.

thumb_up_off_alt303

chat_bubble_outline12

repeat33

shareShare

Stella Li

@stellalisy

6 months ago

🤯 We cracked RLVR with... Random Rewards?! Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by: - Random rewards: +21% - Incorrect rewards: +25% - (FYI) Ground-truth rewards: + 28.8% How could this even work⁉️ Here's why: 🧵 Blogpost: tinyurl.com/spurious-rewar…

thumb_up_off_alt1,1K

chat_bubble_outline69

repeat322

shareShare