Zora Wang (@zhiruow) Twitter Tweets • TwiCopy

Jaemin Cho (on faculty job market)

5 months ago

Sharing some personal updates 🥳: - I've completed my PhD at UNC Computer Science! 🎓 - Starting Fall 2026, I'll be joining the Computer Science dept. at Johns Hopkins University (JHU Computer Science) as an Assistant Professor 💙 - Currently exploring options + finalizing the plan for my gap year (Aug

Sharing some personal updates 🥳:
- I've completed my PhD at <a href="/unccs/">UNC Computer Science</a>! 🎓
- Starting Fall 2026, I'll be joining the Computer Science dept. at Johns Hopkins University (<a href="/JHUCompSci/">JHU Computer Science</a>) as an Assistant Professor 💙
- Currently exploring options + finalizing the plan for my gap year (Aug

thumb_up_off_alt395

chat_bubble_outline65

repeat45

shareShare

Stella Li

@stellalisy

5 months ago

🤯 We cracked RLVR with... Random Rewards?! Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by: - Random rewards: +21% - Incorrect rewards: +25% - (FYI) Ground-truth rewards: + 28.8% How could this even work⁉️ Here's why: 🧵 Blogpost: tinyurl.com/spurious-rewar…

thumb_up_off_alt1,1K

chat_bubble_outline69

repeat322

shareShare

Graham Neubig

@gneubig

5 months ago

I'd like to announce that the CMU FLAME center (cmu.edu/flame/) has a new cluster! It is 256 H100 GPUs, which we'll use to perform larger experiments, build more useful artifacts, and continue our tradition of open research. Expect to see more like this in the future👇

thumb_up_off_alt149

chat_bubble_outline3

repeat10

shareShare

Yizhong Wang

@yizhongwyz

5 months ago

Thrilled to announce that I will be joining UT Austin Computer Science at UT Austin as an assistant professor in fall 2026! I will continue working on language models, data challenges, learning paradigms, & AI for innovation. Looking forward to teaming up with new students & colleagues! 🤠🤘

Thrilled to announce that I will be joining <a href="/UTAustin/">UT Austin</a> <a href="/UTCompSci/">Computer Science at UT Austin</a> as an assistant professor in fall 2026!

I will continue working on language models, data challenges, learning paradigms, & AI for innovation. Looking forward to teaming up with new students & colleagues! 🤠🤘

thumb_up_off_alt620

chat_bubble_outline98

repeat48

shareShare

Apurva Gandhi

@apurvasgandhi

5 months ago

New preprint on web agents🚨 Go-Browse: Training Web Agents with Structured Exploration Problem: LLMs lack prior understanding of the websites that web agents will be deployed on. Solution: Go-Browse is an unsupervised method for automatically collecting diverse and realistic

thumb_up_off_alt37

chat_bubble_outline3

repeat6

shareShare

Lindia Tjuatja

@lltjuatja

4 months ago

When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs: 🧵1/9

thumb_up_off_alt85

chat_bubble_outline1

repeat18

shareShare

Omar Shaikh

@oshaikh13

4 months ago

What if LLMs could learn your habits and preferences well enough (across any context!) to anticipate your needs? In a new paper, we present the General User Model (GUM): a model of you built from just your everyday computer use. 🧵

thumb_up_off_alt181

chat_bubble_outline12

repeat57

shareShare

Michael Ryan

@michaelryan207

4 months ago

New #ACL2025NLP Paper! 🎉 Curious what AI thinks about YOU? We interact with AI every day, offering all kinds of feedback, both implicit ✏️ and explicit 👍. What if we used this feedback to personalize your AI assistant to you? Introducing SynthesizeMe! An approach for

thumb_up_off_alt135

chat_bubble_outline7

repeat35

shareShare

Junhong Shen

@junhongshen1

4 months ago

🔥Unlocking New Paradigm for Test-Time Scaling of Agents! We introduce Test-Time Interaction (TTI), which scales the number of interaction steps beyond thinking tokens per step. Our agents learn to act longer➡️richer exploration➡️better success Paper: arxiv.org/abs/2506.07976

thumb_up_off_alt154

chat_bubble_outline7

repeat36

shareShare

Saining Xie

@sainingxie

4 months ago

Had a great time at this CVPR community-building workshop---lots of fun discussions and some really important insights for early-career researchers. I also gave a talk on "Research as an Infinite Game." Here are the slides: canva.com/design/DAGp0iR…

thumb_up_off_alt347

chat_bubble_outline17

repeat60

shareShare

Yijia Shao

@echoshao8899

4 months ago

🚨 70 million US workers are about to face their biggest workplace transmission due to AI agents. But nobody asks them what they want. While AI races to automate everything, we took a different approach: auditing what workers want vs. what AI can do across the US workforce.🧵

thumb_up_off_alt280

chat_bubble_outline6

repeat47

shareShare

Diyi Yang

@diyi_yang

4 months ago

AI agents are transforming the workforce! We mapped how AI agents could #automate vs. #augment jobs across the U.S. workforce With a worker-first look of the future of work👇🧵

thumb_up_off_alt8

chat_bubble_outline1

repeat5

shareShare

Sanjay Subramanian

@sanjayssub

4 months ago

Also be sure to check out this awesome work on automated slide generation led by Jiaxin Ge and Zora Wang on Friday at Poster Session 1 - ExHall D #262. x.com/aomaru_21490/s…

thumb_up_off_alt3

chat_bubble_outline1

repeat1

shareShare

Jiaxin Ge

@aomaru_21490

4 months ago

Excited to be at CVPR! I’ll be presenting AutoPresent on Friday at Poster Session 1 - ExHall D #262. Looking forward to meeting old and new friends and chatting about the future of agents for visual generation!

thumb_up_off_alt62

chat_bubble_outline0

repeat4

shareShare

evanthebouncy

@evanthebouncy

4 months ago

new multi-turn instruction grounding dataset with Will McCarthy and Saujas Vaduguru - multi-modal instruction : drawing + txt - verifiable execution : 2D CAD gym env - easy eval : API → score - baselines : human vs VLMs - large : 15,163 inst-exe rounds github.com/AutodeskAILab/… [1/n]

new multi-turn instruction grounding dataset with <a href="/wp_mccarthy/">Will McCarthy</a> and <a href="/saujasv/">Saujas Vaduguru</a>

- multi-modal instruction : drawing + txt
- verifiable execution : 2D CAD gym env
- easy eval : API → score
- baselines : human vs VLMs
- large : 15,163 inst-exe rounds

github.com/AutodeskAILab/…
[1/n]

thumb_up_off_alt27

chat_bubble_outline1

repeat8

shareShare

Graham Neubig

@gneubig

4 months ago

We just updated the leaderboard of TheAgentCompany, a benchmark of tasks like real-world work. - In December 2024, 24% of the tasks could be solved - In June 2025, 33% of the tasks could be solved I'm interested to see when we'll be at 50%.

thumb_up_off_alt90

chat_bubble_outline2

repeat17

shareShare

Jiayi Geng

@jiayiigeng

4 months ago

I'm thrilled to share that I've moved to Pittsburgh and joined NeuLab at CMU as a research intern this summer, advised by Graham Neubig! I'll also start my PhD Language Technologies Institute | @CarnegieMellon this fall. Feel free to reach out if you're interested in chatting about multi-agent systems, LLMs for scientific

thumb_up_off_alt369

chat_bubble_outline11

repeat13

shareShare

jack morris

@jxmnop

4 months ago

seems big AI labs are hyperfixating on reasoning when they should focus on *memory* instead normal people won't use models that can think for hours to solve hard math problems people want models that learn over time, remember details, adapt and interact like a person would

thumb_up_off_alt1,1K

chat_bubble_outline108

repeat68

shareShare

CLS

@chengleisi

4 months ago

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

thumb_up_off_alt553

chat_bubble_outline10

repeat162

shareShare

Graham Neubig

@gneubig

4 months ago

What will software development look like in 2026? With coding agents rapidly improving, dev roles may look quite different. My current workflow has changed a lot: - Work in github, not IDEs - Agents in parallel - Write English, not code - More code review Thoughts + a video👇

thumb_up_off_alt119

chat_bubble_outline3

repeat16

shareShare