Zora Wang (@zhiruow) 's Twitter Profile
Zora Wang

@zhiruow

PhD student @LTIatCMU | prev @Amazon Alexa AI, @Microsoft Research, Asia | fun 👩🏻‍💻 🐈 💃 🪴 🎶

ID: 1427604853212303408

linkhttps://zorazrw.github.io/ calendar_today17-08-2021 12:15:06

198 Tweet

1,1K Followers

325 Following

Jaemin Cho (on faculty job market) (@jmin__cho) 's Twitter Profile Photo

Sharing some personal updates 🥳: - I've completed my PhD at UNC Computer Science! 🎓 - Starting Fall 2026, I'll be joining the Computer Science dept. at Johns Hopkins University (JHU Computer Science) as an Assistant Professor 💙 - Currently exploring options + finalizing the plan for my gap year (Aug

Sharing some personal updates 🥳:
- I've completed my PhD at <a href="/unccs/">UNC Computer Science</a>! 🎓
- Starting Fall 2026, I'll be joining the Computer Science dept. at Johns Hopkins University (<a href="/JHUCompSci/">JHU Computer Science</a>) as an Assistant Professor 💙
- Currently exploring options + finalizing the plan for my gap year (Aug
Stella Li (@stellalisy) 's Twitter Profile Photo

🤯 We cracked RLVR with... Random Rewards?! Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by: - Random rewards: +21% - Incorrect rewards: +25% - (FYI) Ground-truth rewards: + 28.8% How could this even work⁉️ Here's why: 🧵 Blogpost: tinyurl.com/spurious-rewar…

🤯 We cracked RLVR with... Random Rewards?!
Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by:
- Random rewards: +21%
- Incorrect rewards: +25%
- (FYI) Ground-truth rewards: + 28.8%
How could this even work⁉️ Here's why: 🧵
Blogpost: tinyurl.com/spurious-rewar…
Graham Neubig (@gneubig) 's Twitter Profile Photo

I'd like to announce that the CMU FLAME center (cmu.edu/flame/) has a new cluster! It is 256 H100 GPUs, which we'll use to perform larger experiments, build more useful artifacts, and continue our tradition of open research. Expect to see more like this in the future👇

Yizhong Wang (@yizhongwyz) 's Twitter Profile Photo

Thrilled to announce that I will be joining UT Austin Computer Science at UT Austin as an assistant professor in fall 2026! I will continue working on language models, data challenges, learning paradigms, & AI for innovation. Looking forward to teaming up with new students & colleagues! 🤠🤘

Thrilled to announce that I will be joining <a href="/UTAustin/">UT Austin</a> <a href="/UTCompSci/">Computer Science at UT Austin</a> as an assistant professor in fall 2026! 

I will continue working on language models, data challenges, learning paradigms, &amp; AI for innovation. Looking forward to teaming up with new students &amp; colleagues! 🤠🤘
Apurva Gandhi (@apurvasgandhi) 's Twitter Profile Photo

New preprint on web agents🚨 Go-Browse: Training Web Agents with Structured Exploration Problem: LLMs lack prior understanding of the websites that web agents will be deployed on. Solution: Go-Browse is an unsupervised method for automatically collecting diverse and realistic

Lindia Tjuatja (@lltjuatja) 's Twitter Profile Photo

When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs: 🧵1/9

When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs: 

🧵1/9
Omar Shaikh (@oshaikh13) 's Twitter Profile Photo

What if LLMs could learn your habits and preferences well enough (across any context!) to anticipate your needs? In a new paper, we present the General User Model (GUM): a model of you built from just your everyday computer use. 🧵

Michael Ryan (@michaelryan207) 's Twitter Profile Photo

New #ACL2025NLP Paper! 🎉 Curious what AI thinks about YOU? We interact with AI every day, offering all kinds of feedback, both implicit ✏️ and explicit 👍.  What if we used this feedback to personalize your AI assistant to you? Introducing SynthesizeMe! An approach for

Junhong Shen (@junhongshen1) 's Twitter Profile Photo

🔥Unlocking New Paradigm for Test-Time Scaling of Agents! We introduce Test-Time Interaction (TTI), which scales the number of interaction steps beyond thinking tokens per step. Our agents learn to act longer➡️richer exploration➡️better success Paper: arxiv.org/abs/2506.07976

🔥Unlocking New Paradigm for Test-Time Scaling of Agents!

We introduce Test-Time Interaction (TTI), which scales the number of interaction steps beyond thinking tokens per step.

Our agents learn to act longer➡️richer exploration➡️better success

Paper: arxiv.org/abs/2506.07976
Saining Xie (@sainingxie) 's Twitter Profile Photo

Had a great time at this CVPR community-building workshop---lots of fun discussions and some really important insights for early-career researchers. I also gave a talk on "Research as an Infinite Game." Here are the slides: canva.com/design/DAGp0iR…

Had a great time at this CVPR community-building workshop---lots of fun discussions and some really important insights for early-career researchers. 

I also gave a talk on "Research as an Infinite Game." Here are the slides:
canva.com/design/DAGp0iR…
Yijia Shao (@echoshao8899) 's Twitter Profile Photo

🚨 70 million US workers are about to face their biggest workplace transmission due to AI agents. But nobody asks them what they want. While AI races to automate everything, we took a different approach: auditing what workers want vs. what AI can do across the US workforce.🧵

🚨 70 million US workers are about to face their biggest workplace transmission due to AI agents. But nobody asks them what they want.

While AI races to automate everything, we took a different approach: auditing what workers want vs. what AI can do across the US workforce.🧵
Diyi Yang (@diyi_yang) 's Twitter Profile Photo

AI agents are transforming the workforce! We mapped how AI agents could #automate vs. #augment jobs across the U.S. workforce With a worker-first look of the future of work👇🧵

Sanjay Subramanian (@sanjayssub) 's Twitter Profile Photo

Also be sure to check out this awesome work on automated slide generation led by Jiaxin Ge and Zora Wang on Friday at Poster Session 1 - ExHall D #262. x.com/aomaru_21490/s…

Jiaxin Ge (@aomaru_21490) 's Twitter Profile Photo

Excited to be at CVPR! I’ll be presenting AutoPresent on Friday at Poster Session 1 - ExHall D #262. Looking forward to meeting old and new friends and chatting about the future of agents for visual generation!

evanthebouncy (@evanthebouncy) 's Twitter Profile Photo

new multi-turn instruction grounding dataset with Will McCarthy and Saujas Vaduguru - multi-modal instruction : drawing + txt - verifiable execution : 2D CAD gym env - easy eval : API → score - baselines : human vs VLMs - large : 15,163 inst-exe rounds github.com/AutodeskAILab/… [1/n]

new multi-turn instruction grounding dataset with <a href="/wp_mccarthy/">Will McCarthy</a> and <a href="/saujasv/">Saujas Vaduguru</a> 

- multi-modal instruction : drawing + txt
- verifiable execution : 2D CAD gym env
- easy eval : API → score
- baselines : human vs VLMs
- large : 15,163 inst-exe rounds

github.com/AutodeskAILab/…
[1/n]
Graham Neubig (@gneubig) 's Twitter Profile Photo

We just updated the leaderboard of TheAgentCompany, a benchmark of tasks like real-world work. - In December 2024, 24% of the tasks could be solved - In June 2025, 33% of the tasks could be solved I'm interested to see when we'll be at 50%.

We just updated the leaderboard of TheAgentCompany, a benchmark of tasks like real-world work.

- In December 2024, 24% of the tasks could be solved
- In June 2025, 33% of the tasks could be solved

I'm interested to see when we'll be at 50%.
Jiayi Geng (@jiayiigeng) 's Twitter Profile Photo

I'm thrilled to share that I've moved to Pittsburgh and joined NeuLab at CMU as a research intern this summer, advised by Graham Neubig! I'll also start my PhD Language Technologies Institute | @CarnegieMellon this fall. Feel free to reach out if you're interested in chatting about multi-agent systems, LLMs for scientific

jack morris (@jxmnop) 's Twitter Profile Photo

seems big AI labs are hyperfixating on reasoning when they should focus on *memory* instead normal people won't use models that can think for hours to solve hard math problems people want models that learn over time, remember details, adapt and interact like a person would

CLS (@chengleisi) 's Twitter Profile Photo

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

Are AI scientists already better than human researchers?

We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts.

Main finding: LLM ideas result in worse projects than human ideas.
Graham Neubig (@gneubig) 's Twitter Profile Photo

What will software development look like in 2026? With coding agents rapidly improving, dev roles may look quite different. My current workflow has changed a lot: - Work in github, not IDEs - Agents in parallel - Write English, not code - More code review Thoughts + a video👇