Valerie Chen (@valeriechen_) Twitter Tweets • TwiCopy

Valerie Chen

@valeriechen_

+ Follow

phd student @mldcmu @SCSatCMU + visitor @NYUDataScience | building @CopilotArena | previously @MSFTResearch @yale @CMU_Robotics @IBMResearch

ID: 1374055043230535685

linkhttps://valeriechen.github.io/ calendar_today22-03-2021 17:47:10

279 Tweet

1,1K Followers

480 Following

Valerie Chen

@valeriechen_

4 months ago

Presenting this work at GenAICHI workshop today!!

thumb_up_off_alt20

chat_bubble_outline0

repeat2

shareShare

Valerie Chen

@valeriechen_

4 months ago

The presentation is happening today at the Programming and Software Use session (G401)! More details about the paper below👇

thumb_up_off_alt32

chat_bubble_outline0

repeat3

shareShare

Can we use LLMs to generate high-quality *and* original text for creative tasks? We explore where existing models fall on these two axes and try to understand what techniques can push the frontier of novel LLM outputs. Check out Vishakh Padmakumar's thread for more details 👇

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Valerie Chen

@valeriechen_

4 months ago

Accepted to #ICML2025! We’ll see you in Vancouver 🥳

thumb_up_off_alt117

chat_bubble_outline4

repeat10

shareShare

Wayne Chi

@iamwaynechi

4 months ago

Accepted to #ICML2025! My first physical conference in a while... Excited to see you all in Vancouver!

thumb_up_off_alt49

chat_bubble_outline1

repeat8

shareShare

Valerie Chen

@valeriechen_

4 months ago

hot off the press 🔥 Wayne Chi and me appearing in The Wall Street Journal article on this latest OpenAI release wsj.com/articles/opena…

thumb_up_off_alt26

chat_bubble_outline0

repeat1

shareShare

Wayne Chi

@iamwaynechi

4 months ago

Got interviewed by The Wall Street Journal about coding and OpenAI ... then this drops 😲 w/ Valerie Chen

thumb_up_off_alt8

chat_bubble_outline1

repeat1

shareShare

Valerie Chen

@valeriechen_

4 months ago

Who is winning the race to claim the LLMs for SWE market? We share our thoughts based on our Copilot Arena work. See article below for current sentiments and what lies ahead 👇

thumb_up_off_alt20

chat_bubble_outline0

repeat2

shareShare

Hussein Mozannar

@hsseinmzannar

3 months ago

Excited to release my first lead project Magentic-UI at Microsoft Research, an OS web agent application designed for efficient human-agent interaction. CUA agents are cool but they're not so useful yet, Magentic-UI helps us study how to get value from them. github.com/microsoft/mage…

thumb_up_off_alt55

chat_bubble_outline1

repeat9

shareShare

Ameet Talwalkar

@atalwalkar

3 months ago

I’m excited to share new work from Datadog AI Research! We just released Toto, a new SOTA (by a wide margin!) time series foundation model, and BOOM, the largest benchmark of observability metrics. Both are available under the Apache 2.0 license. 🧵

thumb_up_off_alt241

chat_bubble_outline4

repeat53

shareShare

NYU Center for Data Science

@nyudatascience

3 months ago

CDS PhD student Vishakh Padmakumar, with co-authors John (Yueh-Han) Chen, Jane Pan, Valerie Chen, and CDS Associate Professor He He, has published new research on the trade-off between originality and quality in LLM outputs. Read more: nyudatascience.medium.com/in-ai-generate…

thumb_up_off_alt19

chat_bubble_outline2

repeat4

shareShare

Copilot Arena

@copilotarena

3 months ago

New result: Qwen-2.5-Coder jumps from 13th to joint 1st place with fill-in-the-middle (FiM)! Congrats to Qwen 🥳 Also check out lmarena.ai 's new UI 🖥️✨

New result: Qwen-2.5-Coder jumps from 13th to joint 1st place with fill-in-the-middle (FiM)! Congrats to <a href="/Alibaba_Qwen/">Qwen</a> 🥳

Also check out <a href="/lmarena_ai/">lmarena.ai</a> 's new UI 🖥️✨

thumb_up_off_alt7

chat_bubble_outline0

repeat4

shareShare

elvis

@omarsar0

3 months ago

Coding Agents 🤝 Multimodal Browsing Can AI agents generalize beyond their intended scope? Great paper on how you can build generalist agents with superior performance over specialized agents. What models and tools work the best? Here are my notes:

thumb_up_off_alt209

chat_bubble_outline4

repeat57

shareShare

Valerie Chen

@valeriechen_

3 months ago

Exciting new work led by Aditya Soni showing how a few tools can enable agents to solve diverse tasks — from software engineering 🧑‍💻 to information seeking 🔍. Even more exciting to see some of these contributions integrated into OpenHands👐! Check out 🧵for more details✨

thumb_up_off_alt10

chat_bubble_outline0

repeat0

shareShare

All Hands AI

@allhands_ai

3 months ago

The paper about this versatile agent, OpenHands-Versa, was lead by Aditya Soni at CMU, and you can read much more about the methodology: - His summary: x.com/Aditya_Soni_8/… - The paper: arxiv.org/abs/2506.03011 - Our blog: all-hands.dev/blog/building-…

thumb_up_off_alt10

chat_bubble_outline1

repeat2

shareShare

Graham Neubig

@gneubig

3 months ago

Huge shout-out to Aditya Soni at CMU, who's amazing work on his paper laid the foundation for accuracy improvements on many tasks: x.com/Aditya_Soni_8/… And Juan at All Hands AI, who set up VersaBench to do such a diverse variety of benchmarking.

thumb_up_off_alt10

chat_bubble_outline2

repeat1

shareShare

Xingyao Wang

@xingyaow_

3 months ago

Very excited about OpenHands Versa! With it, OpenHands just got even more versatile — I asked it today to update my website with this paper: "Can you add this to my paper list for this year? arxiv.org/abs/2506.03011" Details and prompts in 🧵

thumb_up_off_alt73

chat_bubble_outline3

repeat19

shareShare

Aditya Soni

@aditya_soni_8

3 months ago

Excited about the results! OpenHands-Versa ranks #1 both in terms of accuracy and cost 🚀 The cost savings are primarily due to context condensation in OpenHands-Versa: it suffices to retain the most recent browsing observation instead of all previous browsing observations.

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare