Tom Hartvigsen (@tom_hartvigsen) 's Twitter Profile
Tom Hartvigsen

@tom_hartvigsen

Assistant Professor of Data Science @UVA developing responsible machine learning and NLP methods for ever-changing environments. Recruiting PhD students!

ID: 774042053655355392

linkhttps://tomhartvigsen.com calendar_today09-09-2016 00:29:41

291 Tweet

1,1K Followers

747 Following

Sasha Luccioni, PhD 🦋🌎✨🤗 (@sashamtl) 's Twitter Profile Photo

I keep getting asked about my take on these CO2 estimates for the o3 model by the press and members of the community, so I'll interrupt my vacation to comment 🤓 TL;DR- any kind of estimate is a proxy, and instead of wasting our time and energy, we should demand accountability.

I keep getting asked about my take on these CO2 estimates for the o3 model by the press and members of the community, so I'll interrupt my vacation to comment 🤓
TL;DR- any kind of estimate is a proxy, and instead of wasting our time and energy, we should demand accountability.
Maarten Sap (he/him) (@maartensap) 's Twitter Profile Photo

CMU LTI is hosting predoc interns this summer, centered around "Language Technologies for All"! Please apply and circulate! lti.cs.cmu.edu/news-and-event…

Liam McCoy, MD MSc (@liamgmccoy) 's Twitter Profile Photo

What do we want to know about LLMs in research? Out today in Nature Medicine, we take a stab at developing a comprehensive, living taxonomy of LLM use-cases and reporting.

What do we want to know about LLMs in research? Out today in Nature Medicine, we take a stab at developing a comprehensive, living taxonomy of LLM use-cases and reporting.
Hannah Kerner (@hannah_kerner) 's Twitter Profile Photo

#ICML2025 includes a new track on Application-Driven Machine Learning (innovative ML techniques, problems, and datasets driven by the needs of end-users in real-world)! If this fits your work, consider submitting to ICML (dl: Jan 30) and checking the ADML box ✅ in OpenReview ⬇️

#ICML2025 includes a new track on Application-Driven Machine Learning (innovative ML techniques, problems, and datasets driven by the needs of end-users in real-world)!

If this fits your work, consider submitting to ICML (dl: Jan 30) and checking the ADML box ✅ in OpenReview ⬇️
Antonios Mamalakis (@antoniosmamala2) 's Twitter Profile Photo

Dear Climate and AI community! We are hiring 😀 a postdoc to join UVA Environmental Institute at UVA and work with Chirag Agarwal and myself, on using multimodal AI models and explainable AI to attribute extreme precipitation events! Fascinating stuff! Link below. Please RT!

Tom Hartvigsen (@tom_hartvigsen) 's Twitter Profile Photo

Excited to share our work on keeping LLMs up-to-date by composing multiple post-training interventions was accepted to #ICLR2025 ICLR 2026! Great work led by Arinbjörn and Kyle O'Brien!

Emily Alsentzer (@emily_alsentzer) 's Twitter Profile Photo

Medical licensing exams are convenient LLM benchmarks, but they don’t reflect real-world clinical tasks. With LLMs already in EHRs, we need benchmarks that match real-world needs. Let’s partner with hospitals piloting these tools to develop diverse, task-specific evaluations.

Shan Chen (@shan23chen) 's Twitter Profile Photo

More SAE papers coming! We dived deeper, looking into what is the best way to gather the SAE features for downstream classifications and also what are the potential benefits 🧐.

Tom Hartvigsen (@tom_hartvigsen) 's Twitter Profile Photo

I'm honored to have received a research award from Capital One to support our work developing models that reason about time series data! Thank you! Many exciting new results in this area coming soon :)

Tom Hartvigsen (@tom_hartvigsen) 's Twitter Profile Photo

Excited to share new works on knowledge editing for LLMs! Many recent papers find cases where editing LLMs breaks them quickly, but we find the commonly-studied editing methods are needlessly destructive. With some easy-to-use tweaks, we avoid model degradation for WAY longer!

Akshat Gupta (@akshatgupta57) 's Twitter Profile Photo

Our work on knowledge editing got an "Outstanding Paper Award"🏆🏆 at the AAAI KnowFM Workshop!! #AAAI2025 🥳🥳🥳 Congratulations to my amazing co-authors Tom Hartvigsen Ahmed Alaa Gopala Anumanchipalli

Our work on knowledge editing got an "Outstanding Paper Award"🏆🏆 at the <a href="/RealAAAI/">AAAI</a> KnowFM Workshop!! #AAAI2025  🥳🥳🥳

Congratulations to my amazing co-authors <a href="/tom_hartvigsen/">Tom Hartvigsen</a> <a href="/_ahmedmalaa/">Ahmed Alaa</a> <a href="/GopalaSpeech/">Gopala Anumanchipalli</a>
Tom Hartvigsen (@tom_hartvigsen) 's Twitter Profile Photo

New #ICLR2025 paper to be presented by Sujay Nagaraj! It's the first method to capture how label noise can change over time in sequential classification tasks. Many cool implications, one being towards better models of labelers' behavior over time, even for non time series tasks🎉

Priyanshu Kumar (@kpriyanshu256) 's Twitter Profile Photo

Need a multilingual safety detector? 🚨Introducing PolyGuard🚨 ⚙️ supports 17 languages ⚙️ generates structured output for prompt safety, response safety, and model refusal 🚀 outperforms existing SOTA open and commercial safety detectors by 5.5% 📜 arxiv.org/abs/2504.04377🧵

Need a multilingual safety detector?

🚨Introducing PolyGuard🚨

⚙️ supports 17 languages
⚙️ generates structured output for prompt safety, response safety, and model refusal
🚀 outperforms existing SOTA open and commercial safety detectors by 5.5% 

📜 arxiv.org/abs/2504.04377🧵
Prithviraj (Raj) Ammanabrolu (@rajammanabrolu) 's Twitter Profile Photo

Introducing TALES - Text Adventure Learning Environment Suite A benchmark of a few hundred text envs: science experiments and embodied cooking to solving murder mysteries. We test over 30 of the best LLM agents and pinpoint failure modes +how to improve 👨‍💻pip install tale-suite

Tom Hartvigsen (@tom_hartvigsen) 's Twitter Profile Photo

Excited we have some papers accepted to ICML Conference in collaborations with some tremendous folks 🎉 Looking forward to Vancouver to discuss model editing for LLMs/VLMs and improving medical benchmarking!

Excited we have some papers accepted to <a href="/icmlconf/">ICML Conference</a> in collaborations with some tremendous folks 🎉

Looking forward to Vancouver to discuss model editing for LLMs/VLMs and improving medical benchmarking!
Explainable Machine Learning (@explainableml) 's Twitter Profile Photo

🚨Happy to announce that one paper, "Understanding the Limits of Lifelong Knowledge Editing in LLMs", is accepted at #icml2025 ! Congrats to the wonderful authors Lukas Thede , Karsten Roth , Matthias Bethge ,Zeynep Akata , and Tom Hartvigsen. 👇 Highlights in the thread

🚨Happy to announce that one paper, "Understanding the Limits of Lifelong Knowledge Editing in LLMs", is accepted at #icml2025 ! Congrats to the wonderful authors <a href="/lukas_thede/">Lukas Thede</a> , <a href="/confusezius/">Karsten Roth</a> , <a href="/MatthiasBethge/">Matthias Bethge</a> ,<a href="/zeynepakata/">Zeynep Akata</a> , and <a href="/tom_hartvigsen/">Tom Hartvigsen</a>.  👇 Highlights in the thread
Shan Chen (@shan23chen) 's Twitter Profile Photo

Designing a hard but useful benchmark has always been a passion of mine. Here we present MedBrowseComp, a deep research + computer use benchmark that is easy to verify (like BrowseComp from OpenAI) but still very expandable 💊! Project page: moreirap12.github.io/mbc-browse-app/ 1/n

Designing a hard but useful benchmark has always been a passion of mine. Here we present MedBrowseComp, a deep research + computer use benchmark that is easy to verify (like BrowseComp from <a href="/OpenAI/">OpenAI</a>) but still very expandable 💊!

Project page:
moreirap12.github.io/mbc-browse-app/

1/n
Akshat Gupta (@akshatgupta57) 's Twitter Profile Photo

Just did a major revision to our paper on Lifelong Knowledge Editing!🔍 Key takeaway (+ our new title) - "Lifelong Knowledge Editing requires Better Regularization" Fixing this leads to consistent downstream performance! Tom Hartvigsen Ahmed Alaa Gopala Anumanchipalli Berkeley AI Research

Just did a major revision to our paper on Lifelong Knowledge Editing!🔍

Key takeaway (+ our new title) - "Lifelong Knowledge Editing requires Better Regularization"

Fixing this leads to consistent downstream performance!

<a href="/tom_hartvigsen/">Tom Hartvigsen</a> <a href="/_ahmedmalaa/">Ahmed Alaa</a> <a href="/GopalaSpeech/">Gopala Anumanchipalli</a> <a href="/berkeley_ai/">Berkeley AI Research</a>