Michael Oberst (@michaeloberst) 's Twitter Profile
Michael Oberst

@michaeloberst

Assistant Professor of CS at @JohnsHopkins, Part-time Visiting Scientist @AbridgeHQ. Previously: Postdoc at @CarnegieMellon. PhD from @MIT_CSAIL.

ID: 360876248

linkhttps://www.michaelkoberst.com calendar_today23-08-2011 22:17:57

234 Tweet

2,2K Followers

977 Following

Yisong Yue (@yisongyue) 's Twitter Profile Photo

Just updated my Tips for CS Faculty Applications. Best of luck to everyone applying! yisongyue.medium.com/checklist-of-t…

Daniel P Jeong (@danielpjeong) 's Twitter Profile Photo

🧵 Are "medical" LLMs/VLMs *adapted* from general-domain models, always better at answering medical questions than the original models? In our oral presentation at #EMNLP2024 today (2:30pm in Tuttle), we'll show that surprisingly, the answer is "no". arxiv.org/abs/2411.04118

Monica Agrawal (@monicanagrawal) 's Twitter Profile Photo

Excited to be here at #ICML2025 to present our paper on 'pragmatic misalignment' in (deployed!) RAG systems: narrowly "accurate" responses that can be profoundly misinterpreted by readers. It's especially dangerous for consequential domains like medicine! arxiv.org/pdf/2502.14898

Excited to be here at #ICML2025 to present our paper on 'pragmatic misalignment' in (deployed!) RAG systems: narrowly "accurate" responses that can be profoundly misinterpreted by readers.

It's especially dangerous for consequential domains like medicine! arxiv.org/pdf/2502.14898
Niloofar (on faculty job market!) (@niloofar_mire) 's Twitter Profile Photo

🧵 Academic job market season is almost here! There's so much rarely discussed—nutrition, mental and physical health, uncertainty, and more. I'm sharing my statements, essential blogs, and personal lessons here, with more to come in the upcoming weeks! ⬇️ (1/N)

Daniel P Jeong (@danielpjeong) 's Twitter Profile Photo

Excited to talk about our work on evaluating *medically adapted* LLMs and VLMs, covering additional results since our oral presentation at EMNLP 2024. If you ever wondered whether Med-* LLMs/VLMs are really better than their general-domain counterparts, come check it out on 8/4!

Divya Shanmugam (@dmshanmugam) 's Twitter Profile Photo

New #NeurIPS2025 paper: how should we evaluate machine learning models without a large, labeled dataset? We introduce Semi-Supervised Model Evaluation (SSME), which uses labeled and unlabeled data to estimate performance! We find SSME is far more accurate than standard methods.

New #NeurIPS2025 paper: how should we evaluate machine learning models without a large, labeled dataset? We introduce Semi-Supervised Model Evaluation (SSME), which uses labeled and unlabeled data to estimate performance! We find SSME is far more accurate than standard methods.
Danielle Bitterman, MD (@dbittermanmd) 's Twitter Profile Photo

LLMs tend to prioritize helpfulness > reason. We show that safety-aware, compute-efficient fine-tuning helps models reason more critically in healthcare domain, and generalizes to improved safety alignment across other domains. nature.com/articles/s4174…

LLMs tend to prioritize helpfulness > reason. We show that safety-aware, compute-efficient fine-tuning helps models reason more critically in healthcare domain, and generalizes to improved safety alignment across other domains. 
nature.com/articles/s4174…