Thomas Buckley (@tabuckley_) 's Twitter Profile
Thomas Buckley

@tabuckley_

PhD Student at @HarvardDBMI

ID: 1677706430781046784

calendar_today08-07-2023 15:49:38

18 Tweet

72 Followers

146 Following

Arjun (Raj) Manrai (@arjunmanrai) 's Twitter Profile Photo

In our new preprint, we evaluated GPT-4V on 934 challenging NEJM medical image cases and 69 clinicopathological conferences. GPT-4V outperformed human respondents overall and across difficulty levels, skin tones, and image types except radiology, where it matched humans. GPT-4V

In our new preprint, we evaluated GPT-4V on 934 challenging <a href="/NEJM/">NEJM</a> medical image cases and 69 clinicopathological conferences. GPT-4V outperformed human respondents overall and across difficulty levels, skin tones, and image types except radiology, where it matched humans. GPT-4V
Adam Rodman (@adamrodmanmd) 's Twitter Profile Photo

Do large language models have a probabilistic understanding of disease states? And what does this mean for the future of diagnosis and clinical reasoning? I explore this with Thomas Buckley, Arjun (Raj) Manrai, and Dan Morgan in our new paper in JAMA Network Open. A brief 🧵⬇️

Marinka Zitnik (@marinkazitnik) 's Twitter Profile Photo

(1/4) Excited to introduce ProCyon: a multimodal foundation model to model, generate, and predict protein phenotypes, led by stellar Owen Queen Yepeng Robert Calef Valentina Giunchiglia 👉 biorxiv.org/content/10.110… ProCyon is an 11B parameter multimodal model that integrates protein

(1/4) Excited to introduce ProCyon: a multimodal foundation model to model, generate, and predict protein phenotypes, led by stellar <a href="/oq_35/">Owen Queen</a> <a href="/YepHuang/">Yepeng</a> Robert Calef <a href="/valegiunca/">Valentina Giunchiglia</a>

👉 biorxiv.org/content/10.110…

ProCyon is an 11B parameter multimodal model that integrates protein
Isaac Kohane (@zakkohane) 's Twitter Profile Photo

Is a well-placed word worth a thousand pictures for an AI? "the provision of images either reduced or had no effect on model performance when text is already highly informative." arxiv.org/abs/2311.05591 Kudo Thomas dbmi.hms.harvard.edu/people/thomas-…

Is a well-placed word worth a thousand pictures for an AI? "the provision of images either reduced or had no effect on model performance when text is already highly informative." arxiv.org/abs/2311.05591 Kudo Thomas dbmi.hms.harvard.edu/people/thomas-…
Adam Rodman (@adamrodmanmd) 's Twitter Profile Photo

Preprint out today that tests o1-preview's medical reasoning experiments against a baseline of 100s of clinicians. In this case the title says it all: Superhuman performance of a large language model on the reasoning tasks of a physician Link: arxiv.org/abs/2412.10849 A 🧵⬇️

Deedy (@deedydas) 's Twitter Profile Photo

o1-preview is far superior to doctors on reasoning tasks and it's not even close, according to OpenAI's latest paper. AI does ~80% vs ~30% on the 143 hard NEJM CPC diagnoses. It's dangerous now to trust your doctor and NOT consult an AI model. Here are some actual tasks: 1/5

o1-preview is far superior to doctors on reasoning tasks and it's not even close, according to OpenAI's latest paper.

AI does ~80% vs ~30% on the 143 hard NEJM CPC diagnoses.

It's dangerous now to trust your doctor and NOT consult an AI model.

Here are some actual tasks:

1/5
Ethan Mollick (@emollick) 's Twitter Profile Photo

‼️"o1-preview demonstrates superhuman performance in differential diagnosis, diagnostic clinical reasoning, and management reasoning, superior in multiple domains compared to prior model generations and human physicians." And this is using vignettes, not multiple choice.

Arjun (Raj) Manrai (@arjunmanrai) 's Twitter Profile Photo

How does o1-preview compare to hundreds of clinicians in medical reasoning? We explore this in our new preprint, led by stars Thomas Buckley of DBMI at Harvard Med and Peter Brodeur of BIDMC. Full text: arxiv.org/abs/2412.10849 Great thread by co-conspirator Adam Rodman below 👇

Ayush Noori (@ayushnoori) 's Twitter Profile Photo

Thanks for the sneak peak of this last month, Adam Rodman. This work from Adam and Arjun (Raj) Manrai (co-led by the inimitable Thomas Buckley) is really worth a read. The most astounding result highlighted in the thread below: 👇🏽

Thanks for the sneak peak of this last month, <a href="/AdamRodmanMD/">Adam Rodman</a>. This work from Adam and <a href="/arjunmanrai/">Arjun (Raj) Manrai</a> (co-led by the inimitable <a href="/tabuckley_/">Thomas Buckley</a>) is really worth a read.

The most astounding result highlighted in the thread below: 👇🏽
Eric Horvitz (@erichorvitz) 's Twitter Profile Photo

Excited to share recent results on the impressive performance of 01-preview on medical diagnosis & management challenges arxiv.org/abs/2412.10849 Additional reflections on developments in my LinkedIn article: tinyurl.com/za3nj5yw

Greg Brockman (@gdb) 's Twitter Profile Photo

promising results of o1-preview for medical reasoning (though still lots of work to figure out how to integrate with the healthcare system)

Arjun (Raj) Manrai (@arjunmanrai) 's Twitter Profile Photo

Open-source LLMs have narrowed the gap with leading proprietary models surprisingly fast. But how well do they do on tough clinical cases? Excited to share our new study out today in JAMA Health Forum led by Thomas Buckley with amazing physician coauthors. Detailed thread to come

Open-source LLMs have narrowed the gap with leading proprietary models surprisingly fast. But how well do they do on tough clinical cases?

Excited to share our new study out today in <a href="/JAMAHealthForum/">JAMA Health Forum</a> led by <a href="/tabuckley_/">Thomas Buckley</a> with amazing physician coauthors. Detailed thread to come
JAMA Health Forum (@jamahealthforum) 's Twitter Profile Photo

An open-source LLM performed comparably to GPT-4 in generating differential diagnoses for complex cases, indicating that high-performing, locally-deployable custom models can enhance clinical decision support while ensuring data privacy and flexibility. ja.ma/4bGTyHx

An open-source LLM performed comparably to GPT-4 in generating differential diagnoses for complex cases, indicating that high-performing, locally-deployable custom models can enhance clinical decision support while ensuring data privacy and flexibility. ja.ma/4bGTyHx
Adam Rodman (@adamrodmanmd) 's Twitter Profile Photo

New research letter on the diagnostic abilities of open-source models for diagnosis led by superstar Thomas Buckley -- short letter (and something that I think most researchers already know) but big implications for medicine. A 🧵⬇️

New research letter on the diagnostic abilities of open-source models for diagnosis led by superstar <a href="/tabuckley_/">Thomas Buckley</a>  -- short letter (and something that I think most researchers already know) but big implications for medicine. 

A 🧵⬇️
Adam Rodman (@adamrodmanmd) 's Twitter Profile Photo

Huge update to our preprint today on the superhuman performance of reasoning models in medical diagnosis! TL;DR – they don't just surpass humans in meaningful benchmarks, but in actual medical care from unstructured clinical data: A 🧵⬇️: x.com/AdamRodmanMD/s…

Arjun (Raj) Manrai (@arjunmanrai) 's Twitter Profile Photo

We just added a major new experiment to our o1 study, comparing o1 and attending physicians at key diagnostic touchpoints on REAL cases from the BIDMC ER Stellar work led by Thomas Buckley and Peter Brodeur as part of a rapidly growing DBMI at Harvard Med and BIDMC collab 🚀

Ethan Mollick (@emollick) 's Twitter Profile Photo

Updated paper by physicians at Harvard, Stanford, and other academic medical centers testing o1-preview for medical reasoning & diagnosis tasks: “In all experiments—both vignettes and emergency room second opinions—the LLM displayed superhuman diagnostic and reasoning abilities.”

Updated paper by physicians at Harvard, Stanford, and other academic medical centers testing o1-preview for medical reasoning &amp; diagnosis tasks: “In all experiments—both vignettes and emergency room second opinions—the LLM displayed superhuman diagnostic and reasoning abilities.”