Dean Carignan (@deancarignan) 's Twitter Profile
Dean Carignan

@deancarignan

Chief of Staff for @Microsoft's Chief Scientific Officer; exploring responsible practices in AI, Data Science, ML Ops. Ex: @MSFTReseach @Mckinsey, @Worldbank

ID: 77415138

calendar_today26-09-2009 06:35:47

396 Tweet

1,1K Followers

1,1K Following

Sebastian Bordt (@sbordt) 's Twitter Profile Photo

Should we trust LLM evaluations on publicly available benchmarks?🤔 Our latest work studies the overfitting of few-shot learning with GPT-4. with Harsha Nori Vanessa Rodrigues Besmira Nushi 💙💛 and Rich Caruana Paper: arxiv.org/abs/2404.06209 More details👇 [1/N]

Should we trust LLM evaluations on publicly available benchmarks?🤔

Our latest work studies the overfitting of few-shot learning with GPT-4.

with <a href="/HarshaNori/">Harsha Nori</a> Vanessa Rodrigues <a href="/besanushi/">Besmira Nushi 💙💛</a> and Rich Caruana 

Paper: arxiv.org/abs/2404.06209

More details👇 [1/N]
Eric Horvitz (@erichorvitz) 's Twitter Profile Photo

Important in machine learning to recognize that patterns of error can change with model updates, such that new errors can show up—even when overall model accuracy increases. Besmira Nushi 💙💛 Microsoft Research

Dean Carignan (@deancarignan) 's Twitter Profile Photo

Fully agree with Yu Su here. Access to a model’s internals can unlock a range of new capabilities that are only now being understood and utilized. Exciting work on reranking in the paper.

Dean Carignan (@deancarignan) 's Twitter Profile Photo

Why do leadership teams fail to react even when disruption is immanent? This article deconstructs the reasons for paralysis and points out useful solutions. Particularly important is the discussion on managing leadership attention. The attention of the top leaders is like the

Dean Carignan (@deancarignan) 's Twitter Profile Photo

A key limitation for mechanistic interpretation of LLMs is the high level of technical knowledge required. This demo shared by Kevin Meng of Transluce looks ahead to an interface that could allow non-technical users to both interpret and steer LLMs. Exciting work!

Dean Carignan (@deancarignan) 's Twitter Profile Photo

Important work by my Microsoft colleagues. A year ago, we showed that prompting techniques (Medprompt) could enable a generalist model (GPT-4) to outperform a specialist (MedPaLM-2) on medical challenge questions. This latest work suggests that o1 may have internalized aspects

Dean Carignan (@deancarignan) 's Twitter Profile Photo

My INSEAD Tech Talk on AI co-innovation with Peter Zemsky and Siddhartha Chaturvedi is now live. I enjoyed this deep dive into how cross-disciplinary collaboration can unlock massive value in the Gen AI era. youtu.be/gzn9MyUjH2w

Dean Carignan (@deancarignan) 's Twitter Profile Photo

Responsible AI in Practice: I'm excited to share my conversation with Matthew DeMello on Responsible AI practices. Rather than a technical discussion, we focused on practical steps that companies can take today to make AI safter. A deeper exploration is found in my forthcoming

Mark Fortier, Fortier Public Relations (@bizbookpr) 's Twitter Profile Photo

Apply the strategies of billion-dollar companies to your own business & life. Ed Mylett interviews #Microsoft Chief #Innovation Officers JoAnn Garbin and Dean Carignan, coauthors of the new book The Insider's Guide to #InnovationatMicrosoft Post Hill Press podcasts.apple.com/us/podcast/the…

Mark Fortier, Fortier Public Relations (@bizbookpr) 's Twitter Profile Photo

How “Pasteur’s quadrant” enlightens the #invention-#innovation challenge. Big Think excerpts the new book The Insider's Guide to #InnovationatMicrosoft by #Microsoft Chief #Innovation Officers JoAnn Garbin and Dean Carignan from Post Hill Press bigthink.com/business/how-p…

Saleema Amershi (@saleemaamershi) 's Twitter Profile Photo

Check out our new tutorial on Magentic-UI by Maya Murad Learn about Magentic's human-in-the-loop features including: 🧑‍🤝‍🧑 Co-planning 🤝 Co-tasking 🛡️ Action Guards 🧠 Plan Learning 🔀 Parallel Task Execution 👇

Saibo-Creator (@saibogeng) 's Twitter Profile Photo

🚀 Excited to share our latest work at ICML 2025 — zip2zip: Inference-Time Adaptive Vocabularies for Language Models via Token Compression! Sessions: 📅 Fri 18 Jul  - Tokenization Workshop 📅 Sat 19 Jul  - Workshop on Efficient Systems for Foundation Models (Oral 5/145)