Nora Kassner (@kassnernora) 's Twitter Profile
Nora Kassner

@kassnernora

Research Scientist in NLP

ID: 953683086113492993

linkhttp://norakassner.github.io calendar_today17-01-2018 17:39:01

95 Tweet

1,1K Followers

553 Following

AI for Global Goals (@globalgoalsai) 's Twitter Profile Photo

๐Ÿ“ขโ— After the resounding success of #OxML summer schools, we proudly present a pioneering new course on Generative AI at The London School of Economics and Political Science (LSE), this month!๐ŸŽ“๐Ÿ’ก This intensive program will equip participants with cutting-edge knowledge

๐Ÿ“ขโ— After the resounding success of #OxML summer schools, we proudly present a pioneering new course on Generative AI at The London School of Economics and Political Science (<a href="/LSEnews/">LSE</a>), this month!๐ŸŽ“๐Ÿ’ก 
This intensive program will equip participants with cutting-edge knowledge
Sohee Yang (@soheeyang_) 's Twitter Profile Photo

Our paper "Do Large Language Models Latently Perform Multi-Hop Reasoning?" will be presented at #ACL2024 today. ๐Ÿ“ Mon 14:00-15:30 Poster Session 2 (Conv. Center A1) Please visit our poster if you are interested, and catch me to chat about the latent reasoning ability of LLMs!

Sohee Yang (@soheeyang_) 's Twitter Profile Photo

๐Ÿšจ New Paper ๐Ÿšจ Can LLMs perform latent multi-hop reasoning without exploiting shortcuts? We find the answer is yes โ€“ they can recall and compose facts not seen together in training or guessing the answer, but success greatly depends on the type of the bridge entity (80%+ for

Sebastian Riedel (@riedelcastro@sigmoid.social) (@riedelcastro) 's Twitter Profile Photo

Frontier models can do this stuff, but also not! Opinions differ on how much we even want this (CC Geoffrey Irving), but understanding the patterns will be critical regardless. Been a pleasure to work with Latent Reasoning Dream Team Sohee Yang Mor Geva Nora Kassner!

Aida Nematzadeh ๐Ÿฆ‹ (@aidanematzadeh) 's Twitter Profile Photo

I am hiring for RS/RE positions! If you are interested in language-flavored multimodal learning, evaluation, or post-training apply here ๐ŸฆŽ boards.greenhouse.io/deepmind/jobs/โ€ฆ I will also be #NeurIPS2024 so come say hi! (Please email me to find time to chat)

Google DeepMind (@googledeepmind) 's Twitter Profile Photo

Welcome to the world, Gemini 2.0 โœจ our most capable AI model yet. We're first releasing an experimental version of 2.0 Flash โšก It has better performance, new multimodal output, Google tool use - and paves the way for new agentic experiences. ๐Ÿงต goo.gle/gemini-2

Shrestha Basu Mallick (@shresbm) 's Twitter Profile Photo

The Gemini 2.0 era begins with 2.0 Flash Experimental release โšก๏ธ ๐Ÿ“ˆ2.0 Flash beats 1.5 Pro across factuality, reasoning, coding, math. ๐Ÿ“ณ More modalities - image and audio out (in EAP) ๐Ÿ”ง Native tool use for Google Search, code execution and 3P functions ๐Ÿ†• a new multimodal,

Alexandra Chronopoulou (@alexandraxron) 's Twitter Profile Photo

We are organizing Repl4NLP 2025 along with Freda Shi Giorgos Vernikos Vaibhav Adlakha Xiang Lorraine Li Bodhisattwa Majumder. The workshop will be co-located with NAACL 2025 in Albuquerque, New Mexico and we plan to have a great panel of speakers. Consider submitting your coolest work!

Sohee Yang (@soheeyang_) 's Twitter Profile Photo

Excited to share that the code and datasets for our papers on latent multi-hop reasoning are finally available on GitHub: github.com/google-deepminโ€ฆ We hope these resources support further research in this area. Thanks for your patience as we worked through the release process!

Sian Gooding (@siangooding) 's Twitter Profile Photo

Google DeepMind Edward Grefenstette ๐ŸฅณWe have had a lot of interest in the role and are now asking potential candidates to fill out this form. If we are going forward with a referral, you will hear from us! ย forms.gle/8Y4oEvdGLZmmo1โ€ฆ Thanks again!

Sundar Pichai (@sundarpichai) 's Twitter Profile Photo

Our latest Gemini 2.5 Pro update is now in preview. Itโ€™s better at coding, reasoning, science + math, shows improved performance across key benchmarks (AIDER Polyglot, GPQA, HLE to name a few), and leads lmarena.ai with a 24pt Elo score jump since the previous version. We also

Our latest Gemini 2.5 Pro update is now in preview.

Itโ€™s better at coding, reasoning, science + math, shows improved performance across key benchmarks (AIDER Polyglot, GPQA, HLE to name a few), and leads <a href="/lmarena_ai/">lmarena.ai</a> with a 24pt Elo score jump since the previous version.

We also
Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

How Well Can Reasoning Models Identify and Recover from Unhelpful Thoughts? "We show that models are effective at identifying most unhelpful thoughts but struggle to recover from the same thoughts when these are injected into their thinking process, causing significant

How Well Can Reasoning Models Identify and Recover from Unhelpful Thoughts?

"We show that models are effective at identifying most unhelpful thoughts  but struggle to recover from the same thoughts when these are injected  into their thinking process, causing significant
Sohee Yang (@soheeyang_) 's Twitter Profile Photo

๐Ÿšจ New Paper ๐Ÿงต How effectively do reasoning models reevaluate their thought? We find that: - Models excel at identifying unhelpful thoughts but struggle to recover from them - Smaller models can be more robust - Self-reevaluation ability is far from true meta-cognitive awareness

๐Ÿšจ New Paper ๐Ÿงต
How effectively do reasoning models reevaluate their thought? We find that:
- Models excel at identifying unhelpful thoughts but struggle to recover from them
- Smaller models can be more robust
- Self-reevaluation ability is far from true meta-cognitive awareness
Partha Talukdar (@partha_p_t) 's Twitter Profile Photo

Google DeepMindย  India ๐Ÿ‡ฎ๐Ÿ‡ณ & Japan ๐Ÿ‡ฏ๐Ÿ‡ต are looking for strong candidates in multilinguality, multicultural, & multimodality areas. RS Bangalore: job-boards.greenhouse.io/deepmind/jobs/โ€ฆ RS Tokyo: job-boards.greenhouse.io/deepmind/jobs/โ€ฆ RE Tokyo: job-boards.greenhouse.io/deepmind/jobs/โ€ฆ

Mor Geva (@megamor2) 's Twitter Profile Photo

๐Ÿ“2025-07-28 18:00 - 19:30 Hall 4/5 (and GEM workshop) Sohee Yang will present the results of our investigation at Google DeepMind on whether LLMs can perform latent multi-hop reasoning without exploiting shortcuts x.com/soheeyang_/staโ€ฆ Nora Kassner Elena Gribovskaya Sebastian Riedel (@[email protected])

Sohee Yang (@soheeyang_) 's Twitter Profile Photo

Our paper "Do Large Language Models Perform Latent Multi-Hop Reasoning without exploiting shortcuts?" will be presented at #ACL2025 today. ๐Ÿ“ Mon 18:00-19:30 Findings Posters (Hall X4 X5) Please visit our poster if you are interested! โœจ

Yanai Elazar (@yanaiela) 's Twitter Profile Photo

Organizing a workshop? Checkout our compiled material for organizing one: bigpictureworkshop.com/open-workshop (and hopefully we'll be back for another iteration of the Big Picture next year Allyson Ettinger, Nora Kassner, Sebastian Ruder @ ACL)