Max Bartolo (@max_nlp) Twitter Tweets • TwiCopy

Max Bartolo

@max_nlp

+ Follow

I lead the Command modelling team at @Cohere and co-chair the @DynabenchAI @MLCommons working group. Prev @DeepMind, @MetaAI / FAIR & @BloomsburyAI.

ID: 794224315315224576

linkhttp://maxbartolo.com calendar_today03-11-2016 17:06:47

781 Tweet

2,2K Followers

759 Following

Sara Hooker

@sarahookr

8 months ago

Very proud to introduce Kaleidoscope ✨🌿 🌍 18 languages (Bengali → Spanish) 📚 14 subjects (Humanities → STEM) 📸 55% requiring image understanding! A very important open science collaboration — which extends in-language evaluation for vision models to many more languages.

thumb_up_off_alt131

chat_bubble_outline3

repeat31

shareShare

Matthias Gallé

@mgalle

8 months ago

A year ago we released LBBP - a drop-in replacement of HumanEval that was more challenging and less leaked Internally we have been using the multilingual version of this for benchmarking, and as code is not only python we decided to release that as well huggingface.co/datasets/Coher…

thumb_up_off_alt57

chat_bubble_outline0

repeat13

shareShare

Arduin Findeis @ ICLR2025

@arduinfindeis

8 months ago

How exactly was the initial Chatbot Arena version of Llama 4 Maverick different from the public HuggingFace version?🕵️ I used our Feedback Forensics app to quantitatively analyse how exactly these two models differ. An overview…👇🧵

thumb_up_off_alt23

chat_bubble_outline3

repeat6

shareShare

Max Bartolo

@max_nlp

8 months ago

Really enjoyed giving this talk. Thanks for hosting and for the great questions! Tom Hosking you might recognise this slide 😅

thumb_up_off_alt15

chat_bubble_outline1

repeat0

shareShare

Eugene Choi

@221eugene

8 months ago

Attending #ICLR2025 and interested in #LLM, #Alignment, or #SelfImprovement? Then come by and check out our work from cohere: "Self-Improving Robust Preference Optimization" - a new alignment method that unlocks self-refinement in LLMs! 📍 Poster Session 4 — Friday, 3–5:30 PM

Attending #ICLR2025 and interested in #LLM, #Alignment, or #SelfImprovement?

Then come by and check out our work from
<a href="/cohere/">cohere</a>: "Self-Improving Robust Preference Optimization" - a new alignment method that unlocks self-refinement in LLMs!
📍 Poster Session 4 — Friday, 3–5:30 PM

thumb_up_off_alt36

chat_bubble_outline0

repeat11

shareShare

Max Bartolo

@max_nlp

8 months ago

If you want to learn more about how LLMs pick up reasoning abilities from procedural knowledge in pretraining, visit poster #208 in Hall 3 at 3pm today ICLR 2026 #ICLR #ICLR25 #ICLR2025

thumb_up_off_alt33

chat_bubble_outline0

repeat5

shareShare

Edward Grefenstette

@egrefen

8 months ago

At #ICLR2025? Come and see Laura Ruis present these amazing results on how LLMs exploit data in different ways to learn facts vs capabilities. Happening now at poster 208 in Hall 3! 🚀

At #ICLR2025? Come and see <a href="/LauraRuis/">Laura Ruis</a> present these amazing results on how LLMs exploit data in different ways to learn facts vs capabilities. Happening now at poster 208 in Hall 3! 🚀

thumb_up_off_alt125

chat_bubble_outline0

repeat12

shareShare

Max Bartolo

@max_nlp

8 months ago

Recently overheard ICLR 2026: influence functions for LLMs are useless. Poster #208 disagrees 🤔

Recently overheard <a href="/iclr_conf/">ICLR 2026</a>: influence functions for LLMs are useless. Poster #208 disagrees 🤔

thumb_up_off_alt52

chat_bubble_outline1

repeat3

shareShare

Cohere Labs

@cohere_labs

8 months ago

Congrats to our Cohere colleagues for their paper “Improving Reward Models with Synthetic Critiques” being presented at NAACL this week! 🎉 Read the paper: arxiv.org/pdf/2405.20850 Work led by Daniella Ye, Fraser, Max Bartolo, Phil Blunsom, Jon Ander Campos and Matthias Gallé

thumb_up_off_alt21

chat_bubble_outline1

repeat2

shareShare

Cohere Labs

@cohere_labs

7 months ago

Join us to mark the end of Expedition Aya, our six-week global open-build challenge designed to accelerate ML research progress in multilingual, multimodal and efficiency✨ Top teams will present their key findings and innovations and our judges will select 5 winning projects🏆

thumb_up_off_alt13

chat_bubble_outline1

repeat1

shareShare

Max Bartolo

@max_nlp

7 months ago

Massive congrats team Afri-Aya, really great work! 🤩

thumb_up_off_alt18

chat_bubble_outline1

repeat1

shareShare

xjdr

@_xjdr

7 months ago

the command-a paper is one of my top 5 papers of the year for sure cohere.com/research/paper…

thumb_up_off_alt316

chat_bubble_outline5

repeat18

shareShare

Moritz Laurer

@moritzlaurer

6 months ago

Kudos to cohere for releasing 6 proper research papers in May alone, while publications of other western labs increasingly read like advertisements! I recently read the Command A technical report and it contains much more detail than other model reports. Looking at recent

Kudos to <a href="/cohere/">cohere</a> for releasing 6 proper research papers in May alone, while publications of other western labs increasingly read like advertisements! I recently read the Command A technical report and it contains much more detail than other model reports. Looking at recent

thumb_up_off_alt150

chat_bubble_outline2

repeat13

shareShare

Max Bartolo

@max_nlp

6 months ago

Looking forward to sharing some of our recent research contributions at Machine Learning Street Talk's first London AI meetup 🤩

thumb_up_off_alt19

chat_bubble_outline0

repeat3

shareShare

Maximilian Mozes

@maximilianmozes

6 months ago

We’re looking for a Research Engineer / Scientist with a focus on Data Analysis and Evaluation to join the post-training team at Cohere! More details and application here: jobs.ashbyhq.com/cohere/6170371… Feel free to reach out if you'd like to know more!

thumb_up_off_alt113

chat_bubble_outline3

repeat20

shareShare

Laura Ruis

@lauraruis

6 months ago

LLMs can be programmed by backprop 🔎 In our new preprint, we show they can act as fuzzy program interpreters and databases. After being ‘programmed’ with next-token prediction, they can retrieve, evaluate, and even *compose* programs at test time, without seeing I/O examples.

thumb_up_off_alt314

chat_bubble_outline4

repeat50

shareShare

Max Bartolo

@max_nlp

5 months ago

Really enjoyed discussing the state of AI benchmarking alongside Prof Mark Bishop, Timothy Nguyen, Enzo Blindow & Tim Scarfe at Machine Learning Street Talk's first in-person event in London yesterday. Looking forward to many more!

Really enjoyed discussing the state of AI benchmarking alongside Prof Mark Bishop, <a href="/IAmTimNguyen/">Timothy Nguyen</a>, Enzo Blindow & <a href="/ecsquendor/">Tim Scarfe</a> at <a href="/MLStreetTalk/">Machine Learning Street Talk</a>'s first in-person event in London yesterday. Looking forward to many more!

thumb_up_off_alt16

chat_bubble_outline0

repeat3

shareShare

Tokenization Workshop (TokShop) @ICML2025

@tokshop2025

5 months ago

🎤 Meet our expert panelists! Join Albert Gu, Alisa Liu, Kris Cao, Sander Land, and Yuval Pinter as they discuss the Future of Tokenization on July 18 at 3:30 PM at TokShop at #ICML2025.

thumb_up_off_alt19

chat_bubble_outline0

repeat7

shareShare