Seraphina Goldfarb-Tarrant (@seraphinagt) Twitter Tweets • TwiCopy

Seraphina Goldfarb-Tarrant

@seraphinagt

+ Follow

Head of AI Safety @cohere. PhD from @EdinburghNLP @InfAtED.
If you don't recognise me it's cause I am invisible dl.acm.org/doi/10.1145/25…

ID: 85239409

linkhttps://seraphinatarrant.github.io calendar_today26-10-2009 04:41:41

267 Tweet

1,1K Followers

382 Following

Preethi Seshadri

@preethi__s_

10 months ago

📢 New paper from my internship at cohere with Seraphina Goldfarb-Tarrant @ FAccT 🇬🇷 ‼️ Are you interested in investigating the fairness of LLMs in hiring contexts? Take a look at our work 🧵 arxiv.org/abs/2501.04316

thumb_up_off_alt92

chat_bubble_outline3

repeat17

shareShare

Seraphina Goldfarb-Tarrant

@seraphinagt

10 months ago

In 🇫🇷 for the AI Action Summit! Say hi if you're here!

thumb_up_off_alt16

chat_bubble_outline0

repeat1

shareShare

Seraphina Goldfarb-Tarrant

@seraphinagt

9 months ago

Radar charts flame 🔥 reloaded, Gemma3 report edition. Axes/radials are rescaled for each capability, so the distance between points varies, not just between charts, but *within* a chart 😱. It would be challenging to make a bar plot this misleading, if trying. 🧪 👮 are coming.

thumb_up_off_alt10

chat_bubble_outline0

repeat1

shareShare

Tom Hosking

@tomhosking

9 months ago

GPT-4o level performance ✅ 75% faster ✅ 256k context length ✅ Open weight ✅ Fits on 2xH100s ✅ Add to basket

thumb_up_off_alt83

chat_bubble_outline1

repeat14

shareShare

Seraphina Goldfarb-Tarrant

@seraphinagt

9 months ago

🔥 new model 🔥 Our approach to safety is pretty unique ✨. We have 3 focuses: controllability for context dependent safety (risks differ across use cases), enterprise fairness (equal treatement in tasks on human data), and open weights safety (baseline safety for everything).

thumb_up_off_alt54

chat_bubble_outline0

repeat9

shareShare

Seraphina Goldfarb-Tarrant

@seraphinagt

8 months ago

223 (!!!) people worked on this tech report, it was definitely the biggest coordination effort of a scientific artifact that I've ever done. Totally worth it.

thumb_up_off_alt38

chat_bubble_outline0

repeat1

shareShare

Seraphina Goldfarb-Tarrant

@seraphinagt

8 months ago

Everything you need to know about how to explain scientific concepts like merging clearly and concisely with just a couple of colours is in Figures 2-4 in this report, courtesy of Jay Alammar

thumb_up_off_alt18

chat_bubble_outline0

repeat3

shareShare

Hongyu (Charlie) Chen

@hongyucharlie

8 months ago

This work was integral to Command A's development process described in the tech report (cohere.com/research/paper…), and enabled higher quality auto evaluations and fast iterations. Grateful to have been part of this cohere and mentored by Seraphina Goldfarb-Tarrant @ ACL 🇦🇹!

thumb_up_off_alt12

chat_bubble_outline0

repeat1

shareShare

Seraphina Goldfarb-Tarrant

@seraphinagt

8 months ago

Charlie did this work with the safety team cohere on the robustness of LLM evaluators and found them to be surprisingly vulnerable to basic perturbations. This was really important research that was a prereq for creating evals at Cohere that are robust to spurious correlations!

thumb_up_off_alt12

chat_bubble_outline0

repeat0

shareShare

Hadas Orgad

@orgadhadas

8 months ago

🎉 Our Actionable Interpretability workshop has been accepted to #ICML2025! 🎉 >> Follow Actionable Interpretability Workshop ICML2025 Tal Haklay Anja Reusch Marius Mosbach Sarah Wiegreffe Ian Tenney (@[email protected]) Mor Geva Paper submission deadline: May 9th!

🎉 Our Actionable Interpretability workshop has been accepted to #ICML2025! 🎉
>> Follow <a href="/ActInterp/">Actionable Interpretability Workshop ICML2025</a>

<a href="/tal_haklay/">Tal Haklay</a> <a href="/anja_reu/">Anja Reusch</a> <a href="/mariusmosbach/">Marius Mosbach</a> <a href="/sarahwiegreffe/">Sarah Wiegreffe</a> <a href="/iftenney/">Ian Tenney (@iftenney@sigmoid.social)</a> <a href="/megamor2/">Mor Geva</a>

Paper submission deadline: May 9th!

thumb_up_off_alt127

chat_bubble_outline1

repeat25

shareShare

Seraphina Goldfarb-Tarrant

@seraphinagt

5 months ago

I will be in 🇬🇷 for ACM FAccT next Mon-Thurs, the first time I've been to one in 5 years! ☕️ chat with me about research, internships, collabs, etc -- I am largely there to talk to people :).

thumb_up_off_alt16

chat_bubble_outline1

repeat1

shareShare