Seraphina Goldfarb-Tarrant (@seraphinagt) 's Twitter Profile
Seraphina Goldfarb-Tarrant

@seraphinagt

Head of AI Safety @cohere. PhD from @EdinburghNLP @InfAtED.
If you don't recognise me it's cause I am invisible dl.acm.org/doi/10.1145/25โ€ฆ

ID: 85239409

linkhttps://seraphinatarrant.github.io calendar_today26-10-2009 04:41:41

267 Tweet

1,1K Followers

382 Following

Preethi Seshadri (@preethi__s_) 's Twitter Profile Photo

๐Ÿ“ข New paper from my internship at cohere with Seraphina Goldfarb-Tarrant @ FAccT ๐Ÿ‡ฌ๐Ÿ‡ท โ€ผ๏ธ Are you interested in investigating the fairness of LLMs in hiring contexts? Take a look at our work ๐Ÿงต arxiv.org/abs/2501.04316

Seraphina Goldfarb-Tarrant (@seraphinagt) 's Twitter Profile Photo

Radar charts flame ๐Ÿ”ฅ reloaded, Gemma3 report edition. Axes/radials are rescaled for each capability, so the distance between points varies, not just between charts, but *within* a chart ๐Ÿ˜ฑ. It would be challenging to make a bar plot this misleading, if trying. ๐Ÿงช ๐Ÿ‘ฎ are coming.

Radar charts flame ๐Ÿ”ฅ reloaded, Gemma3 report edition. Axes/radials are rescaled for each capability, so the distance between points varies, not just between charts, but *within* a chart ๐Ÿ˜ฑ. It would be challenging to make a bar plot this misleading, if trying. ๐Ÿงช ๐Ÿ‘ฎ are coming.
Tom Hosking (@tomhosking) 's Twitter Profile Photo

GPT-4o level performance โœ… 75% faster โœ… 256k context length โœ… Open weight โœ… Fits on 2xH100s โœ… Add to basket

Seraphina Goldfarb-Tarrant (@seraphinagt) 's Twitter Profile Photo

๐Ÿ”ฅ new model ๐Ÿ”ฅ Our approach to safety is pretty unique โœจ. We have 3 focuses: controllability for context dependent safety (risks differ across use cases), enterprise fairness (equal treatement in tasks on human data), and open weights safety (baseline safety for everything).

Seraphina Goldfarb-Tarrant (@seraphinagt) 's Twitter Profile Photo

223 (!!!) people worked on this tech report, it was definitely the biggest coordination effort of a scientific artifact that I've ever done. Totally worth it.

Seraphina Goldfarb-Tarrant (@seraphinagt) 's Twitter Profile Photo

Everything you need to know about how to explain scientific concepts like merging clearly and concisely with just a couple of colours is in Figures 2-4 in this report, courtesy of Jay Alammar

Hongyu (Charlie) Chen (@hongyucharlie) 's Twitter Profile Photo

This work was integral to Command A's development process described in the tech report (cohere.com/research/paperโ€ฆ), and enabled higher quality auto evaluations and fast iterations. Grateful to have been part of this cohere and mentored by Seraphina Goldfarb-Tarrant @ ACL ๐Ÿ‡ฆ๐Ÿ‡น!

Seraphina Goldfarb-Tarrant (@seraphinagt) 's Twitter Profile Photo

Charlie did this work with the safety team cohere on the robustness of LLM evaluators and found them to be surprisingly vulnerable to basic perturbations. This was really important research that was a prereq for creating evals at Cohere that are robust to spurious correlations!

Seraphina Goldfarb-Tarrant (@seraphinagt) 's Twitter Profile Photo

I will be in ๐Ÿ‡ฌ๐Ÿ‡ท for ACM FAccT next Mon-Thurs, the first time I've been to one in 5 years! โ˜•๏ธ chat with me about research, internships, collabs, etc -- I am largely there to talk to people :).