
Fangyu Liu
@hardy_qr
Research Scientist @GoogleDeepMind working on Gemini♊ pretraining. PhD @CambridgeLTL. BMath @UWaterloo. From 成都🐼.
Opinions my own.
ID: 4923880075
http://fangyuliu.me/about 18-02-2016 04:12:15
229 Tweet
1,1K Followers
1,1K Following

Massive News from Chatbot Arena🔥 Google DeepMind's latest Gemini (Exp 1114), tested with 6K+ community votes over the past week, now ranks joint #1 overall with an impressive 40+ score leap — matching 4o-latest in and surpassing o1-preview! It also claims #1 on Vision










Appreciate Aidan McLaughlin looking into the thinking model results. Originally scores looked weak as the response was plucked from the thought content versus output. We are looking into ways of making thinking output less confusing for people running evals. This is why we 🚢, to




