Fangyu Liu (@hardy_qr) 's Twitter Profile
Fangyu Liu

@hardy_qr

Research Scientist @GoogleDeepMind working on Gemini♊ pretraining. PhD @CambridgeLTL. BMath @UWaterloo. From 成都🐼.
Opinions my own.

ID: 4923880075

linkhttp://fangyuliu.me/about calendar_today18-02-2016 04:12:15

229 Tweet

1,1K Followers

1,1K Following

lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

Massive News from Chatbot Arena🔥 Google DeepMind's latest Gemini (Exp 1114), tested with 6K+ community votes over the past week, now ranks joint #1 overall with an impressive 40+ score leap — matching 4o-latest in and surpassing o1-preview! It also claims #1 on Vision

Massive News from Chatbot Arena🔥

<a href="/GoogleDeepMind/">Google DeepMind</a>'s latest Gemini (Exp 1114), tested with 6K+ community votes over the past week, now ranks joint #1 overall with an impressive 40+ score leap — matching 4o-latest in and surpassing o1-preview! It also claims #1 on Vision
Kaushik Shivakumar (@19kaushiks) 's Twitter Profile Photo

Super excited for native image out to be released. Had the opportunity to work with a brilliant team to take this from idea to product over the past year. First going to early access partners, then more widely in early 2025. We'll be sharing some cool demos throughout the day

Fangyu Liu (@hardy_qr) 's Twitter Profile Photo

It's cool to see capabilities being compounding. Progress at one front eventually accelerates progress at other fronts: ultra long-context, MM-in/out, reasoning/planning, agency, ... And it's all just one model!

Robert Riachi (@robertriachi) 's Twitter Profile Photo

A simple yet powerful example of the new Gemini 2.0 Flash's native multimodal input + output. Precise conversational editing & reasoning! Next step, Chess!

A simple yet powerful example of the new Gemini 2.0 Flash's native multimodal input + output. Precise conversational editing &amp; reasoning!

Next step, Chess!
Jeff Dean (@jeffdean) 's Twitter Profile Photo

Introducing Gemini 2.0 Flash Thinking, an experimental model that explicitly shows its thoughts. Built on 2.0 Flash’s speed and performance, this model is trained to use thoughts to strengthen its reasoning. And we see promising results when we increase inference time

Jack Rae (@jack_w_rae) 's Twitter Profile Photo

Appreciate Aidan McLaughlin looking into the thinking model results. Originally scores looked weak as the response was plucked from the thought content versus output. We are looking into ways of making thinking output less confusing for people running evals. This is why we 🚢, to

Fangyu Liu (@hardy_qr) 's Twitter Profile Photo

Happy to see people like our hyperfitting paper. We are presenting it at ICLR 2025 in Singapore later this year 🇸🇬

Machine Learning Street Talk (@mlstreettalk) 's Twitter Profile Photo

Coding using Cursor 0.45 with the Google DeepMind (new) gemini-2.0-flash-thinking-exp model seems like the biggest step up in genai coding since Claude Sonnet 3.5 came out last June. This is unreal... forget about R1 folks - check out this new Gemini model! 🤯

Mostafa Dehghani (@m__dehghani) 's Twitter Profile Photo

Anyone who has been in this room knows that it’s never just another day in here! This space has seen the extremes of chaos and genius! ...and we ship! developers.googleblog.com/en/experiment-… Happy Wednesday everyone!

Anyone who has been in this room knows that it’s never just another day in here! This space has seen the extremes of chaos and genius!

...and we ship! 
developers.googleblog.com/en/experiment-…

Happy Wednesday everyone!
lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

BREAKING: Gemini 2.5 Pro is now #1 on the Arena leaderboard - the largest score jump ever (+40 pts vs Grok-3/GPT-4.5)! 🏆 Tested under codename "nebula"🌌, Gemini 2.5 Pro ranked #1🥇 across ALL categories and UNIQUELY #1 in Math, Creative Writing, Instruction Following, Longer

BREAKING: Gemini 2.5 Pro is now #1 on the Arena leaderboard - the largest score jump ever (+40 pts vs Grok-3/GPT-4.5)! 🏆

Tested under codename "nebula"🌌, Gemini 2.5 Pro ranked #1🥇 across ALL categories and UNIQUELY #1 in Math, Creative Writing, Instruction Following, Longer