JB Alayrac (@jalayrac) 's Twitter Profile
JB Alayrac

@jalayrac

🦩, ♊ - Research Scientist at Google DeepMind

ID: 807487617478643712

linkhttps://www.jbalayrac.com/ calendar_today10-12-2016 07:30:25

279 Tweet

1,1K Followers

247 Following

Demis Hassabis (@demishassabis) 's Twitter Profile Photo

The Gemini team cooked hard with Gemini 2.5 Pro, it's an awesome model that continues to lead lmarena.ai - huge congrats to the team! Try it for yourself in the Google Gemini App now. Can't wait for you all to see what else we've been cooking 👀

Vlad Feinberg (@feinbergvlad) 's Twitter Profile Photo

Recently had the pleasure of lecturing back at Princeton in a grad seminar. I took the opportunity to cover how scaling laws have evolved since their inception, leaning heavily on great external content from my colleagues Sebastian Borgeaud JB Alayrac Jacob Austin . Content in thread

Recently had the pleasure of lecturing back at Princeton in a grad seminar. I took the opportunity to cover how scaling laws have evolved since their inception, leaning heavily on great external content from my colleagues <a href="/borgeaud_s/">Sebastian Borgeaud</a> <a href="/jalayrac/">JB Alayrac</a> <a href="/jacobaustin132/">Jacob Austin</a> .

Content in thread
Ani Baddepudi (@anibaddepudi) 's Twitter Profile Photo

although the vision leaderboard doesn't capture every vision use case, 60+ elo points reflects the significant step in core vision capabilities like transcription, spatial understanding, reading charts/diagrams & many more. Still a lot more to do, but 2.5 Pro is the best vision

although the vision leaderboard doesn't capture every vision use case, 60+ elo points reflects the significant step in core vision capabilities like transcription, spatial understanding, reading charts/diagrams &amp; many more. 

Still a lot more to do, but 2.5 Pro is the best vision
Demis Hassabis (@demishassabis) 's Twitter Profile Photo

Gemini 2.5 Pro is incredible at video understanding, try posting a YouTube link into AI studio ai.dev and asking it questions about the video. You will be amazed!

JB Alayrac (@jalayrac) 's Twitter Profile Photo

A lot of work went to make Gemini 2.5 SOTA at video understanding, check out this 🧵 for more details! Looking back at where we were a year ago, the progress really feels phenomenal! So many things to unlock and enable from video 🎥 and we are only getting started!

Tobias Weyand (@0xtob) 's Twitter Profile Photo

Gemini 2.5 Pro sets the state of the art on our newly released Minerva video reasoning benchmark by scoring 63.5%. 📜 Paper: arxiv.org/abs/2505.00681… 📊 Dataset: github.com/google-deepmin…

Ani Baddepudi (@anibaddepudi) 's Twitter Profile Photo

The Gemini 2.5 models are magical for analyzing sports video. We asked Gemini to find Draymond's defensive plays from a highlights reel, which requires the model to: - reason “over pixels” to identify defensive plays - identify players in the video using its world knowledge -

The Gemini 2.5 models are magical for analyzing sports video.

We asked Gemini to find Draymond's defensive plays from a highlights reel, which requires the model to: 
- reason “over pixels” to identify defensive plays
- identify players in the video using its world knowledge
-
Fei Xia (@xf1280) 's Twitter Profile Photo

Excited that our work on Gemini Robotics and Gemini spatial understanding have just been featured on #GoogleIO stage! I believe that a frontier model possessing strong real-world understanding capabilities represents the ultimate path to embodied AGI, and we are making rapid

Excited that our work on Gemini Robotics and Gemini spatial understanding have just been featured on #GoogleIO stage! I believe that a frontier model possessing strong real-world understanding capabilities represents the ultimate path to embodied AGI, and we are making rapid
Visual Geometry Group (VGG) (@oxford_vgg) 's Twitter Profile Photo

Many Congratulations to Jianyuan#CVPR2025 2025, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht and David Novotny for winning the Best Paper Award @CVPR for "VGGT: Visual Geometry Grounded Transformer" 🥇🎉 🙌🙌 #CVPR2025!!!!!!

Many Congratulations to <a href="/jianyuan_wang/">Jianyuan<a href="/CVPR/">#CVPR2025</a> 2025</a>, <a href="/MinghaoChen23/">Minghao Chen</a>, <a href="/n_karaev/">Nikita Karaev</a>, Andrea Vedaldi, Christian Rupprecht and <a href="/davnov134/">David Novotny</a> for winning the Best Paper Award @CVPR for "VGGT: Visual Geometry Grounded Transformer" 🥇🎉 🙌🙌 #CVPR2025!!!!!!
Antoine Yang (@antoineyang2) 's Twitter Profile Photo

The newly generally available Gemini 2.5 Flash and Pro are even better at video understanding than the versions we shared in the blog a month ago, see more details in the tech report 😀

The newly generally available Gemini 2.5 Flash and Pro are even better at video understanding than the versions we shared in the blog a month ago, see more details in the tech report 😀
Ani Baddepudi (@anibaddepudi) 's Twitter Profile Photo

You can now sample at higher frame rates (default 1 FPS), and specify start and end times for videos in the Gemini API! We’ve been blown away by all the ways developers are using Gemini to process videos, and see a ton of devs manually clipping and slowing down videos to use

You can now sample at higher frame rates (default 1 FPS), and specify start and end times for videos in the Gemini API!  

We’ve been blown away by all the ways developers are using Gemini to process videos, and see a ton of devs manually clipping and slowing down videos to use
Logan Kilpatrick (@officiallogank) 's Twitter Profile Photo

We just shipped video FPS support in the Gemini API, so you can dynamically customize how many frames per second you want the model to see, unlocking lots of interesting new video use cases! 📹

We just shipped video FPS support in the Gemini API, so you can dynamically customize how many frames per second you want the model to see, unlocking lots of interesting new video use cases! 📹
Ani Baddepudi (@anibaddepudi) 's Twitter Profile Photo

gemini's still the only frontier model that supports native video input (and is amazing at it!) incredible amount of real-world utility given how much of the world's information is increasingly in video

gemini's still the only frontier model that supports native video input (and is amazing at it!)

incredible amount of real-world utility given how much of the world's information is increasingly in video