iislucas (Lucas Dixon) (@iislucas) 's Twitter Profile
iislucas (Lucas Dixon)

@iislucas

machines learn, graphs reason, identity is a non-identity, incompetence over conspiracy, evil by association is evil, expression is never free, stay curious

ID: 101337016

linkhttp://pair.withgoogle.com calendar_today02-01-2010 23:08:40

245 Tweet

376 Followers

205 Following

Ian Tenney (@iftenney@sigmoid.social) (@iftenney) 's Twitter Profile Photo

🧵(1/6): Excited to announce the v1.0 release of the Google AI Learning Interpretability Tool (🔥LIT), an interactive platform to debug, validate, and understand ML model behavior. This release brings exciting new features and a simplified Python API. pair-code.github.io/lit

🧵(1/6): Excited to announce the v1.0 release of the <a href="/GoogleAI/">Google AI</a> Learning Interpretability Tool (🔥LIT), an interactive platform to debug, validate, and understand ML model behavior. This release brings exciting new features and a simplified Python API. pair-code.github.io/lit
Peter Hase (@peterbhase) 's Twitter Profile Photo

Happy to share that this paper was accepted with a Spotlight at #NeurIPS2023! We updated the arXiv with results showing the disconnect between knowledge localization and editing success across different neuron ablations, editing methods, editing metrics, models, and datasets.⬇️

Jeff Dean (@jeffdean) 's Twitter Profile Photo

I’m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,

I’m very excited to share our work on Gemini today!  Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains.  Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,
Dan Friedman (@danfriedman0) 's Twitter Profile Photo

We often interpret neural nets by studying simplified representations (e.g. low-dim visualization). But how faithful are these simplifications to the original model? In our new preprint, we found some surprising "interpretability illusions"... 1/6

We often interpret neural nets by studying simplified representations (e.g. low-dim visualization). But how faithful are these simplifications to the original model? In our new preprint, we found some surprising "interpretability illusions"... 1/6
Asma Ghandeharioun (@ghandeharioun) 's Twitter Profile Photo

🧵Can we “ask” an LLM to “translate” its own hidden representations into natural language? We propose 🩺Patchscopes, a new framework for decoding specific information from a representation by “patching” it into a separate inference pass, independently of its original context. 1/9

🧵Can we “ask” an LLM to “translate” its own hidden representations into natural language? We propose 🩺Patchscopes, a new framework for decoding specific information from a representation by “patching” it into a separate inference pass, independently of its original context. 1/9
Geoffrey Cideron (@cdrgeo) 's Twitter Profile Photo

Happy to introduce our paper MusicRL, the first music generation system finetuned with human preferences. Paper link: arxiv.org/abs/2402.04229

Ian Tenney (@iftenney@sigmoid.social) (@iftenney) 's Twitter Profile Photo

Super excited for the Gemma model release, and with it a new debugging tool we built on 🔥LIT - use gradient-based salience to debug and refine complex LLM prompts! ai.google.dev/responsible/mo…

Adam Roberts (@ada_rob) 's Twitter Profile Photo

I love music most when it’s live, in the moment, and expressing something personal. This is why I’m psyched about the new “DJ mode” we developed for MusicFX: aitestkitchen.withgoogle.com/tools/music-fx… It’s an infinite AI jam that you control 🎛️. Try mixing your unique 🌀 of instruments, genres,

Google AI (@googleai) 's Twitter Profile Photo

Being able to interpret an #ML model’s hidden representations is key to understanding its behavior. Today we introduce Patchscopes, an approach that trains #LLMs to provide natural language explanations of their own hidden representations. Learn more → goo.gle/4aS5epd

Being able to interpret an #ML model’s hidden representations is key to understanding its behavior. Today we introduce Patchscopes, an approach that trains #LLMs to provide natural language explanations of their own hidden representations. Learn more → goo.gle/4aS5epd
Armand Joulin (@armandjoulin) 's Twitter Profile Photo

Gemma 2 27B is now the best open model while being 2.5x smaller than alternatives! This validates the work done by the team and Gemini. This is just the beginning 💙♊️

Google AI (@googleai) 's Twitter Profile Photo

Can large language models (LLMs) explain their internal mechanisms? Check out the latest AI Explorable on Patchscopes, an inspection framework that uses LLMs to explain the hidden representations of LLMs. Learn more → goo.gle/patchscopes

Can large language models (LLMs) explain their internal mechanisms? Check out the latest AI Explorable on Patchscopes, an inspection framework that uses LLMs to explain the hidden representations of LLMs. Learn more → goo.gle/patchscopes
Google DeepMind (@googledeepmind) 's Twitter Profile Photo

We’re welcoming a new 2 billion parameter model to the Gemma 2 family. 🛠️ It offers best-in-class performance for its size and can run efficiently on a wide range of hardware. Developers can get started with 2B today → dpmd.ai/4d0MKEH

Asma Ghandeharioun (@ghandeharioun) 's Twitter Profile Photo

đź§µResponses to adversarial queries can still remain latent in a safety-tuned model. Why are they revealed sometimes, but not others? And what are the mechanics of this latent misalignment? Does it matter *who* the user is? (1/n)

đź§µResponses to adversarial queries can still remain latent in a safety-tuned model. Why are they revealed sometimes, but not others? And what are the mechanics of this latent misalignment? Does it matter *who* the user is? (1/n)
Adam Roberts (@ada_rob) 's Twitter Profile Photo

I’m so proud of the updated version of #MusicFXDJ we developed in collaboration with Jacob Collier, available today at labs.google/musicfx. Over the past year I’ve spent countless hours experimenting with our real-time music models, and it feels like I’ve learned to play a

Sohee Yang (@soheeyang_) 's Twitter Profile Photo

🚨 New Paper 🚨 Can LLMs perform latent multi-hop reasoning without exploiting shortcuts? We find the answer is yes – they can recall and compose facts not seen together in training or guessing the answer, but success greatly depends on the type of the bridge entity (80%+ for

Jeff Dean (@jeffdean) 's Twitter Profile Photo

What a way to celebrate one year of incredible Gemini progress -- #1🥇across the board on overall ranking, as well as on hard prompts, coding, math, instruction following, and more, including with style control on. Thanks to the hard work of everyone in the Gemini team and

What a way to celebrate one year of incredible Gemini progress -- #1🥇across the board on overall ranking, as well as on hard prompts, coding, math, instruction following, and more, including with style control on.

Thanks to the hard work of everyone in the Gemini team and
Tyler Chang (@tylerachang) 's Twitter Profile Photo

We scaled training data attribution (TDA) methods ~1000x to find influential pretraining examples for thousands of queries in an 8B-parameter LLM over the entire 160B-token C4 corpus! medium.com/people-ai-rese…

We scaled training data attribution (TDA) methods ~1000x to find influential pretraining examples for thousands of queries in an 8B-parameter LLM over the entire 160B-token C4 corpus!
medium.com/people-ai-rese…
Arthur Conmy (@arthurconmy) 's Twitter Profile Photo

We are hiring Applied Interpretability researchers on the GDM Mech Interp Team!đź§µ If interpretability is ever going to be useful, we need it to be applied at the frontier. Come work with Neel Nanda, the Google DeepMind AGI Safety team, and me: apply by 28th February as a

Alexander Chen (@alexanderchen) 's Twitter Profile Photo

Veo holograms 🦝⚡️ Visualizing animal superpowers! Just discovered Veo 3's amazing ability to render 3d holograms. Virtual interfaces within the simulated world. 🔊 Prompts in 🧵