Jack Merullo (@jack_merullo_) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Yanai Elazar

@yanaiela

7 months ago

Jack Merullo is kicking off the contributed talks session

<a href="/jack_merullo_/">Jack Merullo</a> is kicking off the contributed talks session

thumb_up_off_alt15

chat_bubble_outline0

repeat3

shareShare

We are excited to announce our collaboration with Arc Institute on their state-of-the-art biological foundation model, Evo 2. Our work reveals how models like Evo 2 process biological information - from DNA to proteins - in ways we can now decode.

We are excited to announce our collaboration with <a href="/arcinstitute/">Arc Institute</a> on their state-of-the-art biological foundation model, Evo 2. Our work reveals how models like Evo 2 process biological information - from DNA to proteins - in ways we can now decode.

thumb_up_off_alt400

chat_bubble_outline7

repeat43

shareShare

William Merrill

@lambdaviking

5 months ago

How does the depth of a transformer affect reasoning capabilities? New preprint by myself and Ashish Sabharwal shows that a little depth goes a long way to increase transformers’ expressive power We take this as encouraging for further research on looped transformers!🧵

How does the depth of a transformer affect reasoning capabilities? New preprint by myself and <a href="/Ashish_S_AI/">Ashish Sabharwal</a> shows that a little depth goes a long way to increase transformers’ expressive power

We take this as encouraging for further research on looped transformers!🧵

thumb_up_off_alt396

chat_bubble_outline11

repeat57

shareShare

Apoorv Khandelwal

@apoorvkh

5 months ago

We made a library (torchrunx) to make multi-GPU / multi-node PyTorch easier, more robust, and more modular! 🧵 github.com/apoorvkh/torch… Docs: torchrun.xyz `(uv) pip install torchrunx` today! (w/ the very talented, Peter Curtin, Brown CS '25)

thumb_up_off_alt56

chat_bubble_outline3

repeat4

shareShare

Thariq

@trq212

5 months ago

✨ New AI Interfaces powered by Interpretability I'm excited to share LatentLit, the result of my applied AI research fellowship with Goodfire Mechanistic interpretability isn’t just important for AI safety, it also gives us new ways to steer and interact with LLMs.

thumb_up_off_alt540

chat_bubble_outline40

repeat58

shareShare

Tom McGrath

@banburismus_

4 months ago

I’m a bit confused by this - perhaps due to differences of opinion in what ‘fundamental SAE research’ is and what interpretability is for. This is why I prefer to talk about interpreter models rather than SAEs - we’re attached to the end goal, not the details of methodology. The

thumb_up_off_alt116

chat_bubble_outline1

repeat9

shareShare

David Bau

@davidbau

4 months ago

GDM's AGI safety document is great and worth a read. But their focus on *robustness* of technology neglects *resilience* of the larger ecosystem. To build resilience, we need to empower people, and a build third way between open and closed models. resilience.baulab.info/docs/NDIF-resi…

thumb_up_off_alt43

chat_bubble_outline1

repeat10

shareShare

Jack Merullo

@jack_merullo_

4 months ago

I joined Goodfire a little over a month ago to do interpretability! I am really excited to extend my work beyond just LMs. I think interp has a lot to offer to e.g., scientific models. Understanding them might actually teach us something new about the world 🌎

thumb_up_off_alt153

chat_bubble_outline8

repeat5

shareShare

Goodfire

@goodfireai

3 months ago

What goes on inside the mind of a reasoning model? Today we're releasing the first open-source sparse autoencoders (SAEs) trained on DeepSeek's 671B parameter reasoning model, R1—giving us new tools to understand and steer model thinking. Why does this matter?

thumb_up_off_alt631

chat_bubble_outline20

repeat69

shareShare

Jack Merullo

@jack_merullo_

3 months ago

Big things!

thumb_up_off_alt36

chat_bubble_outline0

repeat0

shareShare

Suraj Anand ICLR 2025

@surajk610

3 months ago

Excited to be at #ICLR2025 in a few days to present this work with Michael Lepori! Interested in chatting about training dynamics, mechinterp, memory-efficient training, info theory or anything else! Please dm me.

thumb_up_off_alt12

chat_bubble_outline0

repeat3

shareShare

Benjamin Spiegel

@superspeeg

3 months ago

Why did only humans invent graphical systems like writing? 🧠✍️ In our new paper at CogSci Society, we explore how agents learn to communicate using a model of pictographic signification similar to human proto-writing. 🧵👇

thumb_up_off_alt1,1K

chat_bubble_outline22

repeat180

shareShare

Aaron Mueller

@amuuueller

3 months ago

Lots of progress in mech interp (MI) lately! But how can we measure when new mech interp methods yield real improvements over prior work? We propose 😎 𝗠𝗜𝗕: a Mechanistic Interpretability Benchmark!

thumb_up_off_alt163

chat_bubble_outline2

repeat37

shareShare

Jack Merullo

@jack_merullo_

3 months ago

I’m not at ICLR but Yanai is presenting the work I did last summer at ai2!

thumb_up_off_alt35

chat_bubble_outline0

repeat1

shareShare

Goodfire

@goodfireai

2 months ago

We created a canvas that plugs into an image model’s brain. You can use it to generate images in real-time by painting with the latent concepts the model has learned. Try out Paint with Ember for yourself 👇

thumb_up_off_alt917

chat_bubble_outline40

repeat97

shareShare

Goodfire

@goodfireai

2 months ago

New research update! We replicated Anthropic's circuit tracing methods to test if they can recover a known, simple transformer mechanism.

New research update! We replicated <a href="/AnthropicAI/">Anthropic</a>'s circuit tracing methods to test if they can recover a known, simple transformer mechanism.

thumb_up_off_alt479

chat_bubble_outline2

repeat53

shareShare