Jascha Sohl-Dickstein (@jaschasd) 's Twitter Profile
Jascha Sohl-Dickstein

@jaschasd

Member of the technical staff @ Anthropic. Most (in)famous for inventing diffusion models. AI + physics + neuroscience + dynamics.

ID: 65876824

linkhttps://sohldickstein.com calendar_today15-08-2009 11:00:03

544 Tweet

21,21K Followers

675 Following

AK (@_akhaliq) 's Twitter Profile Photo

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5? paper page: huggingface.co/papers/2311.07… introduce and study the problem of adversarial arithmetic, which provides a simple yet challenging testbed for language model

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

paper page: huggingface.co/papers/2311.07…

introduce and study the problem of adversarial arithmetic, which provides a simple yet challenging testbed for language model
Max Bileschi (@mlbileschi_pub) 's Twitter Profile Photo

2+2=5? “LLMs are not Robust to Adversarial Arithmetic” a new paper from our team Google DeepMind with bucket of kets, Laura Culp, AaronParisi, Gamaleldin Elsayed, Jascha Sohl-Dickstein, Noah Fiedel TLDR: We ask an LLM to attack itself and find this works extremely well.

Jascha Sohl-Dickstein (@jaschasd) 's Twitter Profile Photo

I’ve been daydreaming about an AI+audio product that I think recently became possible: virtual noise canceling headphones. I hate loud background noise -- BART trains, airline cabins, road noise, ... 🙉. I would buy the heck out of this product, and would love it if it were built

Tristan Hume (@trishume) 's Twitter Profile Photo

Here's Claude 3 Haiku running at >200 tokens/s (>2x as fast as prod)! We've been working on capacity optimizations but we can have fun testing those as speed optimizations via overly-costly low batch size. Come work with me at Anthropic on things like this, more info in thread 🧵

Jascha Sohl-Dickstein (@jaschasd) 's Twitter Profile Photo

This was a fun project! If you could train an LLM over text arithmetically compressed using a smaller LLM as a probabilistic model of text, it would be really good. Text would be represented with far fewer tokens, and inference would be way faster and cheaper. The hard part is

Jascha Sohl-Dickstein (@jaschasd) 's Twitter Profile Photo

This was one of the most research-enabling libraries I used at Google. If you want to try out LLM ideas with a simple, clean, JAX codebase, this is for you.

Jascha Sohl-Dickstein (@jaschasd) 's Twitter Profile Photo

This is great, hearing Yang's thought process and motivations for his score matching/diffusion research. (I had forgotten that I tried to convince him that score matching was too local to be useful for generative modeling :/)