Jascha Sohl-Dickstein (@jaschasd) Twitter Tweets • TwiCopy

Jascha Sohl-Dickstein

@jaschasd

+ Follow

Member of the technical staff @ Anthropic. Most (in)famous for inventing diffusion models. AI + physics + neuroscience + dynamics.

ID: 65876824

linkhttps://sohldickstein.com calendar_today15-08-2009 11:00:03

544 Tweet

21,21K Followers

675 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5? paper page: huggingface.co/papers/2311.07… introduce and study the problem of adversarial arithmetic, which provides a simple yet challenging testbed for language model

thumb_up_off_alt43

chat_bubble_outline0

repeat9

shareShare

Max Bileschi

@mlbileschi_pub

2 years ago

2+2=5? “LLMs are not Robust to Adversarial Arithmetic” a new paper from our team Google DeepMind with bucket of kets, Laura Culp, AaronParisi, Gamaleldin Elsayed, Jascha Sohl-Dickstein, Noah Fiedel TLDR: We ask an LLM to attack itself and find this works extremely well.

thumb_up_off_alt48

chat_bubble_outline2

repeat11

shareShare

Jascha Sohl-Dickstein

@jaschasd

2 years ago

An excellent project making evolution strategies much more efficient for computing gradients in dynamical systems.

thumb_up_off_alt38

chat_bubble_outline0

repeat4

shareShare

Jascha Sohl-Dickstein

@jaschasd

a year ago

I’ve been daydreaming about an AI+audio product that I think recently became possible: virtual noise canceling headphones. I hate loud background noise -- BART trains, airline cabins, road noise, ... 🙉. I would buy the heck out of this product, and would love it if it were built

thumb_up_off_alt80

chat_bubble_outline8

repeat4

shareShare

Tristan Hume

@trishume

a year ago

Here's Claude 3 Haiku running at >200 tokens/s (>2x as fast as prod)! We've been working on capacity optimizations but we can have fun testing those as speed optimizations via overly-costly low batch size. Come work with me at Anthropic on things like this, more info in thread 🧵

thumb_up_off_alt435

chat_bubble_outline10

repeat35

shareShare

Jascha Sohl-Dickstein

@jaschasd

a year ago

This was a fun project! If you could train an LLM over text arithmetically compressed using a smaller LLM as a probabilistic model of text, it would be really good. Text would be represented with far fewer tokens, and inference would be way faster and cheaper. The hard part is

thumb_up_off_alt103

chat_bubble_outline3

repeat10

shareShare

Jascha Sohl-Dickstein

@jaschasd

a year ago

This was one of the most research-enabling libraries I used at Google. If you want to try out LLM ideas with a simple, clean, JAX codebase, this is for you.

thumb_up_off_alt77

chat_bubble_outline1

repeat6

shareShare

Jascha Sohl-Dickstein

@jaschasd

a year ago

This is an excellent paper, that ties many threads together around scaling models and hyperparameters.

thumb_up_off_alt55

chat_bubble_outline3

repeat3

shareShare

Jascha Sohl-Dickstein

@jaschasd

5 months ago

Slater is an excellent interviewer. This was a lot of fun to do. I'm even more excited for the upcoming interviews with Yang Song and Sander Dieleman !

thumb_up_off_alt72

chat_bubble_outline1

repeat6

shareShare

Jascha Sohl-Dickstein

@jaschasd

3 months ago

This is great, hearing Yang's thought process and motivations for his score matching/diffusion research. (I had forgotten that I tried to convince him that score matching was too local to be useful for generative modeling :/)

thumb_up_off_alt110

chat_bubble_outline8

repeat11

shareShare

Jascha Sohl-Dickstein

Gate.io

AK

Max Bileschi

Jascha Sohl-Dickstein

Jascha Sohl-Dickstein

Tristan Hume

Jascha Sohl-Dickstein

Jascha Sohl-Dickstein

Jascha Sohl-Dickstein

Jascha Sohl-Dickstein

Jascha Sohl-Dickstein