Giannis Daras (@giannis_daras) 's Twitter Profile
Giannis Daras

@giannis_daras

MIT CSAIL Postdoc 👨‍🎓 Ph.D. Computer Science @UTAustin 👨‍💻
Ex: @nvidia, @google, @explosion_ai, @ntua

ID: 1047227002388959232

linkhttp://giannisdaras.github.io calendar_today02-10-2018 20:49:08

1,1K Tweet

4,4K Followers

472 Following

Biology+AI Daily (@biologyaidaily) 's Twitter Profile Photo

Ambient Proteins: Training Diffusion Models on Low Quality Structures 1. A new framework, Ambient Protein Diffusion, revolutionizes protein structure generation by leveraging low-confidence AlphaFold structures as valuable, corrupted training data instead of discarding them.

Ambient Proteins: Training Diffusion Models on Low Quality Structures

1.  A new framework, Ambient Protein Diffusion, revolutionizes protein structure generation by leveraging low-confidence AlphaFold structures as valuable, corrupted training data instead of discarding them.
Biology+AI Daily (@biologyaidaily) 's Twitter Profile Photo

Ambient Proteins: Training Diffusion Models on Low Quality Structures 1. A new framework, Ambient Protein Diffusion, revolutionizes protein structure generation by leveraging low-confidence AlphaFold structures as valuable, corrupted training data instead of discarding them.

Ambient Proteins: Training Diffusion Models on Low Quality Structures

1.  A new framework, Ambient Protein Diffusion, revolutionizes protein structure generation by leveraging low-confidence AlphaFold structures as valuable, corrupted training data instead of discarding them.
Giannis Daras (@giannis_daras) 's Twitter Profile Photo

Announcing Ambient Protein Diffusion, a state-of-the-art 17M-params generative model for protein structures. Diversity improves by 91% and designability by 26% over previous 200M SOTA model for long proteins. The trick? Treat low pLDDT AlphaFold predictions as low-quality data

Announcing Ambient Protein Diffusion, a state-of-the-art 17M-params generative model for protein structures.

Diversity improves by 91% and designability by 26% over previous 200M SOTA model for long proteins.

The trick? Treat low pLDDT AlphaFold predictions as low-quality data
Giannis Daras (@giannis_daras) 's Twitter Profile Photo

Ambient Protein Diffusion treats low pLDDT AF structures as low-quality data. Instead of filtering them out (as done in prior work), we use them for a subset of the diffusion times. Enough noise "erases" the AF mistakes, and we can still learn from those structures.

Ambient Protein Diffusion treats low pLDDT AF structures as low-quality data.

Instead of filtering them out (as done in prior work), we use them for a subset of the diffusion times. 

Enough noise "erases" the AF mistakes, and we can still learn from those structures.
Giannis Daras (@giannis_daras) 's Twitter Profile Photo

The results are quite strong. Ambient Protein Diffusion substantially outperforms previous baselines in short and long protein generation. For short proteins, we dominate the Pareto frontier between designability and diversity, using a ~13x smaller model than previous SOTA.

The results are quite strong. 

Ambient Protein Diffusion substantially outperforms previous baselines in short and long protein generation. 

For short proteins, we dominate the Pareto frontier between designability and diversity, using a ~13x smaller model than previous SOTA.
Giannis Daras (@giannis_daras) 's Twitter Profile Photo

Joint work with the amazing Jeffrey Zhang (Jeffrey Ouyang-Zhang) (equal contribution) and w. wonderful people: D. Diaz (Danny Diaz), K. Ravishankar, W. Daspit, A. Klivans, C. Daskalakis (Constantinos Daskalakis), Q. Liu. It's also my first paper in the proteins space, so show it some love!

Joint work with the amazing Jeffrey Zhang (<a href="/zhang_ouyang/">Jeffrey Ouyang-Zhang</a>) (equal contribution) and w. wonderful people: D. Diaz (<a href="/aiproteins/">Danny Diaz</a>), K. Ravishankar, W. Daspit, A. Klivans, C. Daskalakis (<a href="/KonstDaskalakis/">Constantinos Daskalakis</a>), Q. Liu.

It's also my first paper in the proteins space, so show it some love!
Danny Diaz (@aiproteins) 's Twitter Profile Photo

Had a lot of fun learning diffusion and addressing key issues in protein diffusion with Giannis Daras Jeffrey Ouyang-Zhang TLDR: a few protein structure insights inspired us to design a new diffusion loss, training regime and dataset, resulting in significant performance improvements

yi (@agihippo) 's Twitter Profile Photo

GPT3: scale compute by 10x to get a good model Grok-4: scale RL compute by 10x to get a good model Llama-5: scale employee comp by 10x to get a good model.

Kirill Neklyudov (@k_neklyudov) 's Twitter Profile Photo

1/ Where do Probabilistic Models, Sampling, Deep Learning, and Natural Sciences meet? 🤔 The workshop we’re organizing at #NeurIPS2025! 📢 FPI@NeurIPS 2025: Frontiers in Probabilistic Inference – Learning meets Sampling Learn more and submit → fpiworkshop.org