Chenchen Han (@chenchenha42849) 's Twitter Profile
Chenchen Han

@chenchenha42849

ID: 1730417643587567616

calendar_today01-12-2023 02:45:05

30 Tweet

19 Followers

78 Following

fajie yuan (@duguyuan) 's Twitter Profile Photo

Exciting highlights: 1️⃣ Training is super easy—no ML or coding expertise needed! 2️⃣ Biologists can share models on our community store for others to use or retrain. 3️⃣ Join OPMC as a paper author! Welcome more contributions! FAQs:github.com/westlake-repl/… Colaboratory #OPMC

Leo Zang (@leotz03) 's Twitter Profile Photo

ProTrek: Navigating the Protein Universe through Tri-Modal Contrastive Learning - Aligns sequence-structure, sequence-function, and structure-function pairs by ESM, BERT, and Foldseek - Leverages max-inner product search for rapid retrieval preprint: biorxiv.org/content/10.110…

ProTrek: Navigating the Protein Universe through Tri-Modal Contrastive Learning
- Aligns sequence-structure, sequence-function, and structure-function pairs by ESM, BERT, and Foldseek
- Leverages max-inner product search for rapid retrieval
preprint: biorxiv.org/content/10.110…
fajie yuan (@duguyuan) 's Twitter Profile Photo

Excited to share ProTrec, a fast & accurate protein search tool! 30x/60x better seq-func/func-seq retrieval 100x faster than Foldseek & MMseq2 9 tasks: seq-stru, seq-func, struc-fun, etc. Beats ESM2 in 9/11 tasks Thanks to Sergey Ovchinnikov chentongwang biorxiv.org/content/10.110…

fajie yuan (@duguyuan) 's Twitter Profile Photo

Introducing ProTrek, a 3-modal PLM for protein seq, struc, and func: ✨ Trained on 40M protein-text pairs, 100x larger than ProteinCLIP, ProtST, ProteinCLAP 🚀 30x/60x better accuracy than ProtST, ProteinCLAP ⚡ 100x faster than Foldseek, MMseq2 for similar function searches

Introducing ProTrek, a 3-modal PLM for protein seq, struc, and func:

✨ Trained on 40M protein-text pairs, 100x larger than ProteinCLIP, ProtST, ProteinCLAP
🚀 30x/60x better accuracy than ProtST, ProteinCLAP
⚡ 100x faster than Foldseek, MMseq2 for similar function searches
Jin Su (@ltenjoy) 's Twitter Profile Photo

We are thrilled to release ProTrek, a tri-modal PLM modeling protein sequence, structure and function! ProTrek supports both retrieval (9 tasks) and downstream fine-tuning! 👉Paper: biorxiv.org/content/10.110… 👉Github: github.com/westlake-repl/… 👉Demo: huggingface.co/spaces/westlak…

We are thrilled to release ProTrek, a tri-modal PLM modeling protein sequence, structure and function!

ProTrek supports both retrieval (9 tasks) and downstream fine-tuning!

👉Paper: biorxiv.org/content/10.110…
👉Github: github.com/westlake-repl/…
👉Demo: huggingface.co/spaces/westlak…
fajie yuan (@duguyuan) 's Twitter Profile Photo

My student Jin Su evaluated ESM3 (v1) for the inverse folding task. The results look great! Waiting more results. Also check SaprotHub without license limitation biorxiv.org/content/10.110… Welcome contributions to SaprotHub and be an author! github.com/westlake-repl/…

fajie yuan (@duguyuan) 's Twitter Profile Photo

Zhikai uploaded a 6-min tutorial for SaprotHub! 🚀 Biologists can now easily train & share their protein language models. Join us, be a SaprotHub author! #Bioinformatics #ProteinModeling Jin Su Sergey Ovchinnikov Paper: biorxiv.org/content/10.110… Video: youtube.com/watch?v=r42z1h…

fajie yuan (@duguyuan) 's Twitter Profile Photo

Great news: a wet lab submitted a EYFP fluorescence fitness model to SaprotHub with a Spearman ρ of 0.94, close to wet lab accuracy for double/triple-site mutations. Trained on 100K variants, it's a great🔧 tool for biologists! Boston Protein Design and Modeling Club Machine learning for protein engineering seminar Sergey Ovchinnikov Jin Su

fajie yuan (@duguyuan) 's Twitter Profile Photo

Recruited 12 bio students, no coding exp, to use ColabSaprot for re-training, zero-shot mutation, & protein design. They matched AI experts w/o hyper-parameter tuning! With SaprotHub, any biologist can train protein models! Sergey Ovchinnikov Jin Su biorxiv.org/content/10.110…

Recruited 12 bio students, no coding exp, to use ColabSaprot for re-training, zero-shot mutation, & protein design. They matched AI experts w/o hyper-parameter tuning! 
With SaprotHub, any biologist can train protein models!  <a href="/sokrypton/">Sergey Ovchinnikov</a> <a href="/LTEnjoy/">Jin Su</a> 

biorxiv.org/content/10.110…
Jin Su (@ltenjoy) 's Twitter Profile Photo

We believe the potential of ProTrek to search proteins from large database for interested functions. Any suggestions for our evaluation would be appreciated!! Protrek demo: huggingface.co/spaces/westlak… paper: biorxiv.org/content/10.110…

Jin Su (@ltenjoy) 's Twitter Profile Photo

Used SaprotHub to predict mutations for eTDG, a uracil-N-glycosylase variant. 🧬 Lab results: 17 out of top 20 mutations had higher T-to-G editing efficiency than wild type (marked as red), with 3 showing nearly 2x improvement! 🚀

Used SaprotHub to predict mutations for eTDG, a uracil-N-glycosylase variant. 🧬
Lab results: 17 out of top 20 mutations had higher T-to-G editing efficiency than wild type (marked as red), with 3 showing nearly 2x improvement! 🚀
fajie yuan (@duguyuan) 's Twitter Profile Photo

Pinal demonstrates impressive performance when evaluated using GT-TMscore and ProTrek CLIP score, outperforming ESM-3 for with key words as promt in dry experiment metrics. We plan to validate these results with wet experiments.

Pinal demonstrates impressive performance when evaluated using GT-TMscore and ProTrek CLIP score, outperforming ESM-3 for with key words as promt in dry experiment metrics. We plan to validate these results with wet experiments.
fajie yuan (@duguyuan) 's Twitter Profile Photo

We've released ColabProTrek, the successor to ColabSaprot. 🔬 Try it out: colab.research.google.com/drive/1On2xQU0… 🆕 We've also expanded ProTrek's search capabilities with additional databases including UniRef50 and PDB. 🧬 Explore: huggingface.co/spaces/westlak… Paper: biorxiv.org/content/10.110…

We've released ColabProTrek, the successor to ColabSaprot.

🔬 Try it out: colab.research.google.com/drive/1On2xQU0…

🆕 We've also expanded ProTrek's search capabilities with additional databases including UniRef50 and PDB.

🧬 Explore: huggingface.co/spaces/westlak…

Paper: biorxiv.org/content/10.110…
fajie yuan (@duguyuan) 's Twitter Profile Photo

🚀 New Update. The latest version of ProTrek is now available on bioRxiv. 🧬 📑 Read it here: biorxiv.org/content/10.110… • Service: huggingface.co/spaces/westlak… • Try it on Colab: colab.research.google.com/drive/1On2xQU0…

🚀 New Update. The latest version of ProTrek is now available on bioRxiv. 🧬

📑 Read it here: biorxiv.org/content/10.110…

• Service: huggingface.co/spaces/westlak…
• Try it on Colab: colab.research.google.com/drive/1On2xQU0…
fajie yuan (@duguyuan) 's Twitter Profile Photo

Excited to share our AI+cryo-EM work! 🧬 🔬 Cryo-IEF: Foundation model trained on 65M particles 🤖 CryoWizard: automated structure pipeline 🎯 Making cryo-EM accessible to more labs Preprint: biorxiv.org/content/10.110… Code: github.com/westlake-repl/… #CryoEM #AI #StructuralBiology

Excited to share our  AI+cryo-EM work! 🧬
🔬 Cryo-IEF: Foundation model trained on 65M particles
🤖 CryoWizard:  automated structure pipeline
🎯 Making cryo-EM accessible to more labs

Preprint: biorxiv.org/content/10.110…
Code: github.com/westlake-repl/…
#CryoEM #AI #StructuralBiology
DailyHealthcareAI (@aipulserx) 's Twitter Profile Photo

How can we effectively decode and understand the complex molecular language of proteins to unlock their functional secrets at scale?bioRxiv Westlake University "Decoding the Molecular Language of Proteins with Evola" • Scientists have developed Evola, an 80 billion

How can we effectively decode and understand the complex molecular language of proteins to unlock their functional secrets at scale?<a href="/biorxivpreprint/">bioRxiv</a> <a href="/Westlake_Uni/">Westlake University</a> 

"Decoding the Molecular Language of Proteins with Evola"

• Scientists have developed Evola, an 80 billion
fajie yuan (@duguyuan) 's Twitter Profile Photo

We release our protein chatGPT, Evola! 🌟 chat-protein.com Evola comes in two versions: 10B & 80B. The 80B model has a 1.3B Saprot encoder & a 70B LLaMA3 decoder. Trained on 546 protein question-text pairs with an 150 billion word tokens! 💡🔬 biorxiv.org/content/10.110…

We release our protein chatGPT, Evola! 🌟 chat-protein.com

Evola comes in two versions: 10B &amp; 80B. The 80B model has a 1.3B Saprot encoder &amp; a 70B LLaMA3 decoder.

Trained on 546 protein question-text pairs with an 150 billion word tokens! 💡🔬

biorxiv.org/content/10.110…
Biology+AI Daily (@biologyaidaily) 's Twitter Profile Photo

Decoding the Molecular Language of Proteins with Evola 1. Evola introduces an 80-billion parameter multimodal protein-language model to decode protein functions, leveraging protein sequences, structures, and user queries. 2. A key innovation is its unprecedented training

Decoding the Molecular Language of Proteins with Evola

1. Evola introduces an 80-billion parameter multimodal protein-language model to decode protein functions, leveraging protein sequences, structures, and user queries.

2. A key innovation is its unprecedented training