Chenchen Han (@chenchenha42849) Twitter Tweets • TwiCopy

fajie yuan

2 years ago

Exciting highlights: 1️⃣ Training is super easy—no ML or coding expertise needed! 2️⃣ Biologists can share models on our community store for others to use or retrain. 3️⃣ Join OPMC as a paper author! Welcome more contributions！ FAQs：github.com/westlake-repl/… Colaboratory #OPMC

thumb_up_off_alt48

chat_bubble_outline1

repeat19

shareShare

Leo Zang

@leotz03

2 years ago

ProTrek: Navigating the Protein Universe through Tri-Modal Contrastive Learning - Aligns sequence-structure, sequence-function, and structure-function pairs by ESM, BERT, and Foldseek - Leverages max-inner product search for rapid retrieval preprint: biorxiv.org/content/10.110…

thumb_up_off_alt86

chat_bubble_outline1

repeat19

shareShare

fajie yuan

@duguyuan

2 years ago

Excited to share ProTrec, a fast & accurate protein search tool! 30x/60x better seq-func/func-seq retrieval 100x faster than Foldseek & MMseq2 9 tasks: seq-stru, seq-func, struc-fun, etc. Beats ESM2 in 9/11 tasks Thanks to Sergey Ovchinnikov chentongwang biorxiv.org/content/10.110…

thumb_up_off_alt59

chat_bubble_outline0

repeat16

shareShare

fajie yuan

@duguyuan

2 years ago

Introducing ProTrek, a 3-modal PLM for protein seq, struc, and func: ✨ Trained on 40M protein-text pairs, 100x larger than ProteinCLIP, ProtST, ProteinCLAP 🚀 30x/60x better accuracy than ProtST, ProteinCLAP ⚡ 100x faster than Foldseek, MMseq2 for similar function searches

thumb_up_off_alt84

chat_bubble_outline7

repeat30

shareShare

Jin Su

@ltenjoy

2 years ago

We are thrilled to release ProTrek, a tri-modal PLM modeling protein sequence, structure and function! ProTrek supports both retrieval (9 tasks) and downstream fine-tuning! 👉Paper: biorxiv.org/content/10.110… 👉Github: github.com/westlake-repl/… 👉Demo: huggingface.co/spaces/westlak…

thumb_up_off_alt82

chat_bubble_outline1

repeat16

shareShare

fajie yuan

@duguyuan

a year ago

My student Jin Su evaluated ESM3 (v1) for the inverse folding task. The results look great! Waiting more results. Also check SaprotHub without license limitation biorxiv.org/content/10.110… Welcome contributions to SaprotHub and be an author! github.com/westlake-repl/…

thumb_up_off_alt5

chat_bubble_outline0

repeat3

shareShare

fajie yuan

@duguyuan

a year ago

Zhikai uploaded a 6-min tutorial for SaprotHub! 🚀 Biologists can now easily train & share their protein language models. Join us, be a SaprotHub author! #Bioinformatics #ProteinModeling Jin Su Sergey Ovchinnikov Paper: biorxiv.org/content/10.110… Video: youtube.com/watch?v=r42z1h…

thumb_up_off_alt36

chat_bubble_outline5

repeat11

shareShare

fajie yuan

@duguyuan

a year ago

Great news: a wet lab submitted a EYFP fluorescence fitness model to SaprotHub with a Spearman ρ of 0.94, close to wet lab accuracy for double/triple-site mutations. Trained on 100K variants, it's a great🔧 tool for biologists! Boston Protein Design and Modeling Club Machine learning for protein engineering seminar Sergey Ovchinnikov Jin Su

thumb_up_off_alt33

chat_bubble_outline2

repeat5

shareShare

fajie yuan

@duguyuan

a year ago

Recruited 12 bio students, no coding exp, to use ColabSaprot for re-training, zero-shot mutation, & protein design. They matched AI experts w/o hyper-parameter tuning! With SaprotHub, any biologist can train protein models! Sergey Ovchinnikov Jin Su biorxiv.org/content/10.110…

thumb_up_off_alt217

chat_bubble_outline7

repeat33

shareShare

fajie yuan

@duguyuan

a year ago

Video Training：youtube.com/watch?v=r42z1h… Vdeo Prediction: youtube.com/watch?v=N5VMBw…

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Jin Su

@ltenjoy

a year ago

We believe the potential of ProTrek to search proteins from large database for interested functions. Any suggestions for our evaluation would be appreciated!! Protrek demo: huggingface.co/spaces/westlak… paper: biorxiv.org/content/10.110…

thumb_up_off_alt11

chat_bubble_outline2

repeat4

shareShare

Jin Su

@ltenjoy

a year ago

Used SaprotHub to predict mutations for eTDG, a uracil-N-glycosylase variant. 🧬 Lab results: 17 out of top 20 mutations had higher T-to-G editing efficiency than wild type (marked as red), with 3 showing nearly 2x improvement! 🚀

thumb_up_off_alt21

chat_bubble_outline0

repeat4

shareShare

fajie yuan

@duguyuan

a year ago

Pinal demonstrates impressive performance when evaluated using GT-TMscore and ProTrek CLIP score, outperforming ESM-3 for with key words as promt in dry experiment metrics. We plan to validate these results with wet experiments.

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

fajie yuan

@duguyuan

a year ago

We've released ColabProTrek, the successor to ColabSaprot. 🔬 Try it out: colab.research.google.com/drive/1On2xQU0… 🆕 We've also expanded ProTrek's search capabilities with additional databases including UniRef50 and PDB. 🧬 Explore: huggingface.co/spaces/westlak… Paper: biorxiv.org/content/10.110…

thumb_up_off_alt60

chat_bubble_outline1

repeat20

shareShare

fajie yuan

@duguyuan

a year ago

🚀 New Update. The latest version of ProTrek is now available on bioRxiv. 🧬 📑 Read it here: biorxiv.org/content/10.110… • Service: huggingface.co/spaces/westlak… • Try it on Colab: colab.research.google.com/drive/1On2xQU0…

thumb_up_off_alt55

chat_bubble_outline1

repeat8

shareShare

fajie yuan

@duguyuan

a year ago

Excited to share our AI+cryo-EM work! 🧬 🔬 Cryo-IEF: Foundation model trained on 65M particles 🤖 CryoWizard: automated structure pipeline 🎯 Making cryo-EM accessible to more labs Preprint: biorxiv.org/content/10.110… Code: github.com/westlake-repl/… #CryoEM #AI #StructuralBiology

thumb_up_off_alt105

chat_bubble_outline5

repeat29

shareShare

bioRxiv Bioinfo

@biorxiv_bioinfo

a year ago

Decoding the Molecular Language of Proteins with Evola biorxiv.org/cgi/content/sh… #biorxiv_bioinfo

thumb_up_off_alt39

chat_bubble_outline0

repeat10

shareShare

DailyHealthcareAI

@aipulserx

a year ago

How can we effectively decode and understand the complex molecular language of proteins to unlock their functional secrets at scale?bioRxiv Westlake University "Decoding the Molecular Language of Proteins with Evola" • Scientists have developed Evola, an 80 billion

How can we effectively decode and understand the complex molecular language of proteins to unlock their functional secrets at scale?<a href="/biorxivpreprint/">bioRxiv</a> <a href="/Westlake_Uni/">Westlake University</a>

"Decoding the Molecular Language of Proteins with Evola"

• Scientists have developed Evola, an 80 billion

thumb_up_off_alt83

chat_bubble_outline2

repeat24

shareShare

fajie yuan

@duguyuan

a year ago

We release our protein chatGPT, Evola! 🌟 chat-protein.com Evola comes in two versions: 10B & 80B. The 80B model has a 1.3B Saprot encoder & a 70B LLaMA3 decoder. Trained on 546 protein question-text pairs with an 150 billion word tokens! 💡🔬 biorxiv.org/content/10.110…

thumb_up_off_alt610

chat_bubble_outline20

repeat141

shareShare

Biology+AI Daily

@biologyaidaily

a year ago

Decoding the Molecular Language of Proteins with Evola 1. Evola introduces an 80-billion parameter multimodal protein-language model to decode protein functions, leveraging protein sequences, structures, and user queries. 2. A key innovation is its unprecedented training

thumb_up_off_alt80

chat_bubble_outline0

repeat13

shareShare