Orion Weller @ ICLR 2025 (@orionweller) 's Twitter Profile
Orion Weller @ ICLR 2025

@orionweller

PhD student @jhuclsp. Previously: @samaya_ai, @allen_ai. #NLProc and #IR research

ID: 3057771000

linkhttp://orionweller.github.io calendar_today02-03-2015 19:11:31

412 Tweet

1,1K Followers

886 Following

Raphaël Sourty (@raphaelsrty) 's Twitter Profile Photo

To anyone wondering what's the difference between encoders and decoders on downstream tasks when both models are trained the same way, this blog post is made for you. Very interesting resource and new models available, impressive work 🙌

tomaarsen (@tomaarsen) 's Twitter Profile Photo

I'm very excited to see more Ettin-based embedding models being trained. It would be really solid to see training recipes applied on all 6 sizes. The 17M encoder should allow for a model that outperforms all-MiniLM-L6-v2 with roughly the same size, I think

search founder (@n0riskn0r3ward) 's Twitter Profile Photo

Finally got the chance to read about Ettin (huggingface.co/blog/ettin). Good stuff, encoders are better. Makes sense. But practically, there are all kinds of Apache 2 decoders to work with trained on 15T+ tokens and I'm pretty focused on retrieval...

Finally got the chance to read about Ettin (huggingface.co/blog/ettin). Good stuff, encoders are better. Makes sense. But practically, there are all kinds of Apache 2 decoders to work with trained on 15T+ tokens and I'm pretty focused on retrieval...
Antoine Chaffin (@antoine_chaffin) 's Twitter Profile Photo

If you missed it because you were at a conference, last week we released SOTA encoders and decoders across various sizes alongside public data to reproduce them I already had nice feedback from people on the small models, can’t wait to see what the community will build!

Knowledgator (@knowledgator) 's Twitter Profile Photo

🧠 Variants include: The models are based on DeBERTa, ModernBERT and the Ettin small model for edge device use-cases. – gliclass-edge-v3.0: ultra-efficient – gliclass-large-v3.0: high accuracy – gliclass-x-base: robust multilingual zero-shot

Orion Weller @ ICLR 2025 (@orionweller) 's Twitter Profile Photo

Does anyone have these stats for ICLR/ NeurIPS etc? Wondering if there’s a US trend to avoid *CL confs (my personal experience sadly) or if this is the case at all conferences

Antoine Chaffin (@antoine_chaffin) 's Twitter Profile Photo

Obviously it has been catched by Sumit before the official announcement! 😁 I am very happy to announce that PyLate has now an associated paper and it has been accepted to CIKM! Very happy to share this milestone with my dear co-creator Raphaël Sourty 🫶

Hamel Husain (@hamelhusain) 's Twitter Profile Photo

TOC for the open book "Beyond Naive RAG: Practical Advanced Methods" from our RAG series. This condenses 5 hours of instruction into something you can read in ~30 minutes. Link: maven.com/p/945082/beyon… Ben Clavié Nandan Thakur Orion Weller Antoine Chaffin Bryan Bischof fka Dr. Donut

TOC for the open book "Beyond Naive RAG: Practical Advanced Methods" from our RAG series.  

This  condenses 5 hours of instruction into something you can read in ~30 minutes. 

Link: maven.com/p/945082/beyon…

<a href="/bclavie/">Ben Clavié</a> <a href="/beirmug/">Nandan Thakur</a> <a href="/orionweller/">Orion Weller</a> <a href="/antoine_chaffin/">Antoine Chaffin</a>  <a href="/BEBischof/">Bryan Bischof fka Dr. Donut</a>
Fred Jonsson (@enginoid) 's Twitter Profile Photo

so many great artifacts of this work: - an open-data recipe for ModernBERT that exceeds ModernBERT in performance - tons of checkpoints (17m, 32m, 68m, 150m, 400m, 1b) - direct comparison of same training recipe, data & model shape with masked vs. causal LM - open data +