Ben Bogin (@ben_bogin) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

OLMo is here! And it’s 100% open. It’s a state-of-the-art LLM and we are releasing it with all pre-training data and code. Let’s get to work on understanding the science behind LLMs. Learn more about the framework and how to access it here: blog.allenai.org/olmo-open-lang…

thumb_up_off_alt1,1K

chat_bubble_outline29

repeat334

shareShare

AK

@_akhaliq

a year ago

Allen AI presents Dolma an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research paper page: huggingface.co/papers/2402.00… dataset: huggingface.co/datasets/allen… release Dolma, a three trillion tokens English corpus, built from a diverse mixture of web content,

thumb_up_off_alt375

chat_bubble_outline4

repeat80

shareShare

Maor Ivgi

@maorivg

a year ago

1/5 🧠 Excited to share our latest paper focusing on the heart of LLM training: data curation! We train a 7B LLM achieving 64% on 5-shot MMLU, using only 2.6T tokens. The key to this performance? Exceptional data curation. #LLM #DataCuration

thumb_up_off_alt57

chat_bubble_outline3

repeat17

shareShare

Maor Ivgi

@maorivg

a year ago

1/7 🚨 What do LLMs do when they are uncertain? We found that the stronger the LLM, the more it hallucinates and the less it loops! This pattern extends to sampling methods and instruction tuning. 🧵👇 Mor Geva Jonathan Berant Ori Yoran

thumb_up_off_alt122

chat_bubble_outline2

repeat30

shareShare

Ben Bogin

@ben_bogin

a year ago

Our new benchmark with challenging real-world agent tasks! Great work led by Ori Yoran

thumb_up_off_alt20

chat_bubble_outline0

repeat0

shareShare

Ori Yoran

@oriyoran

9 months ago

Working on a new web agent? AssistantBench, our benchmark with realistic and challenging web tasks such just got an update: * Our SeePlanAct Agent with Sonnet 3.5 achieved a new SoTA of 26.4%. * We just open sourced our agent. * Accepted to #EMNLP2024!

thumb_up_off_alt72

chat_bubble_outline1

repeat16

shareShare

Ben Bogin

@ben_bogin

9 months ago

I will be presenting SUPER next week at EMNLP, Tuesday 4pm. Stop by to talk about evaluating agents on running research experiments and code in-the-wild!

thumb_up_off_alt24

chat_bubble_outline0

repeat3

shareShare

Ben Bogin

@ben_bogin

9 months ago

So honored that SUPER has received an Outstanding Paper Award! Huge thanks to everyone involved in this work!

thumb_up_off_alt47

chat_bubble_outline6

repeat3

shareShare

Ori Yoran

@oriyoran

4 months ago

New #ICLR2024 paper! The KoLMogorov Test: can CodeLMs compress data by code generation? The optimal compression for a sequence is the shortest program that generates it. Empirically, LMs struggle even on simple sequences, but can be trained to outperform current methods! 🧵1/7

thumb_up_off_alt292

chat_bubble_outline8

repeat47

shareShare

Ian Magnusson

@ianmagnusson

4 months ago

🔭 Science relies on shared artifacts collected for the common good. 🛰 So we asked: what's missing in open language modeling? 🪐 DataDecide 🌌 charts the cosmos of pretraining—across scales and corpora—at a resolution beyond any public suite of models that has come before.

thumb_up_off_alt88

chat_bubble_outline4

repeat62

shareShare

Tai Nguyen

@taidng

3 months ago

We released a massive suite of 30K checkpoints to help facilitate research into pretraining data decisions! We include insights on what evaluation choices (metrics, benchmarks) can track progress, with comparison to existing methods. Check out DataDecide! 🔮🥇🥈🥉

thumb_up_off_alt14

chat_bubble_outline3

repeat3

shareShare

Ben Bogin

Gate.io

Ai2

AK

Maor Ivgi

Maor Ivgi

Ben Bogin

Ori Yoran

Ben Bogin

Ben Bogin

Ori Yoran

Ian Magnusson

Tai Nguyen