Luca Soldaini ✈️ ICLR 25 (@soldni) Twitter Tweets • TwiCopy

Luca Soldaini ✈️ ICLR 25

@soldni

+ Follow

I like tokens! I lead the OLMo data team at @allen_ai w/ @kylelostat. Open source is fun 🤖☕️🍕🏳️‍🌈 Opinions are sampled from my own stochastic parrot

ID: 1865461842

linkhttps://soldaini.net calendar_today15-09-2013 00:09:49

19,19K Tweet

9,9K Followers

1,1K Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

We enabled OLMoTrace for Tülu 3 models! 🤠 Matched spans are shorter than for OLMo models, bc we can only search in Tülu's post-training data (base model is Llama). Yet we thought it'd still bring some value. Try yourself on the Ai2 playground -- playground.allenai.org

thumb_up_off_alt42

chat_bubble_outline2

repeat12

shareShare

Luca Soldaini ✈️ ICLR 25

@soldni

a month ago

Only a fraction of data needed for LLM comes with identifiable licenses. But if you curate it all, can you train a model on in? We release Common Pile, a 1T tokens dataset, and train a 7B model on it! results are on par with open weights models trained on eq FLOPS

thumb_up_off_alt72

chat_bubble_outline2

repeat6

shareShare

Luca Soldaini ✈️ ICLR 25

@soldni

a month ago

does anyone **actually** read the output of research assistant or do you also immediately go to references

thumb_up_off_alt9

chat_bubble_outline1

repeat0

shareShare

Luca Soldaini ✈️ ICLR 25

@soldni

a month ago

one hard lesson i’ve learned on working on large scale ML system: no change is too small to be ablated. every hypothesis consumes a non-negligible amount of bandwidth, and there are simply Too Many Things to try. instead of arguing whether something should work, just try it

thumb_up_off_alt120

chat_bubble_outline2

repeat3

shareShare

Luca Soldaini ✈️ ICLR 25

@soldni

a month ago

Google Cloud somewhat borked, and has brought WandB down with it. Might as well pack and go home?

thumb_up_off_alt25

chat_bubble_outline3

repeat0

shareShare

Luca Soldaini ✈️ ICLR 25

@soldni

a month ago

probably the greatest paper title in history

thumb_up_off_alt12

chat_bubble_outline1

repeat0

shareShare

Luca Soldaini ✈️ ICLR 25

@soldni

a month ago

So glad to see Molmo recognized. Amazing work lead by Chris, Matt Deitke, and Ani Kembhavi; I feel so fortunate to have at least helped a lil bit!

thumb_up_off_alt14

chat_bubble_outline0

repeat0

shareShare

Kaiser Sun

@kaiserwholearns

a month ago

What happens when an LLM is asked to use information that contradicts its knowledge? We explore knowledge conflict in a new preprint📑 TLDR: Performance drops, and this could affect the overall performance of LLMs in model-based evaluation.📑🧵⬇️ 1/8 #NLProc #LLM #AIResearch