Luca Soldaini ✈️ ICLR 25 (@soldni) 's Twitter Profile
Luca Soldaini ✈️ ICLR 25

@soldni

I like tokens! I lead the OLMo data team at @allen_ai w/ @kylelostat. Open source is fun 🤖☕️🍕🏳️‍🌈 Opinions are sampled from my own stochastic parrot

ID: 1865461842

linkhttps://soldaini.net calendar_today15-09-2013 00:09:49

19,19K Tweet

9,9K Followers

1,1K Following

Jiacheng Liu (@liujc1998) 's Twitter Profile Photo

We enabled OLMoTrace for Tülu 3 models! 🤠 Matched spans are shorter than for OLMo models, bc we can only search in Tülu's post-training data (base model is Llama). Yet we thought it'd still bring some value. Try yourself on the Ai2 playground -- playground.allenai.org

We enabled OLMoTrace for Tülu 3 models! 🤠

Matched spans are shorter than for OLMo models, bc we can only search in Tülu's post-training data (base model is Llama). Yet we thought it'd still bring some value.

Try yourself on the Ai2 playground -- playground.allenai.org
Luca Soldaini ✈️ ICLR 25 (@soldni) 's Twitter Profile Photo

Only a fraction of data needed for LLM comes with identifiable licenses. But if you curate it all, can you train a model on in? We release Common Pile, a 1T tokens dataset, and train a 7B model on it! results are on par with open weights models trained on eq FLOPS

Luca Soldaini ✈️ ICLR 25 (@soldni) 's Twitter Profile Photo

one hard lesson i’ve learned on working on large scale ML system: no change is too small to be ablated. every hypothesis consumes a non-negligible amount of bandwidth, and there are simply Too Many Things to try. instead of arguing whether something should work, just try it

Kaiser Sun (@kaiserwholearns) 's Twitter Profile Photo

What happens when an LLM is asked to use information that contradicts its knowledge? We explore knowledge conflict in a new preprint📑 TLDR: Performance drops, and this could affect the overall performance of LLMs in model-based evaluation.📑🧵⬇️ 1/8 #NLProc #LLM #AIResearch

What happens when an LLM is asked to use information that contradicts its knowledge? We explore knowledge conflict in a new preprint📑
TLDR: Performance drops, and this could affect the overall performance of LLMs in model-based evaluation.📑🧵⬇️ 1/8
#NLProc #LLM #AIResearch