Ola Piktus (@olapiktus) Twitter Tweets • TwiCopy

WikiResearch

3 years ago

"Improving Wikipedia Verifiability with AI " a system that identifies Wikipedia citations that are unlikely to support their claims, and subsequently recommend better ones from the web. (Petroni et al, 2022) openreview.net/forum?id=qfTqR…

thumb_up_off_alt44

chat_bubble_outline0

repeat16

shareShare

Will Held

@williambarrheld

3 years ago

The Roots Search Tool from Hugging Face allowed me to quickly find XNLI examples in the BLOOM pre-training data - probably true for other benchmarks too. This 🔥 tool really highlights open-data as key to making the study of LLM capabilities a science. huggingface.co/spaces/bigscie…

The Roots Search Tool from <a href="/huggingface/">Hugging Face</a> allowed me to quickly find XNLI examples in the BLOOM pre-training data - probably true for other benchmarks too.

This 🔥 tool really highlights open-data as key to making the study of LLM capabilities a science.

huggingface.co/spaces/bigscie…

thumb_up_off_alt40

chat_bubble_outline0

repeat9

shareShare

Anna Rogers

@annargrs

3 years ago

📢 New blog post: the attribution problem with generative AI #NLProc AI & Society TLDR: Some argue that publicly available data is fair game for commercial models bc human text/art also has sources. But unlike models, we know when attribution is due... hackingsemantics.xyz/2022/attributi…

thumb_up_off_alt97

chat_bubble_outline4

repeat29

shareShare

Ola Piktus

@olapiktus

3 years ago

Best Demo Paper for Evaluate on the Hub at #EMNLP2022 🥳 congrats team! 🤗🚀

thumb_up_off_alt17

chat_bubble_outline0

repeat1

shareShare

AK

@_akhaliq

3 years ago

Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face abs: arxiv.org/abs/2302.14534

thumb_up_off_alt55

chat_bubble_outline1

repeat17

shareShare

Ola Piktus

@olapiktus

3 years ago

ROOTS Search Tool now shinier and better with both exact search and BM25-based sparse retrieval ☀️🌸 Check out Anna Rogers 's thread for details 🚀

thumb_up_off_alt19

chat_bubble_outline0

repeat2

shareShare

WiMLDS Paris

@wimlds_paris

3 years ago

[#FridayWiMLDSPaper 📜 curated by Marie Sacksick] "The ROOTS Search Tool: Data Transparency for LLMs", by Ola Piktus, Christopher Akiki, Paulo Villegas, Hugo Laurençon, @ggdupont, Sasha Luccioni, PhD 🦋🌎✨🤗, Yacine Jernite and Anna Rogers arxiv.org/abs/2302.14035

[#FridayWiMLDSPaper 📜 curated by <a href="/MarieSacksick/">Marie Sacksick</a>]

"The ROOTS Search Tool: Data Transparency for LLMs", by <a href="/olapiktus/">Ola Piktus</a>, <a href="/christopher/">Christopher Akiki</a>, Paulo Villegas, <a href="/HugoLaurencon/">Hugo Laurençon</a>, @ggdupont, <a href="/SashaMTL/">Sasha Luccioni, PhD 🦋🌎✨🤗</a>, <a href="/YJernite/">Yacine Jernite</a> and <a href="/annargrs/">Anna Rogers</a>

arxiv.org/abs/2302.14035

thumb_up_off_alt14

chat_bubble_outline1

repeat5

shareShare

Anna Rogers

@annargrs

2 years ago

ROOTS search tool for BLOOM🌸 training data will be presented at #ACL2023 as a demo paper! Really proud to be part of this important precedent.

thumb_up_off_alt44

chat_bubble_outline1

repeat1

shareShare

Niklas Muennighoff

@muennighoff

2 years ago

How to keep scaling Large Language Models when data runs out? 🎢 We train 400 models with up to 9B params & 900B tokens to create an extension of Chinchilla scaling laws for repeated data. Results are interesting… 🧐 📜: arxiv.org/abs/2305.16264 1/7

thumb_up_off_alt916

chat_bubble_outline24

repeat196

shareShare

Ola Piktus

@olapiktus

2 years ago

🔍💃

thumb_up_off_alt13

chat_bubble_outline0

repeat1

shareShare

EleutherAI

@aieleuther

2 years ago

Releasing data is amazing, but tools like these that help people make sense of the data is arguably an even more important step forward for data transparency. We're thrilled to see our community continue to lead by example when it comes to in transparent releases.

thumb_up_off_alt54

chat_bubble_outline1

repeat13

shareShare

Ola Piktus

@olapiktus

2 years ago

Just FYI

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Fabio Petroni

@fabio_petroni

2 years ago

Every researcher's path is a rollercoaster filled with numerous lows. With unspeakable joy, today I want to share a high point: our paper, “Improving Wikipedia Verifiability with AI,” has been published in the prestigious Nature Portfolio. Dive in nature.com/articles/s4225… [1/4]

thumb_up_off_alt129

chat_bubble_outline7

repeat20

shareShare

Ola Piktus

@olapiktus

2 years ago

Last time in New Orleans was fun and it was not even for NeurIPS. If you wanna chat about RAG, god or data hit me up this week. If you wanna tell me about LLM UX please ping me too 🙌 #NeurIPS2023

thumb_up_off_alt14

chat_bubble_outline0

repeat0

shareShare

Boaz Barak

@boazbaraktcs

2 years ago

Our work on multi-epoch scaling laws with the amazing Niklas Muennighoff (applying to grad school!) and Sasha Rush Teven Le Scao Ola Piktus Nouamane Tazi TurkuNLP Thomas Wolf Colin Raffel won runner-up outstanding paper award! See Niklas Muennighoff 's thread x.com/Muennighoff/st…

thumb_up_off_alt100

chat_bubble_outline4

repeat8

shareShare

Jay Alammar

@jayalammar

2 years ago

Awesome poster presentation by Niklas Muennighoff for the paper "Scaling Data-Constrained Language Models" at #NeurIPS2023 Kudos Sasha Rush Boaz Barak Teven Le Scao Ola Piktus Nouamane Tazi Sampo Pyysalo Thomas Wolf Colin Raffel

thumb_up_off_alt216

chat_bubble_outline2

repeat29

shareShare

Ola Piktus

@olapiktus

2 years ago

The ChatGPT attacks that we all deserve

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Ola Piktus

@olapiktus

2 years ago

Check in with me in 2047 and I'll still be doing KILT on RAG.

thumb_up_off_alt52

chat_bubble_outline0

repeat3

shareShare

Ola Piktus

@olapiktus

a year ago

A podcast about how it's weird speaking English when you're both Polish and you know it

thumb_up_off_alt85

chat_bubble_outline2

repeat6

shareShare

Patrick Lewis

@psh_lewis

a year ago

New paper from our team, led by @pat_verga Are you: * Doing evaluation with LLMs? * Using a huge model? * Worried about self-recognition? Try an ensemble of smaller LLMs. Use a PoLL: less biased, faster, 7x cheaper. Works great on QA & Arena-hard evals arxiv.org/abs/2404.18796

thumb_up_off_alt185

chat_bubble_outline9

repeat38

shareShare