Katelyn Mei (@meikatelyn) 's Twitter Profile
Katelyn Mei

@meikatelyn

PhDing in Information Science @UW| Human-AI interaction, AI-assisted Decision-making |psychology & mathematics @Middlebury’22

ID: 894191917327581186

linkhttp://www.katelynmei.com calendar_today06-08-2017 13:42:21

118 Tweet

149 Followers

944 Following

Katelyn Mei (@meikatelyn) 's Twitter Profile Photo

This was my first time at FAccT and I had a great time meeting new friends and mentors from the community. Really appreciated all the wonderful conversations and I look forward to the next FAccT!

Bloomberg (@business) 's Twitter Profile Photo

Water-guzzling data centers are facing new scrutiny as drought and extreme heat turn H2O into a more precious resource trib.al/jF5mWcZ

Ari Holtzman (@universeinanegg) 's Twitter Profile Photo

Incredibly excited to announce that Chenhao, Mina, and I are taking on the challenge of investigation the interplay of communication and intelligence in this new era of non-human language users!

Quan Ze Chen (@cquanze) 's Twitter Profile Photo

Hi Twitter/X folks! Excited to announce I am going on the job market this cycle! (industry & academia) I work on building uncertainty-aware tools and workflows that support capturing and defining socially-constructed concepts at scale. Here are some examples of my work: (1/n)

Ari Holtzman (@universeinanegg) 's Twitter Profile Photo

If you want a respite from OpenAI drama, how about joining academia? I'm starting Conceptualization Lab, recruiting PhDs & Postdocs! We need new abstractions to understand LLMs. Conceptualization is the act of building abstractions to see something new. conceptualization.ai

Jiacheng Liu (@liujc1998) 's Twitter Profile Photo

It’s year 2024, and n-gram LMs are making a comeback!! We develop infini-gram, an engine that efficiently processes n-gram queries with unbounded n and trillion-token corpora. It takes merely 20 milliseconds to count the frequency of an arbitrarily long n-gram in RedPajama (1.4T

It’s year 2024, and n-gram LMs are making a comeback!!

We develop infini-gram, an engine that efficiently processes n-gram queries with unbounded n and trillion-token corpora. It takes merely 20 milliseconds to count the frequency of an arbitrarily long n-gram in RedPajama (1.4T
Allison Koenecke (@allisonkoe) 's Twitter Profile Photo

🎷Excited to present our paper, “Careless Whisper: Speech-to-text Hallucination Harms” at ACM FAccT! 🎷We assess Whisper (OpenAI’s speech recognition tool) for transcribed hallucinations that don’t appear in audio input. Paper link: arxiv.org/abs/2402.08021, thread 👇

🎷Excited to present our paper, “Careless Whisper: Speech-to-text Hallucination Harms” at <a href="/FAccTConference/">ACM FAccT</a>! 🎷We assess Whisper (OpenAI’s speech recognition tool) for transcribed hallucinations that don’t appear in audio input. Paper link: arxiv.org/abs/2402.08021, thread 👇
Rulin Shao (@rulinshao) 's Twitter Profile Photo

🔥We release the first open-source 1.4T-token RAG datastore and present a scaling study for RAG on perplexity and downstream tasks! We show LM+RAG scales better than LM alone, with better performance for the same training compute (pretraining+indexing) retrievalscaling.github.io 🧵

Bingbing Wen (@bingbingwen1) 's Twitter Profile Photo

🤔💭To answer or not to answer? We survey research on when language models should abstain in our new paper, "The Art of Refusal." . Thread below! 🧵⬇️ arxiv.org/abs/2407.18418 Joint w/ Jihan Yao Shangbin Feng Chenjun Xu tsvetshop Bill Howe Lucy Lu Wang UW iSchool Allen School #nlproc

🤔💭To answer or not to answer? We survey research on when language models should abstain in our new paper, "The Art of Refusal." . Thread below! 🧵⬇️ arxiv.org/abs/2407.18418
Joint w/ <a href="/jihan_yao/">Jihan Yao</a> <a href="/shangbinfeng/">Shangbin Feng</a> Chenjun Xu <a href="/tsvetshop/">tsvetshop</a> <a href="/billghowe/">Bill Howe</a> <a href="/lucyluwang/">Lucy Lu Wang</a>
<a href="/uw_ischool/">UW iSchool</a> <a href="/uwcse/">Allen School</a> #nlproc
Rulin Shao (@rulinshao) 's Twitter Profile Photo

Happy to share our work on RAG scaling is accepted by NeurIPS Conference 🥳 Some new thoughts on this work: (1) Retrieving from a web-scale datastore is another way to do test-time scaling. It doesn't add much to the training cost, leading to better compute-optimal scaling curves. 🔎🧵

Happy to share our work on RAG scaling is accepted by <a href="/NeurIPSConf/">NeurIPS Conference</a> 🥳 Some new thoughts on this work:
(1) Retrieving from a web-scale datastore is another way to do test-time scaling. It doesn't add much to the training cost, leading to better compute-optimal scaling curves. 🔎🧵
Akari Asai (@akariasai) 's Twitter Profile Photo

3/ 🔍 What is OpenScholar? It's a retrieval-augmented LM with 1️⃣ a datastore of 45M+ open-access papers 2️⃣ a specialized retriever and reranker to search the datastore 3️⃣ an 8B Llama fine-tuned LM trained on high-quality synthetic data 4️⃣ a self-feedback generation pipeline

3/ 🔍 What is OpenScholar?
It's a retrieval-augmented LM with
1️⃣ a datastore of 45M+ open-access papers
2️⃣ a specialized retriever and reranker to search the datastore
3️⃣ an 8B Llama fine-tuned LM trained on high-quality synthetic data
4️⃣ a self-feedback generation pipeline
Rulin Shao (@rulinshao) 's Twitter Profile Photo

Looking for an AI assistant to synthesize literature for your cutting-edge research and study? Don't miss out on 👩‍🔬OpenScholar-8B led by Akari Asai ! Our model can answer questions with up-to-date citations. Everything is open-sourced! Try out our demo: open-scholar.allen.ai