David Wadden (@davidjwadden) Twitter Tweets • TwiCopy

Ai2

2 years ago

Today we're thrilled to announce our new undertaking to collaboratively build the best open language model in the world: AI2 OLMo. Uniquely open, 70B parameters, coming early 2024 – join us! blog.allenai.org/announcing-ai2…

thumb_up_off_alt639

chat_bubble_outline33

repeat187

shareShare

Yizhong Wang

@yizhongwyz

2 years ago

🦙🐪🐫 So many instruction tuning datasets came out recently! How valuable are they, and how far are open models really from proprietary ones like ChatGPT? 🧐We did a systematic exploration, and built Tülu---a suite of LLaMa-tuned models up to 65B! 📜arxiv.org/abs/2306.04751

thumb_up_off_alt616

chat_bubble_outline11

repeat153

shareShare

Yanai Elazar

@yanaiela

2 years ago

Does arXiving have a casual effect on acceptance? The answer is nuanced, and depends on what assumptions you are willing to make, but arguably more importantly, we observe no difference in acceptance for different groups. arxiv.org/abs/2306.13891

thumb_up_off_alt197

chat_bubble_outline5

repeat39

shareShare

Ashish Sharma

@sharma_ashish_2

2 years ago

Absolutely thrilled🎉 to receive the ACL 2025 #ACL2023NLP 🏆Outstanding Paper Award🏆 for our work on cognitive reframing of negative thoughts! A huge shoutout to the diverse team behind this work Allen School UW NLP Mental Health America and Stanford Health Care 💖

thumb_up_off_alt157

chat_bubble_outline14

repeat16

shareShare

Daniel Weld

@dsweld

2 years ago

Interested in a better way to explore #VLDB2023 papers? Try exp-sum.apps.allenai.org for an LLM-powered way to probe those papers… * Ask questions w/ a single click * Explore answer provenance using the ending “ * Dive deep w/ recursive questions Powered by Semantic Scholar

thumb_up_off_alt42

chat_bubble_outline1

repeat20

shareShare

Orion Weller @ ICLR 2025

@orionweller

2 years ago

Using LLMs for query or document expansion in retrieval (e.g. HyDE and Doc2Query) have scores going 📈 But do these approaches work for all IR models and for different types of distribution shifts? Turns out its actually more 📉 🚨 📝 (arxiv soon): orionweller.github.io/assets/pdf/LLM…

thumb_up_off_alt143

chat_bubble_outline3

repeat48

shareShare

Hamish Ivison

@hamishivi

2 years ago

Check out the Tulu 2 suite 🐪, a set of Llama-2 models finetuned+DPO-trained on a mixture of publicly available datasets! Our best-performing models are competitive with SoTA open models on a range of benchmarks incl. AlpacaEval and MT-Bench. 📜Paper: arxiv.org/abs/2311.10702

thumb_up_off_alt131

chat_bubble_outline3

repeat31

shareShare

Semantic Scholar

@semanticscholar

2 years ago

New feature alert 🚨On each paper page, scroll down to find AI-generated Topic pages related to the paper, which include topic definitions, papers most cited for the topic, and more! Now available for Computer Science fields. Here’s an example: semanticscholar.org/paper/SPECTER%…

thumb_up_off_alt63

chat_bubble_outline1

repeat21

shareShare

Yanai Elazar

@yanaiela

2 years ago

This is fantastic news!! Somewhat of a coincidence, but our paper that studies the effect of early arxiving on acceptance that suggested this effect is small and that it does not fill its purpose was accepted to CLeaR (Causal Learning and Reasoning) 2024 x.com/yanaiela/statu…

thumb_up_off_alt34

chat_bubble_outline0

repeat2

shareShare

Ai2

@allen_ai

2 years ago

OLMo is here! And it’s 100% open. It’s a state-of-the-art LLM and we are releasing it with all pre-training data and code. Let’s get to work on understanding the science behind LLMs. Learn more about the framework and how to access it here: blog.allenai.org/olmo-open-lang…

thumb_up_off_alt1,1K

chat_bubble_outline29

repeat334

shareShare

Semantic Scholar Research @ AI2

@ai2_s2research

2 years ago

📣 Job opportunities at Semantic Scholar Research @ the Allen Institute for AI (AI2) for post-doctoral & pre-doctoral researchers starting in 2024! 📣 Our team works on NLP and HCI research with a focus on open LLMs and LLM-powered research support tools and assistants.

thumb_up_off_alt61

chat_bubble_outline3

repeat25

shareShare

Fangyuan Xu

@brunchavecmoi

2 years ago

Instruction-following capabilities of LLMs are a prerequisite to AI ✒️ writing assistance. How are good current LLMs at this task? We present 🥝 𝗞𝗜𝗪𝗜, a dataset with instructions for knowledge-intensive, document-grounded writing for long-form answers to research questions.

thumb_up_off_alt214

chat_bubble_outline3

repeat46

shareShare

Nathan Lambert

@natolambert

2 years ago

Excited to share something that we've needed since the early open RLHF days: RewardBench, the first benchmark for reward models. 1. We evaluated 30+ of the currently available RMs (w/ DPO too). 2. We created new datasets covering chat, safety, code, math, etc. We learned a lot.

thumb_up_off_alt450

chat_bubble_outline111

repeat152

shareShare

Yanai Elazar

@yanaiela

2 years ago

I am excited to be back at CLeaR and present this work today!

thumb_up_off_alt15

chat_bubble_outline0

repeat1

shareShare

Hanna Hajishirzi

@hannahajishirzi

2 years ago

Introducing our best OLMo yet. OLMo 1.7-7B outperforms LLaMa2-7B, approaching LLaMa2-13B at MMLU and GSM8k. High-quality data and staged training are key. I am so proud of our team making such significant improvement in a short period after our first release.

thumb_up_off_alt108

chat_bubble_outline2

repeat14

shareShare

Ai2

@allen_ai

a year ago

Looking for a dataset to enhance language model instruction-following over scientific literature? Introducing SciRIFF, a dataset of 137K expert-written demonstrations spanning 5 essential task categories for literature understanding: information extraction, summarization,

thumb_up_off_alt152

chat_bubble_outline1

repeat36

shareShare

Kejian Shi

@shi_kejian

a year ago

Introducing SciRIFF, a toolkit to enhance LLM instruction-following over scientific literature. 137k expert demonstrations in 5 categories: IE, summarization, QA, entailment, and classification; models up to 70b and code to science-tune your checkpoints included! Read more in 🧵:

thumb_up_off_alt106

chat_bubble_outline1

repeat31

shareShare

Yuling Gu

@gu_yuling

a year ago

LLMs are evaluated on the same tasks in so many different ways! 🤯 ✨ We introduce OLMES – a standard for reproducible LLM evaluations that is open, practical, completely documented, and can be applied to current leaderboards & eval code bases! ✨ 📜 arxiv.org/abs/2406.08446 1/

thumb_up_off_alt139

chat_bubble_outline3

repeat23

shareShare

Kyle Lo

@kylelostat

a year ago

Luca Soldaini 🎀 and I are arriving to #ACL2024 🇹🇭today! come find us at our talks & poster sessions for our OLMo & Dolma projects with Ai2 & frens 🤩 also dont miss our poster on KIWI🥝for interactive science QA w/ our intern Fangyuan Xu & mentors Eunsol Choi David Wadden

<a href="/soldni/">Luca Soldaini 🎀</a> and I are arriving to #ACL2024 🇹🇭today!

come find us at our talks & poster sessions for our OLMo & Dolma projects with <a href="/allen_ai/">Ai2</a> & frens 🤩

also dont miss our poster on KIWI🥝for interactive science QA w/ our intern <a href="/brunchavecmoi/">Fangyuan Xu</a> & mentors <a href="/eunsolc/">Eunsol Choi</a> <a href="/davidjwadden/">David Wadden</a>

thumb_up_off_alt66

chat_bubble_outline0

repeat11

shareShare

Abhilasha Ravichander

@lasha_nlp

9 months ago

We are launching HALoGEN💡, a way to systematically study *when* and *why* LLMs still hallucinate. New work w/ Shrusti Ghela* David Wadden Yejin Choi 💫 🧵 [1/n]

We are launching HALoGEN💡, a way to systematically study *when* and *why* LLMs still hallucinate.

New work w/ <a href="/shrusti_ghela/">Shrusti Ghela</a>* <a href="/davidjwadden/">David Wadden</a> <a href="/YejinChoinka/">Yejin Choi</a> 💫

🧵 [1/n]

thumb_up_off_alt165

chat_bubble_outline1

repeat40

shareShare