BIU NLP (@biunlp) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

🚨 Introducing LAQuer, accepted to #ACL2025 (main conf)! LAQuer provides more granular attribution for LLM generations: users can just highlight any output fact (top), and get attribution for that input snippet (bottom). This reduces the amount of text the user has to read by 2

thumb_up_off_alt72

chat_bubble_outline3

repeat26

shareShare

Elias Stengel-Eskin (on the faculty job market)

@eliaseskin

2 months ago

Attribution is key to being able to audit an LLM's generations, but is only useful when it is high precision. LAQuer requires models to attribute arbitrary spans in the output to fine-grained source spans, making checking for factuality in the output faster and easier on

thumb_up_off_alt20

chat_bubble_outline1

repeat11

shareShare

David Wan

@meetdavidwan

2 months ago

Verifying LLM-generated facts can be a slog through lengthy citations. LAQuer introduces a novel framework, allowing users to simply highlight an output fact and get precise, snippet-level attribution! This massively cuts down the reading needed for verification – by up to 2

thumb_up_off_alt14

chat_bubble_outline0

repeat11

shareShare

Elias Stengel-Eskin (on the faculty job market)

@eliaseskin

2 months ago

🚨 CLATTER treats entailment as a reasoning process, guiding models to follow concrete steps (decomposition, attribution/entailment, and aggregation). CLATTER improves hallucination detection via NLI, with gains on ClaimVerify, LFQA, and TofuEval especially on long-reasoning

thumb_up_off_alt20

chat_bubble_outline0

repeat10

shareShare

Arie Cattan

@ariecattan

2 months ago

🚨Check out CLATTER, a new structured reasoning appraoch for detecting hallucinations!

thumb_up_off_alt13

chat_bubble_outline0

repeat3

shareShare

Eran Hirsch

@hirscheran

2 months ago

🚨 New preprint! We propose a reasoning process for hallucination detection: 1️⃣ Decompose the output 2️⃣ Generate fine-grained attribution (if possible), and accordingly make local entailment decisions 3️⃣ Aggregate all to a final decision We also introduce metrics to evaluate

thumb_up_off_alt20

chat_bubble_outline0

repeat10

shareShare

David Wan

@meetdavidwan

2 months ago

Excited to share GenerationPrograms! 🚀 How do we get LLMs to cite their sources? GenerationPrograms is attributable by design, producing a program that executes text w/ a trace of how the text was generated! Gains of up to +39 Attribution F1 and eliminates uncited sentences,

thumb_up_off_alt82

chat_bubble_outline7

repeat31

shareShare

David Wan

@meetdavidwan

15 days ago

🎉 Our paper, GenerationPrograms, which proposes a modular framework for attributable text generation, has been accepted to Conference on Language Modeling! GenerationPrograms produces a program that executes to text, providing an auditable trace of how the text was generated and major gains on

thumb_up_off_alt37

chat_bubble_outline0

repeat24

shareShare

ACL 2025

@aclmeeting

6 days ago

We start with the Outstanding Papers (1/6)

thumb_up_off_alt217

chat_bubble_outline2

repeat28

shareShare

BIU NLP

@biunlp

6 days ago

Congrats Itai Mondshine Tzuf - צוף Reut Tsarfaty !

thumb_up_off_alt31

chat_bubble_outline2

repeat4

shareShare

Tzuf - צוף

@tzuf6

5 days ago

BIU NLP Itai Mondshine Reut Tsarfaty Our paper "Beyond N-Grams: Rethinking Evaluation Metrics and Strategies for Multilingual Abstractive Summarization": arxiv.org/pdf/2507.08342

thumb_up_off_alt8

chat_bubble_outline1

repeat3

shareShare

omer goldman

@omernlp

5 days ago

Are you still around Vienna? Come hear about a new morphological task at CoNLL at ~11:20 (hall M.1) Reut Tsarfaty

Are you still around Vienna? Come hear about a new morphological task at CoNLL at ~11:20 (hall M.1)
<a href="/rtsarfaty/">Reut Tsarfaty</a>

thumb_up_off_alt9

chat_bubble_outline1

repeat2

shareShare

Amir David Nissan cohen

@amirdnc

5 days ago

Led by Aviya Maimon, our new paper redefines how we evaluate LLMs. Instead of one flat leaderboard score, we uncover the latent skills—reasoning, comprehension, ethics, precision & more—that really shape LLM ability. Think: psychometrics meets AI. link: arxiv.org/pdf/2507.20208

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Shauli Ravfogel

@ravfogel

5 days ago

1/8 Happy to share our new paper—“IQ Test for LLMs”—co-authored with Aviya Maimon, Amir David Nissan cohen, Gal Vishne @neurogal.bsky.social and Reut Tsarfaty. We propose to rethink how language models are evaluated by focusing on the latent capabilities that explain benchmark results. Arxiv: arxiv.org/pdf/2507.20208

1/8 Happy to share our new paper—“IQ Test for LLMs”—co-authored with <a href="/AviyaMaimon/">Aviya Maimon</a>, <a href="/AmirDNC/">Amir David Nissan cohen</a>, <a href="/neuro_gal/">Gal Vishne @neurogal.bsky.social</a> and <a href="/rtsarfaty/">Reut Tsarfaty</a>. We propose to rethink how language models are evaluated by focusing on the latent capabilities that explain benchmark results.
Arxiv: arxiv.org/pdf/2507.20208

thumb_up_off_alt22

chat_bubble_outline1

repeat3

shareShare

Avshalom Manevich

@avshalomm

4 days ago

We introduce CoCI, which improves fine-grained visual discrimination in LVLMs using contrast images. Shows up to 98.9% improvement on NaturalBench across three different supervision regimes. Reut Tsarfaty 📄 aclanthology.org/anthology-file…

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare