Dennis Aumiller (@d_aumiller) Twitter Tweets • TwiCopy

Dennis Aumiller

@d_aumiller

+ Follow

Getting paid to complain about LLM evaluation @cohere. PhD on summarization from @UniHeidelberg. Previously: @AmazonScience, @sap. Find me on Stackoverflow!

ID: 964558325001211904

linkhttps://dennis-aumiller.de calendar_today16-02-2018 17:53:20

804 Tweet

684 Followers

722 Following

Jan Trienes

@jantrienes

10 months ago

Do you want to know what information LLMs prioritize in text synthesis tasks? Here's a short 🧵 about our new paper: an interpretable framework for salience analysis in LLMs. First of all, information salience is a fuzzy concept. So how can we even measure it?

thumb_up_off_alt25

chat_bubble_outline1

repeat9

shareShare

Cohere Labs

@cohere_labs

9 months ago

Introducing ✨ Aya Vision ✨ - an open-weights model to connect our world through language and vision Aya Vision adds breakthrough multimodal capabilities to our state-of-the-art multilingual 8B and 32B models. 🌿

thumb_up_off_alt465

chat_bubble_outline19

repeat126

shareShare

Maya Moritz

@mayarmoritz

9 months ago

Are you studying, working in, or utilizing #forensics? We're looking for expert opinions in a short, anonymous survey! Message or email me with any questions or for the link! #Science #DNA #pathology

thumb_up_off_alt2

chat_bubble_outline1

repeat1

shareShare

Kyle Duffy

@kyduffy

9 months ago

My team recently launched a best-in-class LLM specializing in English and Arabic. We just published a tech report explaining our methods. Check it out on arxiv: arxiv.org/abs/2503.14603

thumb_up_off_alt67

chat_bubble_outline4

repeat11

shareShare

Matthias Gallé

@mgalle

8 months ago

A year ago we released LBBP - a drop-in replacement of HumanEval that was more challenging and less leaked Internally we have been using the multilingual version of this for benchmarking, and as code is not only python we decided to release that as well huggingface.co/datasets/Coher…

thumb_up_off_alt57

chat_bubble_outline0

repeat13

shareShare

Nils Reimers

@nils_reimers

8 months ago

𝐂𝐨𝐡𝐞𝐫𝐞 𝐄𝐦𝐛𝐞𝐝 𝐯𝟒 - 𝐒𝐭𝐚𝐭𝐞-𝐨𝐟-𝐭𝐡𝐞-𝐚𝐫𝐭 𝐭𝐞𝐱𝐭 & 𝐢𝐦𝐚𝐠𝐞 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥 Today we are releasing Embed v4, unlocking so many cool new features for retrieval. 🇺🇳 100+ languages 🖼️ Text & Image capabilities 📜 128k context length

thumb_up_off_alt223

chat_bubble_outline8

repeat29

shareShare

Arnold Ventures

@arnold_ventures

7 months ago

AV's latest #BRIDGEseries convening brought together researchers, public officials, and industry experts to better understand the impact and prevalence of retail theft, and what can be done to effectively prevent it.

thumb_up_off_alt8

chat_bubble_outline1

repeat2

shareShare

cohere

@cohere

7 months ago

Command A, our state-of-the-art generative model, is now the highest-scoring generalist LLM on the Bird Bench leaderboard for SQL! It outperforms other systems that rely on extensive scaffolding to tackle these SQL benchmarks, and instead delivers these results out-of-the-box,

thumb_up_off_alt173

chat_bubble_outline4

repeat18

shareShare

Dennis Aumiller

@d_aumiller

6 months ago

It's my first time area chairing for the ACLRollingReview May cycle! And it will also be the first time asking for availability of emergency reviewers 😅 If you (or somebody you know) has availability for reviews in the Resources and Languages track, I have two papers missing reviews.

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Dennis Aumiller

@d_aumiller

5 months ago

Probably a good time to mention that I will be in Vienna attending ACL in two weeks. If you're tired of attending session after session, come and talk to me about LLM evaluation instead (I won't tell on you for skipping sessions🤫)! DMs are open if you want to set something up :)

thumb_up_off_alt18

chat_bubble_outline0

repeat0

shareShare

Dennis Aumiller

@d_aumiller

4 months ago

Genuine question: how do people in tech with imposter syndrome survive the bay area?? It's bad enough elsewhere, but with the talent density there (plus supposedly being more openly bragging), it seems like death

thumb_up_off_alt12

chat_bubble_outline0

repeat0

shareShare

Nick Frosst

@nickfrosst

4 months ago

👁️👁️ Cohere has a vision model now cohere.com/blog/command-a…

thumb_up_off_alt108

chat_bubble_outline2

repeat18

shareShare

Dennis Aumiller

@d_aumiller

4 months ago

No secret to anyone who works with Pierre (and his team), but they are super cracked. Seeing (pun intended) this model come to life was amazing! Please try it out and let us know what you think 😎

thumb_up_off_alt29

chat_bubble_outline0

repeat5

shareShare