Giovanni Monea (@giomonea) Twitter Tweets • TwiCopy

Mustafa Omer Gul

a year ago

New paper! Models that learn from feedback train on their own outputs, so you see performance 📈 but language diversity 📉. We show that if you couple comprehension and generation you learn faster 🏎️ AND get richer language! arxiv.org/abs/2408.15992 Demo and video ⬇ + in EMNLP!

thumb_up_off_alt51

chat_bubble_outline4

repeat10

shareShare

Julian Minder

@jkminder

a year ago

Can we understand and control how language models balance context and prior knowledge? Our latest paper shows it’s all about a 1D knob! 🎛️ arxiv.org/abs/2411.07404 Co-led with Kevin Du, as well as Niklas Stoehr, Giovanni Monea, Chris Wendler, Bob West & Ryan Cotterell.

thumb_up_off_alt70

chat_bubble_outline6

repeat22

shareShare

Mustafa Omer Gul

@momergul_

a year ago

This still feels very surreal! I would like to thank EMNLP 2025 for this great honor, Yoav Artzi and my labmates for all of their support, and the many crowdworkers who played with and provided the feedback for our models!

thumb_up_off_alt46

chat_bubble_outline5

repeat8

shareShare

Kianté Brantley

@xkianteb

a year ago

I am recruiting PhD students to join my lab at Harvard in Fall 2025! (deadline Dec 15) If you are interested in solving problems at the intersection of reinforcement learning, imitation learning, and NLP, pls consider applying (bit.ly/4fnficx)! Harvard SEAS Kempner Institute at Harvard University

thumb_up_off_alt390

chat_bubble_outline2

repeat94

shareShare

Oreva Ahia

@orevaahia

a year ago

I am excited to be presenting MAGNET 🧲at NeurIPS 2024 next week. Subword tokenizers have been shown to overly segment text in non-Latin script languages. Our work presents an approach to train tokenizer-free multilingual LMs via efficient byte-level modeling. 1/n

thumb_up_off_alt254

chat_bubble_outline7

repeat58

shareShare

Chris Wendler

@wendlerch

9 months ago

SAEs pick up on abstract grammatical concepts that LLMs share across a diverse set of languages - even languages in which these grammatical concepts manifest in wildly different forms 🐍 🐈‍⬛ 🐮 see thread below 👇

thumb_up_off_alt13

chat_bubble_outline1

repeat2

shareShare

Yoav Artzi

@yoavartzi

9 months ago

I am looking for a postdoc. A serious-looking call coming soon, but this is to get it going. Topics include (but not limited to): LLMs (🫢!), multimodal LLMs, interaction+learning, RL, intersection with cogsci, ... see our work to get an idea: yoavartzi.com/pubs Plz RT 🙏

thumb_up_off_alt194

chat_bubble_outline5

repeat65

shareShare

Yoav Artzi

@yoavartzi

9 months ago

We now have a form for postdoc applications: forms.gle/tiydAChgV1wLcQ… I am looking at candidates on a rolling basis, so while there's no deadline, there's an advantage of throwing your name in the ring earlier than later

thumb_up_off_alt24

chat_bubble_outline0

repeat11

shareShare

Yoav Artzi

@yoavartzi

9 months ago

We recently pushed an update to this paper. Usually, updates don't justify a post, but this one is exceptionally contentful -> 🧵 tldr: all the findings are stronger, and the behaviors are super cool! arxiv.org/abs/2410.05362

thumb_up_off_alt247

chat_bubble_outline5

repeat43

shareShare

Conference on Language Modeling

@colm_conf

7 months ago

A bit of a mess around the conflict of COLM with the ARR (and to lesser degree ICML) reviews release. We feel this is creating a lot of pressure and uncertainty. So, we are pushing our deadlines: Abstracts due March 22 AoE (+48hr) Full papers due March 28 AoE (+24hr) Plz RT 🙏

thumb_up_off_alt119

chat_bubble_outline3

repeat71

shareShare

Anthropic

@anthropicai

7 months ago

How does Claude understand different languages? We find shared circuitry underlying the same concepts in multiple languages, implying that Claude "thinks" using universal concepts even before converting those thoughts into language.

thumb_up_off_alt569

chat_bubble_outline4

repeat46

shareShare

Veniamin Veselovsky

@vminvsky

7 months ago

New paper: Language models have “universal” concept representation – but can they capture cultural nuance? 🌏 If someone from Japan asks an LLM what color a pumpkin is, will it correctly say green (as they are in Japan)? Or does cultural nuance require more than just language?

thumb_up_off_alt131

chat_bubble_outline6

repeat33

shareShare

Rishi Jha

@rishi_d_jha

5 months ago

I’m stoked to share our new paper: “Harnessing the Universal Geometry of Embeddings” with jack morris, Collin Zhang, and Vitaly Shmatikov. We present the first method to translate text embeddings across different spaces without any paired data or encoders. Here's why we're excited: 🧵👇🏾

I’m stoked to share our new paper: “Harnessing the Universal Geometry of Embeddings” with <a href="/jxmnop/">jack morris</a>, Collin Zhang, and <a href="/shmatikov/">Vitaly Shmatikov</a>.

We present the first method to translate text embeddings across different spaces without any paired data or encoders.

Here's why we're excited: 🧵👇🏾

thumb_up_off_alt1,1K

chat_bubble_outline33

repeat257

shareShare

Linxi Zhao

@linxizhao4

5 months ago

🚀Excited to share our latest work: LLMs entangle language and knowledge, making it hard to verify or update facts. We introduce LMLM 🐑🧠 — a new class of models that externalize factual knowledge into a database and learn during pretraining when and how to retrieve facts

thumb_up_off_alt43

chat_bubble_outline1

repeat14

shareShare