Chenyang Lyu 吕晨阳 (@chenyang_lyu) 's Twitter Profile
Chenyang Lyu 吕晨阳

@chenyang_lyu

Researcher at @MBZUAI, PhD from @ml_labs_irl and @dcucomputing @dcu interested in Natural Language Processing, mainly Large Language Models (LLMs).

ID: 1138273563293769728

linkhttp://lyuchenyang.github.io calendar_today11-06-2019 02:35:40

1,1K Tweet

990 Followers

750 Following

Songlin Yang (@songlinyang4) 's Twitter Profile Photo

📢 (1/16) Introducing PaTH 🛣️ — a RoPE-free contextualized position encoding scheme, built for stronger state tracking, better extrapolation, and hardware-efficient training. PaTH outperforms RoPE across short and long language modeling benchmarks arxiv.org/abs/2505.16381

steven (@tu7uruu) 's Twitter Profile Photo

🚨 Don’t trust your ASR benchmarks. LibriSpeech and Common Voice are contaminated. Your SpeechLLM might be cheating—because it already saw the test answers during pretraining. Stop evaluating on leaked data. Here's why 1/n

Yapei Chang (@yapeichang) 's Twitter Profile Photo

Qwen benefits from random rewards on math 🧮 but that doesn't hold for general instruction following (IF). That said, noisy rewards aren't useless. In [🫐 arxiv.org/pdf/2505.11080], we show that simple string-matching metrics like BLEU can work surprisingly well for general IF!

Qwen benefits from random rewards on math 🧮 but that doesn't hold for general instruction following (IF).

That said, noisy rewards aren't useless. In [🫐 arxiv.org/pdf/2505.11080], we show that simple string-matching metrics like BLEU can work surprisingly well for general IF!
CLS (@chengleisi) 's Twitter Profile Photo

This year, there have been various pieces of evidence that AI agents are starting to be able to conduct scientific research and produce papers end-to-end, at a level where some of these generated papers were already accepted by top-tier conferences/workshops. Intology’s

Slator (@slatornews) 's Twitter Profile Photo

.Alibaba Group introduces TransBench, a benchmark designed to evaluate 🔍 how well #AI #translation systems perform in real-world industry settings — starting with international e-commerce 🛍️🌐 #xl8 #t9n #ecommerce Minghao Wu Chenyang Lyu 吕晨阳 Longyue Wang slator.com/alibaba-launch…

Yu Zhang (@yuz9yuz) 's Twitter Profile Photo

"𝘐𝘴 𝘈𝘊𝘓 𝘢𝘯 𝘈𝘐 𝘤𝘰𝘯𝘧𝘦𝘳𝘦𝘯𝘤𝘦?" A possible perspective is to examine: "𝘞𝘩𝘪𝘤𝘩 𝘈𝘊𝘓 𝘵𝘰𝘱𝘪𝘤𝘴 𝘳𝘦𝘴𝘰𝘯𝘢𝘵𝘦 𝘣𝘦𝘺𝘰𝘯𝘥 𝘢𝘤𝘢𝘥𝘦𝘮𝘪𝘢 𝘢𝘯𝘥 𝘤𝘢𝘱𝘵𝘶𝘳𝘦 𝘱𝘶𝘣𝘭𝘪𝘤 𝘢𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯?" Our #ACL2025 main conf paper delves into this issue. (1/n)

"𝘐𝘴 𝘈𝘊𝘓 𝘢𝘯 𝘈𝘐 𝘤𝘰𝘯𝘧𝘦𝘳𝘦𝘯𝘤𝘦?" A possible perspective is to examine: "𝘞𝘩𝘪𝘤𝘩 𝘈𝘊𝘓 𝘵𝘰𝘱𝘪𝘤𝘴 𝘳𝘦𝘴𝘰𝘯𝘢𝘵𝘦 𝘣𝘦𝘺𝘰𝘯𝘥 𝘢𝘤𝘢𝘥𝘦𝘮𝘪𝘢 𝘢𝘯𝘥 𝘤𝘢𝘱𝘵𝘶𝘳𝘦 𝘱𝘶𝘣𝘭𝘪𝘤 𝘢𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯?" Our #ACL2025 main conf paper delves into this issue. (1/n)
Yiqing Liang (@yiqingliang2) 's Twitter Profile Photo

🧵 MoDoMoDo: smarter data mixing → stronger reasoning for multimodal LLMs 🚀 New preprint! We show how right mixing of multi-domain data can super‑charge multimodal LLM RL reasoning. 🌐 Site: modomodo-rl.github.io 📄 Paper: arxiv.org/abs/2505.24871 details in thread. 👇

🧵 MoDoMoDo: smarter data mixing → stronger reasoning for multimodal LLMs
🚀 New preprint! We show how right mixing of multi-domain data can super‑charge multimodal LLM RL reasoning. 
🌐 Site: modomodo-rl.github.io
📄 Paper: arxiv.org/abs/2505.24871

details in thread. 👇
Chenyang Lyu 吕晨阳 (@chenyang_lyu) 's Twitter Profile Photo

ah this is interesting, language model is easily being contaminated by some data that's already included in the training process. I am curious about what if extending this to multilingual ASR setting especially involving some correlation analysis with the training data

James Barry (@jamesarbarry) 's Twitter Profile Photo

If any Irish speakers are interested in helping annotate some of the BLEnD examples to Irish, please let me know (can be as little as a few examples). We aim to have Irish included in the next release. huggingface.co/datasets/nayeo…

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

The Diffusion Duality "The arg max operation transforms Gaussian diffusion into Uniform-state diffusion" Adapts consistency distillation to diffusion language models, unlocking few-step generation by accelerating sampling by two orders of magnitude. Introduces a curriculum

The Diffusion Duality 

"The arg max operation transforms Gaussian diffusion into Uniform-state diffusion"

Adapts consistency distillation to diffusion language models, unlocking few-step generation by accelerating sampling by two orders of magnitude.

Introduces a curriculum
Audio and Speech Processing Papers (@audioandspeech) 's Twitter Profile Photo

PMF-CEC: Phoneme-augmented Multimodal Fusion for Context-aware ASR Error Correction with Error-specific Selective Decoding. arxiv.org/abs/2506.11064

MiniMax (official) (@minimax__ai) 's Twitter Profile Photo

Day 1/5 of #MiniMaxWeek: We’re open-sourcing MiniMax-M1, our latest LLM — setting new standards in long-context reasoning. - World’s longest context window: 1M-token input, 80k-token output - State-of-the-art agentic use among open-source models - RL at unmatched efficiency:

Day 1/5 of #MiniMaxWeek: We’re open-sourcing MiniMax-M1, our latest LLM — setting new standards in long-context reasoning.

- World’s longest context window: 1M-token input, 80k-token output
- State-of-the-art agentic use among open-source models
- RL at unmatched efficiency:
Chenyang Lyu 吕晨阳 (@chenyang_lyu) 's Twitter Profile Photo

wow to see "Open"AI is in the second last position, surprised to see IBM is a major contributor, bravo! Qwen is the leading effort of China, same to DeepSeek. Hope open science will get more stronger.

Tianyu Gao (@gaotianyu1350) 's Twitter Profile Photo

Check out our work on fair comparison among KV cache reduction methods and PruLong, one of the most effective, easy-to-use memory reduction method for long-context LMs!

Chenyang Lyu 吕晨阳 (@chenyang_lyu) 's Twitter Profile Photo

good workshop, but why use this bad AI-generated picture with lots of meaningless symbols? It is destroying people's taste and sense for art and design.