Chenyang Lyu 吕晨阳 (@chenyang_lyu) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

📢 (1/16) Introducing PaTH 🛣️ — a RoPE-free contextualized position encoding scheme, built for stronger state tracking, better extrapolation, and hardware-efficient training. PaTH outperforms RoPE across short and long language modeling benchmarks arxiv.org/abs/2505.16381

thumb_up_off_alt424

chat_bubble_outline9

repeat79

shareShare

steven

@tu7uruu

2 months ago

🚨 Don’t trust your ASR benchmarks. LibriSpeech and Common Voice are contaminated. Your SpeechLLM might be cheating—because it already saw the test answers during pretraining. Stop evaluating on leaked data. Here's why 1/n

thumb_up_off_alt15

chat_bubble_outline1

repeat3

shareShare

Yapei Chang

@yapeichang

2 months ago

Qwen benefits from random rewards on math 🧮 but that doesn't hold for general instruction following (IF). That said, noisy rewards aren't useless. In [🫐 arxiv.org/pdf/2505.11080], we show that simple string-matching metrics like BLEU can work surprisingly well for general IF!

thumb_up_off_alt77

chat_bubble_outline0

repeat14

shareShare

CLS

@chengleisi

2 months ago

This year, there have been various pieces of evidence that AI agents are starting to be able to conduct scientific research and produce papers end-to-end, at a level where some of these generated papers were already accepted by top-tier conferences/workshops. Intology’s

thumb_up_off_alt212

chat_bubble_outline13

repeat43

shareShare

Slator

@slatornews

2 months ago

.Alibaba Group introduces TransBench, a benchmark designed to evaluate 🔍 how well #AI #translation systems perform in real-world industry settings — starting with international e-commerce 🛍️🌐 #xl8 #t9n #ecommerce Minghao Wu Chenyang Lyu 吕晨阳 Longyue Wang slator.com/alibaba-launch…

thumb_up_off_alt5

chat_bubble_outline0

repeat3

shareShare

Yu Zhang

@yuz9yuz

2 months ago

"𝘐𝘴 𝘈𝘊𝘓 𝘢𝘯 𝘈𝘐 𝘤𝘰𝘯𝘧𝘦𝘳𝘦𝘯𝘤𝘦?" A possible perspective is to examine: "𝘞𝘩𝘪𝘤𝘩 𝘈𝘊𝘓 𝘵𝘰𝘱𝘪𝘤𝘴 𝘳𝘦𝘴𝘰𝘯𝘢𝘵𝘦 𝘣𝘦𝘺𝘰𝘯𝘥 𝘢𝘤𝘢𝘥𝘦𝘮𝘪𝘢 𝘢𝘯𝘥 𝘤𝘢𝘱𝘵𝘶𝘳𝘦 𝘱𝘶𝘣𝘭𝘪𝘤 𝘢𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯?" Our #ACL2025 main conf paper delves into this issue. (1/n)

thumb_up_off_alt117

chat_bubble_outline4

repeat11

shareShare

Yiqing Liang

@yiqingliang2

2 months ago

🧵 MoDoMoDo: smarter data mixing → stronger reasoning for multimodal LLMs 🚀 New preprint! We show how right mixing of multi-domain data can super‑charge multimodal LLM RL reasoning. 🌐 Site: modomodo-rl.github.io 📄 Paper: arxiv.org/abs/2505.24871 details in thread. 👇

thumb_up_off_alt148

chat_bubble_outline4

repeat22

shareShare

Konstantin Mishchenko

@konstmish

2 months ago

Anyone working on adaptive optimization methods and replacements for Adam should check this paper.

thumb_up_off_alt395

chat_bubble_outline7

repeat31

shareShare

Chenyang Lyu 吕晨阳

@chenyang_lyu

2 months ago

should be related to the pretraining data distribution across languages

thumb_up_off_alt8

chat_bubble_outline0

repeat2

shareShare

Chenyang Lyu 吕晨阳

@chenyang_lyu

2 months ago

ah this is interesting, language model is easily being contaminated by some data that's already included in the training process. I am curious about what if extending this to multilingual ASR setting especially involving some correlation analysis with the training data

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Longyue Wang

@wangly0229

2 months ago

✨Introducing ComfyUI-R1: a large reasoning model for automated workflow generation🌺 huggingface.co/papers/2506.09…

thumb_up_off_alt16

chat_bubble_outline0

repeat6

shareShare

James Barry

@jamesarbarry

2 months ago

If any Irish speakers are interested in helping annotate some of the BLEnD examples to Irish, please let me know (can be as little as a few examples). We aim to have Irish included in the next release. huggingface.co/datasets/nayeo…

thumb_up_off_alt9

chat_bubble_outline0

repeat6

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

2 months ago

The Diffusion Duality "The arg max operation transforms Gaussian diffusion into Uniform-state diffusion" Adapts consistency distillation to diffusion language models, unlocking few-step generation by accelerating sampling by two orders of magnitude. Introduces a curriculum

thumb_up_off_alt471

chat_bubble_outline5

repeat73

shareShare

Audio and Speech Processing Papers

@audioandspeech

2 months ago

PMF-CEC: Phoneme-augmented Multimodal Fusion for Context-aware ASR Error Correction with Error-specific Selective Decoding. arxiv.org/abs/2506.11064

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

MiniMax (official)

@minimax__ai

2 months ago

Day 1/5 of #MiniMaxWeek: We’re open-sourcing MiniMax-M1, our latest LLM — setting new standards in long-context reasoning. - World’s longest context window: 1M-token input, 80k-token output - State-of-the-art agentic use among open-source models - RL at unmatched efficiency:

thumb_up_off_alt1,1K

chat_bubble_outline55

repeat236

shareShare

Junxian He

@junxian_he

2 months ago

Hybrid attention enables efficient test-time compute scaling.

thumb_up_off_alt55

chat_bubble_outline0

repeat3

shareShare

Chenyang Lyu 吕晨阳

@chenyang_lyu

2 months ago

wow to see "Open"AI is in the second last position, surprised to see IBM is a major contributor, bravo! Qwen is the leading effort of China, same to DeepSeek. Hope open science will get more stronger.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

AK

@_akhaliq

2 months ago

V-JEPA 2 Self-Supervised Video Models Enable Understanding, Prediction and Planning

thumb_up_off_alt284

chat_bubble_outline7

repeat51

shareShare

Tianyu Gao

@gaotianyu1350

a month ago

Check out our work on fair comparison among KV cache reduction methods and PruLong, one of the most effective, easy-to-use memory reduction method for long-context LMs!

thumb_up_off_alt72

chat_bubble_outline0

repeat5

shareShare

Chenyang Lyu 吕晨阳

@chenyang_lyu

a month ago

good workshop, but why use this bad AI-generated picture with lots of meaningless symbols? It is destroying people's taste and sense for art and design.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Chenyang Lyu 吕晨阳

Gate.io

Songlin Yang

steven

Yapei Chang

CLS

Slator

Yu Zhang

Yiqing Liang

Konstantin Mishchenko

Chenyang Lyu 吕晨阳

Chenyang Lyu 吕晨阳

Longyue Wang

James Barry

Tanishq Mathew Abraham, Ph.D.

Audio and Speech Processing Papers

MiniMax (official)

Junxian He

Chenyang Lyu 吕晨阳

AK

Tianyu Gao

Chenyang Lyu 吕晨阳