Yangyi Chen (on job market) (@yangyichen6666) 's Twitter Profile
Yangyi Chen (on job market)

@yangyichen6666

CS Ph.D. student at UIUC @IllinoisCS, focusing on scalable foundation models. I’m on the industry job market, seeking full-time research scientist positions!

ID: 1430592529322287108

linkhttps://yangyi-chen.github.io/ calendar_today25-08-2021 18:07:09

397 Tweet

922 Followers

289 Following

Shizhe Diao (@shizhediao) 's Twitter Profile Photo

Thrilled to share my first project at NVIDIA! ✨ Today’s language models are pre-trained on vast and chaotic Internet texts, but these texts are unstructured and poorly understood. We propose CLIMB — Clustering-based Iterative Data Mixture Bootstrapping — a fully automated

Thrilled to share my first project at NVIDIA! ✨

Today’s language models are pre-trained on vast and chaotic Internet texts, but these texts are unstructured and poorly understood. We propose CLIMB — Clustering-based Iterative Data Mixture Bootstrapping — a fully automated
Shizhe Diao (@shizhediao) 's Twitter Profile Photo

[Approach] ➤ Embeds and clusters web-scale data semantically. ➤ Searches, iteratively and efficiently, for optimal data mixtures using a lightweight proxy model + predictor loop. ➤ Learns how different domains interact, and how the right mix can unlock downstream performance

[Approach] 
➤ Embeds and clusters web-scale data semantically.
➤ Searches, iteratively and efficiently, for optimal data mixtures using a lightweight proxy model + predictor loop.
➤ Learns how different domains interact, and how the right mix can unlock downstream performance
Dylan (@dylan_works_) 's Twitter Profile Photo

Decision: Tweet Comment: Okay, here is the summary of this Summary: Summary: Besides this picture, this message hallucinates the full name of our approach based on the acronym, which includes 2 words that appeared ZERO times in the entire paper. ICML Conference

Decision:
Tweet

Comment:
Okay, here is the summary of this Summary:

Summary:
Besides this picture, this message hallucinates the full name of our approach based on the acronym, which includes 2 words that appeared ZERO times in the entire paper.
<a href="/icmlconf/">ICML Conference</a>
Xiusi Chen (@xiusi_chen) 's Twitter Profile Photo

🚀 Can we cast reward modeling as a reasoning task? 📖 Introducing our new paper: RM-R1: Reward Modeling as Reasoning 📑 Paper: arxiv.org/pdf/2505.02387 💻 Code: github.com/RM-R1-UIUC/RM-… Inspired by recent advances of long chain-of-thought (CoT) on reasoning-intensive tasks, we

🚀 Can we cast reward modeling as a reasoning task?

📖 Introducing our new paper: 
RM-R1: Reward Modeling as Reasoning

📑 Paper: arxiv.org/pdf/2505.02387
💻 Code: github.com/RM-R1-UIUC/RM-…

Inspired by recent advances of long chain-of-thought (CoT) on reasoning-intensive tasks, we
Heng Ji (@hengjinlp) 's Twitter Profile Photo

We are extremely excited to announce mCLM, a Modular Chemical Language Model that is friendly to automatable block-based chemistry and mimics bilingual speakers by “code-switching” between functional molecular modules and natural language descriptions of the functions. 1/2

We are extremely excited to announce mCLM, a Modular Chemical Language Model that is friendly to automatable block-based chemistry and mimics bilingual speakers by “code-switching” between functional molecular modules and natural language descriptions of the functions. 1/2
Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

Thrilled to share that our paper has been accepted to #ACL2025 Main 🇦🇹 Huge thanks to my amazing collaborators and my advisor Heng Ji 🙃 📄arxiv.org/abs/2502.17793 Happy to chat about our work as well as MLLM research projects 🙌

Thrilled to share that our paper has been accepted to #ACL2025 Main 🇦🇹

Huge thanks to my amazing collaborators and my advisor <a href="/hengjinlp/">Heng Ji</a> 🙃 
📄arxiv.org/abs/2502.17793

Happy to chat about our work as well as MLLM research projects 🙌
Wei Ping (@_weiping) 's Twitter Profile Photo

Introducing AceReason-Nemotron: Advancing math and code reasoning through reinforcement learning (RL) We propose conducting RL on math-only prompts first, then on code-only prompts. Our key findings include: - Math-only RL significantly boosts both math and code benchmarks! -

Introducing AceReason-Nemotron: Advancing math and code reasoning through reinforcement learning (RL)

We propose conducting RL on math-only prompts first, then on code-only prompts. 
Our key findings include:
- Math-only RL significantly boosts both math and code benchmarks!
-
Shizhe Diao (@shizhediao) 's Twitter Profile Photo

Does RL truly expand a model’s reasoning🧠capabilities? Contrary to recent claims, the answer is yes—if you push RL training long enough! Introducing ProRL 😎, a novel training recipe that scales RL to >2k steps, empowering the world’s leading 1.5B reasoning model💥and offering

Does RL truly expand a model’s reasoning🧠capabilities? Contrary to recent claims, the answer is yes—if you push RL training long enough!

Introducing ProRL 😎, a novel training recipe that scales RL to &gt;2k steps, empowering the world’s leading 1.5B reasoning model💥and offering
Yang Chen (@ychennlp) 's Twitter Profile Photo

📢We conduct a systematic study to demystify the synergy between SFT and RL for reasoning models. The result? We trained a 7B model - AceReason-Nemotron-1.1, significantly improved from version 1.0 on math and coding benchmarks. ✅AIME2025 (math): 53.6% -> 64.8% ✅LiveCodeBench

📢We conduct a systematic study to demystify the synergy between SFT and RL for reasoning models.

The result? We trained a 7B model - AceReason-Nemotron-1.1, significantly improved from version 1.0 on math and coding benchmarks.

✅AIME2025 (math): 53.6% -&gt; 64.8%
✅LiveCodeBench
Martin Ziqiao Ma (@ziqiao_ma) 's Twitter Profile Photo

Can we scale 4D pretraining to learn general space-time representations that reconstruct an object from a few views at any time to any view at any other time? Introducing 4D-LRM: a Large Space-Time Reconstruction Model that ... 🔹 Predicts 4D Gaussian primitives directly from

May Fung (@may_f1_) 's Twitter Profile Photo

🧠 How can AI evolve from statically 𝘵𝘩𝘪𝘯𝘬𝘪𝘯𝘨 𝘢𝘣𝘰𝘶𝘵 𝘪𝘮𝘢𝘨𝘦𝘴 → dynamically 𝘵𝘩𝘪𝘯𝘬𝘪𝘯𝘨 𝘸𝘪𝘵𝘩 𝘪𝘮𝘢𝘨𝘦𝘴 as cognitive workspaces, similar to the human mental sketchpad? 🔍 What’s the 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝗿𝗼𝗮𝗱𝗺𝗮𝗽 from tool-use → programmatic

🧠 How can AI evolve from statically 𝘵𝘩𝘪𝘯𝘬𝘪𝘯𝘨 𝘢𝘣𝘰𝘶𝘵 𝘪𝘮𝘢𝘨𝘦𝘴 → dynamically 𝘵𝘩𝘪𝘯𝘬𝘪𝘯𝘨 𝘸𝘪𝘵𝘩 𝘪𝘮𝘢𝘨𝘦𝘴 as cognitive workspaces, similar to the human mental sketchpad?

🔍 What’s the 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝗿𝗼𝗮𝗱𝗺𝗮𝗽 from tool-use → programmatic
Zhenhailong Wang (@zhenhailongw) 's Twitter Profile Photo

Learning to perceive while learning to reason! We introduce PAPO: Perception-Aware Policy Optimization, a direct upgrade to GRPO for multimodal reasoning. PAPO relies on internal supervision signals. No extra annotations, reward models, or teacher models needed. 🧵1/3

Learning to perceive while learning to reason!
We introduce PAPO: Perception-Aware Policy Optimization, a direct upgrade to GRPO for multimodal reasoning. PAPO relies on internal supervision signals. No extra annotations, reward models, or teacher models needed. 🧵1/3