Liyuan Liu (Lucas) (@liyuanlucas) Twitter Tweets • TwiCopy

Yufan Zhuang

2 months ago

Can LLMs reason beyond context limits? 🤔 Introducing Knowledge Flow, a training-free method that helped gpt-oss-120b & Qwen3-235B achieve 100% on the AIME-25, no tools. How? like human deliberation, for LLMs. 📝 Blog: yufanzhuang.notion.site/knowledge-flow 💻 Code: github.com/EvanZhuang/kno…

thumb_up_off_alt207

chat_bubble_outline6

repeat35

shareShare

Dinghuai Zhang 张鼎怀

@zdhnarsil

2 months ago

Check our Knowledge Flow blog: We develop a new axis of test-time scaling by doing iterative refinement on a "knowledge" list for reasoning tasks! Notably, we find that updating what is wrong is more effective than recording what is right. Great job led by Yufan Zhuang!

thumb_up_off_alt69

chat_bubble_outline0

repeat5

shareShare

Nan Jiang

@nanjiang_cs

a month ago

got confused by something basic and went down a rabbit hole, so I just wrote a blogpost about it. "Is Density vs. Feature Coverage That Different?" nanjiang.cs.illinois.edu/2025/10/24/cov…

thumb_up_off_alt47

chat_bubble_outline0

repeat7

shareShare

Thinking Machines

@thinkymachines

a month ago

Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL with the reward density of SFT. When training it for math reasoning and as an internal chat assistant, we find that on-policy distillation can outperform other

thumb_up_off_alt2,2K

chat_bubble_outline60

repeat381

shareShare

Thinking Machines

@thinkymachines

a month ago

Today we’re announcing research and teaching grants for Tinker: credits for scholars and students to fine-tune and experiment with open-weight LLMs. Read more and apply at: thinkingmachines.ai/blog/tinker-re…

thumb_up_off_alt944

chat_bubble_outline17

repeat108

shareShare

Liyuan Liu (Lucas)

@liyuanlucas

a month ago

this would be a very good starting point for learning / prototyping many times people interested in learning ML/DL/LLM are intimidated by the sys/compute complexity

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Penghui Qi

@qphutu

a month ago

🚀Excited to share our new work! 💊Problem: The BF16 precision causes a large training-inference mismatch, leading to unstable RL training. 💡Solution: Just switch to FP16. 🎯That's it. 📰Paper: arxiv.org/pdf/2510.26788 ⭐️Code: github.com/sail-sg/Precis…

thumb_up_off_alt591

chat_bubble_outline18

repeat92

shareShare

Yingru Li

@richardyrli

a month ago

Daniel Han, glad you liked the post! You're spot on to suspect lower-level implementation issues. That's exactly what we found in the original blog. The disable_cascade_attn finding (Sec 4.2.4) was the symptom, but the root cause was that silent FlashAttention-2 kernel bug

<a href="/danielhanchen/">Daniel Han</a>, glad you liked the post! You're spot on to suspect lower-level implementation issues. That's exactly what we found in the original blog.
The disable_cascade_attn finding (Sec 4.2.4) was the symptom, but the root cause was that silent FlashAttention-2 kernel bug

thumb_up_off_alt341

chat_bubble_outline8

repeat24

shareShare

Penghui Qi

@qphutu

a month ago

Yingru Li Daniel Han Hi Yingru Li , I tried this disable_cascade_attn many times, including the latest vllm version. But unfortunately it made no difference in our experiments. So I guess it really depends on the setting.

thumb_up_off_alt25

chat_bubble_outline1

repeat1

shareShare

Zongyi Li

@zongyilicaltech

a month ago

Life update: I will join NYU Courant Math (CAOS) and Center of Data Science as an assistant professor in Fall 2026. If you are interested in doing a PhD with me please let me know! zongyi-li.github.io

thumb_up_off_alt611

chat_bubble_outline17

repeat63

shareShare

Shekswess

@shekswess

a month ago

It’s frustrating how labs like Kimi.ai, Qwen, DeepSeek, Ai2, Hugging Face... share their research, pipelines, and lessons openly, only for closed-source labs to quietly use that knowledge to build better models without ever giving back.

thumb_up_off_alt1,1K

chat_bubble_outline77

repeat125

shareShare

Liyuan Liu (Lucas)

@liyuanlucas

a month ago

Science is better when shared!

thumb_up_off_alt30

chat_bubble_outline2

repeat0

shareShare

Liyuan Liu (Lucas)

@liyuanlucas

a month ago

We do! It's indeed better & very interesting! Check x.com/yufan_zhuang/s…

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Ashwinee Panda

@pandaashwinee

a month ago

"Dense Backpropagation Improves Pretraining for sMoEs" is accepted at NeurIPS Conference! We show that we can proxy inactive experts with a cheap estimator, and that doing this in pretraining improves performance without requiring HPO or compute overhead.

"Dense Backpropagation Improves Pretraining for sMoEs" is accepted at <a href="/NeurIPSConf/">NeurIPS Conference</a>! We show that we can proxy inactive experts with a cheap estimator, and that doing this in pretraining improves performance without requiring HPO or compute overhead.

thumb_up_off_alt107

chat_bubble_outline3

repeat20

shareShare

Zhiyuan Zeng

@zhiyuanzeng_

a month ago

RL is bounded by finite data😣? Introducing RLVE: RL with Adaptive Verifiable Environments We scale RL with data procedurally generated from 400 envs dynamically adapting to the trained model 💡find supervision signals right at the LM capability frontier + scale them 🔗in🧵

thumb_up_off_alt446

chat_bubble_outline11

repeat109

shareShare

vLLM

@vllm_project

a month ago

🚀 No More Train–Inference Mismatch! We demonstrate bitwise consistent on-policy RL with TorchTitan (training) + vLLM (inference) — the first open-source run where training and inference numerics match exactly. It only takes 3 steps: 1️⃣ Make vLLM batch-invariant (same seq →

thumb_up_off_alt520

chat_bubble_outline9

repeat62

shareShare

Satya Nadella

@satyanadella

24 days ago

I’ve been thinking a lot about what the net benefit of the AI platform wave is. The real question is how to empower every company out there to get more out of this platform shift and build their own AI native capabilities and enterprise value (vs inadvertently just transfer their

thumb_up_off_alt5,5K

chat_bubble_outline744

repeat806

shareShare