Lang Feng (@langfengq) 's Twitter Profile
Lang Feng

@langfengq

PhD student @ Nanyang Technological University | LLM, RL, LLM Agent

ID: 1924493225492103168

linkhttps://langfengq.github.io/ calendar_today19-05-2025 15:52:06

18 Tweet

41 Followers

35 Following

Lang Feng (@langfengq) 's Twitter Profile Photo

🚀Introducing GiGPO: a new RL algorithm for training LLM agent. Built on GRPO, but now with fine-grained credit assignment. 🔹ALFWorld(+12%); WebShop(+9%); Sokoban(+10%) 🔹No extra GPU & rollout cost 📄arxiv.org/abs/2505.10978 💻github.com/langfengQ/verl… #LLM #AIagent #DeepSeek

Lang Feng (@langfengq) 's Twitter Profile Photo

Introducing our new #ICML2025 paper: CoSo, an entropy RL in LLM/VLM agent training. Agent's textual action → exponentially large exploration space💥, making rollout inefficient. CoSo targets action-critical tokens → faster learning & stronger agents. arxiv.org/abs/2505.03792

Agentica Project (@agentica_) 's Twitter Profile Photo

🚀 Introducing DeepSWE 🤖: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-32B. DeepSWE achieves 59% on SWEBench-Verified with test-time scaling (and 42.2% Pass@1), topping the SWEBench leaderboard for open-weight models. 💪DeepSWE

🚀 Introducing DeepSWE 🤖: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-32B. DeepSWE achieves 59% on SWEBench-Verified with test-time scaling (and 42.2% Pass@1), topping the SWEBench leaderboard for open-weight models.

💪DeepSWE
Paul Zhou (@zhiyuan_zhou_) 's Twitter Profile Photo

We tested WSRL (Warm-start RL) on a Franka Robot, and it leads to really efficient online RL fine-tuning in the real world! WSRL learned the peg insertion task perfectly with only 11 minutes of warmup and *7 minutes* of online RL interactions 👇🧵

verl project (@verl_project) 's Twitter Profile Photo

If you're in Singapore on 7/11, do not miss this meetup! Talks from the verl community: - LLMs to optimize code performance on real-world repos & verl project updates Qian Liu - Long-horizon LLM agent training with verl-agent Lang Feng Link: lu.ma/e498qhsi

Jason Wei (@_jasonwei) 's Twitter Profile Photo

Becoming an RL diehard in the past year and thinking about RL for most of my waking hours inadvertently taught me an important lesson about how to live my own life. One of the big concepts in RL is that you always want to be “on-policy”: instead of mimicking other people’s