Lang Feng (@langfengq) Twitter Tweets • TwiCopy

Lang Feng

@langfengq

+ Follow

PhD student @ Nanyang Technological University | LLM, RL, LLM Agent

ID: 1924493225492103168

linkhttps://langfengq.github.io/ calendar_today19-05-2025 15:52:06

18 Tweet

41 Followers

35 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

🚀Introducing GiGPO: a new RL algorithm for training LLM agent. Built on GRPO, but now with fine-grained credit assignment. 🔹ALFWorld(+12%); WebShop(+9%); Sokoban(+10%) 🔹No extra GPU & rollout cost 📄arxiv.org/abs/2505.10978 💻github.com/langfengQ/verl… #LLM #AIagent #DeepSeek

thumb_up_off_alt3

chat_bubble_outline1

repeat0

shareShare

Lang Feng

@langfengq

2 months ago

Introducing our new #ICML2025 paper: CoSo, an entropy RL in LLM/VLM agent training. Agent's textual action → exponentially large exploration space💥, making rollout inefficient. CoSo targets action-critical tokens → faster learning & stronger agents. arxiv.org/abs/2505.03792

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Agentica Project

@agentica_

23 days ago

🚀 Introducing DeepSWE 🤖: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-32B. DeepSWE achieves 59% on SWEBench-Verified with test-time scaling (and 42.2% Pass@1), topping the SWEBench leaderboard for open-weight models. 💪DeepSWE

thumb_up_off_alt345

chat_bubble_outline15

repeat65

shareShare

Lang Feng

@langfengq

22 days ago

Memory really matters for agents especially when dealing with long-horizon interactions.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Paul Zhou

@zhiyuan_zhou_

22 days ago

We tested WSRL (Warm-start RL) on a Franka Robot, and it leads to really efficient online RL fine-tuning in the real world! WSRL learned the peg insertion task perfectly with only 11 minutes of warmup and *7 minutes* of online RL interactions 👇🧵

thumb_up_off_alt254

chat_bubble_outline9

repeat37

shareShare

verl project

@verl_project

20 days ago

If you're in Singapore on 7/11, do not miss this meetup! Talks from the verl community: - LLMs to optimize code performance on real-world repos & verl project updates Qian Liu - Long-horizon LLM agent training with verl-agent Lang Feng Link: lu.ma/e498qhsi

thumb_up_off_alt15

chat_bubble_outline0

repeat4

shareShare

Jason Wei

@_jasonwei

10 days ago

Becoming an RL diehard in the past year and thinking about RL for most of my waking hours inadvertently taught me an important lesson about how to live my own life. One of the big concepts in RL is that you always want to be “on-policy”: instead of mimicking other people’s

thumb_up_off_alt1,1K

chat_bubble_outline74

repeat129

shareShare