Ting-En Lin (@tnlin_tw) 's Twitter Profile
Ting-En Lin

@tnlin_tw

Research scientist at Alibaba Tongyi Lab, focusing on self-evolving LLMs aimed at AGI. @Tsinghua_Uni alum.

ID: 1331994524

linkhttps://tnlin.github.io/ calendar_today06-04-2013 16:35:26

186 Tweet

216 Followers

450 Following

Yujia Qin@ICLR2025 (@tsingyoga) 's Twitter Profile Photo

我们离理想的AutoGPT还有多远?[1/4] AutoGPT[1]已经163k star了,AutoGPT的开发者雕花了一年多,但它仍然停留在demo阶段,算不上产品(即使面向开发者)。这和传统开源软件的发展轨迹相差甚远,核心原因是Agent的上限由底座模型决定

我们离理想的AutoGPT还有多远?[1/4]

AutoGPT[1]已经163k star了,AutoGPT的开发者雕花了一年多,但它仍然停留在demo阶段,算不上产品(即使面向开发者)。这和传统开源软件的发展轨迹相差甚远,核心原因是Agent的上限由底座模型决定
Shunyu Yao (@shunyuyao12) 's Twitter Profile Photo

Excited to share what I did Sierra with Noah Shinn pedram and Karthik Narasimhan ! 𝜏-bench evaluates critical agent capabilities omitted by current benchmarks: robustness, complex rule following, and human interaction skills. Try it out!

Keming (Luke) Lu @ ICLR2025 (@keminglu612) 's Twitter Profile Photo

How can we improve instruction-following abilities without manual efforts? 🤔️ We present Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models from kabi ! Paper: arxiv.org/pdf/2406.13542 More⬇️

How can we improve instruction-following abilities without manual efforts? 🤔️

We present Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models from <a href="/kakakbibibi/">kabi</a> !

Paper: arxiv.org/pdf/2406.13542
More⬇️
Amrith Setlur (@setlur_amrith) 's Twitter Profile Photo

🚨 Interested in synthetic data and LLM reasoning? Our new work studies scaling laws for synthetic data and RL for math reasoning. TLDR: Step-level RL (per-step DPO in fig) on self-generated answers improves sample efficiency of synthetic data by 8x! arxiv.org/abs/2406.14532 1/🧵

🚨 Interested in synthetic data and LLM reasoning? Our new work studies scaling laws for synthetic data and RL for math reasoning.
TLDR: Step-level RL (per-step DPO in fig) on self-generated answers improves sample efficiency of synthetic data by 8x! arxiv.org/abs/2406.14532

1/🧵
theblackat102 (@zraytam) 's Twitter Profile Photo

Bit bored decided to use some time type this new paper my team recently just pub, if you are working IRL deployment of LLM service you should check it out arxiv.org/abs/2406.08747

Yujia Qin@ICLR2025 (@tsingyoga) 's Twitter Profile Photo

视觉-语言模型(VLM)领域在研究些什么?🧐 VLM是一个从去年末开始快速发展的领域,对研究者来说尚有大量“金矿”未被发掘,且当前探索仍然非常初步,对大模型的初学者上手难度较小🥰 以下是帮你快速掌握VLM领域目前发展的文章推荐📰: 1.

视觉-语言模型(VLM)领域在研究些什么?🧐

VLM是一个从去年末开始快速发展的领域,对研究者来说尚有大量“金矿”未被发掘,且当前探索仍然非常初步,对大模型的初学者上手难度较小🥰

以下是帮你快速掌握VLM领域目前发展的文章推荐📰:

1.
Cheng Han Chiang (姜成翰) (@dcml0714) 's Twitter Profile Photo

❗ New Paper❗ 📄 In '23, we proposed LLM-as-judge for NLP research 🤔 Any real-world applications? 💯 Now, we use LLM as an automatic assignment evaluator in a course with 1000+ students at National Taiwan University, led by Hung-yi Lee (李宏毅) with me as a TA 🔗 arxiv.org/abs/2407.05216

❗ New Paper❗
📄 In '23, we proposed LLM-as-judge for NLP research
🤔 Any real-world applications?
💯 Now, we use LLM as an automatic assignment evaluator in a course with 1000+ students at National Taiwan University, led by <a href="/HungyiLee2/">Hung-yi Lee (李宏毅)</a> with me as a TA
🔗 arxiv.org/abs/2407.05216
Ting-En Lin (@tnlin_tw) 's Twitter Profile Photo

I’ll be at ACL 2024 in Thailand from August 11-16, sharing about Tongyi CoAI at our booth. Come join us and let’s connect! 🎉🌟 #ACL2024 #NLProc ACL 2025

Ting-En Lin (@tnlin_tw) 's Twitter Profile Photo

I will be sharing about "Tongyi CoAI: Your Personalized Conversational Agent for Complex Applications" at our ACL 2024 booth this afternoon (Aug 12, 15:30). Come join us and let’s connect! #ACL2024 #NLProc ACL 2025

I will be sharing about "Tongyi CoAI: Your Personalized Conversational Agent for Complex Applications" at our ACL 2024 booth this afternoon (Aug  12, 15:30). Come join us and let’s connect!  #ACL2024 #NLProc <a href="/aclmeeting/">ACL 2025</a>
Yingwei Ma (@yingweim98560) 's Twitter Profile Photo

🚀 Open source model first autonomously resolved over 30% of real GitHub issues (on SWE-bench Verified) 🌟 Announcing Lingma SWE-GPT: an open development-process-centric llm for automated software improvement! 📄 Paper: arxiv.org/abs/2411.00622 💻 Code: github.com/LingmaTongyi/L…

🚀 Open source model first autonomously resolved over 30% of real GitHub issues (on SWE-bench Verified)

🌟 Announcing Lingma SWE-GPT: an open development-process-centric llm for automated software improvement!

📄 Paper: arxiv.org/abs/2411.00622
💻 Code: github.com/LingmaTongyi/L…
OpenAI (@openai) 's Twitter Profile Photo

GPT-4o got an update 🎉 The model’s creative writing ability has leveled up–more natural, engaging, and tailored writing to improve relevance & readability. It’s also better at working with uploaded files, providing deeper insights & more thorough responses.

Jason Wei (@_jasonwei) 's Twitter Profile Photo

Nice paper from Deepmind takes a fresh angle on factuality: arxiv.org/abs/2501.03200 While most existing factuality datasets focus on public world knowledge, this paper evaluates whether responses are consistent with a provided document as context. This is an elegant and

Haibin (@eric_haibin_lin) 's Twitter Profile Photo

Recent updates on verl project (RL lib for LLMs): Engine: - Megatron qwen & GRPO support, v0.11 upgrade - vllm v0.7 integration with v1 mode - experimental sglang integration Algorithm & recipes: - vision language reasoning with qwen2.5-vl - PRIME, RLOO, remax, math-verify