Beibin Li (@beibin79) 's Twitter Profile
Beibin Li

@beibin79

ML & Optim @MSFTResearch.
PhD from @uwcse. Previous: @seattlechildren @YaleMed @UMich

Optimization, LLM, eye tracking, autism, medical imaging.

ID: 157905985

linkhttp://beibinli.com calendar_today21-06-2010 06:18:35

192 Tweet

182 Followers

276 Following

Sebastien Bubeck (@sebastienbubeck) 's Twitter Profile Photo

Amazing work on these new benchmarks, keep them coming!!! And notice our little phi-3-mini (3.8B) ahead of 34B models :-). Quite curious to see where phi-3-medium (14B) lands!

Amazing work on these new benchmarks, keep them coming!!! And notice our little phi-3-mini (3.8B) ahead of 34B models :-). Quite curious to see where phi-3-medium (14B) lands!
Microsoft Research (@msftresearch) 's Twitter Profile Photo

.Chi Wang Gagan Bansal Beibin Li Ahmed Awadallah Ryen White Doug Burger Jieyu Zhang @ CVPR Eric Zhu Li Jiang Xiaoyun Zhang & coauthors won Best Paper at #ICLR2024 LLM Agents Workshop for "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation." msft.it/6014YbsYu

.<a href="/Chi_Wang_/">Chi Wang</a> <a href="/bansalg_/">Gagan Bansal</a> <a href="/beibin79/">Beibin Li</a> <a href="/AhmedHAwadallah/">Ahmed Awadallah</a> <a href="/ryen_white/">Ryen White</a> <a href="/dcburger/">Doug Burger</a> <a href="/JieyuZhang20/">Jieyu Zhang @ CVPR</a> Eric Zhu Li Jiang Xiaoyun Zhang &amp; coauthors won Best Paper at #ICLR2024 LLM Agents Workshop for "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation." msft.it/6014YbsYu
Qingyun Wu (@qingyun_wu) 's Twitter Profile Photo

Thrilled to be part of this incredible new course on AI Agentic Design Patterns with AutoGen! 🚀 Join Chi Wang and me as we dive into the world of multi-agent collaboration, reflection, tool use, and more using AutoGen. Can't wait to show you how to leverage these to build cool

Chi Wang (@chi_wang_) 's Twitter Profile Photo

I often get asked two common questions: 1. What's an agent? 2. What are the pros and cons of multi vs. single agent? This blog collects my thoughts from several interviews and recent learnings. microsoft.github.io/autogen/blog/2…

I often get asked two common questions:
1. What's an agent?
2. What are the pros and cons of multi vs. single agent?
This blog collects my thoughts from several interviews and recent learnings.
microsoft.github.io/autogen/blog/2…
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Actually, really liked the Apple Intelligence announcement. It must be a very exciting time at Apple as they layer AI on top of the entire OS. A few of the major themes. Step 1 Multimodal I/O. Enable text/audio/image/video capability, both read and write. These are the native

Runlong Zhou (@vectorzhou) 's Twitter Profile Photo

Reflect-RL: Two-Player Online RL Fine-Tuning for LMs #acl2024 paper: arxiv.org/abs/2402.12621 Code: github.com/zhourunlong/Re… Reflect-RL allows two LM players (a frozen reflection agent + a trainable policy agent) to learn from interactions in a decision-making environment.

Reflect-RL: Two-Player Online RL Fine-Tuning for LMs

#acl2024  paper: arxiv.org/abs/2402.12621

Code: github.com/zhourunlong/Re…

Reflect-RL allows two LM players (a frozen reflection agent + a trainable policy agent) to learn from interactions in a decision-making environment.
Runlong Zhou (@vectorzhou) 's Twitter Profile Photo

Even though modern LMs have been trained on nearly all public text data, their ability to make multi-step decisions is still limited. To improve, LMs should engage in self-driven exploration, reflection, and learning within interactive environments.

Runlong Zhou (@vectorzhou) 's Twitter Profile Photo

Key findings: (4/4) + A multi-agent design facilitates easier agent training, expediting reflective learning processes. + Curriculum learning improves the RL training convergence in complex environments.

Runlong Zhou (@vectorzhou) 's Twitter Profile Photo

Key findings: (3/4) + Adding a mechanism for reflection, particularly learning from mistakes, can substantially enhance success rates. This finding confirms the strengths of ReAct, CoT, and related prompting techniques.

Runlong Zhou (@vectorzhou) 's Twitter Profile Photo

Key findings: (2/4) - Reinforcement Learning (RL) struggles in large action spaces, as the vast exploration space complicates the process of acquiring valuable knowledge. + Combining SFT with online RL provides some improvement in convergence.

Runlong Zhou (@vectorzhou) 's Twitter Profile Photo

Key findings: (1/4) - The success rate of current open-source 7B LMs in multi-step decision-making is similar to random walk. - Supervised fine-tuning (SFT), such as imitation learning, falls short of a significant alignment due to lack of online interactions.

Runlong Zhou (@vectorzhou) 's Twitter Profile Photo

Future Directions: 1. Analyze the convergence rates, of one player vs two players vs multiple players. 2. Integrate RL to advance LMs' reasoning and cognitive abilities. 3.  Apply Reflect-RL to other applications of LMs such as scientific exploration and discovery.

Lianhui Qin (@lianhuiq) 's Twitter Profile Photo

💡Divergence thinking💡 is a hallmark of human creativity and problem-solving 🤖Can LLMs also do divergent reasoning to generate diverse solutions🤔? Introducing Flow-of-Reasoning (FoR) 🌊, a data-efficient way of training LLM policy to generate diverse, high-quality reasoning

Lior⚡ (@lioronai) 's Twitter Profile Photo

This might be the biggest moment for Open-Source AI. Meta just released Llama 3.1 and a 405 billion parameter model, the most sophisticated open model ever released. It already outperforms GPT-4o on several benchmarks.

Lianhui Qin (@lianhuiq) 's Twitter Profile Photo

📢Our amazing team are presenting two papers at #ICML2024. Join them to explore LLMs for chemistry reasoning and Controllable Jailbreaking LLMs 🌟 I'll miss being there, but hope everyone enjoys the conference! 🔥

📢Our amazing team are presenting two papers at #ICML2024. Join them to explore LLMs for chemistry reasoning and Controllable Jailbreaking LLMs   🌟

I'll miss being there, but hope everyone enjoys the conference! 🔥
Chi Wang (@chi_wang_) 's Twitter Profile Photo

Join us for the invited talk Language Agent Tree Search in the AutoGen community meetup (Do not miss!) Time: August 26, Monday 9am PT. Event link: discord.com/events/1153072… Abstract: ⬇️

Qingyun Wu (@qingyun_wu) 's Twitter Profile Photo

Just can't believe this happened at NeurIPS and, ironically, from an invited keynote speaker talking about ethics! Removing racial bias from humans is so much harder than removing it from LLMs. So proud of the Chinese student who speaks up on spot, pointing out the racist

AI at Meta (@aiatmeta) 's Twitter Profile Photo

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model

Today is the start of a new era of natively multimodal AI innovation.

Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick —  our most advanced models yet and the best in their class for multimodality.

Llama 4 Scout
• 17B-active-parameter model
Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile Photo

Phase 1 of Physics of Language Models code release ✅our Part 3.1 + 4.1 = all you need to pretrain strong 8B base model in 42k GPU-hours ✅Canon layers = strong, scalable gains ✅Real open-source (data/train/weights) ✅Apache 2.0 license (commercial ok!) 🔗github.com/facebookresear…

Phase 1 of Physics of Language Models code release
✅our Part 3.1 + 4.1 = all you need to pretrain strong 8B base model in 42k GPU-hours
✅Canon layers = strong, scalable gains
✅Real open-source (data/train/weights)
✅Apache 2.0 license (commercial ok!)
🔗github.com/facebookresear…