Long Lian (@longtonylian) 's Twitter Profile
Long Lian

@longtonylian

EECS PhD student at UC Berkeley | My research primarily focuses on developing LLMs/VLMs with reasoning capabilities through RL.

ID: 1543413137033945088

linkhttps://tonylian.com calendar_today03-07-2022 01:55:53

89 Tweet

278 Followers

340 Following

Xiuyu Li (@xiuyu_l) 's Twitter Profile Photo

Scale smarter, not harder! Long CoT reasoning is powerful, but its sequential nature limits how efficiently and easily it can scale We incentivize LMs to divide and conquer subtasks in parallel, selectively gathering only the highest-quality explorations

Yifei Zhou (@yifeizhou02) 's Twitter Profile Photo

It’s a really fun project to be involved in. It’s like giving the LLM the tool to call itself in a recursive manner, and it scales surprisingly well for using more token budgets for reasoning!

马东锡 NLP 🇸🇪 (@dongxi_nlp) 's Twitter Profile Photo

「Reasoning, Agent」论文 Learning Adaptive Parallel Reasoning with Language Models 当 prompt 成了 Launch Kernel ?APR 让 LLM 学会何时分裂多线程、何时回收串行,使推理摆脱线性束缚。 为什么要“并行推理”? 串行 CoT:一步一步写思路 -> 长 token 序列既拖慢推理,又挤爆 context。

「Reasoning, Agent」论文

Learning Adaptive Parallel Reasoning with Language Models

当 prompt 成了 Launch Kernel ?APR 让 LLM 学会何时分裂多线程、何时回收串行,使推理摆脱线性束缚。

为什么要“并行推理”?
串行 CoT:一步一步写思路 -> 长 token 序列既拖慢推理,又挤爆 context。
𝚐𝔪𝟾𝚡𝚡𝟾 (@gm8xx8) 's Twitter Profile Photo

Learning Adaptive Parallel Reasoning with Language Models APR: - mixes serialized & parallel CoT via spawn() / join() - trained end-to-end with RL—no fixed search structure - parent-child threads jointly optimized for success - runs efficiently via multi-threaded batching on

Learning Adaptive Parallel Reasoning with Language Models

APR:
- mixes serialized & parallel CoT via spawn() / join()
- trained end-to-end with RL—no fixed search structure
- parent-child threads jointly optimized for success
- runs efficiently via multi-threaded batching on
Long Lian (@longtonylian) 's Twitter Profile Photo

Thank you for your appreciation in our work! 感谢分享我们的工作! 我们相信解决高难度的问题不能只依赖单线程CoT,而是需要不同的线程分工合作,就像攻克高难度的研究问题往往需要一个团队一样。期待和大家多交流!

Arthur Allshire (@arthurallshire) 's Twitter Profile Photo

our new system trains humanoid robots using data from cell phone videos, enabling skills such as climbing stairs and sitting on chairs in a single policy (w/ Hongsuk Benjamin Choi Junyi Zhang David McAllister)

Long Lian (@longtonylian) 's Twitter Profile Photo

As we all know, collecting data for robotics is very costly. This is why I’m very impressed by this work: it generates a huge amount of data for different robots without any teleoperation.

Yifei Zhou (@yifeizhou02) 's Twitter Profile Photo

With previous research in multimodal and agents, I believe the only truly useful multimodal agent before 2027 is multimodal co-creations in structured formats. Sharing my first blogpost, cuz I do not quite see this point of view around but can be quite impacful to the society.

Baifeng (@baifeng_shi) 's Twitter Profile Photo

Finally! We just released the models and code for PS3 & VILA-HD, a vision encoder **pre-trained at 4K resolution** and the resulting MLLM! PS3 & VILA-HD models: huggingface.co/collections/nv… PS3 code: github.com/NVlabs/PS3 VILA-HD code: github.com/NVlabs/VILA/tr… Demo:

David Chan (@_dmchan) 's Twitter Profile Photo

🚀 Call for Papers! 🚀 Excited to help organize the 4th Workshop on What is Next in Multimodal Foundation Models? at ICCV in Honolulu, Hawai'i 🌺 Submit work on vision, language, audio & more! 🗓️ Deadline: July 1, 2025 🔗 sites.google.com/view/mmfm4thwo… #MMFM4 #ICCV2025 #AI #multimodal

Xinyu Yang (@xinyu2ml) 's Twitter Profile Photo

🚀 Super excited to share Multiverse! 🏃 It’s been a long journey exploring the space between model design and hardware efficiency. What excites me most is realizing that, beyond optimizing existing models, we can discover better model architectures by embracing system-level

Lucas Beyer (bl16) (@giffmana) 's Twitter Profile Photo

Gemini 2.5 paper TL;DR. Technical part in thread. Contributors: ~1k 2.5 Pro timed out counting after 600s 2.5 Flash counts 1228 in 60s o3 counts 919 "after dedup" in 4m9s No grouping or "leads", just one long list. I guess too much infighting or poaching from this in the past?

Gemini 2.5 paper TL;DR. Technical part in thread.

Contributors: ~1k
2.5 Pro timed out counting after 600s
2.5 Flash counts 1228 in 60s
o3 counts 919 "after dedup" in 4m9s

No grouping or "leads", just one long list. I guess too much infighting or poaching from this in the past?
Long Lian (@longtonylian) 's Twitter Profile Photo

Excited to share that Describe Anything has been accepted at ICCV 2025! 🎉 Describe Anything Model (DAM) is a powerful Multimodal LLM that generates detailed descriptions for user-specified regions in images or videos using points, boxes, scribbles, or masks. Open-source code,

Shreya Shekhar (@_shreya_s) 's Twitter Profile Photo

Excited to be partnering with Henry Yin and Naomi - AGI House Ventures from AGI House to host a deep dive session on some of the most topical recent research in RL. We’ll have amazing researchers Jiayi Pan talking about his recent work on Adaptive Parallel Reasoning, and

XuDong Wang (@xdwang101) 's Twitter Profile Photo

🎉 Excited to share RecA: Reconstruction Alignment Improves Unified Multimodal Models 🔥 Post-train w/ RecA: 8k images & 4 hours (8 GPUs) → SOTA UMMs: GenEval 0.73→0.90 | DPGBench 80.93→88.15 | ImgEdit 3.38→3.75 Code: github.com/HorizonWind200… 1/n

🎉 Excited to share RecA: Reconstruction Alignment Improves Unified Multimodal Models

🔥 Post-train w/ RecA: 8k images & 4 hours (8 GPUs) → SOTA UMMs:

GenEval 0.73→0.90 | DPGBench 80.93→88.15 | ImgEdit 3.38→3.75

Code: github.com/HorizonWind200…

1/n
Xinyan Hu (@xyvickyhu) 's Twitter Profile Photo

3->5, 4->6, 9→11, 7-> ? LLMs solve this via In-Context Learning (ICL); but how is ICL represented and transmitted in LLMs? We build new tools identifying “extractor” and “aggregator” subspaces for ICL, and use them to understand ICL addition tasks like above. Come to

3->5, 4->6, 9→11, 7-> ?
LLMs solve this via In-Context Learning (ICL); but how is ICL represented and transmitted in LLMs? We build new tools identifying “extractor” and “aggregator” subspaces for ICL, and use them to understand ICL addition tasks like above. Come to
Long Lian (@longtonylian) 's Twitter Profile Photo

Super excited to see our work Adaptive Parallel Reasoning featured in the State of AI Report 2025! So glad that there is more and more interest on parallel reasoning!

Jiaxin Ge (@aomaru_21490) 's Twitter Profile Photo

✨Introducing ECHO, the newest in-the-wild image generation benchmark! You’ve seen new image models and new use cases discussed on social media, but old benchmarks don’t test them! We distilled this qualitative discussion into a structured benchmark. 🔗 echo-bench.github.io