Zhen Wang (@zhenwang9102) Twitter Tweets • TwiCopy

Maitrix.org

7 months ago

🤖Thrilled to introduce _ReasonerAgent_ - A fully open source, ready-to-run agent that does research🧐 in a web browser and answers your queries Use ReasonerAgent to help you: ✈️search for flights, 🛍️compile shopping options, 🗞️research news coverage, etc. 📘Check out more

thumb_up_off_alt144

chat_bubble_outline5

repeat37

shareShare

Huan Sun (OSU)

@hhsun1

5 months ago

⚖️Towards rigorously benchmarking the progress of agents📏: Wondering whether frontier web agents are genuinely as good as reported? 🤔Are they truly reaching nearly 90% task success rates on real-world tasks and websites? Check out our more comprehensive and rigorous

thumb_up_off_alt35

chat_bubble_outline1

repeat8

shareShare

Boshi Wang

@boshiwang2

5 months ago

LLMs exhibit the Reversal Curse, a basic generalization failure where they struggle to learn reversible factual associations (e.g., "A is B" -> "B is A"). But why? Our new work uncovers that it's a symptom of the long-standing binding problem in AI, and shows that a model design

thumb_up_off_alt872

chat_bubble_outline24

repeat127

shareShare

Huan Sun (OSU)

@hhsun1

5 months ago

It's a great honor to give a keynote at the Molecule Maker Lab Institute symposium at UIUC! Many thanks to Prof. Heng Ji and Prof. Jiawei Han for invitation. The symposium’s theme this year is “AI scientist? What would it take?”, which I hold close to heart and made a talk titled “Language

thumb_up_off_alt70

chat_bubble_outline2

repeat19

shareShare

Maitrix.org

@maitrixorg

4 months ago

Voila! 🤗🔥 Super excited to open-source Voila -- new unified Voice-Language Foundation Models for real-time conversations, audio-in audio-out. Voila enables to build a voice-based character-ai👩‍👩‍👧‍👧 instantly, with over one million voice persona! Voila unified model supports:

thumb_up_off_alt12

chat_bubble_outline2

repeat9

shareShare

Zhiting Hu

@zhitinghu

4 months ago

I was kidding -- this video was entirely simulated by the _world model_ we're building. 😀 It's mind-blowing how it produces high-fidelity simulations, lasting several minutes, to complete non-trivial tasks. This showcases the potential for infinite data & experience in

thumb_up_off_alt407

chat_bubble_outline10

repeat43

shareShare

Zhiting Hu

@zhitinghu

4 months ago

A humanoid robot dancing with agility and flair💃 ... in a world _interactively_ simulated by world model Here’s the choreography we told the model to simulate, step by step: 💃Wave both arms and start jumping 👋 💃Dance dance dance‼️ 💃Stand still and put left arm

thumb_up_off_alt133

chat_bubble_outline6

repeat29

shareShare

Huan Sun (OSU)

@hhsun1

4 months ago

Super excited to get funded by Schmidt Sciences to study computer-use agents (CUAs) under adversarial attacks. Many thanks to the student leads including Zeyi Liao, Jaylen Jones, Linxi Jiang, and amazing co-PIs Yu Su and Zhiqiang Lin. As the capabilities of CUAs improve,

thumb_up_off_alt94

chat_bubble_outline3

repeat9

shareShare

Tianmin Shu

@tianminshu

3 months ago

🚀 Excited to introduce SimWorld: an embodied simulator for infinite photorealistic world generation 🏙️ populated with diverse agents 🤖 If you are at #CVPR2025, come check out the live demo 👇 Jun 14, 12:00-1:00 pm at JHU booth, ExHall B Jun 15, 10:30 am-12:30 pm, #7, ExHall B

thumb_up_off_alt195

chat_bubble_outline5

repeat37

shareShare

Zhiting Hu

@zhitinghu

3 months ago

🔥Reinforcement learning for LLM reasoning is emerging—but many questions remain🧐🧐 ❓ Does RL teach new reasoning, or just elicit what’s already in the base LLM? ❓ Do long chains of thought truly emerge from RL? ❓ Most RL work has been focusing on math and coding. But how do

thumb_up_off_alt103

chat_bubble_outline2

repeat25

shareShare

Qiyue Gao

@qiyuegao123

2 months ago

🤔 Have OpenAI o3, Gemini 2.5, Claude 3.7 formed an internal world model to understand the physical world, or just align pixels with words? We introduce WM-ABench, the first systematic evaluation of VLMs as world models. Using a cognitively-inspired framework, we test 15 SOTA

thumb_up_off_alt206

chat_bubble_outline3

repeat44

shareShare

Zhiting Hu

@zhitinghu

2 months ago

🚨Do frontier VLMs (o3, Gemini 2.5, Claude 3.5, Qwen…) actually learn an internal world model🌍? Surprisingly, the answer appears to be a hard NO—as revealed by our WM Atomic Benchmark⚛️. Even o3 struggles with the most basic, atomic-level questions: ❌Confuse triangles📐 with

thumb_up_off_alt152

chat_bubble_outline4

repeat29

shareShare

Dynamics Lab

@dynamicslab_ai

2 months ago

💥💥BANG! Experience the future of gaming with our real-time world model for video games!🕹️🕹️ Not just PLAY—but CREATE! Introducing Mirage, the world’s first AI-native UGC game engine. Now featuring real-time playable demos of two games: 🏙️ GTA-style urban chaos 🏎️ Forza

thumb_up_off_alt686

chat_bubble_outline28

repeat142

shareShare

Eric Xing

@ericxing

2 months ago

I have been long arguing that a world model is NOT about generating videos, but IS about simulating all possibilities of the world to serve as a sandbox for general-purpose reasoning via thought-experiments. This paper proposes an architecture toward that arxiv.org/abs/2507.05169

thumb_up_off_alt512

chat_bubble_outline7

repeat87

shareShare

LAW Workshop@NeurIPS 2025

@law2025_neurips

a month ago

📢 Thrilled to announce LAW 2025 workshop, Bridging Language, Agent, and World Models, at #NeurIPS2025 this December in San Diego! 🌴🏖️ 🎉 Join us in exploring the exciting intersection of #LLMs, #Agents, #WorldModels! 🧠🤖🌍 🔗 sites.google.com/view/law-2025 #ML #AI #GenerativeAI 1/

thumb_up_off_alt16

chat_bubble_outline1

repeat7

shareShare

Zhen Wang

@zhenwang9102

a month ago

Huge thanks to Lambda for sponsoring awards for ALL accepted papers at #LAW2025! #NeurIPS2025 The deadline is approaching fast. Let's build a great program together🤗 ✍️Submit your work here: openreview.net/group?id=NeurI… 🧐Join our program committee: docs.google.com/forms/d/e/1FAI…👇

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare