Yana Wei (@yanawei_) Twitter Tweets • TwiCopy

Yana Wei

@yanawei_

+ Follow

PhD student@Johns Hopkins University; Multimodal Understanding, Embodied Agent, Image Editing

ID: 1708838979628388352

calendar_today02-10-2023 13:39:11

10 Tweet

80 Followers

50 Following

Yasmine

@cyousakura

4 months ago

We are excited to introduce Open Vision Reasoner (OVR) 🚀 — transferring linguistic cognitive behavior to unlock advanced visual reasoning! 💡 Two-stage recipe • Massive linguistic cold-start on Qwen-2.5-VL-7B sparks “mental imagery” • ~1 k-step multimodal RL refines & scales

thumb_up_off_alt135

chat_bubble_outline3

repeat28

shareShare

Vishal Patel

@vishalm_patel

4 months ago

🚀 Open Vision Reasoner (OVR) Transferring linguistic cognitive behaviors to visual reasoning via large-scale multimodal RL. SOTA on MATH500 (95.3%), MathVision, and MathVerse. 💻 Code: github.com/Open-Reasoner-… 🌐 Project: weiyana.github.io/Open-Vision-Re… #LLM yana wei Johns Hopkins Engineering

thumb_up_off_alt21

chat_bubble_outline0

repeat5

shareShare

Vishal Patel

@vishalm_patel

4 months ago

🪞 We'll present Perception in Reflection at ICML this week! We introduce RePer, a dual-model framework that improves visual understanding through reflection. Better captions, fewer hallucinations, stronger alignment. 📄 arxiv.org/pdf/2504.07165 #ICML2025 Yana Wei JHU Computer Science

thumb_up_off_alt14

chat_bubble_outline0

repeat2

shareShare

AK

@_akhaliq

4 months ago

Kimi k2 + groq in anycoder Vibe coding 500+ loc Three.js mobile game in seconds

thumb_up_off_alt164

chat_bubble_outline8

repeat25

shareShare

Yana Wei

@yanawei_

4 months ago

🥳 Thanks AK for featuring our OVR! We’re continuously iterating on both models and data to release even more 🚀 powerful versions—stay tuned! Check out the nearly 1K-step multimodal RL and in-depth cognitive behavior analysis here: ✍️ ArXiv: arxiv.org/abs/2507.05255 🐼

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Yana Wei

@yanawei_

4 months ago

Just three words and one link—beautiful project pages in surprising styles! ✨ Try magic Anycoder by AK here 👉 huggingface.co/spaces/akhaliq… Original pages have totally different vibes! (weiyana.github.io/Perception-in-…)

thumb_up_off_alt13

chat_bubble_outline1

repeat1

shareShare

Yana Wei

@yanawei_

2 months ago

🚀 Open Vision Reasoner and Perception-R1 are both accepted at #NeurIPS2025! Also releasing OVR’s Cold Start data with rich reasoning here: 🔗 huggingface.co/datasets/Kangh…

thumb_up_off_alt15

chat_bubble_outline0

repeat0

shareShare

Jieneng Chen

@jieneng_chen

21 days ago

🤯 Think better visuals mean better world models? Think again. 💥 Surprise: Agents don’t need eye candy— they need wins. Meet World-in-World, the first open benchmark that ranks world models by closed-loop task success, not pixels. We uncover 3 shocks: 1️⃣ Visuals ≠ utility 2️⃣

thumb_up_off_alt142

chat_bubble_outline2

repeat39

shareShare