Yana Wei (@yanawei_) 's Twitter Profile
Yana Wei

@yanawei_

PhD student@Johns Hopkins University; Multimodal Understanding, Embodied Agent, Image Editing

ID: 1708838979628388352

calendar_today02-10-2023 13:39:11

10 Tweet

80 Followers

50 Following

Yasmine (@cyousakura) 's Twitter Profile Photo

We are excited to introduce Open Vision Reasoner (OVR) 🚀 — transferring linguistic cognitive behavior to unlock advanced visual reasoning! 💡 Two-stage recipe • Massive linguistic cold-start on Qwen-2.5-VL-7B sparks “mental imagery” • ~1 k-step multimodal RL refines & scales

We are excited to introduce Open Vision Reasoner (OVR) 🚀 — transferring linguistic cognitive behavior to unlock advanced visual reasoning!

💡 Two-stage recipe
• Massive linguistic cold-start on Qwen-2.5-VL-7B sparks “mental imagery”
• ~1 k-step multimodal RL refines & scales
Vishal Patel (@vishalm_patel) 's Twitter Profile Photo

🚀 Open Vision Reasoner (OVR) Transferring linguistic cognitive behaviors to visual reasoning via large-scale multimodal RL. SOTA on MATH500 (95.3%), MathVision, and MathVerse. 💻 Code: github.com/Open-Reasoner-… 🌐 Project: weiyana.github.io/Open-Vision-Re… #LLM yana wei Johns Hopkins Engineering

Vishal Patel (@vishalm_patel) 's Twitter Profile Photo

🪞 We'll present Perception in Reflection at ICML this week! We introduce RePer, a dual-model framework that improves visual understanding through reflection. Better captions, fewer hallucinations, stronger alignment. 📄 arxiv.org/pdf/2504.07165 #ICML2025 Yana Wei JHU Computer Science

Yana Wei (@yanawei_) 's Twitter Profile Photo

🥳 Thanks AK for featuring our OVR! We’re continuously iterating on both models and data to release even more 🚀 powerful versions—stay tuned! Check out the nearly 1K-step multimodal RL and in-depth cognitive behavior analysis here: ✍️ ArXiv: arxiv.org/abs/2507.05255 🐼

Yana Wei (@yanawei_) 's Twitter Profile Photo

Just three words and one link—beautiful project pages in surprising styles! ✨ Try magic Anycoder by AK here 👉 huggingface.co/spaces/akhaliq… Original pages have totally different vibes! (weiyana.github.io/Perception-in-…)

Yana Wei (@yanawei_) 's Twitter Profile Photo

🚀 Open Vision Reasoner and Perception-R1 are both accepted at #NeurIPS2025! Also releasing OVR’s Cold Start data with rich reasoning here: 🔗 huggingface.co/datasets/Kangh…

Jieneng Chen (@jieneng_chen) 's Twitter Profile Photo

🤯 Think better visuals mean better world models? Think again. 💥 Surprise: Agents don’t need eye candy— they need wins. Meet World-in-World, the first open benchmark that ranks world models by closed-loop task success, not pixels. We uncover 3 shocks: 1️⃣ Visuals ≠ utility 2️⃣

🤯 Think better visuals mean better world models? Think again.
💥 Surprise: Agents don’t need eye candy— they need wins.

Meet World-in-World, the first open benchmark that ranks world models by closed-loop task success, not pixels.

We uncover 3 shocks:
1️⃣ Visuals ≠ utility
2️⃣