Saaket Agashe @ NAACL 2025 (@saa1605) Twitter Tweets • TwiCopy

Yue Fan

a year ago

🚀🚀🚀 Excited to share our latest breakthrough: Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding! 📍 Click ANYWHERE on the screen, and our Tree-of-Lens (ToL) agent will tell you what's there and where it's located. 🌟 As shown in the video,

thumb_up_off_alt43

chat_bubble_outline1

repeat16

shareShare

Xin Eric Wang @ ICLR 2025

@xwang_lk

a year ago

🚀 Exciting news! Agent S will appear at #ICLR2025 in Singapore! 🌏 After 3 months post-release, it remains the SOTA open-source OS agent, now supporting Mac, Linux, Windows, and web browsers (integrated into our Simular Browser: simular.ai)! 🌐✨ Get started in

thumb_up_off_alt43

chat_bubble_outline0

repeat9

shareShare

Xin Eric Wang @ ICLR 2025

@xwang_lk

a year ago

Congrats to Saaket Agashe @ NAACL 2025 for his LLM-Coordination paper being accepted to #NAACL2025 Findings! What a day for a new PhD student who just completed his first year! Two first-author papers being accepted at ICLR and NAACL, with hundreds of GitHub stars & citations already!

thumb_up_off_alt22

chat_bubble_outline0

repeat5

shareShare

Yue Fan

@yfan_ucsc

10 months ago

Tired of GUI grounding models failing in new apps? 🤔 We introduce GUI-Bee 🐝with RL-driven exploration (covering 51% more unique scenes compared to baselines) to help your GUI action grounding models conquer NOVEL environments!🚀 Key Highlights: ✅ We pioneer aligning GUI

thumb_up_off_alt39

chat_bubble_outline1

repeat12

shareShare

Kaiwen Zhou

@kaiwenzhou9

10 months ago

🛡️ R1 Safety Paper Alert! 📰 How safe are large reasoning models like R1? What is their safety behavior? Does their enhanced capability introduce greater risks? — We present a comprehensive safety analysis on large reasoning models: 🔥 Key Findings: 1️⃣Open-source R1 models lag

thumb_up_off_alt102

chat_bubble_outline10

repeat23

shareShare

Qianqi "Jackie" Yan

@qianqi_yan

10 months ago

New Paper Alert: Multimodal Inconsistency Reasoning (MMIR)! ✨ Ever visited a webpage where the text says “IKEA desk” yet images and descriptions elsewhere show a totally different brand? Or read a slide that shows “50% growth” in the text but the accompanying chart looks flat?

thumb_up_off_alt30

chat_bubble_outline1

repeat9

shareShare

Saaket Agashe @ NAACL 2025

@saa1605

8 months ago

📢 Excited to present our poster at #ICLR2025! Agent S: An Open Agentic Framework that Uses Computers Like a Human. Come explore how Agent S leverages Experience Augmented Planning to interact with computers like humans do! 📍Hall 3 + Hall 2B, Poster #408 🗓️ April 26th, 10 AM

thumb_up_off_alt26

chat_bubble_outline3

repeat8

shareShare

Xin Eric Wang @ ICLR 2025

@xwang_lk

8 months ago

Our Agent S paper won the Best Paper Award at #ICLR2025 Agentic AI for Science Workshop! 🎉 Congrats to Simular Research team (Saaket Agashe, Jiuzhou Han, Ang Li). This is the most hands-on and committed project I’ve led since I started my faculty career. We’re just getting

Our Agent S paper won the Best Paper Award at #ICLR2025 Agentic AI for Science Workshop! 🎉 Congrats to Simular Research team (<a href="/saa1605/">Saaket Agashe</a>, <a href="/jiuzhou_han/">Jiuzhou Han</a>, <a href="/angli_ai/">Ang Li</a>). This is the most hands-on and committed project I’ve led since I started my faculty career. We’re just getting

thumb_up_off_alt206

chat_bubble_outline11

repeat17

shareShare

Saaket Agashe @ NAACL 2025

@saa1605

8 months ago

I’ll be presenting our poster: “LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models” tomorrow at #NAACL2025! ⏰ 11 AM 📍 Hall 3 Drop by to chat about applications of LLMs for Multi-Agent Coordination! #MultiAgentAI #LLMs

thumb_up_off_alt19

chat_bubble_outline0

repeat4

shareShare

Qianqi "Jackie" Yan

@qianqi_yan

6 months ago

🚀 New paper out! “Hidden in Plain Sight: Probing Implicit Reasoning in Multimodal Language Models” Real life is messy: 🔹 “Make a cup of apple juice” - but no apples are in sight 🔹 “Say hi to my friend” - yet two people are in the frame 🔹 “Tell me the brand of the lipstick” -

thumb_up_off_alt18

chat_bubble_outline1

repeat5

shareShare

Qianqi "Jackie" Yan

@qianqi_yan

4 months ago

We’re thrilled to launch the MMIR Challenge at the #ICCV2025 CLVL Workshop! 🧠 🖼️ Task: Detect inconsistencies in multimodal artifacts (webpages, slides, posters) 🏆 Top submissions invited to present in the non-archival track at CLVL 🔗 Compete now → kaggle.com/competitions/m…

thumb_up_off_alt6

chat_bubble_outline1

repeat7

shareShare

Kabir

@kabirahuja004

3 months ago

How does GPT-5 do on FlawedFictions? 🍩 On short stories, it reaches SoTA with CE-Eval = 0.70 (max 1), even above est. human performance. On long stories (FlawedFictionsLong), it still struggles at 0.47. We’ll present FlawedFictions at Conference on Language Modeling (Poster Session 2 Tuesday).

thumb_up_off_alt16

chat_bubble_outline0

repeat7

shareShare

Xin Eric Wang @ ICLR 2025

@xwang_lk

3 months ago

🚀 Introducing 𝐀𝐠𝐞𝐧𝐭 𝐒3, the most advanced computer-use agent, now 𝐚𝐩𝐩𝐫𝐨𝐚𝐜𝐡𝐢𝐧𝐠 𝐡𝐮𝐦𝐚𝐧-𝐥𝐞𝐯𝐞𝐥 𝐩𝐞𝐫𝐟𝐨𝐫𝐦𝐚𝐧𝐜𝐞🧠💻 Just one year ago, Agent S scored ~20% on OSWorld: SOTA then, but far from human 72%. Today, Agent S3 reaches 6̳9̳.̳9̳%̳ (⬆10% over

thumb_up_off_alt1,1K

chat_bubble_outline67

repeat251

shareShare

Simular

@simularai

2 months ago

🚀 Simular at COLM 2025 — Presenting Agent S2 in Montréal! 🇨🇦 We’re excited to share that our research team — Xin Eric Wang, Vincent and Kyle — presented Agent S2: A Compositional Generalist–Specialist Framework for Computer Use Agents at COLM 2025 in Montréal! Agent S2

🚀 <a href="/SimularAI/">Simular</a> at COLM 2025 — Presenting Agent S2 in Montréal! 🇨🇦

We’re excited to share that our research team — <a href="/xwang_lk/">Xin Eric Wang</a>, Vincent and Kyle — presented Agent S2: A Compositional Generalist–Specialist Framework for Computer Use Agents at COLM 2025 in Montréal!

Agent S2

thumb_up_off_alt14

chat_bubble_outline2

repeat4

shareShare