Robert Brennan (@rbren_dev) Twitter Tweets • TwiCopy

All Hands AI

8 months ago

We created a new state-of-the-art agent on the SWE-Bench Verified leaderboard, at a 66.4 resolve rate! It is based on: 1. A strong base agent (using Claude-3.7 Sonnet). 2. A specially-trained "critic model" that can distinguish good solutions from bad ones.

thumb_up_off_alt415

chat_bubble_outline11

repeat44

shareShare

Robert Brennan

@rbren_dev

8 months ago

This was a really fun conversation with Practical AI practicalai.fm/310

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

Graham Neubig

@gneubig

8 months ago

How can we vibe code while still maintaining code quality? Over the past year, I've shifted 95% of my development from manually writing code to using coding agents. I wrote this blog on some tricks I learned to work successfully with agents: all-hands.dev/blog/vibe-codi…

thumb_up_off_alt176

chat_bubble_outline4

repeat36

shareShare

All Hands AI

@allhands_ai

7 months ago

Want to programmatically use AI agents to perform coding and maintenance tasks in the background with a single function call? Now you can do so with the OpenHands API! 🧵 about use cases and how to get started below.

thumb_up_off_alt73

chat_bubble_outline4

repeat17

shareShare

All Hands AI

@allhands_ai

7 months ago

The SWE-Bench verified leaderboard has been updated and OpenHands is both number one overall, and the only open source agent in the top 10! swebench.com Read more about our approach of the OpenHands critic here: all-hands.dev/blog/sota-on-s…

thumb_up_off_alt138

chat_bubble_outline3

repeat25

shareShare

All Hands AI

@allhands_ai

7 months ago

OpenAI released a new coding agent today Codex, exciting development in the coding agent space! openai.com/index/introduc… We're going to go through some of the key interesting points on the thread here.

thumb_up_off_alt68

chat_bubble_outline2

repeat8

shareShare

Robert Brennan

@rbren_dev

7 months ago

Getting excited!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Rohit Malhotra

@rohit_malh5

7 months ago

Used OpenHands to build a speculative spatial tracking museum experience in Unity, despite no Unity experience. It was pure "vibe coding" with no tests or reviews. Challenges: adding features without breaking existing ones and managing a messy codebase. Review your code folks!

thumb_up_off_alt8

chat_bubble_outline2

repeat2

shareShare

All Hands AI

@allhands_ai

7 months ago

Some upcoming presentations in SF June 5th by people from All Hands! - "Software Development Agents: What works and what doesn’t": by Robert Brennan at the AI Engineer World's Fair - Participation in the luminaries panel at Snowflake Dev Day by Graham Neubig See you there!

thumb_up_off_alt18

chat_bubble_outline1

repeat1

shareShare

All Hands AI

@allhands_ai

6 months ago

What if we could have *trustworthy* agents that don't just write code, but also do research, understand multimodal content, and perform many practically useful tasks? Today at OpenHands, we released a new agent that gets SOTA or competitive performance on 8 diverse tasks.

thumb_up_off_alt174

chat_bubble_outline5

repeat27

shareShare

Robert Brennan

@rbren_dev

6 months ago

🏆 🙌 yoav.blog/2025/06/10/com…

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

All Hands AI

@allhands_ai

6 months ago

Introducing the OpenHands CLI, a new coding CLI that: - Has top accuracy (similar to Claude Code) - Is completely open source, MIT licensed - Is model agnostic, use an API or bring your own - Is simple to install and run `pip install openhands-ai` and `openhands` (no Docker!)

thumb_up_off_alt1,1K

chat_bubble_outline35

repeat269

shareShare

Robert Brennan

@rbren_dev

5 months ago

The new Devstral is incredible!

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Graham Neubig

@gneubig

5 months ago

6000 PRs! I knew a lot of people were using OpenHands but this honestly exceeded my expectations a bit. And we're just getting stated, hoping to have some changes soon that'll make it even easier to develop with OpenHands and increase the count even more 👀

thumb_up_off_alt32

chat_bubble_outline0

repeat1

shareShare

All Hands AI

@allhands_ai

5 months ago

OpenHands is live on TerminalBench and gets 41.3% with claude-4-sonnet, 6 points better than Claude Code! If you want to use an agent that can use the terminal, in your terminal -- try out the OpenHands CLI.

thumb_up_off_alt495

chat_bubble_outline10

repeat40

shareShare

Robert Brennan

@rbren_dev

5 months ago

Congrats Qwen team! This is huge.

thumb_up_off_alt8

chat_bubble_outline1

repeat0

shareShare

Robert Brennan

@rbren_dev

5 months ago

Nothing more frustrating than seeing "private scaffold" on public benchmark results I love that model providers like Qwen and Mistral are now reporting their results specifically using OpenHands as the scaffold--feels like we're becoming a standard here x.com/Alibaba_Qwen/s…

thumb_up_off_alt94

chat_bubble_outline2

repeat7

shareShare

All Hands AI

@allhands_ai

4 months ago

‼️MIT: 95% of in‑house GenAI pilots fail to lift revenue/productivity. ⚠️ 🧵👇

thumb_up_off_alt21

chat_bubble_outline1

repeat3

shareShare

All Hands AI

@allhands_ai

4 months ago

We built OpenHands in the open (~60K ⭐️ on GitHub). Now we’re giving back to the OSS ecosystem. Announcing the OpenHands Cloud OSS Credit Program → $100–$500 credits for maintainers. 👉 Learn how to apply!

thumb_up_off_alt77

chat_bubble_outline1

repeat7

shareShare

Robert Brennan

@rbren_dev

4 months ago

A lot of agents out there are over-optimizing for SWE-bench. We've been very careful to ensure OpenHands generalizes to a wide variety of eng-related tasks Great to see that work pay off as we hit #1 on SWT-Bench!🏆

thumb_up_off_alt19

chat_bubble_outline0

repeat3

shareShare