Justin Bullock (@justinbullock14) 's Twitter Profile
Justin Bullock

@justinbullock14

VP of Policy for @americans4ri; Senior Fellow with Convergence Analysis; Advocate of Love, Intelligence, & Freedom

ID: 2933754365

linkhttp://governingwithAI.com calendar_today20-12-2014 15:27:34

3,3K Tweet

1,1K Followers

1,1K Following

Ethan Mollick (@emollick) 's Twitter Profile Photo

Current agents only do 30% of complex real company tasks in this paper. Though note benchmarks are a floor, not a ceiling, if: 1) More recent models show improvement in the benchmark, suggesting future models may do it 2) Better prompting/tools would make the AI perform better.

Current agents only do 30% of complex real company tasks in this paper. Though note benchmarks are a floor, not a ceiling, if:
1) More recent models show improvement in the benchmark, suggesting future models may do it
2) Better prompting/tools would make the AI perform better.
Justin Bullock (@justinbullock14) 's Twitter Profile Photo

So, um, is it really the case that the headline AI agents COMPLETED 30%-48% of real-real world professional office tasks? That’s, um, *checks notes* a lot, right?

Justin Bullock (@justinbullock14) 's Twitter Profile Photo

My sample is biased (obviously), but ChatGPT use (and, rapidly others as well) is beginning to feel ambient. The oracles are running amok amongst us, well maybe not quite amok, but quickly thriving towards ends unknown for sure.

Ketan Ramakrishnan (@ketanr) 's Twitter Profile Photo

So what to do? We say: focus on the handful of large AI developers truly at at the frontier. That's where the most distinctive potential risks of frontier AI development are most likely to arise, and where the need for transparency, evidence, and understanding are most pressing.

Loquacious Bibliophilia ⏸️ (@locbibliophilia) 's Twitter Profile Photo

Ketan Ramakrishnan Dean W. Ball Carnegie Endowment I did not expect to agree with Dean W. Ball but yes, this is the path forward. We do need governance on frontier companies. "One of the chief tasks of a frontier AI regulatory regime—arguably the chief task, at least for now—is to put society in a position to reduce such

Justin Bullock (@justinbullock14) 's Twitter Profile Photo

Keep an eye for these projects to be published! Deric and I were very impressed with the quality of these fellows and their work. There’s not nearly enough work being done to understand what components are needed for an AGI Social Contract. With this incredible team of fellows

Justin Bullock (@justinbullock14) 's Twitter Profile Photo

Kudos to Anthropic for their work on transparency! This a big step in the right direction. Happy to see it! Let’s have the public conversation about this incredibly important piece of AI policy! “Frontier AI development needs greater transparency to ensure public safety and

Joe Carlsmith (@jkcarlsmith) 's Twitter Profile Photo

I'm giving a public talk Tuesday July 8th, 7:30 pm at Mox in SF. Title: "Can goodness compete?". It's about long-term equilibrium outcomes post-AGI. More info at link in thread.

Justin Bullock (@justinbullock14) 's Twitter Profile Photo

I know I’m new to DC, because when I look around, I see opportunity everywhere. And, even more than opportunity, there’s maybe even a hint of a touch of momentum.