Jan Hendrik Kirchner (@janhkirchner) 's Twitter Profile
Jan Hendrik Kirchner

@janhkirchner

formerly comp neuroscience @ mpi brain research frankfurt ➡️ small verifier

ID: 972038953586057216

linkhttp://universalprior.substack.com calendar_today09-03-2018 09:18:40

447 Tweet

1,1K Followers

527 Following

T. Greer (@scholars_stage) 's Twitter Profile Photo

It will be hard to keep the commanding heights in (western) liberal hands while simultaneously handing the technology over to the governments of the developing world. These goals are probably not compatible—it will take a lot of work to make them so.

Anthropic (@anthropicai) 's Twitter Profile Photo

Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use. Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text.

Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use.

Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text.
Alex Mallen (@alextmallen) 's Twitter Profile Photo

New paper! How should we make trade-offs between the quantity and quality of labels used for eliciting knowledge from capable AI systems?

New paper!

How should we make trade-offs between the quantity and quality of labels used for eliciting knowledge from capable AI systems?
Jan Hendrik Kirchner (@janhkirchner) 's Twitter Profile Photo

Someone recommended The Goal from Goldratt to me and I have to say, there's nothing in there that Factorio hasn't taught me already

Anthropic (@anthropicai) 's Twitter Profile Photo

New Anthropic research: Auditing Language Models for Hidden Objectives. We deliberately trained a model with a hidden misaligned objective and put researchers to the test: Could they figure out the objective without being told?

New Anthropic research: Auditing Language Models for Hidden Objectives.

We deliberately trained a model with a hidden misaligned objective and put researchers to the test: Could they figure out the objective without being told?
Samuel Marks (@saprmarks) 's Twitter Profile Photo

New paper with Johannes Treutlein , Evan Hubinger , and many other coauthors! We train a model with a hidden misaligned objective and use it to run an auditing game: Can other teams of researchers uncover the model’s objective? x.com/AnthropicAI/st…

Anthropic (@anthropicai) 's Twitter Profile Photo

Introducing the next generation: Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is our most powerful model yet, and the world’s best coding model. Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.

Introducing the next generation: Claude Opus 4 and Claude Sonnet 4.

Claude Opus 4 is our most powerful model yet, and the world’s best coding model.

Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.
Jacob Hilton (@jacobhhilton) 's Twitter Profile Photo

There is still an opportunity for OpenAI to live up to its founding promises, instead of abandoning them. Here I explain what this could look like.

There is still an opportunity for <a href="/OpenAI/">OpenAI</a> to live up to its founding promises, instead of abandoning them. Here I explain what this could look like.