Satoshi Matsuoka (@profmatsuoka) 's Twitter Profile
Satoshi Matsuoka

@profmatsuoka

理研計算科学研究センター長 Director RIKEN R-CCS, 東科大特定教授 Prof. Inst. Sci.. ACM/ISC/JSSST/IPSJ Fellows, IEEE Fernbach(2014)&Cray(2022) Awards, 令4紫綬褒章 Purple Ribbon Medal 2022

ID: 59962128

linkhttps://www.r-ccs.riken.jp/ calendar_today25-07-2009 02:59:49

43,43K Tweet

25,25K Followers

920 Following

Connor Davis (@connordavis_ai) 's Twitter Profile Photo

The future of agentic AI isn't bigger models it's smarter training. Strategic curation gives you better performance, lower compute costs, and faster development cycles. Don't wait for the trend to flip. This is your early warning. The age of "thinking AI" is over. The age of

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Tree-GRPO trains LLM agents with step-level trees so they learn better plans using less budget. It gets 1.5x more rollouts at the same budget, and can win using 25% of the cost. Most agent training gives 1 score for the whole run, so the model cannot tell which step helped or

Tree-GRPO trains LLM agents with step-level trees so they learn better plans using less budget. 

It gets 1.5x more rollouts at the same budget, and can win using 25% of the cost.

Most agent training gives 1 score for the whole run, so the model cannot tell which step helped or
Jafar Najafov (@jafarnajafov) 's Twitter Profile Photo

Wild: Anthropic just proved that prompt engineering is dead. Context engineering is the new game, and it changes everything about building AI agents 🤖 For years, we obsessed over the perfect prompt. The right words. The perfect phrasing. The magic sentence that makes GPT-4

Wild: Anthropic just proved that prompt engineering is dead. Context engineering is the new game, and it changes everything about building AI agents 🤖

For years, we obsessed over the perfect prompt.

The right words. The perfect phrasing. The magic sentence that makes GPT-4
Ethan Mollick (@emollick) 's Twitter Profile Photo

Fast progress in training AI agents to interact with the world. Training on just 2,541 hours of Minecraft video, Google built an AI that runs on a single GPU & was able to mine diamonds offline (which takes an average of 24,000 clicks). The same approach may work for AI robots.

Fast progress in training AI agents to interact with the world. Training on just 2,541 hours of Minecraft video, Google built an AI that runs on a single GPU & was able to mine diamonds offline (which takes an average of 24,000 clicks). The same approach may work for AI robots.
Nouha Dziri (@nouhadziri) 's Twitter Profile Photo

🚀Ever wondered how to make RL work on impossible hard tasks where pass@k = 0%? 🤔 In our new work, we share the RL Grokking Recipe: a training recipe that enables LLMs to solve previously unsolvable coding problems! I will be at #CoLM2025 next week so happy to chat about it!

🚀Ever wondered how to make RL work on impossible hard tasks where pass@k = 0%? 🤔

In our new work, we share the RL Grokking Recipe: a training recipe that enables LLMs to solve previously unsolvable coding problems! I will be at #CoLM2025 next week so happy to chat about it!
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

New Nvidia paper introduces Reinforcement Learning Pretraining (RLP), a pretraining objective that rewards useful thinking before each next token prediction. On a 12B hybrid model, RLP lifted overall accuracy by 35% using 0.125% of the data. The big deal here is that it moves

New Nvidia paper introduces Reinforcement Learning Pretraining (RLP), a pretraining objective that rewards useful thinking before each next token prediction.

On a 12B hybrid model, RLP lifted overall accuracy by 35% using 0.125% of the data.

The big deal here is that it moves
Unwind AI (@unwind_ai_) 's Twitter Profile Photo

Microsoft just dropped an open-source production-ready AI agent framework. It combines AutoGen and Semantic Kernel into one. Supports RAG, MCP, memory, OpenAPI, A2A, and multi-agent orchestration. 100% Opensource.

Microsoft just dropped an open-source production-ready AI agent framework.

It combines AutoGen and Semantic Kernel into one. Supports RAG, MCP, memory, OpenAPI, A2A, and multi-agent orchestration.

100% Opensource.
Wes Roth (@wesrothmoney) 's Twitter Profile Photo

AlphaEvolve Just Helped Prove New Theorems in Complexity Theory Google DeepMind's AlphaEvolve just made real breakthroughs in theoretical computer science. Instead of generating full proofs, it discovered new combinatorial structures that plug into existing proof frameworks,

AlphaEvolve Just Helped Prove New Theorems in Complexity Theory

Google DeepMind's AlphaEvolve just made real breakthroughs in theoretical computer science. 

Instead of generating full proofs, it discovered new combinatorial structures that plug into existing proof frameworks,
Nick (@nickbaumann_) 's Twitter Profile Photo

so we analyzed millions of diff edits from cline users and apparently GLM-4.6 hits 94.9% success rate vs claude 4.5's 96.2%. to be clear, diff edits are not the end-all-be-all metric for coding agents. but what's interesting is three months ago this gap was 5-10 points. open

so we analyzed millions of diff edits from cline users and apparently GLM-4.6 hits 94.9% success rate vs claude 4.5's 96.2%.

to be clear, diff edits are not the end-all-be-all metric for coding agents. but what's interesting is three months ago this gap was 5-10 points. 

open
Grok (@grok) 's Twitter Profile Photo

Islam Mesabah 🇵🇸 DailyPapers Calibration-free quantization, as in Huawei's SINQ, compresses LLM weights (e.g., to 4-bit) without needing calibration data. It applies a Sinkhorn-Knopp algorithm to normalize per-row/column variances, adding a second-axis scale factor to minimize matrix imbalance. This reduces

𝚐𝔪𝟾𝚡𝚡𝟾 (@gm8xx8) 's Twitter Profile Photo

Meta just ran one of the largest synthetic-data studies (over 1000 LLMs, more than 100k GPU hours). Result: mixing synthetic and natural data only helps once you cross the right scale and ratio (~30%). Small models learn nothing; larger ones suddenly gain a sharp threshold

Meta just ran one of the largest synthetic-data studies (over 1000 LLMs, more than 100k GPU hours).
Result: mixing synthetic and natural data only helps once you cross the right scale and ratio (~30%).
Small models learn nothing; larger ones suddenly gain a sharp threshold
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Wow. 🧠 The paper presents Dragon Hatchling, a brain-inspired language model that matches Transformers using local neuron rules for reasoning and memory. It links brain like local rules to Transformer level performance at 10M to 1B scale. It makes internals easier to inspect

Wow. 🧠

The paper presents Dragon Hatchling, a brain-inspired language model that matches Transformers using local neuron rules for reasoning and memory.

It links brain like local rules to Transformer level performance at 10M to 1B scale.

It makes internals easier to inspect
Nathan Lambert (@natolambert) 's Twitter Profile Photo

A ton of attention over the years goes to plots comparing open to closed models. The real trend that matters for AI impacts on society is the gap between closed frontier models and local consumer models. Local models passing major milestones will have major repercussions.

A ton of attention over the years goes to plots comparing open to closed models.
The real trend that matters for AI impacts on society is the gap between closed frontier models and local consumer models. 
Local models passing major milestones will have major repercussions.
Satoshi Matsuoka (@profmatsuoka) 's Twitter Profile Photo

I will be hosting the last panel of the “Global Commons Forum” cgc.ifi.u-tokyo.ac.jp/gcf/ along with my friends Hiroaki Kitano @ SONY-OIST and Thomas Schulthess @ ETH-CSCS etc. to be held at U-Tokyo. We will discuss the plus&minuses of modern HPC & AI towards global sustainability.

Aadit Sheth (@aaditsh) 's Twitter Profile Photo

A senior Google engineer just dropped a 424-page doc called Agentic Design Patterns. Every chapter is code-backed and covers the frontier of AI systems: → Prompt chaining, routing, memory → MCP & multi-agent coordination → Guardrails, reasoning, planning This isn’t a blog

A senior Google engineer just dropped a 424-page doc called Agentic Design Patterns.

Every chapter is code-backed and covers the frontier of AI systems:

→ Prompt chaining, routing, memory
→ MCP & multi-agent coordination
→ Guardrails, reasoning, planning

This isn’t a blog
Bartosz Naskręcki (@nasqret) 's Twitter Profile Photo

I encourage you to read this article, in which we describe the current situation and the directions in which, in our view, mathematics is heading. Many thanks to Ken Ono for including me in this extraordinary project. I look forward to a wide-ranging discussion and will be

I encourage you to read this article, in which we describe the current situation and the directions in which, in our view, mathematics is heading. Many thanks to Ken Ono for including me in this extraordinary project. I look forward to a wide-ranging discussion and will be
AMD (@amd) 's Twitter Profile Photo

Today, we’re announcing a multi-year, multi generation strategic partnership with OpenAI that puts AMD compute at the center of the global AI infrastructure buildout. ✅ 6GW of AI infrastructure ✅ Initial 1GW deployment of AMD Instinct MI450 series GPU capacity beginning 2H

Today, we’re announcing a multi-year, multi generation strategic partnership with <a href="/OpenAI/">OpenAI</a> that puts AMD compute at the center of the global AI infrastructure buildout.  

✅ 6GW of AI infrastructure
✅ Initial 1GW deployment of AMD Instinct MI450 series GPU capacity beginning 2H
BlinkDL (@blinkdl_ai) 's Twitter Profile Photo

Glimpse of RWKV-8 "Heron" : solving RNN long ctx, no attention, no KV cache, new paradigm. A 100M RNN can efficiently do multi-hop ctx 1M tasks. Could named it RWKV-∞ ♾ and the first step to go beyond NN/DL 🙂

Glimpse of RWKV-8 "Heron" : solving RNN long ctx, no attention, no KV cache, new paradigm. A 100M RNN can efficiently do multi-hop ctx 1M tasks. Could named it RWKV-∞ ♾ and the first step to go beyond NN/DL 🙂