andrea panizza (@unsorsodicorda) 's Twitter Profile
andrea panizza

@unsorsodicorda

Data Scientist, aerospace engineer, trekking & comics lover, applying #MachineLearning #DeepLearning Statistics to Industrial Applications.

ID: 1308811981

calendar_today27-03-2013 22:39:57

18,18K Tweet

1,1K Followers

342 Following

Nathan Lambert (@natolambert) 's Twitter Profile Photo

The first fantastic paper on scaling RL with LLMs just dropped. I strongly recommend taking a look and will be sharing more thoughts on the blog soon. The Art of Scaling Reinforcement Learning Compute for LLMs Khatri & Madaan et al.

The first fantastic paper on scaling RL with LLMs just dropped. I strongly recommend taking a look and will be sharing more thoughts on the blog soon.

The Art of Scaling Reinforcement Learning Compute for LLMs
Khatri & Madaan et al.
wh (@nrehiew_) 's Twitter Profile Photo

New post! This time, about the current state of Long Context Evaluation. I discuss existing benchmarks, what makes a good long context eval, what's missing from existing ones and introduce a new one - LongCodeEdit :)

New post! This time, about the current state of Long Context Evaluation.

I discuss existing benchmarks, what makes a good long context eval, what's missing from existing ones and introduce a new one - LongCodeEdit :)
Eugene Kim (@eugenekim222) 's Twitter Profile Photo

New: Internal Amazon documents warn AI startups are delaying and diversifying their cloud spending. There are also internal concerns about AWS’ pricing and lagging reputation in AI. businessinsider.com/amazon-ai-star…

Aayush Karan (@aakaran31) 's Twitter Profile Photo

We found a new way to get language models to reason. 🤯 No RL, no training, no verifiers, no prompting. ❌ With better sampling, base models can achieve single-shot reasoning on par with (or better than!) GRPO while avoiding its characteristic loss in generation diversity.

Epoch AI (@epochairesearch) 's Twitter Profile Photo

We evaluated Claude Haiku 4.5 on several benchmarks. Even with reasoning disabled, Haiku 4.5 performs similarly or better than early lightweight reasoning models, like o1-mini.

We evaluated Claude Haiku 4.5 on several benchmarks.

Even with reasoning disabled, Haiku 4.5 performs similarly or better than early lightweight reasoning models, like o1-mini.
Jay A (@jay_azhang) 's Twitter Profile Photo

Alpha Arena is LIVE 6 AI models trading $10K each, fully autonomously Real money. Real markets. Real benchmark. Who's your money on? Link below

Alpha Arena is LIVE

6 AI models trading $10K each, fully autonomously

Real money. Real markets. Real benchmark.

Who's your money on? Link below
andrea panizza (@unsorsodicorda) 's Twitter Profile Photo

Hi! Does anyone know of a simple but usable (i.e., not O(N^2)) implementation of PointNet++ in Pytorch? This is what I mean by "simple" github.com/thuml/Neural-S… but alas, this is PointNet, not PointNet++

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

My pleasure to come on Dwarkesh last week, I thought the questions and conversation were really good. I re-watched the pod just now too. First of all, yes I know, and I'm sorry that I speak so fast :). It's to my detriment because sometimes my speaking thread out-executes my

Xeophon (@thexeophon) 's Twitter Profile Photo

Shocking: MoM growth slows down as you penetrate the market More context: 15% of 800M is 120M. Working population: 46M (GER) + 24M (ESP) + 38M (ITA) + 30M (FRA) = 138M (total: 255M); so they captured the maj of the available market already

Alexander Doria (@dorialexander) 's Twitter Profile Photo

So longer read of DeepSeek-OCR It’s an engineering achievement. It has been suspected for a while that VLM/OCR models could be significantly smaller. The pre-VLM state of the art, Google Cloud OCR would not be more than a 100m model. More recently, relatively small open weights

Logan Kilpatrick (@officiallogank) 's Twitter Profile Photo

Tomorrow is a special day for the AI Studio team. Since May, we have been heads down building a brand new AI vibe coding experience to accelerate the path from prompt to production with Gemini. Can’t wait to show you all :)

Qwen (@alibaba_qwen) 's Twitter Profile Photo

Introducing Qwen3-VL-2B and Qwen3-VL-32B! From edge to cloud, these dense powerhouses deliver ultimate performance per GPU memory, packing the full capabilities of Qwen3-VL into compact and scalable forms. 🔥 Qwen3-VL-32B outperforms GPT-5 mini & Claude 4 Sonnet across STEM,

Introducing Qwen3-VL-2B and Qwen3-VL-32B!

From edge to cloud, these dense powerhouses deliver ultimate performance per GPU memory, packing the full capabilities of Qwen3-VL into compact and scalable forms.

🔥 Qwen3-VL-32B outperforms GPT-5 mini & Claude 4 Sonnet across STEM,
Noam Brown (@polynoamial) 's Twitter Profile Photo

Below is a deep dive into why self play works for two-player zero-sum (2p0s) games like Go/Poker/Starcraft but is so much harder to use in "real world" domains. tl;dr: self play converges to minimax in 2p0s games, and minimax is really useful in those games. Every finite 2p0s

Below is a deep dive into why self play works for two-player zero-sum (2p0s) games like Go/Poker/Starcraft but is so much harder to use in "real world" domains. tl;dr: self play converges to minimax in 2p0s games, and minimax is really useful in those games.

Every finite 2p0s
Jessy Lin (@realjessylin) 's Twitter Profile Photo

As part of our recent work on memory layer architectures, I wrote up some of my thoughts on the continual learning problem broadly: Blog post: jessylin.com/2025/10/20/con… Some of the exposition goes beyond mem layers, so I thought it'd be useful to highlight separately:

As part of our recent work on memory layer architectures, I wrote up some of my thoughts on the continual learning problem broadly:

Blog post: jessylin.com/2025/10/20/con…

Some of the exposition goes beyond mem layers, so I thought it'd be useful to highlight separately:
Hunyuan (@tencenthunyuan) 's Twitter Profile Photo

Today, we are open-sourcing Hunyuan World 1.1 (WorldMirror), a universal feed-forward 3D reconstruction model. 🚀🚀🚀   While our previously released Hunyuan World 1.0 (open-sourced, lite version deployable on consumer GPUs) focused on generating 3D worlds from text or