Álvaro Barbero Jiménez (@albarjip) Twitter Tweets • TwiCopy

Álvaro Barbero Jiménez

@albarjip

+ Follow

Chief Data Scientist at @IIConocimiento, crazy scientist, artificial artist. Machine Learning, AI, optimization, programming, 日本文化, geek stuff

ID: 749039234

linkhttps://albarji.substack.com/ calendar_today10-08-2012 09:43:08

3,3K Tweet

913 Followers

673 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

John Carmack

@id_aa_carmack

3 months ago

I have never seen it expressed exactly like that, but I wholeheartedly endorse it: Feedback beats planning. My plea at Meta was “No grand plans, follow the gradient of user value”.

thumb_up_off_alt10,10K

chat_bubble_outline505

repeat1,1K

shareShare

Clarifying o3’s ARC-AGI Performance OpenAI has confirmed: * The released o3 is a different model from what we tested in December 2024 * All released o3 compute tiers are smaller than the version we tested * The released o3 was not trained on ARC-AGI data, not even the train

thumb_up_off_alt1,1K

chat_bubble_outline36

repeat83

shareShare

Programmer Humor

@pr0grammerhum0r

3 months ago

pleaseTellMyEngineeringDirector redd.it/1k1xs3t

thumb_up_off_alt200

chat_bubble_outline4

repeat21

shareShare

nic

@nicdunz

3 months ago

LMAOO

thumb_up_off_alt35,35K

chat_bubble_outline150

repeat1,1K

shareShare

Zhao Tianyu

@zhaoting1024

3 months ago

Official announcement: Qwen 3 this week. Reasoning and non-reasoning in one.

thumb_up_off_alt794

chat_bubble_outline6

repeat88

shareShare

Andrej Karpathy

@karpathy

3 months ago

There's a new paper circulating looking in detail at LMArena leaderboard: "The Leaderboard Illusion" arxiv.org/abs/2504.20879 I first became a bit suspicious when at one point a while back, a Gemini model scored #1 way above the second best, but when I tried to switch for a few

thumb_up_off_alt4,4K

chat_bubble_outline192

repeat429

shareShare

David Rozado

@davidrozado

2 months ago

1/ Do AI systems discriminate based on gender when choosing the most qualified candidate for a job? I ran an experiment with several leading LLMs to find out. Here's what I discovered:👇

thumb_up_off_alt887

chat_bubble_outline40

repeat205

shareShare

Blank

@blankpapper_

2 months ago

get off that silly

thumb_up_off_alt73,73K

chat_bubble_outline55

repeat7,7K

shareShare

hussam

@hussamfyi

2 months ago

thumb_up_off_alt33,33K

chat_bubble_outline346

repeat5,5K

shareShare

Carlos Santana

@dotcsv

2 months ago

Si realmente viviéramos en una simulación y yo fuera el guionista, sería con este tipo de vídeos con el que empezaría a dar pistas que precipiten los acontecimientos de final de temporada para acabar revelando que, efectivamente, vivimos en una simulación.

thumb_up_off_alt1,1K

chat_bubble_outline48

repeat277

shareShare

ARC Prize

@arcprize

2 months ago

Claude Sonnet 4 on ARC-AGI Semi Private Eval Base * ARC-AGI-1: 23%, $0.08/task * ARC-AGI-2: 1.2%, $0.12/task Thinking 16K * ARC-AGI-1: 40%, $0.36/task * ARC-AGI-2: 5.9%, $0.48/task Sonnet 4 sets new SOTA (5.9%) on ARC-AGI-2

thumb_up_off_alt611

chat_bubble_outline41

repeat74

shareShare

Álvaro Barbero Jiménez

@albarjip

2 months ago

Humor ¿no intencionado? en Twitter

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Álvaro Barbero Jiménez

@albarjip

2 months ago

A very necessary paper, showing once again that LLMs, even "reasoning" LLMs, do not actually reason but collapse when far from their training distribution. AGI or PhD-level AIs are not around the corner. But they are still extremely useful when used correctly.

thumb_up_off_alt8

chat_bubble_outline0

repeat3

shareShare

Rohan Paul

@rohanpaul_ai

a month ago

It’s a hefty 206-page research paper, and the findings are concerning. "LLM users consistently underperformed at neural, linguistic, and behavioral levels" This study finds LLM dependence weakens the writer’s own neural and linguistic fingerprints. 🤔🤔 Relying only on EEG,

thumb_up_off_alt9,9K

chat_bubble_outline256

repeat2,2K

shareShare

Escaños en Blanco para dejar Escaños Vacíos

@escanosenblanco

a month ago

Desde la semana pasada, esta cuenta está creciendo de forma orgánica. Es decir: crece sin responder a una publicación o a una acción concreta. La herramienta que ofrece Escaños en Blanco, la idea de dejar escaños vacíos, está empezando a moverse sola. Ciudadanía. Sin partidos.

thumb_up_off_alt297

chat_bubble_outline17

repeat104

shareShare

UAM Autónoma Madrid

@uam_madrid

a month ago

🏆 Los Premios a jóvenes investigadores e investigadoras UAM 2024 reconocen la contribución significativa al desarrollo de la actividad investigadora en la #UAM. ¡Enhorabuena a todas las personas galardonadas en esta edición!👏🏻 Alfonso Santos López, Anne-Marie Reynaers, Fátima

thumb_up_off_alt57

chat_bubble_outline7

repeat18

shareShare

PyTorch

@pytorch

10 days ago

Discover how #verl simplifies #ReinforcementLearning for advanced #LLM reasoning and tool use in our Aug 6 Expert Exchange with Haibin Lin (ByteDance). Supports PPO/GRPO/DAPO, async rollout, expert parallelism for MoE, and more. #PyTorch #OpenSourceAI 🔗 hubs.la/Q03xkQW-0

thumb_up_off_alt53

chat_bubble_outline4

repeat18

shareShare

ARC Prize

@arcprize

7 days ago

Today, we're announcing a preview of ARC-AGI-3, the Interactive Reasoning Benchmark with the widest gap between easy for humans and hard for AI We’re releasing: * 3 games (environments) * $10K agent contest * AI agents API Starting scores - Frontier AI: 0%, Humans: 100%

thumb_up_off_alt1,1K

chat_bubble_outline61

repeat218

shareShare