Florent BARTOCCIONI (@fbartoc) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Full episode dropping soon! Geeking out with Paul Zhou on AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World auto-eval.github.io Co-hosted by Chris Paxton & Michael Cho - Rbt/Acc

thumb_up_off_alt18

chat_bubble_outline0

repeat5

shareShare

Hongyang Li

@francislee2020

3 months ago

Introducing #UniVLA, this is a unified VLA framework that enables policy learning across different environments, exhibiting unanimous improvement on multiple manipulation and navigation tasks. #RSS2025 github.com/OpenDriveLab/U…

thumb_up_off_alt140

chat_bubble_outline2

repeat21

shareShare

Michael Cho - Rbt/Acc

@micoolcho

2 months ago

This makes the past 3 years of work and often months away from my family worth it. A big shoutout to noriaki_hirose , Lydia Ignatova, Kyle Stachowicz Catherine Glossop Dhruv Shah Sergey Levine for giving meaning to the work we do FrodoBots While some see our attempts in robotic

thumb_up_off_alt51

chat_bubble_outline4

repeat6

shareShare

Rudy Gilman

@rgilman33

2 months ago

The secret life of SwiGLU Simple neurons like those using ReLU, GELU or SiLU create a new dimension, then slice across that same dimension to lop off part of the space. A gated neuron, on the other hand, can align the knife however it wants. In DINO-v2 what's interesting is

thumb_up_off_alt260

chat_bubble_outline3

repeat22

shareShare

Yann LeCun

@ylecun

2 months ago

A talk on Self-Supervised Learning linkedin.com/posts/yann-lec…

thumb_up_off_alt952

chat_bubble_outline24

repeat136

shareShare

OpenDriveLab

@opendrivelab

2 months ago

💥 Forget slow autoregression and skip rigid full-sequence denoising! Nexus is a next-gen predictive pipeline for realistic, safety-critical driving scene generation. What’s new? ✅ Decoupled diffusion → fast updates, goal-driven control ✅ Noise-masking training → inject

thumb_up_off_alt47

chat_bubble_outline0

repeat11

shareShare

Edward Milsom

@edward_milsom

2 months ago

To address the "parameterisation lottery" (ideas win because they work well with popular choices of e.g. learning rates) I think empirical hyperparameter transfer methods are crucial. Rules like mu-P require you to derive them first, which is painful... x.com/edward_milsom/…

thumb_up_off_alt62

chat_bubble_outline1

repeat6

shareShare

Seohong Park

@seohong_park

2 months ago

We found a way to do RL *only* with BC policies. The idea is simple: 1. Train a BC policy π(a|s) 2. Train a conditional BC policy π(a|s, z) 3. Amplify(!) the difference between π(a|s, z) and π(a|s) using CFG Here, z can be anything (e.g., goals for goal-conditioned RL). 🧵↓

thumb_up_off_alt339

chat_bubble_outline5

repeat41

shareShare

jack morris

@jxmnop

2 months ago

new paper from our work at Meta! **GPT-style language models memorize 3.6 bits per param** we compute capacity by measuring total bits memorized, using some theory from Shannon (1953) shockingly, the memorization-datasize curves look like this: ___________ / / (🧵)

thumb_up_off_alt3,3K

chat_bubble_outline76

repeat369

shareShare

Xun Huang

@xunhuang1995

2 months ago

Real-time video generation is finally real — without sacrificing quality. Introducing Self-Forcing, a new paradigm for training autoregressive diffusion models. The key to high quality? Simulate the inference process during training by unrolling transformers with KV caching.

thumb_up_off_alt765

chat_bubble_outline25

repeat120

shareShare

Robert Lange

@roberttlange

a month ago

Text-to-LoRA: What if you no longer had to fine-tune your LLM for every single downstream task? 🚀 Stoked to share our work on instant LLM adaptation using meta-learned hypernetworks 📝 → 🔥 The idea is simple yet elegant: We text-condition a hypernetwork to output LoRA

thumb_up_off_alt384

chat_bubble_outline7

repeat62

shareShare

Ethan

@torchcompiled

a month ago

Modeling dolphin language is cool. Translating it into human speak is cooler. Somewhere you're gonna want to figure out how to align the latent space of dolphin language with that of human language in an unpaired, unbiased manner.

thumb_up_off_alt494

chat_bubble_outline12

repeat60

shareShare

Brian Christian

@brianchristian

a month ago

Reward models (RMs) are the moral compass of LLMs – but no one has x-rayed them at scale. We just ran the first exhaustive analysis of 10 leading RMs, and the results were...eye-opening. Wild disagreement, base-model imprint, identity-term bias, mere-exposure quirks & more: 🧵

thumb_up_off_alt1,1K

chat_bubble_outline40

repeat197

shareShare

leloy!

@leloykun

24 days ago

This effect seems to just be an artifact of SGD/Adam/AdamW/etc and more modern optimizers, e.g. Muon/Shampoo/PSGD, don't have this 'issue'. The crux is that the raw 'gradients' we get from backpropagation tend to have low (stable) rank. And optimizers like SGD/AdamW preserves

thumb_up_off_alt278

chat_bubble_outline5

repeat34

shareShare

Y Combinator

@ycombinator

22 days ago

François Chollet (François Chollet) on the ARC Prize and how we get to AGI. At AI Startup School in San Francisco. 00:00 - The Falling Cost of Compute 00:57 - Deep-Learning’s Scaling Era & Benchmarks 01:59 - The ARC Benchmark 03:02 - The 2024 Shift to Test-Time Adaptation 05:01 - What

thumb_up_off_alt480

chat_bubble_outline39

repeat153

shareShare

Arthur Douillard

@ar_douillard

21 days ago

I'll discuss distributed learning on Saturday, July 12. First, I'll cover current methods needing high bandwidth, then next-generation methods for decentralized learning

thumb_up_off_alt114

chat_bubble_outline1

repeat12

shareShare

Soumith Chintala

@soumithchintala

10 days ago

considering Muon is so popular and validated at scale, we've just decided to welcome a PR for it in PyTorch core by default. If anyone wants to take a crack at it... github.com/pytorch/pytorc…

thumb_up_off_alt845

chat_bubble_outline31

repeat62

shareShare

Simo Ryu

@cloneofsimo

7 days ago

"Aggressive Filtering aint good for larger training" Similar find also at

thumb_up_off_alt354

chat_bubble_outline5

repeat46

shareShare

Shashank

@shawshank_v

4 days ago

Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research🧵

thumb_up_off_alt256

chat_bubble_outline11

repeat52

shareShare

Florent BARTOCCIONI

Gate.io

RoboPapers

Hongyang Li

Michael Cho - Rbt/Acc

Rudy Gilman

Yann LeCun

OpenDriveLab

Edward Milsom

Seohong Park

jack morris

Xun Huang

Robert Lange

Ethan

Brian Christian

leloy!

Y Combinator

Arthur Douillard

Soumith Chintala

Simo Ryu

Shashank