Fabian Gloeckle (@fabiangloeckle) Twitter Tweets • TwiCopy

Tom Sander @NeurIPS

a year ago

You didn’t believe in Differential Private training for foundation models? We achieved the same performance as non-private MAE trained on the same dataset, but with rigorous DP. Code is released: github.com/facebookresear…. Presenting tomorrow at ICML, 11:30AM poster, #2313

thumb_up_off_alt35

chat_bubble_outline0

repeat5

shareShare

Kaiyu Yang

@kaiyuyang4

a year ago

We're looking for a postdoc at Meta FAIR to work on AI4Math, e.g., neural theorem proving, autoformalization, learning mathematical rules and abstractions, and automated discovery/conjecturing in math. Please apply at metacareers.com/jobs/145969190… and email me. Please help share this

thumb_up_off_alt233

chat_bubble_outline5

repeat42

shareShare

Fabian Gloeckle

@fabiangloeckle

a year ago

github.com/facebookresear… by Guillaume Lample @ NeurIPS 2024 Timothee Lacroix Marie-Anne Lachaux Aurelien Rodriguez Amaury Hayat Thibaut Lavril Gabriel Ebner Xavier M and others!

thumb_up_off_alt71

chat_bubble_outline3

repeat6

shareShare

AI at Meta

@aiatmeta

a year ago

LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models ➡️ go.fb.me/5ngnnp

thumb_up_off_alt31

chat_bubble_outline2

repeat8

shareShare

Fabian Gloeckle

@fabiangloeckle

a year ago

🤩

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Jason Rute @ JMM 2025

@jasonrute

a year ago

Prof. Anima Anandkumar I really worry that the, shall we say, “questionable” claims in this paper (162 unproven by human theorems) will get taken seriously and the rest of us working in this field will look really bad for it. There are much better works already in this field.

thumb_up_off_alt13

chat_bubble_outline4

repeat1

shareShare

Fabian Gloeckle

@fabiangloeckle

a year ago

New generation of cross-lingual embedding models -- the same could work for code + math

thumb_up_off_alt7

chat_bubble_outline0

repeat2

shareShare

TimDarcet

@timdarcet

a year ago

🚨 RELEASE ALERT ‼️ github.com/facebookresear… THIS CHANGES EVERYTHING $META just dropped a game-changing codebase! Now everyone can do LLM research! 😱 🧵10 best things people are already building with lingua 🔥👇

thumb_up_off_alt69

chat_bubble_outline2

repeat9

shareShare

Gabriel Synnaeve

@syhw

a year ago

Want to do research in code generation with LLMs and wonky deep learning from the 90s? We're recruiting one Master student (M2) intern for 2025 at FAIR Paris in my team metacareers.com/jobs/106871446…

thumb_up_off_alt294

chat_bubble_outline8

repeat47

shareShare

Mathurin Videau

@mathuvu_

10 months ago

Meta Lingua: a minimal, fast LLM codebase for training and inference. By researchers, for researchers. Easily hackable, still reproducible. Built-in efficiency, profiling (cpu, gpu and mem) and interpretability (automatic activation and gradient statistics) Joint work w/ Badr Youbi Idrissi

thumb_up_off_alt48

chat_bubble_outline1

repeat14

shareShare

Fabian Gloeckle

@fabiangloeckle

10 months ago

The PyTorch reference transformer implementation for researchers. Amazing work!

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Taco Cohen

@tacocohen

10 months ago

🚨 New intern position in the FAIR CodeGen Team! 🚨 I'm particularly interested in working with candidates with expertise in off-policy RL methods, and/or in code generation with LLMs, but this is not a hard requirement and the project topic is somewhat flexible.

thumb_up_off_alt194

chat_bubble_outline5

repeat26

shareShare

Ekin Akyürek

@akyurekekin

10 months ago

Why do we treat train and test times so differently? Why is one “training” and the other “in-context learning”? Just take a few gradients during test-time — a simple way to increase test time compute — and get a SoTA in ARC public validation set 61%=avg. human score! ARC Prize

thumb_up_off_alt1,1K

chat_bubble_outline36

repeat345

shareShare

Fabian Gloeckle

@fabiangloeckle

10 months ago

Still a bit to go

thumb_up_off_alt13

chat_bubble_outline0

repeat0

shareShare

Lean

@leanprover

9 months ago

We have just launched the new Lean reference manual, our core documentation intended as a comprehensive, precise description of Lean! #leanlang #leanprover Check out the manual: lean-lang.org/doc/reference/… Read more about the release: lean-lang.org/blog/2024-12-1…

thumb_up_off_alt110

chat_bubble_outline0

repeat26

shareShare

Fabian Gloeckle

@fabiangloeckle

8 months ago

Causal multi-token prediction at scale!

thumb_up_off_alt24

chat_bubble_outline1

repeat0

shareShare

Mark Saroufim

@marksaroufim

5 months ago

x.com/i/article/1904…

thumb_up_off_alt394

chat_bubble_outline9

repeat67

shareShare

Kunhao Zheng @ ICLR 2025

@kunhaoz

4 months ago

🚨 Your RL only improves 𝗽𝗮𝘀𝘀@𝟭, not 𝗽𝗮𝘀𝘀@𝗸? 🚨 That’s not a bug — it’s a 𝗳𝗲𝗮𝘁𝘂𝗿𝗲 𝗼𝗳 𝘁𝗵𝗲 𝗼𝗯𝗷𝗲𝗰𝘁𝗶𝘃𝗲 you’re optimizing. You get what you optimize for. If you want better pass@k, you need to optimize for pass@k at training time. 🧵 How?

thumb_up_off_alt823

chat_bubble_outline12

repeat141

shareShare

Mathurin Videau

@mathuvu_

3 months ago

We present an Autoregressive U-Net that incorporates tokenization inside the model, pooling raw bytes into words then word-groups. AU-Net focuses most of its compute on building latent vectors that correspond to larger units of meaning. Joint work with Badr Youbi Idrissi 1/8

thumb_up_off_alt189

chat_bubble_outline14

repeat47

shareShare

Belen Alastruey

@b_alastruey

a month ago

🚀New paper alert! 🚀 In our work AI at Meta we dive into the struggles of mixing languages in largely multilingual Transformer encoders and use the analysis as a tool to better design multilingual models to obtain optimal performance. 📄: arxiv.org/abs/2508.02256 🧵(1/n)

🚀New paper alert! 🚀

In our work <a href="/AIatMeta/">AI at Meta</a> we dive into the struggles of mixing languages in largely multilingual Transformer encoders and use the analysis as a tool to better design multilingual models to obtain optimal performance.

📄: arxiv.org/abs/2508.02256

🧵(1/n)

thumb_up_off_alt73

chat_bubble_outline1

repeat17

shareShare