Federico Barbero (@fedzbar) 's Twitter Profile
Federico Barbero

@fedzbar

I like Transformers and graphs. I also like chess and a few other things as well.

ID: 1073302912854564870

linkhttps://federicobarbero.com calendar_today13-12-2018 19:45:30

231 Tweet

2,2K Followers

274 Following

Federico Barbero (@fedzbar) 's Twitter Profile Photo

Heading tomorrow to Vancouver for NeurIPS! Please do reach out if you want to chat about reasoning in Transformers / LLMs :) I'll be presenting our work "Transformers need glasses! 👓" on Thursday at 4:30pm at East Exhibit Hall A-C #1806.

Heading tomorrow to Vancouver for NeurIPS! Please do reach out if you want to chat about reasoning in Transformers / LLMs :) 

I'll be presenting our work "Transformers need glasses! 👓" on Thursday at 4:30pm at East Exhibit Hall A-C #1806.
Ben Finkelshtein (@benfinkelshtein) 's Twitter Profile Photo

Come check out Learning on Large Graphs using Intersecting Communities! (With a “hint” of Game of Throne references) @NeurIPS 📌East Exhibit Hall A-C #3001, Session 3

Come check out Learning on Large Graphs using Intersecting Communities! (With a “hint” of Game of Throne references) @NeurIPS 
📌East Exhibit Hall A-C #3001, Session 3
Ethan (@torchcompiled) 's Twitter Profile Photo

so yeah, this is something I've always been confused about with softmax. your denominator keeps growing with sequence length, but logits of individual items are invariant to this. So attention sharpness ultimately depends on sequence length, becoming easier for noise to drown

so yeah, this is something I've always been confused about with softmax. 
your denominator keeps growing with sequence length, but logits of individual items are invariant to this. 
So attention sharpness ultimately depends on sequence length, becoming easier for noise to drown
Petar Veličković (@petarv_93) 's Twitter Profile Photo

This just in -- Looks like you'll be seeing more of p-RoPE at #ICLR2025! 🔄 Congratulation Federico Barbero on yet another epic paper from your internship getting published! 🎉

EEML (@eemlcommunity) 's Twitter Profile Photo

Applications are now open for EEML 2025 in Sarajevo, Bosnia and Herzegovina, 21-26 July! 🎉 Learn from top AI researchers and connect with peers in Sarajevo 🇧🇦, a historical crossroads of East and West. Needs-based scholarships are available. Deadline: 31 March 2025.

Applications are now open for EEML 2025 in Sarajevo, Bosnia and Herzegovina, 21-26 July! 🎉

Learn from top AI researchers and connect with peers in Sarajevo 🇧🇦, a historical crossroads of East and West. Needs-based scholarships are available.

Deadline: 31 March 2025.
Simone Scardapane (@s_scardapane) 's Twitter Profile Photo

*Round and Round We Go! What makes Rotary Positional Encodings useful?* by Federico Barbero Petar Veličković Christos Perivolaropoulos They show RoPE has distinct behavior for different rotation angles - high freq for position, low freq for semantics. arxiv.org/abs/2410.06205

*Round and Round We Go! What makes Rotary Positional Encodings useful?*
by <a href="/fedzbar/">Federico Barbero</a> <a href="/PetarV_93/">Petar Veličković</a> <a href="/ccperivol/">Christos Perivolaropoulos</a> 

They show RoPE has distinct behavior for different rotation angles - high freq for position, low freq for semantics.

arxiv.org/abs/2410.06205
Alvaro Arroyo (@arroyo_alvr) 's Twitter Profile Photo

Vanishing gradients are central to RNNs and SSMs, but how do they affect GNNs? We explore this in our new paper! w/ A. Gravina, Ben Gutteridge Federico Barbero C. Gallicchio xiaowen dong Michael Bronstein Pierre Vandergheynst 🔗 arxiv.org/abs/2502.10818 🧵(1/11)

Frank Noe (@franknoeberlin) 's Twitter Profile Photo

The BioEmu-1 model and inference code are now public under MIT license!!! Please go ahead, play with it and let us know if there are issues. github.com/microsoft/bioe…

charliebtan (@charliebtan) 's Twitter Profile Photo

New preprint! 🚨 We scale equilibrium sampling to hexapeptide (in cartesian coordinates!) with Sequential Boltzmann generators!  📈 🤯 Work with Joey Bose, Chen Lin, Leon Klein, Michael Bronstein and Alex Tong Thread 🧵 1/11

New preprint! 🚨 We scale equilibrium sampling to hexapeptide (in cartesian coordinates!) with Sequential Boltzmann generators!  📈 🤯

Work with <a href="/bose_joey/">Joey Bose</a>, <a href="/WillLin1028/">Chen Lin</a>, <a href="/leonklein26/">Leon Klein</a>, <a href="/mmbronstein/">Michael Bronstein</a> and <a href="/AlexanderTong7/">Alex Tong</a>

Thread 🧵 1/11
Itay Yona (@itay__yona) 's Twitter Profile Photo

Ever felt like you're talking to a parrot with a glitch? 🦜 Turns out, LLMs struggle with repetition in a fascinating way! 🕵️‍♂️ We reverse-engineered the circuit responsible for that bug 🤯

Federico Barbero (@fedzbar) 's Twitter Profile Photo

I was left so impressed by the amount of effort and care Tim Scarfe puts into the production of his videos. Definitely recommend his channel, a true privilege to have been interviewed. Please excuse me as I was very jet lagged so be nice!! :)

Petar Veličković (@petarv_93) 's Twitter Profile Photo

Indeed it is! Let's look at these techniques together 🌟 Join me at the virtual GLOW seminar today (5pm CET) for the first public showing of my 'LLMs as GNNs' talk. 💬🕸️ (Instructions for joining in reply)

Ji-Ha (@ji_ha_kim) 's Twitter Profile Photo

LLMs anchor themselves on the first token to dampen and stabilize the interactions on the other tokens. A great explanation of attention sinks with minimal math, and great diagrams!

LLMs anchor themselves on the first token to dampen and stabilize the interactions on the other tokens.

A great explanation of attention sinks with minimal math, and great diagrams!
Alexander Doria (@dorialexander) 's Twitter Profile Photo

"Instructions work better at the top of long context". Not going to repeat this thread but prompt engineers should really get better acquainted with the geometry of LLMs.

"Instructions work better at the top of long context". Not going to repeat this thread but prompt engineers should really get better acquainted with the geometry of LLMs.
Federico Barbero (@fedzbar) 's Twitter Profile Photo

Super excited to be heading to Singapore tomorrow to present our work on RoPE with Alex, Christos Perivolaropoulos, Razvan, Petar Veličković. Christos and I will be presenting on Fri 25 Apr 7 p.m. PDT — 9:30 p.m. PDT Hall 3 + Hall 2B #242. Happy to meet and catch up :) DMs are open!

Super excited to be heading to Singapore tomorrow to present our work on RoPE with Alex, <a href="/ccperivol/">Christos Perivolaropoulos</a>, Razvan, <a href="/PetarV_93/">Petar Veličković</a>. 

Christos and I will be presenting on Fri 25 Apr 7 p.m. PDT — 9:30 p.m. PDT Hall 3 + Hall 2B #242. 

Happy to meet and catch up :) DMs are open!