Alessio Devoto (@devoto_alessio) 's Twitter Profile
Alessio Devoto

@devoto_alessio

PhD in Data Science at @SapienzaRoma | Researching Efficient ML/AI ☘️ | Visiting @EdinburghNLP | alessiodevoto.github.io | Also on 🦋

ID: 1496187730463760388

linkhttps://alessiodevoto.github.io/ calendar_today22-02-2022 18:19:15

165 Tweet

457 Followers

508 Following

Sonia (@soniajoseph_) 's Twitter Profile Photo

We visualized the features of 16 SAEs trained on CLIP in collaboration between Fraunhofer HHI and Mila - Institut québécois d'IA! Search thousands of interpretable CLIP features in our vision atlas, with autointerp labels, & scores like clarity and polysemanticity. Some fun features in thread:

Pasquale Minervini is hiring postdocs! 🚀 (@pminervini) 's Twitter Profile Photo

My amazing collaborators will present several works at ICLR and NAACL later this month -- please catch up with them if you're attending! I tried to summarise our recent work in a blog post: neuralnoise.com/2025/march-res…

My amazing collaborators will present several works at ICLR and NAACL later this month -- please catch up with them if you're attending! I tried to summarise our recent work in a blog post: neuralnoise.com/2025/march-res…
Hongru Wang (@wangcarrey) 's Twitter Profile Photo

🎉 Thrilled to share our TWO #NAACL2025 oral papers! 👇 Welcome to catch me and talk about anything! 1️⃣ Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering 📅 30 Apr • 11:30–11:45 AM • Ballroom C TLDR: A general representation learning

🎉 Thrilled to share our TWO #NAACL2025 oral papers! 👇 Welcome to catch me and talk about anything!

1️⃣ Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
📅 30 Apr • 11:30–11:45 AM • Ballroom C
TLDR: A general representation learning
Yu Zhao (@yuzhaouoe) 's Twitter Profile Photo

NAACL 2025 Oral Presentation💥 Our work about using Sparse AutoEncoder to resolve knowledge conflict will present on 30 Apr 11:30–11:45 AM • Ballroom C Thank Hongru for presenting our work!!!

Ne Luo (seeking PhD opportunities) (@neluo19) 's Twitter Profile Photo

Hi! I will be attending #NAACL2025 and presenting our paper on self-training for tool-use today, an extended work of my MSc dissertation at EdinburghNLP, supervised by Pasquale Minervini is hiring postdocs! 🚀. Time: 14:00-15:30 Location: Hall 3 Let’s chat and connect!😊

Hi! I will be attending #NAACL2025 and presenting our paper on self-training for tool-use today, an extended work of my MSc dissertation at <a href="/EdinburghNLP/">EdinburghNLP</a>, supervised by <a href="/PMinervini/">Pasquale Minervini is hiring postdocs! 🚀</a>.

Time: 14:00-15:30
Location: Hall 3

Let’s chat and connect!😊
Aryo Pradipta Gema (@aryopg) 's Twitter Profile Photo

MMLU-Redux just touched down at #NAACL2025! 🎉 Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope 😅 If anyone's swinging by, give our research some love! Hit me up if you check it out! 👋

MMLU-Redux just touched down at #NAACL2025! 🎉 
Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope 😅
If anyone's swinging by, give our research some love! Hit me up if you check it out! 👋
Alberto Carlo Maria Mancino (@alberto_mancino) 's Twitter Profile Photo

Are you ready to play with us?🎲 Our tutorial D&D4Rec, short for "Standard Practices for Data Processing and Multimodal Feature Extraction in Recommendation with DataRec and Ducho", has been accepted at #RecSys2025 (ACM RecSys) 🥳🥳 More details in the thread 🧵👇

Are you ready to play with us?🎲 

Our tutorial D&amp;D4Rec, short for "Standard Practices for Data Processing and Multimodal Feature Extraction in Recommendation with DataRec and Ducho", has been accepted at #RecSys2025 (<a href="/ACMRecSys/">ACM RecSys</a>) 🥳🥳 

More details in the thread 🧵👇
Jary Pomponi (@jarypom) 's Twitter Profile Photo

A new paper is out! In collaboration with Alessio Devoto and Simone Scardapane, we tackled catastrophic forgetting in class-incremental learning scenarios via Probability Dampening (self-scaling logit margins) & Cascaded Gated Classifier (sigmoid-gated mini-heads per task)

Simone Scardapane (@s_scardapane) 's Twitter Profile Photo

*attention is logarithmic, actually* by spike Short & nice blog post on the difference between time complexity and work-depth complexity and how it applies to many neural network operations (e.g., attention). supaiku.com/attention-is-l…

*attention is logarithmic, actually*
by <a href="/spikedoanz/">spike</a>

Short &amp; nice blog post on the difference between time complexity and work-depth complexity and how it applies to many neural network operations (e.g., attention).

supaiku.com/attention-is-l…
Simone Scardapane (@s_scardapane) 's Twitter Profile Photo

Happy to share I just started as associate professor in Sapienza Università di Roma! I have now reached my perfect thermodynamical equilibrium. 😄 Also, ChatGPT's idea of me is way infinitely cooler so I'll leave it here to trick people into giving me money.

Happy to share I just started as associate professor in <a href="/SapienzaRoma/">Sapienza Università di Roma</a>! I have now reached my perfect thermodynamical equilibrium. 😄

Also, ChatGPT's idea of me is way infinitely cooler so I'll leave it here to trick people into giving me money.
Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

Resa: Transparent Reasoning Models via SAEs "Specifically, SAE-Tuning involves two key stages: First, we use an SAE to probe the internal activations of a source model, identifying and extracting a dictionary of latent features that correspond to its reasoning processes. Second,

Resa: Transparent Reasoning Models via SAEs

"Specifically, SAE-Tuning involves two key stages: First, we use an SAE to probe the internal activations of a source model, identifying and extracting a dictionary of latent features that correspond to its reasoning processes. Second,
Sebastian Raschka (@rasbt) 's Twitter Profile Photo

Feels good to be back coding! Just picked a fun one from my “someday” side project list and finally added a KV cache to the LLMs From Scratch repo: github.com/rasbt/LLMs-fro…

Feels good to be back coding! Just picked a fun one from my “someday” side project list and finally added a KV cache to the LLMs From Scratch repo: github.com/rasbt/LLMs-fro…
Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs "we propose LongLLaDA, a training-free method that integrates LLaDA with the NTK-based RoPE extrapolation. Our results validate that established extrapolation scaling laws remain effective for extending the

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

"we propose LongLLaDA, a training-free method that integrates LLaDA with  the NTK-based RoPE extrapolation. Our results validate that established  extrapolation scaling laws remain effective for extending the
Unsloth AI (@unslothai) 's Twitter Profile Photo

We made a Guide on mastering LoRA Hyperparameters, so you can learn to fine-tune LLMs correctly! Learn to: • Train smarter models with fewer hallucinations • Choose optimal: learning rates, epochs, LoRA rank, alpha • Avoid overfitting & underfitting 🔗docs.unsloth.ai/get-started/fi…

We made a Guide on mastering LoRA Hyperparameters, so you can learn to fine-tune LLMs correctly!

Learn to:
• Train smarter models with fewer hallucinations
• Choose optimal: learning rates, epochs, LoRA rank, alpha
• Avoid overfitting &amp; underfitting

🔗docs.unsloth.ai/get-started/fi…
Simone Scardapane (@s_scardapane) 's Twitter Profile Photo

Twitter friends, here's some draft notes for my upcoming course on automatic differentiation, mostly based on the "Elements of differentiable programming" book. Let me know what you think! They also include a notebook on operator overloading. 🙃 notion.so/sscardapane/Au…

Twitter friends, here's some draft notes for my upcoming course on automatic differentiation, mostly  based on the "Elements of differentiable programming" book. Let me know what you think! They also include a notebook on operator overloading. 🙃

notion.so/sscardapane/Au…