Ben Hoover (@ben_hoov) 's Twitter Profile
Ben Hoover

@ben_hoov

AI Visualization & (re)Interpretability Researcher @IBMResearch @GeorgiaTech

ID: 960964042197217280

linkhttp://bhoov.com calendar_today06-02-2018 19:50:56

194 Tweet

765 Followers

320 Following

Dmitry Krotov (@dimakrotov) 's Twitter Profile Photo

I am super excited to announce the call for papers for the New Frontiers in Associative Memories workshop at ICLR 2025. New architectures and algorithms, memory-augmented LLMs, energy-based models, Hopfield networks, associative memory and diffusion, and many other exciting

I am super excited to announce the call for papers for the New Frontiers in Associative Memories workshop at ICLR 2025. New architectures and algorithms, memory-augmented LLMs, energy-based models, Hopfield networks, associative memory and diffusion, and many other exciting
Julia Kempe (@kempelab) 's Twitter Profile Photo

Submit to our New Frontiers in Associative Memories workshop ICLR 2026. New architectures & algorithms, memory-augmented LLMs, energy-based models, Hopfield networks, assoc. memory & diffusion.. nfam.vizhub.ai openreview.net/group?id=ICLR.… Organizing with Dmitry Krotov et al

John Hopfield (@hopfieldjohn) 's Twitter Profile Photo

I very much enjoyed reading the papers from the first iteration of this workshop in 2023. If you are working on associative memory, consider submitting your work and participating in this event.

Dmitry Krotov (@dimakrotov) 's Twitter Profile Photo

Now that ICML papers are submitted and we are in the midst of discussions on whether scaling is enough or new architectural/algorithmic ideas are needed, what can be a better time to submit your best work to our workshop on New Frontiers in Associative Memory ICLR 2026?

Now that ICML papers are submitted and we are in the midst of discussions on whether scaling is enough or new architectural/algorithmic ideas are needed, what can be a better time to submit your best work to our workshop on New Frontiers in Associative Memory <a href="/iclr_conf/">ICLR 2026</a>?
Alec Helbling (@alec_helbling) 's Twitter Profile Photo

Gradient descent alone tends to converge to local minima. Momentum frames optimization as a ball with mass moving down a hill. By adding inertia, the ball resists settling in small basins, allowing it to arrive at the global minimum.

Alec Helbling (@alec_helbling) 's Twitter Profile Photo

Introducing ConceptAttention, an approach to interpreting diffusion transformer models! Write a prompt, choose some concepts, generate an image, and get high-quality heatmaps of text concepts. Our method outperforms existing methods like cross attention. Link to demo πŸ‘‡

Alec Helbling (@alec_helbling) 's Twitter Profile Photo

Create heatmaps that localize text concepts in generated videos. We discovered that our approach, ConceptAttention, can be directly extended from image generation to video generation models! It's amazing how simple techniques often generalize way better than more complex ones.

Alec Helbling (@alec_helbling) 's Twitter Profile Photo

Diffusion models leverage a variety of samplers. Deterministic methods like DDIM produce orderly paths. In contrast, stochastic samplers like DDPM produce chaotic trajectories. Despite their differences, both methods draw valid samples from the underlying distribution.

Alec Helbling (@alec_helbling) 's Twitter Profile Photo

One of the simplest algorithms for sampling from a probability distribution is Random Walk Metropolis-Hastings. It proposes new samples by taking Gaussian-distributed steps, accepting or rejecting them to maintain the target distribution. I call this pdf the "fidget spinner".

Alec Helbling (@alec_helbling) 's Twitter Profile Photo

Hamiltonian Monte Carlo (HMC) frames sampling from a probability distribution as simulating the dynamics of a physical system. Samples are expressed as particles whose trajectories are updated following Hamilton's equations based on the structure of the target distribution.

Alec Helbling (@alec_helbling) 's Twitter Profile Photo

Random Walks aim to explore a distribution through random perturbations. This is likely to move into low density regions, which is inefficient. Hamiltonian MC sails through a distribution much more rapidly by incorporating the distribution's structure into proposals.

Alec Helbling (@alec_helbling) 's Twitter Profile Photo

Our work ConceptAttention was accepted to ICML 2025 as a Spotlight Poster ("top" 2.6% of submissions)! ConceptAttention creates rich saliency maps of text concepts present in generated images and videos. It requires no additional training, only repurposing existing parameters.

Alec Helbling (@alec_helbling) 's Twitter Profile Photo

Flow Matching aims to learn a "flow" that transforms a simple source distribution (e.g. Gaussian) to an arbitrarily complex target distribution. This video shows the evolution of the marginal probability path as a source distribution is transformed to a target distribution.

Alec Helbling (@alec_helbling) 's Twitter Profile Photo

I've been putting together an interactive tool called DiffusionLab for explaining the geometric intuition behind diffusion and flow based generative models. Sampling is actually being done in the browser using Tensorflow.js! It is still in the very early stages.

Alec Helbling (@alec_helbling) 's Twitter Profile Photo

I made a tool called Diffusion Explorer that lets you to train and visualize simple 2D diffusion and flow models live in the browser. You can draw your own distributions and observe how the generated samples converge during training. Try it live πŸ‘‡

Anthony Peng (@realanthonypeng) 's Twitter Profile Photo

Guardrail models like πŸ›‘οΈ Llama Guard do more than filtering β€” we repurpose them to track how safety risk evolves πŸ“‰ through a response. This gives rise to the STAR ⭐ score: a fine-grained signal for finetuning LLMs more safely πŸ€–πŸ”’ Curious how it works? More in the thread πŸ‘‡

Guardrail models like πŸ›‘οΈ Llama Guard do more than filtering β€” we repurpose them to track how safety risk evolves πŸ“‰ through a response. This gives rise to the STAR ⭐ score: a fine-grained signal for finetuning LLMs more safely πŸ€–πŸ”’ 

Curious how it works? More in the thread πŸ‘‡
Anthony Peng (@realanthonypeng) 's Twitter Profile Photo

🚨 Sharing our new #ACL2025NLP main paper! πŸŽ₯ Deploying video VLMs at scale? Inference compute is your bottleneck. We study how to optimally allocate inference FLOPs across LLM size, frame count, and visual tokens. πŸ’‘ Large-scale training sweeps (~100k A100 hrs) πŸ“Š Parametric

🚨 Sharing our new #ACL2025NLP main paper!
πŸŽ₯ Deploying video VLMs at scale? Inference compute is your bottleneck.

We study how to optimally allocate inference FLOPs across LLM size, frame count, and visual tokens.
πŸ’‘ Large-scale training sweeps (~100k A100 hrs)
πŸ“Š Parametric