
Akshat Shrivastava
@akshats07
Co-founder & CTO @perceptroninc; ex Research Scientist @MetaAI (FAIR, AR, Assistant)
ID: 1932483559
http://akshatsh.github.io 04-10-2013 00:00:34
138 Tweet
736 Followers
314 Following

Physical world modeling introduces a set of challenges around designing the right interaction space for our model and building the right/scalable data strategy. Reach out to [email protected] if you're interested!

I’m very excited to announce that I’ll be joining Perceptron AI (perceptron.inc?) as a researcher and founding member of the technical staff. I’ll be working with Akshat Shrivastava and Armen Aghajanyan to create the world’s first visual language foundation models specifically



MoE's have been a key driver in improving performance for LLMs when memory is abundant, but what happens when we get to resource constrained devices? Checkout our latest work led by Patrick Huber exploring design decisions in making MoE's optimal for on-device deployment!

Bringing Efficiency to LLMs with Fine-Tuning LayerSkip, introduced in the 2024 paper by Mostafa Elhoushi et al. (arXiv:2404.16710), is a brilliant technique to accelerate large language model (LLM) inference without compromising accuracy. By training models with layer dropout and

fun debugging journey w/Akshat Shrivastava: be careful around FP8 w. activation checkpointing activation checkpointing works under the assumptions that different calls of forward give similar results which we move away from the more we quantize. when you re-quantize in activation



When Maciej Kilian and I first started talking about alignment and parameterization, he introduced several ideas presented in this blog post. As we continue to scale foundation models (esp multimodal), and with data-aware, scale-aware parameterization becoming more prevalent ,




I'm excited to be in ICML this week :-) Perceptron AI is co-sponsoring the Assessing World Models workshop this Friday. Come see some great talks from Jacob Andreas Naomi Saphra and more; topics include mechanistic interpretability, intuitive physics, LLMs for generating scientific