Prafull Sharma (@prafull7) 's Twitter Profile
Prafull Sharma

@prafull7

PostDoc @MIT with Josh Tenenbaum and Phillip Isola // PhD @MIT with Bill Freeman and Fredo Durand // BS @Stanford

ID: 188711985

linkhttp://prafullsharma.net calendar_today09-09-2010 12:12:26

346 Tweet

1,1K Followers

755 Following

Jeremy Bernstein (@jxbz) 's Twitter Profile Photo

New paper and pip package: modula: "Scalable Optimization in the Modular Norm" 📦 github.com/jxbz/modula 📝 arxiv.org/abs/2405.14813 We re-wrote the @pytorch module tree so that training automatically scales across width and depth.

New paper and pip package:
modula: "Scalable Optimization in the Modular Norm"

📦 github.com/jxbz/modula
📝 arxiv.org/abs/2405.14813

We re-wrote the @pytorch module tree so that training automatically scales across width and depth.
Ta-Ying Cheng (@chengtim0708) 's Twitter Profile Photo

Thrilled to share that ZeST has been accepted to #ECCV2024 !! A huge thanks to my collaborators/mentors Prafull Sharma , Varun Jampani, and my supervisors Niki Trigoni and Andrew Markham for the amazing support!

Boyuan Chen (@boyuanchen0) 's Twitter Profile Photo

Introducing Diffusion Forcing, which unifies next-token prediction (eg LLMs) and full-seq. diffusion (eg SORA)! It offers improved performance & new sampling strategies in vision and robotics, such as stable, infinite video generation, better diffusion planning, and more! (1/8)

Google AI (@googleai) 's Twitter Profile Photo

Presenting a novel approach that harnesses generative text-to-image models to enable users to precisely edit specific material properties (like roughness and transparency) of objects in images while retaining their original shape. Learn more → goo.gle/4deVgj5

Presenting a novel approach that harnesses generative text-to-image models to enable users to precisely edit specific material properties (like roughness and transparency) of objects in images while retaining their original shape. Learn more → goo.gle/4deVgj5
AK (@_akhaliq) 's Twitter Profile Photo

Sakana AI announces The AI Scientist Towards Fully Automated Open-Ended Scientific Discovery discuss: huggingface.co/papers/2408.06… One of the grand challenges of artificial general intelligence is developing agents capable of conducting scientific research and discovering new

Sakana AI announces The AI Scientist

Towards Fully Automated Open-Ended Scientific Discovery

discuss: huggingface.co/papers/2408.06…

One of the grand challenges of artificial general intelligence is developing agents capable of conducting scientific research and discovering new
MIT CSAIL Alliances (@csail_alliances) 's Twitter Profile Photo

.MIT CSAIL PhD student Marianne Rakic's most recent project, Tyche, is a medical image segmentation model that aims at generalizing new tasks & capturing uncertainty in the medical image. Learn more about Marianne and her recent projects: bit.ly/4d6X39t

Shobhita Sundaram (@shobsund) 's Twitter Profile Photo

What happens when models see the world as humans do? In our #NeurIPS2024 paper we show that aligning to human perceptual preferences can *improve* general-purpose representations! 📝: arxiv.org/abs/2410.10817 🌐: percep-align.github.io 💻: github.com/ssundaram21/dr… (1/n)

What happens when models see the world as humans do?

In our #NeurIPS2024 paper we show that aligning to human perceptual preferences can *improve* general-purpose representations!

📝: arxiv.org/abs/2410.10817
🌐: percep-align.github.io
💻: github.com/ssundaram21/dr…

(1/n)
Josh McDermott (@joshhmcdermott) 's Twitter Profile Photo

We just wrote a primer on how the physics of sound constrains auditory perception: authors.elsevier.com/a/1jzSR3QW8S6E… Covers sound propagation and object interactions, and touches on their relevance to music and film. I enjoyed working on this with Vin Agarwal and James Traer.

We just wrote a primer on how the physics of sound constrains auditory perception:

authors.elsevier.com/a/1jzSR3QW8S6E…

Covers sound propagation and object interactions, and touches on their relevance to music and film.

I enjoyed working on this with <a href="/vin_agarwal/">Vin Agarwal</a> and James Traer.
Vin Agarwal (@vin_agarwal) 's Twitter Profile Photo

Had a lot of fun working on this. Stay tuned for more research on how human listeners reverse engineer the physics of the world using the sounds they hear

Shivam Duggal (@shivamduggal4) 's Twitter Profile Photo

Current vision systems use fixed-length representations for all images. In contrast, human intelligence or LLMs (eg: OpenAI o1) adjust compute budgets based on the input. Since different images demand diff. processing & memory, how can we enable vision systems to be adaptive ? 🧵

Phillip Isola (@phillip_isola) 's Twitter Profile Photo

As a kid I was fascinated the Search for Extraterrestrial Intelligence, SETI Now we live in an era when it's becoming meaningful to search for "extraterrestrial life" not just in our universe but in simulated universes as well This project provides new tools toward that dream:

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper

Vincent Sitzmann (@vincesitzmann) 's Twitter Profile Photo

We wrote a new video diffusion paper! Kiwhan Song and Boyuan Chen and co-authors did absolutely amazing work here. Apart from really working, the method of "variable-length history guidance" is really cool and based on some deep truths about sequence generative modeling....

Jeremy Bernstein (@jxbz) 's Twitter Profile Photo

I just wrote my first blog post in four years! It is called "Deriving Muon". It covers the theory that led to Muon and how, for me, Muon is a meaningful example of theory leading practice in deep learning (1/11)

I just wrote my first blog post in four years! It is called "Deriving Muon". It covers the theory that led to Muon and how, for me, Muon is a meaningful example of theory leading practice in deep learning

(1/11)
Shaden (@sa_9810) 's Twitter Profile Photo

Excited to share our ICLR 2025 paper, I-Con, a unifying framework that ties together 23 methods across representation learning, from self-supervised learning to dimensionality reduction and clustering. Website: aka.ms/i-con A thread 🧵 1/n

Akarsh Kumar (@akarshkumar0101) 's Twitter Profile Photo

Excited to share our position paper on the Fractured Entangled Representation (FER) Hypothesis! We hypothesize that the standard paradigm of training networks today — while producing impressive benchmark results — is still failing to create a well-organized internal

Hyojin Bahng (@hyojinbahng) 's Twitter Profile Photo

Image-text alignment is hard — especially as multimodal data gets more detailed. Most methods rely on human labels or proprietary feedback (e.g., GPT-4V). We introduce: 1. CycleReward: a new alignment metric focused on detailed captions, trained without human supervision. 2.

Image-text alignment is hard — especially as multimodal data gets more detailed. Most methods rely on human labels or proprietary feedback (e.g., GPT-4V).

We introduce:
1. CycleReward: a new alignment metric focused on detailed captions, trained without human supervision.
2.