Prafull Sharma (@prafull7) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

New paper and pip package: modula: "Scalable Optimization in the Modular Norm" 📦 github.com/jxbz/modula 📝 arxiv.org/abs/2405.14813 We re-wrote the @pytorch module tree so that training automatically scales across width and depth.

thumb_up_off_alt176

chat_bubble_outline8

repeat37

shareShare

Prafull Sharma

@prafull7

a year ago

Thanks MIT CSAIL for covering our work to be presented #CVPR2025 2024! News article: news.mit.edu/2024/controlle…

thumb_up_off_alt16

chat_bubble_outline3

repeat1

shareShare

Prafull Sharma

@prafull7

a year ago

Justice for Vegans #CVPR2025 🤣

Justice for Vegans <a href="/CVPR/">#CVPR2025</a> 🤣

thumb_up_off_alt48

chat_bubble_outline3

repeat2

shareShare

Prafull Sharma

@prafull7

a year ago

Come hear about our work on controlling material properties using Alchemist at 1 PM #CVPR2025 in Flex AB. #cvpr2024

Come hear about our work on controlling material properties using Alchemist at 1 PM <a href="/CVPR/">#CVPR2025</a> in Flex AB. #cvpr2024

thumb_up_off_alt20

chat_bubble_outline0

repeat0

shareShare

Ta-Ying Cheng

@chengtim0708

a year ago

Thrilled to share that ZeST has been accepted to #ECCV2024 !! A huge thanks to my collaborators/mentors Prafull Sharma , Varun Jampani, and my supervisors Niki Trigoni and Andrew Markham for the amazing support!

thumb_up_off_alt28

chat_bubble_outline1

repeat6

shareShare

Boyuan Chen

@boyuanchen0

a year ago

Introducing Diffusion Forcing, which unifies next-token prediction (eg LLMs) and full-seq. diffusion (eg SORA)! It offers improved performance & new sampling strategies in vision and robotics, such as stable, infinite video generation, better diffusion planning, and more! (1/8)

thumb_up_off_alt1,1K

chat_bubble_outline12

repeat210

shareShare

Google AI

@googleai

a year ago

Presenting a novel approach that harnesses generative text-to-image models to enable users to precisely edit specific material properties (like roughness and transparency) of objects in images while retaining their original shape. Learn more → goo.gle/4deVgj5

thumb_up_off_alt404

chat_bubble_outline27

repeat102

shareShare

AK

@_akhaliq

a year ago

Sakana AI announces The AI Scientist Towards Fully Automated Open-Ended Scientific Discovery discuss: huggingface.co/papers/2408.06… One of the grand challenges of artificial general intelligence is developing agents capable of conducting scientific research and discovering new

thumb_up_off_alt380

chat_bubble_outline8

repeat83

shareShare

MIT CSAIL Alliances

@csail_alliances

10 months ago

.MIT CSAIL PhD student Marianne Rakic's most recent project, Tyche, is a medical image segmentation model that aims at generalizing new tasks & capturing uncertainty in the medical image. Learn more about Marianne and her recent projects: bit.ly/4d6X39t

thumb_up_off_alt7

chat_bubble_outline0

repeat4

shareShare

Shobhita Sundaram

@shobsund

9 months ago

What happens when models see the world as humans do? In our #NeurIPS2024 paper we show that aligning to human perceptual preferences can *improve* general-purpose representations! 📝: arxiv.org/abs/2410.10817 🌐: percep-align.github.io 💻: github.com/ssundaram21/dr… (1/n)

thumb_up_off_alt456

chat_bubble_outline8

repeat83

shareShare

Josh McDermott

@joshhmcdermott

9 months ago

We just wrote a primer on how the physics of sound constrains auditory perception: authors.elsevier.com/a/1jzSR3QW8S6E… Covers sound propagation and object interactions, and touches on their relevance to music and film. I enjoyed working on this with Vin Agarwal and James Traer.

thumb_up_off_alt126

chat_bubble_outline5

repeat38

shareShare

Vin Agarwal

@vin_agarwal

9 months ago

Had a lot of fun working on this. Stay tuned for more research on how human listeners reverse engineer the physics of the world using the sounds they hear

thumb_up_off_alt30

chat_bubble_outline1

repeat4

shareShare

Shivam Duggal

@shivamduggal4

9 months ago

Current vision systems use fixed-length representations for all images. In contrast, human intelligence or LLMs (eg: OpenAI o1) adjust compute budgets based on the input. Since different images demand diff. processing & memory, how can we enable vision systems to be adaptive ? 🧵

thumb_up_off_alt481

chat_bubble_outline10

repeat66

shareShare

Phillip Isola

@phillip_isola

7 months ago

As a kid I was fascinated the Search for Extraterrestrial Intelligence, SETI Now we live in an era when it's becoming meaningful to search for "extraterrestrial life" not just in our universe but in simulated universes as well This project provides new tools toward that dream:

thumb_up_off_alt213

chat_bubble_outline4

repeat18

shareShare

Andrej Karpathy

@karpathy

6 months ago

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper

thumb_up_off_alt29,29K

chat_bubble_outline1,1K

repeat3,3K

shareShare

Vincent Sitzmann

@vincesitzmann

5 months ago

We wrote a new video diffusion paper! Kiwhan Song and Boyuan Chen and co-authors did absolutely amazing work here. Apart from really working, the method of "variable-length history guidance" is really cool and based on some deep truths about sequence generative modeling....

thumb_up_off_alt124

chat_bubble_outline2

repeat11

shareShare

Jeremy Bernstein

@jxbz

5 months ago

I just wrote my first blog post in four years! It is called "Deriving Muon". It covers the theory that led to Muon and how, for me, Muon is a meaningful example of theory leading practice in deep learning (1/11)

thumb_up_off_alt885

chat_bubble_outline10

repeat128

shareShare

Shaden

@sa_9810

3 months ago

Excited to share our ICLR 2025 paper, I-Con, a unifying framework that ties together 23 methods across representation learning, from self-supervised learning to dimensionality reduction and clustering. Website: aka.ms/i-con A thread 🧵 1/n

thumb_up_off_alt86

chat_bubble_outline1

repeat25

shareShare

Akarsh Kumar

@akarshkumar0101

2 months ago

Excited to share our position paper on the Fractured Entangled Representation (FER) Hypothesis! We hypothesize that the standard paradigm of training networks today — while producing impressive benchmark results — is still failing to create a well-organized internal

thumb_up_off_alt224

chat_bubble_outline5

repeat36

shareShare

Hyojin Bahng

@hyojinbahng

2 months ago

Image-text alignment is hard — especially as multimodal data gets more detailed. Most methods rely on human labels or proprietary feedback (e.g., GPT-4V). We introduce: 1. CycleReward: a new alignment metric focused on detailed captions, trained without human supervision. 2.

thumb_up_off_alt148

chat_bubble_outline3

repeat30

shareShare