Ivan Skorokhodov (@isskoro) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

If you tried Gen-3/MovieGen, you might have realized that it's close-to-impossible to create any meaningful long video with them, where the subjects would do a sequence of complex non-trivial actions. MinT solves this problem really well by reformulating a complex, multi-event

thumb_up_off_alt25

chat_bubble_outline1

repeat2

shareShare

Ivan Skorokhodov

@isskoro

7 months ago

Diffusion models are very strong and robust feature extractors, but recent works were only using them for recognition tasks. In our recent work (led by MOAYED HAJi ALi), we harness them for video2audio generation: they by far outperform conventional video feature extractors for

thumb_up_off_alt14

chat_bubble_outline0

repeat1

shareShare

Ivan Skorokhodov

@isskoro

6 months ago

I also feel quite schmidhuber-ish about VAR (NeurIPS'24 best paper): its core idea is the same as of Multiscale PixelCNN (ICML'17), but the authors allocate just a single sentence to discuss it and frame it as some bizarre "raster-scan" + "super-resolution" model. In early 2023,

thumb_up_off_alt30

chat_bubble_outline2

repeat1

shareShare

Ivan Skorokhodov

@isskoro

6 months ago

So, is it ok to have a "master" branch in my github repo now?

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Ivan Skorokhodov

@isskoro

6 months ago

This year, our research team at Snap is hiring multiple Research Engineer interns to help us build large-scale generative models. You don't need to have papers, but should have solid coding and ML skills. It is a good opportunity for BS/MS and early PhD students. Apply at

thumb_up_off_alt399

chat_bubble_outline7

repeat25

shareShare

Kfir Aberman

@abermankfir

5 months ago

We discovered that imposing a spatio-temporal weight space via LoRAs on DIT-based video models unlocks powerful customization! It captures dynamic concepts with precision and even enables composition of multiple videos together!🎥✨

thumb_up_off_alt621

chat_bubble_outline15

repeat89

shareShare

Ivan Skorokhodov

@isskoro

5 months ago

According to google scholar, CVPR has now become the second-ranked venue *worldwide*, with Nature being the only one ahead (also, NeurIPS, ICLR, ICCV and ICML are in top-20 taking 7/10/13/17-th place). You're impressed at first glance, but then you realize that the ranking is

thumb_up_off_alt36

chat_bubble_outline0

repeat4

shareShare

Ivan Skorokhodov

@isskoro

3 months ago

Recently, there were many tweets from people frustrated with their ICML results (I feel your pain). It was my first time submitting to ICML, and somehow it was maybe the most reasonable set of reviewers I've ever got (and we had 4 of them). There were multiple concerns raised but

thumb_up_off_alt18

chat_bubble_outline1

repeat0

shareShare

Ivan Skorokhodov

@isskoro

2 months ago

Turns out, ICML does not restrict using special characters in the paper TLDR/lay summary, so here we go:

thumb_up_off_alt23

chat_bubble_outline0

repeat0

shareShare

Ziyi Wu

@dazitu_616

2 months ago

📢 Introducing DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models Compared to vanilla DPO, we improve paired data construction and preference label granularity, leading to better visual quality and motion strength with only 1/3 of the data. 🧵

thumb_up_off_alt169

chat_bubble_outline2

repeat36

shareShare

Ashkan Mirzaei

@ashmrz10

a month ago

[1/9] 🚀 We introduce 4Real-Video-V2, a method that can generate 4D scenes from a simple text prompt, viewable from any angle at any moment in time. It’s fast, photorealistic, and works on full scenes. Here's how it works and why it matters. 👇 snap-research.github.io/4Real-Video-V2/

thumb_up_off_alt86

chat_bubble_outline2

repeat26

shareShare

Moayed Haji Ali

@moayedhajiali

a month ago

Where are good old progressive diffusion models? 🤔 Breaking generation to multiple resolution scales is a great idea, but complexity (multiple models, custom diffusion process, etc) stalled scaling. Our Decomposable Flow Matching packs multi-scale perks into one scalable model.

thumb_up_off_alt64

chat_bubble_outline3

repeat18

shareShare