
Vittorio Ferrari
@vittoferraricv
Director of Science at Synthesia.io
ID: 1275184138664976384
https://sites.google.com/view/vittoferrari 22-06-2020 21:50:10
60 Tweet
5,5K Followers
12 Following

Four papers accepted to #ICCV2023! 1/4 Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories arxiv.org/abs/2306.09224 Dataset release coming soon! With @tejmensink Jasper Uijlings Lluís Castrejón Arushi Goel Cadar Howard Zhou Fei Sha, A.Araujo



Four papers accepted to #ICCV2023! 4/4 Tracking by 3D Model Estimation of Unknown Objects in Videos arxiv.org/abs/2304.06419 With Denys Rozumnyi, Jiri Matas, Martin R. Oswald, Marc Pollefeys


Four papers accepted to #ICCV2023! 2/4 CAD-Estate: Large-scale CAD Model Annotation in RGB Videos >100k 3D objects annotated on RGB videos of complex scenes Dataset release coming soon! arxiv.org/abs/2306.09011 Stefan Popov, Kevis-Kokitsi Maninis, Matthias Niessner


Check out CAD-Estate: a large dataset with 3D object and room layout annotations on RGB videos of complex multi-object scenes (101k objects in total!). github.com/google-researc… arxiv.org/abs/2306.09011 arxiv.org/abs/2306.09077 With Stefan Popov, Kevis-Kokitsi Maninis, Matthias Niessner


Three papers accepted to #NeurIPS 1/3 StoryBench: a new benchmark for text-to-video generation of stories to guide progress in assistive technology for filmmaking 🧑🎨 arxiv.org/abs/2308.11606 github.com/google/storybe… x.com/ebugliarello/s… With Emanuele Bugliarello, Hernan Moraldo, many others


Three papers accepted to #NeurIPS 2/3 "Estimating Generic 3D Room Structures from 2D Annotations" 3D room layouts annotations for 2246 videos (part of CAD-Estate dataset). arxiv.org/abs/2306.09077 github.com/google-researc… With Denys Rozumnyi,Stefan Popov, Kevis-Kokitsi Maninis, Matthias Niessner

Three papers accepted to #NeurIPS 3/3 NAVI: a dataset of image collections of objects, along with high-quality 3D object scans, near-perfect 2D-3D alignments, and accurate camera parameters. arxiv.org/abs/2306.09109 navidataset.github.io With Varun Jampani, Kevis-Kokitsi Maninis, others



We are running the Vision and Sports Summer school again this year! Prague, July 22-27. We offer a broad-range of lectures on state-of-the-art Computer Vision techniques, as well as exciting sport activities, such as Volleyball, Frisbee and Table Tennis. cmp.felk.cvut.cz/summerschool20…

Paper accepted to #CVPR2024! Grounding Everything: Emerging Localization Properties in Vision-Language Transformers Paper: arxiv.org/abs/2312.00878 Demo:huggingface.co/spaces/WalidBo… Code: github.com/WalBouss/GEM With Walid BOUSSELHAM, Felix Petersen, Hilde Kuehne


Introducing HAMMR: hierarchical multimodal agents that handle a broad range of VQA tasks within a single system (counting, spatial reasoning, OCR, visual pointing, external knowledge, and more). arxiv.org/abs/2404.05465 Lluís Castrejón @tejmensink Howard Zhou André Araujo Jasper Uijlings


AI Avatars have learned to interpret text now. 😬 Our soon-to-be-public EXPRESS-1 AI model enables Synthesia avatars to understand and adjust to the script automatically. 🤯 Join the pre-launch tech chat with: Victor Riparbelli, Matthias Niessner & Jon Starck 👀 x.com/i/spaces/1YpJk…

Our EXPRESS-1 AI model enables @Synthesiaio avatars to understand and adjust to the script automatically 💥 This is a big milestone, so tune in tomorrow for a pre-launch chat with Matthias Niessner, Jon Starck, Victor Riparbelli and @AlexVoica X Spaces event link: x.com/i/spaces/1YpJk…



Paper accepted to the “Multimodal Algorithmic Reasoning” NeurIPS workshop! HAMMR: Hierarchical multimodal agents for handing many diverse VQA tasks in a single system arxiv.org/abs/2404.05465 Lluís Castrejón @tejmensink Howard Zhou André Araujo Jasper Uijlings

