Bill Psomas (@bill_psomas) 's Twitter Profile
Bill Psomas

@bill_psomas

Postdoctoral researcher @ VisualRecognitionGroup, @CVUTPraha. PhD @ntua. Former IARAI, @Inria, @athenaRICinfo intern. Photographer. Crossfit freak.

ID: 1394460390319341569

linkhttp://users.ntua.gr/psomasbill/ calendar_today18-05-2021 01:10:46

260 Tweet

342 Followers

795 Following

Christian Wolf (🦋🦋🦋) (@chriswolfvision) 's Twitter Profile Photo

At NAVER LABS Europe in Grenoble, France, we are searching for talented PhD interns for work on Spatial AI, geometric and robotic foundation models for navigation and manipulation. If you have experience in Embodied AI and are interested, DM me.

At <a href="/naverlabseurope/">NAVER LABS Europe</a>  in Grenoble, France, we are searching for talented PhD interns for work on Spatial AI, geometric and robotic foundation models for navigation and manipulation. If you have experience in Embodied AI and are interested, DM me.
Bill Psomas (@bill_psomas) 's Twitter Profile Photo

🚀New paper alert: FREEDOM is here! Check out “Composed Image Retrieval for Training-FREE DOMain Conversion,” our training-free method for domain conversion with VLMs.🎯 📜WACV 2025 💡Retrieve images using image+text queries! 📖arxiv.org/abs/2412.03297 🔗github.com/NikosEfth/free…

Marcin Przewięźlikowski (@pszwnzl) 's Twitter Profile Photo

Self-supervised Learning with Masked Autoencoders (MAE) is known to produce worse image representations than Joint-Embedding approaches (e.g. DINO). In our new paper, we identify new reasons for why that is and point towards solutions: arxiv.org/abs/2412.03215 🧵

Self-supervised Learning with Masked Autoencoders (MAE) is known to produce worse image representations than Joint-Embedding approaches (e.g. DINO). In our new paper, we identify new reasons for why that is and point towards solutions: arxiv.org/abs/2412.03215 🧵
Efstathios Karypidis (@k_sta8is) 's Twitter Profile Photo

1/n 🚀 Excited to share our latest work: DINO-Foresight, a new framework for predicting the future states of scenes using Vision Foundation Model features! Links to the arXiv and Github 👇

1/n 🚀 Excited to share our latest work: DINO-Foresight, a new framework for predicting the future states of scenes using Vision Foundation Model features!
Links to the arXiv and Github 👇
Bill Psomas (@bill_psomas) 's Twitter Profile Photo

🚀Exciting news🚀 I’ve been awarded the Marie Skłodowska-Curie Postdoctoral Fellowship (#MSCA-PF) 2024 with 98/100!🎉 🥟My project, RAVIOLI, hosted at ČVUT v Praze, integrates retrieval-augmented predictions into vision-language models for open-vocabulary segmentation.

Shashank (@shawshank_v) 's Twitter Profile Photo

Excited to share that the recordings and slides of our SSLBIG tutorial are now online! If you notice any missing reference or have feedback, feel free to reach out. European Conference on Computer Vision #ECCV2026 Stay tuned for future editions! webpage: shashankvkt.github.io/eccv2024-SSLBI… Youtube: youtube.com/@SSLBiG_tutori…

Thodoris Kouzelis (@thkouz) 's Twitter Profile Photo

1/n🚀If you’re working on generative image modeling, check out our latest work! We introduce EQ-VAE, a simple yet powerful regularization approach that makes latent representations equivariant to spatial transformations, leading to smoother latents and better generative models.👇

Efstathios Karypidis (@k_sta8is) 's Twitter Profile Photo

🧵 Excited to share our latest work: FUTURIST - A unified transformer architecture for multimodal semantic future prediction, is accepted to #CVPR2025 ! Here's how it works (1/n) 👇 Links to the arxiv and github below

Giorgos Kordopatis-Zilos (@g_kordo) 's Twitter Profile Photo

ILIAS is a large-scale dataset for evaluation on Instance-Level Image retrieval At Scale. It is designed to support research in image-to-image and text-to-image retrieval for particular objects and serves as a benchmark for evaluating foundation models and retrieval techniques

ILIAS is a large-scale dataset for evaluation on Instance-Level Image retrieval At Scale. It is designed to support research in image-to-image and text-to-image retrieval for particular objects and serves as a benchmark for evaluating foundation models and retrieval techniques
valeo.ai (@valeoai) 's Twitter Profile Photo

👏 Huge congrats to our research scientist Elias Ramzi Elias for winning the AFRIF 2024 PhD award for his thesis "Robust image retrieval with deep learning", conducted at CNAM. Well deserved recognition for amazing work! 🏆 🔗 afrif.irisa.fr/?page_id=54

Bill Psomas (@bill_psomas) 's Twitter Profile Photo

🚀 Greeks in AI is booming! 200+ sign-ups, 30+ OpenReview submissions, and 🔥 sponsors joining daily. 📍Limited seats at Serafeio — register now: 👉 greeksin.ai Stay tuned for speakers, program, and abstract previews! #GreeksInAI #AI #ML #Research #Greece

Giorgos Kordopatis-Zilos (@g_kordo) 's Twitter Profile Photo

🚨 Call for Papers! 7th Instance-Level Recognition and Generation Workshop (ILR+G) at #ICCV2025 📍 Honolulu, Hawaii 🌺 📅 October 19–20, 2025 🌐 ilr-workshop.github.io/ICCVW2025/ in-proceedings deadline: June 7 out-of-proceedings deadline: June 30 #ICCV2025

Andrea Tagliasacchi 🇨🇦 (@taiyasaki) 's Twitter Profile Photo

Thank got that nobody submits papers to both #ICCV2025 and #NeurIPS2025. Writing rebuttals for one while working on the deadline for the other would be a total nightmare.

Yunzhi Zhang (@zhang_yunzhi) 's Twitter Profile Photo

(1/n) Time to unify your favorite visual generative models, VLMs, and simulators for controllable visual generation—Introducing a Product of Experts (PoE) framework for inference-time knowledge composition from heterogeneous models.

Maria Brbic (@mariabrbic) 's Twitter Profile Photo

Can we build multimodal models by simply aligning pretrained unimodal models with limited paired data? We introduce STRUCTURE 🏗️: a lightweight, plug-and-play regularizer that preserves latent geometry to align frozen unimodal models using <1% of paired data typically used in

Can we build multimodal models by simply aligning pretrained unimodal models with limited paired data? 

We introduce STRUCTURE 🏗️: a lightweight, plug-and-play regularizer that preserves latent geometry to align frozen unimodal models using &lt;1% of paired data typically used in
Shashank (@shawshank_v) 's Twitter Profile Photo

New paper out - accepted at #ICCV2025 We introduce MoSiC, a self-supervised learning framework that learns temporally consistent representations from video using motion cues. Key idea: leverage long-range point tracks to enforce dense feature coherence across time.🧵

New paper out - accepted at
<a href="/ICCVConference/">#ICCV2025</a>  

We introduce MoSiC, a self-supervised learning framework that learns temporally consistent  representations from video using motion cues.

Key idea: leverage long-range point tracks to enforce dense feature coherence across time.🧵
Marcin Przewięźlikowski (@pszwnzl) 's Twitter Profile Photo

Our paper "Beyond [cls]: Exploring the True Potential of Masked Image Modeling Representations" has been accepted to #ICCV2025! 🧵 TL;DR: Masked image models (like MAE) underperform not just because of weak features, but because they aggregate them poorly. [1/7]

Our paper "Beyond [cls]: Exploring the True Potential of Masked Image Modeling Representations" has been accepted to <a href="/ICCVConference/">#ICCV2025</a>!

🧵 TL;DR: Masked image models (like MAE) underperform not just because of weak features, but because they aggregate them poorly.

[1/7]