From Video to Virtual: Object-centric 3D scene understanding from videos 🗓

Sponsor: IEEE CS San Diego Invited Seminar Series 2025 – Lecture 3 (Virtual)
Speaker: Yash Bhalgat of University of Oxford’s Visual Geometry Group (VGG)
register
Date: 25 Feb 2025
Time: 03:00 PM to 04:00 PM
Cost:
Location:
Reservations: IEEE
Summary:
The growing demand for immersive, interactive experiences has underscored the importance of 3D data in understanding our surroundings. Traditional methods for capturing 3D data are often complex and equipment-intensive. In contrast, my research aims to utilize unconstrained videos, such as those from augmented reality glasses, to effortlessly capture scenes and objects in their full 3D complexity. As a first step, I will describe a method to incorporate Epipolar Geometry priors in multi-view Transformer models to enable identifying objects across extreme pose variations. Next, I will discuss my work “Contrastive Lift” on 3D object segmentation using 2D pre-trained foundation models, following which I will talk about addressing the same problem using language.

Bio: Yash Bhalgat is a final year PhD student at University of Oxford’s Visual Geometry Group (VGG) supervised by Andrew Zisserman, Andrea Vedaldi, Joao Henriques and Iro Laina. His research is broadly in 3D Computer Vision and Machine Learning, with focus on geometry-aware transformers, neural rendering, vision-language models. Previously, he was a Senior Researcher at Qualcomm AI Research in San Diego working on efficient deep learning. He received his Masters in CS from University of Michigan – Ann Arbor, and his Bachelors in EE from IIT Bombay.

Moved Online Webinar