March 22 Colloquium: "Multimodal Learning from Pixels to People"

Talk Abstract

People experience the world through modalities of sight, sound, words, touch, and more. By leveraging their natural relationships and developing multimodal learning methods, my research creates artificial perception systems with diverse skills, including spatial, physical, logical, and cognitive abilities, for flexibly analyzing visual data. This multimodal approach provides versatile representations for tasks like 3D reconstruction, visual question answering, and object recognition, while offering inherent explainability and excellent zero-shot generalization across tasks. By closely integrating diverse modalities, we can overcome key challenges in machine learning and enable new capabilities for computer vision, especially for the many upcoming applications where trust is required.

Biography

Carl Vondrick is the YM Associate Professor of Computer Science at Columbia University. Previously, he was a Research Scientist at Google, and he received his PhD from MIT. His research interests are in computer vision, machine learning, and their applications. He is the recipient of the NSF CAREER award, and his research is supported by the NSF, DARPA, Amazon, Google, and Toyota.

Website: https://www.cs.columbia.edu/~vondrick/

Location

Sennott Square Building, Room 5317

Date

Friday, March 22 at 2:00 p.m. to 3:15 p.m.

Faculty Host

Dr. Adriana Kovashka

News Type