Recent Progress in Combining Perception and Language Models

Recent Progress in Combining Perception and Language Models
Ask the Experts
June 4, 2024 | 12-12:30 PM ET

Join MIT Senior Research Scientist James Glass in a virtual discussion on recent advances in audio-visual foundation models and large language models which have enabled conversational multimodal agents that can perceive their environment and communicate via natural language. Dr. Glass will cover some of his ongoing research in this area. Moderated by Aude Oliva, lab co-director.

Meet the Expert

Speaker
James Glass, Ph.D.,
is a Senior Research Scientist at the Massachusetts Institute of Technology where he leads the Spoken Language Systems Group in the Computer Science and Artificial Intelligence Laboratory. He is also a member of the Harvard University Program in Speech and Hearing Bioscience and Technology. Since obtaining his S.M. and Ph.D. degrees at MIT in Electrical Engineering and Computer Science, his research has focused on automatic speech recognition, unsupervised speech processing, and spoken language understanding using machine learning. He is an IEEE Fellow, and a Fellow of the International Speech Communication Association, and is currently an Associate Editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence.