All Work

Helping computer vision and language models understand what they see
Helping computer vision and language models understand what they see
MIT News
AI model speeds up high-resolution computer vision
AI model speeds up high-resolution computer vision
MIT News
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
Dynamic Video Quantization for Efficient Inference
Dynamic Video Quantization for Efficient Inference
Curious Representation Learning for Embodied Intelligence
Curious Representation Learning for Embodied Intelligence
Reasoning about Human-Object Interactions through Dual Attention Networks
Reasoning about Human-Object Interactions through Dual Attention Networks