Dan Gutfreund

Research Staff Member, Video Analytics

I am a principal investigator at the MIT-IBM Watson AI Lab, focusing on developing capabilities towards fully automated video comprehension.

Before I moved to the Cambridge Research Lab I was at the Haifa Research Lab where I held several managerial and technical leadership positions. In my last role I was the manager in charge of IBM Debating Technologies.

In 2005 I received a PhD in computer science from the Hebrew Univerity in Jerusalem, Israel. Following that I was a postdoctoral fellow and a lecturer at Harvard University and MIT. My research interests are in computational complexity, foundations of cryptography and machine learning with applications to natural language processing and computer vision.

Selected Publications

Top Work

Moments in Time Dataset: one million videos for event understanding

Moments in Time Dataset: one million videos for event understanding

Computer Vision

ObjectNet: A bias-controlled dataset object recognition

ObjectNet: A bias-controlled dataset object recognition

Computer Vision

Publications with the MIT-IBM Watson AI Lab

Zero-shot linear combinations of grounded social interactions with Linear Social MDPs
Zero-shot linear combinations of grounded social interactions with Linear Social MDPs
 
How hard are computer vision datasets? Calibrating dataset difficulty to viewing time
How hard are computer vision datasets? Calibrating dataset difficulty to viewing time
 
Finding Fallen Objects Via Asynchronous Audio-Visual Integration
Finding Fallen Objects Via Asynchronous Audio-Visual Integration
 
A Bayesian-Symbolic Approach to Reasoning and Learning in Intuitive Physics
A Bayesian-Symbolic Approach to Reasoning and Learning in Intuitive Physics
 
ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation
ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation
 
3DP3: 3D Scene Perception via Probabilistic Programming
3DP3: 3D Scene Perception via Probabilistic Programming
 
Multi-Moments in Time: Learning and Interpreting Models for Multi-Action Video Understanding
Multi-Moments in Time: Learning and Interpreting Models for Multi-Action Video Understanding
 
AGENT: A Benchmark for Core Psychological Reasoning
AGENT: A Benchmark for Core Psychological Reasoning
 
ObjectNet: A bias-controlled dataset object recognition
ObjectNet: A bias-controlled dataset object recognition
 
ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models
ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models
 
SimVAE: Simulator-Assisted Training for Interpretable Generative Models
SimVAE: Simulator-Assisted Training for Interpretable Generative Models
 
Reasoning about Human-Object Interactions through Dual Attention Networks
Reasoning about Human-Object Interactions through Dual Attention Networks
 
Identifying Interpretable Action Concepts in Deep Networks
Identifying Interpretable Action Concepts in Deep Networks
 
Grounding Spoken Words in Unlabeled Video
Grounding Spoken Words in Unlabeled Video
 
Moments in Time Dataset: one million videos for event understanding
Moments in Time Dataset: one million videos for event understanding