Teaching AI to generalize

Gregory Wornell (MIT)


The success of deep neural networks (DNNs) often relies on two factors: (1) sufficient labeled training data for the task, and (2) similarity in distributions of the training data and the data encountered by the trained DNN during its application. DNNs aren’t very good at applying what they have learned when the application data distribution is different from the training data distribution, a problem called domain shift. It can be challenging to obtain sufficient labeled training data for all possible domain shifts. The ability to effectively transfer learning from one domain to another would be a big win for AI systems. MIT-IBM Watson AI Lab scientists are developing a new technique to accomplish this. The technique is unaffected by domain shift and enables neural networks to learn a good predictive model for a new domain using labeled examples from the original domain and only unlabeled examples from the new domain. This method can generalize learning from one domain to another in common tasks like object recognition, performing significantly better than state-of-the-art techniques. The ability to generalize better in the presence of domain shift opens the door to several potential AI applications where obtaining labeled data is challenging.