Shiyu Chang

Research Staff Member

Shiyu Chang is a research scientist at the MIT-IBM Watson AI Lab, working closely with Prof. Regina Barzilay and Prof. Tommi S. Jaakkola. His research focuses on machine learning and its applications in natural language processing and computer vision.

Most recently, he has been studying how machine predictions can be made more interpretable to humans, and how human intuition and rationalization can improve AI transferability, data efficiency, and adversarial robustness.

Prior to his current position, Shiyu was a research scientist at the IBM T.J. Watson Research Center. He got his B.S., and Ph.D. from the University of Illinois at Urbana-Champaign. His Ph.D. advisor is Prof. Thomas S. Huang.

Some words that keep me moving forward:

“A job well done is its own reward. You take pride in the things you do, not for others to see, not for the respect, or glory, or any other rewards it might bring. You take pride in what you do, because you’re doing your best. If you believe in something, you stick with it. When things get difficult, you try harder.”

Top Work

Class-wise rationalization: teaching AI to weigh pros and cons

Class-wise rationalization: teaching AI to weigh pros and cons

Natural Language Processing

Publications with the MIT-IBM Watson AI Lab

Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning
Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning
 
Robust Overfitting may be mitigated by properly learned smoothening
Robust Overfitting may be mitigated by properly learned smoothening
 
Generating Adversarial Computer Programs using Optimized Obfuscations
Generating Adversarial Computer Programs using Optimized Obfuscations
 
The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models
The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models
 
Self-Progressing Robust Training
Self-Progressing Robust Training
 
Complementary Evidence Identification in Open-Domain Question Answering
Complementary Evidence Identification in Open-Domain Question Answering
 
Global Prosody Style Transfer Without Text Transcriptions
Global Prosody Style Transfer Without Text Transcriptions
 
Interactive Fiction Game Playing as Multi-Paragraph Reading Comprehension with Reinforcement Learning
Interactive Fiction Game Playing as Multi-Paragraph Reading Comprehension with Reinforcement Learning
 
Training Stronger Baselines for Learning to Optimize
Training Stronger Baselines for Learning to Optimize
 
The Lottery Ticket Hypothesis for the Pre-trained BERT Networks
The Lottery Ticket Hypothesis for the Pre-trained BERT Networks
 
Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing
Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing
 
Tight Certificates of Adversarial Robustness
Tight Certificates of Adversarial Robustness
 
Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization
Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization