Shiyu Chang

Research Staff Member

Shiyu Chang is a research scientist at the MIT-IBM Watson AI Lab, working closely with Prof. Regina Barzilay and Prof. Tommi S. Jaakkola. His research focuses on machine learning and its applications in natural language processing and computer vision.

Most recently, he has been studying how machine predictions can be made more interpretable to humans, and how human intuition and rationalization can improve AI transferability, data efficiency, and adversarial robustness.

Prior to his current position, Shiyu was a research scientist at the IBM T.J. Watson Research Center. He got his B.S., and Ph.D. from the University of Illinois at Urbana-Champaign. His Ph.D. advisor is Prof. Thomas S. Huang.

Some words that keep me moving forward:

“A job well done is its own reward. You take pride in the things you do, not for others to see, not for the respect, or glory, or any other rewards it might bring. You take pride in what you do, because you’re doing your best. If you believe in something, you stick with it. When things get difficult, you try harder.”

Top Work

Class-wise rationalization: teaching AI to weigh pros and cons

Class-wise rationalization: teaching AI to weigh pros and cons

Natural Language Processing

Publications with the MIT-IBM Watson AI Lab

PromptBoosting: Black-Box Text Classification with Ten Forward Passes
PromptBoosting: Black-Box Text Classification with Ten Forward Passes
 
Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models
Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models
 
Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
 
TextGrad: Advancing Robustness Evaluation in NLP by Gradient-Driven Optimization
TextGrad: Advancing Robustness Evaluation in NLP by Gradient-Driven Optimization
 
Fairness Reprogramming
Fairness Reprogramming
 
An Adversarial Framework for Generating Unseen Images by Activation Maximization
An Adversarial Framework for Generating Unseen Images by Activation Maximization
 
CONTENTVEC: An Improved Self-Supervised Speech Representation by Disentangling Speakers
CONTENTVEC: An Improved Self-Supervised Speech Representation by Disentangling Speakers
 
Data-Efficient Double-Win Lottery Tickets from Robust Pre-training
Data-Efficient Double-Win Lottery Tickets from Robust Pre-training
 
Adversarial Support Alignment
Adversarial Support Alignment
 
Optimizer Amalgamation
Optimizer Amalgamation
 
How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective
How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective
 
PARP: Prune Once, Adjust and Re-Prune for Self-Supervised Speech Recognition
PARP: Prune Once, Adjust and Re-Prune for Self-Supervised Speech Recognition
 
Understanding Interlocking Dynamics of Cooperative Rationalization
Understanding Interlocking Dynamics of Cooperative Rationalization
 
TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up
TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up
 
Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning
Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning
 
Robust Overfitting may be mitigated by properly learned smoothening
Robust Overfitting may be mitigated by properly learned smoothening
 
Generating Adversarial Computer Programs using Optimized Obfuscations
Generating Adversarial Computer Programs using Optimized Obfuscations
 
The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models
The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models
 
Self-Progressing Robust Training
Self-Progressing Robust Training
 
Complementary Evidence Identification in Open-Domain Question Answering
Complementary Evidence Identification in Open-Domain Question Answering
 
Global Prosody Style Transfer Without Text Transcriptions
Global Prosody Style Transfer Without Text Transcriptions
 
Interactive Fiction Game Playing as Multi-Paragraph Reading Comprehension with Reinforcement Learning
Interactive Fiction Game Playing as Multi-Paragraph Reading Comprehension with Reinforcement Learning
 
Training Stronger Baselines for Learning to Optimize
Training Stronger Baselines for Learning to Optimize
 
The Lottery Ticket Hypothesis for the Pre-trained BERT Networks
The Lottery Ticket Hypothesis for the Pre-trained BERT Networks
 
Invariant Rationalization
Invariant Rationalization
 
Unsupervised Speech Decomposition via Triple Information Bottleneck
Unsupervised Speech Decomposition via Triple Information Bottleneck
 
Proper Network Interpretability Helps Adversarial Robustness in Classification
Proper Network Interpretability Helps Adversarial Robustness in Classification
 
Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
 
Learning to learn with distributional signatures for text data
Learning to learn with distributional signatures for text data
 
A Game Theoretic Approach to Class-wise Selective Rationalization
A Game Theoretic Approach to Class-wise Selective Rationalization
 
Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers
Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers
 
Improving Question Answering over Incomplete KBs with Knowledge-Aware Reader
Improving Question Answering over Incomplete KBs with Knowledge-Aware Reader
 
Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers
Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers
 
Self-Supervised Learning for Contextualized Extractive Summarization
Self-Supervised Learning for Contextualized Extractive Summarization
 
TWEETQA: A Social Media Focused Question Answering Dataset
TWEETQA: A Social Media Focused Question Answering Dataset
 
Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets
Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets
 
Context-Aware Conversation Thread Detection in Multi-Party Chat
Context-Aware Conversation Thread Detection in Multi-Party Chat
 
Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control
Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control
 
Out-of-Domain Detection for Low-Resource Text Classification Tasks
Out-of-Domain Detection for Low-Resource Text Classification Tasks
 
AutoGAN: Neural Architecture Search for Generative Adversarial Networks
AutoGAN: Neural Architecture Search for Generative Adversarial Networks
 
Coupled Variational Recurrent Collaborative Filtering
Coupled Variational Recurrent Collaborative Filtering
 
Additive Adversarial Learning for Unbiased Authentication
Additive Adversarial Learning for Unbiased Authentication
 
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
 
Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing
Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing
 
Tight Certificates of Adversarial Robustness
Tight Certificates of Adversarial Robustness
 
Deriving Machine Attention from Human Rationales
Deriving Machine Attention from Human Rationales
 
Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization
Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization