Yang Zhang

Research Staff Member

Yang Zhang is a research scientist at MIT-IBM Watson AI Lab. His research focuses on deep learning for speech, natural language, and other time-series processing. Recently, he has been working on multi-modal large language models (LLMs), improving the reliability and interpretability of LLMs, disentanglement techniques for speech and its application to low-resourced languages. Before joining MIT-IBM Watson AI Lab, Yang was a researcher at IBM Research Yorktown. Yang obtained his PhD degree from University of Illinois at Urbana-Champaign (UIUC). His advisor was Mark Hasegawa-Johnson.

Top Work

Class-wise rationalization: teaching AI to weigh pros and cons

Class-wise rationalization: teaching AI to weigh pros and cons

Natural Language Processing

Publications with the MIT-IBM Watson AI Lab

PromptBoosting: Black-Box Text Classification with Ten Forward Passes
PromptBoosting: Black-Box Text Classification with Ten Forward Passes
 
Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models
Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models
 
Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
 
TextGrad: Advancing Robustness Evaluation in NLP by Gradient-Driven Optimization
TextGrad: Advancing Robustness Evaluation in NLP by Gradient-Driven Optimization
 
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
 
Fairness Reprogramming
Fairness Reprogramming
 
An Adversarial Framework for Generating Unseen Images by Activation Maximization
An Adversarial Framework for Generating Unseen Images by Activation Maximization
 
CONTENTVEC: An Improved Self-Supervised Speech Representation by Disentangling Speakers
CONTENTVEC: An Improved Self-Supervised Speech Representation by Disentangling Speakers
 
Data-Efficient Double-Win Lottery Tickets from Robust Pre-training
Data-Efficient Double-Win Lottery Tickets from Robust Pre-training
 
Adversarial Support Alignment
Adversarial Support Alignment
 
Linking Emergent and Natural Languages via Corpus Transfer
Linking Emergent and Natural Languages via Corpus Transfer
 
PARP: Prune Once, Adjust and Re-Prune for Self-Supervised Speech Recognition
PARP: Prune Once, Adjust and Re-Prune for Self-Supervised Speech Recognition
 
Understanding Interlocking Dynamics of Cooperative Rationalization
Understanding Interlocking Dynamics of Cooperative Rationalization
 
Drawing Robust Scratch Tickets: Subnetworks with Inborn Robustness Are Found within Randomly Initialized Networks
Drawing Robust Scratch Tickets: Subnetworks with Inborn Robustness Are Found within Randomly Initialized Networks
 
SACoD: Sensor Algorithm Co-Design Towards Efficient CNN-Powered Intelligent PhlatCam
SACoD: Sensor Algorithm Co-Design Towards Efficient CNN-Powered Intelligent PhlatCam
 
The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models
The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models
 
Global Prosody Style Transfer Without Text Transcriptions
Global Prosody Style Transfer Without Text Transcriptions
 
Auto-NBA: Efficient and Effective Search Over The Joint Space of Networks, Bitwidths, and Accelerators
Auto-NBA: Efficient and Effective Search Over The Joint Space of Networks, Bitwidths, and Accelerators
 
The Lottery Ticket Hypothesis for the Pre-trained BERT Networks
The Lottery Ticket Hypothesis for the Pre-trained BERT Networks
 
Invariant Rationalization
Invariant Rationalization
 
Unsupervised Speech Decomposition via Triple Information Bottleneck
Unsupervised Speech Decomposition via Triple Information Bottleneck
 
Deep Symbolic Superoptimization Without Human Knowledge
Deep Symbolic Superoptimization Without Human Knowledge
 
A Game Theoretic Approach to Class-wise Selective Rationalization
A Game Theoretic Approach to Class-wise Selective Rationalization
 
Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control
Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control
 
Grounding Spoken Words in Unlabeled Video
Grounding Spoken Words in Unlabeled Video
 
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss