Abstract
Models often easily learn biases present in the training data, and their
predictions directly reflect this bias. We analyze gender bias in dialogue
data, and examine how this bias is actually amplified in subsequent generative
chit-chat dialogue models. We measure gender bias in six existing dialogue
datasets, and focus on the most biased one, the multi-player text-based fantasy
adventure dataset LIGHT, as a testbed for our bias mitigation techniques. The
LIGHT dataset is highly imbalanced with respect to gender, containing
predominantly male characters, likely because it is entirely collected by
crowdworkers and reflects common biases that exist in fantasy or medieval
settings. We consider three techniques to mitigate gender bias: counterfactual
data augmentation, targeted data collection, and bias controlled training. We
show that our proposed techniques mitigate gender bias in LIGHT by balancing
the genderedness of generated dialogue utterances and are particularly
effective in combination. We quantify performance using various evaluation
methods---such as quantity of gendered words, a dialogue safety classifier, and
human studies---all of which show that our models generate less gendered, but
equally engaging chit-chat responses.
Description
Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation
Links and resources
Tags