Racial Disparity in Natural Language Processing: A Case Study of Social Media African-American English

Brendan O'Connor

Su Lin Blodgett and Brendan O’Connor

We highlight an important frontier in algorithmic fairness: disparity in the quality of natural language processing algorithms when applied to language from authors of different social groups. For example, current systems sometimes analyze the language of females and minorities more poorly than they do of whites and males. We conduct an empirical analysis of racial disparity in language identifi€cation for tweets written in African-American English, and discuss implications of disparity in NLP.

Links: Video