Everyone’s Voice Matters: Quantifying Annotation Disagreement Using Demographic Information

Ruyuan Wan, Jaehyung Kim, Dongyeop Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

In NLP annotation, it is common to have multiple annotators label the text and then obtain the ground truth labels based on the agreement of major annotators. However, annotators are individuals with different backgrounds, and minors’ opinions should not be simply ignored. As annotation tasks become subjective and topics are controversial in modern NLP tasks, we need NLP systems that can represent people’s diverse voices on subjective matters and predict the level of diversity. This paper examines whether the text of the task and annotators’ demographic background information can be used to estimate the level of disagreement among annotators. Particularly, we extract disagreement labels from the annotators’ voting histories in the five subjective datasets, and then fine-tune language models to predict annotators’ disagreement. Our results show that knowing annotators’ demographic information, like gender, ethnicity, and education level, helps predict disagreements. In order to distinguish the disagreement from the inherent controversy from text content and the disagreement in the annotators’ different perspectives, we simulate everyone’s voices with different combinations of annotators’ artificial demographics and examine its variance of the fine-tuned disagreement predictor. Our paper aims to improve the annotation process for more efficient and inclusive NLP systems through a novel disagreement prediction mechanism. Our code and dataset are publicly available.

Original languageEnglish (US)
Title of host publicationAAAI-23 Special Tracks
EditorsBrian Williams, Yiling Chen, Jennifer Neville
PublisherAAAI press
Pages14523-14530
Number of pages8
ISBN (Electronic)9781577358800
StatePublished - Jun 27 2023
Event37th AAAI Conference on Artificial Intelligence, AAAI 2023 - Washington, United States
Duration: Feb 7 2023Feb 14 2023

Publication series

NameProceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023
Volume37

Conference

Conference37th AAAI Conference on Artificial Intelligence, AAAI 2023
Country/TerritoryUnited States
CityWashington
Period2/7/232/14/23

Bibliographical note

Publisher Copyright:
Copyright © 2023, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Fingerprint

Dive into the research topics of 'Everyone’s Voice Matters: Quantifying Annotation Disagreement Using Demographic Information'. Together they form a unique fingerprint.

Cite this