Simulation experiments on (the absence of) ratings bias in reputation systems

Jacob Thebault-Spieker; Daniel Kluver; Maximillian Klein; Aaron Halfaker; Brent Hecht; Loren G Terveen; Joseph A Konstan

doi:10.1145/3134736

Simulation experiments on (the absence of) ratings bias in reputation systems

Jacob Thebault-Spieker, Daniel Kluver, Maximillian Klein, Aaron Halfaker, Brent Hecht, Loren G Terveen, Joseph A Konstan

Computer Science and Engineering

Research output: Contribution to journal › Article › peer-review

18 Scopus citations

Abstract

As the gig economy continues to grow and freelance work moves online, five-star reputation systems are becoming more and more common. At the same time, there are increasing accounts of race and gender bias in evaluations of gig workers, with negative impacts for those workers. We report on a series of four Mechanical Turk-based studies in which participants who rated simulated gig work did not show race- or gender bias, while manipulation checks showed they reliably distinguished between low- and high-quality work. Given prior research, this was a striking result. To explore further, we used a Bayesian approach to verify absence of ratings bias (as opposed to merely not detecting bias). This Bayesian test let us identify an upperbound: if any bias did exist in our studies, it was below an average of 0.2 stars on a five-star scale. We discuss possible interpretations of our results and outline future work to better understand the results.

Original language	English (US)
Article number	101
Journal	Proceedings of the ACM on Human-Computer Interaction
Volume	1
Issue number	CSCW
DOIs	https://doi.org/10.1145/3134736
State	Published - Nov 2017

Bibliographical note

Publisher Copyright:
© 2017 Association for Computing Machinery.

Keywords

Bayesian statistics
Gender bias
Gig economy
Racial bias
Reputation bias
Reputation systems

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access

10.1145/3134736

OpenUrl availability

Full text

Cite this

@article{0ad2f246857f4875ab900356fe41a0ae,

title = "Simulation experiments on (the absence of) ratings bias in reputation systems",

abstract = "As the gig economy continues to grow and freelance work moves online, five-star reputation systems are becoming more and more common. At the same time, there are increasing accounts of race and gender bias in evaluations of gig workers, with negative impacts for those workers. We report on a series of four Mechanical Turk-based studies in which participants who rated simulated gig work did not show race- or gender bias, while manipulation checks showed they reliably distinguished between low- and high-quality work. Given prior research, this was a striking result. To explore further, we used a Bayesian approach to verify absence of ratings bias (as opposed to merely not detecting bias). This Bayesian test let us identify an upperbound: if any bias did exist in our studies, it was below an average of 0.2 stars on a five-star scale. We discuss possible interpretations of our results and outline future work to better understand the results.",

keywords = "Bayesian statistics, Gender bias, Gig economy, Racial bias, Reputation bias, Reputation systems",

author = "Jacob Thebault-Spieker and Daniel Kluver and Maximillian Klein and Aaron Halfaker and Brent Hecht and Terveen, {Loren G} and Konstan, {Joseph A}",

note = "Publisher Copyright: {\textcopyright} 2017 Association for Computing Machinery.",

year = "2017",

month = nov,

doi = "10.1145/3134736",

language = "English (US)",

volume = "1",

journal = "Proceedings of the ACM on Human-Computer Interaction",

issn = "2573-0142",

publisher = "Association for Computing Machinery (ACM)",

number = "CSCW",

}

TY - JOUR

T1 - Simulation experiments on (the absence of) ratings bias in reputation systems

AU - Thebault-Spieker, Jacob

AU - Kluver, Daniel

AU - Klein, Maximillian

AU - Halfaker, Aaron

AU - Hecht, Brent

AU - Terveen, Loren G

AU - Konstan, Joseph A

PY - 2017/11

Y1 - 2017/11

N2 - As the gig economy continues to grow and freelance work moves online, five-star reputation systems are becoming more and more common. At the same time, there are increasing accounts of race and gender bias in evaluations of gig workers, with negative impacts for those workers. We report on a series of four Mechanical Turk-based studies in which participants who rated simulated gig work did not show race- or gender bias, while manipulation checks showed they reliably distinguished between low- and high-quality work. Given prior research, this was a striking result. To explore further, we used a Bayesian approach to verify absence of ratings bias (as opposed to merely not detecting bias). This Bayesian test let us identify an upperbound: if any bias did exist in our studies, it was below an average of 0.2 stars on a five-star scale. We discuss possible interpretations of our results and outline future work to better understand the results.

AB - As the gig economy continues to grow and freelance work moves online, five-star reputation systems are becoming more and more common. At the same time, there are increasing accounts of race and gender bias in evaluations of gig workers, with negative impacts for those workers. We report on a series of four Mechanical Turk-based studies in which participants who rated simulated gig work did not show race- or gender bias, while manipulation checks showed they reliably distinguished between low- and high-quality work. Given prior research, this was a striking result. To explore further, we used a Bayesian approach to verify absence of ratings bias (as opposed to merely not detecting bias). This Bayesian test let us identify an upperbound: if any bias did exist in our studies, it was below an average of 0.2 stars on a five-star scale. We discuss possible interpretations of our results and outline future work to better understand the results.

KW - Bayesian statistics

KW - Gender bias

KW - Gig economy

KW - Racial bias

KW - Reputation bias

KW - Reputation systems

UR - http://www.scopus.com/inward/record.url?scp=85066422244&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066422244&partnerID=8YFLogxK

U2 - 10.1145/3134736

DO - 10.1145/3134736

M3 - Article

AN - SCOPUS:85066422244

SN - 2573-0142

VL - 1

JO - Proceedings of the ACM on Human-Computer Interaction

JF - Proceedings of the ACM on Human-Computer Interaction

IS - CSCW

M1 - 101

ER -

Simulation experiments on (the absence of) ratings bias in reputation systems

Abstract

Bibliographical note

Keywords

UN SDGs

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this