Bot detection in Wikidata using behavioral and other informal cues

Andrew Hall, Loren Terveen, Aaron Halfaker

Research output: Contribution to journalArticlepeer-review

23 Scopus citations

Abstract

Bots have been important to peer production’s success. Wikipedia, OpenStreetMap, and Wikidata all have taken advantage of automation to perform work at a rate and scale exceeding that of human contributors. Understanding the ways in which humans and bots behave in these communities is an important topic, and one that relies on accurate bot recognition. Yet, in many cases, bot activities are not explicitly flagged and could be mistaken for human contributions. We develop a machine classifier to detect previously unidentified bots using implicit behavioral and other informal editing characteristics. We show that this method yields a high level of fitness under both formal evaluation (PR-AUC: 0.845, ROC-AUC: 0.985) and a qualitative analysis of “anonymous” contributor edit sessions. We also show that, in some cases, unflagged bot activities can significantly misrepresent human behavior in analyses. Our model has the potential to support future research and community patrolling activities.

Original languageEnglish (US)
Article number64
JournalProceedings of the ACM on Human-Computer Interaction
Volume2
Issue numberCSCW
DOIs
StatePublished - Nov 2018

Bibliographical note

Publisher Copyright:
Copyright 2018 held by Owner/Author(s).

Keywords

  • Automation
  • Bots
  • Machine learning
  • Peer production
  • Structured data
  • Wikidata

Fingerprint

Dive into the research topics of 'Bot detection in Wikidata using behavioral and other informal cues'. Together they form a unique fingerprint.

Cite this