A baseline methodology for word sense disambiguation

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

This paper describes a methodology for supervised word sense disambiguation that relies on standard machine learning algorithms to induce classifiers from sense-tagged training examples where the context in which ambiguous words occur are represented by simple lexical features. This constitutes a baseline approach since it produces classifiers based on easy to identify features that result in accurate disambiguation across a variety of languages. This paper reviews several systems based on this methodology that participated in the Spanish and English lexical sample tasks of the Senseval-2 comparative exercise among word sense disambiguation systems. These systems fared much better than standard baselines, and were within seven to ten percentage points of accuracy of the mostly highly ranked syste.

Original languageEnglish (US)
Title of host publicationComputational Linguistics and Intelligent Text Processing - 3rd International Conference, CICLing 2002, Proceedings
EditorsAlexander Gelbukh
PublisherSpringer Verlag
Pages126-135
Number of pages10
ISBN (Print)3540432191, 9783540457152
DOIs
StatePublished - 2002
Event3rd Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2002 - Mexico City, Mexico
Duration: Feb 17 2002Feb 23 2002

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2276
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other3rd Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2002
Country/TerritoryMexico
CityMexico City
Period2/17/022/23/02

Bibliographical note

Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2002.

Fingerprint

Dive into the research topics of 'A baseline methodology for word sense disambiguation'. Together they form a unique fingerprint.

Cite this