The ngram statistics package (Text::NSP) - A flexible tool for identifying ngrams, collocations, and word associations

Ted Pedersen, Satanjeev Banerjee, Bridget T. McInnes, Saiyam Kohli, Mahesh Joshi, Ying Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Scopus citations

Abstract

The Ngram Statistics Package (Text::NSP) is freely available open-source software that identifies ngrams, collocations and word associations in text. It is implemented in Perl and takes advantage of regular expressions to provide very flexible tokenization and to allow for the identification of non-adjacent ngrams. It includes a wide range of measures of association that can be used to identify collocations.

Original languageEnglish (US)
Title of host publicationWorkshop on Multiword Expressions
Subtitle of host publicationFrom Parsing and Generation to the Real World, MWE 2011 at the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-HLT 2011 - Proceedings
EditorsValia Kordoni, Carlos Ramisch, Carlos Ramisch, Aline Villavicencio
PublisherAssociation for Computational Linguistics (ACL)
Pages131-133
Number of pages3
ISBN (Electronic)9781932432978
StatePublished - 2011
Event2011 Workshop on Multiword Expressions: From Parsing and Generation to the Real World, MWE 2011 at the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-HLT 2011 - Portland, United States
Duration: Jun 23 2011 → …

Publication series

NameWorkshop on Multiword Expressions: From Parsing and Generation to the Real World, MWE 2011 at the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-HLT 2011 - Proceedings

Conference

Conference2011 Workshop on Multiword Expressions: From Parsing and Generation to the Real World, MWE 2011 at the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-HLT 2011
Country/TerritoryUnited States
CityPortland
Period6/23/11 → …

Bibliographical note

Publisher Copyright:
© 2011 Association for Computational Linguistics.

Fingerprint

Dive into the research topics of 'The ngram statistics package (Text::NSP) - A flexible tool for identifying ngrams, collocations, and word associations'. Together they form a unique fingerprint.

Cite this