Abstract
The Ngram Statistics Package (Text::NSP) is freely available open-source software that identifies ngrams, collocations and word associations in text. It is implemented in Perl and takes advantage of regular expressions to provide very flexible tokenization and to allow for the identification of non-adjacent ngrams. It includes a wide range of measures of association that can be used to identify collocations.
Original language | English (US) |
---|---|
Title of host publication | Workshop on Multiword Expressions |
Subtitle of host publication | From Parsing and Generation to the Real World, MWE 2011 at the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-HLT 2011 - Proceedings |
Editors | Valia Kordoni, Carlos Ramisch, Carlos Ramisch, Aline Villavicencio |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 131-133 |
Number of pages | 3 |
ISBN (Electronic) | 9781932432978 |
State | Published - 2011 |
Event | 2011 Workshop on Multiword Expressions: From Parsing and Generation to the Real World, MWE 2011 at the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-HLT 2011 - Portland, United States Duration: Jun 23 2011 → … |
Publication series
Name | Workshop on Multiword Expressions: From Parsing and Generation to the Real World, MWE 2011 at the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-HLT 2011 - Proceedings |
---|
Conference
Conference | 2011 Workshop on Multiword Expressions: From Parsing and Generation to the Real World, MWE 2011 at the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-HLT 2011 |
---|---|
Country/Territory | United States |
City | Portland |
Period | 6/23/11 → … |
Bibliographical note
Publisher Copyright:© 2011 Association for Computational Linguistics.