DuluthNLP at SemEval-2023 Task 12: AfriSenti-SemEval: Sentiment Analysis for Low-resource African Languages using Twitter Dataset

Samuel Akrah, Ted Pedersen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

This paper describes the DuluthNLP system that participated in Task 12 of SemEval-2023 on AfriSenti-SemEval: Sentiment Analysis for Low-resource African Languages using Twitter Dataset. Given a set of tweets, the task requires participating systems to classify each tweet as negative, positive or neutral. We evaluate a range of monolingual and multilingual pre-trained models on the Twi language dataset, one among the 14 African languages included in the SemEval task. We introduce TwiBERT, a new pretrained model trained from scratch. We show that TwiBERT, along with mBERT, generally perform best when trained on the Twi dataset, achieving an F1 score of 64.29% on the official evaluation test data, which ranks 14 out of 30 of the total submissions for Track 10. The TwiBERT model is released at https://huggingface.co/sakrah/TwiBERT.

Original languageEnglish (US)
Title of host publication17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop
EditorsAtul Kr. Ojha, A. Seza Dogruoz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
PublisherAssociation for Computational Linguistics
Pages1697-1701
Number of pages5
ISBN (Electronic)9781959429999
StatePublished - 2023
Event17th International Workshop on Semantic Evaluation, SemEval 2023, co-located with the 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 - Hybrid, Toronto, Canada
Duration: Jul 13 2023Jul 14 2023

Publication series

Name17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop

Conference

Conference17th International Workshop on Semantic Evaluation, SemEval 2023, co-located with the 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
Country/TerritoryCanada
CityHybrid, Toronto
Period7/13/237/14/23

Bibliographical note

Publisher Copyright:
© 2023 Association for Computational Linguistics.

Fingerprint

Dive into the research topics of 'DuluthNLP at SemEval-2023 Task 12: AfriSenti-SemEval: Sentiment Analysis for Low-resource African Languages using Twitter Dataset'. Together they form a unique fingerprint.

Cite this