Long-Tail Theory Under Gaussian Mixtures

Arman Bolatov, Maxat Tezekbayev, Igor Melnykov, Artur Pak, Vassilina Nikoulina, Zhenisbek Assylbekov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We suggest a simple Gaussian mixture model for data generation that complies with Feldman's long tail theory (2020). We demonstrate that a linear classifier cannot decrease the generalization error below a certain level in the proposed model, whereas a nonlinear classifier with a memorization capacity can. This confirms that for long-tailed distributions, rare training examples must be considered for optimal generalization to new data. Finally, we show that the performance gap between linear and nonlinear models can be lessened as the tail becomes shorter in the subpopulation frequency distribution, as confirmed by experiments on synthetic and real data.

Original languageEnglish (US)
Title of host publicationECAI 2023 - 26th European Conference on Artificial Intelligence, including 12th Conference on Prestigious Applications of Intelligent Systems, PAIS 2023 - Proceedings
EditorsKobi Gal, Kobi Gal, Ann Nowe, Grzegorz J. Nalepa, Roy Fairstein, Roxana Radulescu
PublisherIOS Press BV
Pages109-116
Number of pages8
ISBN (Electronic)9781643684369
DOIs
StatePublished - Sep 28 2023
Externally publishedYes
Event26th European Conference on Artificial Intelligence, ECAI 2023 - Krakow, Poland
Duration: Sep 30 2023Oct 4 2023

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume372
ISSN (Print)0922-6389
ISSN (Electronic)1879-8314

Conference

Conference26th European Conference on Artificial Intelligence, ECAI 2023
Country/TerritoryPoland
CityKrakow
Period9/30/2310/4/23

Bibliographical note

Publisher Copyright:
© 2023 The Authors.

Fingerprint

Dive into the research topics of 'Long-Tail Theory Under Gaussian Mixtures'. Together they form a unique fingerprint.

Cite this