Abstract
We suggest a simple Gaussian mixture model for data generation that complies with Feldman's long tail theory (2020). We demonstrate that a linear classifier cannot decrease the generalization error below a certain level in the proposed model, whereas a nonlinear classifier with a memorization capacity can. This confirms that for long-tailed distributions, rare training examples must be considered for optimal generalization to new data. Finally, we show that the performance gap between linear and nonlinear models can be lessened as the tail becomes shorter in the subpopulation frequency distribution, as confirmed by experiments on synthetic and real data.
Original language | English (US) |
---|---|
Title of host publication | ECAI 2023 - 26th European Conference on Artificial Intelligence, including 12th Conference on Prestigious Applications of Intelligent Systems, PAIS 2023 - Proceedings |
Editors | Kobi Gal, Kobi Gal, Ann Nowe, Grzegorz J. Nalepa, Roy Fairstein, Roxana Radulescu |
Publisher | IOS Press BV |
Pages | 109-116 |
Number of pages | 8 |
ISBN (Electronic) | 9781643684369 |
DOIs | |
State | Published - Sep 28 2023 |
Externally published | Yes |
Event | 26th European Conference on Artificial Intelligence, ECAI 2023 - Krakow, Poland Duration: Sep 30 2023 → Oct 4 2023 |
Publication series
Name | Frontiers in Artificial Intelligence and Applications |
---|---|
Volume | 372 |
ISSN (Print) | 0922-6389 |
ISSN (Electronic) | 1879-8314 |
Conference
Conference | 26th European Conference on Artificial Intelligence, ECAI 2023 |
---|---|
Country/Territory | Poland |
City | Krakow |
Period | 9/30/23 → 10/4/23 |
Bibliographical note
Publisher Copyright:© 2023 The Authors.