Abstract
Many image classification tasks (e.g., medical image classification) have a severe class imbalance problem. Convolutional neural network (CNN) is currently a state-of-the-art method for image classification. CNN relies on a large training dataset to achieve high classification performance. However, manual labeling is costly and may not even be feasible for medical domain. In this paper, we propose a novel similarity-based active deep learning framework (SAL) that deals with class imbalance. SAL actively learns a similarity model to recommend unlabeled rare class samples for experts' manual labeling. Based on similarity ranking, SAL recommends high confidence unlabeled common class samples for automatic pseudo-labeling without experts' labeling effort. To the best of our knowledge, SAL is the first active deep learning framework that deals with a significant class imbalance. Our experiments show that SAL consistently outperforms two other recent active deep learning methods on two challenging datasets. What's more, SAL obtains nearly the upper bound classification performance (using all the images in the training dataset) while the domain experts labeled only 5.6% and 7.5% of all images in the Endoscopy dataset and the Caltech-256 dataset, respectively. SAL significantly reduces the experts' manual labeling efforts while achieving near optimal classification performance.
Original language | English (US) |
---|---|
Title of host publication | 2018 IEEE International Conference on Data Mining, ICDM 2018 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 1422-1427 |
Number of pages | 6 |
ISBN (Electronic) | 9781538691588 |
DOIs | |
State | Published - Dec 27 2018 |
Event | 18th IEEE International Conference on Data Mining, ICDM 2018 - Singapore, Singapore Duration: Nov 17 2018 → Nov 20 2018 |
Publication series
Name | Proceedings - IEEE International Conference on Data Mining, ICDM |
---|---|
Volume | 2018-November |
ISSN (Print) | 1550-4786 |
Conference
Conference | 18th IEEE International Conference on Data Mining, ICDM 2018 |
---|---|
Country/Territory | Singapore |
City | Singapore |
Period | 11/17/18 → 11/20/18 |
Bibliographical note
Funding Information:This research was supported in part by a grant from the NIH (Grant #1R01DK106130-01A1). Tavanapong, Wong, and Oh have an equity interest and management role in EndoMetric Corp. De Groen serves as the company's Medical Advisor. The terms of this arrangement have been reviewed and approved by Iowa State University and University of Minnesota in accordance with its conflict of interest policies.
Publisher Copyright:
© 2018 IEEE.
Keywords
- Active deep learning
- Class imbalance
- Image classification
- Similarity learning