MISMATCHED SUPERVISED LEARNING

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Supervised learning scenarios, where labels and features are possibly mismatched, have been an emerging concern in machine learning applications. For example, researchers often need to align heterogeneous data from multiple resources to the same entities without a unique identifier in the socioeconomic study. Such a mismatch problem can significantly affect the learning performance if it is not appropriately addressed. Due to the combinatorial nature of the mismatch problem, existing methods are often designed for small datasets and simple linear models but are not scalable to large-scale datasets and complex models. In this paper, we first present a new formulation of the mismatch problem that supports continuous optimization problems and allows for gradient-based methods. Moreover, we develop a computation and memory efficient method to process complex data and models. Empirical studies on synthetic and real-world data show significantly better performance of the proposed algorithms than state-of-the-art methods.

Original languageEnglish (US)
Title of host publication2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4228-4232
Number of pages5
ISBN (Electronic)9781665405409
DOIs
StatePublished - 2022
Event47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Virtual, Online, Singapore
Duration: May 23 2022May 27 2022

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2022-May
ISSN (Print)1520-6149

Conference

Conference47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Country/TerritorySingapore
CityVirtual, Online
Period5/23/225/27/22

Bibliographical note

Funding Information:
This paper is based upon work supported by the Cisco Research and the National Science Foundation under grant number DMS-2134148.

Publisher Copyright:
© 2022 IEEE

Keywords

  • Mismatched Data
  • Permutation Matrix
  • Stochastic Gradient Descent (SGD)
  • Supervised Learning

Fingerprint

Dive into the research topics of 'MISMATCHED SUPERVISED LEARNING'. Together they form a unique fingerprint.

Cite this