CAREER: New Frontiers of Private Learning and Synthetic Data

Project: Research project

Project Details

Description

The vast collection of detailed personal data offers significant benefits to researchers, companies, and policymakers. To protect individual privacy, many organizations, both from the public and private sectors, have adopted differential privacy as a rigorous privacy measure. However, recent deployments of differential privacy have revealed key research gaps. First, much of the existing theoretical work in differential privacy focuses on worst-case analyses, which often lead to overly pessimistic results and fail to inform algorithm design in practice. Despite recent advancements, differentially private algorithms for machine learning and data sharing are still not widely adopted technologies. Lastly, the lack of comprehensive tools for privacy risk assessment makes it difficult for practitioners to evaluate the effectiveness of differential privacy and to determine appropriate privacy risk parameters. This project aims to address these challenges in differential privacy by expanding the repertoire of privacy-preserving algorithms and developing auditing mechanisms to assess the privacy protection these algorithms provide.The research focuses on two fundamental and closely related problems: private learning and private synthetic data. In private learning, the goal is to learn accurate machine learning models using sensitive data with differential privacy guarantees. In private synthetic data, the goal is to differentially privately generate a synthetic dataset that preserves important statistical trends of the sensitive dataset. The project advances the frontiers of these two problems with three research thrusts. The first thrust develops a theoretical framework that goes beyond pessimistic worst-case analyses to better capture practical scenarios and guide algorithm design. The second thrust designs practical algorithms that are informed by theoretical principles and empirical structures of the problems in practice. The third focuses on privacy attacks and auditing mechanisms that evaluate the privacy risks of learning and synthetic data algorithms. The project also includes a comprehensive educational and outreach program, providing research opportunities for students at different educational levels and developing new courses and educational materials.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
StatusActive
Effective start/end date3/1/242/28/29

Funding

  • National Science Foundation: $680,000.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.