Projected Stochastic Gradient Langevin Algorithms for Constrained Sampling and Non-Convex Learning

Andrew Lamperski

Projected Stochastic Gradient Langevin Algorithms for Constrained Sampling and Non-Convex Learning

Andrew Lamperski

Electrical and Computer Engineering

Research output: Contribution to journal › Conference article › peer-review

5 Scopus citations

Abstract

Langevin algorithms are gradient descent methods with additive noise. They have been used for decades in Markov Chain Monte Carlo (MCMC) sampling, optimization, and learning. Their convergence properties for unconstrained non-convex optimization and learning problems have been studied widely in the last few years. Other work has examined projected Langevin algorithms for sampling from log-concave distributions restricted to convex compact sets. For learning and optimization, log-concave distributions correspond to convex losses. In this paper, we analyze the case of non-convex losses with compact convex constraint sets and IID external data variables. We term the resulting method the projected stochastic gradient Langevin algorithm (PSGLA). We show the algorithm achieves a deviation of O(T^-1/4(log T)^1/2) from its target distribution in 1-Wasserstein distance. For optimization and learning, we show that the algorithm achieves ε-suboptimal solutions, on average, provided that it is run for a time that is polynomial in ε^-1 and slightly super-exponential in the problem dimension.

Original language	English (US)
Pages (from-to)	2891-2937
Number of pages	47
Journal	Proceedings of Machine Learning Research
Volume	134
State	Published - 2021
Event	34th Conference on Learning Theory, COLT 2021 - Boulder, United States Duration: Aug 15 2021 → Aug 19 2021

Bibliographical note

Publisher Copyright:
© 2021 A. Lamperski.

Keywords

Langevin Methods
Markov Chain Monte Carlo Sampling
Non-Asymptotic Analysis
Non-Convex Learning
Stochastic Gradient Algorithms

OpenUrl availability

Full text

Cite this

@article{cbcf0df822554d278ce94b96db1a7a75,

title = "Projected Stochastic Gradient Langevin Algorithms for Constrained Sampling and Non-Convex Learning",

abstract = "Langevin algorithms are gradient descent methods with additive noise. They have been used for decades in Markov Chain Monte Carlo (MCMC) sampling, optimization, and learning. Their convergence properties for unconstrained non-convex optimization and learning problems have been studied widely in the last few years. Other work has examined projected Langevin algorithms for sampling from log-concave distributions restricted to convex compact sets. For learning and optimization, log-concave distributions correspond to convex losses. In this paper, we analyze the case of non-convex losses with compact convex constraint sets and IID external data variables. We term the resulting method the projected stochastic gradient Langevin algorithm (PSGLA). We show the algorithm achieves a deviation of O(T-1/4(log T)1/2) from its target distribution in 1-Wasserstein distance. For optimization and learning, we show that the algorithm achieves ε-suboptimal solutions, on average, provided that it is run for a time that is polynomial in ε-1 and slightly super-exponential in the problem dimension.",

keywords = "Langevin Methods, Markov Chain Monte Carlo Sampling, Non-Asymptotic Analysis, Non-Convex Learning, Stochastic Gradient Algorithms",

author = "Andrew Lamperski",

note = "Publisher Copyright: {\textcopyright} 2021 A. Lamperski.; 34th Conference on Learning Theory, COLT 2021 ; Conference date: 15-08-2021 Through 19-08-2021",

year = "2021",

language = "English (US)",

volume = "134",

pages = "2891--2937",

journal = "Proceedings of Machine Learning Research",

issn = "2640-3498",

}

TY - JOUR

T1 - Projected Stochastic Gradient Langevin Algorithms for Constrained Sampling and Non-Convex Learning

AU - Lamperski, Andrew

PY - 2021

Y1 - 2021

N2 - Langevin algorithms are gradient descent methods with additive noise. They have been used for decades in Markov Chain Monte Carlo (MCMC) sampling, optimization, and learning. Their convergence properties for unconstrained non-convex optimization and learning problems have been studied widely in the last few years. Other work has examined projected Langevin algorithms for sampling from log-concave distributions restricted to convex compact sets. For learning and optimization, log-concave distributions correspond to convex losses. In this paper, we analyze the case of non-convex losses with compact convex constraint sets and IID external data variables. We term the resulting method the projected stochastic gradient Langevin algorithm (PSGLA). We show the algorithm achieves a deviation of O(T-1/4(log T)1/2) from its target distribution in 1-Wasserstein distance. For optimization and learning, we show that the algorithm achieves ε-suboptimal solutions, on average, provided that it is run for a time that is polynomial in ε-1 and slightly super-exponential in the problem dimension.

AB - Langevin algorithms are gradient descent methods with additive noise. They have been used for decades in Markov Chain Monte Carlo (MCMC) sampling, optimization, and learning. Their convergence properties for unconstrained non-convex optimization and learning problems have been studied widely in the last few years. Other work has examined projected Langevin algorithms for sampling from log-concave distributions restricted to convex compact sets. For learning and optimization, log-concave distributions correspond to convex losses. In this paper, we analyze the case of non-convex losses with compact convex constraint sets and IID external data variables. We term the resulting method the projected stochastic gradient Langevin algorithm (PSGLA). We show the algorithm achieves a deviation of O(T-1/4(log T)1/2) from its target distribution in 1-Wasserstein distance. For optimization and learning, we show that the algorithm achieves ε-suboptimal solutions, on average, provided that it is run for a time that is polynomial in ε-1 and slightly super-exponential in the problem dimension.

KW - Langevin Methods

KW - Markov Chain Monte Carlo Sampling

KW - Non-Asymptotic Analysis

KW - Non-Convex Learning

KW - Stochastic Gradient Algorithms

UR - http://www.scopus.com/inward/record.url?scp=85162731173&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85162731173&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85162731173

SN - 2640-3498

VL - 134

SP - 2891

EP - 2937

JO - Proceedings of Machine Learning Research

JF - Proceedings of Machine Learning Research

T2 - 34th Conference on Learning Theory, COLT 2021

Y2 - 15 August 2021 through 19 August 2021

ER -

Projected Stochastic Gradient Langevin Algorithms for Constrained Sampling and Non-Convex Learning

Abstract

Bibliographical note

Keywords

OpenUrl availability

Other files and links

Fingerprint

Cite this