Online control basis selection by a regularized actor critic algorithm

Jianjun Yuan; Andrew Lamperski

doi:10.23919/ACC.2017.7963640

Online control basis selection by a regularized actor critic algorithm

Jianjun Yuan, Andrew Lamperski

Electrical and Computer Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

5 Scopus citations

Abstract

Policy gradient algorithms are useful reinforcement learning methods which optimize a control policy by performing stochastic gradient descent with respect to controller parameters. In this paper, we extend actor-critic algorithms by adding an ℓ₁ norm regularization on the actor part, which makes our algorithm automatically select and optimize the useful controller basis functions. Our method is closely related to existing approaches to sparse controller design and actuator selection, but in contrast to these, our approach runs online and does not require a plant model. In order to utilize ℓ₁ regularization online, the actor updates are extended to include an iterative soft-thresholding step. Convergence of the algorithm is proved using methods from stochastic approximation. The effectiveness of our algorithm for control basis and actuator selection is demonstrated on numerical examples.

Original language	English (US)
Title of host publication	2017 American Control Conference, ACC 2017
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	4448-4453
Number of pages	6
ISBN (Electronic)	9781509059928
DOIs	https://doi.org/10.23919/ACC.2017.7963640
State	Published - Jun 29 2017
Event	2017 American Control Conference, ACC 2017 - Seattle, United States Duration: May 24 2017 → May 26 2017

Publication series

Name	Proceedings of the American Control Conference
ISSN (Print)	0743-1619

Other

Other	2017 American Control Conference, ACC 2017
Country/Territory	United States
City	Seattle
Period	5/24/17 → 5/26/17

Access

10.23919/ACC.2017.7963640

OpenUrl availability

Full text

Cite this

Online control basis selection by a regularized actor critic algorithm. / Yuan, Jianjun; Lamperski, Andrew.
2017 American Control Conference, ACC 2017. Institute of Electrical and Electronics Engineers Inc., 2017. p. 4448-4453 7963640 (Proceedings of the American Control Conference).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Yuan, J & Lamperski, A 2017, Online control basis selection by a regularized actor critic algorithm. in 2017 American Control Conference, ACC 2017., 7963640, Proceedings of the American Control Conference, Institute of Electrical and Electronics Engineers Inc., pp. 4448-4453, 2017 American Control Conference, ACC 2017, Seattle, United States, 5/24/17. https://doi.org/10.23919/ACC.2017.7963640

@inproceedings{ef43bf99f4a547bcae6b798e6b651d11,

title = "Online control basis selection by a regularized actor critic algorithm",

abstract = "Policy gradient algorithms are useful reinforcement learning methods which optimize a control policy by performing stochastic gradient descent with respect to controller parameters. In this paper, we extend actor-critic algorithms by adding an ℓ1 norm regularization on the actor part, which makes our algorithm automatically select and optimize the useful controller basis functions. Our method is closely related to existing approaches to sparse controller design and actuator selection, but in contrast to these, our approach runs online and does not require a plant model. In order to utilize ℓ1 regularization online, the actor updates are extended to include an iterative soft-thresholding step. Convergence of the algorithm is proved using methods from stochastic approximation. The effectiveness of our algorithm for control basis and actuator selection is demonstrated on numerical examples.",

author = "Jianjun Yuan and Andrew Lamperski",

year = "2017",

month = jun,

day = "29",

doi = "10.23919/ACC.2017.7963640",

language = "English (US)",

series = "Proceedings of the American Control Conference",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "4448--4453",

booktitle = "2017 American Control Conference, ACC 2017",

note = "2017 American Control Conference, ACC 2017 ; Conference date: 24-05-2017 Through 26-05-2017",

}

TY - GEN

T1 - Online control basis selection by a regularized actor critic algorithm

AU - Yuan, Jianjun

AU - Lamperski, Andrew

PY - 2017/6/29

Y1 - 2017/6/29

N2 - Policy gradient algorithms are useful reinforcement learning methods which optimize a control policy by performing stochastic gradient descent with respect to controller parameters. In this paper, we extend actor-critic algorithms by adding an ℓ1 norm regularization on the actor part, which makes our algorithm automatically select and optimize the useful controller basis functions. Our method is closely related to existing approaches to sparse controller design and actuator selection, but in contrast to these, our approach runs online and does not require a plant model. In order to utilize ℓ1 regularization online, the actor updates are extended to include an iterative soft-thresholding step. Convergence of the algorithm is proved using methods from stochastic approximation. The effectiveness of our algorithm for control basis and actuator selection is demonstrated on numerical examples.

AB - Policy gradient algorithms are useful reinforcement learning methods which optimize a control policy by performing stochastic gradient descent with respect to controller parameters. In this paper, we extend actor-critic algorithms by adding an ℓ1 norm regularization on the actor part, which makes our algorithm automatically select and optimize the useful controller basis functions. Our method is closely related to existing approaches to sparse controller design and actuator selection, but in contrast to these, our approach runs online and does not require a plant model. In order to utilize ℓ1 regularization online, the actor updates are extended to include an iterative soft-thresholding step. Convergence of the algorithm is proved using methods from stochastic approximation. The effectiveness of our algorithm for control basis and actuator selection is demonstrated on numerical examples.

UR - http://www.scopus.com/inward/record.url?scp=85027052467&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85027052467&partnerID=8YFLogxK

U2 - 10.23919/ACC.2017.7963640

DO - 10.23919/ACC.2017.7963640

M3 - Conference contribution

AN - SCOPUS:85027052467

T3 - Proceedings of the American Control Conference

SP - 4448

EP - 4453

BT - 2017 American Control Conference, ACC 2017

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2017 American Control Conference, ACC 2017

Y2 - 24 May 2017 through 26 May 2017

ER -

Online control basis selection by a regularized actor critic algorithm

Abstract

Publication series

Other

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this