Distributional Cloning for Stabilized Imitation Learning via ADMM

Xin Zhang; Yanhua Li; Ziming Zhang; Christopher G. Brinton; Zhenming Liu; Zhi Li Zhang

doi:10.1109/ICDM58522.2023.00091

Distributional Cloning for Stabilized Imitation Learning via ADMM

Xin Zhang, Yanhua Li, Ziming Zhang, Christopher G. Brinton, Zhenming Liu, Zhi Li Zhang

Computer Science and Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

The two leading solution paradigms for imitation learning (IL), BC and GAIL, each suffers from notable drawbacks. BC, a supervised learning approach to mimic expert actions, is vulnerable to covariate shift. GAIL applies adversarial training to minimize the discrepancy between expert and learner behaviors, which is prone to unstable training and mode collapse. In this work, we propose DC - Distributional Cloning - a novel IL approach for addressing the covariate shift and mode collapse problems simultaneously. DC directly maximizes the likelihood of observed expert and learner demonstrations, and gradually encourages the learner to evolve towards expert behaviors based on an averaging effect. The DC solution framework contains two stages in each training loop, where in stage one the mixed expert and learner state distribution is estimated via SoftFlow, and in stage two the learner policy is trained to match both the expert's policy and state distribution via ADMM. Experimental evaluation of DC compared with several baselines in 10 different physics-based control tasks reveal superior results in learner policy performance, training stability, and mode distribution preservation.

Original language	English (US)
Title of host publication	Proceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023
Editors	Guihai Chen, Latifur Khan, Xiaofeng Gao, Meikang Qiu, Witold Pedrycz, Xindong Wu
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	818-827
Number of pages	10
ISBN (Electronic)	9798350307887
DOIs	https://doi.org/10.1109/ICDM58522.2023.00091
State	Published - 2023
Event	23rd IEEE International Conference on Data Mining, ICDM 2023 - Shanghai, China Duration: Dec 1 2023 → Dec 4 2023

Publication series

Name	Proceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)	1550-4786

Conference

Conference	23rd IEEE International Conference on Data Mining, ICDM 2023
Country/Territory	China
City	Shanghai
Period	12/1/23 → 12/4/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Keywords

imitation learning
neural ordinary differential equations

Access

10.1109/ICDM58522.2023.00091

OpenUrl availability

Full text

Cite this

Zhang, X., Li, Y., Zhang, Z., Brinton, C. G., Liu, Z., & Zhang, Z. L. (2023). Distributional Cloning for Stabilized Imitation Learning via ADMM. In G. Chen, L. Khan, X. Gao, M. Qiu, W. Pedrycz, & X. Wu (Eds.), Proceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023 (pp. 818-827). (Proceedings - IEEE International Conference on Data Mining, ICDM). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDM58522.2023.00091

Distributional Cloning for Stabilized Imitation Learning via ADMM. / Zhang, Xin; Li, Yanhua; Zhang, Ziming et al.
Proceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023. ed. / Guihai Chen; Latifur Khan; Xiaofeng Gao; Meikang Qiu; Witold Pedrycz; Xindong Wu. Institute of Electrical and Electronics Engineers Inc., 2023. p. 818-827 (Proceedings - IEEE International Conference on Data Mining, ICDM).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Zhang, X, Li, Y, Zhang, Z, Brinton, CG, Liu, Z & Zhang, ZL 2023, Distributional Cloning for Stabilized Imitation Learning via ADMM. in G Chen, L Khan, X Gao, M Qiu, W Pedrycz & X Wu (eds), Proceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023. Proceedings - IEEE International Conference on Data Mining, ICDM, Institute of Electrical and Electronics Engineers Inc., pp. 818-827, 23rd IEEE International Conference on Data Mining, ICDM 2023, Shanghai, China, 12/1/23. https://doi.org/10.1109/ICDM58522.2023.00091

Zhang X, Li Y, Zhang Z, Brinton CG, Liu Z, Zhang ZL. Distributional Cloning for Stabilized Imitation Learning via ADMM. In Chen G, Khan L, Gao X, Qiu M, Pedrycz W, Wu X, editors, Proceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023. Institute of Electrical and Electronics Engineers Inc. 2023. p. 818-827. (Proceedings - IEEE International Conference on Data Mining, ICDM). doi: 10.1109/ICDM58522.2023.00091

Zhang, Xin ; Li, Yanhua ; Zhang, Ziming et al. / Distributional Cloning for Stabilized Imitation Learning via ADMM. Proceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023. editor / Guihai Chen ; Latifur Khan ; Xiaofeng Gao ; Meikang Qiu ; Witold Pedrycz ; Xindong Wu. Institute of Electrical and Electronics Engineers Inc., 2023. pp. 818-827 (Proceedings - IEEE International Conference on Data Mining, ICDM).

@inproceedings{800a31cd091e4763935c2a9ac4c3564a,

title = "Distributional Cloning for Stabilized Imitation Learning via ADMM",

abstract = "The two leading solution paradigms for imitation learning (IL), BC and GAIL, each suffers from notable drawbacks. BC, a supervised learning approach to mimic expert actions, is vulnerable to covariate shift. GAIL applies adversarial training to minimize the discrepancy between expert and learner behaviors, which is prone to unstable training and mode collapse. In this work, we propose DC - Distributional Cloning - a novel IL approach for addressing the covariate shift and mode collapse problems simultaneously. DC directly maximizes the likelihood of observed expert and learner demonstrations, and gradually encourages the learner to evolve towards expert behaviors based on an averaging effect. The DC solution framework contains two stages in each training loop, where in stage one the mixed expert and learner state distribution is estimated via SoftFlow, and in stage two the learner policy is trained to match both the expert's policy and state distribution via ADMM. Experimental evaluation of DC compared with several baselines in 10 different physics-based control tasks reveal superior results in learner policy performance, training stability, and mode distribution preservation.",

keywords = "imitation learning, neural ordinary differential equations",

author = "Xin Zhang and Yanhua Li and Ziming Zhang and Brinton, {Christopher G.} and Zhenming Liu and Zhang, {Zhi Li}",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 23rd IEEE International Conference on Data Mining, ICDM 2023 ; Conference date: 01-12-2023 Through 04-12-2023",

year = "2023",

doi = "10.1109/ICDM58522.2023.00091",

language = "English (US)",

series = "Proceedings - IEEE International Conference on Data Mining, ICDM",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "818--827",

editor = "Guihai Chen and Latifur Khan and Xiaofeng Gao and Meikang Qiu and Witold Pedrycz and Xindong Wu",

booktitle = "Proceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023",

}

TY - GEN

T1 - Distributional Cloning for Stabilized Imitation Learning via ADMM

AU - Zhang, Xin

AU - Li, Yanhua

AU - Zhang, Ziming

AU - Brinton, Christopher G.

AU - Liu, Zhenming

AU - Zhang, Zhi Li

PY - 2023

Y1 - 2023

N2 - The two leading solution paradigms for imitation learning (IL), BC and GAIL, each suffers from notable drawbacks. BC, a supervised learning approach to mimic expert actions, is vulnerable to covariate shift. GAIL applies adversarial training to minimize the discrepancy between expert and learner behaviors, which is prone to unstable training and mode collapse. In this work, we propose DC - Distributional Cloning - a novel IL approach for addressing the covariate shift and mode collapse problems simultaneously. DC directly maximizes the likelihood of observed expert and learner demonstrations, and gradually encourages the learner to evolve towards expert behaviors based on an averaging effect. The DC solution framework contains two stages in each training loop, where in stage one the mixed expert and learner state distribution is estimated via SoftFlow, and in stage two the learner policy is trained to match both the expert's policy and state distribution via ADMM. Experimental evaluation of DC compared with several baselines in 10 different physics-based control tasks reveal superior results in learner policy performance, training stability, and mode distribution preservation.

AB - The two leading solution paradigms for imitation learning (IL), BC and GAIL, each suffers from notable drawbacks. BC, a supervised learning approach to mimic expert actions, is vulnerable to covariate shift. GAIL applies adversarial training to minimize the discrepancy between expert and learner behaviors, which is prone to unstable training and mode collapse. In this work, we propose DC - Distributional Cloning - a novel IL approach for addressing the covariate shift and mode collapse problems simultaneously. DC directly maximizes the likelihood of observed expert and learner demonstrations, and gradually encourages the learner to evolve towards expert behaviors based on an averaging effect. The DC solution framework contains two stages in each training loop, where in stage one the mixed expert and learner state distribution is estimated via SoftFlow, and in stage two the learner policy is trained to match both the expert's policy and state distribution via ADMM. Experimental evaluation of DC compared with several baselines in 10 different physics-based control tasks reveal superior results in learner policy performance, training stability, and mode distribution preservation.

KW - imitation learning

KW - neural ordinary differential equations

UR - http://www.scopus.com/inward/record.url?scp=85185402652&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85185402652&partnerID=8YFLogxK

U2 - 10.1109/ICDM58522.2023.00091

DO - 10.1109/ICDM58522.2023.00091

M3 - Conference contribution

AN - SCOPUS:85185402652

T3 - Proceedings - IEEE International Conference on Data Mining, ICDM

SP - 818

EP - 827

BT - Proceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023

A2 - Chen, Guihai

A2 - Khan, Latifur

A2 - Gao, Xiaofeng

A2 - Qiu, Meikang

A2 - Pedrycz, Witold

A2 - Wu, Xindong

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 23rd IEEE International Conference on Data Mining, ICDM 2023

Y2 - 1 December 2023 through 4 December 2023

ER -

Distributional Cloning for Stabilized Imitation Learning via ADMM

Abstract

Publication series

Conference

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this