SOLVE MINIMAX OPTIMIZATION BY ANDERSON ACCELERATION

Huan He; Shifan Zhao; Yuanzhe Xi; Joyce C. Ho; Yousef Saad

SOLVE MINIMAX OPTIMIZATION BY ANDERSON ACCELERATION

Huan He, Shifan Zhao, Yuanzhe Xi, Joyce C. Ho, Yousef Saad

Research output: Contribution to conference › Paper › peer-review

Abstract

Many modern machine learning algorithms such as generative adversarial networks (GANs) and adversarial training can be formulated as minimax optimization. Gradient descent ascent (GDA) is the most commonly used algorithm due to its simplicity. However, GDA can converge to non-optimal minimax points. We propose a new minimax optimization framework, GDA-AM, that views the GDA dynamics as a fixed-point iteration and solves it using Anderson Mixing to converge to the local minimax. It addresses the diverging issue of simultaneous GDA and accelerates the convergence of alternating GDA. We show theoretically that the algorithm can achieve global convergence for bilinear problems under mild conditions. We also empirically show that GDA-AM solves a variety of minimax problems and improves adversarial training on several datasets. Codes are available on Github.

Original language	English (US)
State	Published - 2022
Externally published	Yes
Event	10th International Conference on Learning Representations, ICLR 2022 - Virtual, Online Duration: Apr 25 2022 → Apr 29 2022

Conference

Conference	10th International Conference on Learning Representations, ICLR 2022
City	Virtual, Online
Period	4/25/22 → 4/29/22

Bibliographical note

Funding Information:
This work was funded in part by the NSF grant OAC 2003720, IIS 1838200 and NIH grant 5R01LM013323-03,5K01LM012924-03.

Publisher Copyright:
© 2022 ICLR 2022 - 10th International Conference on Learning Representationss. All rights reserved.

OpenUrl availability

Full text

Cite this

@conference{d1cb8e93f0e6493a883329e550a619be,

title = "SOLVE MINIMAX OPTIMIZATION BY ANDERSON ACCELERATION",

abstract = "Many modern machine learning algorithms such as generative adversarial networks (GANs) and adversarial training can be formulated as minimax optimization. Gradient descent ascent (GDA) is the most commonly used algorithm due to its simplicity. However, GDA can converge to non-optimal minimax points. We propose a new minimax optimization framework, GDA-AM, that views the GDA dynamics as a fixed-point iteration and solves it using Anderson Mixing to converge to the local minimax. It addresses the diverging issue of simultaneous GDA and accelerates the convergence of alternating GDA. We show theoretically that the algorithm can achieve global convergence for bilinear problems under mild conditions. We also empirically show that GDA-AM solves a variety of minimax problems and improves adversarial training on several datasets. Codes are available on Github.",

author = "Huan He and Shifan Zhao and Yuanzhe Xi and Ho, {Joyce C.} and Yousef Saad",

note = "Funding Information: This work was funded in part by the NSF grant OAC 2003720, IIS 1838200 and NIH grant 5R01LM013323-03,5K01LM012924-03. Publisher Copyright: {\textcopyright} 2022 ICLR 2022 - 10th International Conference on Learning Representationss. All rights reserved.; 10th International Conference on Learning Representations, ICLR 2022 ; Conference date: 25-04-2022 Through 29-04-2022",

year = "2022",

language = "English (US)",

}

TY - CONF

T1 - SOLVE MINIMAX OPTIMIZATION BY ANDERSON ACCELERATION

AU - He, Huan

AU - Zhao, Shifan

AU - Xi, Yuanzhe

AU - Ho, Joyce C.

AU - Saad, Yousef

N1 - Funding Information: This work was funded in part by the NSF grant OAC 2003720, IIS 1838200 and NIH grant 5R01LM013323-03,5K01LM012924-03. Publisher Copyright: © 2022 ICLR 2022 - 10th International Conference on Learning Representationss. All rights reserved.

PY - 2022

Y1 - 2022

N2 - Many modern machine learning algorithms such as generative adversarial networks (GANs) and adversarial training can be formulated as minimax optimization. Gradient descent ascent (GDA) is the most commonly used algorithm due to its simplicity. However, GDA can converge to non-optimal minimax points. We propose a new minimax optimization framework, GDA-AM, that views the GDA dynamics as a fixed-point iteration and solves it using Anderson Mixing to converge to the local minimax. It addresses the diverging issue of simultaneous GDA and accelerates the convergence of alternating GDA. We show theoretically that the algorithm can achieve global convergence for bilinear problems under mild conditions. We also empirically show that GDA-AM solves a variety of minimax problems and improves adversarial training on several datasets. Codes are available on Github.

AB - Many modern machine learning algorithms such as generative adversarial networks (GANs) and adversarial training can be formulated as minimax optimization. Gradient descent ascent (GDA) is the most commonly used algorithm due to its simplicity. However, GDA can converge to non-optimal minimax points. We propose a new minimax optimization framework, GDA-AM, that views the GDA dynamics as a fixed-point iteration and solves it using Anderson Mixing to converge to the local minimax. It addresses the diverging issue of simultaneous GDA and accelerates the convergence of alternating GDA. We show theoretically that the algorithm can achieve global convergence for bilinear problems under mild conditions. We also empirically show that GDA-AM solves a variety of minimax problems and improves adversarial training on several datasets. Codes are available on Github.

UR - http://www.scopus.com/inward/record.url?scp=85150387232&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85150387232&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85150387232

T2 - 10th International Conference on Learning Representations, ICLR 2022

Y2 - 25 April 2022 through 29 April 2022

ER -

SOLVE MINIMAX OPTIMIZATION BY ANDERSON ACCELERATION

Abstract

Conference

Bibliographical note

OpenUrl availability

Other files and links

Fingerprint

Cite this