Watch and learn: Optimizing from revealed preferences feedback

Aaron Roth; Jonathan Ullman; Zhiwei Steven Wu

doi:10.1145/2897518.2897579

Watch and learn: Optimizing from revealed preferences feedback

Aaron Roth, Jonathan Ullman, Zhiwei Steven Wu

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

33 Scopus citations

Abstract

A Stackelberg game is played between a leader and a follower. The leader first chooses an action, then the follower plays his best response. The goal of the leader is to pick the action that will maximize his payoff given the follower's best response. In this paper we present an approach to solving for the leader's optimal strategy in certain Stackelberg games where the follower's utility function (and thus the subsequent best response of the follower) is unknown. Stackelberg games capture, for example, the following interaction between a producer and a consumer. The producer chooses the prices of the goods he produces, and then a consumer chooses to buy a utility maximizing bundle of goods. The goal of the seller here is to set prices to maximize his profit-his revenue, minus the production cost of the purchased bundle. It is quite natural that the seller in this example should not know the buyer's utility function. However, he does have access to revealed preference feedback-he can set prices, and then observe the purchased bundle and his own profit. We give algorithms for efficiently solving, in terms of both computational and query complexity, a broad class of Stackelberg games in which the follower's utility function is unknown, using only "revealed preference" access to it. This class includes in particular the profit maximization problem, as well as the optimal tolling problem in nonatomic congestion games, when the latency functions are unknown. Surprisingly, we are able to solve these problems even though the optimization problems are non-convex in the leader's actions.

Original language	English (US)
Title of host publication	STOC 2016 - Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing
Editors	Yishay Mansour, Daniel Wichs
Publisher	Association for Computing Machinery
Pages	949-962
Number of pages	14
ISBN (Electronic)	9781450341325
DOIs	https://doi.org/10.1145/2897518.2897579
State	Published - Jun 19 2016
Externally published	Yes
Event	48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016 - Cambridge, United States Duration: Jun 19 2016 → Jun 21 2016

Publication series

Name	Proceedings of the Annual ACM Symposium on Theory of Computing
Volume	19-21-June-2016
ISSN (Print)	0737-8017

Other

Other	48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016
Country/Territory	United States
City	Cambridge
Period	6/19/16 → 6/21/16

Bibliographical note

Funding Information:
Partially supported by an NSF CAREER award, NSF Grants CCF-1101389 and CNS-1065060, and a Google Focused Research Award.

Keywords

Game theory
Learning
Optimization
Revealed preferences

Access

10.1145/2897518.2897579

OpenUrl availability

Full text

Cite this

Roth, A., Ullman, J., & Wu, Z. S. (2016). Watch and learn: Optimizing from revealed preferences feedback. In Y. Mansour, & D. Wichs (Eds.), STOC 2016 - Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing (pp. 949-962). (Proceedings of the Annual ACM Symposium on Theory of Computing; Vol. 19-21-June-2016). Association for Computing Machinery. https://doi.org/10.1145/2897518.2897579

Watch and learn: Optimizing from revealed preferences feedback. / Roth, Aaron; Ullman, Jonathan; Wu, Zhiwei Steven.
STOC 2016 - Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing. ed. / Yishay Mansour; Daniel Wichs. Association for Computing Machinery, 2016. p. 949-962 (Proceedings of the Annual ACM Symposium on Theory of Computing; Vol. 19-21-June-2016).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Roth, A, Ullman, J & Wu, ZS 2016, Watch and learn: Optimizing from revealed preferences feedback. in Y Mansour & D Wichs (eds), STOC 2016 - Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing. Proceedings of the Annual ACM Symposium on Theory of Computing, vol. 19-21-June-2016, Association for Computing Machinery, pp. 949-962, 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, Cambridge, United States, 6/19/16. https://doi.org/10.1145/2897518.2897579

@inproceedings{38330fd606464bf2913d37adf357f980,

title = "Watch and learn: Optimizing from revealed preferences feedback",

abstract = "A Stackelberg game is played between a leader and a follower. The leader first chooses an action, then the follower plays his best response. The goal of the leader is to pick the action that will maximize his payoff given the follower's best response. In this paper we present an approach to solving for the leader's optimal strategy in certain Stackelberg games where the follower's utility function (and thus the subsequent best response of the follower) is unknown. Stackelberg games capture, for example, the following interaction between a producer and a consumer. The producer chooses the prices of the goods he produces, and then a consumer chooses to buy a utility maximizing bundle of goods. The goal of the seller here is to set prices to maximize his profit-his revenue, minus the production cost of the purchased bundle. It is quite natural that the seller in this example should not know the buyer's utility function. However, he does have access to revealed preference feedback-he can set prices, and then observe the purchased bundle and his own profit. We give algorithms for efficiently solving, in terms of both computational and query complexity, a broad class of Stackelberg games in which the follower's utility function is unknown, using only {"}revealed preference{"} access to it. This class includes in particular the profit maximization problem, as well as the optimal tolling problem in nonatomic congestion games, when the latency functions are unknown. Surprisingly, we are able to solve these problems even though the optimization problems are non-convex in the leader's actions.",

keywords = "Game theory, Learning, Optimization, Revealed preferences",

author = "Aaron Roth and Jonathan Ullman and Wu, {Zhiwei Steven}",

note = "Funding Information: Partially supported by an NSF CAREER award, NSF Grants CCF-1101389 and CNS-1065060, and a Google Focused Research Award.; 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016 ; Conference date: 19-06-2016 Through 21-06-2016",

year = "2016",

month = jun,

day = "19",

doi = "10.1145/2897518.2897579",

language = "English (US)",

series = "Proceedings of the Annual ACM Symposium on Theory of Computing",

publisher = "Association for Computing Machinery",

pages = "949--962",

editor = "Yishay Mansour and Daniel Wichs",

booktitle = "STOC 2016 - Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing",

}

TY - GEN

T1 - Watch and learn

T2 - 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016

AU - Roth, Aaron

AU - Ullman, Jonathan

AU - Wu, Zhiwei Steven

N1 - Funding Information: Partially supported by an NSF CAREER award, NSF Grants CCF-1101389 and CNS-1065060, and a Google Focused Research Award.

PY - 2016/6/19

Y1 - 2016/6/19

N2 - A Stackelberg game is played between a leader and a follower. The leader first chooses an action, then the follower plays his best response. The goal of the leader is to pick the action that will maximize his payoff given the follower's best response. In this paper we present an approach to solving for the leader's optimal strategy in certain Stackelberg games where the follower's utility function (and thus the subsequent best response of the follower) is unknown. Stackelberg games capture, for example, the following interaction between a producer and a consumer. The producer chooses the prices of the goods he produces, and then a consumer chooses to buy a utility maximizing bundle of goods. The goal of the seller here is to set prices to maximize his profit-his revenue, minus the production cost of the purchased bundle. It is quite natural that the seller in this example should not know the buyer's utility function. However, he does have access to revealed preference feedback-he can set prices, and then observe the purchased bundle and his own profit. We give algorithms for efficiently solving, in terms of both computational and query complexity, a broad class of Stackelberg games in which the follower's utility function is unknown, using only "revealed preference" access to it. This class includes in particular the profit maximization problem, as well as the optimal tolling problem in nonatomic congestion games, when the latency functions are unknown. Surprisingly, we are able to solve these problems even though the optimization problems are non-convex in the leader's actions.

AB - A Stackelberg game is played between a leader and a follower. The leader first chooses an action, then the follower plays his best response. The goal of the leader is to pick the action that will maximize his payoff given the follower's best response. In this paper we present an approach to solving for the leader's optimal strategy in certain Stackelberg games where the follower's utility function (and thus the subsequent best response of the follower) is unknown. Stackelberg games capture, for example, the following interaction between a producer and a consumer. The producer chooses the prices of the goods he produces, and then a consumer chooses to buy a utility maximizing bundle of goods. The goal of the seller here is to set prices to maximize his profit-his revenue, minus the production cost of the purchased bundle. It is quite natural that the seller in this example should not know the buyer's utility function. However, he does have access to revealed preference feedback-he can set prices, and then observe the purchased bundle and his own profit. We give algorithms for efficiently solving, in terms of both computational and query complexity, a broad class of Stackelberg games in which the follower's utility function is unknown, using only "revealed preference" access to it. This class includes in particular the profit maximization problem, as well as the optimal tolling problem in nonatomic congestion games, when the latency functions are unknown. Surprisingly, we are able to solve these problems even though the optimization problems are non-convex in the leader's actions.

KW - Game theory

KW - Learning

KW - Optimization

KW - Revealed preferences

UR - http://www.scopus.com/inward/record.url?scp=84979222072&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84979222072&partnerID=8YFLogxK

U2 - 10.1145/2897518.2897579

DO - 10.1145/2897518.2897579

M3 - Conference contribution

AN - SCOPUS:84979222072

T3 - Proceedings of the Annual ACM Symposium on Theory of Computing

SP - 949

EP - 962

BT - STOC 2016 - Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing

A2 - Mansour, Yishay

A2 - Wichs, Daniel

PB - Association for Computing Machinery

Y2 - 19 June 2016 through 21 June 2016

ER -

Watch and learn: Optimizing from revealed preferences feedback

Abstract

Publication series

Other

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this