Learning While Doing Algorithms for Online Decision Making Problems

  • Wang, Zizhuo (PI)

Project: Research project

Project Details

Description

The research objective of this award is to develop efficient and effective algorithms for sequential decision making problems when the input data of the problems are unknown or uncertain at the beginning of the decision period. Applications of this research are broad, including online routing problems, online advertisement allocation problems, dynamic pricing problems in revenue management, and various other online resource allocation problems in service systems. If successful, decision tools that are developed in this research will enable decision makers to better cope with uncertainty in practical problems by absorbing, analyzing and utilizing data in an online fashion, which can be viewed as a substantial step to meet the challenges presented by the era of big data. Besides, this project also serves as a great educational opportunity. Graduate students involved will benefit from the training provided by this research. New graduate courses will be developed based on this project and topics in the research will be incorporated in the Principal Investigator's (PI's) teaching materials.

To achieve the goals of this research, the PI will develop decision rules that can simultaneously learn the input data structure in the underlying problems as well as optimize the long term objective value. Such decision rules are called learning-while-doing algorithms, as it will balance two objectives, one is the exploration of the underlying data structure (i.e., learning), and the other is the exploitation of the structures that have been learned (i.e., doing). Theoretical performance guarantees as well as practical implementation strategies will be derived from this research. Specifically, the PI will (1) further study the design and theory of learning-while-doing algorithms in different and more complex problem settings as well as their robustness; (2) investigate implementation issues; and (3) apply the algorithms to various important and practical problems. Among the applications, the PI plans to integrate learning customer choice behavior into revenue management problems, which will complement recent interest in such models. The PI also plans to apply the method to problems in the management of the smart grid, which is an emerging application of online decision making problems. From the theoretical point of view, the study of the algorithms will advance the understanding of the tradeoffs between learning and doing in such problems, and the analysis will establish important connections between optimization and probability, which will be of independent interest to the research community.

StatusFinished
Effective start/end date9/1/148/31/18

Funding

  • National Science Foundation: $285,599.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.