Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression

Richard Maclin; Jude Shavlik; Lisa Torrey; Trevor Walker; Edward Wild

Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression

Richard Maclin, Jude Shavlik, Lisa Torrey, Trevor Walker, Edward Wild

Computer Science (Duluth)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

66 Scopus citations

Abstract

We present a novel formulation for providing advice to a reinforcement learner that employs support-vector regression as its function approximator. Our new method extends a recent advice-giving technique, called Knowledge-Based Kernel Regression (KBKR), that accepts advice concerning a single action of a reinforcement learner. In KBKR, users can say that in some set of states, an action's value should be greater than some linear expression of the current state. In our new technique, which we call Preference KBKR (Pref-KBKR), the user can provide advice in a more natural manner by recommending that some action is preferred over another in the specified set of states. Specifying preferences essentially means that users are giving advice about policies rather than Q values, which is a more natural way for humans to present advice. We present the motivation for preference advice and a proof of the correctness of our extension to KBKR. In addition, we show empirical results that our method can make effective use of advice on a novel reinforcement-learning task, based on the RoboCup simulator, which we call Breakaway. Our work demonstrates the significant potential of advice-giving techniques for addressing complex reinforcement learning problems, while further demonstrating the use of support-vector regression for reinforcement learning.

Original language	English (US)
Title of host publication	Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05
Pages	819-824
Number of pages	6
Volume	2
State	Published - Dec 1 2005
Event	20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05 - Pittsburgh, PA, United States Duration: Jul 9 2005 → Jul 13 2005

Other

Other	20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05
Country/Territory	United States
City	Pittsburgh, PA
Period	7/9/05 → 7/13/05

OpenUrl availability

Full text

Cite this

Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression. / Maclin, Richard; Shavlik, Jude; Torrey, Lisa et al.
Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05. Vol. 2 2005. p. 819-824.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Maclin, R, Shavlik, J, Torrey, L, Walker, T & Wild, E 2005, Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression. in Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05. vol. 2, pp. 819-824, 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05, Pittsburgh, PA, United States, 7/9/05.

@inproceedings{eb84540eccd54d33b72248d6074cfcd6,

title = "Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression",

abstract = "We present a novel formulation for providing advice to a reinforcement learner that employs support-vector regression as its function approximator. Our new method extends a recent advice-giving technique, called Knowledge-Based Kernel Regression (KBKR), that accepts advice concerning a single action of a reinforcement learner. In KBKR, users can say that in some set of states, an action's value should be greater than some linear expression of the current state. In our new technique, which we call Preference KBKR (Pref-KBKR), the user can provide advice in a more natural manner by recommending that some action is preferred over another in the specified set of states. Specifying preferences essentially means that users are giving advice about policies rather than Q values, which is a more natural way for humans to present advice. We present the motivation for preference advice and a proof of the correctness of our extension to KBKR. In addition, we show empirical results that our method can make effective use of advice on a novel reinforcement-learning task, based on the RoboCup simulator, which we call Breakaway. Our work demonstrates the significant potential of advice-giving techniques for addressing complex reinforcement learning problems, while further demonstrating the use of support-vector regression for reinforcement learning.",

author = "Richard Maclin and Jude Shavlik and Lisa Torrey and Trevor Walker and Edward Wild",

year = "2005",

month = dec,

day = "1",

language = "English (US)",

volume = "2",

pages = "819--824",

booktitle = "Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05",

note = "20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05 ; Conference date: 09-07-2005 Through 13-07-2005",

}

TY - GEN

T1 - Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression

AU - Maclin, Richard

AU - Shavlik, Jude

AU - Torrey, Lisa

AU - Walker, Trevor

AU - Wild, Edward

PY - 2005/12/1

Y1 - 2005/12/1

N2 - We present a novel formulation for providing advice to a reinforcement learner that employs support-vector regression as its function approximator. Our new method extends a recent advice-giving technique, called Knowledge-Based Kernel Regression (KBKR), that accepts advice concerning a single action of a reinforcement learner. In KBKR, users can say that in some set of states, an action's value should be greater than some linear expression of the current state. In our new technique, which we call Preference KBKR (Pref-KBKR), the user can provide advice in a more natural manner by recommending that some action is preferred over another in the specified set of states. Specifying preferences essentially means that users are giving advice about policies rather than Q values, which is a more natural way for humans to present advice. We present the motivation for preference advice and a proof of the correctness of our extension to KBKR. In addition, we show empirical results that our method can make effective use of advice on a novel reinforcement-learning task, based on the RoboCup simulator, which we call Breakaway. Our work demonstrates the significant potential of advice-giving techniques for addressing complex reinforcement learning problems, while further demonstrating the use of support-vector regression for reinforcement learning.

AB - We present a novel formulation for providing advice to a reinforcement learner that employs support-vector regression as its function approximator. Our new method extends a recent advice-giving technique, called Knowledge-Based Kernel Regression (KBKR), that accepts advice concerning a single action of a reinforcement learner. In KBKR, users can say that in some set of states, an action's value should be greater than some linear expression of the current state. In our new technique, which we call Preference KBKR (Pref-KBKR), the user can provide advice in a more natural manner by recommending that some action is preferred over another in the specified set of states. Specifying preferences essentially means that users are giving advice about policies rather than Q values, which is a more natural way for humans to present advice. We present the motivation for preference advice and a proof of the correctness of our extension to KBKR. In addition, we show empirical results that our method can make effective use of advice on a novel reinforcement-learning task, based on the RoboCup simulator, which we call Breakaway. Our work demonstrates the significant potential of advice-giving techniques for addressing complex reinforcement learning problems, while further demonstrating the use of support-vector regression for reinforcement learning.

UR - http://www.scopus.com/inward/record.url?scp=29344474034&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=29344474034&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:29344474034

VL - 2

SP - 819

EP - 824

BT - Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05

T2 - 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05

Y2 - 9 July 2005 through 13 July 2005

ER -

Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression

Abstract

Other

OpenUrl availability

Other files and links

Fingerprint

Cite this