Deep Reinforcement Learning for Interactive Systems.

Zhiwen Tang

Notes FAQ Contact Us

Back to results

Direct link

ERIC Number: ED670127

Record Type: Non-Journal

Publication Date: 2021

Pages: 205

Abstractor: As Provided

ISBN: 979-8-5381-5405-0

ISSN: N/A

EISSN: N/A

Available Date: 0000-00-00

Deep Reinforcement Learning for Interactive Systems

Zhiwen Tang

ProQuest LLC, Ph.D. Dissertation, Georgetown University

Artificial intelligence (AI) aims to build intelligent systems that can interact with and assist humans. During the interaction, a system learns the requirements from the human user and adapts to the needs to complete tasks. A popular type of interactive system is retrieval-based, where the system uses a retrieval function to retrieve relevant answers from a document collection or a knowledge repository. Because of the interactive setting and the goal-oriented objective, reinforcement learning (RL) becomes a trending solution. However, developing RL-based interactive systems is not always successful. Prior methods failed to build representations that provide an entire picture of the task or could not enable the system to control the retrieval results directly. The costly labeling process of interactive data further handicaps the application of RL-based methods. The RL agents trained on limited annotated data may fail to generalize. The evaluation metrics for interactive systems are often unbounded, and the huge variance among search tasks may bias the evaluation. In this dissertation, I formulate the task of building retrieval-based interactive systems as an RL problem and propose a systematic solution for building, training, generalizing, and evaluating RL-based interactive systems. I propose to provide the system with a global representation of the knowledge repository to enable the full exploration in state and action space. I then employ a differentiable retrieval action to allow the system to control the retrieval process effectively. To improve the generalizability, I propose methods that adaptively train the system in randomized environments and generate high-quality, diverse interactions. I also propose a metric normalization schema that effectively improves the fairness of evaluation. The proposed representation shows great improvement compared with other neural methods on ad-hoc retrieval tasks in Text REtrieval Conference (TREC) Web track and LETOR. With the representation and the differentiable retrieval function, the interactive system improves the state-of-the-art performance on TREC Dynamic Domain (DD) track. The proposed adaptive training method enhances the system's generalizability when tested in novel environments on TREC DD. The proposed trajectory diversification method boosts RL systems' performance on the Multiwoz dialogue dataset. The metric normalization schema is adopted by TREC DD and enables a more fair and robust evaluation. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com.bibliotheek.ehb.be/en-US/products/dissertations/individuals.shtml.]

Descriptors: Artificial Intelligence, Intelligent Tutoring Systems, Objectives, Reinforcement, Learning Processes, Costs, Evaluation Methods, Systems Development, Knowledge Management, Generalizability Theory, Problem Solving

ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com.bibliotheek.ehb.be/en-US/products/dissertations/individuals.shtml

Publication Type: Dissertations/Theses - Doctoral Dissertations

Education Level: N/A

Audience: N/A

Language: English

Sponsor: National Science Foundation (NSF), Division of Information and Intelligent Systems (IIS)

Authoring Institution: N/A

Grant or Contract Numbers: 1453721

Author Affiliations: N/A