Partially Observable Markov Decision Processes Over an Infinite Planning Horizon with Discounting. Technical Report No. 77.

Wollmer, Richard D.

Notes FAQ Contact Us

Back to results

Download full text

ERIC Number: ED124161

Record Type: Non-Journal

Publication Date: 1976-Mar

Pages: 25

Abstractor: N/A

ISBN: N/A

ISSN: N/A

EISSN: N/A

Available Date: N/A

Partially Observable Markov Decision Processes Over an Infinite Planning Horizon with Discounting. Technical Report No. 77.

Wollmer, Richard D.

The true state of the system described here is characterized by a probability vector. At each stage of the system an action must be chosen from a finite set of actions. Each possible action yields an expected reward, transforms the system to a new state in accordance with a Markov transition matrix, and yields an observable outcome. The problem of finding the total maximum discounted reward as a function of the probability state vector may be formulated as a linear program with an infinite number of constraints. The reward function may be expressed as a partial N-dimensional Maclaurin series. The coefficients in this series are also determined as an optimal solution to a linear program with an infinite number of constraints. A sequence of related finitely constrained linear programs is solved which then generates a sequence of solutions that converge to a local minimum for the infinitely constrained program. This model is applicable to computer assisted instruction systems as well as to other situations. (Author/CH)

Descriptors: Computer Assisted Instruction, Decision Making, Instructional Systems, Linear Programing, Mathematical Applications, Mathematical Models, Operations Research, Probability, Systems Approach

Publication Type: Reports - Descriptive

Education Level: N/A

Audience: N/A

Language: N/A

Sponsor: Advanced Research Projects Agency (DOD), Washington, DC.; Office of Naval Research, Arlington, VA. Personnel and Training Research Programs Office.

Authoring Institution: University of Southern California, Los Angeles. Behavioral Technology Labs.

Grant or Contract Numbers: N/A

Author Affiliations: N/A