Safe Haskell	None
Language	Haskell98

Algorithms.MDP.CTMDP

Description

A continuous-time Markov decision process (CTMDP) is an MDP where transitions between states take a random amount of time. Each transition time is assumed to be exponentially distributed with an action- and state-dependent transition rate.

The record accessors of the CTMDP type conflict with those of the MDP type, so either import only the mkCTMDP and uniformize functions or import this module qualified.

Synopsis

Documentation

data CTMDP a b t Source

A Continuous-time Markov decision process.

A CTMDP is a continuous-time analog of an MDP. In a CTMDP each stage takes a variable amount of time. Each stage lasts an expontially distributed amount of time characterized by a state- and action-dependent rate parameter. Instead of simply having costs associated with a state and an action, the costs of a CTMDP are broken up into fixed and rate costs. Fixed costs are incured as an action are chosen, while rate costs are paid for the duration of the stage.

Here the type variable a represents the type of the states, b represents the type of the actions, and t represents the numeric type used in computations. Generally choosing t to be a Double is fine, although there is no reason a higher-precision type cannot be used.

This type should not be constructed directly; use the mkCTMDP constructor instead.

Constructors

CTMDP
Fields _states :: Vector a _actions :: Vector b _fixedCosts :: Vector (Vector t) _rateCosts :: Vector (Vector t) _rates :: Vector (Vector t) _trans :: Vector (Vector (Vector t)) _discount :: t _actionSet :: Vector (Vector Int)

mkCTMDP Source

Arguments

:: Eq b
=> [a]	The state space
-> [b]	The action space
-> Transitions a b t	The transition probabilities
-> Rates a b t	The transition rates
-> Costs a b t	The action-dependent fixed costs
-> Costs a b t	The action-dependent rate costs
-> ActionSet a b	The state-dependent actions
-> t	The discount factor in (0, 1]
-> CTMDP a b t	The resulting CTMDP

Create a CTMDP.

type Rates a b t = b -> a -> t Source

A function mapping an action and a state to a transition rate.

uniformize :: (Ord t, Fractional t) => CTMDP a b t -> MDP a b t Source

Convert a CTMDP into an MDP.