HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

How many possible policies in a Markov Decision Process?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
howprocesspoliciespossibledecisionmanymarkov

Problem

If a policy yields an action for a state, how come a 3-state MDP with 2 possible actions, i.e. $S = \{Hot, Mild, Cold\}$, $A = \{East, West\}$, has 8 possible policies? Isn't it 6 if there are 2 possible action for every state?

Solution

Looks like you are a bit confused by the notion of MDP policy. There's a detailed discussion with lots of examples in this question.

A policy is any possible strategy in a given environment. Example: "go $East$ in any state" is a valid strategy in your MDP (though maybe not optimal), as well as "go $West$ in any state".

So there are 3 states and 2 possible actions per each, hence $|A|^{|S|} = 2^3 = 8$ possible strategies: $EEE$, $EEW$, ..., $WWW$.

Context

StackExchange Computer Science Q#81070, answer score: 4

Revisions (0)

No revisions yet.