HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Value Iteration Implementation for MDPs

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
mdpsvalueiterationforimplementation

Problem

I've been working for a while on a decision theory library, and since I've never really had any formal training in code best practices I'd love to hear your feedback. This particular class is one of my older ones, and it performs the Value Iteration algorithm on a supplied class which respects a particular interface.

There are includes in the code, but I'm not sure I should include all relevant code, or give more information about parts not shown. Please tell me if I should.

My main concern is whether I should live getters and setters like this, or if I should templatize/make every method inline so as to improve ease of use/speed. Of course, if there is anything else you feel I should change, I'd be very happy to know.

Header file:

```
#ifndef AI_TOOLBOX_MDP_VALUE_ITERATION_HEADER_FILE
#define AI_TOOLBOX_MDP_VALUE_ITERATION_HEADER_FILE

#include
#include
#include

#include
#include
#include

namespace AIToolbox {
namespace MDP {
/**
* @brief This class applies the value iteration algorithm on a Model.
*
* This algorithm solves an MDP model for the specified horizon, or less
* if convergence is encountered.
*
* The idea of this algorithm is to iteratively compute the
* ValueFunction for the MDP optimal policy. On the first iteration,
* the ValueFunction for horizon 1 is obtained. On the second
* iteration, the one for horizon 2. This process is repeated until the
* ValueFunction has converged to a specific value within a certain
* accuracy, or the horizon requested is reached.
*
* This implementation in particular is ported from the MATLAB
* MDPToolbox (although it is simplified).
*/
class ValueIteration {
public:
/**
* @brief Basic constructor.
*
* The epsilon parameter must be >= 0.0, otherwise the
* constru

Solution

I personally don't see the point in splitting things into hpp/cpp files here. For such a small amount of functionality that is basically just getters and setters, you may as well just place them in the header file, along with everything else.

I dislike the use of std::enable_if to toggle functions on and off when it isn't absolutely necessary. It does have a place, but this is generally when you need to perform overload resolution between multiple functions based on template parameters. If you simply want to ensure that some condition is met for a given template parameter, prefer to use static_assert. Without seeing exactly what the implementation of is_model is, it's hard to say exactly what this should be, but I imagine something like:

template 
QFunction ValueIteration::computeQFunction(const M & model, const Table2D & ir) const {
    static_assert(is_model::value, "M must be a model!");
    //Implementation
    .....
}


On a separate note, you're missing an include here: there should be an #include for std::enable_if. You're also missing an #include (for things like std::max_element and std::transform), and an #include for std::fabs. Even if other internal headers of yours are pulling these in, it's bad practice to rely on that - each file should pull in everything it needs by itself.

Perhaps the variable names S and A make sense in your particular context, but if possible, I'd try and give them more descriptive names.

Is there a reason that you've created an inner scope in ValueIterator::operator()?

template ::value, int>::type>
std::tuple ValueIteration::operator()(const M & model) {
// Extract necessary knowledge from model so we don't have to pass it around
S = model.getS();
A = model.getA();
discount_ = model.getDiscount();

// Inner scope here?
{
    // Verify that parameter value function is compatible.
    size_t size = std::get(vParameter_).size();
    if ( size != S ) {
        if ( size != 0 )
            std::cerr << "AIToolbox: Size of starting value function in ValueIteration::solve() is incorrect, ignoring...\n";
        // Defaulting
        v1_ = makeValueFunction(S);
    }
    else
        v1_ = vParameter_;
}


Generally when I see this, I'm looking for something that uses RAII to clean thing up (files, locks, threads, etc). I can't see any particular reason for it here, which is a bit confusing when reading it.

Any particular reason why ValueFunction is passed by pointer here?

void ValueIteration::bellmanOperator(const QFunction & q, ValueFunction * v) const;


There are no check for nullptr, so I'd expect it to be passed by reference.

Mostly your code looks pretty reasonable. Using {} around for/if/etc has been mentioned, and I'll second that.

Code Snippets

template <typename M>
QFunction ValueIteration::computeQFunction(const M & model, const Table2D & ir) const {
    static_assert(is_model<M>::value, "M must be a model!");
    //Implementation
    .....
}
template <typename M, typename std::enable_if<is_model<M>::value, int>::type>
std::tuple<bool, ValueFunction, QFunction> ValueIteration::operator()(const M & model) {
// Extract necessary knowledge from model so we don't have to pass it around
S = model.getS();
A = model.getA();
discount_ = model.getDiscount();

// Inner scope here?
{
    // Verify that parameter value function is compatible.
    size_t size = std::get<VALUES>(vParameter_).size();
    if ( size != S ) {
        if ( size != 0 )
            std::cerr << "AIToolbox: Size of starting value function in ValueIteration::solve() is incorrect, ignoring...\n";
        // Defaulting
        v1_ = makeValueFunction(S);
    }
    else
        v1_ = vParameter_;
}
void ValueIteration::bellmanOperator(const QFunction & q, ValueFunction * v) const;

Context

StackExchange Code Review Q#60874, answer score: 4

Revisions (0)

No revisions yet.