patterncppMinor
Task scheduler with dependencies
Viewed 0 times
withschedulertaskdependencies
Problem
I have need for a task scheduler determined by a directed graph. The tasks are held in a
To set up a task dependency graph as below, use the driver provided.
scheduler_driver.hpp:
scheduler_driver.cpp:
scheduler.h:
```
#ifndef __SCHEDULER2_H__
#define __SCHEDULER2_H__
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#in
std::vector, while the dependency graph is held in a adjacency_list (bidirectionalS so that I have access to the in_degree() function). A single dispatch will start the tasks. Are there obvious improvements here?To set up a task dependency graph as below, use the driver provided.
scheduler_driver.hpp:
#ifndef __SCHEDULER_DRIVER_HPP__
#define __SCHEDULER_DRIVER_HPP__
#include
#include
#include
#include
#include
#include "scheduler.h"
#endifscheduler_driver.cpp:
#include "scheduler_driver.hpp"
enum task_nodes
{
task_0,
task_1,
task_2,
task_3,
task_4,
task_5,
task_6,
task_7,
task_8,
task_9,
N
};
int basic_task(int a, int d)
{
std::chrono::milliseconds sleepDuration(d);
std::this_thread::sleep_for(sleepDuration);
std::cout F;
Graph deps(N);
boost::add_edge(task_0, task_1, deps);
boost::add_edge(task_0, task_2, deps);
boost::add_edge(task_0, task_3, deps);
boost::add_edge(task_1, task_4, deps);
boost::add_edge(task_1, task_5, deps);
boost::add_edge(task_1, task_6, deps);
boost::add_edge(task_2, task_7, deps);
boost::add_edge(task_2, task_8, deps);
boost::add_edge(task_2, task_9, deps);
std::vector tasks =
{
std::bind(basic_task, 0, 1000),
std::bind(basic_task, 1, 1000),
std::bind(basic_task, 2, 1000),
std::bind(basic_task, 3, 1000),
std::bind(basic_task, 4, 1000),
std::bind(basic_task, 5, 1000),
std::bind(basic_task, 6, 1000),
std::bind(basic_task, 7, 1000),
std::bind(basic_task, 8, 1000),
std::bind(basic_task, 9, 1000)
};
scheduler *s = new scheduler(std::move(deps), std::move(tasks));
s->doit();
return 0;
}scheduler.h:
```
#ifndef __SCHEDULER2_H__
#define __SCHEDULER2_H__
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#in
Solution
Here are some things that may help you improve your code.
Be careful with signed and unsigned
In the
Be careful with namespace assumptions
The C++ standard says that
Avoid
The
This may seem convenient, but it means that any code that includes this header has now brought everything from the
it would fail to compile because of the name clash on
Avoid data races
The
Note that
I know this is just test code, but if the actual function called within your code uses shared resources, you should apply the same technique.
Consider exceptions
If
Consider an alternative representation
The graph-based mechanism you current have seems to work, but I wonder if it might not be made simpler. In particular, one thing that comes to mind is the use of a
"Don't throw away your future"
OK, that sounds like it should be fatherly advice on life, but it's actually a lot more minor than that. Within the
Be careful with signed and unsigned
In the
doit routine and various others, the code compares an int i to a size_t num_tasks, but size_t is unsigned and int is signed. Instead, declare both variables as size_t types. Also, see the next suggestion.Be careful with namespace assumptions
The C++ standard says that
size_t is defined in the std namespace. It might also be defined as ::size_t (and often is) but it's not guaranteed. Better would be to use std::size_t explicitly.Avoid
using namespace in headersThe
scheduler.h header includes this line:using namespace boost;This may seem convenient, but it means that any code that includes this header has now brought everything from the
boost namespace into the global namespace which is a recipe for annoying name clashes later. If, for example, we used both your header and code like this:#include "scheduler_driver.hpp"
#include
#include
void clash() {
using namespace std;
array, 5> m;
}it would fail to compile because of the name clash on
array. In this artificial example, it's easy to spot, but in real code it's a headache that will have a future programmer cursing your name. It's easy to fix -- just remove that line and add the boost:: namespace prefix explicitly where needed.Avoid data races
The
basic_task function accesses std::cout but that's a single resource that might simultaneously be used by other threads. One fix for this is to use a std::mutex like this:static std::mutex cout_mutex;
// wherever cout is used:
some_function() {
std::lock_guard lock(cout_mutex);
std::cout << "Now we can do this safely!\n";
}Note that
std::lock_guard is intended to be used via RAII so that the lock is obtained when the object is created and released when it's destroyed, making it easy to avoid the bug of forgetting to unlock the mutex.I know this is just test code, but if the actual function called within your code uses shared resources, you should apply the same technique.
Consider exceptions
If
valid() is not true when std::future::get() is called, the behavior is undefined. One way that can happen is if the future throws an exception before the computation of the result is complete. That's not going to happen with this toy code, but if you're going to apply this scheduler to real tasks, that should be taken into account.Consider an alternative representation
The graph-based mechanism you current have seems to work, but I wonder if it might not be made simpler. In particular, one thing that comes to mind is the use of a
std::priority_queue in which the priority is simply the depth of that node in the dependency tree. The difference is that although it would mean that none of the leaf tasks would get executed until all of the higher priority tasks were executed, the priority queue has the benefit of constant lookup time. Might not be a big difference with this code, (yours is linear) but it might be worth considering if you have a very large number of tasks."Don't throw away your future"
OK, that sounds like it should be fatherly advice on life, but it's actually a lot more minor than that. Within the
task_thread() routine, the value of the future.get() is assigned to val but never used.Code Snippets
using namespace boost;#include "scheduler_driver.hpp"
#include <array>
#include <boost/array.hpp>
void clash() {
using namespace std;
array<array<int, 5>, 5> m;
}static std::mutex cout_mutex;
// wherever cout is used:
some_function() {
std::lock_guard<std::mutex> lock(cout_mutex);
std::cout << "Now we can do this safely!\n";
}Context
StackExchange Code Review Q#122176, answer score: 4
Revisions (0)
No revisions yet.