HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Accepting user defined functions for custom map reduce functionality in C++

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
mapusercustomreduceacceptingforfunctionsfunctionalitydefined

Problem

I am implementing map and reduce - style functions for processing geospatial raster datasets.
I would like the map and reduce functions to accept a user-defined function as an input which will be applied to the raster dataset. Currently I am using function pointers - is this a good starting point? Could there be problems scaling this to a large code base, to accept complicated algorithms etc?
Project Aims:

As well as providing the mentioned map and reduce functions for users to pass their own custom functions to, I would eventually like to provide a library of functions which can be passed to map/reduce. As such a library operates on arrays, it may be useful outside of the geospatial domain, so having it decoupled from the map/reduce library is quite useful - users could ignore that and just pass arrays to this 'algorithms' library. I think my initial approach to this would be to wrap selected functions from OpenCV (or another, similar library) which are useful for geospatial raster analysis, before developing other custom functions.

Given the above, I suppose the overall aim could be thought of as a marriage of GDAL and OpenCV while remaining incredibly flexible and extensible.
The Code

So far I have only implemented the map function and the template for a RasterProcess class.

header:

```
// Include processing functions

/**
* \brief Definition of a raster processing function.
*
* A GALGRasterProcessFn accepts an array of data as input, applies custom logic and writes the output to padfOutArray.
* Such a function can be passed to GALGRunRasterProcess to apply custom processing to a GDALDataset in chunks and create
* a new GDALDataset.
*
* @param padfInArray The input array of data.
*
* @param padfOutArray The output array of data. On first call (via GDALRunRasterProcess) this will be an empty, initialised array,
* which should be populated with the result of calculations on padfInArray. In subsequent calls it will contain t

Solution

In modern C++ algorithm is conventionally a template function and the callback is an argument (and automatically deduced template parameter).

// C++03
template 
void traverse(F& cb) { ... }

// C++14
void traverse(auto& cb) { ... }


Such signature lets you pass anything callable, being that a raw pointer, an instance of a class with overloaded () operator, an instance of std::function with compatible signature.

One reason why you might want to accept a more generic callback than just a function pointer is that it is not uncommon for callbacks to have state, which with raw function pointer becomes difficult to implement. Another reason is you might want to be C++1x-friendly and let people use generic lambdas. For instance a prototype code snippet to research your algorithm might be something very localized and easy to type as

map([&](auto&&... args) { std::cout << std::tie(args...) << std::endl; }, ...);


With modern compilers you can pre-instantiate most commonly used templates (say raw pointer case) in the place of algorithm definition to keep object files size to the minimum.

Code Snippets

// C++03
template <class F>
void traverse(F& cb) { ... }

// C++14
void traverse(auto& cb) { ... }
map([&](auto&&... args) { std::cout << std::tie(args...) << std::endl; }, ...);

Context

StackExchange Code Review Q#136322, answer score: 2

Revisions (0)

No revisions yet.