HiveBrain v1.2.0
Get Started
← Back to all entries
snippetcppMinor

Parser for a custom scene definition format for a raytracer

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
definitionformatraytracerparsercustomforscene

Problem

For a raytracer I’ve been writing with a classmate, we use a custom scene definition format that allows specifying shapes, composite shapes, materials, lights, cameras and transform and render commands.

The assets of a scene (i.e. shapes, composite shapes, materials, lights and cameras) are stored in the data transfer object Scene. The task of this parser is to take a text file with the scene definition, parse its rules and transform them into objects which populate a Scene object.

A few pieces of the code I left out of simplicity, because I want the review to focus on the parsing strategy. Here is a list of these things and what they do.

  • Logger: A simple logger class (a singleton) that prints messages when in debug mode. It’s also possible to generate a log file of the currently stored messages



  • load_file(file_path): Takes a path as a string, loads the file and returns its content as a string



  • sanitize_inputs(file_string): Removes leading/trailing spaces, consecutive spaces, comments (starting with #, anywhere in the line) and empty lines



  • split(file_string, delimiter): Takes a string and splits it by a character. Returns a vector of strings



  • order_rule_vector(rule_vector): Some rules in the scene definition format depend on other rules. To allow the user to write them out in any order, the vector of rules is partitioned first so that rules with dependencies come after their dependants



Also I left out the code for parsing lights, cameras, transform commands and render commands, because the strategy is identical to what’s shown in the code below.

Scene Definition Format

Our raytracer will implement the Phong reflection model. Currently there are regular spheres and axis-aligned boxes.

```
# name, ambient, diffuse, specular terms, shininess
define material red 1 0 0 1 0 0 1 0 0 1

# name, min, max positions, material name
define shape box b1 -100 -80 -20 1002 80 -100 red

# name, center position, radius, material name
define

Solution

Getting rid of the branches

I don’t like that code. It’s a branch nightmare.

Yes, let's get rid of the branches. Consider that all the branches are in the form of:

if (command == "foo") {
    // parse rule_stream further
    …
}


When you have a sequence of homogenous if/else statements like that, there are alternatives to consider. One would be a switch-statement, but those unfortunately only work with integers and enums. Another option would be to create a map from strings to std::functions:

using ParseFunction = std::function;

static const std::unordered_map commands = {
    {"define", parse_define},
    {"transform", parse_transform},
    {"render", parse_render},
};


Then inside load_scene(), you would write something like:

std::string command;
rule_stream >> command;
auto it = commands.find(command);

if (it != commands.end()) {
    auto parse_function = it->second;
    parse_function(rule_stream, scene->get());
} else {
    logger->log("--- Skipping unknown command " + command);
    continue;
}


Then you can define a function for each command, for example:

void parse_define(std::istream& rule_stream, Scene* scene) {
    std::string define_target;
    rule_stream >> define_target;
    …
    scene->materials.insert(…);
    …
}


That shows how to get rid of the outer ifs. You can also do the same for targets and shapes.
Avoiding writing rule_stream >>

I write rule_stream >> a lot.

This is because you are reading each individual value. You can write your own overloads of operator>>() for the types you are using. For example:

std::istream& operator>>(std::istream& is, glm::vec3 &v) {
    return is >> v.x >> v.y >> v.z;
}


And then you can write:

glm::vec3 min, max;
rule_stream >> min >> max;


Split up large functions

Another reason your code looks bad is because everything is in one huge function. It is easy to lose overview. The techniques I've shown above already introduce new functions that in turn simplify your load_scene(). But even without those you could have split up the function yourself:

static void parse_define(std::istream& rule_stream, Scene* scene) {
    …
}
…
static void parse_rule(const std::string& rule, Scene* scene) {
    std::istringstream rule_stream{ rule };
    std::string command;
    rule_stream >> command;
    if (command == "define") {
        parse_define(rule_stream, scene):
    } else if (command == "transform") {
        …
    } …
};

std::shared_ptr load_scene(std::string const& file_path) {
    …
    for (auto&& rule : rule_vector) {
        parse_rule(rule, scene->get());
    }

    return scene;
}


Are the std::shared_ptrs necessary?

I see you use std::shared_ptrs a lot. But are they really necessary? You could have load_scene() return a Scene directly; even without C++17's guaranteed copy elision, if Scene has a move constructor, it could move all its contents efficiently to the caller. If you do need a smart pointer because of polymorphism, consider whether you can just use astd::unique_ptr instead of the more expensive std::shared_ptr.

Let's have a look at scene->materials. This looks like a std::map or std::unordered_map. These containers will already do memory management for you, and pointers to values stored in the map are guaranteed to be stable. So if it was defined as:

struct Scene {
    …
    std::unordered_map materials;
    …
};


The code in your parser could be simplified a lot:

std::string mat_name;
Color ambient, diffuse, specular;
float shininess;

rule_stream >> mat_name >> ambient >> diffuse >> specular >> shininess;

auto result = scene->materials.try_emplace(
    mat_name, mat_name, ambient, diffuse, specular, shininess
);

if (!result.second) {
    logger->log("Error: Material '" + mat_name + "' wasn't added to the scene object.");
}


(I used C++17's try_emplace() here, you can achieve the same in other ways in earlier versions of C++, but less elegantly).

If you ensure any material used by shapes is in scene->materials before the shapes using them are added, then you can just store a reference to the Material in a Shape instead of using a std::shared_ptr.

If you have a default constructor and an operator>> overload for Material, then you could also consider writing:

Material mat;
rule_stream >> mat;
if (scene->materials.try_emplace(mat.name(), mat) {
    logger->log("Error: Material '" + mat.name() + "' wasn't added to the scene object.");
}


Do you really need to order the rules first?

You are ordering the rules before parsing them. That means you have to read the whole file in memory first. But do you really need that? I see you are already handling missing materials, by assigning a default constructed Material to a Shape. But why not add that default Material under the given name to scene->materials, and then when you later do encounter the rule defining that Material, just update it

Code Snippets

if (command == "foo") {
    // parse rule_stream further
    …
}
using ParseFunction = std::function<void(std::istream&, Scene*)>;

static const std::unordered_map<std::string, ParseFunction> commands = {
    {"define", parse_define},
    {"transform", parse_transform},
    {"render", parse_render},
};
std::string command;
rule_stream >> command;
auto it = commands.find(command);

if (it != commands.end()) {
    auto parse_function = it->second;
    parse_function(rule_stream, scene->get());
} else {
    logger->log("--- Skipping unknown command " + command);
    continue;
}
void parse_define(std::istream& rule_stream, Scene* scene) {
    std::string define_target;
    rule_stream >> define_target;
    …
    scene->materials.insert(…);
    …
}
std::istream& operator>>(std::istream& is, glm::vec3 &v) {
    return is >> v.x >> v.y >> v.z;
}

Context

StackExchange Code Review Q#134089, answer score: 2

Revisions (0)

No revisions yet.