snippetcppMinor
Parser for a custom scene definition format for a raytracer
Viewed 0 times
definitionformatraytracerparsercustomforscene
Problem
For a raytracer I’ve been writing with a classmate, we use a custom scene definition format that allows specifying shapes, composite shapes, materials, lights, cameras and transform and render commands.
The assets of a scene (i.e. shapes, composite shapes, materials, lights and cameras) are stored in the data transfer object
A few pieces of the code I left out of simplicity, because I want the review to focus on the parsing strategy. Here is a list of these things and what they do.
Also I left out the code for parsing lights, cameras, transform commands and render commands, because the strategy is identical to what’s shown in the code below.
Scene Definition Format
Our raytracer will implement the Phong reflection model. Currently there are regular spheres and axis-aligned boxes.
```
# name, ambient, diffuse, specular terms, shininess
define material red 1 0 0 1 0 0 1 0 0 1
# name, min, max positions, material name
define shape box b1 -100 -80 -20 1002 80 -100 red
# name, center position, radius, material name
define
The assets of a scene (i.e. shapes, composite shapes, materials, lights and cameras) are stored in the data transfer object
Scene. The task of this parser is to take a text file with the scene definition, parse its rules and transform them into objects which populate a Scene object.A few pieces of the code I left out of simplicity, because I want the review to focus on the parsing strategy. Here is a list of these things and what they do.
Logger: A simple logger class (a singleton) that prints messages when in debug mode. It’s also possible to generate a log file of the currently stored messages
load_file(file_path): Takes a path as a string, loads the file and returns its content as a string
sanitize_inputs(file_string): Removes leading/trailing spaces, consecutive spaces, comments (starting with#, anywhere in the line) and empty lines
split(file_string, delimiter): Takes a string and splits it by a character. Returns a vector of strings
order_rule_vector(rule_vector): Some rules in the scene definition format depend on other rules. To allow the user to write them out in any order, the vector of rules is partitioned first so that rules with dependencies come after their dependants
Also I left out the code for parsing lights, cameras, transform commands and render commands, because the strategy is identical to what’s shown in the code below.
Scene Definition Format
Our raytracer will implement the Phong reflection model. Currently there are regular spheres and axis-aligned boxes.
```
# name, ambient, diffuse, specular terms, shininess
define material red 1 0 0 1 0 0 1 0 0 1
# name, min, max positions, material name
define shape box b1 -100 -80 -20 1002 80 -100 red
# name, center position, radius, material name
define
Solution
Getting rid of the branches
I don’t like that code. It’s a branch nightmare.
Yes, let's get rid of the branches. Consider that all the branches are in the form of:
When you have a sequence of homogenous
Then inside
Then you can define a function for each command, for example:
That shows how to get rid of the outer
Avoiding writing
I write
This is because you are reading each individual value. You can write your own overloads of
And then you can write:
Split up large functions
Another reason your code looks bad is because everything is in one huge function. It is easy to lose overview. The techniques I've shown above already introduce new functions that in turn simplify your
Are the
I see you use
Let's have a look at
The code in your parser could be simplified a lot:
(I used C++17's
If you ensure any material used by shapes is in
If you have a default constructor and an
Do you really need to order the rules first?
You are ordering the rules before parsing them. That means you have to read the whole file in memory first. But do you really need that? I see you are already handling missing materials, by assigning a default constructed
I don’t like that code. It’s a branch nightmare.
Yes, let's get rid of the branches. Consider that all the branches are in the form of:
if (command == "foo") {
// parse rule_stream further
…
}When you have a sequence of homogenous
if/else statements like that, there are alternatives to consider. One would be a switch-statement, but those unfortunately only work with integers and enums. Another option would be to create a map from strings to std::functions:using ParseFunction = std::function;
static const std::unordered_map commands = {
{"define", parse_define},
{"transform", parse_transform},
{"render", parse_render},
};Then inside
load_scene(), you would write something like:std::string command;
rule_stream >> command;
auto it = commands.find(command);
if (it != commands.end()) {
auto parse_function = it->second;
parse_function(rule_stream, scene->get());
} else {
logger->log("--- Skipping unknown command " + command);
continue;
}Then you can define a function for each command, for example:
void parse_define(std::istream& rule_stream, Scene* scene) {
std::string define_target;
rule_stream >> define_target;
…
scene->materials.insert(…);
…
}That shows how to get rid of the outer
ifs. You can also do the same for targets and shapes.Avoiding writing
rule_stream >>I write
rule_stream >> a lot.This is because you are reading each individual value. You can write your own overloads of
operator>>() for the types you are using. For example:std::istream& operator>>(std::istream& is, glm::vec3 &v) {
return is >> v.x >> v.y >> v.z;
}And then you can write:
glm::vec3 min, max;
rule_stream >> min >> max;Split up large functions
Another reason your code looks bad is because everything is in one huge function. It is easy to lose overview. The techniques I've shown above already introduce new functions that in turn simplify your
load_scene(). But even without those you could have split up the function yourself:static void parse_define(std::istream& rule_stream, Scene* scene) {
…
}
…
static void parse_rule(const std::string& rule, Scene* scene) {
std::istringstream rule_stream{ rule };
std::string command;
rule_stream >> command;
if (command == "define") {
parse_define(rule_stream, scene):
} else if (command == "transform") {
…
} …
};
std::shared_ptr load_scene(std::string const& file_path) {
…
for (auto&& rule : rule_vector) {
parse_rule(rule, scene->get());
}
return scene;
}Are the
std::shared_ptrs necessary?I see you use
std::shared_ptrs a lot. But are they really necessary? You could have load_scene() return a Scene directly; even without C++17's guaranteed copy elision, if Scene has a move constructor, it could move all its contents efficiently to the caller. If you do need a smart pointer because of polymorphism, consider whether you can just use astd::unique_ptr instead of the more expensive std::shared_ptr.Let's have a look at
scene->materials. This looks like a std::map or std::unordered_map. These containers will already do memory management for you, and pointers to values stored in the map are guaranteed to be stable. So if it was defined as:struct Scene {
…
std::unordered_map materials;
…
};The code in your parser could be simplified a lot:
std::string mat_name;
Color ambient, diffuse, specular;
float shininess;
rule_stream >> mat_name >> ambient >> diffuse >> specular >> shininess;
auto result = scene->materials.try_emplace(
mat_name, mat_name, ambient, diffuse, specular, shininess
);
if (!result.second) {
logger->log("Error: Material '" + mat_name + "' wasn't added to the scene object.");
}(I used C++17's
try_emplace() here, you can achieve the same in other ways in earlier versions of C++, but less elegantly).If you ensure any material used by shapes is in
scene->materials before the shapes using them are added, then you can just store a reference to the Material in a Shape instead of using a std::shared_ptr.If you have a default constructor and an
operator>> overload for Material, then you could also consider writing:Material mat;
rule_stream >> mat;
if (scene->materials.try_emplace(mat.name(), mat) {
logger->log("Error: Material '" + mat.name() + "' wasn't added to the scene object.");
}Do you really need to order the rules first?
You are ordering the rules before parsing them. That means you have to read the whole file in memory first. But do you really need that? I see you are already handling missing materials, by assigning a default constructed
Material to a Shape. But why not add that default Material under the given name to scene->materials, and then when you later do encounter the rule defining that Material, just update it Code Snippets
if (command == "foo") {
// parse rule_stream further
…
}using ParseFunction = std::function<void(std::istream&, Scene*)>;
static const std::unordered_map<std::string, ParseFunction> commands = {
{"define", parse_define},
{"transform", parse_transform},
{"render", parse_render},
};std::string command;
rule_stream >> command;
auto it = commands.find(command);
if (it != commands.end()) {
auto parse_function = it->second;
parse_function(rule_stream, scene->get());
} else {
logger->log("--- Skipping unknown command " + command);
continue;
}void parse_define(std::istream& rule_stream, Scene* scene) {
std::string define_target;
rule_stream >> define_target;
…
scene->materials.insert(…);
…
}std::istream& operator>>(std::istream& is, glm::vec3 &v) {
return is >> v.x >> v.y >> v.z;
}Context
StackExchange Code Review Q#134089, answer score: 2
Revisions (0)
No revisions yet.