patterncppMinor
File Output given repeated small operations
Viewed 0 times
fileoperationsrepeatedoutputsmallgiven
Problem
I want to write a file which contains values for each cell of a CFD computation. This will be an input file to the program which runs the computation.
Currently, the most dubious section of my code looks like:
For most of the file these
I can rewrite it like this (with appropriate min/max functions):
Now my
Can I rewrite this in a way that doesn't require several thousand separate (identical) writes to a single file?
Currently, the most dubious section of my code looks like:
for (int y = 0; y cellToY)
{
file << "0\n";
}
else
{
double value = (0.5 + (-0.5 * (deltaY / cellToY)));
file << value << "\n";
}
}
}
}For most of the file these
if checks are going to be slow/useless - since depending on the value of yPos, they will all evaluate to true/false for large regions.I can rewrite it like this (with appropriate min/max functions):
for (int y = 0; y yHeightMax)
{
for (int x=0; x cellToY)
{
file << "0\n";
}
else
{
double value = (0.5 + (-0.5 * (deltaY / cellToY)));
file << value << "\n";
}
}
}
}Now my
if conditions are evaluated less frequently, but I'm still left with segments likefor (int x=0; x<xcells; x++) // where xcells is ~ 1-10k
{
file << "1\n";
}Can I rewrite this in a way that doesn't require several thousand separate (identical) writes to a single file?
Solution
Measuring is good, but it will only tell you what those data will result in, so you will need to reason on the code too and then measure that it isn't totally wrong.
The reason to change the code is the suspicion that the repeated calls to file output would be a large overhead.
Usage, dups is the number of duplicates
There is a method using char[] which doesn't make any copy. (not shown)
Saving on the math operations as '/' is expensive and '-0.5 *' also is a constant that might as well be calculated outside the loop.
WARNING this may result in some subtle differences in final values due to float pointiness.
Outside the loop or if cellToY is a global const make it a constexpr
This leaves
Saving a divide.
The
If they are true at a near random occurrence you have a serious performance problem. If they come in streaks you can do something like this
This guarantees one mispredict for the
This would also mean that the original code would get lots of mispredicts, you could try this
as in some cases (trinaries) some compilers (gcc+?) can find out to generate conditional moves.
But my guess would be the
The reason to change the code is the suspicion that the repeated calls to file output would be a large overhead.
// ToDo: add sanity checks.
string Duplicate(string org, size_t maxSize) {
string dup;
dup.reserve(org.length() * maxSize);
for (auto i = 0; i < maxSize; ++i)
dup.append(org);
return dup; // hope for NVRO
}
string Ones = "1\n";
size_t numLens = Ones.length();
string Ones = Duplicate(Ones, MaxNumberOfRepeats);
string Zeroes = Duplicate("0\n", MaxNumberOfRepeats);Usage, dups is the number of duplicates
inline void Writer(string text, size_t dups) {
file << text.substr(0, dups*numLens); // potentially makes a copy
}There is a method using char[] which doesn't make any copy. (not shown)
double value = (0.5 + (-0.5 * (deltaY / cellToY)));Saving on the math operations as '/' is expensive and '-0.5 *' also is a constant that might as well be calculated outside the loop.
WARNING this may result in some subtle differences in final values due to float pointiness.
Outside the loop or if cellToY is a global const make it a constexpr
constexpr double invCellToY = -0.5 / cellToY;This leaves
double value = (0.5 + ( (deltaY * invCellToY)));Saving a divide.
The
ifs are a problemif (deltaY cellToY) {
file << "0\n";If they are true at a near random occurrence you have a serious performance problem. If they come in streaks you can do something like this
double deltaY = yProfileHeight[x] - yPos;
if (deltaY < -cellToY) {
int count = 1;
while (++x < xcells) {
deltaY = yProfileHeight[x] - yPos;
if (!(deltaY < -cellToY))
break;
++count;
}
Writer(Ones, count);
} else
...This guarantees one mispredict for the
while so this might be a loss if they don't come in streaks. (branch prediction is somewhat dependend on global branch history so some of the lost might be redeemed again as the next x itereation should have an increased probability of predict the first if not taken after the loop exit). This would also mean that the original code would get lots of mispredicts, you could try this
// these bool calculations should be branchless
bool LT = (deltaY cellToY);
// ??? LT and GT are mutual exclusive
bool LTGT = LT | GT;
if (LTGT)
res = LT?Ones:Zeroes;
else
// very costly operation of turning double to string ...
// should the result be 0 or 1 also?
res = string(0.5 + ( (deltaY * invCellToY)))+"\n";
fill << res;as in some cases (trinaries) some compilers (gcc+?) can find out to generate conditional moves.
But my guess would be the
while implementation would be faster as the domain suggest some form of streaks. Merge your final decision with this code.string res;
for (int y = 0; y yHeightMax) {
Writer(Ones, xcells);
} else {
for (int x=0; x cellToY) {
file << "0\n";
} else {
double value = (0.5 + ( (deltaY * invCellToY)));
file << value << "\n";
}
}
}
}Code Snippets
// ToDo: add sanity checks.
string Duplicate(string org, size_t maxSize) {
string dup;
dup.reserve(org.length() * maxSize);
for (auto i = 0; i < maxSize; ++i)
dup.append(org);
return dup; // hope for NVRO
}
string Ones = "1\n";
size_t numLens = Ones.length();
string Ones = Duplicate(Ones, MaxNumberOfRepeats);
string Zeroes = Duplicate("0\n", MaxNumberOfRepeats);inline void Writer(string text, size_t dups) {
file << text.substr(0, dups*numLens); // potentially makes a copy
}double value = (0.5 + (-0.5 * (deltaY / cellToY)));constexpr double invCellToY = -0.5 / cellToY;double value = (0.5 + ( (deltaY * invCellToY)));Context
StackExchange Code Review Q#69682, answer score: 2
Revisions (0)
No revisions yet.