HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Directory Snapshot

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
snapshotdirectorystackoverflow

Problem

The following code creates a recursive backup of a directory in the form of interlinked HTML files, structured in the same form as the input directory.

It does not take the backup of the contents of a directory, just stores the names and sizes of all the files and folders contained in the directory.

(It can be thought of as a hyperlinked version of the dir /s or tree /f commands.)

For more details (including an example), refer to this.


As an example, this would be the output directory considering this as the input directory.

I cannot use C++ 14.

```
#include
#include
#include
#include
#include
#include
#include
#include

using namespace std;
using namespace boost::filesystem;

const path LogFileName = "DirectorySnapshotLog.txt";
ofstream Log;
stringstream LogErrorStream; // Buffer soft errors to output them separately after the informational messages in the log file.

// Convert any type to its string representation
template std::string ToString( const T obj )
{
std::stringstream ss;
ss units;
units.push_back( "bytes" );
units.push_back( "KB" );
units.push_back( "MB" );
units.push_back( "GB" );
units.push_back( "TB" );

const unsigned ratio = 1024;
unsigned i = 0;

while ( ret > ratio && i & dirContents )
{
if ( exists( dirPath ) && is_directory( dirPath ) )
{
copy( directory_iterator( dirPath ), directory_iterator(), back_inserter( dirContents ) );
}
}

// Create a set of HTML files containing information about source directory's contents and store it in the destination directory, in a directory structure similar to the source directory
// Returns the total size of the source directory
long long Snapshot( const path& sourcePath, const path& destinationPath )
{
Log dirContents, files, directories;
try
{
DirectoryIterate( sourcePath, dirContents );
}
catch ( const filesystem_error& ex )
{
LogErrorStream \n";
outFile \n";
outFile \n";

Solution

I see a number of things that may help you improve your code.

Don't hardcode file names

The LogFileName might be something that a user of this program wants to place elsewhere and it will fail entirely if run from a read-only directory such as a CD or DVD.

Prefer command line parameters to runtime interaction

There is not an easy way to use this program in a script because it requires a response to a prompt rather than allowing command line parameters to be passed. You can provide both, if you wish, by looking for required command line parameters and then only prompting for missing ones.

Consider using standard SI units

The code calculates units in ratio of 1024 but then uses the abbreviation "MB" which is 10001000 rather than 10241024 which should be "MiB". See the Wikipedia article on Mebibyte for more details.

Avoid dynamically creating const structures

Within your RoundSize routine, the units vector is created and destroyed every time the function is called, which is not at all necessary. Further, since you're using C++11, you can simply create the vector using a std::initializer_list:

static const vector units{ 
    "bytes", "KiB", "MiB", "GiB", "TiB" 
};


It would be even better as a constexpr, but we can't do that as written because std::vector has a non-trivial destructor.

Use for instead of while where appropriate

In that same RoundSize routine, the while loop would be much more idiomatic C++ if it were instead a for loop.

unsigned i;

for ( i = 0; ret > ratio && i < units.size() - 1; ++i)
{
    ret /= ratio;
}


Consider the performance cost of creating objects

I don't know if you had performance goals for your program, but it's useful to understand the performance characteristics of code you write. Consider this line, also from RoundSize:

return ToString( ret ) + " " + units[i];


The ToString() routine creates a string, then the space character is converted to a string and those two strings are concatenated, and then the [] operator is called on the vector and that string concatenated. Whew! One simple way to avoid some of that would be to simply have the space as part of the units.

Similarly, your sourcePath and destinationPath variables are declared as string objects but are used as path objects in most uses. This means that a conversion is done from string to path almost every time you use those variables. Better would be to declare both as path and then use explicit strings sourcePathStr and destinationPathStr for the few places you actually need strings.

Pass const references where possible

The ToString templated function should take const T &obj as its argument rather than const T obj to avoid making a unnecessary copy of the passed object.

Declare file scope items as static

Unless you intend to share the variables with other code in other files, your variables and functions should be declared static.

Catch exceptions

In a number of places within the code the underlying boost library call can throw an exception, but it is not caught by your program. An example of that is in this loop within Snapshot:

for ( const auto& file : files )
{
    auto size = file_size( file );
    outFile \n";
    sourcePathSize += size;
}


The call to file_size can throw an error and did when I tried it on my Linux machine. The issue was a symlink that pointed to a nonexistent file. You could either catch the exception using try...catch or use the form of file_size that takes an error_code as a parameter.

Separate data manipulation from output

The Snapshot function really does two things. It creates the directories and files vectors and then it outputs those data structures as HTML. Better would be to split that one long function into the two logical halves. Or, even better...

Use object-oriented programming

Your files and directories are both objects. Why not instead create a filesystem object? That way, the files and directories would be parts of the filesystem and your Snapshot function would be better expressed as a constructor and a output member function. This would be much cleaner and also have the advantage of making it easier to create alternative output formats.

Use constant string concatenation

The Snapshot routine currently has these lines:

outFile \n";
outFile \n";
outFile \n";
outFile " \n";
outFile \n";


But you don't really need to do it that way which potentially calls the ` and tags are out of order). Also, consider changing the outputs to ` rather than just lists. This would potentially allow nicer formatting via stylesheets.

Re-read the boost pages

As noted on the boost Filesystem home page, using

#define BOOST_FILESYSTEM_NO_DEPRECATED


above the line where you include the boost library is highly recommended for new code. Doing so now will help you preserve functionality as you expand this p

Code Snippets

static const vector<string> units{ 
    "bytes", "KiB", "MiB", "GiB", "TiB" 
};
unsigned i;

for ( i = 0; ret > ratio && i < units.size() - 1; ++i)
{
    ret /= ratio;
}
return ToString( ret ) + " " + units[i];
for ( const auto& file : files )
{
    auto size = file_size( file );
    outFile << file.filename() << "----" << RoundSize( size ) << "<br>\n";
    sourcePathSize += size;
}
outFile << "<!DOCTYPE html>\n";
outFile << "<meta charset=\"UTF-8\">\n";
outFile << "<html>\n";
outFile << "<title>" << sourcePathName << "</title>\n";
outFile << "<body>\n";

Context

StackExchange Code Review Q#85255, answer score: 5

Revisions (0)

No revisions yet.