HiveBrain v1.2.0
Get Started
← Back to all entries
patterncModerate

Removing multiple slashes from path

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
pathslashesremovingmultiplefrom

Problem

I have been trying to write a sample code block to remove multiple backward or forward slashes from a path (e.g. //a//b should be /a/b; on Windows, c:\\a\\b\\\c should be c:\a\b\c).

I am assuming that:

  • Code must be platform independent



  • Should work in all cases (or report a proper error)



  • It should be readable and maintainable



  • It shouldn't be over-complex and shouldn't have errors



Please provide point-wise comments for the following code sample (you may ignore main and assume that path is not NULL - it is just for demonstration purposes). Or else you may provide a better mechanism for the same.

```
void
die( int error_code, char* message )
{
fprintf( stderr, message );
exit( error_code );
}

/Following function removes duplicate slashes from a given path/
char*
validate_path ( char* path )
{
char* path_copy = path;
int dup_flag = 0;
char path_without_dup_slash = (char)malloc(strlen(path)+1);
if ( path_without_dup_slash == NULL ) {
die( EXIT_FAILURE, "Failed to allocate memory for processing path.");
}
char* copy_pwds = path_without_dup_slash;
/ Travel through given path /
while( *path_copy != '\0' ) {
/* If there is '\' or '/' then copy a single '/'
, ignore others until a real char comes /
if ( (( path_copy == '\\') || ( path_copy == '/' )) &&
(dup_flag == 0) ) {
copy_pwds = path_copy;
dup_flag = 1;
copy_pwds++;
}else if (( path_copy != '\\') && ( path_copy != '/' )) {
copy_pwds = path_copy;
dup_flag = 0;
copy_pwds++;
}
path_copy++;
}
*copy_pwds = '\0';
path_without_dup_slash = (char *)realloc( path_without_dup_slash,

Solution

Here are some suggestions for how you might improve your code.

Use the appropriate #includes

This program fragment requires headers, which should be included:

#include 
#include 
#include 


Use const where practical

Because the validate_path function does not alter the passed string, the parameter should be declared const:

char *validate_path ( const char* path )


Don't make pointless copies of variables

When you pass a parameter to a function, it's a copy, so there's no reason for this line:

char* path_copy = path;


Just use path directly.

Check for NULL before dereferencing

The first thing the validate_path does is to pass path to strlen, but this results in undefined behavior if path isn't a '\0'-terminated string. A simple safety check would be to check to see if it's a NULL pointer and exit if it is.

Consider a change to the interface

Instead of allocating memory within the function, consider instead altering the function so that it is passed an input and output buffer. This way, the code would not need to allocate memory and the caller would do both allocation and freeing. That makes for a more robust interface and it makes your code simpler.

Prefer for to while

The while loop within the code would likely be more succinctly expressed as a for loop. That way it's easier to see what is being iterated through and how.

Use a more accurate function name

I would expect that validate_path would do just that and validate a path, but that's not really what's happening here. Instead, it seems that a more accurate function name would be something like remove_dup_slashes.

Simplify your code

Here's an alternative implementation that uses these suggestions:

char *validate_path(const char *path, char *curr) {
    if (path == NULL || curr == NULL) {
        return NULL;
    }
    char *begin = curr;
    // always copy first character
    *curr = *path;
    // if it's just a single `NUL`, we're done
    if (*curr == '\0') {
        return curr;
    }
    for (path++; *path; ++path) {
        // path points to next char to read
        // curr points to last written
        if (*curr != *path || (*curr != '\\' && *curr != '/')) {
            // only copy if not duplicate slash
            *(++curr) = *path;
        }
    }
    *curr = '\0';  // terminate string
    return begin;
}


Note that the interface is still not a very good one. It could be improved by also passing in the length of the output buffer and maybe the length of the input buffer as well.

Code Snippets

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *validate_path ( const char* path )
char* path_copy = path;
char *validate_path(const char *path, char *curr) {
    if (path == NULL || curr == NULL) {
        return NULL;
    }
    char *begin = curr;
    // always copy first character
    *curr = *path;
    // if it's just a single `NUL`, we're done
    if (*curr == '\0') {
        return curr;
    }
    for (path++; *path; ++path) {
        // path points to next char to read
        // curr points to last written
        if (*curr != *path || (*curr != '\\' && *curr != '/')) {
            // only copy if not duplicate slash
            *(++curr) = *path;
        }
    }
    *curr = '\0';  // terminate string
    return begin;
}

Context

StackExchange Code Review Q#144957, answer score: 10

Revisions (0)

No revisions yet.