HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavascriptMinor

Prefix and suffix detection algorithm

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
suffixandalgorithmdetectionprefix

Problem

I have written an algorithm intended to find prefixes and suffixes in arrays of strings.

I would like to get opinions/suggestions/reviews on the code I wrote (specially cases in which my code would fail), but also, I would like to know if this algorithm can be rewritten in a more optimal way since I will be running this code against lots of entries.

Finding preffixes

function findPrefix(strings){
    if(!strings || strings.length === 0){
        return null;   
    }

    var prefix = '',
        characters = strings[0].split(''),
        i, j;

    for(i = 0; i = i + 1 && string[i] === character);
        }

        if(isPrefix) {
            prefix += character;
        } else {
            return prefix;
        }
    }
}


Finding suffixes

function reverse(string){
    return string.split('').reverse().join('');
}

function findSuffix(strings){
    if(!strings || strings.length === 0){
        return null;   
    }

    return reverse(findPrefix(strings.map(reverse)));
}


A running example can be found here: http://jsbin.com/razuta/2/edit (click JavaScript and console to see the results)

Solution

As Martin R said, if all strings must match a common prefix, then that prefix is whatever the first and last string in a sorted array have in common.

The code in the SO answer isn't too pretty though. Here's my take (rags-to-riches?):

function findPrefix(strings) {
  if(!strings.length) {
    return ""; // or null or undefined; your choice
  }

  var sorted = strings.slice(0).sort(), // copy the array before sorting!
      string1 = sorted[0],
      string2 = sorted[sorted.length-1],
      i = 0,
      l = Math.min(string1.length, string2.length);

  while(i < l && string1[i] === string2[i]) {
    i++;
  }

  return string1.slice(0, i);
}


Compared to the SO answer, the main point here (besides style and camelCased variables) is that the array is copied, and the copy is sorted. Otherwise, the same array object that was passed to the function will be sorted as a side-effect.

Compared to your code, I've skipped the strings-it-truth'y check at the very start. I prefer to assume that arguments are at least the correct type (i.e. an array) - otherwise it's the caller's problem.

As for your current code:

-
As mentioned in the comments, you can use string.charAt(i) or simply string[i] (though the latter is a more recent addition to JS) instead of splitting the string into an array.

-
Your inner loop starts at index 0 - but that's the same string you're already using as your characters array, so you already know that it's going to match.

-
The moment you hit a non-match, use break to skip the rest of the loop. My bad, you're already returning from the loop, which has the same effect.

As for finding suffixes, your current code seems fine to me if you use the "sort-and-compare" approach above for findPrefix. Otherwise, it'd probably be infinitesimally faster to write a separate findSuffix function that just has its loops reversed. The logic's the same, the indices just run backward.

While nothing beats the sort-and-compare approach (except a better one; see below), here's, just for fun, another way to to use all strings, unsorted, to find the prefix:

function findPrefix(strings) {
  function commonPrefix(a, b) {
    var i = 0,
        l = Math.min(a.length, b.length);

    while(i < l && a[i] === b[i]) {
      i++;
    }
    return a.slice(0, i);
  }

  return strings.reduce(function (prefix, string) {
    // if prefix is empty or matches the string,
    // then just return the prefix immediately
    if(string.indexOf(prefix) === 0) {
      return prefix;
    }

    // otherwise, find the intersection between
    // prefix and the string
    return commonPrefix(prefix, string);
  });
}


Update: As MatinR pointed out in the comments, sort may be a drag on performance, and besides it isn't necessary, as you only need what would be the first and last string in a sorted array. These strings can alternative be found like so:

function extremes(strings) {
  // grab two strings to use as our initial guesses
  var initial = {
    largest: strings[0],
    smallest: strings[strings.length-1]
  };

  // loop through to find the extremes
  return strings.reduce(function (memo, string) {
    // slightly funky syntax, but it's short.
    (memo.largest > string) || (memo.largest = string);
    (memo.smallest < string) || (memo.smallest = string);
    return memo;
  }, initial);
}


The two strings in the returned object may then be used to determine the shared prefix (if any). The entire thing can also be built into findPrefixes, of course, rather than be a separate function as shown below:

function findPrefix(strings) {
  if(!strings.length) {
    return ""; // or null or undefined; your choice
  }

  var string1 = strings[0],
      string2 = strings[strings.length-1],
      i, l;

  // note: a regular for-loop is faster than forEach
  // I'm using forEach because it's prettier
  strings.forEach(function (string) {
    (string1  string) || (string2 = string);
  });

  i = 0;
  l = Math.min(string1.length, string2.length);
  while(i < l && string1[i] === string2[i]) {
    i++;
  }

  return string1.slice(0, i);
}

Code Snippets

function findPrefix(strings) {
  if(!strings.length) {
    return ""; // or null or undefined; your choice
  }

  var sorted = strings.slice(0).sort(), // copy the array before sorting!
      string1 = sorted[0],
      string2 = sorted[sorted.length-1],
      i = 0,
      l = Math.min(string1.length, string2.length);

  while(i < l && string1[i] === string2[i]) {
    i++;
  }

  return string1.slice(0, i);
}
function findPrefix(strings) {
  function commonPrefix(a, b) {
    var i = 0,
        l = Math.min(a.length, b.length);

    while(i < l && a[i] === b[i]) {
      i++;
    }
    return a.slice(0, i);
  }

  return strings.reduce(function (prefix, string) {
    // if prefix is empty or matches the string,
    // then just return the prefix immediately
    if(string.indexOf(prefix) === 0) {
      return prefix;
    }

    // otherwise, find the intersection between
    // prefix and the string
    return commonPrefix(prefix, string);
  });
}
function extremes(strings) {
  // grab two strings to use as our initial guesses
  var initial = {
    largest: strings[0],
    smallest: strings[strings.length-1]
  };

  // loop through to find the extremes
  return strings.reduce(function (memo, string) {
    // slightly funky syntax, but it's short.
    (memo.largest > string) || (memo.largest = string);
    (memo.smallest < string) || (memo.smallest = string);
    return memo;
  }, initial);
}
function findPrefix(strings) {
  if(!strings.length) {
    return ""; // or null or undefined; your choice
  }

  var string1 = strings[0],
      string2 = strings[strings.length-1],
      i, l;

  // note: a regular for-loop is faster than forEach
  // I'm using forEach because it's prettier
  strings.forEach(function (string) {
    (string1 < string) || (string1 = string);
    (string2 > string) || (string2 = string);
  });

  i = 0;
  l = Math.min(string1.length, string2.length);
  while(i < l && string1[i] === string2[i]) {
    i++;
  }

  return string1.slice(0, i);
}

Context

StackExchange Code Review Q#65111, answer score: 7

Revisions (0)

No revisions yet.