HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

Remove repeated words from a 2D character array

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
repeatedarraywordscharacterremovefrom

Problem

This code removes repeated words from a word array (2D char array). I want to optimize this as much possible for speed.

#include "stdio.h"
#include "string.h"

main()
{
    char words[2000][20];

   /*This array is populated by some other module. possible example of this array is given below, but in reality it would far more elements.
        char words[6][20] = 
    {
        {"barrack"},        
        {"david"},
        {"John"},
        {"david"},
        {"benjamin"},
        {"barrack"}

    };*/

    int names_found = 10;
    char out_words[6][20];
    int i,j;
    int act_names=0;

    for (i = 0; i < names_found; i ++)
    {
        j = i + 1;
        while (j < names_found)
        {
            if (strcmp(words[i], words[j]) == 0)
            {
                memmove(words + j, words + (names_found - 1), sizeof(words[0]));

                -- names_found;             
            }
            else
                ++ j;
        }
    }
    for(i=0;i<names_found;i++)
    {
        printf("%s\n",words[i]);
    }

}


-
How can the 2 for loops for i and j above be optimized?

-
As you see, the above logic currently outputs to same input buffer (in-place processing). Can it be optimized further if we allow a second buffer say char out_words[2000][20]?

Solution

-
Your algorithm is \$O(n^2)\$. You can reduce this to \$O(n)\$ by first inserting them into a closed hash table (i.e., one with bucket chains).

-
I'd optimise is rather than copying the whole items, just copy pointers to the original strings (unless you intend to alter their contents later on).

Context

StackExchange Code Review Q#3488, answer score: 6

Revisions (0)

No revisions yet.