HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

Find the dominator in an array of integers

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
dominatorthearrayfindintegers

Problem

A zero-indexed array A consisting of N integers is given. The
dominator of array A is the value that occurs in more than half of the
elements of A.


For example, consider array A such that


A[0] = 3 A[1] = 4 A[2] = 3 A[3] = 2 A[4] = 3 A[5] =
-1 A[6] = 3 A[7] = 3


The dominator of A is 3 because it occurs in 5 out of 8 elements of A
(namely in those with indices 0, 2, 4, 6 and 7) and 5 is more than a
half of 8.


Write a function


int solution(int A[], int N);


that, given a zero-indexed array A consisting of N integers, returns
index of any element of array A in which the dominator of A occurs.
The function should return −1 if array A does not have a dominator.


Assume that:


N is an integer within the range [0..100,000]; each element of array A
is an integer within the range [−2,147,483,648..2,147,483,647].

int solution(int A[], int N)  { 
        int size = 0; 
        int value = 0; 
        for(int i = 0 ; i  0) { 
            candidate = value; 
         } 
         int leader = -1; 
         int count = 0; 
         for(int j = 0 ; j  (N/2)) {
            leader = candidate; 
         }
         else {
            return -1;  
         }
         int index[count]; 
         int cnt=0; 
         for(int k = 0 ; k < N ; k++) { 
            if(A[k] == leader) { 
               index[cnt] = k; 
               cnt++; 
            } 
         } 
         return index[0]; 
    }

Solution

I see a number of things that may help you improve your program.

Test thoroughly

There is a flaw in your program which causes it to fail on certain kinds of input. Here's the test code I used:

#include 
#include 
#include 
#include 

// solution() goes here

struct test {
    int expected;
    int N;
    int numbers[20];
};

struct test tests[] = {
    { -1, 6, { 3, 3, 0, 6, 9, 3 }},
    { 3, 8, { 3, 4, 3, 2, 3, -1, 3, 3 }},
    { 3, 8, { 3, 3, 4, 2, 3, 3, -1, 3 }},
    { 3, 7, { 3, 3, 4, 3, 3, -1, 3}},
    { 3, 7, { 3, 3, 4, 3, 3, -1, 0}},
    { 3, 6, { 3, 3, 4, 3, 3, -1 }},
    { 6, 2, { 6, 6}},
    { 8, 1, { 8}},
};

bool doTest(struct test* t) {
    int index = solution(t->numbers, t->N);
    if (t->expected == -1) {
        return index == -1;
    }
    return t->expected == t->numbers[index];
}

int main()
{
    int testcount = sizeof(tests) / sizeof(tests[0]);
    for (int i=0; i < testcount; ++i) {
        printf("Test %d: %s\n", i+1, doTest(&tests[i]) ? "OK " : "BAD");
    }
}


With your code as posted, tests 5 and 6 (of 8) fail.

Fix your formatting

The indentation of the code is not consistent, which makes it a little harder to read and understand than if you format the code consistently.

Read the problem statement carefully

The problem statement says that the return value can be any of the index values that point to the dominator number, so there's no need to store all of them as the code does in the index[] array.

Reduce redundant variables

The value, candidate and leader could all be collapsed to a single local variable which would make the code shorter and more understandable.

Minimize loops through arrays

The code attempts to identify a candidate and then does several more loops through the code to find the count and an index value to return. The code could be simplified to this (although the bug still remains):

int solution(int A[], int N)
{
    int size = 0;
    int candidate = 0;
    for (int i = 0; i  (N / 2)) {
        return index;
    }
    return -1;
}


Consider a different algorithm

There is another approach that passes all of the tests I've tried. It first sorts the array. Because the array is now sorted, we know that if there is a dominator, more than half the sorted array must be the dominator value. In other words, for an array size of N, the index A[N/2] of the sorted array must contain the dominator value if one exists. Then all that remains is to verify that it is indeed a dominator and return an index into this sorted version of the original array. (Alternatively, this can be done with a copy of the original array if you make the assumption that the passed array can't be modified. The problem statement doesn't say and doesn't specify const for the array parameter.) Here's the code:

int comp(const void *a, const void *b)
{
    return *(const int *)a - *(const int *)b;
}

int solution(int A[], int N)
{
    qsort(A, N, sizeof(int), comp);
    int count = 0;
    for (int i = 0; i  N / 2) ? N / 2 : -1;
}


Note that the index of N / 2 is guaranteed to point to the dominator if it exists, and the specification says that one may return any element of A in which the dominator exists.

Put another way, if one has a 10 integer array, there is no way to place 6 contiguous numbers within that array without having one of them be at index 5.

The corrected original algorithm

As @vnp and @janos have pointed out, the algorithm above, while simple and correct, is not optimal in terms of performance, so by popular demand, here's a corrected version of your original algorithm:

int solution(int A[], int N)
{
    int count = 1;
    int candidate = A[0];
    int index = 0;
    for (int i = 1; i  0) {
            --count;
        } else {
            candidate = A[i];
            index = i;
            count = 1;
        }
    }
    if (count > 0) { 
        count = 0;
        for (int i = 0; i  (N / 2)) {
            return index;
        }
    }
    return -1;
}


Thanks to @JS1 who pointed out that an earlier version of this failed on an all zeroes input, reinforcing my first point: Test thoroughly! :) I've added the following line to my test array:

{0, 5, {0, 0, 0, 0, 0}},


Timing tests

I modified the test routines a bit to have them copy each test and then sort rather than using a pointer. Then I had each version run 100000 (1e6) times over the test suite of 8 tests. Timing was then noted on each version and, as expected, the qsort algorithm is slower. Tests were all done on a 64-bit Linux machine with a quad-core i7 running at 3.40GHz. Compiler was gcc version 5.3.1 with -O2 optimization.

qsort version: 1.25 seconds
Moore version: 0.77 seconds


Results are fairly consistently repeatable and independent of which algorithm is tested first. Since qsort is \${\mathcal O}(n \log n)\$, it's likely that the difference would tilt even more toward the Moore algorithm if longer strings had been used.

Code Snippets

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>

// solution() goes here

struct test {
    int expected;
    int N;
    int numbers[20];
};

struct test tests[] = {
    { -1, 6, { 3, 3, 0, 6, 9, 3 }},
    { 3, 8, { 3, 4, 3, 2, 3, -1, 3, 3 }},
    { 3, 8, { 3, 3, 4, 2, 3, 3, -1, 3 }},
    { 3, 7, { 3, 3, 4, 3, 3, -1, 3}},
    { 3, 7, { 3, 3, 4, 3, 3, -1, 0}},
    { 3, 6, { 3, 3, 4, 3, 3, -1 }},
    { 6, 2, { 6, 6}},
    { 8, 1, { 8}},
};

bool doTest(struct test* t) {
    int index = solution(t->numbers, t->N);
    if (t->expected == -1) {
        return index == -1;
    }
    return t->expected == t->numbers[index];
}

int main()
{
    int testcount = sizeof(tests) / sizeof(tests[0]);
    for (int i=0; i < testcount; ++i) {
        printf("Test %d: %s\n", i+1, doTest(&tests[i]) ? "OK " : "BAD");
    }
}
int solution(int A[], int N)
{
    int size = 0;
    int candidate = 0;
    for (int i = 0; i < N; i++) {
        if (size == 0) {
            size++;
            candidate = A[i];
        } else {
            if (candidate != A[i])
                size--;
        }
    }
    if (size <= 0) {
        return -1;
    }
    int count = 0;
    int index = -1;
    for (int i = 0; i < N; i++) {
        if (A[i] == candidate) {
            count++;
            index = i;
        }
    }
    if (count > (N / 2)) {
        return index;
    }
    return -1;
}
int comp(const void *a, const void *b)
{
    return *(const int *)a - *(const int *)b;
}

int solution(int A[], int N)
{
    qsort(A, N, sizeof(int), comp);
    int count = 0;
    for (int i = 0; i < N && count <= N / 2; ++i) {
        if (A[i] == A[N / 2]) {
            ++count;
        }
    }
    return (count > N / 2) ? N / 2 : -1;
}
int solution(int A[], int N)
{
    int count = 1;
    int candidate = A[0];
    int index = 0;
    for (int i = 1; i < N; i++) {
        if (A[i] == candidate) {
            ++count;
        } else if (count > 0) {
            --count;
        } else {
            candidate = A[i];
            index = i;
            count = 1;
        }
    }
    if (count > 0) { 
        count = 0;
        for (int i = 0; i < N && count <= N / 2; ++i) {
            if (A[i] == candidate) {
                count++;
            }
        }
        if (count > (N / 2)) {
            return index;
        }
    }
    return -1;
}
{0, 5, {0, 0, 0, 0, 0}},

Context

StackExchange Code Review Q#123090, answer score: 5

Revisions (0)

No revisions yet.