HiveBrain v1.2.0
Get Started
← Back to all entries
patternModerate

Find largest and second largest elements of the array

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
theelementsarraylargestsecondfindand

Problem

So here is a HW problem I have been working. On I was wondering if anybody could give me a hint of what I am doing wrong. I don't want to be given the answer just hints and advice on how to solve it.

I am given this simple algorithm that finds the greatest value and the second greatest value in a array.

If A[1]  largest {
         second = largest;
        largest = A[i];
        }
        else
            if A[i] > second
                second = A[i]
    }


My first task was to find the number of comparisons it makes not in big-oh terms. So I said 2n-3 because it has one in the beginning and worst case
the largest element is in the first two elements of the array A so from
index 3 to n it has two comparisons. So 1 + 2(n-2) = 2n - 3 comparisons.

My second task was to develop an algorithm that would use less comparisons. So I broke it down with divide and conquer and said this.

```
// start is the first index inclusive
// finish is the last index inclusive
// A is an array of elements
function getGreatestAndSecondGreatestValue(A, start, finish) {
T[] // T is an array of size 2
if(finish - start == 1) {
if(A[start] > A[finish]) {
T[0] = A[start]
T[1] = A[finish]
} else {
T[0] = A[finish]
T[1] = A[start]
}
return T
}

Array1 = getGreatestAndSecondGreatestValue(A,start, start + floor((last-first)/2))
Array2 = getGreatestAndSecondGreatestValue(A, start + ceil((last-first)/2), finish)
return merge(Array1, Array2);
// The greast value will be in the 0th index of the array returned and the second
// greatest value will be in the 1 index of the array.
}

// this merges two arrays keeping the greatest value on the left side of the new array and
// keeping the second greatest val

Solution

Here I present an optimal solution minimizing the number of comparisons.

The thing to take advantage of is the first step of the original algorithm:

if A[1] < A[2],
    largest = A[2], second = A[1]
else
    largest = A[1], second = A[2]


We can get the largest and second largest of sets of two in one comparison!
So the first thing we should do is divide our array into $n/2$ pairs of two. Namely:
$$p_1 = (a_1, a_2), p_2 = (a_3, a_4), \ldots, p_i = (a_{2i - 1}, a_{2i}), \ldots, p_{n/2} = (a_{n-1}, a_n)$$
For every pair $p_i$, we can determine it's largest and second largest value in one comparison. We'll call these $p_i.L$ and $p_i.S$ respectively. So it takes $n/2$ comparisons to determine all of thee values for every pair.

Let's call the overall largest and second largest elements $\mathbb{L}$ and $\mathbb{S}$ respectively. We know that $\mathbb{L}$ will be among the $\{p_i.L\}$'s because it will be the largest of these. We also know $\mathbb{S}$ will be among $\{p_i.L\} \setminus \mathbb{L}$ or it could be the second largest value in the pair that $\mathbb{L}$ came from.

An example: Let our original array be:
$$A = [1,4,5,2,7,9,10,19]$$
We then break it up into pairs:
$$P = [(1,4)(5,2)(7,9)(10,19)]$$
We then know $\mathbb{L}$ will be in:
$$W = \forall i.\; \{p_i.L\} = \{4, 5, 9, 19\}$$
So now we just need to find three values: $W.\mathbb{L}$ (this will be $A.\mathbb{L}$), $W.\mathbb{S}$ (this could potentially be $A.\mathbb{S}$) and $p_i.S$ where $p_i$ is the pair that $W.\mathbb{L}$ came from (this could also potentially be $A.\mathbb{S}$).

From our example we see $A.\mathbb{L} = W.\mathbb{L} = 19$. We then see $A.\mathbb{S}$ is either $W.\mathbb{S} = 9$ or $10$ (from the pair containing $W.\mathbb{L}$), and it is $10$.

So this turns our problem of $A$ into a subproblem of size $n/2$, $W$. Then we can run the original algorithm on this subproblem and do some extra comparisons to get $A.\mathbb{S}$ and this totals about $\approx 1.5n$ comparisons. But we can do better!

We use recursion on this subproblem instead of just using the original algorithm. So the algorithm looks like this, and will return the indices of the $A.\mathbb{L}$ and $A.\mathbb{S}$. (assume $n = 2^k$ and this is 1-indexed). The algorithm will modify the input array so we will assume $A$ to be a copy of the input.

L_and_S (array of length n):
  Let A = copy of input
  // base case
  if n == 2: 
    if A[1] > A[2]:
      return (1, 2)
    else:
      return (2, 1)

  // recursive case
  Let W be new array length n/2 (keeps track of largest elements)
  Let M be new array length n/2 (keeps track of swaps)
  for i in 1 ... n/2:
    // swap larger element into first position of pair
    // set M[i] to 1 if pair was swapped, else 0
    // move larger element into W 
    if A[2*i - 1]  A[k]:
    k = j + 1
    k_swap = 0 - j_swap
  return (j + j_swap, k + k_swap) // account for the swap


We can run through the example above:

L_and_S([1,4,5,2,7,9,10,19]):
  // do n/2 comparisons
  A = [4,1,5,2,9,7,19,10]
  W = [4,5,9,19]
  M = [1,0,1,1]
  (j, k) = L_and_S([4,5,9,19]):
             // do n/4 comparisons
             A = [5,4,19,9]
             W = [5,19]
             M = [1,1]
             (j, k) = L_and_S([5,19]):
                        // do 1 comparison (5 > 19)
                        return (2,1)
             (j_swap, k_swap) = (1, 1)
             (j, k) = (3, 1)
             // do 1 comparison (9 > 5)
             k = 4
             k_swap = -1
             return (3 + 1, 4 - 1) = (4, 3)
  (j_swap, k_swap) = (1, 1)
  (j, k) = (7, 5)
  // do 1 comparison (10 > 9)
  k = 8
  k_swap = -1
  return (7 + 1, 8 - 1) = (8, 7)


We see overall this takes $n/2 + n/4 + 1 + 1 + 1 = 9 = n + 1$ comparisons! These are comparisons of the array elements, ignoring the base case comparison in the beginning.

The exact recurrence relation for number of comparisons is the following (assume $n = 2^k$):
$$\begin{align}
T(2) & = 1\\
T(n) & = T(\frac{n}{2}) + \frac{n}{2} + 1\\
& = 1 + \sum_{1}^{\log_2 n - 1} \frac{n}{2^i} + 1\\
& = \log_2 n + \sum_{1}^{\log_2 n - 1} \frac{n}{2^i}\\
& = \log_2 n + n - 2\\
& = n + \log_2 n - 2
\end{align}$$
So we get the total number of comparisons in the worst case is $n + \log_2 n - 2$, which is pretty good. In fact, this is a lower bound! (see here) If you include the base case comparison it adds another $\log_2 n$ to the comparisons. Here's a table for the first few $n$:

$$\begin{array}{|c|c|}
\hline n & n + \log_2 n - 2 & 2n - 3 \\
\hline 2 & 1 & 1 \\
\hline 4 & 4 & 5 \\
\hline 8 & 9 & 13 \\
\hline 16 & 18 & 29 \\
\hline 32 & 35 & 61 \\
\hline 64 & 68 & 125 \\
\hline 128 & 133 & 253 \\
\hline 256 & 262 & 509 \\
\hline 512 & 519 & 1021 \\
\hline 1024 & 1032 & 2045 \\
\hline
\end{array}$$

Implementation of the code in python can be found here, with test cases at the bottom.

Code Snippets

if A[1] < A[2],
    largest = A[2], second = A[1]
else
    largest = A[1], second = A[2]
L_and_S (array of length n):
  Let A = copy of input
  // base case
  if n == 2: 
    if A[1] > A[2]:
      return (1, 2)
    else:
      return (2, 1)

  // recursive case
  Let W be new array length n/2 (keeps track of largest elements)
  Let M be new array length n/2 (keeps track of swaps)
  for i in 1 ... n/2:
    // swap larger element into first position of pair
    // set M[i] to 1 if pair was swapped, else 0
    // move larger element into W 
    if A[2*i - 1] < A[2*i]:
      temp = A[2*i - 1]
      A[2*i - 1] = A[2*i]
      A[2*i] = temp
      M[i] = 1
    else:
      M[i] = 0
    W[i] = A[2*i - 1]
  
  // now recurse on W
  (j, k) = L_and_S(W)
  // map back from W to A
  j_swap = M[j]
  k_swap = M[k] 
  j = 2*j - 1
  k = 2*k - 1
  if A[j + 1] > A[k]:
    k = j + 1
    k_swap = 0 - j_swap
  return (j + j_swap, k + k_swap) // account for the swap
L_and_S([1,4,5,2,7,9,10,19]):
  // do n/2 comparisons
  A = [4,1,5,2,9,7,19,10]
  W = [4,5,9,19]
  M = [1,0,1,1]
  (j, k) = L_and_S([4,5,9,19]):
             // do n/4 comparisons
             A = [5,4,19,9]
             W = [5,19]
             M = [1,1]
             (j, k) = L_and_S([5,19]):
                        // do 1 comparison (5 > 19)
                        return (2,1)
             (j_swap, k_swap) = (1, 1)
             (j, k) = (3, 1)
             // do 1 comparison (9 > 5)
             k = 4
             k_swap = -1
             return (3 + 1, 4 - 1) = (4, 3)
  (j_swap, k_swap) = (1, 1)
  (j, k) = (7, 5)
  // do 1 comparison (10 > 9)
  k = 8
  k_swap = -1
  return (7 + 1, 8 - 1) = (8, 7)

Context

StackExchange Computer Science Q#83022, answer score: 14

Revisions (0)

No revisions yet.