HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Why are the two random variables independent in the analysis of Randomized Selection algorithm in CLRS?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
randomwhytheselectionindependentareanalysisrandomizedclrsalgorithm

Problem

In section 9.2 of CLRS (Introduction to Algorithms; page 185 in the 2nd edition and page 215 in the 3rd edition), a randomized selection algorithm is presented.

For its analysis, $T(n)$ is a random variable denoting the time required on an input array $A[p \cdots r]$ of $n$ elements and $X_k$ is an indicator random variable $X_k = I \{ \text{the subarray } A[p \cdots q] \text{ has exactly } k \text{ elements (due to the pivot)} \}$.

It has been claimed that $X_k$ and $T(\max(k-1, n-k))$ are independent (page 187 in the 2nd edition and page 218 in the 3rd edition). However, I find it quite counter-intuitive to understand. How to verify it?

Solution

The two random variables $X_k$ and $T(\max(k-1, n-k))$ are independent because in each recursive call to RANDOMIZED-SELECT() we invoke RANDOMIZED-PARTITION(), a pivot is randomly selected, and the choice of the pivot in one recursive call is independent from the choice of the pivot in another recursive call. Remember that in the analysis we assume that the elements are distinct, so that each element in a particular subarray has the same probability of being selected as the pivot. Moreover, simply stated, the value of $\max(k-1,n-k)$ does not depend on $X_k$: it is the same independently of $X_k$ being equal to zero or one.

Context

StackExchange Computer Science Q#23811, answer score: 2

Revisions (0)

No revisions yet.