HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Proving $\log (N)$ mistake bound is "tight" for a learner when learning thresholds

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
provingboundtightlogmistakelearnerlearningthresholdsforwhen

Problem

I'm studying the online learning model and the Halving algorithm.

We've seen the threshold problem:

The domain is $X=\left\{ 1,2,...,N-1\right\} $, the label set is $Y=\{-1,1\}$ and the hypotheses class is $H=\left\{ h_{\theta}(x)=sign(x-\theta):\theta\in\left\{ \frac{1}{2},1+\frac{1}{2},...,(N-1)+\frac{1}{2}\right\} \right\} $

I'll denote the number of mistakes the learner $A$ does on the environment $E$ by $M(A,E) = \left|\left\{ i:\hat{y_{i}}=A_{\mbox{pred}}(x_{i})\neq y_{i}\right\} \right|$ where $E=\left\{ (x_{i},y_{i})\right\} _{i=1}^{n}$

For any environment $E$, the mistake bound for Halving is $M$(Halving,$E$)$\leq\left\lfloor \log N\right\rfloor $

Now I must prove that this bound is "tight", i.e., for any learning algorithm $A$ there exists a realizable environment $E$ such that $M(A,E)\geq \left\lfloor \log N\right\rfloor$

My try - I came to the conclusion that finding the correct $h^{*} \in H$ is pretty similar in this case to finding a value in a sorted array.
Since we'll need at least $\left\lfloor \log N\right\rfloor $ comparisons, an algorithm $A$ which can find $h^{*}$ and make less than $\left\lfloor \log N\right\rfloor$ mistakes "can find a value in a sorted array in less than $\left\lfloor \log N\right\rfloor$ comparisons for any given array, which is a contradiction."

The sentence in quotes is my intuition, but I'm having trouble proving it formally.

Is this the right way to go?

Many thanks.

Solution

I am assuming the learning model described in Wikipedia. The environment will maintain a set of hypothesis forming an interval $S = [h_\alpha,h_\beta]$ consistent with its current answers, and will ensure that the first $\log N$ answers by the algorithm are wrong. In the beginning, $S = H$. At each step, the environment presents the middle point of the interval (we're identifying hypotheses with points in the domain in a natural way). Depending on the learner's answer, the interval $S$ is halved so that the learner is wrong. This process can go on for $\log N$ steps, in each of which the learner is mistaken, so that in total the learner makes at least $\log N$ mistakes.

Context

StackExchange Computer Science Q#21969, answer score: 4

Revisions (0)

No revisions yet.