HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

AdaBoost - why using such alpha function?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
suchwhyalphafunctionusingadaboost

Problem

I'm reading the paper where AdaBoost was invented (link), and I couldn't understand why they have chosen the formula α_t = 1/2 * ln((1-ε_t) / ε_t).

snippet:
AdaBoost algorithm from the paper

What is the motivation behind that specific formula?
Why not using something that feels more natural like ε_t?

Solution

Recall that the final hypothesis after $T$ rounds is $h_T(x)=sign\left(\sum\limits_{i=1}^T \alpha_t h_t(x)\right)$, i.e. $\alpha_t$ is the weight of $h_t$ in $h_T$. If $\epsilon_t$ is high (near one) you want to answer the opposite of $h_t$, so you want $\alpha_t$ to be negative and very large in absolute value. If on the other hand $\epsilon_t$ is very low then you want $\alpha_t$ to be very large. The worst case is $\epsilon_t=\frac{1}{2}$, in which case $h_t$ is of no use to you.

The function $\log\frac{1-\epsilon_t}{\epsilon_t}$ satisfies those properties. This magnitude is known as log odds (where the probability considered is $p=1-\epsilon_t$, the success probability). Intuitively, the odds ratio tells you how often an event with probability $p$ occurs, in our case if e.g. $\epsilon_t=1/3$ then $\frac{1-\epsilon_t}{\epsilon_t}=2$, i.e. $h_t$ is expected to succeed with ratio $2:1$.

Context

StackExchange Computer Science Q#136958, answer score: 4

Revisions (0)

No revisions yet.