HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

What does this performance formula mean?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
thiswhatmeanperformancedoesformula

Problem

I have to make a quick clustering program but the following formula is gibberish to me:


$\operatorname{Perf}(X,C) = \sum\limits_{i=1}^n\min\{||X_i-C_l||^2 \mid l = 1,...,K\}$


where $X$ is a set of multi-dimensional data and $C$ is a set of centroids for each data cluster.

This formula is a fitness function for an artificial bee colony clustering algorithm as a substitute for k-means clustering algorithm. It is described as a total
within-cluster variance or the total mean-square quantization error (MSE).

Can anyone translate it to pseudo-code, normal human English, or at least enlighten me?

Solution

Just break it down into parts:


$ \{ f(l) \mid l = 1,...,K \} $

This is a simple set construction. The above would simply create a set with all the elements from 1 to K. In your case the f(l) is the function:


$ ||X_i-C_l||^2 $

Given the || means the norm, these are vectors you are subtracting (rows of the X and C matrices). So subtract the vectors, take the norm, and square it. That produces a new set, of which you want to take the minimum.


$ \sum\limits_{i=1}^n $

This part is then just the sum of above min calculation for every index $i$ from $1$ to $n$.

Context

StackExchange Computer Science Q#2067, answer score: 6

Revisions (0)

No revisions yet.