snippetjavascriptTip
Group data using the K-means clustering algorithm in JavaScript
Viewed 0 times
clusteringdatajavascriptgroupmeanstheusingalgorithm
Problem
The K-means clustering) algorithm is a popular unsupervised machine learning algorithm used to group a set of data into clusters. It works by iteratively assigning data points to the nearest cluster centroid and then recalculating the centroids based on the new assignments. This process is repeated until the centroids no longer change significantly or a maximum number of iterations is reached.
This implementation of the K-means clustering algorithm groups the given data into
This implementation of the K-means clustering algorithm groups the given data into
k clusters. No maximum number of iterations is set, so the algorithm will run until convergence is reached.- As no initial centroids are provided, start by using the first
kdata points as the initialcentroids, usingArray.prototype.slice(). - Initialize the
distancesarray to store the distances between each data point and each centroid, as well as theclassesarray to store the cluster assignments for each data point. - Use a
whileloop to repeat the assignment and update steps as long as there are changes in the previous iteration, as indicated by theitrvariable.
Solution
const kMeans = (data, k = 1) => {
const centroids = data.slice(0, k);
const distances = Array.from({ length: data.length }, () =>
Array.from({ length: k }, () => 0)
);
const classes = Array.from({ length: data.length }, () => -1);
let itr = true;
while (itr) {
itr = false;
for (let d in data) {
for (let c = 0; c < k; c++) {
distances[d][c] = Math.hypot(
...Object.keys(data[0]).map(key => data[d][key] - centroids[c][key])
);
}
const m = distances[d].indexOf(Math.min(...distances[d]));
if (classes[d] !== m) itr = true;
classes[d] = m;
}
for (let c = 0; c < k; c++) {
centroids[c] = Array.from({ length: data[0].length }, () => 0);
const size = data.reduce((acc, _, d) => {
if (classes[d] === c) {
acc++;
for (let i in data[0]) centroids[c][i] += data[d][i];
}
return acc;
}, 0);
for (let i in data[0]) {
centroids[c][i] = Number.parseFloat(
Number(centroids[c][i] / size).toFixed(2)
);
}
}
}
return classes;
};
kMeans([[0, 0], [0, 1], [1, 3], [2, 0]], 2); // [0, 1, 1, 0]- As no initial centroids are provided, start by using the first
kdata points as the initialcentroids, usingArray.prototype.slice(). - Initialize the
distancesarray to store the distances between each data point and each centroid, as well as theclassesarray to store the cluster assignments for each data point. - Use a
whileloop to repeat the assignment and update steps as long as there are changes in the previous iteration, as indicated by theitrvariable. - Calculate the Euclidean distance between each data point and centroid using
Math.hypot(),Object.keys(), andArray.prototype.map(). - Use
Array.prototype.indexOf()andMath.min()to find the closest centroid for each data point. - Update the cluster assignments in the
classesarray and check if any changes were made.
Code Snippets
const kMeans = (data, k = 1) => {
const centroids = data.slice(0, k);
const distances = Array.from({ length: data.length }, () =>
Array.from({ length: k }, () => 0)
);
const classes = Array.from({ length: data.length }, () => -1);
let itr = true;
while (itr) {
itr = false;
for (let d in data) {
for (let c = 0; c < k; c++) {
distances[d][c] = Math.hypot(
...Object.keys(data[0]).map(key => data[d][key] - centroids[c][key])
);
}
const m = distances[d].indexOf(Math.min(...distances[d]));
if (classes[d] !== m) itr = true;
classes[d] = m;
}
for (let c = 0; c < k; c++) {
centroids[c] = Array.from({ length: data[0].length }, () => 0);
const size = data.reduce((acc, _, d) => {
if (classes[d] === c) {
acc++;
for (let i in data[0]) centroids[c][i] += data[d][i];
}
return acc;
}, 0);
for (let i in data[0]) {
centroids[c][i] = Number.parseFloat(
Number(centroids[c][i] / size).toFixed(2)
);
}
}
}
return classes;
};
kMeans([[0, 0], [0, 1], [1, 3], [2, 0]], 2); // [0, 1, 1, 0]Context
From 30-seconds-of-code: k-means
Revisions (0)
No revisions yet.