patternMinor
Classification algorithm for high dimensional data which is uniquely definable in a very small sub-space
Viewed 0 times
spacewhichhighsubdefinablealgorithmsmallforuniquelyvery
Problem
I am new to machine learning, so forgive me if I am doing something absolutely absurd.
I have a classification task (~100 classes) and have about 2 million training data points in a 2000-dimensional space. Coordinates of data points are integers (discrete). All points have non-zero coordinates only for < 10 dimensions. That is, each point can be uniquely defined in < 10 dimensional sub-space.
If I use a Gaussian Mixture Model (GMM) for each class, I will end up with ~100 GMMs in a 2000-dimensional space. I feel that given the fact that each point is uniquely definable in less than 10 dimensional space, there can possibly be a better way of doing it.
What am I missing here?
I have a classification task (~100 classes) and have about 2 million training data points in a 2000-dimensional space. Coordinates of data points are integers (discrete). All points have non-zero coordinates only for < 10 dimensions. That is, each point can be uniquely defined in < 10 dimensional sub-space.
If I use a Gaussian Mixture Model (GMM) for each class, I will end up with ~100 GMMs in a 2000-dimensional space. I feel that given the fact that each point is uniquely definable in less than 10 dimensional space, there can possibly be a better way of doing it.
What am I missing here?
Solution
Since your data are extremely sparse, using GMMs or a traditional SVM will result in an over-fit model. By employing methods that exploit the sparsity of the structure, you should get much better results. Regression methods typically add some penalty function as a measure of the amount of non-zero values. This is usually referred to as "regularization". Doing this exactly (under the $L_0$ norm) is difficult, so relaxations are used (lasso: $L_1$, ridge: $L_2$).
If you are comfortable with Python, SciKit Learn has implementations for several approaches that utilize either lasso or ridge penalties. Here is an example of Lasso penalty for an SVM model. I would try that and see how well it works. It may require a parameter to adjust the regularization amount. You can do cross-validation to find what parameter works best.
If you are comfortable with Python, SciKit Learn has implementations for several approaches that utilize either lasso or ridge penalties. Here is an example of Lasso penalty for an SVM model. I would try that and see how well it works. It may require a parameter to adjust the regularization amount. You can do cross-validation to find what parameter works best.
Context
StackExchange Computer Science Q#22217, answer score: 3
Revisions (0)
No revisions yet.