patternMinor

Theory of multi-label classification

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

theoryclassificationlabelmulti

Problem

Multi-label classification is a machine-learning problem where each sample can have zero or more labels from a closed set of possible labels. This task has applications in several fields. For example, in dialog systems, each sentence that the human says may have several intents, and the classifier should detect all of them. For example, the sentence "I want a cake and a drink" contains the two intents "WANTCAKE" and "WANTDRINK".

Theoretically, I expect a classifier to classify multi-label samples, even if the training data contained only single-label samples. For example, consider the following training set (where each word is considered a feature):

"I want a cake" -> WANTCAKE

"I want a drink" -> WANTDRINK

"I want a solution" -> WANTSOLUTION

I would expect a classifier to realize, that the words "I want a" are not relevant for classification, and the words cake/drink/solution are indicative of the classes WANTCAKE/WANTDRINK/WANTSOLUTION respectively, and classify the sentence "I want a cake and a drink" correctly as {WANTCAKE,WANTDRINK}.

This seems trivial to humans. Therefore. I was very surprised to find out, that many state-of-the-art multi-label classifiers fail miserably on this simple task!

For example, consider a multi-label classifier in the "Binary Relevance" method. In this method, there is a single binary classifier for each label. For example, there is a binary classifier for the "WANTCAKE" label, trained with I want a cake" as a positive sample, and the other two sentences as negative samples. When this classifier sees the sentence "I want a cake and a drink and a solution", it sees a single feature "cake" that is a positive signal of WANTCAKE, and two features, "drink" and "solution", that are negative signals of WANTCAKE, because they appeared in the training set with sentences that did not have the WANTCAKE label. Therefore, this classifier returns 'negative'. The same happens for the other two binary classifiers, and thus the multi-labe

Solution

You subsequently clarified that you are looking for a way to do multi-label classification in general, and the example in the question about wanting a cake was just an example.

OK, here is one standard way to do multi-label classification. For each candidate label, you build a boolean classifier that outputs true or false: true means that the label applies, false means it doesn't.

In your example, you'd have three classifiers: a "cake classifier" that outputs true if the sentence should be labelled "wants-cake" and false otherwise; a "drink classifier" that outputs true if the sentence should be labelled "wants-drink" and false otherwise; and a "solution classifier" that outputs true if the sentence should be labelled "wants-drink" and false otherwise. You now train each one separately. Given a sentence, you run all three classifiers on it and use that to select which labels should or should not be associated to the sentence. For instance, if the "cake classifier" outputs true, the "drink classifier outputs false, and the "solution classifier" outputs true, then you label the sentence as "wants-cake + wants-solution".

This allows you to use any boolean classifier as your underlying building block. For instance, you can use SVMs, decision trees, random forests, and many other schemes.

Context

StackExchange Computer Science Q#14107, answer score: 3

Revisions (0)

No revisions yet.