HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Finding "fingerprint" sets

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
fingerprintsetsfinding

Problem

Let's say we have 10 people, each with a list of favorite books. For a given person X, I would like to find a special subset of X's books liked only by X, i.e. there is no other person that likes all of the books in X's special subset. I think of this special subset as a unique "fingerprint" for X.

I would appreciate suggestions on an approach for finding such sets. (While this reads like a homework problem, it is related to a problem in my biology research that I am trying to solve.)

Solution

I assume you want the fingerprint to be as small as possible. Then this is the Hitting Set problem: For each person, make a list of all books liked by X but not by this person. Then, the goal is to select at least one book from each list. The problem is NP-hard, so you can't expect to find an algorithm that always solves it optimally in polynomial time. The greedy algorithm has a bad theoretical worst-case bound, but often works quite decent in practice. If you want to solve it optimally, an Integer Linear Programming solver should be able to solve instances of up to 1000 or maybe 10000 books. If you give more details on the size and structure of your instances, we could suggest other approaches.

Context

StackExchange Computer Science Q#7210, answer score: 6

Revisions (0)

No revisions yet.