HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Finding the spots furthest away from bad data

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
spotsawaythebadfindingfurthestfromdata

Problem

The real-world problem I'm trying to solve is to create an algorithm which finds the ideal channels where I can transmit radio traffic, inside a radio frequency band that could contain various disturbances. These n ideal channels are those who are furthest away from any of the disturbances, but also the furthest away from each other. Where n is a constant picked by me, depending on how many channels the radio can support.

In abstract terms, I have a chunk of generic data where each item could either be labelled "good" or "bad". It can be considered to be an array of booleans, or illustrated as a series of 1=good and 0=bad, for example 0010111100.

In this data, consecutive sequences of good data form good segments:

0010111100   data size=10
  ^ ^
  | +------- good segment 2, size 4
  +--------- good segment 1, size 1


In the best case, there only exists one good segment covering all of the data:

1111111111   data size=10, 1 good segment with size 10


No matter the nature of the data, I need to find the n "best places" that are furthest away from any bad data, but also from each other. It is considered equally bad to be near a "bad" item as it is to be near another "best place".

The ideal solution is where the minimum distance between all "best places" and bad data/other best places is as long as possible, for all best places.

For example, in the case of 1111111111 where n=3 places, they could be located as follows (simply divide the data size by 3):

1111111111   data size=10, n=3
  ^  ^  ^


Or in case of 0110111110 n=3, the following would be fine:

0110111110   data size=10, n=3
 ^   ^ ^

or

0110111110   data size=10, n=3
  ^  ^ ^


In case there is too much bad data, the problem has no solution. The sum of all good segment sizes must be at least n or there exists no solution.
Example:

0000010001   data size=10, n=3, no solution possible.


I suppose one solution is to find the n best segments out of all the good

Solution

I am going to assume you want to maximize the "separation". In particular, I'll say that a solution has separation $\ge s$ if every selected place is at least $s$ apart from any selected place, and if every selected place is at least $s$ away from any bad place.

Then this can be solved in $O(m \lg (m/n))$ time using binary search and a greedy algorithm, where $m = $ the data size. I'll show how to determine whether it's possible to place $n$ stations achieving separation $\ge s$, in $O(m)$ time; then you can use binary search on $s$ to find the largest $s$ achievable.

To determine whether separation $\ge s$ is achievable, just use a greedy algorithm with a left-to-right scan. Select the left-most place you can. Then, scanning left to right, select the first place that can be selected (taking into account it has to be at least $s$ to the right of the one you previously selected, and at least $s$ away from any bad place). Keep iterating. When you're done, that gives you the maximum number of places that can be selected, subject to the requirement that you have separation $\ge s$.

Now you can use binary search from here. If the number of places that could be selected was less than $n$, decrease $s$ and try again. If the number of places that could be selected was greater than or equal to $n$, increase $s$ and try again. There are at most $m/n$ possible values of $s$, so after $\lg (m/n)$ steps of binary search, this will terminate. If you start the binary search at the smallest possible value of $s$, the running time will be $O(m \lg s)$.

Context

StackExchange Computer Science Q#68214, answer score: 3

Revisions (0)

No revisions yet.