patternModerate
Algorithm to distribute items "evenly"
Viewed 0 times
evenlyalgorithmitemsdistribute
Problem
I'm searching for an algorithm to distribute values from a list so that the resulting list is as "balanced" or "evenly distributed" as possible (in quotes because I'm not sure these are the best ways to describe it... later I'll provide a way to measure if a result is better than other).
So, for the list:
One of the best results, after re-distributing the values, is:
There may be other results as good as this one, and of course this gets more complicated with a less uniform set of values.
This is how to measure if a result is better than other:
-
Count the distances between each item and the next item with the same value.
-
Calculate the standard deviation for that set of distances. A lower dispersion means a better result.
Observations:
So:
Given these definitions, I ask for a clue of which algorithms or strategies should I search for.
So, for the list:
[1, 1, 2, 2, 3, 3]One of the best results, after re-distributing the values, is:
[1, 2, 3, 1, 2, 3]There may be other results as good as this one, and of course this gets more complicated with a less uniform set of values.
This is how to measure if a result is better than other:
-
Count the distances between each item and the next item with the same value.
-
Calculate the standard deviation for that set of distances. A lower dispersion means a better result.
Observations:
- When calculating a distance and the end of the list is reached without finding an item with the same value, we go back to the beginning of the list. So, at most, the same item will be found and the distance for that item will be the length of the list. This means that the list is cyclic;
- A typical list has ~50 items with ~15 different values in varied quantities.
So:
- For the result
[1, 2, 3, 1, 2, 3], the distances are[3, 3, 3, 3, 3, 3], and the standard deviation is0;
- For the result
[1, 1, 2, 2, 3, 3], the distances are[1, 5, 1, 5, 1, 5], and the standard deviation is2;
- Which makes the first result better than the second (lower deviation is better).
Given these definitions, I ask for a clue of which algorithms or strategies should I search for.
Solution
I ran across this question while researching a similar problem: optimum additions of liquids to reduce stratification. It seems like my solution would be applicable to your situation, as well.
If you want to mix liquids A, B, and C in the proportion 30,20,10 (that is, 30 units of A, 20 units of B, and 10 units of C), you end up with stratification if you add all the A, then all the B, and then all the C. You're better off mixing smaller units. For example, do single-unit additions in the sequence [A,B,A,C,B,A]. That will prevent stratification altogether.
The way I found to do it is to treat it as a kind of merge, using a priority queue. If I create a structure to describe the additions:
The Frequency is expressed as "one every N". So A, which is added three out of six times, has a frequency of 2 (6/3).
And initialize a heap that initially contains:
Now, I remove the first item from the heap and output it. Then reduce its count by 1 and increase Priority by Frequency and add it back to the heap. The resulting heap is:
Next, remove B from the heap, output and update it, then add back to the heap:
If I continue in that fashion, I get the desired mixture. I use a custom comparer to ensure that when equal Priority items are inserted into the heap, the one with the highest Frequency value (i.e. the least frequent) is ordered first.
I wrote a more complete description of the problem and its solution on my blog, and presented some working C# code that illustrates it. See Evenly distributing items in a list.
Update after comments
I do think my problem is similar to the OP's problem, and therefore that my solution is potentially useful. I apologize for not framing my answer more in the terms of the OP's question.
The first objection, that my solution is using A, B, and C rather than 0, 1, and 2, is easily remedied. It's simply a matter of nomenclature. I find it easier and less confusing to think about and say "two A's" rather than "two 1's". But for purposes of this discussion I have modified my outputs below to use the OP's nomenclature.
Of course my problem deals with the concept of distance. If you want to "spread things out evenly," distance is implied. But, again, it was my failing for not adequately showing how my problem is similar to the OP's problem.
I ran a few tests with the two examples that the OP provided. That is:
In my nomenclature those are expressed as [2,2,2] and [4,3,2,1], respectively. That is, in the last example, "4 items of type 0, 3 items of type 1, 2 items of type 2, and 1 item of type 3."
I ran my test program (as described immediately below), and have posted my results. Absent input from the OP, I can't say if my results are similar to, worse than, or better than his. Nor can I compare my results to anybody else's results because nobody else has posted any.
I can say, however, that the algorithm provides a good solution to my problem of eliminating stratification when mixing liquids. And it looks like it provides a reasonable solution to the OP's problem.
For the results shown below, I used the algorithm that I detailed in my blog entry, with the initial priority set to
Running my test program with the OP's first example, I get:
So my algorithm works for the trivial problem of all counts being equal.
For the second problem that the OP posted, I got:
```
Counts: 4,3,2,1
Sequence: 0,1,2,0,1,3,0,2,1,0
Distances for item type 0: 3,3,3,1
Stddev = 0.866025403784439
Distances for item type 1: 3,4,3
Stddev = 0.471404520791032
Distances for item type 2: 5,5
Stddev = 0
Distances for item ty
If you want to mix liquids A, B, and C in the proportion 30,20,10 (that is, 30 units of A, 20 units of B, and 10 units of C), you end up with stratification if you add all the A, then all the B, and then all the C. You're better off mixing smaller units. For example, do single-unit additions in the sequence [A,B,A,C,B,A]. That will prevent stratification altogether.
The way I found to do it is to treat it as a kind of merge, using a priority queue. If I create a structure to describe the additions:
MergeItem
Item, Count, Frequency, PriorityThe Frequency is expressed as "one every N". So A, which is added three out of six times, has a frequency of 2 (6/3).
And initialize a heap that initially contains:
(A, 3, 2, 2)
(B, 2, 3, 3)
(C, 1, 6, 6)Now, I remove the first item from the heap and output it. Then reduce its count by 1 and increase Priority by Frequency and add it back to the heap. The resulting heap is:
(B, 2, 3, 0)
(A, 2, 2, 4)
(C, 1, 6, 6)Next, remove B from the heap, output and update it, then add back to the heap:
(A, 2, 2, 4)
(C, 1, 6, 6)
(B, 1, 3, 6)If I continue in that fashion, I get the desired mixture. I use a custom comparer to ensure that when equal Priority items are inserted into the heap, the one with the highest Frequency value (i.e. the least frequent) is ordered first.
I wrote a more complete description of the problem and its solution on my blog, and presented some working C# code that illustrates it. See Evenly distributing items in a list.
Update after comments
I do think my problem is similar to the OP's problem, and therefore that my solution is potentially useful. I apologize for not framing my answer more in the terms of the OP's question.
The first objection, that my solution is using A, B, and C rather than 0, 1, and 2, is easily remedied. It's simply a matter of nomenclature. I find it easier and less confusing to think about and say "two A's" rather than "two 1's". But for purposes of this discussion I have modified my outputs below to use the OP's nomenclature.
Of course my problem deals with the concept of distance. If you want to "spread things out evenly," distance is implied. But, again, it was my failing for not adequately showing how my problem is similar to the OP's problem.
I ran a few tests with the two examples that the OP provided. That is:
[1,1,2,2,3,3] // which I converted to [0,0,1,1,2,2]
[0,0,0,0,1,1,1,2,2,3]In my nomenclature those are expressed as [2,2,2] and [4,3,2,1], respectively. That is, in the last example, "4 items of type 0, 3 items of type 1, 2 items of type 2, and 1 item of type 3."
I ran my test program (as described immediately below), and have posted my results. Absent input from the OP, I can't say if my results are similar to, worse than, or better than his. Nor can I compare my results to anybody else's results because nobody else has posted any.
I can say, however, that the algorithm provides a good solution to my problem of eliminating stratification when mixing liquids. And it looks like it provides a reasonable solution to the OP's problem.
For the results shown below, I used the algorithm that I detailed in my blog entry, with the initial priority set to
Frequency/2, and the heap comparer modified to favor the more frequent item. The modified code is shown here, with the modified lines commented.private class HeapItem : IComparable
{
public int ItemIndex { get; private set; }
public int Count { get; set; }
public double Frequency { get; private set; }
public double Priority { get; set; }
public HeapItem(int itemIndex, int count, int totalItems)
{
ItemIndex = itemIndex;
Count = count;
Frequency = (double)totalItems / Count;
// ** Modified the initial priority setting.
Priority = Frequency/2;
}
public int CompareTo(HeapItem other)
{
if (other == null) return 1;
var rslt = Priority.CompareTo(other.Priority);
if (rslt == 0)
{
// ** Modified to favor the more frequent item.
rslt = Frequency.CompareTo(other.Frequency);
}
return rslt;
}
}Running my test program with the OP's first example, I get:
Counts: 2,2,2
Sequence: 1,0,2,1,0,2
Distances for item type 0: 3,3
Stddev = 0
Distances for item type 1: 3,3
Stddev = 0
Distances for item type 2: 3,3
Stddev = 0So my algorithm works for the trivial problem of all counts being equal.
For the second problem that the OP posted, I got:
```
Counts: 4,3,2,1
Sequence: 0,1,2,0,1,3,0,2,1,0
Distances for item type 0: 3,3,3,1
Stddev = 0.866025403784439
Distances for item type 1: 3,4,3
Stddev = 0.471404520791032
Distances for item type 2: 5,5
Stddev = 0
Distances for item ty
Code Snippets
MergeItem
Item, Count, Frequency, Priority(A, 3, 2, 2)
(B, 2, 3, 3)
(C, 1, 6, 6)(B, 2, 3, 0)
(A, 2, 2, 4)
(C, 1, 6, 6)(A, 2, 2, 4)
(C, 1, 6, 6)
(B, 1, 3, 6)[1,1,2,2,3,3] // which I converted to [0,0,1,1,2,2]
[0,0,0,0,1,1,1,2,2,3]Context
StackExchange Computer Science Q#29709, answer score: 13
Revisions (0)
No revisions yet.