patterncsharpMinor
Count duplicate List<int>
Viewed 0 times
countlistduplicateint
Problem
Count duplicate
A duplicate is not order-dependent. { 1, 2, 3 } is a duplicate of { 2, 1, 3 }
Looking for speed without getting silly.
List in List.A duplicate is not order-dependent. { 1, 2, 3 } is a duplicate of { 2, 1, 3 }
Looking for speed without getting silly.
public static void CountListTest()
{
List> testData = new List>
{
new List { 7, 3, 4 },
new List { 1, 2, 3 },
new List { 2, 1, 3 },
new List { 6, 8, 3 },
new List { 9, 2, 4 },
new List { 6, 3, 8 },
new List { 7, 3, 4 },
};
List> countList = CountList(testData);
Debug.WriteLine("");
testData = new List>
{
new List { 1, 2, 3 },
new List { 2, 1, 3 },
new List { 6, 8, 3 },
new List { 2, 1, 3, 9 },
new List { 6, 8, 3, 9 },
new List { 2, 1, 3, 9 },
new List { 9, 2, 4 },
new List { 6, 3, 8 },
new List { 7, 3, 4 },
};
countList = CountList(testData);
}
public static List> CountList(List> input)
{
List> returnListList = new List>();
foreach (List list in input)
list.Sort();
HashSet alreadyMatched = new HashSet();
int count = 0;
bool match = true;
int lastEval = 0;
for (int i = 0; i 0)
{
Debug.WriteLine("Count = {0} List {1}", 1, string.Join(", ", input[lastEval]));
returnListList.Add(input[lastEval]);
}
return returnListList;
}Solution
The code itself is not so bad although it could use more
I can't tell how fast/slow it would be if it returned a result that can be printed outside the method. I compared it with another search algorithm and for 1.000.000 iterations mine is ~20ms faster.
I think the method could do better by returning a dictionary with the results.
Currently the original method runs twice over the item list. First by sorting them and then by searching for duplicates.
This could be improved by using a helper array to mark which item is already sorted and skip subsequent sortings thus run over the items only once.
The last optimization is to skip items that already have been checked. I think you are doing this with the
Making the method generic and require
The value comparison takes place in a local function
Similarly it sorts the lists and keeps track of the sorting in the local
I have run some more tests and these are the results.
Test collection
Results
{} and maybe more precise names then lastEval or match. Of course it shouldn't print to the console but return a usable result.I can't tell how fast/slow it would be if it returned a result that can be printed outside the method. I compared it with another search algorithm and for 1.000.000 iterations mine is ~20ms faster.
I think the method could do better by returning a dictionary with the results.
Currently the original method runs twice over the item list. First by sorting them and then by searching for duplicates.
This could be improved by using a helper array to mark which item is already sorted and skip subsequent sortings thus run over the items only once.
The last optimization is to skip items that already have been checked. I think you are doing this with the
alreadyMatched helper but I'm not sure about it.Making the method generic and require
T to be IComparable makes it possbile to reuse it for other collections with comparable items.The value comparison takes place in a local function
sequenceEqualFast.Similarly it sorts the lists and keeps track of the sorting in the local
sortList function. public static Dictionary, int> FindDuplicatesFast(
List> values
) where T : IComparable
{
var isDuplicate = new bool[values.Count];
var isSorted = new bool[values.Count];
// Key: first of duplicates
// Value: count of duplicates
var results = new Dictionary, int>();
var sequenceEqualFast = new Func, List, bool>((l1, l2) =>
{
if (l1.Count != l2.Count) { return false; }
for (int k = 0; k (i =>
{
if (!isSorted[i])
{
values[i].Sort();
isSorted[i] = true;
}
});
for (int i = 0; i < values.Count - 1; i++)
{
if (isDuplicate[i]) { continue; }
sortList(i);
results[values[i]] = 1;
isDuplicate[i] = true;
for (int j = i + 1; j < values.Count; j++)
{
if (isDuplicate[j]) { continue; }
sortList(j);
if (sequenceEqualFast(values[i], values[j]))
{
isDuplicate[j] = true;
results[values[i]] += 1;
}
}
}
return results;
}I have run some more tests and these are the results.
Test collection
var rnd = new Random();
var itemCount = 100;
var testData = Enumerable.Range(0, itemCount).Select(e =>
{
var count = rnd.Next(3, 5);
return Enumerable.Range(0, count).Select(en => rnd.Next(0, 10)).ToList();
}).ToList();
var testCount = 10000;Results
@Paparazzi
00:00:03.2791287
@t3chb0t
00:00:01.6763969
Code Snippets
public static Dictionary<List<T>, int> FindDuplicatesFast<T>(
List<List<T>> values
) where T : IComparable
{
var isDuplicate = new bool[values.Count];
var isSorted = new bool[values.Count];
// Key: first of duplicates
// Value: count of duplicates
var results = new Dictionary<List<T>, int>();
var sequenceEqualFast = new Func<List<T>, List<T>, bool>((l1, l2) =>
{
if (l1.Count != l2.Count) { return false; }
for (int k = 0; k < l1.Count(); k++)
{
if (l1[k].CompareTo(l2[k]) != 0)
{
return false;
}
}
return true;
});
var sortList = new Action<int>(i =>
{
if (!isSorted[i])
{
values[i].Sort();
isSorted[i] = true;
}
});
for (int i = 0; i < values.Count - 1; i++)
{
if (isDuplicate[i]) { continue; }
sortList(i);
results[values[i]] = 1;
isDuplicate[i] = true;
for (int j = i + 1; j < values.Count; j++)
{
if (isDuplicate[j]) { continue; }
sortList(j);
if (sequenceEqualFast(values[i], values[j]))
{
isDuplicate[j] = true;
results[values[i]] += 1;
}
}
}
return results;
}var rnd = new Random();
var itemCount = 100;
var testData = Enumerable.Range(0, itemCount).Select(e =>
{
var count = rnd.Next(3, 5);
return Enumerable.Range(0, count).Select(en => rnd.Next(0, 10)).ToList();
}).ToList();
var testCount = 10000;Context
StackExchange Code Review Q#151606, answer score: 4
Revisions (0)
No revisions yet.