patternpythonMinor
Creating a table using given items
Viewed 0 times
creatingitemsusinggiventable
Problem
This simply creates a table using the items passed in. I would like to make this function shorter and/or more efficient. What I have seems pretty redundant.
def apriori_creator(items, k):
"""Function that creates apriori table using supported candidates"""
table = []
the_length = len(items)
new_k = k-2 # first k-2 items as described in algorithm in book for 'apriori gen'
for x in range(the_length):
next = x + 1
for y in range(next, the_length):
combined = items[x].union(items[y]) # merge
l_sub_one = list(items[x])
l_sub_one.sort()
l_sub_two = list(items[y])
l_sub_two.sort()
first = l_sub_one[:new_k] # getting the first k-2 items of the set
second = l_sub_two[:new_k]
if first == second:
table.append(combined)
return tableSolution
For efficiency, you can take take some things out of the inner loop:
-
Revised code:
A further improvement is possible.
One way to describe what you are doing is that you are generating all different pairs of items that share the same value of
Here's again revised code. I'm using
- The computations that depend only on
xshould be in the outer loop.
- The computation of
combinedshould be under theifwhere it is used.
- Using
enumerateand direct iteration instead ofrangeis simpler and a little faster.
-
l_sub_one = sorted(items[x]) does the same as these two lines:l_sub_one = list(items[x])
l_sub_one.sort()Revised code:
def apriori_creator(items, k):
"""Function that creates apriori table using supported candidates"""
table = []
new_k = k-2 # first k-2 items as described in algorithm in book for 'apriori gen'
for x, item_x in enumerate(items):
l_sub_one = sorted(item_x)
first = l_sub_one[:new_k] # getting the first k-2 items of the set
for item_y in items[x+1:]:
l_sub_two = sorted(item_y)
second = l_sub_two[:new_k]
if first == second:
combined = item_x.union(item_y) # merge
table.append(combined)
return tableA further improvement is possible.
One way to describe what you are doing is that you are generating all different pairs of items that share the same value of
sorted(item)[:new_k]. Your code does this by considering all possible pairs. A more efficient approach would begin by grouping the items by that value. With the help of a dictionary this only requires one pass over the items. Then you only need to generate pairs withing each group.Here's again revised code. I'm using
collections.defaultdict for grouping, and itertools.combinations for pairing.from collections import defaultdict
from itertools import combinations
def apriori_creator(items, k):
"""Function that creates apriori table using supported candidates"""
new_k = k-2 # first k-2 items as described in algorithm in book for 'apriori gen'
groups = defaultdict(list) # group by the first k-2 items of the set
for item in items:
l_sub_one = sorted(item)
key = tuple(l_sub_one[:new_k])
groups[key].append(item)
table = []
for group in groups.values():
if len(group) > 1:
for item_x, item_y in combinations(group, 2): # all pairs
combined = item_x.union(item_y) # merge
table.append(combined)
return tableCode Snippets
l_sub_one = list(items[x])
l_sub_one.sort()def apriori_creator(items, k):
"""Function that creates apriori table using supported candidates"""
table = []
new_k = k-2 # first k-2 items as described in algorithm in book for 'apriori gen'
for x, item_x in enumerate(items):
l_sub_one = sorted(item_x)
first = l_sub_one[:new_k] # getting the first k-2 items of the set
for item_y in items[x+1:]:
l_sub_two = sorted(item_y)
second = l_sub_two[:new_k]
if first == second:
combined = item_x.union(item_y) # merge
table.append(combined)
return tablefrom collections import defaultdict
from itertools import combinations
def apriori_creator(items, k):
"""Function that creates apriori table using supported candidates"""
new_k = k-2 # first k-2 items as described in algorithm in book for 'apriori gen'
groups = defaultdict(list) # group by the first k-2 items of the set
for item in items:
l_sub_one = sorted(item)
key = tuple(l_sub_one[:new_k])
groups[key].append(item)
table = []
for group in groups.values():
if len(group) > 1:
for item_x, item_y in combinations(group, 2): # all pairs
combined = item_x.union(item_y) # merge
table.append(combined)
return tableContext
StackExchange Code Review Q#73629, answer score: 2
Revisions (0)
No revisions yet.