HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythondjangoMinor

An array of dictionaries; comparing each {key, value} pair; and combining dictionaries

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
combiningcomparingeacharrayvalueanddictionariespairkey

Problem

I'm trying to optimize a nested for loops that compares an element in the array with the rest of the element in the array.

There's two part, the first part is for example, an Array has 3 elements, and each element is a dictionary:

[{"someKey_1":"a"}, {"someKey_1":"b"}, {"somekey_1":"a"}]


1st iteration(1st element compares with 2nd element):

Test key of "someKey" for two elements, since a != b, then we do nothing

2st iteration(1st element compares with 3nd element):

Test key of "someKey" for two elements, since a == a, we do some logic

The code(Sudo):

for idx, first_dictionary in enumerate(set_of_pk_values):
    for second_dictionary in (set_of_pk_values[idx+1:]):
        if (first_dictionary['someKey'] == second_dictionary['someKey']):
                #Some Logic


The #Some Logic part of the code requires combining keys from one dictionary to another, for example:

for key in val_2.keys():
    val[key]=val_2[key]


The Code:

```
newList = []
skipList = []
checked = []
getter = itemgetter("predecessor")
getter_2 = itemgetter("setid_hash")

for idx, val in enumerate(set_of_pk_values):
if(idx not in skipList):
for val_2 in set_of_pk_values[idx+1:]:
if(idx not in checked):
try:
if (ast.literal_eval(getter(val)) == ast.literal_eval(getter(val_2))):
for key in val_2.keys():
if(key != "block" and key != "username" and key != "setid"
and key != "setid_hash" and key != "predecessor"
and key != "time_string" and key != "condition"):
val[key]=val_2[key]
skipList.append(idx)
except:
if (getter(val) == getter(val_2)):
for key in val_2.keys():
if(key != "block" and key != "username" and key != "setid"

Solution

ast.literal_eval(...)

If we can remove your calls to ast.literal_eval(...) we should see a nice reduction in the run time of your loops. But, why can we remove this you ask? Consider:

m = '[0, 1, 2, ... , 9,999]' # a str representation of list w/ 10k elements, 0-9999
    n = '[0, 1, 2]'

    x = ast.literal.eval(m)
    y = ast.literal.eval(n)

    x == range(10000) # true


As you can see from the snippet above, ast.literal_eval(...) will parse and evaluate whatever string you pass it, and return a literal representation of that string (assuming of course that the string represents a valid literal). Clearly, it is more efficient to compare m and n than it is to compare x and y. Also, it doesn't appear that you are concerned with whether or not val or val_2 is a valid python literal because under the scenario that ast.literal_eval(...)throws an exception, you default to just comparing the strings returned by getter(val) and getter(val_2). Long story short you can remove the try: / except: and just use the statements you have under the except clause.

for key in val_2.keys()

The above loop occurs as the inner-most loop of both loops 1 and 2. With each iteration you check that key isn't equivalent to 7 other possible key values. 6 of these key values occur in the data you've presented and the 7th (condition) doesn't. It should be more efficient to replace:

for key in val_2.keys():
   if(key != "block" and key != "username" and key != "setid" 
      and key != "setid_hash" and key != "predecessor" 
      and key != "time_string" and key != "condition"):
           val[key]=val_2[key]


with:

# put this at the top of the test function 
x_keys = set(['block', 'username', 'setid', 'setid_hash', 'predecessor', 'time_string', 'condition'])
# ...
for key in set(val_2.keys()) - x_keys:
    val[key] = val_2[key]

Code Snippets

m = '[0, 1, 2, ... , 9,999]' # a str representation of list w/ 10k elements, 0-9999
    n = '[0, 1, 2]'

    x = ast.literal.eval(m)
    y = ast.literal.eval(n)

    x == range(10000) # true
for key in val_2.keys():
   if(key != "block" and key != "username" and key != "setid" 
      and key != "setid_hash" and key != "predecessor" 
      and key != "time_string" and key != "condition"):
           val[key]=val_2[key]
# put this at the top of the test function 
x_keys = set(['block', 'username', 'setid', 'setid_hash', 'predecessor', 'time_string', 'condition'])
# ...
for key in set(val_2.keys()) - x_keys:
    val[key] = val_2[key]

Context

StackExchange Code Review Q#84235, answer score: 5

Revisions (0)

No revisions yet.