patternpythonMinor
Parsing a complicated CSV generated on election nights
Viewed 0 times
generatednightscsvelectionparsingcomplicated
Problem
I have a Python script which parses a complicated CSV generated on election nights. Each row of the CSV represents a race. As I loop through the races, I store the candidates for each race into a list called
For illustration purposes, here's some sample data we might process:
First attempt:
My initial version was pretty straightforward. Make a copy of
This works great -- except I realized later that it doesn't account for ties.
Since this script is for election night when the results are unofficial, I only want to mark as winners the candidates I am sure are winners. In the data above, the clear winners would be: Christine Matthews, Dexter Holmes, Gerald Wheeler, Timothy Hunter, and Sheila Murray. There is a tie for the sixth spot. Depending on the type of race, etc, it might be sett
cnds. The other variable to note is called num_win, and it holds the number of people who will be elected for that particular race. Usually it's just 1, but in cases like school boards, it can be much higher.For illustration purposes, here's some sample data we might process:
num_win = 6
cnds = [
{ 'cnd' : 'Christine Matthews', 'votes' : 200, 'winner': False },
{ 'cnd' : 'Dexter Holmes', 'votes' : 123, 'winner': False },
{ 'cnd' : 'Gerald Wheeler', 'votes' : 123, 'winner': False },
{ 'cnd' : 'Timothy Hunter', 'votes' : 100, 'winner': False },
{ 'cnd' : 'Sheila Murray', 'votes' : 94, 'winner': False },
{ 'cnd' : 'Elisa Banks', 'votes' : 88, 'winner': False },
{ 'cnd' : 'John Park', 'votes' : 88, 'winner': False },
{ 'cnd' : 'Guadalupe Bates', 'votes' : 76, 'winner': False },
{ 'cnd' : 'Lynne Austin', 'votes' : 66, 'winner': False }
]First attempt:
My initial version was pretty straightforward. Make a copy of
cnds, sort it in order of vote count, and limit to all but the num_win number of candidates. These are the winners. Then loop through cnds and mark the winners.winners = sorted(cnds, key=lambda k: int(k['votes']), reverse=True)[0:num_win]
for cnd in cnds:
for winner in winners:
if cnd['cnd'] == winner['cnd']:
cnd['winner'] = TrueThis works great -- except I realized later that it doesn't account for ties.
Since this script is for election night when the results are unofficial, I only want to mark as winners the candidates I am sure are winners. In the data above, the clear winners would be: Christine Matthews, Dexter Holmes, Gerald Wheeler, Timothy Hunter, and Sheila Murray. There is a tie for the sixth spot. Depending on the type of race, etc, it might be sett
Solution
# Make one more candidate than necessary into winners list
winners = sorted(cnds, key=lambda k: int(k['votes'], reverse=True)[0:num_win + 1]
# A tie to be resolved happens when two last candidates have equal vote count.
# if so, go backwards removing everybody with the same vote count.
# Binary search would work faster, of course. If a standard library
# provides something similar to std::lower_bound from STL - its even better.
index = num_win
while index > 0 and winners[index - 1]['votes'] == winners[num_win]['votes']:
index -= 1
# Finally, adjust your list
winners = winners[0:index]PS: One more thing to mention. The final nested loop is not really the best approach. You should decorate the original list with the sequential numbers (or use some other method to remember initial ordering), sort it, mark the winners which are at the beginning of the list, and sort it by sequence numbers back to the original state.
Code Snippets
# Make one more candidate than necessary into winners list
winners = sorted(cnds, key=lambda k: int(k['votes'], reverse=True)[0:num_win + 1]
# A tie to be resolved happens when two last candidates have equal vote count.
# if so, go backwards removing everybody with the same vote count.
# Binary search would work faster, of course. If a standard library
# provides something similar to std::lower_bound from STL - its even better.
index = num_win
while index > 0 and winners[index - 1]['votes'] == winners[num_win]['votes']:
index -= 1
# Finally, adjust your list
winners = winners[0:index]Context
StackExchange Code Review Q#49570, answer score: 2
Revisions (0)
No revisions yet.