HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Soup of the day: best served during election season

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
theduringseasonservedelectionsoupdaybest

Problem

Community moderator elections on the Stack Exchange network are really exciting.
Alas, on the page of the primaries, I find it mildly annoying that candidates are randomly reordered on every page load, even if it's surely for a good reason.

So, I cooked up (pun totally intended) this script using Beautiful Soup 4 in Python to list the candidates in an election by their score. To use it, install beautifulsoup4 (for example with pip install --user beautifulsoup4), and then to list the candidates running for Code Review:

$ python primaries.py so -n 6 # the 6th election on Stack Overflow
10905 Martijn Pieters
7310 meagar
5923 Jon Clements
5461 Matt
5211 Madara Uchiha
4133 deceze
3301 Raghav Sood
3180 Paresh Mayani
3126 Jeremy Banks
2651 Ed Cottrell


Since all the Stack Exchange sites seem to use the same format, it can work with all Stack Exchange sites. For the sake of a POC I included support for a few other sites (use "so" for Stack Overflow, "sf" for Server Fault), obviously it should be extended for others. Another feature I plan to add soon is to find the latest election by default, rather than using the first.

I mainly suspect some improvement opportunities concerning my use of Beautiful Soup, but I'm open to any kind of improvement ideas in general.

`#!/usr/bin/env python

from argparse import ArgumentParser
from urllib import request

from bs4 import BeautifulSoup

STANDARD_URL_FORMAT = 'http://{}.stackexchange.com/election/{}?tab=primary'
COM_URL_FORMAT = 'http://{}.com/election/{}?tab=primary'

SITES_INFO_HELPER = [
(('cr', 'codereview'), STANDARD_URL_FORMAT, 'codereview'),
(('sf', 'serverfault'), COM_URL_FORMAT, 'serverfault'),
(('so', 'stackoverflow'), COM_URL_FORMAT, 'stackoverflow'),
]

def build_sites_info():
sites_info = {}
for info in SITES_INFO_HELPER:
names, url_format, url_component = info
for name in names:
sites_info[name] = url_format, url_component
retur

Solution

Since your list SITES_INFO_HELPER remains constant throughout the entire program, that list should be a tuple instead

Tuples are faster than lists and are write-protected, which doesn't make a difference to you because you are writing to them.

Read more about when you should use a tuple and when you should use a list here.

Good job using ArgumentParser in your code; that is often something that you don't see used in python code and is actually an extremely useful tool.

You could make your code more object oriented by turning the information inside SITES_INFO_HELPER into instances of a class, probably called Site.

It doesn't have to be anything too complicated; it's just supposed to be a way to store some values:

class Site:
    def __init__(self, name, initials, format=STANDARD_URL_FORMAT):
        self.name = name
        self.initials = initials
        self.format = format


Note: I set the format property to be STANDARD_URL_FORMAT by default because most stack exchange sites follow this format.

Then, when storing the sites, you'd store them like this:

SITES_INFO_HELPER = ( Site("codereview", "cr"), Site("stackoverflow", "so", COM_URL_FORMAT) ...)


I'm not entirely sure, but storing site information like this may rid of the need of build_sites_info.

I think you made a good choice using urllib and BeautifulSoup together to read the election webpage. And, from what I know about BeautifulSoup, what you are doing looks just fine.

Also, looking over the election page layout, how you are scanning the webpage seems to be the only/best way.

Code Snippets

class Site:
    def __init__(self, name, initials, format=STANDARD_URL_FORMAT):
        self.name = name
        self.initials = initials
        self.format = format
SITES_INFO_HELPER = ( Site("codereview", "cr"), Site("stackoverflow", "so", COM_URL_FORMAT) ...)

Context

StackExchange Code Review Q#96346, answer score: 7

Revisions (0)

No revisions yet.