patternpythonMinor
Soup of the day: best served during election season
Viewed 0 times
theduringseasonservedelectionsoupdaybest
Problem
Community moderator elections on the Stack Exchange network are really exciting.
Alas, on the page of the primaries, I find it mildly annoying that candidates are randomly reordered on every page load, even if it's surely for a good reason.
So, I cooked up (pun totally intended) this script using Beautiful Soup 4 in Python to list the candidates in an election by their score. To use it, install
Since all the Stack Exchange sites seem to use the same format, it can work with all Stack Exchange sites. For the sake of a POC I included support for a few other sites (use "so" for Stack Overflow, "sf" for Server Fault), obviously it should be extended for others. Another feature I plan to add soon is to find the latest election by default, rather than using the first.
I mainly suspect some improvement opportunities concerning my use of Beautiful Soup, but I'm open to any kind of improvement ideas in general.
`#!/usr/bin/env python
from argparse import ArgumentParser
from urllib import request
from bs4 import BeautifulSoup
STANDARD_URL_FORMAT = 'http://{}.stackexchange.com/election/{}?tab=primary'
COM_URL_FORMAT = 'http://{}.com/election/{}?tab=primary'
SITES_INFO_HELPER = [
(('cr', 'codereview'), STANDARD_URL_FORMAT, 'codereview'),
(('sf', 'serverfault'), COM_URL_FORMAT, 'serverfault'),
(('so', 'stackoverflow'), COM_URL_FORMAT, 'stackoverflow'),
]
def build_sites_info():
sites_info = {}
for info in SITES_INFO_HELPER:
names, url_format, url_component = info
for name in names:
sites_info[name] = url_format, url_component
retur
Alas, on the page of the primaries, I find it mildly annoying that candidates are randomly reordered on every page load, even if it's surely for a good reason.
So, I cooked up (pun totally intended) this script using Beautiful Soup 4 in Python to list the candidates in an election by their score. To use it, install
beautifulsoup4 (for example with pip install --user beautifulsoup4), and then to list the candidates running for Code Review:$ python primaries.py so -n 6 # the 6th election on Stack Overflow
10905 Martijn Pieters
7310 meagar
5923 Jon Clements
5461 Matt
5211 Madara Uchiha
4133 deceze
3301 Raghav Sood
3180 Paresh Mayani
3126 Jeremy Banks
2651 Ed Cottrell
Since all the Stack Exchange sites seem to use the same format, it can work with all Stack Exchange sites. For the sake of a POC I included support for a few other sites (use "so" for Stack Overflow, "sf" for Server Fault), obviously it should be extended for others. Another feature I plan to add soon is to find the latest election by default, rather than using the first.
I mainly suspect some improvement opportunities concerning my use of Beautiful Soup, but I'm open to any kind of improvement ideas in general.
`#!/usr/bin/env python
from argparse import ArgumentParser
from urllib import request
from bs4 import BeautifulSoup
STANDARD_URL_FORMAT = 'http://{}.stackexchange.com/election/{}?tab=primary'
COM_URL_FORMAT = 'http://{}.com/election/{}?tab=primary'
SITES_INFO_HELPER = [
(('cr', 'codereview'), STANDARD_URL_FORMAT, 'codereview'),
(('sf', 'serverfault'), COM_URL_FORMAT, 'serverfault'),
(('so', 'stackoverflow'), COM_URL_FORMAT, 'stackoverflow'),
]
def build_sites_info():
sites_info = {}
for info in SITES_INFO_HELPER:
names, url_format, url_component = info
for name in names:
sites_info[name] = url_format, url_component
retur
Solution
Since your list
Tuples are faster than lists and are write-protected, which doesn't make a difference to you because you are writing to them.
Read more about when you should use a tuple and when you should use a list here.
Good job using
You could make your code more object oriented by turning the information inside
It doesn't have to be anything too complicated; it's just supposed to be a way to store some values:
Note: I set the
Then, when storing the sites, you'd store them like this:
I'm not entirely sure, but storing site information like this may rid of the need of
I think you made a good choice using
Also, looking over the election page layout, how you are scanning the webpage seems to be the only/best way.
SITES_INFO_HELPER remains constant throughout the entire program, that list should be a tuple insteadTuples are faster than lists and are write-protected, which doesn't make a difference to you because you are writing to them.
Read more about when you should use a tuple and when you should use a list here.
Good job using
ArgumentParser in your code; that is often something that you don't see used in python code and is actually an extremely useful tool.You could make your code more object oriented by turning the information inside
SITES_INFO_HELPER into instances of a class, probably called Site.It doesn't have to be anything too complicated; it's just supposed to be a way to store some values:
class Site:
def __init__(self, name, initials, format=STANDARD_URL_FORMAT):
self.name = name
self.initials = initials
self.format = formatNote: I set the
format property to be STANDARD_URL_FORMAT by default because most stack exchange sites follow this format.Then, when storing the sites, you'd store them like this:
SITES_INFO_HELPER = ( Site("codereview", "cr"), Site("stackoverflow", "so", COM_URL_FORMAT) ...)I'm not entirely sure, but storing site information like this may rid of the need of
build_sites_info.I think you made a good choice using
urllib and BeautifulSoup together to read the election webpage. And, from what I know about BeautifulSoup, what you are doing looks just fine.Also, looking over the election page layout, how you are scanning the webpage seems to be the only/best way.
Code Snippets
class Site:
def __init__(self, name, initials, format=STANDARD_URL_FORMAT):
self.name = name
self.initials = initials
self.format = formatSITES_INFO_HELPER = ( Site("codereview", "cr"), Site("stackoverflow", "so", COM_URL_FORMAT) ...)Context
StackExchange Code Review Q#96346, answer score: 7
Revisions (0)
No revisions yet.