patternpythonMinor
Plot president's approval rating of president with lowest approval rating by day of presidency
Viewed 0 times
presidentwithlowestratingpresidencyplotdayapproval
Problem
I was curious which U.S. president had the lowest approval rating for each day in their presidency. For example, which president had the lowest approval rating on day 42, and what was the rating. I downloaded the data from here and built this code to visualize it.
I'm particularly interested in feedback regarding anything inefficient or clumsy that I'm doing. I want the code to be clean and professional looking. This might be out of the scope of this site but any thoughts on how to visualize the data more effectively would be welcome as well.
```
# Here are the imports that we'll use
import os
import pandas as pd
from datetime import datetime
from collections import Counter
import matplotlib.patches as mpatches
import matplotlib.pyplot as plt
from matplotlib import font_manager as fm
'''
Here's the path to all the data. The data were copied from http://www.presidency.ucsb.edu/data/popularity.php and saved as tsv
files.
'''
djt_path = os.getcwd() + '/data/djt.tsv'
bho_path = os.getcwd() + '/data/bho.tsv'
gwb_path = os.getcwd() + '/data/gwb.tsv'
wjc_path = os.getcwd() + '/data/wjc.tsv'
ghwb_path = os.getcwd() + '/data/ghwb.tsv'
rwr_path = os.getcwd() + '/data/rwr.tsv'
jec_path = os.getcwd() + '/data/jec.tsv'
grf_path = os.getcwd() + '/data/grf.tsv'
rmn_path = os.getcwd() + '/data/rmn.tsv'
lbj_path = os.getcwd() + '/data/lbj.tsv'
jfk_path = os.getcwd() + '/data/jfk.tsv'
dde_path = os.getcwd() + '/data/dde.tsv'
hst_path = os.getcwd() + '/data/hst.tsv'
# Now let's read in all the data
djt = pd.read_table(djt_path)
bho = pd.read_table(bho_path)
gwb = pd.read_table(gwb_path)
wjc = pd.read_table(wjc_path)
ghwb = pd.read_table(ghwb_path)
rwr = pd.read_table(rwr_path)
jec = pd.read_table(jec_path)
grf = pd.read_table(grf_path)
rmn = pd.read_table(rmn_path)
lbj = pd.read_table(lbj_path)
jfk = pd.read_table(jfk_path)
dde = pd.read_table(dde_path)
hst = pd.read_table(hst_path)
# The first Gallup poll for this question was on 07/22/1941, which was in
# FDR's third term, so
I'm particularly interested in feedback regarding anything inefficient or clumsy that I'm doing. I want the code to be clean and professional looking. This might be out of the scope of this site but any thoughts on how to visualize the data more effectively would be welcome as well.
```
# Here are the imports that we'll use
import os
import pandas as pd
from datetime import datetime
from collections import Counter
import matplotlib.patches as mpatches
import matplotlib.pyplot as plt
from matplotlib import font_manager as fm
'''
Here's the path to all the data. The data were copied from http://www.presidency.ucsb.edu/data/popularity.php and saved as tsv
files.
'''
djt_path = os.getcwd() + '/data/djt.tsv'
bho_path = os.getcwd() + '/data/bho.tsv'
gwb_path = os.getcwd() + '/data/gwb.tsv'
wjc_path = os.getcwd() + '/data/wjc.tsv'
ghwb_path = os.getcwd() + '/data/ghwb.tsv'
rwr_path = os.getcwd() + '/data/rwr.tsv'
jec_path = os.getcwd() + '/data/jec.tsv'
grf_path = os.getcwd() + '/data/grf.tsv'
rmn_path = os.getcwd() + '/data/rmn.tsv'
lbj_path = os.getcwd() + '/data/lbj.tsv'
jfk_path = os.getcwd() + '/data/jfk.tsv'
dde_path = os.getcwd() + '/data/dde.tsv'
hst_path = os.getcwd() + '/data/hst.tsv'
# Now let's read in all the data
djt = pd.read_table(djt_path)
bho = pd.read_table(bho_path)
gwb = pd.read_table(gwb_path)
wjc = pd.read_table(wjc_path)
ghwb = pd.read_table(ghwb_path)
rwr = pd.read_table(rwr_path)
jec = pd.read_table(jec_path)
grf = pd.read_table(grf_path)
rmn = pd.read_table(rmn_path)
lbj = pd.read_table(lbj_path)
jfk = pd.read_table(jfk_path)
dde = pd.read_table(dde_path)
hst = pd.read_table(hst_path)
# The first Gallup poll for this question was on 07/22/1941, which was in
# FDR's third term, so
Solution
I would change the setup of the dataframes. You could make your list of president dataframes into a dictionary with the name of the president as key. This way, you can greatly reduce the amount of code duplication:
After this, you can iterate over this without always having to use
and can just do
So, I would use
And for the time in the office you can use:
president_names = ["Donald Trump", "Barack Obama", "George W. Bush",
"Bill Clinton", "George H.W. Bush", "Ronald Reagan",
"Jimmy Carter", "Gerald Ford", "Richard Nixon",
"Lyndon Johnson", "John F. Kennedy", "Dwight Eisenhower",
"Harry Truman"]
file_names = ['djt.tsv', 'bho.tsv', 'gwb.tsv', 'wjc.tsv', 'ghwb.tsv',
'rwr.tsv', 'jec.tsv', 'grf.tsv', 'rmn.tsv', 'lbj.tsv', 'jfk.tsv',
'dde.tsv', 'hst.tsv']
presidents = {name: pd.read_table(os.path.join(os.getcwd(), "data", file_name))
for name, file_name in zip(president_names, file_names)}
inauguration_dates = ['01/20/2017', '01/20/2009', '01/20/2001', '01/20/1993',
'01/20/1989', '01/20/1981', '01/20/1977', '08/09/1974',
'01/20/1969', '11/22/1963', '01/20/1961', '01/20/1953',
'04/12/1945']After this, you can iterate over this without always having to use
for x in range(len(presidents)):
print presidents[x]and can just do
for name, president_df in presidents.items():
print president_dfpandas.read_table has a switch parse_dates, which, if enabled, will try to parse all columns as dates (and not do anything if they don't parse as dates). You can also tell it to parse only specific columns as dates by passing a list of column indices. By default it parses dates in the standard US format, so this should work. If not, there is also a switch to parse them as DD/MM/YYYY or you can even pass a custom parser function with date_parser=func.So, I would use
presidents = {name: pd.read_table(os.path.join(os.getcwd(), "data", file_name), parse_dates=True)
for name, file_name in zip(president_names, file_names)}
inauguration_dates = {name: conv(inauguration) for name, inauguration in zip(president_names, inauguration_dates)}And for the time in the office you can use:
for name, president in presidents.items():
inauguration = inauguration_dates[name]
president['days_in_admin'] = (president['Start Date'] - inauguration).daysCode Snippets
president_names = ["Donald Trump", "Barack Obama", "George W. Bush",
"Bill Clinton", "George H.W. Bush", "Ronald Reagan",
"Jimmy Carter", "Gerald Ford", "Richard Nixon",
"Lyndon Johnson", "John F. Kennedy", "Dwight Eisenhower",
"Harry Truman"]
file_names = ['djt.tsv', 'bho.tsv', 'gwb.tsv', 'wjc.tsv', 'ghwb.tsv',
'rwr.tsv', 'jec.tsv', 'grf.tsv', 'rmn.tsv', 'lbj.tsv', 'jfk.tsv',
'dde.tsv', 'hst.tsv']
presidents = {name: pd.read_table(os.path.join(os.getcwd(), "data", file_name))
for name, file_name in zip(president_names, file_names)}
inauguration_dates = ['01/20/2017', '01/20/2009', '01/20/2001', '01/20/1993',
'01/20/1989', '01/20/1981', '01/20/1977', '08/09/1974',
'01/20/1969', '11/22/1963', '01/20/1961', '01/20/1953',
'04/12/1945']for x in range(len(presidents)):
print presidents[x]for name, president_df in presidents.items():
print president_dfpresidents = {name: pd.read_table(os.path.join(os.getcwd(), "data", file_name), parse_dates=True)
for name, file_name in zip(president_names, file_names)}
inauguration_dates = {name: conv(inauguration) for name, inauguration in zip(president_names, inauguration_dates)}for name, president in presidents.items():
inauguration = inauguration_dates[name]
president['days_in_admin'] = (president['Start Date'] - inauguration).daysContext
StackExchange Code Review Q#160730, answer score: 2
Revisions (0)
No revisions yet.