HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Tweeting statistics on shared links

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
linkstweetingstatisticsshared

Problem

I tend to be rather prolific in my sharing of things I find online. I'm interested in tracking my own habits to make them more effective. Therefore, I submit the above to be critiqued by the lot of you. Have at it.

import argparse
from backports.lzma import LZMAFile
import csv
import datetime
import time
import tweepy

parser = argparse.ArgumentParser(description='Links report on twitter -- some basic metrics as to how many links I shared today')
parser.add_argument('API_KEY', action='store', metavar='API_KEY', help='Twitter API key')
parser.add_argument('API_SECRET', action='store', metavar='API_SECRET', help='Twitter API secret')
parser.add_argument('ACCESS_TOKEN', action='store', metavar='ACCESS_TOKEN', help='Twitter access token')
parser.add_argument('ACCESS_SECRET', action='store', metavar='ACCESS_SECRET', help='Twitter access secret')
parsed = parser.parse_args()

auth = tweepy.OAuthHandler(parsed.API_KEY, parsed.API_SECRET)
auth.set_access_token(parsed.ACCESS_TOKEN, parsed.ACCESS_SECRET)
api = tweepy.API(auth)

tweets = []

today = datetime.datetime.today() - datetime.timedelta(hours=24)
today.replace(hour=0, minute=0, second=0, microsecond=0)
today = today.timetuple()

def links_today(link):
    date_of_link = time.localtime(long(link[0]))
    return today < date_of_link

with LZMAFile('/home/ec2-user/public_html/links.csv.xz') as fin:
    reader = csv.reader(fin.readlines()[1:])
    tweets = filter(links_today, reader)

status = 'In the last 24 hours, I sent {0} links to {1} unique recipients.'.format(len(tweets[1]), len(set(tweets[2])))

if len(tweets) != 0:
    api.update_status(status=status)

Solution

A pair of style comments based on PEP 8:

  • It’s good that you sort your module imports alphabetically, but you should also split out third-party imports from standard library modules. Specifically, this means moving backports and tweepy into their own group. (Imports)



  • Wrap lines to 79 characters. (Maximum Line Length)



Now on to the meat of the program:

-
Get your secret information from the keychain, not the command line.

I’m not a big fan of passing in passwords or secret information over the command line. They’ll show up in your command history, and you probably need to copy/paste them from somewhere every time you use them (Twitter’s access tokens are not particularly easy to remember).

I prefer to use the keyring module, which stores and accesses passwords via the system keychain.

That’s how I store my secret tokens for Twitter, which I think is a bit more secure and more convenient. Here’s a snippet from my Twitter backup script which gets my tokens this way:

import keyring
import tweepy

def setup_api():
    """Authorise the use of the Twitter API."""
    a = {attr: keyring.get_password("twitter", attr) for attr in [
        'consumerKey',
        'consumerSecret',
        'token',
        'tokenSecret'
    ]}
    auth = tweepy.OAuthHandler(a['consumerKey'], a['consumerSecret'])
    auth.set_access_token(a['token'], a['tokenSecret'])
    return tweepy.API(auth)


and here’s how I would initially set the tokens:

import keyring

keyring.set_password("twitter", "consumerKey", 'abcdefg1234567')
# repeat for other secrets/tokens as appropriate


-
Tidy up your date handling code.

There are several things I’d change about this part:

  • Use a different name for the today variable. Not only is it incorrect (because the datetime object it creates points to yesterday), but it also has the potential for confusion with the today() method.



  • Add some comments to explain the purpose of this variable – what does it represent, and why are you doing it this way?



-
The today.replace() line has no effect on the program. The replace() method returns a new datetime object; it doesn’t modify in-place. Here’s a short snippet that demonstrates the difference:

mydate = datetime.datetime.today()
mydate.replace(hour=0, minute=0, second=0, microsecond=0)
print mydate  # 2015-04-29 19:29:54.572354

mydate2 = datetime.datetime.today()
mydate2 = mydate.replace(hour=0, minute=0, second=0, microsecond=0)
print mydate2 # 2015-04-29 00:00:00


You need to add today = to the start of the line.

(Although it’s not clear to me why you need this line, anyway. It means you’ll get links from the last 24 hours and a bit. If you’re not using it, remove this line from the script.)

-
Clarify the links_today() function.

  • You should add a docstring to the function to explain what it does, and what sort of input it expects. It’s not clear what link is supposed to be, or why you're taking the 0th index to get the time.



  • The name of the function suggests to me it might return a list of links from today. Instead it tells me whether a link was posted today. I’d give the function a more descriptive name, e.g. was_posted_today().



-
Don’t hard code the path to your csv.xz file halfway down the program. It would be better to put it as a global variable right at the top of the script: that way, if you ever want to change it, it’s easy to find.

-
It would be good to add some comments about the format of file, and what each line of the CSV looks like. That would help to explain the seemingly random indices of tweet that you take later.

-
For the status string, I would precompute the number of links and recipients, accompanied with a comment about why you take index 1 and 2, respectively. Otherwise these seem like magic numbers.

Also note that if either of these numbers are 1, you will have a pluralisation bug. Oops.

-
Wrap your code in a main() function.

Rather than have all your program code in the top-level, define a series of functions, concluding in main(), which is where your main program flow goes. Then put this at the end of your script:

if __name__ == '__main__':
    main()


Now, if the script is run as a standalone program, everything in main(), but you can import the functions from this script in another file as well. It makes it easier to reuse code.

Code Snippets

import keyring
import tweepy

def setup_api():
    """Authorise the use of the Twitter API."""
    a = {attr: keyring.get_password("twitter", attr) for attr in [
        'consumerKey',
        'consumerSecret',
        'token',
        'tokenSecret'
    ]}
    auth = tweepy.OAuthHandler(a['consumerKey'], a['consumerSecret'])
    auth.set_access_token(a['token'], a['tokenSecret'])
    return tweepy.API(auth)
import keyring

keyring.set_password("twitter", "consumerKey", 'abcdefg1234567')
# repeat for other secrets/tokens as appropriate
mydate = datetime.datetime.today()
mydate.replace(hour=0, minute=0, second=0, microsecond=0)
print mydate  # 2015-04-29 19:29:54.572354

mydate2 = datetime.datetime.today()
mydate2 = mydate.replace(hour=0, minute=0, second=0, microsecond=0)
print mydate2 # 2015-04-29 00:00:00
if __name__ == '__main__':
    main()

Context

StackExchange Code Review Q#88214, answer score: 2

Revisions (0)

No revisions yet.