HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Rapidly requesting a file from a server

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
filerapidlyrequestingserverfrom

Problem

Here is a Python script I wrote earlier for rapidly requesting a file from a server and timing each transaction. Can I improve this script?

from time import time
from urllib import urlopen

# vars
url = raw_input("Please enter the URL you want to test: ")

for i in range(0,100):
    start_time = time()
    pic = urlopen(url)

    if pic.getcode() == 200:
        delta_time = time() - start_time
        print "%d" % (delta_time * 100)
    else:
        print "error"
print "%d requests made. File size: %d B" % (i, len(pic.read()))


I'm new to Python, though, so I'm not sure if this is the best way to go about it.

Solution

Here are some comments on your code:

-
raw_input is a very inflexible way to get data into a program: it is only suitable for interactive use. For more general use, you should get your data into the program in some other way, for example via command-line arguments.

-
urllib.urlopen was removed in Python 3, so your program will be more forward-compatible if you use urllib2.urlopen instead. (You'll need to change the way your error handling works, because urllib2.urlopen deals with an error by raising an exception instead of returning a response object with a getcode method.)

-
The comment # vars does not seem to contain any information. Improve it or remove it.

-
The range function starts at 0 by default, so range(0,100) can be written more simply as range(100).

-
100 seems rather arbitrary. Why not take this as a command-line parameter too?

-
The variable pic seems poorly named. Is it short for picture? The urlopen function returns a file-like object from which the resource at the URL can be read. The examples in the Python documentation accordingly use f or r as variable names.

-
I assume that this code is only going to be used for testing your own site. If it were going to be used to time the fetching of resources from public sites, it would be polite (a) to respect the robots exclusion standard and (b) to not fetch resources as fast as possible, but to sleep for a while between each request.

-
If an attempt to fetch the resource fails, you probably want to exit the loop rather than fail again many times.

-
You say in your post that the code is supposed to time "the entire transaction", but it does not do this. It only times how long it takes to call urlopen. This generally completes quickly, because it just reads the headers, not the entire resource. You do read the resource once (at the very end), but outside the timing code.

-
You multiply the time taken by 100 and then print it as an integer. This seems unnecessarily misleading: if you want two decimal places, then why not use "%.2f"?

-
Finally, there's a built-in Python library timeit for measuring the execution time of a piece of code, which compensates for things like the time taken to call time.

So I might write something more like this:

from sys import argv
from timeit import Timer
from urllib2 import urlopen

if __name__ == '__main__':
    url = argv[1]
    n = int(argv[2])
    length = 0

    def download():
        global length, url
        f = urlopen(url)
        length += len(f.read())
        f.close()

    t = Timer(download).timeit(number = n)
    print('{0:.2f} seconds/download ({1} downloads, average length = {2} bytes)'
          .format(t / n, n, length / n))

Code Snippets

from sys import argv
from timeit import Timer
from urllib2 import urlopen

if __name__ == '__main__':
    url = argv[1]
    n = int(argv[2])
    length = 0

    def download():
        global length, url
        f = urlopen(url)
        length += len(f.read())
        f.close()

    t = Timer(download).timeit(number = n)
    print('{0:.2f} seconds/download ({1} downloads, average length = {2} bytes)'
          .format(t / n, n, length / n))

Context

StackExchange Code Review Q#15113, answer score: 3

Revisions (0)

No revisions yet.