patternpythonMinor
WSGI static file handler
Viewed 0 times
handlerfilestaticwsgi
Problem
I put up a simple WSGI static file handler, but I'm skeptical about its efficiency, also because I got a handful of errors while benchmarking it with Apache Benchmark.
The errors are:
This is what brought me to think it's inefficient.
The following is the Wsgi static file handler:
Any advice/tips/suggestions?
The errors are:
IOError: [Errno 24] Too many open files: 'test.png'
Client 101 hit errno 32
Client 601 hit errno 32
Client 408 hit errno 32
Client 225 hit errno 32
Client 668 hit errno 32
Client 415 hit errno 32
Client 237 hit errno 32
Client 316 hit errno 104
...This is what brought me to think it's inefficient.
The following is the Wsgi static file handler:
import bjoern
import os
# Get file size
size = os.path.getsize("test.png")
def app(environ, start_response):
status = "200 OK"
# Open image file
the_file = open("test.png", "rb")
response_headers = [ ('Content-Type', 'image/png'), ('Content-length', str(size)) ]
start_response( status, response_headers )
# return the entire file
if 'wsgi.file_wrapper' in environ:
# Return env[wsgi.fw](file, block size)
return environ['wsgi.file_wrapper'](the_file , 1024)
else:
return iter(lambda: the_file.read(1024), '')
bjoern.run(app, "localhost", 8000)Any advice/tips/suggestions?
Solution
In total, a process in Unix systems can have a limited number of open filehandles. By default this is only say 1024 per process in Linux; all of the open sockets also consume file descriptors. Even if you optimize the system, then reading the file for each request will be costly.
In this case, if you are really concerned about the performance, read the image into memory (as bytes/str), and send it from there instead:
However if you have a veery big file, then you could mmap it into memory in the beginning of the process (this consumes just 1 file descriptor) globally and send it from there in each client:
Both of these would save you 1 file descriptor per a client requesting the file.
However, if you need to serve lots of files like this, then you might want to increase the limit of open file descriptors per a process, use a front-side cache such as Varnish and/or scale up into multiple processes, or hosts, or even use a CDN to deliver your files.
In this case, if you are really concerned about the performance, read the image into memory (as bytes/str), and send it from there instead:
with open('test.png', 'rb') as f:
image_data = f.read()
...
return ( image_data, )However if you have a veery big file, then you could mmap it into memory in the beginning of the process (this consumes just 1 file descriptor) globally and send it from there in each client:
im_file = open('verybigfile.iso', 'rb')
mm = mmap.mmap(im_file, 0)
...
return ( mm, )Both of these would save you 1 file descriptor per a client requesting the file.
However, if you need to serve lots of files like this, then you might want to increase the limit of open file descriptors per a process, use a front-side cache such as Varnish and/or scale up into multiple processes, or hosts, or even use a CDN to deliver your files.
Code Snippets
with open('test.png', 'rb') as f:
image_data = f.read()
...
return ( image_data, )im_file = open('verybigfile.iso', 'rb')
mm = mmap.mmap(im_file, 0)
...
return ( mm, )Context
StackExchange Code Review Q#60241, answer score: 8
Revisions (0)
No revisions yet.