Recent Entries 4
- pattern minor 112d agoPython CGI front-end for web service to perform machine translationI am trying to optimize this python script that is used to process web requests for machine translation. The actual translation executable that is called is quite fast. Also, the perl scripts that are called are fast as well. The largest performance boost came from removing unnecessary import libraries. I would like to have this code reviewed so I can further optimize the performance. Also, I welcome any advice on a pythonic way of testing performance. My code is littered with timing and print commands that I removed for this post. ``` #!/usr/bin/env python # -*- coding: UTF-8 -*- import time import sys import cgi import subprocess import string import xmlrpclib reload(sys) sys.setdefaultencoding('utf8') isTestPerformance = len(sys.argv) == 4 # Parameters if isTestPerformance: source = sys.argv[1] target = sys.argv[2] sourceText = sys.argv[3] else: # this part is important to tell the browser that output is html text. print "Access-Control-Allow-Origin: *" print "Content-Type: text/plain;charset=utf-8" print form = cgi.FieldStorage() sourceText = form.getvalue("sourceText").decode('utf8') source = form.getvalue("source").lower() target = form.getvalue("target").lower() # Decode the CGI encoded source text # NOTE: Custom encoding of semicolon (;), (?), (&), (#), etc, is only done here because # CGI can not handle them. Do not used this (decode) if you are not using CGI, # or use some other decoding that matches the encoding from the caller of this code sourceText = sourceText.replace("__QUESTION_MARK__", "?") sourceText = sourceText.replace("__SEMICOLON__", ";") sourceText = sourceText.replace("__AMPERSAND__", "&") sourceText = sourceText.replace("__NUMBER__", "#") # sourceText = sourceText.replace("__NEWLINE__", "\n") # Tokenize the Source Text if source == "zh": # Chinese has to do word alignment # options are slim: write the text to a file # use NLTK Stanford NLP (python>java) to segment chinese
- pattern minor 112d agoCGI output gzip compression moduleEdit: How should I interpret the silence? On a scale from 0 to 10 where 0 means "Bloody awful" and 10 means "Nothing to complain about". I'm mainly concerned about readability and things I don't seem to be aware of. If the code is readable, I won't need to explain it. ``` #!/usr/bin/python import os import sys import subprocess assert __name__ != '__main__' max_load_avg1 = 3.5 ''' compressout =========== Simple CGI output module. Uses gzip compression on the output stream if the client accepts it. NOTICE: The `cgitb` module will write to stdout if the script crashes, you should use a browser that does not accept gzip, when you are testing your scripts. NOTICE: In the beginning of this file `max_load_avg1` is defined. This is the maximum allowed load average under one minute. If the one minute load average exceeds this value, compressout will abort. Functions ========= init(write_headers=True) ------------------------ Initialize the module. This function will detect if the client supports gzip. If `write_headers`, write a 'Vary' and (if used) 'Content-Encdoing' header. write_h(s) ---------- Write part of header. Write `s` to standard output, will never go through gzip. write_b(s) ---------- Write part of body. gzip is supported by the client ------------------------------- `s` will be appended to a local buffer which `done` will compress and print. gzip is not supported --------------------- `s` will go straight to stdout. done() ------ Done writing output. This function will invoke gzip. Dos and don'ts ============== ## ## ## if __name__ == '__main__': compressout.init() main() compressout.done() ## ## ## * Never call `write_h` after any call to `write_b` * Always call `done` when your done. * Use only compressout to write output * NOTICE: The `cgitb` module will write to stdout if the script crashes, you should use a browser that d
- pattern minor 112d agoCGI script for managing Unix passwordsAll the services I run on my server are based on Unix accounts. Since most web services have their own users and perform all the account management separate from the actual system accounts, I created a CGI script that handles: - Changing passwords (requires old password) - Assigning contact info (requires password) - Request password reset (no passwords sent in email) I've tried to use only system commands and no external scripts (aside from the one to get POST variables). The application is not run setuid, but permissions are required for `sudo` to run `chpasswd`. I'm looking for any issues with sanitizing form data, how I'm using `expect` to input to system commands, etc. I know I can clean up the code a bit and refactor all the duplicate code. Basically I got the thing working and now and am looking for how to make it better before I start cleaning it up. Code posted on Github ``` #!/bin/bash - #=============================================================================== # Copyright (c) 2015 Jeff Parent # All rights reserved. # # Redistribution and use in source and binary forms, with or without modification, # are permitted provided that the following conditions are met: # # * Redistributions of source code must retain the above copyright notice, this # list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright notice, # this list of conditions and the following disclaimer in the documentation # and/or other materials provided with the distribution. # * Neither the name of the passwd.sh authors nor the names of its contributors # may be used to endorse or promote products derived from this software without # specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- pattern minor 112d agoBash CGI Upload FileI'm using the following Bash CGI to upload a file: ``` #!/bin/bash echo "Content-Type: text/plain" echo if [ "$REQUEST_METHOD" = "POST" ]; then TMPOUT=hello cat >$TMPOUT # Get the line count LINES=$(wc -l $TMPOUT | cut -d ' ' -f 1) # Remove the first four lines tail -$((LINES - 4)) $TMPOUT >$TMPOUT.1 # Remove the last line head -$((LINES - 5)) $TMPOUT.1 >$TMPOUT # Copy everything but the new last line to a temporary file head -$((LINES - 6)) $TMPOUT >$TMPOUT.1 # Copy the new last line but remove trailing \r\n tail -1 $TMPOUT | tr -d '\r\n' >> $TMPOUT.1 fi ``` This is for a uClinux/Busybox server. When a file is passed this way, the original `$TMPOUT` will contain a four line head and one line tail that need to be removed to end up with the same file. The resulting file's hash is identical to the original. It works but it seems pretty ugly, creating two files and such. I'm by no means a pro in bash, can this be made prettier? Keep in mind that the target is a little embedded device and has no Perl/Python or anything on it. It needs to be pure bash.