HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Python CSV command line tool to import CSV into MySQL

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
lineintocsvimportmysqlpythoncommandtool

Problem

I've written my first Python program longer than 15 rows and, of course, it looks terrible.

I'd be very glad to receive comments and suggestion about how to implement a similar tool correctly.

```
#!/usr/bin/python
# -- coding: utf-8 --

# This script imports the csv data into a mysql database

import csv
import MySQLdb
import os
import sys
import datetime
import subprocess
import time
import logging
import notification

SCRIPT_PATH = os.path.dirname(sys.argv[0])
DATE_FORMAT = '%Y-%m-%d'
TIME_FORMAT = '%H:%M:%S'
DB_HOST = 'myhost'
DB_USER = 'myuser'
DB_PASSWORD = 'mypassword'
DB_NAME = 'mydbname'

logging.basicConfig(filename='csv_import_log.log', level=logging.DEBUG)

def main():
clear_screen()
if len(sys.argv) == 2:
period = int(sys.argv[1])

delete_all_csv_in_script_path()

if period '%s'"
% last_ticket_number)

def delete_all_data_from_db():
mydb = MySQLdb.connect(host=DB_HOST, user=DB_USER,
passwd=DB_PASSWORD, db=DB_NAME)
cursor = mydb.cursor()
cursor.execute('truncate docs')

def csv_import_start():
'''
Search into the script folder for csv file and imports those files into Mysql
'''

for file in os.listdir(SCRIPT_PATH):
if file.endswith('.csv'):
if file.startswith('Docs'):
print 'Importing ' + file
import_docs(file)
elif file.startswith('GetCash'):
print 'Importing ' + file
import_getcash(file)
elif file.startswith('GetUti'):
print 'Importing ' + file
import_getuti(file)
else:
print 'Input error'

def import_docs(docs_csv_file):
'''
Imports Docs.csv into mysql
'''

mydb = MySQLdb.connect(host=DB_HOST, user=DB_USER,
passwd=DB_PASSWORD, db=DB_NAME)
cursor = mydb.cursor()
firstline = True
progress = 0
with open(os.path.join(SCRIPT_PAT

Solution

In function import_getuti,

for row in csv_reader:
            if firstline:
                firstline = False
                continue


The above code skips first line of the file, but repetitious. To skip a line from any interator, use next.

next(csvfile)


cursor.execute can be done as follows,

row is a list. Just build a tuple out of it and pass.

cursor.execute('INSERT INTO getuti (field_1, field_2, field_3, field_4, field_5, field_6, field_7, field_8, field_9, field_10, field_11, field_12, field_13)
                VALUES(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)', tuple(row[:13]))


The following two lines can be moved out of functions and cursor should be passed to functions that need database access.

mydb = MySQLdb.connect(host=DB_HOST, user=DB_USER,
                           passwd=DB_PASSWORD, db=DB_NAME)
cursor = mydb.cursor()


Use str.format to format stings.

print 'Rows imported in getcash: ' + str(progress)
print 'Rows imported in getcash: {}'.format(progress)


In function import_docs, ',' is replaced with '.' in series of strings that can be done as follows,

row[17:34] = [s.replace(',', '.') for s in row[17:34]]


If the strings to be processed are non-contiguous in the list and can't be calculated, the whole list can be processed because str.replace is safe, it doesn't throw an error if couldn't find the specified old string.

cursor.execute("DELETE FROM docs where number > '%s'" % last_ticket_number)


String formatting shouldn't be used in SQL statements for parameter substitution. Pass a tuple.

cursor.execute("DELETE FROM docs where number > %s", (last_ticket_number,))


strftime returns string. No need to call str again.

In the try block of function get_last_ticket_number_and_date,

cursor.execute("select number,date from docs where date = '%s' order by number desc limit 1" % first_of_the_month)           
last_ticket_number_and_date = cursor.fetchone()            
return last_ticket_number_and_date


non-critical code shouldn't be in try bolck and use else clause to return values.

try:
    cursor.execute("select number,date from docs where date = '%s' order by number desc limit 1", (first_of_the_month,))            
    last_ticket_number_and_date = cursor.fetchone()
except Exception as e:
    print "An error occured: {}".format(e)
else:
    return last_ticket_number_and_date


glob module can be used to get list of files matching a pattern.

for file in glob.glob(os.path.join(SCRIPT_PATH,'*.csv')):
    os.remove(file)


As no calculation are performed on the input sys.argv[1], it's needless to convert to int, just string comparison can be done as perion == '1'.

if period in '123456':
    start_process(period)
else:
    print 'Wrong option'


Rather than dumping all the functions in the main moudle, fuctions should be split by usage and put into separate modules.

Code Snippets

for row in csv_reader:
            if firstline:
                firstline = False
                continue
next(csvfile)
cursor.execute('INSERT INTO getuti (field_1, field_2, field_3, field_4, field_5, field_6, field_7, field_8, field_9, field_10, field_11, field_12, field_13)
                VALUES(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)', tuple(row[:13]))
mydb = MySQLdb.connect(host=DB_HOST, user=DB_USER,
                           passwd=DB_PASSWORD, db=DB_NAME)
cursor = mydb.cursor()
print 'Rows imported in getcash: ' + str(progress)
print 'Rows imported in getcash: {}'.format(progress)

Context

StackExchange Code Review Q#85428, answer score: 2

Revisions (0)

No revisions yet.