HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Downloading 2 tables from db, does calculation and uploads resulting table

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
uploadstablescalculationdownloadingresultingdoesandfromtable

Problem

I have written this script that downloads two tables from the db, preforms an intersection on them and adds 2 new columns to the resulting table, and uploads the resulting table to the db. It's a bit slow and I realize my code is messy (I'm new to Python and GIS). I suspect more steps can be done in parallel, for example downloading both tables from the database at once.

```
import arcpy, os, subprocess
import logging, shutil, sys

if __name__ == '__main__':
#create logger
logger = logging.getLogger(__name__)
FORMAT = '%(name)s - %(levelname)s - %(message)s'
logging.basicConfig(filename='py_intersection.log', level=logging.DEBUG, format=FORMAT)
logger.info('Started shp intr arcpy')

host = sys.argv[1]
database = sys.argv[2]
schema1 = sys.argv[3]
schema2 = sys.argv[4]
username = sys.argv[5]
password = sys.argv[6]
table1 = sys.argv[7]
table2 = sys.argv[8]

print 'program started'

out_path = r'\\storage1\gis\temp\output.shp'
new_dir = r'\\storage1\gis\temp'
if not os.path.exists(new_dir):
os.makedirs(new_dir)

pgsql2shp = 'pgsql2shp -f %s\\%s -h %s -p 5432 -P %s -u %s %s %s.%s' % (new_dir, table1, host, password, username, database, schema1, table1)
print pgsql2shp
logger.info('Return code pgsql2shp invocation 1: '+str(subprocess.Popen(pgsql2shp, shell=True).wait()))
pgsql2shp = 'pgsql2shp -f %s\\%s -h %s -p 5432 -P %s -u %s %s %s.%s' % (new_dir, table2, host, password, username, database, schema2, table2)
print pgsql2shp
logger.info('Return code pgsql2shp invocation 2: '+str(subprocess.Popen(pgsql2shp, shell=True).wait()))
print('Argument:'+ new_dir+'\\'+table1+'.shp'+' , '+ new_dir+'\\'+table2+'.shp')
arcpy.Intersect_analysis([new_dir+'\\'+table1+'.shp', new_dir+'\\'+table2+'.shp'], out_path, "ALL")

arcpy.AddField_management(out_path, "intersected_area", "DOUBLE")
arcpy.AddField_management(out_path, "dist_perct", "DOUBLE")
arcpy.CalculateField_m

Solution

-
I know, that you know, that your code is messy, but I'm still going to give some tips on your coding style. Here are a few tips I have on what can be improved there.

  • Don't use % for string formatting, as it is deprecated. You should be using str.format() instead. Here's an example: print "Hello {0}".format("world")



  • You can remove the docstring at the end of your file containing """!@endcond""". This does nothing, and serves no purpose.



  • Separate this out into different functions. In it's current state, it's really hard to read as one large block of code.



  • Add more comments. Comments will make your code much clearer, and easier to read. Functions and classes should use docstrings, over regular inline comments.



-
Don't import multiple modules on one line. If an error occurs with one module, it can be hard to distinguish where the error came from.

  • Some variable names could be better. For example, the names sts, or shp2pgsql provide no useful information. Try to find any other names with not so great names, and see how you can improve them.



  • I'm not an expert on asynchronous programming, or any of the modules provided in Python made for this, but if you want to do a task like this, Python 3.5 will include the async and await keywords built for asynchronous programming.



The official style guide for Python, PEP8, can be found here.

Context

StackExchange Code Review Q#94590, answer score: 4

Revisions (0)

No revisions yet.