patternpythonMinor
Downloading 2 tables from db, does calculation and uploads resulting table
Viewed 0 times
uploadstablescalculationdownloadingresultingdoesandfromtable
Problem
I have written this script that downloads two tables from the db, preforms an intersection on them and adds 2 new columns to the resulting table, and uploads the resulting table to the db. It's a bit slow and I realize my code is messy (I'm new to Python and GIS). I suspect more steps can be done in parallel, for example downloading both tables from the database at once.
```
import arcpy, os, subprocess
import logging, shutil, sys
if __name__ == '__main__':
#create logger
logger = logging.getLogger(__name__)
FORMAT = '%(name)s - %(levelname)s - %(message)s'
logging.basicConfig(filename='py_intersection.log', level=logging.DEBUG, format=FORMAT)
logger.info('Started shp intr arcpy')
host = sys.argv[1]
database = sys.argv[2]
schema1 = sys.argv[3]
schema2 = sys.argv[4]
username = sys.argv[5]
password = sys.argv[6]
table1 = sys.argv[7]
table2 = sys.argv[8]
print 'program started'
out_path = r'\\storage1\gis\temp\output.shp'
new_dir = r'\\storage1\gis\temp'
if not os.path.exists(new_dir):
os.makedirs(new_dir)
pgsql2shp = 'pgsql2shp -f %s\\%s -h %s -p 5432 -P %s -u %s %s %s.%s' % (new_dir, table1, host, password, username, database, schema1, table1)
print pgsql2shp
logger.info('Return code pgsql2shp invocation 1: '+str(subprocess.Popen(pgsql2shp, shell=True).wait()))
pgsql2shp = 'pgsql2shp -f %s\\%s -h %s -p 5432 -P %s -u %s %s %s.%s' % (new_dir, table2, host, password, username, database, schema2, table2)
print pgsql2shp
logger.info('Return code pgsql2shp invocation 2: '+str(subprocess.Popen(pgsql2shp, shell=True).wait()))
print('Argument:'+ new_dir+'\\'+table1+'.shp'+' , '+ new_dir+'\\'+table2+'.shp')
arcpy.Intersect_analysis([new_dir+'\\'+table1+'.shp', new_dir+'\\'+table2+'.shp'], out_path, "ALL")
arcpy.AddField_management(out_path, "intersected_area", "DOUBLE")
arcpy.AddField_management(out_path, "dist_perct", "DOUBLE")
arcpy.CalculateField_m
```
import arcpy, os, subprocess
import logging, shutil, sys
if __name__ == '__main__':
#create logger
logger = logging.getLogger(__name__)
FORMAT = '%(name)s - %(levelname)s - %(message)s'
logging.basicConfig(filename='py_intersection.log', level=logging.DEBUG, format=FORMAT)
logger.info('Started shp intr arcpy')
host = sys.argv[1]
database = sys.argv[2]
schema1 = sys.argv[3]
schema2 = sys.argv[4]
username = sys.argv[5]
password = sys.argv[6]
table1 = sys.argv[7]
table2 = sys.argv[8]
print 'program started'
out_path = r'\\storage1\gis\temp\output.shp'
new_dir = r'\\storage1\gis\temp'
if not os.path.exists(new_dir):
os.makedirs(new_dir)
pgsql2shp = 'pgsql2shp -f %s\\%s -h %s -p 5432 -P %s -u %s %s %s.%s' % (new_dir, table1, host, password, username, database, schema1, table1)
print pgsql2shp
logger.info('Return code pgsql2shp invocation 1: '+str(subprocess.Popen(pgsql2shp, shell=True).wait()))
pgsql2shp = 'pgsql2shp -f %s\\%s -h %s -p 5432 -P %s -u %s %s %s.%s' % (new_dir, table2, host, password, username, database, schema2, table2)
print pgsql2shp
logger.info('Return code pgsql2shp invocation 2: '+str(subprocess.Popen(pgsql2shp, shell=True).wait()))
print('Argument:'+ new_dir+'\\'+table1+'.shp'+' , '+ new_dir+'\\'+table2+'.shp')
arcpy.Intersect_analysis([new_dir+'\\'+table1+'.shp', new_dir+'\\'+table2+'.shp'], out_path, "ALL")
arcpy.AddField_management(out_path, "intersected_area", "DOUBLE")
arcpy.AddField_management(out_path, "dist_perct", "DOUBLE")
arcpy.CalculateField_m
Solution
-
I know, that you know, that your code is messy, but I'm still going to give some tips on your coding style. Here are a few tips I have on what can be improved there.
-
Don't import multiple modules on one line. If an error occurs with one module, it can be hard to distinguish where the error came from.
The official style guide for Python, PEP8, can be found here.
I know, that you know, that your code is messy, but I'm still going to give some tips on your coding style. Here are a few tips I have on what can be improved there.
- Don't use
%for string formatting, as it is deprecated. You should be usingstr.format()instead. Here's an example:print "Hello {0}".format("world")
- You can remove the docstring at the end of your file containing
"""!@endcond""". This does nothing, and serves no purpose.
- Separate this out into different functions. In it's current state, it's really hard to read as one large block of code.
- Add more comments. Comments will make your code much clearer, and easier to read. Functions and classes should use docstrings, over regular inline comments.
-
Don't import multiple modules on one line. If an error occurs with one module, it can be hard to distinguish where the error came from.
- Some variable names could be better. For example, the names
sts, orshp2pgsqlprovide no useful information. Try to find any other names with not so great names, and see how you can improve them.
- I'm not an expert on asynchronous programming, or any of the modules provided in Python made for this, but if you want to do a task like this, Python 3.5 will include the
asyncandawaitkeywords built for asynchronous programming.
The official style guide for Python, PEP8, can be found here.
Context
StackExchange Code Review Q#94590, answer score: 4
Revisions (0)
No revisions yet.