HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Uploading data to Google Drive

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
drivegoogledatauploading

Problem

I have created a project which uploads data from my local storage to my Google Drive account using the Google Drive API.

You can read about how to run it on its Github repository.

The main script upload.py:

```
#!/usr/bin/env python
from __future__ import print_function

import os
import sys
import json
import subprocess
import hashlib

from apiclient.http import MediaFileUpload
from api_boilerplate import file_service

def file_exists(fileId):
"""
Checks whether a file exists on the Drive or not.
:param fileId: The ID of the file to check.
:type fileId: str
:returns: bool
"""
if not fileId:
return False
try:
file_service.get(fileId=fileId, fields="").execute()
return True
except:
return False

def create_file(file_path, parentId=None):
"""
Creates a new file on the Drive.
:param file_path: The path of the source file on local storage.
:type file_path: str
:param parentId: The ID of the directory in which the file has to be
created. If it is None, the file will be created in the root directory.
:type parentId: str or None
:returns: A dictionary containing the ID of the file created.
"""
file_name = os.path.basename(file_path)

mimetype = None
if not os.path.splitext(file_name)[1]:
# Required for files with names like '.astylerc'
mimetype = "text/plain"

media_body = MediaFileUpload(file_path, mimetype=mimetype)
body = {'name': file_name}
if parentId:
body['parents'] = [parentId]

results = file_service.create(
body=body, media_body=media_body, fields="id").execute()

return results

def update_file(file_path, fileId):
"""
Modifies an already existing file on the Drive.
:param file_path: The path of the source file on local storage.
:type file_path: str
:param fileId: The ID of the file to be modified.
:type fileId: str
:returns: A dictionary containing the ID of th

Solution

This is a "blind" review in that I'm not testing my changes or running your code.

Don't use #!/usr/bin/env python unless actually version-agnostic; use #!/usr/bin/env python2 or #!/usr/bin/env python3.

This here

if not fileId:
    return False
try:
    file_service.get(fileId=fileId, fields="").execute()
    return True
except:
    return False


has a raw except. I don't like that; replace it with an except catching the specific wanted exception, or Exception if you really don't know which will be raised. This would at least avoid catching BaseException, which is rarely wanted.

Don't use if x to test if something is None, use is None. This is a stricter, faster check and is less likely to cause false positives.

You'd be better off using pathlib than the string and os.path combination you're using now. For example, create_file would be roughly

file_path = PurePosixPath(file_path)

mimetype = None
if not file_path.suffix:
    # Required for files with names like '.astylerc'
    mimetype = "text/plain"

media_body = MediaFileUpload(file_path, mimetype=mimetype)
body = {'name': file_path.name}
...


which I'd shorten to

file_path = PurePosixPath(file_path)

# Required for files with names like '.astylerc'
mimetype = "text/plain" if file_path.suffix else None

media_body = MediaFileUpload(file_path, mimetype=mimetype)
body = {'name': file_path.name}
...


update_file does

if not os.path.splitext(os.path.basename(file_path))[1]:


which seems redundant compared to just

if not os.path.splitext(file_path)[1]:


and still worse than with pathlib.

You do

some_function(name_of_thing=name_of_thing)


quite often. Don't bother duplicating the name, it doesn't help. Only do this if the parameter name is different and adds information.

Your code looks race-y. For example,

if file_exists(fileId):
    return update_file(file_path, fileId)
else:
    return create_file(file_path, parentId=parentId)


would likely break if a file is created or deleted between the check and the action. You should check the provided API for solutions here (although this might not be a problem).

The code

local_file_hash = hashlib.md5(open(file_path, 'rb').read()).hexdigest()


leaks a file handle. Always (almost always) use with for files. Do so in main too.

The flush here isn't needed:

print(str(input_file))
sys.stdout.flush()

Code Snippets

if not fileId:
    return False
try:
    file_service.get(fileId=fileId, fields="").execute()
    return True
except:
    return False
file_path = PurePosixPath(file_path)

mimetype = None
if not file_path.suffix:
    # Required for files with names like '.astylerc'
    mimetype = "text/plain"

media_body = MediaFileUpload(file_path, mimetype=mimetype)
body = {'name': file_path.name}
...
file_path = PurePosixPath(file_path)

# Required for files with names like '.astylerc'
mimetype = "text/plain" if file_path.suffix else None

media_body = MediaFileUpload(file_path, mimetype=mimetype)
body = {'name': file_path.name}
...
if not os.path.splitext(os.path.basename(file_path))[1]:
if not os.path.splitext(file_path)[1]:

Context

StackExchange Code Review Q#116238, answer score: 3

Revisions (0)

No revisions yet.