HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Collate votes on MP3 files to list them by popularity

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
popularitycollatefilesmp3listthemvotes

Problem

For my first Python project, I decided to create a script that would automatically collate MP3 files into a music countdown. I wanted to 'learn by doing', so I dived straight in and used the Python docs and Stack Overflow as my primary learning resource.

The idea is as follows:

  • Python script resides in base directory.



  • In the base directory there are sub-folders for each voter containing their MP3 files. Each sub-folder is named with the voter's name.



  • Each voter submits their MP3 files in the following format:



(vote_count). Artist - Song Title.mp3.




So, for example, 10. Foo Fighters - Walk.mp3. The '10.' at the start of the song specifies how many votes are being allocated to this song.

  • Once sub-folders have been populated and named correctly in the format, the script can be executed.



  • Upon execution, the script will scan all sub-folders for files ending in .mp3



  • It will then strip the necessary information such as vote count and song title and add the songs to a list.



  • Any duplicates (i.e multivotes) and the script will add the votes together to get the total vote count for that song.



  • Final results will then output to a CSV file in order of song with highest votes first.



So far so good. I have completed everything in that list, however there are still some more features i would like to add, like renaming all of the MP3 files so they are in the correct order for importing to a playlist. I.e add numbers to the start of each filename.

```
import os, csv, operator, itertools
from collections import Counter, defaultdict

print("Welcome to the Music vote Collator" + '\n')
print("STAGE 1. Processing Votes..." + '\n')
folder_list = os.getcwd() #Get list of files in all directories

#global variables and lists
song_list = []
voter_list = defaultdict(list)
tally = Counter()
total_songs = 0

for subdir, dirs, files in os.walk(folder_list):
for file in files:
if file.endswith(".mp3"):
base_dir = os.path

Solution

Avoid wrong comments

folder_list = os.getcwd() #Get list of files in all directories


So the comment states that os.getcwd() -> List[Filenames] but the docs say:

>>> help(os.getcwd)
Help on built-in function getcwd in module posix:

getcwd(...)
    getcwd() -> path

    Return a unicode string representing the current working directory.


A comment that says that the code does another thing from what it actually does is incredibly confusing.

Avoid obvious comments

Some comments may be needed, but stating the obvious in comments is frowned upon.

For example:

song = file_name.lower() #create variable for song and convert all letters to lowercase to prevent song name match issues


If you see x = you know it is a variable assignment, and .lower() obviously converts to lower case.

The only valuable part is to prevent song name match issues because it tells me why you are doing what you can see are doing from the code so I would change that to:

song = file_name.lower() # Lower-casing prevents song-name match issues.

Instead #global variables and lists can be deleted wholesale as it adds no information at all to the code.

Unpacking

A neat trick where you can replace

a = lst[0]
b = lst[1]


with

a, b = lst


In practice:

file_name = file.split('.')[1] #removes the .mp3 extension and numbers at the start of the file to clean up song name
    vote_count = file.split('.')[0] #retrieves vote count for each song


Becomes:

vote_count, file_name = file.split('.')


Simpler for loops

for row in itertools.islice(reader, 0, 1):
    print("The number one song for this year's count is:" + str(row[0]) + ", with a total of " + str(row[1]) + " votes!")
for row in itertools.islice(reader, 0, 1):
    print("The number two song for this year's count is:" + str(row[0]) + ", with a total of " + str(row[1]) + " votes!")
for row in itertools.islice(reader, 0, 1):
    print("The number three song for this year's count is:" + str(row[0]) + ", with a total of " + str(row[1]) + " votes!")


itertools.islice(xs, 0, 1) just returns the first item of xs, doing it three times equals getting the first three items, so you can greatly simplify:

for iteration, row in enumerate(csv.reader(results)):
    print("The number {} song for this year's count is:{}, with a total of {} votes!".format(
        ["one", "two", "three"][iteration], row[0], row[1]))
    if iteration >= 3:
        break


Note:

-
The use of .format instead of + with strings, as the former is more flexible and readable.

-
The use of a loop repeat 3 times, instead of three identical loops repeat 3 times.

-
The simplicity in raising the number of songs to print, just change 3 to something else along with the ["one", "two", "three"] list.

-
The use of enumerate, that gives the index of the item along with the item. At times it is much more readable than indexing the collection (indexing may also not always be possible)

Just delete dead code

dead code is code that serves no purpose, and just sits there making the program harder to read. I am talking about base_dir = os.path.join(subdir) You yourself state:


maybe not needed anymore

I can argue, surely not needed anymore, just use Control-F and look for base_dir and you will find no uses at all! Just delete it and enjoy the simpler code.

Follow the underscore convention

You write:

head, tail = os.path.split(subdir) #retrieves the name of the voter from the folder name


But head is never used again, the Pythonic way of saying I don't care about that value is giving it the _ [UNDERSCORE] name, this way readers will stop looking all over the code for potential uses of head. All in all, using _ is a variable less to keep track of, one further step towards code simplification.

Code Snippets

folder_list = os.getcwd() #Get list of files in all directories
>>> help(os.getcwd)
Help on built-in function getcwd in module posix:

getcwd(...)
    getcwd() -> path

    Return a unicode string representing the current working directory.
song = file_name.lower() #create variable for song and convert all letters to lowercase to prevent song name match issues
a = lst[0]
b = lst[1]
file_name = file.split('.')[1] #removes the .mp3 extension and numbers at the start of the file to clean up song name
    vote_count = file.split('.')[0] #retrieves vote count for each song

Context

StackExchange Code Review Q#113504, answer score: 6

Revisions (0)

No revisions yet.