HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Word frequency counter

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
frequencywordcounter

Problem

I recently took a self-assessment question to assess my Python ability for an online class. The problem was to return the frequency of a word occurring, as part of a tuple.


Implement a function count_words() in Python that takes as input a
string word_string and a number number_of_words, and returns the n most frequently-occurring words in word_string. The return value should be a list of tuples - the top n
words paired with their respective counts [(, ), (,
), ...], sorted in descending count order.


You can assume that all input will be in lowercase and that there will
be no punctuations or other characters (only letters and single
separating spaces). In case of a tie (equal count), order the tied
words alphabetically.


E.g.: print count_words("this is an example sentence with a repeated word example",3) Output: [('example', 2), ('a', 1), ('an', 1)]

def count_words(word_string, number_of_words):
  """
  take in a word string and return a tuple of the
  most frequently counted words

  word_string = "This is an example sentence with a repeated word example",
  number_of_words = 3
  return [('example', 2), ('This', 1), ('a', 1)]
  """
  word_array = word_string.split(' ')
  word_occurence_array = []
  for word in word_array:
    if word in word_string:
      occurence_count = word_array.count(word)
      word_occurence_array.append((word, occurence_count))
    else:
      # no occurences, count = 0
      word_occurence_array.append((word, 0))

  # dedupe
  word_occurence_array = list(set(word_occurence_array))

  # reorder
  # can also pass, reverse=True, but cannot apply `-` to string
  word_occurence_array.sort(key=lambda tup: (-tup[1], tup[0]))

  # only return the Nth number of pairs
  return word_occurence_array[:number_of_words]


You can then call this function:

count_words(word_string="this is an example sentence with a repeated word example", number_of_words=3)


which returns `[('example', 2), ('a', 1), (

Solution

This is exactly the kind of task that should be handled using collections.Counter, which offers a most_common() method to extract the results.

Context

StackExchange Code Review Q#118914, answer score: 4

Revisions (0)

No revisions yet.