HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Count number of occurrences of each word of text, display as circles of varying size

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
numbereachoccurrencesvaryingsizetextcircleswordcountdisplay

Problem

My focus was to have clean, structured code. It has to be efficient. The display doesn't have to look that good. The circles do overlap and sometimes the word doesn't fit inside the circle, but those are problems aren't really my concern right now and I will fix later.

circle.py

import pygame
import random

# Define the colors and screen sizes, define center

# Colors
black = (0, 0, 0)
blue = (0, 128, 255)
red = (255, 102, 102)
green = (102, 255, 178)
purple = (178, 102, 255)
yellow = (255, 255, 102)
colors = [blue, red, green, purple, yellow]
white = (255, 255, 255)

# Radius and font size
radius = [30, 40, 50, 60, 70, 80, 90, 100]

# Screen dimensions and title
SCREEN_WIDTH, SCREEN_HEIGHT = 640, 480
CENTER = (SCREEN_WIDTH/2, SCREEN_HEIGHT/2)
TITLE = "Occurences"

# Actual screen
screen = pygame.display.set_mode((SCREEN_WIDTH, SCREEN_HEIGHT))

# Initialize screen
pygame.display.set_caption(TITLE)
screen.fill(white)

class Circle:
    def __init__(self, size, text):
        rand_color = random.randint(0, 4)
        rand_x = random.randint(size+5, SCREEN_WIDTH-size)
        rand_y = random.randint(size+5, SCREEN_HEIGHT-size)

        self.x = rand_x
        self.y = rand_y
        self.size = size
        self.color = colors[rand_color]
        self.screen = screen
        self.text = text

    def display(self):
        pygame.draw.circle(self.screen, self.color, (self.x, self.y), self.size)

        myfont = pygame.font.SysFont("monospace", self.size-15, bold=True)
        label = myfont.render(self.text, 1, black)
        screen.blit(label, (self.x-self.size+10, self.y-20))


occurences.py

```
import pygame
from pygame.locals import *
import circle
import time
import re
from operator import itemgetter
from collections import Counter

# Intro
pygame.init()

intro_text = ["Hi!", "Welcome to 'occurences'!", "A program (with a misspelt name) that counts the number",
"of occurrences in a piece of text", "and displays circles with sizes based on", "the

Solution

Needless work to get in sorted order.

I'd like to take a look at the following section:

words = Counter(re.findall(r"[\w']+", text.lower())).most_common() # Split words

while words: # While the list of words isn't empty,
    max_occ = max(words, key=itemgetter(1))[1] # Take current max occurrence of the list

    for word in words:

        if word[1] == max_occ: # If the current word's occurrence is same as the max occurrence
            curr_word = word 

            if len(circle.radius) != 1:
                c = circle.Circle(circle.radius[-1], word[0]) # Draw a circle with the current largest size of circle.radius
                c.display() # Displaying the word on the circle
                circle.radius.pop() # Remove that current largest size
            else:
                c = circle.Circle(30, word[0]) # When there's only one size left (the smallest one), make all other words that size.
                c.display()

            words = [x for x in words if x != curr_word] # Change the list so that the words already dealt with are gone, making a new max_occ


First, let's look at the help of most_common:

most_common(self, n=None) unbound collections.Counter method
    List the n most common elements and their counts from the most
    common to the least.  If n is None, then list all element counts.


So, they are in exactly the order you want. Thus, we can replace some code

words = Counter(re.findall(r"[\w']+", text.lower())).most_common()
for word in words:
    if len(circle.radius) != 1:
        c = circle.Circle(circle.radius[-1], word[0]) # Draw a circle with the current largest size of circle.radius
        c.display() # Displaying the word on the circle
        circle.radius.pop() # Remove that current largest size
    else:
        c = circle.Circle(30, word[0]) # When there's only one size left (the smallest one), make all other words that size.
        c.display()


Notice how the while and if are now both gone? All because of how .most_common already guarantees they are ordered by max-occurence.

Code duplication in getting the radius

Let's take a look at the inner if:

if len(circle.radius) != 1:
    c = circle.Circle(circle.radius[-1], word[0]) # Draw a circle with the current largest size of circle.radius
    c.display() # Displaying the word on the circle
    circle.radius.pop() # Remove that current largest size
else:
    c = circle.Circle(30, word[0]) # When there's only one size left (the smallest one), make all other words that size.
    c.display()


I'm going to do something sneaky: I see how the circle.radius.pop() and c.display() are unrelated, and that we can swap them.

if len(circle.radius) != 1:
    c = circle.Circle(circle.radius[-1], word[0]) # Draw a circle with the current largest size of circle.radius
    circle.radius.pop() # Remove that current largest size
    c.display() # Displaying the word on the circle
else:
    c = circle.Circle(30, word[0]) # When there's only one size left (the smallest one), make all other words that size.
    c.display()


Now, I see how both the branches of the if end with the same: the c.display() (and a comment which I'll ignore ;) ).

if len(circle.radius) != 1:
    c = circle.Circle(circle.radius[-1], word[0]) # Draw a circle with the current largest size of circle.radius
    circle.radius.pop() # Remove that current largest size
else:
    c = circle.Circle(30, word[0]) # When there's only one size left (the smallest one), make all other words that size.
c.display()


Another thing I'd like to note. circle.radius.pop() actually returns a value. What value? The last value in the list, that is circle.radius[-1]. Let's inline it.

if len(circle.radius) != 1:
    c = circle.Circle(circle.radius.pop(), word[0]) # Draw a circle with the current largest size of circle.radius
else:
    c = circle.Circle(30, word[0]) # When there's only one size left (the smallest one), make all other words that size.
c.display()


Ooh, see how they now have almost the same body? All that differs is the calculation of the radius. Let's move getting the radius outside of the construction of the circle.

if len(circle.radius) != 1:
    radius = circe.radius.pop()
    c = circle.Circle(radius, word[0]) # Draw a circle with the current largest size of circle.radius
else:
    radius = 30
    c = circle.Circle(radius, word[0]) # When there's only one size left (the smallest one), make all other words that size.
c.display()


and now we do the same trick as before: moving shared code outside of a branch

if len(circle.radius) != 1:
    radius = circe.radius.pop()
else:
    radius = 30
c = circle.Circle(radius, word[0])
c.display()


Combining above results

```
words = Counter(re.findall(r"[\w']+", text.lower())).most_common()
for word in words:
if len(circle.radius) != 1:
radius = circle.radius.pop()
else:
radius = 30
c = circle.Circle(radius, word[0

Code Snippets

words = Counter(re.findall(r"[\w']+", text.lower())).most_common() # Split words

while words: # While the list of words isn't empty,
    max_occ = max(words, key=itemgetter(1))[1] # Take current max occurrence of the list

    for word in words:

        if word[1] == max_occ: # If the current word's occurrence is same as the max occurrence
            curr_word = word 

            if len(circle.radius) != 1:
                c = circle.Circle(circle.radius[-1], word[0]) # Draw a circle with the current largest size of circle.radius
                c.display() # Displaying the word on the circle
                circle.radius.pop() # Remove that current largest size
            else:
                c = circle.Circle(30, word[0]) # When there's only one size left (the smallest one), make all other words that size.
                c.display()

            words = [x for x in words if x != curr_word] # Change the list so that the words already dealt with are gone, making a new max_occ
most_common(self, n=None) unbound collections.Counter method
    List the n most common elements and their counts from the most
    common to the least.  If n is None, then list all element counts.
words = Counter(re.findall(r"[\w']+", text.lower())).most_common()
for word in words:
    if len(circle.radius) != 1:
        c = circle.Circle(circle.radius[-1], word[0]) # Draw a circle with the current largest size of circle.radius
        c.display() # Displaying the word on the circle
        circle.radius.pop() # Remove that current largest size
    else:
        c = circle.Circle(30, word[0]) # When there's only one size left (the smallest one), make all other words that size.
        c.display()
if len(circle.radius) != 1:
    c = circle.Circle(circle.radius[-1], word[0]) # Draw a circle with the current largest size of circle.radius
    c.display() # Displaying the word on the circle
    circle.radius.pop() # Remove that current largest size
else:
    c = circle.Circle(30, word[0]) # When there's only one size left (the smallest one), make all other words that size.
    c.display()
if len(circle.radius) != 1:
    c = circle.Circle(circle.radius[-1], word[0]) # Draw a circle with the current largest size of circle.radius
    circle.radius.pop() # Remove that current largest size
    c.display() # Displaying the word on the circle
else:
    c = circle.Circle(30, word[0]) # When there's only one size left (the smallest one), make all other words that size.
    c.display()

Context

StackExchange Code Review Q#131199, answer score: 2

Revisions (0)

No revisions yet.