patternpythonMinor
Assigning sentiment to each tweet - Twitter trend
Viewed 0 times
tweeteachsentimenttrendtwitterassigning
Problem
Below assignment is taken from here.
Introduction
In this project, you will develop a geographic visualization of
twitter data across the USA. You will need to use dictionaries, lists,
and data abstraction techniques to create a modular program. Below is
Phase 1: The Feelings in Tweets
In this phase, you will create an abstract data type for tweets, split the text of a tweet into words, and calculate the amount of
positive or negative feeling in a tweet.
Tweets
First, you will implement an abstract data type for Tweets. The constructor
Problem 1 (1 pt). Implement the
text of a tweet.
Problem 2 (1 pt). Implement the
defined at the top of
manipulate positions; they play an important role in this project.
When you complete problems 1 and 2, the
Problem 3 (1 pt). Improve the
consists only of ASCII letters. The string
module contains all letters in the ASCII character set. The
else.
When you complete this problem, the
`pyth
Introduction
In this project, you will develop a geographic visualization of
twitter data across the USA. You will need to use dictionaries, lists,
and data abstraction techniques to create a modular program. Below is
phase 1 of this project.Phase 1: The Feelings in Tweets
In this phase, you will create an abstract data type for tweets, split the text of a tweet into words, and calculate the amount of
positive or negative feeling in a tweet.
Tweets
First, you will implement an abstract data type for Tweets. The constructor
make_tweet is defined at the top of trends.py.make_tweet returns a python dictionary with the following entries: text: a string, the text of the tweet, all in lowercase
time: a datetime object, when the tweet was posted
latitude: a floating-point number, the latitude of the tweet's location
longitude: a floating-point number, the longitude of the tweet's location
Problem 1 (1 pt). Implement the
tweet_words and tweet_time selectors. Call extract_words to list the words in thetext of a tweet.
Problem 2 (1 pt). Implement the
tweet_location selector, which returns a position. Positions are another abstract data type,defined at the top of
geo.py. Make sure that you understand how tomanipulate positions; they play an important role in this project.
When you complete problems 1 and 2, the
doctest for make_tweet should pass.python3 trends.py -t make_tweet
Problem 3 (1 pt). Improve the
extract_words function as follows: Assume that a word is any consecutive substring of text thatconsists only of ASCII letters. The string
ascii_letters in the stringmodule contains all letters in the ASCII character set. The
extract_words function should list all such words in order and nothingelse.
When you complete this problem, the
doctest for extract_words should pass.`pyth
Solution
The global design is a bit weird from my point of view but I'll comment on the code you've written.
In
The code is properly formatted. A few remarks anyway :
-
you don't need that many parenthesis.
-
you don't need to check
-
-
Instead of having
-
You can get rid of the part comparing the index to the length and just handle it after the loop.
-
At the end, the code looks like :
Another idea would be to do things differently, by replacing unwanted characters by spaces and then to split on spaces.
In
Instead of asserting, it could be an idea to raise a
In
You can simply :
Also, you should not compare to None using
In
Because non-zero integers value are considered True in boolean contexts, you can write :
Which can be written :
Also,
Then, I am wondering if you should be checking
Finally, a slightly different way to write this function would be to abuse list comprehension in order to be able to reuse builtin functions
In
extract_words:The code is properly formatted. A few remarks anyway :
-
you don't need that many parenthesis.
-
you don't need to check
character in ascii_letters as it has to be true as this point.-
require_current_index_change looks like it should be a boolean. Just replace 1 by True, O by False and if require_current_index_change == 1: by if require_current_index_change:.-
Instead of having
require_current_index_change to know whether you can use current_index or not, you could simply set current_index to None : it is easy to check and if you use the index anyway, you'll probably get an exception.-
You can get rid of the part comparing the index to the length and just handle it after the loop.
-
current_index is probably not the best name as it let the reader think it corresponds to the index we are iterating over (aka index). It could be a good idea to convey the idea of beginning or starting index.At the end, the code looks like :
def extract_words(text):
lst = []
starting_index = 0
for index, character in enumerate(text):
if character not in ascii_letters:
if starting_index is not None:
lst.append(text[starting_index:index])
starting_index = None
elif starting_index is None:
starting_index = index
if starting_index is not None:
lst.append(text[starting_index:])
return lstAnother idea would be to do things differently, by replacing unwanted characters by spaces and then to split on spaces.
In
make_sentiment:Instead of asserting, it could be an idea to raise a
ValueError.In
has_sentiment:You can simply :
return s is not None.Also, you should not compare to None using
== but with is as per PEP8. You'll find various tools like pep8, pyflakes, etc to check your code and detect such things.In
analyze_tweet_sentiment:Because non-zero integers value are considered True in boolean contexts, you can write :
if total_sentiment:
return total_sentiment / count_sentiment
else:
return averageWhich can be written :
return total_sentiment / count_sentiment if total_sentiment else averageAlso,
average does not need to be defined that early, it could simply be :return total_sentiment / count_sentiment if total_sentiment else make_sentiment(None)Then, I am wondering if you should be checking
total_sentiment or count_sentiment. This corresponds to choose whether you can have a sentiment of value 0 (for instance if you have both positive and negative words) or if it corresponds to None. This is an open question and I do not have the answer.Finally, a slightly different way to write this function would be to abuse list comprehension in order to be able to reuse builtin functions
len and sum. For instance, we'd have something like :def analyse(tweet):
sentiment_values = [sentiment_value(s) for s in (get_word_sentiment(w) for w in tweet_words(tweet)) if has_sentiment(s)]
return sum(sentiment_values)/ len(sentiment_values) if sentiment_values else make_sentiment(None)Code Snippets
def extract_words(text):
lst = []
starting_index = 0
for index, character in enumerate(text):
if character not in ascii_letters:
if starting_index is not None:
lst.append(text[starting_index:index])
starting_index = None
elif starting_index is None:
starting_index = index
if starting_index is not None:
lst.append(text[starting_index:])
return lstif total_sentiment:
return total_sentiment / count_sentiment
else:
return averagereturn total_sentiment / count_sentiment if total_sentiment else averagereturn total_sentiment / count_sentiment if total_sentiment else make_sentiment(None)def analyse(tweet):
sentiment_values = [sentiment_value(s) for s in (get_word_sentiment(w) for w in tweet_words(tweet)) if has_sentiment(s)]
return sum(sentiment_values)/ len(sentiment_values) if sentiment_values else make_sentiment(None)Context
StackExchange Code Review Q#91158, answer score: 2
Revisions (0)
No revisions yet.