HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

C String - new function to detect user's anger

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
newuserfunctionangerstringdetect

Problem

This was the most humerus coding I've ever done. It's for my string library in C. It detects if the user is angry to various degrees, namely str_isHeated().

Why?

Ever play a text-based game and you're swearing at the computer by typing, typing multiple !!!, and the computer responds very dumb? I think it may be useful for AI where NPC's (non-playable characters) can judge your mood and respond appropriately. Maybe even used for customer service online.

It works, but I'm interested to see if anyone has any thoughts on how to improve it. I've been having some fun with it.

/*
Function: str_getHeat()
Software usually gets user information, but it hardly 
detects the user's emotion when entering in the information. 
This may be useful for checking a customer's or player's 
typing behavior, which may generate better responses with AI.
Calculated as follows:
    All Caps 
    One or more words in caps
    Exclamation Point Count
    If 'please' or 'sorry' is found, take off heat points.
    Swearing words
Returns: EHeat_Cold, EHeat_Warm, EHeat_Heated, EHeat_VeryHeated
*/
EHeat str_isHeated(STRING *objString)
{
int i;
int intHeatScore = 0;       /* 0% cold; 100% very heated */
STRINGCOLLECTION tokens; 
STRING temp_a;

/* Count how many exclamations there are */
for (i = 0; i length; i++)
{
    if (objString->str[i] == '!')
        intHeatScore += 10;
}

/* tokenize user's input */
sc_init(&tokens);
str_tokenize(objString, &tokens); 

    /* Check if all caps. That can be taken as impatient. */
if (str_isUpper(objString))
{
    intHeatScore += 10;
}
else
{
    /* check if one or more words are all in caps. That is 
       demanding behavior, and that is not nice. */
    for (i = 0; i = 50)
    return EHeat_VeryHeated;

else if (intHeatScore >= 30)
    return EHeat_Heated;

else if (intHeatScore > 10)
    return EHeat_Warm;

else if (intHeatScore >= 0)
    return EHeat_Cold; 

return EHeat_Cold;
}

Solution

Algorithm:

There are a few fallacies that you assume in your approach. Nothing wrong, but things that could be improved:

-
You assume bounds on your "heat score" with this comment

/* 0% cold; 100% very heated */


But there is no code implementing these bounds. You can go negative, as well as go above 100. I'd recommend observing these bounds, and sentiment should be seen as a probability which can only be between 0 and 1.

To match this probability, it might be better for you to store your sentiment score as a double rather than an int but that choice is up to you.

-
Right now you are using a bag-of-words model. This is a typical approach when first starting out with sentiment analysis since it is easier, but it usually gives a lower accuracy representing what the actual sentiment of the text is.

As I was saying, it's a fairly straightforward and practical way to go, but there are a lot of situations where it will get make mistakes.

-
Ambiguous sentiment words - "This product works terribly" vs. "This product is terribly good"

-
Missed negations - "I would never in a millions years say that this product is worth buying"

-
Quoted/Indirect text - "My dad says this product is terrible, but I disagree"

-
Comparisons - "This product is about as useful as a hole in the head"

-
Anything subtle - "This product is ugly, slow and uninspiring, but it's the only thing on the market that does the job"

As far as NLP helping you with any of this, word sense disambiguation (or even just part-of-speech tagging) may help with (1), syntactic parsing might help with the long range dependencies in (2), some kind of chunking might help with (3). It's all research level work though, there's nothing that I know of that you can directly use. Issues (4) and (5) are a lot harder, I throw up my hands and give up at this point.

-
You are going to not score a decent amount of sentences that occur in actual life. Take a look at a lot of sentences in this post, for example. They don't contain swear words, "please" or "sorry", exclamation points or words in caps. You need a more general lexicon of positive, negative, and neutral words and then a system to weigh in the effects of these words into your score.

-
There are some odd score modifications you do. Why is it when I refer to myself with "I", that the score is considered more positive? I don't think it should, and I would say to reconsider the reason you think so.

Sentences that end with an exclamation point aren't necessary negative (higher heat) either. An exclamation point is often used to indicate strong feelings (such as excitement) or high volume. Most sentiment analysis systems that I've looked at don't consider punctuation in the final score at all.

"A" should be considered the same as "I". It is very possible that a sentence could start with "A", and be either positive or neutral but your program considers it to have a negative connotation.

For a basic sentiment analysis this is fine, but do note that it does have it's flaws. If you're looking to improve the accuracy of your algorithm, I'd recommend reading this research paper, which achieves a classification accuracy of 90% (higher than any other published results).

Code:

-
Right now you have the method str_findString(). I'm guessing this is an variation of strstr(). I'm also guessing that `'s implementation of this method will be more efficient and faster, based on it being a standard library.

if (strstr(temp_a, "$#@#"))
{
    ...
}


-
Declare
i inside of your for loops.(C99)

for (int i = 0; i length; i++)


-
I would add another tab to the function body.

EHeat str_isHeated(STRING *objString)
{
    int intHeatScore = 0;       /* 0% cold; 100% very heated */
    STRINGCOLLECTION tokens;


-
I would combine your last two return conditions into one.

else if (intHeatScore >= 0)
    return EHeat_Cold; 

return EHeat_Cold;


I find the last
else-if comparison useless overall, and would just return EHeat_Cold anyways if it wasn't included.

return EHeat_Cold;


-
I'm not too sure about what I'm assuming are
#defines: STRING and STRINGCOLLECTION. I guess it is okay to keep them, but is there a specific reason you don't just put in what they actually are: char and char` array respectively?

Code Snippets

/* 0% cold; 100% very heated */
if (strstr(temp_a, "$#@#"))
{
    ...
}
for (int i = 0; i < objString->length; i++)
EHeat str_isHeated(STRING *objString)
{
    int intHeatScore = 0;       /* 0% cold; 100% very heated */
    STRINGCOLLECTION tokens;
else if (intHeatScore >= 0)
    return EHeat_Cold; 

return EHeat_Cold;

Context

StackExchange Code Review Q#37155, answer score: 8

Revisions (0)

No revisions yet.