HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Lexical Analysis Program

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
lexicalanalysisprogram

Problem

For my computer science class, I was required to write a lexical analysis program that would perform several functions on a std::string. The assignment required a function for each of the following:

  • count number of a certain substring



  • count number of words excluding numbers



  • count number of unique words (excludes repeated words)



  • count number of sentences (by end punctuation)



  • average words per sentence



  • lexical density as percent (unique word count / word count * 100)



My code is fairly long, but here it is:

```
#include
#include
#include
#include
#include
#include

using namespace std;

int substringCount(const string&, const string&);
int wordCount(const string&);
int uniqueWordCount(const string&);
int sentenceCount(const string&);
double wordsPerSentence(const string&);
double lexicalDensity(const string&); // different words / total words * 100

int main() {
string source = ("This is the source text for this program.");
cout substringIndices(const string& str, const string& sub) { // Find indices of substrings
vector indices = {};
for (unsigned int i = 0; i = str.size()) {
break;
}
if (str[t] == sub[j]) {
if (j + 1 == sub.size()) indices.push_back(i);
continue;
} else {
break;
}
}
}
return indices;
}

vector splitByWhitespace(const string& str) { // split by whitespace
vector tokens;
istringstream iss(str); // create istringstream
copy(istream_iterator(iss),istream_iterator(),back_inserter(tokens)); // copy into tokens
return tokens;
}

vector deleteNumbers(const vector& data) {
vector tr = {};
for (unsigned int i = 0; i words = deleteNumbers(splitByWhitespace(str));
sort(words.begin(),words.end()); // sort
words.erase(unique(words.begin(),words.end()),words.end()); // delete extra non-unique words
return words.size();
}

int sentenceCount(const string& str) { // g

Solution

Name Spaces

A most common practice is to preface types and functions supplied by the
Standard Template Library, sometimes known as the Standard Library with
std:: rather than ignoring names spaces by the statement

using namespace std;


This will become quite helpful as the problems and code become more complex
and you need to include other name spaces. Some languages such as C don't
support namespaces, however, C++ does and it is quite a useful feature.
More complex programs may use multiple name spaces and each one of these
name spaces can contain a definition for functions such as sort() or
find_if(), or overrides on operators such as <<. You may even need to
write your own sort() or find(). Using std::sort allows you not to
create your own name space when you write your own sort() with the same
arguments.

Initialization of Empty Containers

There is no need for the empty braces on this line:

vector indices = {};


The vector container class has a constructor that will initialize an empty
vector properly.

vector indices;


Use the Features of the Container Classes and the Standard Template Library

You may find this website useful for learning all of the features a
particular container class or the standard library.

There are definitely some functions you could be using, such as
std::find(), std::find_if(), std::count_if() and std::string::substr() that could definitely shorten the code.

It might be wise to investigate std::map as well for counting words.

GOTO

There are almost always ways to avoid using goto in C++. In rare cases
goto may be appropriate for error handling, more so in C than in
C++ because C++ has try{}/catch{} and exceptions.

In the following code there is really no reason to use a goto:

vector deleteNumbers(const vector& data) {
    vector tr = {};
    for (unsigned int i = 0; i < data.size(); i++) {
        for (char c: data[i]) {
            if (string("0123456789").find(c) == -1) continue;
            else goto mainLoop;
        }
        tr.push_back(data[i]);
        mainLoop: continue;
    }
    return tr;
}

Code Snippets

using namespace std;
vector<int> indices = {};
vector<int> indices;
vector<string> deleteNumbers(const vector<string>& data) {
    vector<string> tr = {};
    for (unsigned int i = 0; i < data.size(); i++) {
        for (char c: data[i]) {
            if (string("0123456789").find(c) == -1) continue;
            else goto mainLoop;
        }
        tr.push_back(data[i]);
        mainLoop: continue;
    }
    return tr;
}

Context

StackExchange Code Review Q#141817, answer score: 5

Revisions (0)

No revisions yet.