HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Creating a simple Interpreter for the Quartz language

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
interpretersimplethecreatinglanguageforquartz

Problem

I've been trying to improve my C++ skills, and deiced to try my hand at making a interpreter for a toy language. The language is called Quartz, and so far the only thing you can do is output strings. The following keywords can be used to print out a string: output, which prints all the strings on one line, and nl_output, which prints each string on a different line.

The following program is valid in Quartz:

nl_output "Hello World"
nl_output "Goodbye Wolrd"
nl_output "This is a test of the Quartz language"


Each file has the .qz extension, and is basically like a text file.

The over-view of how my interpreter works is:

  • It first opens a .qz, and then checks if the file was opened successfully.



  • After ensuring that the files has been opened properly, the file contents is then read into a string. The string is then feed to a lexer that checks for the tokens. The lexer use a for-loop to iterate over the string, and adds any tokens it finds to a vector.



  • The lexer then returns the vector to be read by the parser. The parser uses a while loop to iterate over the vector, and calls the correct code if a keyword/keywords is found.



main.cpp

#include
using std::cout;
using std::cerr;
using std::endl;
#include
using std::ifstream;
using std::fstream;
#include
using std::string;
using std::getline;
#include
using std::vector;

void open_file(const char *filename, ifstream &data)
{
    data.open(filename);
    if(data.fail())
    {
        cerr  lexer(string &data_str, ifstream &data)
{
    string tok;
    string string_var;
    string expr;

    vector tokens;
    getline(data, data_str, '\0');
    bool is_string = false;
    data_str += '

To test the interpreter, simply compile the code in your command prompt/terminal window.

In my case I did:

g++ C:\main.cpp -o quartz.exe


Then run the `[; for(unsigned int i=0; i &tokens) { unsigned int i = 0; while(i tokens = lexer(data_str, data); parser(tokens); return 0; }


To test the interpreter, simply compile the code in your command prompt/terminal window.

In my case I did:

%%CODEBLOCK_2%%

Then run the `[

Solution

The three main questions I have are:

OK.


Is the way I'm reading over my string, and adding my tokens to the tokens vector inefficient and slow?

It's not the worst I have seen. But there does seem to be an awful lot of copying going on. You don't have to actually send back strings as the tokens. The lexer usually reads the string and converts this into a stream (or a vector) of lexemes. The lexemes only need to be a stream of numbers.

nl_output => 256
output    => 257
  => 258


But the worst part is that it is not clear what you are trying to achieve (without really digging into the code). Your code should be self documenting and currently is not.


Is it bad practice to read A file until a NULL character ('\0') is reached?

Yes. Because there can potentially be '\0' characters as valid input. Are you assuming that the a file is null terminated? It is not. When you reach the end of file the end of file flag will be set on the stream.


Is it mandatory to close a file after opening it? What might occur if I choose not to?

Not mandatory. In my opinion not good practice (unless you plan to do something if it fails). And closing a read file is not going to fail in an exciting way. Other things will have gone wrong first. Let the destructor of the stream close the file for you.

Code Review.

I think your lexer can be much more easily written.

  • Assuming all lexemes are "white space separated".



  • The list of lexemes is:



  • TERMINAL: nl_output



  • TERMINAL: output



  • Quoted String: -> "*"



Code

std::vector lexer(std::istream& s)
 {
     std::vector result;
     std::string word;
     while(s >> word)   // reads a word from the stream.
     {                  // Drops all proceeding white space.

         if (word == "nl_output") {
             result.push_back(word);
         }
         else if (word == "output") {
             result.push_back(word);
         }
         else if (word[0] == '"') {
             result.push_back(readComment(word, s));
         }
         else {
             // Error
         }
     }
     return result;
}
std::string readComment(std::string const& word, std::istream& s)
{
    // First see if the whole quote is in the first word.
    auto find = std::find(std::begin(word) + 1, std::end(word), '"');
    if (find != std::end(word))
    {
         auto extraStart = find+1;
         auto extraDist  = std::distance(extraStart, std::end(word));
         for(int loop = 0; loop < extraDist; ++loop)
         {
             s.unget();
         }
         return word.substr(0, std::distance(std::begin(word), extraStart));
    }

    // OK the quote spans multiple words.
    std::string moreData;
    std::getline(s, moreData, '"');
    return word + moreData + '"';
 }


But this logic will get convoluted real quickly. I suggest you use a real lexer (like flex). Writing the rules is much simpler.

Space              [ \r\n\t]
QuotedString       "[^"]*"
%%
nl_output          {return 256;}
output             {return 257;}
{QuotedString}     {return 258;}
{Space}            {/* Ignore */}
.                  {error("Unmatched character");}
%%

Code Snippets

nl_output => 256
output    => 257
<string>  => 258
std::vector<std::string> lexer(std::istream& s)
 {
     std::vector<std::string> result;
     std::string word;
     while(s >> word)   // reads a word from the stream.
     {                  // Drops all proceeding white space.

         if (word == "nl_output") {
             result.push_back(word);
         }
         else if (word == "output") {
             result.push_back(word);
         }
         else if (word[0] == '"') {
             result.push_back(readComment(word, s));
         }
         else {
             // Error
         }
     }
     return result;
}
std::string readComment(std::string const& word, std::istream& s)
{
    // First see if the whole quote is in the first word.
    auto find = std::find(std::begin(word) + 1, std::end(word), '"');
    if (find != std::end(word))
    {
         auto extraStart = find+1;
         auto extraDist  = std::distance(extraStart, std::end(word));
         for(int loop = 0; loop < extraDist; ++loop)
         {
             s.unget();
         }
         return word.substr(0, std::distance(std::begin(word), extraStart));
    }

    // OK the quote spans multiple words.
    std::string moreData;
    std::getline(s, moreData, '"');
    return word + moreData + '"';
 }
Space              [ \r\n\t]
QuotedString       "[^"]*"
%%
nl_output          {return 256;}
output             {return 257;}
{QuotedString}     {return 258;}
{Space}            {/* Ignore */}
.                  {error("Unmatched character");}
%%

Context

StackExchange Code Review Q#138402, answer score: 4

Revisions (0)

No revisions yet.