HiveBrain v1.2.0
Get Started
← Back to all entries
snippetcMinor

Parse 2D matrix, 2 versions

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
versionsmatrixparse

Problem

I'm writing a little C program that computes matrices (for learning purpose). The matrix is fed through arguments in the following form : "{{43,543.324,34},{43,432,87}}"
The user doesn't give the dimensions.

I have solved this problem two times in two different ways:

  • Short series of entangled if / else / mess



  • Kind of a state machine doing one task at a time, much longer



Here they are, the first one is commented and the second one follows:

```
int matrix_fill(matrix A, char input)
{
size_t i, j;
float val;

/*
size_t input_len = strlen(input);
if (input[0] != '{' || input[input_len - 1] != '}') {
errno = EINVAL;
perror("matrix_fill");
exit(-1);
}

input[input_len - 1] = '\0';
input++, input_len -= 2;

i = 0;
while (*input != '\0') {
printf("input: %s\n", input);
j = 0;

if (*input == '{') {
input++;

while (1) {

val = strtof(input, &input);
printf("(%d,%d) = %f\n", i, j, val);
matrix_set(A, i, j, val);
j++;

if (*input == ',')
input++;

else if (*input == '}')
break;

else {

}
}

i++, input++;
A->m = i, A->n = j;
}

else if (*input == ',') {
input++;
}

else {
errno = EINVAL;
perror("matrix_fill");
exit(-1);
}
}
*/

enum {
start,
open,
read,
delim,
close,
end
} status = start;

while (*input != '\0') {

switch (status) {
case start:
if (input != '{' || (input + 1) != '{') {
errno = EINVAL;
goto error;
}

i = 0, input++;
status = open;
break;

case open:
j = 0, input++;

Solution

Using a state machine is generally a very elegant way to parse input. This is actually how parsers
generated by YACC or Bison work, and it's also how lexical analysers generated by LEX work. It allows
the parsing to be more flexible.

Generally the variable names you used are quite descriptive and makes the code easier to use, the
one exception is naming the matrix A.

Things That Would Have Improved This Question

It would have been much easier to review this question if the following had been provided:
  1. The definition of the struct matrix.
  2. The header files necessary (stdio.h, stdlib.h, strings.h and errno.h).
  3. If the 2 different versions of matrix_fill() had been separated and included as matrix_fill1()


and matrix_fill2().
  1. If the program had been compiled using -Wall and all warnings addressed prior to asking.



I have included the reformated code that allowed me to compile at the end of my answer to show what
would have made this a better question and easier to answer.

During development of code it is a good idea to compilte the code with -Wall compiler switch so
that all warnings are reported. The warning can also be treated as errors. The type size_t is
not completely compatible with the printf() statements in the commented out algorithm:

Parse2DMatrix.c:142: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘size_t’

Parse2DMatrix.c:142: warning: format ‘%d’ expects type ‘int’, but argument 3 has type ‘size_t’

The matrix struct should be presented so that we could have suggested possible error handling,
there are times when i or j may cause memory access problems because they are outside the scope
of the matrix.

The choice of types is interesting, I would have used double instead of float, and unsigned int
instead of size_t.

Error Reporting

Rather than using errno.h and perror() for this portion of the code it would have been
better if there was an internal error handling mechanism providing a better description of
the errors. Invalid argument is a little too general, "Missing { at start of expression" might
have been more meaningful.

Exit Versus Return

Calling exit(-1) does not allow the code to clean up after itself, returning an error value
allows the calling code to clean up any allocated memory and reset all items that need to be
reset. If this code was part of an operating system or a daemon that calling exit() is actually
not an option because you wouldn't want the system to crash.

Switch Statement Without Default Case

There should always be a default: case in a switch statement to handle unknown/undefined
behavior. In this specific instance rather than having a goto in case of errors the
default: case could have been used to handle the errors on the next iteration through the
loop. There could be one additional enum value in the enum, error could be a status type.

Enum Coding Standards

It would be better to call attention to the enumerators in the ENUM, and follow general C
Coding Standards (UCSD has published this coding standard).
By making the enumerators ALL_CAPITALS the code would be clearer and show that these are defined someplace.

It is possible that the status/state might be used elsewhere so it would be better to define the enum
type outside the function, and possibly in a header file.

A better way to define the enum would probably be:

typedef status { START, OPEN, READ, DELIM, CLOSE, END, ERROR} Status;


I would probably call the variable status state since that is what it is really being for.

Alternative Implementation For The State Machine

YACC, BISON and LEX generate tables (matricies) for transition from one state to another. One possible
alternate implementation of the state machine is to have an array of function pointers. Another possible
implementation is to use the current state as an index into an array of the next state (how YACC, BISON and
LEX are implemented).

White Space

The parsing algorithm does not allow for white space. Handling white space would be easy to add and allows
the user of the program to make the input more readable. There are a number of ways to implement check on
white space. One possibility is to have matrix_fill() be only a parser and have it run a lexical analysing
function that tokenizes the input. Note: This assumes that the buffer char *input has not been previously
processed to remove white space.

```
#include
#include
#include
#include

typedef struct Matrix {
size_t m;
size_t n;
float values[20][20];
} matrix;

int matrix_set(matrix *A, size_t i, size_t j, float val)
{
int error = 0;

/ Algorithm not specified /

return error;
}

int matrix_fill2(matrix A, char input)
{
size_t i, j;
float val;

enum {
start,
open,
read,
delim,
close,
end
} status = start;

while (*input != '\0') {

switch (status) {
case start:
if (input != '{' ||

Code Snippets

typedef status { START, OPEN, READ, DELIM, CLOSE, END, ERROR} Status;
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
#include <errno.h>

typedef struct Matrix {
    size_t m;
    size_t n;
    float values[20][20];
} matrix;

int matrix_set(matrix *A, size_t i, size_t j, float val)
{
    int error = 0;

    /* Algorithm not specified */

    return error;
}

int matrix_fill2(matrix *A, char *input)
{
    size_t i, j;
    float val;


    enum {
        start,
        open,
        read,
        delim,
        close,
        end
    } status = start;

    while (*input != '\0') {

        switch (status) {
        case start:
            if (*input != '{' || *(input + 1) != '{') {
                errno = EINVAL;
                goto error;
            }

            i = 0, input++;
            status = open;
            break; 

        case open:
            j = 0, input++;
            status = read;
            break;

        case read:
            val = strtof(input, &input);
            if (errno < 0) goto error;

            matrix_set(A, i, j, val);
            j++;

            if (*input == ',')
                status = delim;

            else if (*input == '}')
                status = close;

            else {
                errno = EINVAL;
                goto error;
            }

            break;

        case delim:
            input++;

            if (*input == '{')
                status = open;

            else 
                status = read;

            break;

        case close:
            input++;

            if (*input == ',')
                status = delim;

            else if (*input == '}')
                status = end;

            else {
                errno = EINVAL;
                goto error;
            }

            i++;
            break;

        case end:
            input++;
            A->m = i, A->n = j;

            break;
        }
    }

    return 0;

error:
    perror("matrix_fill");
    return -1;
}

int matrix_fill1(matrix * A, char * input)
{
    size_t i, j;
    float val;
    int error = 0;

    size_t input_len = strlen(input);
    if (input[0] != '{' || input[input_len - 1] != '}') {
        errno = EINVAL;
        perror("matrix_fill");
        exit(-1);
    }

    input[input_len - 1] = '\0';
    input++, input_len -= 2;

    i = 0;
    while (*input != '\0') {
        printf("input: %s\n", input);
        j = 0;

        if (*input == '{') {
            input++;

            while (1) {

                val = strtof(input, &input);
                printf("(%d,%d) = %f\n", i, j, val);
                matrix_set(A, i, j, val);
                j++;

                if (*input == ',')
                    input++;

                else if (*input == '}')
                    break;

                else {

                }
            }

            i++, input++;
            A->m = i, A->n = j;
        }

        else if (*input == ',') {
            input++;
        }

        else {
            errno = EINVAL;
            perror("matrix_fil

Context

StackExchange Code Review Q#134888, answer score: 2

Revisions (0)

No revisions yet.