HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Finite state machine for CSV data

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
finitecsvforstatemachinedata

Problem

I want to read a file containing comma-separated values, so have written a finite state machine:

```
private IList Split(string line)
{
List values = new List();
string value = string.Empty;
ParseState state = ParseState.Initial;
foreach (char c in line)
{
switch (state)
{
case ParseState.Initial:
switch (c)
{
case COMMA:
values.Add(string.Empty);
break;
case QUOTE:
state = ParseState.Quote;
break;
default:
value += c;
state = ParseState.Data;
break;
}
break;
case ParseState.Data:
switch (c)
{
case COMMA:
values.Add(value);
value = string.Empty;
state = ParseState.Initial;
break;
case QUOTE:
throw new InvalidDataException("Improper quotes");
default:
value += c;
break;
}
break;
case ParseState.Quote:
switch (c)
{
case QUOTE:
state = ParseState.QuoteInQuote;
break;
default:
value += c;
break;
}
break;
case ParseState.QuoteInQuote:
switch (c)
{
case COMMA:
values.Add(value);
value = string.Empty;
state = ParseState.Initial;
break;
case QUOTE:

Solution

So my question is this: is this block at the end some sort of anti-pattern, and is it something that's going to come back to bite me in the future?

Not necessarily.
It depends on whether line can contain end-of-line markers.

If line can contain an EOL character that marks the end of the CSV record for you,
then you could move the logic from the last switch into case EOL: for the state-specific handling as appropriate,
and stop further processing.
Or you want to support multi-line CSV records (newline characters embedded within quotes), then you could continue processing the next line.

If the full length of line is expected to be the complete CSV record,
then it's fine to not terminate it with an explicit EOL and handle it in the extra switch as you did.

Context

StackExchange Code Review Q#91694, answer score: 2

Revisions (0)

No revisions yet.