patterncsharpMinor
Finite state machine for CSV data
Viewed 0 times
finitecsvforstatemachinedata
Problem
I want to read a file containing comma-separated values, so have written a finite state machine:
```
private IList Split(string line)
{
List values = new List();
string value = string.Empty;
ParseState state = ParseState.Initial;
foreach (char c in line)
{
switch (state)
{
case ParseState.Initial:
switch (c)
{
case COMMA:
values.Add(string.Empty);
break;
case QUOTE:
state = ParseState.Quote;
break;
default:
value += c;
state = ParseState.Data;
break;
}
break;
case ParseState.Data:
switch (c)
{
case COMMA:
values.Add(value);
value = string.Empty;
state = ParseState.Initial;
break;
case QUOTE:
throw new InvalidDataException("Improper quotes");
default:
value += c;
break;
}
break;
case ParseState.Quote:
switch (c)
{
case QUOTE:
state = ParseState.QuoteInQuote;
break;
default:
value += c;
break;
}
break;
case ParseState.QuoteInQuote:
switch (c)
{
case COMMA:
values.Add(value);
value = string.Empty;
state = ParseState.Initial;
break;
case QUOTE:
```
private IList Split(string line)
{
List values = new List();
string value = string.Empty;
ParseState state = ParseState.Initial;
foreach (char c in line)
{
switch (state)
{
case ParseState.Initial:
switch (c)
{
case COMMA:
values.Add(string.Empty);
break;
case QUOTE:
state = ParseState.Quote;
break;
default:
value += c;
state = ParseState.Data;
break;
}
break;
case ParseState.Data:
switch (c)
{
case COMMA:
values.Add(value);
value = string.Empty;
state = ParseState.Initial;
break;
case QUOTE:
throw new InvalidDataException("Improper quotes");
default:
value += c;
break;
}
break;
case ParseState.Quote:
switch (c)
{
case QUOTE:
state = ParseState.QuoteInQuote;
break;
default:
value += c;
break;
}
break;
case ParseState.QuoteInQuote:
switch (c)
{
case COMMA:
values.Add(value);
value = string.Empty;
state = ParseState.Initial;
break;
case QUOTE:
Solution
So my question is this: is this block at the end some sort of anti-pattern, and is it something that's going to come back to bite me in the future?
Not necessarily.
It depends on whether
If
then you could move the logic from the last
and stop further processing.
Or you want to support multi-line CSV records (newline characters embedded within quotes), then you could continue processing the next line.
If the full length of
then it's fine to not terminate it with an explicit EOL and handle it in the extra
Not necessarily.
It depends on whether
line can contain end-of-line markers.If
line can contain an EOL character that marks the end of the CSV record for you,then you could move the logic from the last
switch into case EOL: for the state-specific handling as appropriate,and stop further processing.
Or you want to support multi-line CSV records (newline characters embedded within quotes), then you could continue processing the next line.
If the full length of
line is expected to be the complete CSV record,then it's fine to not terminate it with an explicit EOL and handle it in the extra
switch as you did.Context
StackExchange Code Review Q#91694, answer score: 2
Revisions (0)
No revisions yet.