HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Parser for tab-delimited data as a subclass of StringReader

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
delimitedparsertabstringreadersubclassfordata

Problem

I wanted to parse lines of text as tab-delimited items, and I had just been using a StringReader for something else and I came up with this:

class TabDelimitedFieldReader : StringReader
    {
        private string _nextLine;
        public TabDelimitedFieldReader(string s)
            : base(s)
        {
        }

        public IEnumerable ReadFields()
        {
            if (_nextLine != null)
            {
                var fields = _nextLine.Split('\t');
                foreach (string field in fields)
                {
                    yield return field;
                }
            }
        }

        public bool HasMore
        {
            get
            {
                _nextLine = this.ReadLine();
                return (_nextLine != null);
            }
        }
    }


I can't think of another time I directly subclasses a .NET framework class like this. I figure maybe I'm just strange and it's normal for others, or maybe there's a reason I wouldn't want to do it.

Solution

Conceptually, I do not see anything inherently wrong with subclassing a framework object, particularly since many of them provide virtual/abstract members specifically to allow it.

In this case, it may be better to wrap a TextReader object instead of inheriting from StringReader. This would allow you to provide tab delimited reading for multiple data sources instead of limiting yourself to in-memory strings. For example, you could still pass in a StringReader for reading strings, but you could also pass in a StreamReader if you wanted to support reading from files instead.

Additionally, I noticed that HasMore is dangerous to call due to its side-effect. It is not immediately obvious to a caller that checking HasMore would alter the reader's current position. I would suggest using Peek instead.

You would then need to update _nextLine elsewhere. You could have it be a side-effect of calling ReadFields. Or, perhaps better, you could add another method (e.g., AdvanceLine) that is simply responsible for moving the reader to the next line.

Context

StackExchange Code Review Q#8818, answer score: 3

Revisions (0)

No revisions yet.