HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Parsing data from text with repeating blocks

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
blockswithtextrepeatingparsingfromdata

Problem

I am parsing the responses from our Varnish load balancers in order to monitor the status of various nodes. One of the responses we get is text consisting of multiple blocks of data, one for each server in the load balancer.

This is one such block:

Backend web05 is Healthy
Current states good: 10 threshold: 8 window: 10
Average responsetime of good probes: 0.010285
Oldest Newest
================================================================
4444444444444444444444444444444444444444444444444444444444444444 Good IPv4
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Good Xmit
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR Good Recv
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH-HHH-HHHHHHH---HHHHHHHH Happy

That is repeated (with no specific delimiters) for every backend there is. Each line is a string in a List of all lines in the response.

Each Node represents a Varnish server, to which I have added the ability to pull back the above health status message.

The parser accepts a List that will be populated or updated with all the found backends, and a List containing all the lines received from the server. Backend is a POCO

Here is are the two classes required to test. Following that is a small sample program that will test it. No external libraries are required and I have removed all the actual remote network calls and connection processes, as these are not pertinent to my text-parsing code. The test program contains a sample of the response from the production services.

Main Classes

```
namespace VarnishTest
{
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;

public class Backend
{
public string Name { get; set; }
public string Status { get; set; }
public int TotalCount { get; set; }
public int OkCount { get; set; }
public float

Solution

Your methods are too big and are doing too much. ParseDebugListResponse is responsible for finding the backend boundaries, splitting the response into backend objects, determining what each line is, and parsing each line. If you don't know about the Single-Responsibility Principle, you should check it out.

A solution needs to do a few things. Let's look first at how to split all of the input lines into blocks of Backend information:

public class MyNodeCollection : IEnumerable
{
    private readonly List _backends;

    public MyNodeCollection(List list)
    {
        this._backends = new List(this.FindBoundaries(list)
            .Select(s => new Backend(
                list
                .Skip(s.Item1)
                .Take(s.Item2 - s.Item1)
                .ToList())));
    }

    private IEnumerable> FindBoundaries(List list)
    {
        return FindStarts(list)
            .Zip(FindEnds(list), (start, end) => Tuple.Create(start, end));
    }

    private IEnumerable FindStarts(List list)
    {
        return list
            .Select((s, i) => new { s, i })
            .Where(w => w.s.StartsWith("Backend "))
            .Select(ss => ss.i);
    }

    private IEnumerable FindEnds(List list)
    {
        return this.FindStarts(list)
            .Skip(1)
            .Select(s => s)
            .Concat(new List { list.Count + 1 });
    }

    public IEnumerator GetEnumerator()
    {
        return this._backends.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return this._backends.GetEnumerator();
    }
}


I've split the functionality into single-line methods making liberal use of Linq extension methods.

  • In FindStarts, all I'm doing is returning the indexes of lines that start with "Backend ". That is the first line of Backend data.



  • FindEnds is slightly more complicated in that I want to return the indexes of the last lines of Backend data. That's either the line before the next "Backed " line, or the last line in the list of strings.



  • In FindBoundaries, I need to combine the start indexes and the end indexes into tuples of indexes. I can do this using the little-used Zip method.



  • Finally, in the constructor I take the tuples of boundary indexes and use them to split the given list into chunks of Backend data.



  • Note that I have MyNodeCollection implement IEnumerable so that your testing code will work without modification.



I've changed your Backend class to something that more closely follows the Single-Responsibility Principle. Most of the methods are only one line long. They all have meaningful names.

```
public class Backend
{
public string Name { get; private set; }
public string Status { get; private set; }
public int TotalCount { get; private set; }
public int OkCount { get; private set; }
public float AvgResponse { get; private set; }
public List GoodIPv4 { get; private set; }
public List GoodXmit { get; private set; }
public List GoodRecv { get; private set; }
public List ErrorRecv { get; private set; }
public List Happy { get; private set; }

public Backend(List list)
{
this.GoodIPv4 = new List();
this.GoodXmit = new List();
this.GoodRecv = new List();
this.ErrorRecv = new List();
this.Happy = new List();

foreach (var line in list)
{
this.ParseLine(line);
}
}

private void ParseLine(string line)
{
if (this.IsAverageResponseTime(line))
this.AvgResponse = this.ParseAverageResponseTime(line);

else if (this.IsErrorRecv(line))
this.ErrorRecv.AddRange(this.ParseGoodErrorHappy(line, 'X'));

else if (this.IsGoodIPv4Count(line))
this.GoodIPv4.AddRange(this.ParseGoodErrorHappy(line, '4'));

else if (this.IsGoodRecv(line))
this.GoodRecv.AddRange(this.ParseGoodErrorHappy(line, 'R'));

else if (this.IsGoodXmit(line))
this.GoodXmit.AddRange(this.ParseGoodErrorHappy(line, 'X'));

else if (this.IsHappy(line))
this.Happy.AddRange(this.ParseGoodErrorHappy(line, 'H'));

else if (this.IsName(line))
this.Name = this.ParseName(line);
}

private string ParseName(string arg)
{
return Regex.Replace(arg, @"Backend (\S) is .", "$1");
}

private bool IsName(string arg)
{
return arg.StartsWith("Backend ");
}

private float ParseAverageResponseTime(string arg)
{
string str = Regex.Replace(arg,
@"Average responsetime of good probes: (.*)", "$1");
return float.Parse(str);
}

private bool IsAverageResponseTime(string arg)
{
return arg.StartsWith("Average responsetime of good probes: ");
}

private IEnumerable ParseGoodErrorHappy(string arg, char ch)
{
return arg.Where((w, i) => i s == ch);
}

private bool IsGoodIPv4Count(string arg)
{
ret

Code Snippets

public class MyNodeCollection : IEnumerable<Backend>
{
    private readonly List<Backend> _backends;

    public MyNodeCollection(List<string> list)
    {
        this._backends = new List<Backend>(this.FindBoundaries(list)
            .Select(s => new Backend(
                list
                .Skip(s.Item1)
                .Take(s.Item2 - s.Item1)
                .ToList())));
    }

    private IEnumerable<Tuple<int, int>> FindBoundaries(List<string> list)
    {
        return FindStarts(list)
            .Zip(FindEnds(list), (start, end) => Tuple.Create(start, end));
    }

    private IEnumerable<int> FindStarts(List<string> list)
    {
        return list
            .Select((s, i) => new { s, i })
            .Where(w => w.s.StartsWith("Backend "))
            .Select(ss => ss.i);
    }

    private IEnumerable<int> FindEnds(List<string> list)
    {
        return this.FindStarts(list)
            .Skip(1)
            .Select(s => s)
            .Concat(new List<int> { list.Count + 1 });
    }

    public IEnumerator<Backend> GetEnumerator()
    {
        return this._backends.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return this._backends.GetEnumerator();
    }
}
public class Backend
{
    public string Name { get; private set; }
    public string Status { get; private set; }
    public int TotalCount { get; private set; }
    public int OkCount { get; private set; }
    public float AvgResponse { get; private set; }
    public List<bool> GoodIPv4 { get; private set; }
    public List<bool> GoodXmit { get; private set; }
    public List<bool> GoodRecv { get; private set; }
    public List<bool> ErrorRecv { get; private set; }
    public List<bool> Happy { get; private set; }

    public Backend(List<string> list)
    {
        this.GoodIPv4 = new List<bool>();
        this.GoodXmit = new List<bool>();
        this.GoodRecv = new List<bool>();
        this.ErrorRecv = new List<bool>();
        this.Happy = new List<bool>();

        foreach (var line in list)
        {
            this.ParseLine(line);
        }
    }

    private void ParseLine(string line)
    {
        if (this.IsAverageResponseTime(line))
            this.AvgResponse = this.ParseAverageResponseTime(line);

        else if (this.IsErrorRecv(line))
            this.ErrorRecv.AddRange(this.ParseGoodErrorHappy(line, 'X'));

        else if (this.IsGoodIPv4Count(line))
            this.GoodIPv4.AddRange(this.ParseGoodErrorHappy(line, '4'));

        else if (this.IsGoodRecv(line))
            this.GoodRecv.AddRange(this.ParseGoodErrorHappy(line, 'R'));

        else if (this.IsGoodXmit(line))
            this.GoodXmit.AddRange(this.ParseGoodErrorHappy(line, 'X'));

        else if (this.IsHappy(line))
            this.Happy.AddRange(this.ParseGoodErrorHappy(line, 'H'));

        else if (this.IsName(line))
            this.Name = this.ParseName(line);
    }

    private string ParseName(string arg)
    {
        return Regex.Replace(arg, @"Backend (\S*) is .*", "$1");
    }

    private bool IsName(string arg)
    {
        return arg.StartsWith("Backend ");
    }

    private float ParseAverageResponseTime(string arg)
    {
        string str = Regex.Replace(arg, 
            @"Average responsetime of good probes: (.*)", "$1");
        return float.Parse(str);
    }

    private bool IsAverageResponseTime(string arg)
    {
        return arg.StartsWith("Average responsetime of good probes: ");
    }

    private IEnumerable<bool> ParseGoodErrorHappy(string arg, char ch)
    {
        return arg.Where((w, i) => i < 64).Select(s => s == ch);
    }

    private bool IsGoodIPv4Count(string arg)
    {
        return arg.EndsWith(" Good IPv4");
    }

    private bool IsGoodXmit(string arg)
    {
        return arg.EndsWith(" Good Xmit");
    }

    private bool IsGoodRecv(string arg)
    {
        return arg.EndsWith(" Good Recv");
    }

    private bool IsErrorRecv(string arg)
    {
        return arg.EndsWith(" Error Recv");
    }

    private bool IsHappy(string arg)
    {
        return arg.EndsWith(" Happy");
    }
}

Context

StackExchange Code Review Q#118641, answer score: 4

Revisions (0)

No revisions yet.