HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpModerate

Look 'ma, I can read code

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
codecanlookread

Problem

I have put together a naive implementation of a VB6/VBA parser, and I'd like to see if the CR community sees the same things as I see can be improved with this code, before I start refactoring.

I've never written a parser, so this naive implementation is naive in all senses: I wanted to push it as far as I could possibly go, until I stopped to refactor - so here I am with a CodeFileParser that does everything except recursively parse code blocks; the actual language syntax isn't yet processed at this point, but the code still produces an ISyntaxTree that models the entire code file.

public interface ISyntaxTree
{
    string Name { get; }
    IEnumerable Attributes { get; }
    IList Nodes { get; }
}


For example I can give it a List class (here modified to implement some IEnumerable interface), and with a little HierarchicalDataTemplate in a sandbox WPF app (which probably could also use a peer review!), I get this.. which is already awesome:

So when we parse Public Property Get Item(index As Long) As Variant, it's broken down like this:

  • PropertyNode ("Item (Get)")



  • IdentifierNode ("Item")



  • ReferenceNode ("Variant")



  • ParameterNode ("index")



  • IdentifierNode ("index")



  • ReferenceNode ("Long")



  • CodeBlockNode



  • (more CodeBlockNodes)



So an IdentifierNode basically represents an identifier; I'm self-debating whether an IdentifierNode should have a Scope property, or if an identifier's scope should be inferred from its position in the tree. Anything that has a name, should have an IdentifierNode to store it: method names, class names, variables, constants, everything.

An identifier that has an explicit type specified, will have a ReferenceNode under it. A ReferenceNode represents a usage of a type, either built-in (String, Integer, etc.) or pointing to an IdentifierNode's Name. For functions and property getters (i.e. members with a return value), the ReferenceNode under the member's `IdentifierN

Solution

In ParseFileHeader you repeat yourself a lot. Trim after every content[...]. You can Trim using loop

for (int i = 0; i < content.Length; i++)
        content[i] = content[i].Trim();


This code...

attributes.Add(attributeParser.Parse(content[8].Trim()));
        attributes.Add(attributeParser.Parse(content[9].Trim()));
        attributes.Add(attributeParser.Parse(content[10].Trim()));
        attributes.Add(attributeParser.Parse(content[11].Trim()));
        attributes.Add(attributeParser.Parse(content[12].Trim()));


... can be writen as

foreach (var index in new[] { 8,9,10,11,12)
            attributes.Add(attributeParser.Parse(content[index])


ParseMembers is too massive. Split it into several methods. You have

private IEnumerable ParseMembers(string[] content, ref int currentLine)
{
    ....

    if (match.Success)
    {
        // Wall of code
    }
    else
    {
        // Wall of code
    }
}


Create KeywordMatched, KeywordNotMatched and thanks to it you splitted your wall of code to two smallers methods. It's way easier to read. Then go to KeywordMatched

if (!new[] { ReservedKeywords.Sub, ReservedKeywords.Function, ReservedKeywords.Property }.Contains(modifier))
{
    // Wall of code
}
else
{
    // Wall of code
}


And repeat process. Create ParseSubOrFunctionOrProperty and ParseOther or whateva meaningful name. After while you will have code where there are many small function with easy to read names.

EDIT: One more thing

var pattern = @"((?Public|Private|Friend)...";
var regex = new Regex(pattern);


Use better names. What is pattern? correctKeyword or what?

Code Snippets

for (int i = 0; i < content.Length; i++)
        content[i] = content[i].Trim();
attributes.Add(attributeParser.Parse(content[8].Trim()));
        attributes.Add(attributeParser.Parse(content[9].Trim()));
        attributes.Add(attributeParser.Parse(content[10].Trim()));
        attributes.Add(attributeParser.Parse(content[11].Trim()));
        attributes.Add(attributeParser.Parse(content[12].Trim()));
foreach (var index in new[] { 8,9,10,11,12)
            attributes.Add(attributeParser.Parse(content[index])
private IEnumerable<ISyntaxTree> ParseMembers(string[] content, ref int currentLine)
{
    ....

    if (match.Success)
    {
        // Wall of code
    }
    else
    {
        // Wall of code
    }
}
if (!new[] { ReservedKeywords.Sub, ReservedKeywords.Function, ReservedKeywords.Property }.Contains(modifier))
{
    // Wall of code
}
else
{
    // Wall of code
}

Context

StackExchange Code Review Q#55805, answer score: 12

Revisions (0)

No revisions yet.