patterncsharpMinor
Rubberduck VBA Parser, Episode VI: Return of the Abstraction
Viewed 0 times
theepisodeparserreturnabstractionvbarubberduck
Problem
VBA comment syntax is fun... and VBA line continuation makes it even more fun.
Picture a VBA module like this:
(no wonder syntax highlighting is getting confused!)
If you don't know what Rubberduck is: Rubberduck is a COM add-in for the VBE / VBA's IDE that I'm building with ...@RubberDuck. I have a branch where I've burned the whole parser namespace and replaced it with ANTLR-generated code.
The only problem is that the .g4 VB6 grammar file I'm using to generate the parser, does not support comments. So I ended up [re-]inserting an abstraction layer between ANTLR's
I added two methods to the
The
```
public class VbModuleParseResult
{
public VbModuleParseResult(QualifiedModuleName qualifiedName, IParseTree parseTree, IEnumerable comments)
{
_qualifiedName = qualifiedName;
_parseTree = parseTree;
_comments = comments;
}
private readonly QualifiedModuleName _qualifiedName;
public QualifiedModuleName QualifiedName { get { return _qualifiedName; } }
private IParseTree _parseTree;
public IParseTree ParseTree { get { return _parseTree; } }
Picture a VBA module like this:
Rem this is an old-style comment.
' this is a more standard comment
Rem this _
is _
a _
multiline _
comment
Private Sub Foo() ' this _
is _
also _
a _
multiline _
comment _
_
...don't do this at home.
End Sub
'@TestMethod
Private Sub Bar()
' todo: call Foo
End Sub(no wonder syntax highlighting is getting confused!)
If you don't know what Rubberduck is: Rubberduck is a COM add-in for the VBE / VBA's IDE that I'm building with ...@RubberDuck. I have a branch where I've burned the whole parser namespace and replaced it with ANTLR-generated code.
The only problem is that the .g4 VB6 grammar file I'm using to generate the parser, does not support comments. So I ended up [re-]inserting an abstraction layer between ANTLR's
IParseTree and the rest of Rubberduck.. albeit very differently this time.I added two methods to the
IRubberduckParser interface:///
/// Parses all code modules in specified project.
///
/// Returns an IParseTree for each code module in the project; the qualified module name being the key.
IEnumerable Parse(VBProject vbProject);
IEnumerable ParseComments(VBComponent vbComponent);The
VbModuleParseResult class encapsulates a module's IParseTree and its CommentNodes:```
public class VbModuleParseResult
{
public VbModuleParseResult(QualifiedModuleName qualifiedName, IParseTree parseTree, IEnumerable comments)
{
_qualifiedName = qualifiedName;
_parseTree = parseTree;
_comments = comments;
}
private readonly QualifiedModuleName _qualifiedName;
public QualifiedModuleName QualifiedName { get { return _qualifiedName; } }
private IParseTree _parseTree;
public IParseTree ParseTree { get { return _parseTree; } }
Solution
Let's tackle this piece of code, shall we?
Alright, first thing I see is some duplication:
So lets remove it.
Additionally, there's some duplication here...
Maybe we can fix it?
Hmm, we're still checking it twice... and
Invert it, maybe? (if
Hmm...
Here's a batshit crazy idea.
Put a for loop inside the current for loop to repeat the bits you need.
```
for (var i = 0; i < code.Length; i++)
{
var line = code[i];
var index = 0;
if (line.HasComment(out index))
{
startLine = i;
startColumn = index;
//multiline comment forloop...
for (; i < code.Length; i++)
{
line = code[i];
var commentLength = line.Length - index;
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
if(!line.EndsWith("_"))
{
break;
}
}
var selection = new S
public IEnumerable ParseComments(VBComponent component)
{
var code = component.CodeModule.Code();
var qualifiedName = new QualifiedModuleName(component.Collection.Parent.Name, component.Name);
var commentBuilder = new StringBuilder();
var continuing = false;
var startLine = 0;
var startColumn = 0;
for (var i = 0; i < code.Length; i++)
{
var line = code[i];
var index = 0;
if (continuing || line.HasComment(out index))
{
startLine = continuing ? startLine : i;
startColumn = continuing ? startColumn : index;
var commentLength = line.Length - index;
continuing = line.EndsWith("_");
if (!continuing)
{
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
var selection = new Selection(startLine + 1, startColumn + 1, i + 1, line.Length);
var result = new CommentNode(commentBuilder.ToString(), new QualifiedSelection(qualifiedName, selection));
commentBuilder.Clear();
yield return result;
}
else
{
// ignore line continuations in comment text:
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
}
}
}
}Alright, first thing I see is some duplication:
continuing = line.EndsWith("_");
if (!continuing)
{
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
var selection = new Selection(startLine + 1, startColumn + 1, i + 1, line.Length);
var result = new CommentNode(commentBuilder.ToString(), new QualifiedSelection(qualifiedName, selection));
commentBuilder.Clear();
yield return result;
}
else
{
// ignore line continuations in comment text:
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
}commentBuilder.Append(line.Substring(index, commentLength).TrimStart()); is duplicated.So lets remove it.
continuing = line.EndsWith("_");
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
if (!continuing)
{
var selection = new Selection(startLine + 1, startColumn + 1, i + 1, line.Length);
var result = new CommentNode(commentBuilder.ToString(), new QualifiedSelection(qualifiedName, selection));
commentBuilder.Clear();
yield return result;
}Additionally, there's some duplication here...
if (continuing || line.HasComment(out index))
{
startLine = continuing ? startLine : i;
startColumn = continuing ? startColumn : index;
var commentLength = line.Length - index;
continuing = line.EndsWith("_");
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
if (!continuing)
{
var selection = new Selection(startLine + 1, startColumn + 1, i + 1, line.Length);
var result = new CommentNode(commentBuilder.ToString(), new QualifiedSelection(qualifiedName, selection));
commentBuilder.Clear();
yield return result;
}
}continuing is checked three times between changes. That's a bit of a waste.Maybe we can fix it?
if (continuing || line.HasComment(out index))
{
if(!continuing){
startLine = i;
startColumn = index;
}
var commentLength = line.Length - index;
continuing = line.EndsWith("_");
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
if (!continuing)
{
var selection = new Selection(startLine + 1, startColumn + 1, i + 1, line.Length);
var result = new CommentNode(commentBuilder.ToString(), new QualifiedSelection(qualifiedName, selection));
commentBuilder.Clear();
yield return result;
}
}Hmm, we're still checking it twice... and
if (a || b) { if(!a) structures are messy. I wonder if there's something we can do about that?Invert it, maybe? (if
A OR B is true then NOT A implies B)if (continuing || line.HasComment(out index))
{
if(line.HasComment(out index)){
startLine = i;
startColumn = index;
}Hmm...
Here's a batshit crazy idea.
Put a for loop inside the current for loop to repeat the bits you need.
```
for (var i = 0; i < code.Length; i++)
{
var line = code[i];
var index = 0;
if (line.HasComment(out index))
{
startLine = i;
startColumn = index;
//multiline comment forloop...
for (; i < code.Length; i++)
{
line = code[i];
var commentLength = line.Length - index;
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
if(!line.EndsWith("_"))
{
break;
}
}
var selection = new S
Code Snippets
public IEnumerable<CommentNode> ParseComments(VBComponent component)
{
var code = component.CodeModule.Code();
var qualifiedName = new QualifiedModuleName(component.Collection.Parent.Name, component.Name);
var commentBuilder = new StringBuilder();
var continuing = false;
var startLine = 0;
var startColumn = 0;
for (var i = 0; i < code.Length; i++)
{
var line = code[i];
var index = 0;
if (continuing || line.HasComment(out index))
{
startLine = continuing ? startLine : i;
startColumn = continuing ? startColumn : index;
var commentLength = line.Length - index;
continuing = line.EndsWith("_");
if (!continuing)
{
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
var selection = new Selection(startLine + 1, startColumn + 1, i + 1, line.Length);
var result = new CommentNode(commentBuilder.ToString(), new QualifiedSelection(qualifiedName, selection));
commentBuilder.Clear();
yield return result;
}
else
{
// ignore line continuations in comment text:
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
}
}
}
}continuing = line.EndsWith("_");
if (!continuing)
{
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
var selection = new Selection(startLine + 1, startColumn + 1, i + 1, line.Length);
var result = new CommentNode(commentBuilder.ToString(), new QualifiedSelection(qualifiedName, selection));
commentBuilder.Clear();
yield return result;
}
else
{
// ignore line continuations in comment text:
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
}continuing = line.EndsWith("_");
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
if (!continuing)
{
var selection = new Selection(startLine + 1, startColumn + 1, i + 1, line.Length);
var result = new CommentNode(commentBuilder.ToString(), new QualifiedSelection(qualifiedName, selection));
commentBuilder.Clear();
yield return result;
}if (continuing || line.HasComment(out index))
{
startLine = continuing ? startLine : i;
startColumn = continuing ? startColumn : index;
var commentLength = line.Length - index;
continuing = line.EndsWith("_");
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
if (!continuing)
{
var selection = new Selection(startLine + 1, startColumn + 1, i + 1, line.Length);
var result = new CommentNode(commentBuilder.ToString(), new QualifiedSelection(qualifiedName, selection));
commentBuilder.Clear();
yield return result;
}
}if (continuing || line.HasComment(out index))
{
if(!continuing){
startLine = i;
startColumn = index;
}
var commentLength = line.Length - index;
continuing = line.EndsWith("_");
commentBuilder.Append(line.Substring(index, commentLength).TrimStart());
if (!continuing)
{
var selection = new Selection(startLine + 1, startColumn + 1, i + 1, line.Length);
var result = new CommentNode(commentBuilder.ToString(), new QualifiedSelection(qualifiedName, selection));
commentBuilder.Clear();
yield return result;
}
}Context
StackExchange Code Review Q#79532, answer score: 7
Revisions (0)
No revisions yet.