patterncsharpMinor
Rubberduck VBA Parser, Episode IV: A New Hope
Viewed 0 times
newepisodeparserhopevbarubberduck
Problem
My home-made wannabe-a-parser was wet behind the ears, so I started seriously looking into more mature alternatives, and ended up adding a dependency to the ANTLR lexer/parser generator tool, and using that along with a slightly modified Visual Basic 6 grammar I was able to generate the real deal - a full-blown real parser that actually understands [almost] everything there is to understand about VBA code.
Here's how I'm exposing it to the rest of Rubberduck:
It's implemented like this:
The generated
```
namespace Rubberduck.VBA.Nodes
{
///
/// The base class for all nodes in a Rubberd
Here's how I'm exposing it to the rest of Rubberduck:
public interface IRubberduckParser
{
///
/// Parses specified code into a code tree.
///
/// The name of the VBA project the code belongs to.
/// The name of the VBA component (module) the code belongs to.
/// The VBA code fragment to be parsed.
///
Node Parse(string projectName, string componentName, string code);
}It's implemented like this:
namespace Rubberduck.VBA
{
public class VBParser : IRubberduckParser
{
public Node Parse(string projectName, string componentName, string code)
{
var result = ParseInternal(code);
var walker = new ParseTreeWalker();
var listener = new VBTreeListener(projectName, componentName);
walker.Walk(listener, result);
return listener.Root;
}
private IParseTree ParseInternal(string code)
{
var input = new AntlrInputStream(code);
var lexer = new VisualBasic6Lexer(input);
var tokens = new CommonTokenStream(lexer);
var parser = new VisualBasic6Parser(tokens);
return parser.startRule();
}
}
}The generated
IParseTree can't quite be passed to the outside world as is - it's ugly generated code, with methods that essentially match grammar rules 1:1... Implementing VBA code inspections with this API would have been a nightmare. So I wrote this Node class:```
namespace Rubberduck.VBA.Nodes
{
///
/// The base class for all nodes in a Rubberd
Solution
This is a big improvement from that last post! It so great that makes my eyes sparkle and brings joy.
I think you are making the right use of the keyword partial here. This may be my personal taste but I don't really like to see #region keywords in the code.
The only thing I see you could improve in your code at this moment is to extract those two lines:
into a new method, like you have done with
EDIT:
To address your comment. I dont really know how node could be a bother, you already had the care to make the Node class to ease your code inspections, like you stated. I also like the fact of having a class that represents each element needed in the tree.
One thing that you could do though is to remove the context from the specific class, you already have it in the Node class. In the subclass you could have a property wich implementation would be to do a cast of the node to the suitable type:
It doesn't bring much benefit, but welll is always worth saving 4 or 8 bytes of memory per object...
I think you are making the right use of the keyword partial here. This may be my personal taste but I don't really like to see #region keywords in the code.
The only thing I see you could improve in your code at this moment is to extract those two lines:
_currentNode = CreateProcedureNode(context);
_currentScope = _project + "." + _module + "." + ((ProcedureNode)_currentNode).Name;into a new method, like you have done with
AddCurrentMember:private void CreateNodeFromContext(dynamic context){
var node = CreateProcedureNode(context); // I prefered to store this in a variable to avoid the cast
_currentNode = node;
_currentScope = _project + "." + _module + "." + node.Name;
}EDIT:
To address your comment. I dont really know how node could be a bother, you already had the care to make the Node class to ease your code inspections, like you stated. I also like the fact of having a class that represents each element needed in the tree.
One thing that you could do though is to remove the context from the specific class, you already have it in the Node class. In the subclass you could have a property wich implementation would be to do a cast of the node to the suitable type:
public abstract class Node
{
protected readonly ParserRuleContext _context;
//...
}
public class EnumNode : Node
{
private VisualBasic6Parser.EnumerationStmtContext Context{
get{ return (VisualBasic6Parser.EnumerationStmtContext)_context; }
}
//...
}It doesn't bring much benefit, but welll is always worth saving 4 or 8 bytes of memory per object...
Code Snippets
_currentNode = CreateProcedureNode(context);
_currentScope = _project + "." + _module + "." + ((ProcedureNode)_currentNode).Name;private void CreateNodeFromContext(dynamic context){
var node = CreateProcedureNode(context); // I prefered to store this in a variable to avoid the cast
_currentNode = node;
_currentScope = _project + "." + _module + "." + node.Name;
}public abstract class Node
{
protected readonly ParserRuleContext _context;
//...
}
public class EnumNode : Node
{
private VisualBasic6Parser.EnumerationStmtContext Context{
get{ return (VisualBasic6Parser.EnumerationStmtContext)_context; }
}
//...
}Context
StackExchange Code Review Q#78390, answer score: 3
Revisions (0)
No revisions yet.