patterncsharpMinor
I think I might be having a regex nightmare
Viewed 0 times
mightthinkhavingregexnightmare
Problem
I've been working on the Rubberduck (an add-in for the VBA Editor /"VBE"), specifically here the
Now, armed with this... thing:
WARNING
Faint of heart, beware. The mere sight of these regex patterns is known to have caused nausea in at least, uh, one, somewhat documented case.
```
[ComVisible(false)]
public static class VBAGrammar
{
private static string IdentifierSyntax { get { return @"(?(?:[a-zA-Z][a-zA-Z0-9_])|(?:\[[a-zA-Z0-9_]\]))"; } }
private static string ReferenceSyntax { get { return @"(?:(?(?:(?:(?[a-zA-Z][a-zA-Z0-9_]))\.))?" + IdentifierSyntax + ")"; } }
///
/// Finds all implementations of in the Rubberduck assembly.
///
///
public static IEnumerable GetGrammarSyntax()
{
return Assembly.GetExecutingAssembly()
.GetTypes()
.Where(type => type.BaseType == typeof(SyntaxBase))
VBA.Parser namespace. Here's how I ended up implementing the syntax part.[ComVisible(false)]
public interface ISyntax
{
///
/// Parses an instruction into a syntax node, if possible.
///
/// The fully-qualified scope of the specified instruction, when the instruction is publicly scoped..
/// The fully-qualified scope of the specified instruction, when the instruction is locally scoped.
/// An instruction.
///
/// Returns a node representing the specified instruction,
/// or null if specified instruction can't be parsed.
///
SyntaxTreeNode Parse(string publicScope, string localScope, Instruction instruction);
bool IsMatch(string publicScope, string localScope, Instruction instruction, out SyntaxTreeNode node);
///
/// Gets a value indicating whether syntax is specific to a particular parent node.
///
///
/// Implementations with this member set to true will not be considered as part of the general grammar.
///
bool IsChildNodeSyntax { get; }
SyntaxType Type { get; }
}Now, armed with this... thing:
WARNING
Faint of heart, beware. The mere sight of these regex patterns is known to have caused nausea in at least, uh, one, somewhat documented case.
```
[ComVisible(false)]
public static class VBAGrammar
{
private static string IdentifierSyntax { get { return @"(?(?:[a-zA-Z][a-zA-Z0-9_])|(?:\[[a-zA-Z0-9_]\]))"; } }
private static string ReferenceSyntax { get { return @"(?:(?(?:(?:(?[a-zA-Z][a-zA-Z0-9_]))\.))?" + IdentifierSyntax + ")"; } }
///
/// Finds all implementations of in the Rubberduck assembly.
///
///
public static IEnumerable GetGrammarSyntax()
{
return Assembly.GetExecutingAssembly()
.GetTypes()
.Where(type => type.BaseType == typeof(SyntaxBase))
Solution
Regex-nightmare
By looking at the returned value of
and the fact that you couldn't make it fool proof in at least 6 months (oldest question I have found), I would say yes you have a nightmare. If you have a bug, debugging this won't be fun.
General
-
here you don't need a
This also applies to
-
In the
-
Empty xml documentation should be removed.
By looking at the returned value of
IdentifierDeclarationSyntax (?(?:(?(?:[a-zA-Z][a-zA-Z0-9_]*)|(?:\[[a-zA-Z0-9_]*\]))(?[%&@!#$])?(?\((?(([0-9]+)\,?\s?)*|([0-9]+\sTo\s[0-9]+\,?\s?)+)\))?(?\sAs(\s(?New))?\s(?:(?(?:(?:(?[a-zA-Z][a-zA-Z0-9_]*))\.)*)?(?(?:[a-zA-Z][a-zA-Z0-9_]*)|(?:\[[a-zA-Z0-9_]*\]))))?)(?:\,\s)?)+and the fact that you couldn't make it fool proof in at least 6 months (oldest question I have found), I would say yes you have a nightmare. If you have a bug, debugging this won't be fun.
General
-
here you don't need a
readonly bool, you should use an auto property with a private setter [ComVisible(false)]
public class ParameterNode : SyntaxTreeNode
{
public ParameterNode(Instruction instruction, string scope, Match match)
: base(instruction, scope, match, new[] {new IdentifierNode(instruction, scope, match)})
{
IsImplicitByRef = !match.Groups["by"].Success;
}
public IdentifierNode Identifier { get { return ChildNodes.OfType().Single(); } }
public bool IsImplicitByRef { get; private set; }
}This also applies to
SyntaxBase.Type and SyntaxBase.IsChildNodeSyntax-
In the
VBAGrammar.GetGrammarSyntax() you are calling unnecessary .ToList(). -
Empty xml documentation should be removed.
Code Snippets
(?<declarations>(?:(?<identifier>(?:[a-zA-Z][a-zA-Z0-9_]*)|(?:\[[a-zA-Z0-9_]*\]))(?<specifier>[%&@!#$])?(?<array>\((?<size>(([0-9]+)\,?\s?)*|([0-9]+\sTo\s[0-9]+\,?\s?)+)\))?(?<as>\sAs(\s(?<initializer>New))?\s(?:(?<reference>(?:(?:(?<library>[a-zA-Z][a-zA-Z0-9_]*))\.)*)?(?<identifier>(?:[a-zA-Z][a-zA-Z0-9_]*)|(?:\[[a-zA-Z0-9_]*\]))))?)(?:\,\s)?)+[ComVisible(false)]
public class ParameterNode : SyntaxTreeNode
{
public ParameterNode(Instruction instruction, string scope, Match match)
: base(instruction, scope, match, new[] {new IdentifierNode(instruction, scope, match)})
{
IsImplicitByRef = !match.Groups["by"].Success;
}
public IdentifierNode Identifier { get { return ChildNodes.OfType<IdentifierNode>().Single(); } }
public bool IsImplicitByRef { get; private set; }
}Context
StackExchange Code Review Q#73798, answer score: 7
Revisions (0)
No revisions yet.