patternMinor
Deriving the regular expression for C-style /**/ comments
Viewed 0 times
expressionthederivingregularstyleforcomments
Problem
I'm working on a parser for a C-style language, and for that parser I need the regular expression that matches C-style /**/ comments. Now, I've found this expression on the web:
However, as you can see, this is a rather messy expression, and I have no idea whether it actually matches exactly what I want it to match.
Is there a different way of (rigorously) defining regular expressions that are easy to check by hand that they are really correct, and are then convertible ('compilable') to the above regular expression?
/\*([^\*]*\*+[^\*/])*([^\*]*\*+|[^\*]*\*/However, as you can see, this is a rather messy expression, and I have no idea whether it actually matches exactly what I want it to match.
Is there a different way of (rigorously) defining regular expressions that are easy to check by hand that they are really correct, and are then convertible ('compilable') to the above regular expression?
Solution
I can think of four ways:
-
Define an automaton for the language you are interested in. Convert the regular expression to an automaton (using Brzozowski's derivatives). Check that both automata accept the same language (determinize and minimize or use a bisimulation argument).
-
Write loads of test cases and apply your regular expression to them.
-
Convert the automaton defined in point 1 to a regular expression, using standard techniques.
-
A combination of the above.
-
Define an automaton for the language you are interested in. Convert the regular expression to an automaton (using Brzozowski's derivatives). Check that both automata accept the same language (determinize and minimize or use a bisimulation argument).
-
Write loads of test cases and apply your regular expression to them.
-
Convert the automaton defined in point 1 to a regular expression, using standard techniques.
-
A combination of the above.
Context
StackExchange Computer Science Q#311, answer score: 6
Revisions (0)
No revisions yet.