patternMinor
Length of a regular expression
Viewed 0 times
expressionlengthregular
Problem
What is the definition for the length of a regular expression? This question might sound simple, yet I cannot find a definition. Do I only count letters from the alphabet? Do * and $\cup$ count?
For example: let r = $(ab)^*\cup$ $a$. What is $|r|$?
For example: let r = $(ab)^*\cup$ $a$. What is $|r|$?
Solution
I assume, you want to define the length of a regular expression such that you can make statements like: The Thompson construction creates an NFA from a regular expression with a number of states that is linear in the length of the expression.
In that case, it is crucial to include operator symbols, otherwise the above statement would not be true for the expression $a^{\ast\ast\ast\ast\ast\cdots\ast}$.
However, if you preprocess expressions such that there are no nested Kleene stars or trailing $\varepsilon$s, then it does not matter much, as then the different definitions only differ by a constant factor. A short write up on this which I think is quite good, you can find in this paper.¹
¹ Keith Ellul, Bryan Krawetz, Jeffrey Shallit, and Ming-wei Wang. Regular Expressions: New Results and Open Problems. Journal of Automata, Languages and Combinatorics, 10(4) 2005
In that case, it is crucial to include operator symbols, otherwise the above statement would not be true for the expression $a^{\ast\ast\ast\ast\ast\cdots\ast}$.
However, if you preprocess expressions such that there are no nested Kleene stars or trailing $\varepsilon$s, then it does not matter much, as then the different definitions only differ by a constant factor. A short write up on this which I think is quite good, you can find in this paper.¹
¹ Keith Ellul, Bryan Krawetz, Jeffrey Shallit, and Ming-wei Wang. Regular Expressions: New Results and Open Problems. Journal of Automata, Languages and Combinatorics, 10(4) 2005
Context
StackExchange Computer Science Q#151654, answer score: 7
Revisions (0)
No revisions yet.