HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Length of a regular expression

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
expressionlengthregular

Problem

What is the definition for the length of a regular expression? This question might sound simple, yet I cannot find a definition. Do I only count letters from the alphabet? Do * and $\cup$ count?
For example: let r = $(ab)^*\cup$ $a$. What is $|r|$?

Solution

I assume, you want to define the length of a regular expression such that you can make statements like: The Thompson construction creates an NFA from a regular expression with a number of states that is linear in the length of the expression.

In that case, it is crucial to include operator symbols, otherwise the above statement would not be true for the expression $a^{\ast\ast\ast\ast\ast\cdots\ast}$.

However, if you preprocess expressions such that there are no nested Kleene stars or trailing $\varepsilon$s, then it does not matter much, as then the different definitions only differ by a constant factor. A short write up on this which I think is quite good, you can find in this paper.¹

¹ Keith Ellul, Bryan Krawetz, Jeffrey Shallit, and Ming-wei Wang. Regular Expressions: New Results and Open Problems. Journal of Automata, Languages and Combinatorics, 10(4) 2005

Context

StackExchange Computer Science Q#151654, answer score: 7

Revisions (0)

No revisions yet.