patternMinor
Is intersection used in regular expression?
Viewed 0 times
expressionusedregularintersection
Problem
As union is part of regular expression in form of positive closure, similarly intersection is also a part of regular expression? in which form it can be used?
Solution
Regular expressions over an alphabet $\Sigma$, and their denotations (denoted by $L[r]$), are defined as follows:
The definition of concatenation of languages is
$$
L_1L_2 = \{w_1w_2 : w_1 \in L_1, w_2 \in L_2\}.
$$
The definition of iteration is
$$
L^* = \bigcup_{n=0}^\infty L^n, \text{ where } L^0 = \{\epsilon\} \text{ and } L^{n+1} = L^nL.
$$
As you can see, union is built into regular expressions, whereas intersection isn't. However, the set of regular languages (which is the set of languages of the form $L[r]$) is closed under intersection, and so given any two regular expressions $r_1,r_2$, we can find a regular expression $r$ such that $L[r] = L[r_1] \cap L[r_2]$.
It turns out that sometimes $r$ will be much larger than $r_1,r_2$, see for example
Gelade and Neven, Succinctness of the Complement and Intersection of Regular Expressions, Theorem 17(2). A similar phenomenon is exhibited by complementation, see for example Theorem 12(2) in the aforementioned paper. For this reason, sometimes extended or generalized regular expressions are considered, in which operations such as intersection and complementation are allowed. While this does not increase the expressivity of regular expressions – every language expressible using an extended regular expression can also be expressed as a standard regular expression – it does increase their economy (i.e., length or size).
- $\emptyset$ is a regular expression, and $L[\emptyset] = \emptyset$ (empty language).
- $\epsilon$ is a regular expression, and $L[\epsilon] = \{\epsilon\}$ (the empty word).
- For each $\sigma \in \Sigma$, $\sigma$ is a regular expression, and $L[\sigma] = \{\sigma\}$ (single letter).
- If $r_1,r_2$ are regular expressions over $\Sigma$ then so is $r_1r_2$, and $L[r_1r_2] = L[r_1]L[r_2]$ (concatenation).
- If $r_1,r_2$ are regular expressions over $\Sigma$ then so is $r_1+r_2$, and $L[r_1+r_2] = L[r_1] \cup L[r_2]$ (union).
- If $r$ is a regular expression over $\Sigma$ then so is $r^$, and $L[r^] = L[r]^*$ (iteration).
The definition of concatenation of languages is
$$
L_1L_2 = \{w_1w_2 : w_1 \in L_1, w_2 \in L_2\}.
$$
The definition of iteration is
$$
L^* = \bigcup_{n=0}^\infty L^n, \text{ where } L^0 = \{\epsilon\} \text{ and } L^{n+1} = L^nL.
$$
As you can see, union is built into regular expressions, whereas intersection isn't. However, the set of regular languages (which is the set of languages of the form $L[r]$) is closed under intersection, and so given any two regular expressions $r_1,r_2$, we can find a regular expression $r$ such that $L[r] = L[r_1] \cap L[r_2]$.
It turns out that sometimes $r$ will be much larger than $r_1,r_2$, see for example
Gelade and Neven, Succinctness of the Complement and Intersection of Regular Expressions, Theorem 17(2). A similar phenomenon is exhibited by complementation, see for example Theorem 12(2) in the aforementioned paper. For this reason, sometimes extended or generalized regular expressions are considered, in which operations such as intersection and complementation are allowed. While this does not increase the expressivity of regular expressions – every language expressible using an extended regular expression can also be expressed as a standard regular expression – it does increase their economy (i.e., length or size).
Context
StackExchange Computer Science Q#106864, answer score: 7
Revisions (0)
No revisions yet.