patternModerate
Are HTML and CSS regular languages?
Viewed 0 times
cssarelanguagesregularandhtml
Problem
I have a question whether or not CSS and HTML are regular languages.
I believe CSS is a regular language, since it should be possible to create a regular expression to match the structure of CSS.
However, I believe that HTML is not a regular language since you have nested attributes that could be defined recursively.
I believe CSS is a regular language, since it should be possible to create a regular expression to match the structure of CSS.
However, I believe that HTML is not a regular language since you have nested attributes that could be defined recursively.
Solution
Providing a regular expression or DFA for a language and proceeding to demonstrate that it is correct for the language in question generally constitutes a pretty convincing argument for the regularity of the language. To prove a language is non-regular, you have a few options: the pumping lemma for regular languages is a classic standby, but the Myhill-Nerode theorem is pretty nifty, too (especially since you can use it to prove regularity as well as non-regularity).
Your first job should be to define what constitute valid strings in the languages $HTML$ and $CSS$. Note that most browsers will take anything you call $HTML$ and display something without error; in that sense, anything is valid $HTML$, and the language $\Sigma^*$ is trivially regular. However, I think it's safe to assume that you have something a bit more strict in mind: $HTML$ consists only of those strings that conform to some standard, which probably calls for matched tags. In this case, you should be able to prove that $HTML$ isn't regular, since you can consider the following kinds of strings (whitespace is added only for clarity and could be removed in the real string):
This language, `
For $CSS$, first figure out what strings you think are valid, then try to come up with a regular expression or DFA for the set of all $CSS$ strings, or figure out how to define a subset of $CSS$ that requires non-local checks and unbounded memory (counting, matching, nesting, etc.) If you can define such a subset (like we did for $HTML$ above), then you're good to go.
Your first job should be to define what constitute valid strings in the languages $HTML$ and $CSS$. Note that most browsers will take anything you call $HTML$ and display something without error; in that sense, anything is valid $HTML$, and the language $\Sigma^*$ is trivially regular. However, I think it's safe to assume that you have something a bit more strict in mind: $HTML$ consists only of those strings that conform to some standard, which probably calls for matched tags. In this case, you should be able to prove that $HTML$ isn't regular, since you can consider the following kinds of strings (whitespace is added only for clarity and could be removed in the real string):
...
This language, `
$($ $)^n($ $)^n$, should be enough to get to a proof of non-regularity, provided you accept the above as valid $HTML$, and would say that a missing ` tag would make the resulting string invalid $HTML$, even though browsers would try to render it anyway.For $CSS$, first figure out what strings you think are valid, then try to come up with a regular expression or DFA for the set of all $CSS$ strings, or figure out how to define a subset of $CSS$ that requires non-local checks and unbounded memory (counting, matching, nesting, etc.) If you can define such a subset (like we did for $HTML$ above), then you're good to go.
Code Snippets
<html>
<body>
<table>
<tr>
<td>
<table>
...
</table>
</td>
</tr>
</table>
</body>
</html>Context
StackExchange Computer Science Q#12867, answer score: 11
Revisions (0)
No revisions yet.