patternMinor

Can formal languages be used to study mathematical notation?

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

programming-languages formal-languages formal-grammars cs stackoverflow reference-request

canstudylanguagesformalusednotationmathematical

Problem

Question: Are there any introductory texts in formal language or programming language theory which discuss how to apply it to the study of optimal notation?

In particular, I am interested to learn what stack-languages, parse trees, and indices are, and how to predict when a certain type of notation will lead to exponential redundancy.

I have basically no background in either formal language/grammar or programming theory, since as a math major the only computer science I learned was algorithms and graph theory, as well as very modest smidgens of complexity theory and Boolean functions. Thus, if the only books which discuss this are not introductory, I would be grateful for answers that both list such books discussing exponential notation blow-up as well as introductory books that will prepare for the books which directly address my question.

Context: This question is inspired primarily by an answer on Physics.SE, which says that:

It is very easy to prove (rigorously) that there is no parentheses notation which reproduces tensor index contractions, because parentheses are parsed by a stack-language (context free grammar in Chomsky's classification) while indices cannot be parsed this way, because they include general graphs. The parentheses generate parse trees, and you always have exponentially many maximal trees inside any graph, so there is exponential redundancy in the notation.

Throughout the rest of the answer, other examples of "exponential notation blow-up" are discussed, for example with Petri Nets in computational biology.

There are also other instances where mathematical notation is difficult to parse, for example as mentioned here when functions and functions applied to the argument are not distinguished clearly. This can become especially confusing when the function becomes the argument and the argument becomes the function, e.g. here.

Solution

-
Formal language theory does not concern itself with the semantics of the language. That might seem odd, since we tend to think of language as a mechanism for communicating something, but if you think about it, there are really two levels of understanding language (at least): the surface level, in which the language is a stream of lexemes, and the underlying denotational level which is more or less divorced from the surface representation. (Chomsky posited an intermediate "transformational" level to get around some limitations with CFGs but that's not relevant here.) Consequently, it is possible to say "the same thing" in different languages; Chomsky is not a Whorfian. (See Wikipedia for a brief overview, with some references).

-
Nonetheless, a context-free grammar is not sufficient to distinguish correct and incorrect utterances. Chomsky offered the classic example: Colourless ideas sleep furiously (which he spelled incorrectly, being a USian). See Wikipedia, again. (Unfortunately Wikipedia doesn't have a Canadian English version.) The precise division between syntactic and semantic errors is hard, if not impossible, to demarcate and there has been considerable debate over this topic in CS fields, which I'm not going to even attempt to discuss here because I always get into trouble when I do. However, we can identify one classic grammatical rule present in many human languages: noun/verb agreement. I disagrees seems to me to be a syntactic error in the sense that I understand the intent of the utterance perfectly but also recognize it as erroneous. But this syntactic issue can only be captured by a context-free grammar if we enumerate all possible agreements. That is, we can write something vaguely like $S \to NP_{sing} VP_{sing} | NP_{plural} VP_{plural}$, but it is easy to see how the enumeration could get out of hand in language with more complicated agreement rules (gender, for example).

-
The problem with context-free grammars is that they are context-free, although you shouldn't take that description too seriously because it is easy to fall into the trap of misinterpreting technical use of common words (which, I might argue, is the basis of this question in the first place). That means that a nonterminal (like $NP$ above) must derive exactly the same set of phrases regardless of the context in which it appears. So we could not write, for example, $S \to NP_X VP_X$ with the understanding that $X$ needs to be filled in the same way in both expansions. (This is one of the issues which which transformational grammar attempted to grapple.)

-
That is exactly the problem with tensor index contractions. A tensor index contraction places a particular requirement on the use of index variables: an index variable must be used exactly twice, in which case it cannot be on the left-hand side, or exactly once, in which it must be on the left-hand side. (Since I'm not a physicist, I'd be tempted to collapse that into saying that an index variable must appear exactly twice in all. But there is a semantic distinction between free and placeholder variables, which is important to the understanding of the expression.) Here, there is no simple finite collection of index variables and no limit to the number of placeholders used. Moreover, renaming placeholders does not affect semantics provided that the new names are not used elsewhere in the expression, and one might expect the formal language description to capture that fact.

-
It is in fact possible to rigorously prove the assertion that context-free grammars cannot capture contextual agreement, as in the previous examples. I think that has something to do with what the quoted claim is asserting. Depending on how omnicurious you are, you might find it interesting to learn more, but I don't think it will end up being particularly relevant to the philosophical or physical insights you seem to be seeking.

-
The other linked articles, about unfortunate surface forms in mathematical notation, are simply anecdotal; none of them, as far as I can see, makes any deep or even superficial point relevant to formal language theory, just as the possibly famous joke that one man's fish is another man's poisson is not even vaguely insightful about romance linguistics, but it's still funny (IMO).

Context

StackExchange Computer Science Q#65629, answer score: 6

Revisions (0)

No revisions yet.