patternMinor
Automatic translation between formal languages
Viewed 0 times
languagesformalbetweenautomatictranslation
Problem
There are parser generators (some of which are limited to certain classes of grammars) which, given a grammar, automatically generate a parser for that grammar.
Would it be possible to make a general-purpose translator generator to automatically translate from a string in the language with production rules described by grammar $G_1$ to a string in the language with production rules described by grammar $G_2$? If so, what rules would have to be imposed? For practical purposes let's say these grammars must both be CFGs. Would a formalization of semantic rules for the grammars also have to specified for the translator generator?
An example of this hypothetical, general-purpose translator generator:
the translator generator is given grammar $G_1$ (which is a CFG representing a context-insenitive adaptation of the C programming language), $G_2$ (which is a CFG representing a context-insenitive adaptation of the JavaScript programming language), and a formalization of the semantic rules of the language with production rules described by grammar $G_2$1. The translator generator is expected to produce a translator which takes as input a string in the language with production rules described by grammar $G_1$ and translates that string to a string in the language with production rules described by grammar $G_2$.
1 I am not entirely sure of what this formalization of the language's semantic rules would look like, so if there exists some formalization which would make the existence of this general-purpose translator generator decidable, then I would also welcome suggestions on what this formalization would look like.
Would it be possible to make a general-purpose translator generator to automatically translate from a string in the language with production rules described by grammar $G_1$ to a string in the language with production rules described by grammar $G_2$? If so, what rules would have to be imposed? For practical purposes let's say these grammars must both be CFGs. Would a formalization of semantic rules for the grammars also have to specified for the translator generator?
An example of this hypothetical, general-purpose translator generator:
the translator generator is given grammar $G_1$ (which is a CFG representing a context-insenitive adaptation of the C programming language), $G_2$ (which is a CFG representing a context-insenitive adaptation of the JavaScript programming language), and a formalization of the semantic rules of the language with production rules described by grammar $G_2$1. The translator generator is expected to produce a translator which takes as input a string in the language with production rules described by grammar $G_1$ and translates that string to a string in the language with production rules described by grammar $G_2$.
1 I am not entirely sure of what this formalization of the language's semantic rules would look like, so if there exists some formalization which would make the existence of this general-purpose translator generator decidable, then I would also welcome suggestions on what this formalization would look like.
Solution
I think you are asking this from a stand point of specific tools so let me answer this in two parts.
Theory Part (ideal view):
There is a notion of languages and a notion of computational systems that except these different classes of languages. Finite Automata can accept regular languages, Pushdown Automata can accept context-free languages. If we go all the way up in the Chomsky hierarchy we get recursively enumerable languages which are accepted by turing machines.
As long as the class of languages define by a certain syntax is less than or equal in power to the target syntax's class of languages then this is possible (at least in principal). Parser generators typically compile context-free grammars to turing machines (or turing complete systems rather) which is just jumping rather far in Chomsky hierarchy. Nicely however this allows us to use context-free grammars in settings where we need recursive enumerability like interpreters. I could in principal compile a context-free grammar to a context-free grammar but defined using a different syntax; this is just an issue of the syntax being used to define context-free grammars. What I could not however do is compile a context-free grammar to a regular expression. No semantic rules need apply in this ideal view.
Practical part:
Real parser generators don't just generate code that accepts a given context free language however. They add in bits of your code to produce abstract syntax trees for your programs to use. I think the details of translating something like Antlr 3 (LL grammars) to Bison (LR grammars which are more general) or some other parser generator would be a bit tricky; I wouldn't want to try it. They probably use slightly different template systems for your code for starters. It could be that they execute your code in different orders so unless your code was pure there would really be no hope of translating the code properly.
So I think the answer to "would semantic rules for the grammars also have to formally specified" would be yes. The code that generates the ASTs would need to be functional and pure or something like that. The substitution principal needs to apply I think which is not something that generally applies except in languages like Haskell where even their effectual code has the substitution property.
My bet is that something like this is possible between the right tools but I think it is a very tricky thing.
Theory Part (ideal view):
There is a notion of languages and a notion of computational systems that except these different classes of languages. Finite Automata can accept regular languages, Pushdown Automata can accept context-free languages. If we go all the way up in the Chomsky hierarchy we get recursively enumerable languages which are accepted by turing machines.
As long as the class of languages define by a certain syntax is less than or equal in power to the target syntax's class of languages then this is possible (at least in principal). Parser generators typically compile context-free grammars to turing machines (or turing complete systems rather) which is just jumping rather far in Chomsky hierarchy. Nicely however this allows us to use context-free grammars in settings where we need recursive enumerability like interpreters. I could in principal compile a context-free grammar to a context-free grammar but defined using a different syntax; this is just an issue of the syntax being used to define context-free grammars. What I could not however do is compile a context-free grammar to a regular expression. No semantic rules need apply in this ideal view.
Practical part:
Real parser generators don't just generate code that accepts a given context free language however. They add in bits of your code to produce abstract syntax trees for your programs to use. I think the details of translating something like Antlr 3 (LL grammars) to Bison (LR grammars which are more general) or some other parser generator would be a bit tricky; I wouldn't want to try it. They probably use slightly different template systems for your code for starters. It could be that they execute your code in different orders so unless your code was pure there would really be no hope of translating the code properly.
So I think the answer to "would semantic rules for the grammars also have to formally specified" would be yes. The code that generates the ASTs would need to be functional and pure or something like that. The substitution principal needs to apply I think which is not something that generally applies except in languages like Haskell where even their effectual code has the substitution property.
My bet is that something like this is possible between the right tools but I think it is a very tricky thing.
Context
StackExchange Computer Science Q#41909, answer score: 6
Revisions (0)
No revisions yet.