HiveBrain v1.2.0
Get Started
← Back to all entries
patternMajor

What is the formal, rigorous definition of a programming language?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
definitionthewhatformalprogramminglanguagerigorous

Problem

In programming language theory, people study the theory behind programming languages. But I have never heard any formal definition of programming languages themselves. What is the formal definition, not of a particular programming language like Python or C++, but of programming languages themselves?

Solution

To taper expectations a little bit, I will first note that the term "programming language" is deliberately broad: it is intended to be open to some interpretation. It means, no more and no less, any convention that is used for describing instructions for computers to execute. This includes, for example, not just C++ and Python, but also things like Nondeterministic programming, where we actually don't tell the computer exactly what to do, but give it several alternatives and allow it to choose any one of them; declarative logic languages like Datalog where we give the computer a set of logical axioms and ask it to deduce all the true statements from those axioms; and even very low-level descriptions like Turing machines and electrical circuits, where we give the program explicitly as electrical or mechanical components. All of these are ways of describing instructions to computers, so all are valid programming langauges at very different levels of abstraction.

However, programming languages researchers do generally agree on some common formal components of programming languages that should always be present, and these serve as a general definition. Namely:
every programming language is defined by a syntax and a semantics.

-
Syntax. This is a formal grammar which gives the set of programs that can be written. Importantly, the formal grammar consists of finitely many syntax elements, which are described in terms of other syntax elements. For example a simple grammar is:

Variable := x | y | z
Term := 0 | 1 | Term + Term | Variable
Program := set Variable = Term | return Term | Program; Program


In this simple language, we have three syntax elements: Variables, Terms, and Programs. In a formal grammar, each syntax element has finitely many cases for how it can be constructed via other syntax elements. For example, a program is either an assignment (setting a variable to equal a term, e.g. set x = x + 1), a return statement, or a sequence of two programs which should be executed one after the other.

-
Semantics. Syntax is just describing the set of valid programs; but it doesn't say anything about what those programs mean. Semantics is a way of assigning meaning to programs. Unlike syntax, which is almost always given as a formal grammar as above, semantics can be given in at least two different ways: these include "denotational semantics", where we assign a mathematical object such as a function to each program, or "operational semantics", where we describe the execution of a program in a more true-to-life way as a sequence of steps.

To illustrate this, starting with denotational semantics: we would say that the term 3 + 5 + 8 is assigned the meaning of 16. More interestingly, the program set x = x + 3 + 5 is assigned the meaning of the mathematical function mapping every integer to that integer plus 8.

Operational semantics, on the other hand, is very different. We would say that the term 3 + 5 + 8 evaluates to 8 + 8 which in turn evaluates to 16. We would also say that the program set x = x + 3 + 5 in a context where x = 5 evaluates to a context where x = 13. So, instead of giving a meaning to each term or program itself, we give a meaning between terms called "evaluates to": we give a formal definition of what it means for A to evaluate to B in the context C.

In any case, the semantics of a language, whether denotational or operational (or something else) gives meaning to the symbols and allows us to make sense of what programs compute, not just what they look like.

Putting these together, we get the following definition.

Definition:
A programming language consists of (1) a syntax, given as a formal grammar; and (2) a semantics, given either as denotational semantics which gives a meaning to each syntax element, or an operational semantics which says when two programs or program contexts relate.

Code Snippets

Variable := x | y | z
Term := 0 | 1 | Term + Term | Variable
Program := set Variable = Term | return Term | Program; Program

Context

StackExchange Computer Science Q#129705, answer score: 40

Revisions (0)

No revisions yet.