patternMinor
Is it possible to tell if two sequences of assembly instructions are semantically equivalent?
Viewed 0 times
semanticallyinstructionsequivalentaretellpossibleassemblytwosequences
Problem
In a modern instruction set like x86-64 there are many many many ways to accomplish the same task using a series of old simple instructions all the way up to the complex SIMD instructions. If one wanted to determine the semantic equivalence of a series of instructions, could this be done rationally or must it only be done experimentally?
For the sake of simplicity I am restricting the number of instructions in a series to something like ten.
The naive way to solve the problem would be to generate all of the potential inputs and experimentally check the results.
A slightly smarter way might be to determine the edges of the input states and only test those.
But I'm wondering if there's a way to represent each of the assembly instructions (and series of instructions) as a data structure that could be compared. Is this a tractable problem? Or is this clearly something like the halting problem?
Intuitively this feels like something similar to what happens inside a compiler.
For the sake of simplicity I am restricting the number of instructions in a series to something like ten.
The naive way to solve the problem would be to generate all of the potential inputs and experimentally check the results.
A slightly smarter way might be to determine the edges of the input states and only test those.
But I'm wondering if there's a way to represent each of the assembly instructions (and series of instructions) as a data structure that could be compared. Is this a tractable problem? Or is this clearly something like the halting problem?
Intuitively this feels like something similar to what happens inside a compiler.
Solution
If you restrict to programs of some fixed, finite length then yes, this is decidable. There are only a finite number of possible programs so there is a finite list of which ones are equivalent to one another. A program that is hard-coded with that list can answer the question just by looking up the two programs.
However, if you don't restrict the length of the programs, the problem becomes undecidable: if it were decidable, you could solve the halting problem ("Does program $P$ halt with input $I$?") by rephrasing it as "Is the program that simulates $P$ running on input $I$ equivalent to this trivial program that always halts?"
As an aside, why do you feel that a compiler would need to do this? Compilers just take a program represented in one form (source code) and translate it into another form (the executable image). The translation process guarantees that the two are semantically equivalent, so there's no need to check.
However, if you don't restrict the length of the programs, the problem becomes undecidable: if it were decidable, you could solve the halting problem ("Does program $P$ halt with input $I$?") by rephrasing it as "Is the program that simulates $P$ running on input $I$ equivalent to this trivial program that always halts?"
As an aside, why do you feel that a compiler would need to do this? Compilers just take a program represented in one form (source code) and translate it into another form (the executable image). The translation process guarantees that the two are semantically equivalent, so there's no need to check.
Context
StackExchange Computer Science Q#89970, answer score: 4
Revisions (0)
No revisions yet.