patternMinor
number encoding effect on complexity
Viewed 0 times
effectnumbercomplexityencoding
Problem
I started reading the book "Data Structures and Network Algorithms" by Robert Tarjan, which is a classic (but a bit outdated - 1983) and I am a bit perplexed by the paragraph in the first chapter, Page 3, first paragraph. I have attached the paragraph below:
An important caveat concerning random-access and pointer machines is that if the machine can manipulate numbers of arbitrary size in constant time, it can perform hidden parallel computation by encoding several numbers into one. There are two ways to prevent this. Instead of counting each operation as one step (the uniform cost measure), we can charge for an operation a time proportional to the number of bits needed to represent the operands (the logarithmic cost measure). Alternatively we can limit the size of the integers we allow to those representable in a constant times log n bits, where n is a measure of the input size, and restrict the operations we allow on real numbers. We shall generally use the latter approach; all our algorithms are implementable on a random-access or pointer machine with integers of size at most n for some small constant c with only comparison, addition, and sometimes multiplication of input values allowed as operations on real numbers, with no clever encoding.
Would somebody be able to explain this paragraph or point me to some references that elaborates it better?
An important caveat concerning random-access and pointer machines is that if the machine can manipulate numbers of arbitrary size in constant time, it can perform hidden parallel computation by encoding several numbers into one. There are two ways to prevent this. Instead of counting each operation as one step (the uniform cost measure), we can charge for an operation a time proportional to the number of bits needed to represent the operands (the logarithmic cost measure). Alternatively we can limit the size of the integers we allow to those representable in a constant times log n bits, where n is a measure of the input size, and restrict the operations we allow on real numbers. We shall generally use the latter approach; all our algorithms are implementable on a random-access or pointer machine with integers of size at most n for some small constant c with only comparison, addition, and sometimes multiplication of input values allowed as operations on real numbers, with no clever encoding.
Would somebody be able to explain this paragraph or point me to some references that elaborates it better?
Solution
If we allow numbers to be unbounded while keeping the unit cost model (each operation costs one unit), we get a very powerful machine. For example, here is how to solve 3SAT in this model. Recall that a 3CNF is a formula of the form
$$ (x_1 \lor \lnot x_2 \lor x_3) \land (x_1 \lor x_4 \lor \lnot x_5) \land \cdots. $$
Here $\lor$ is logical OR, $\land$ is logical AND, and $\lnot$ is logical NOT. The 3CNF is an AND of clauses, each of which is the OR of three literals; a literal is either a variable or its negation. The 3SAT problem asks whether a given 3CNF is satisfiable, that is, whether there is a truth assignment to the variables under which the formula evaluates to TRUE.
Suppose the given 3CNF has $n$ variables. For a formula $\varphi$, let $T(\varphi)$ be a "bit-vector" of length $2^n$, indexed by truth assignments to the variables, such that the bit at position $x$ is the truth value of $\varphi$ evaluated at the assignment $x$ (i.e. 1 if $\varphi$ is TRUE and 0 if $\varphi$ is FALSE). We order the truth assignments in lexicographic order; for example, if $n=2$ then the truth assignments are $(F,F),(F,T),(T,F),(T,T)$, given in the format $(x_2,x_1)$. Continuing the example, $T(x_1) = 1010$ and $T(x_2) = 1100$ (we read the bits from right to left), both in binary notation.
Consider now the following operations:
$$
\begin{align*}
T(x_i) &= \sum_{a=0}^{2^{i-1}-1} \sum_{b=0}^{2^{n-i}-1} 2^{2^i b + 2^{i-1} + a} \\ &= 2^{2^{i-1}} \sum_{a=0}^{2^{i-1}-1} 2^a \sum_{b=0}^{2^{n-i}-1} (2^{2^i})^b \\ &=
2^{2^{i-1}} (2^{2^{i-1}}-1) \frac{(2^{2^i})^{2^{n-i}}-1}{2^{2^i}-1} \\ &=
\frac{2^{2^{i-1}} (2^{2^{i-1}}-1)(2^{2^n}-1)}{2^{2^i}-1}.
\end{align*}
$$
In this way, we can efficiently compute $T(\varphi)$, where $\varphi$ is the given 3CNF. The formula $\varphi$ is satisfiable iff $T(\varphi) \neq 0$. This gives a polynomial time algorithm for 3SAT in the unit cost model. Obviously something is wrong and needs fixing, which is what Tarjan is getting at.
$$ (x_1 \lor \lnot x_2 \lor x_3) \land (x_1 \lor x_4 \lor \lnot x_5) \land \cdots. $$
Here $\lor$ is logical OR, $\land$ is logical AND, and $\lnot$ is logical NOT. The 3CNF is an AND of clauses, each of which is the OR of three literals; a literal is either a variable or its negation. The 3SAT problem asks whether a given 3CNF is satisfiable, that is, whether there is a truth assignment to the variables under which the formula evaluates to TRUE.
Suppose the given 3CNF has $n$ variables. For a formula $\varphi$, let $T(\varphi)$ be a "bit-vector" of length $2^n$, indexed by truth assignments to the variables, such that the bit at position $x$ is the truth value of $\varphi$ evaluated at the assignment $x$ (i.e. 1 if $\varphi$ is TRUE and 0 if $\varphi$ is FALSE). We order the truth assignments in lexicographic order; for example, if $n=2$ then the truth assignments are $(F,F),(F,T),(T,F),(T,T)$, given in the format $(x_2,x_1)$. Continuing the example, $T(x_1) = 1010$ and $T(x_2) = 1100$ (we read the bits from right to left), both in binary notation.
Consider now the following operations:
- For each $i$, we can compute $T(x_i)$ as follows:
$$
\begin{align*}
T(x_i) &= \sum_{a=0}^{2^{i-1}-1} \sum_{b=0}^{2^{n-i}-1} 2^{2^i b + 2^{i-1} + a} \\ &= 2^{2^{i-1}} \sum_{a=0}^{2^{i-1}-1} 2^a \sum_{b=0}^{2^{n-i}-1} (2^{2^i})^b \\ &=
2^{2^{i-1}} (2^{2^{i-1}}-1) \frac{(2^{2^i})^{2^{n-i}}-1}{2^{2^i}-1} \\ &=
\frac{2^{2^{i-1}} (2^{2^{i-1}}-1)(2^{2^n}-1)}{2^{2^i}-1}.
\end{align*}
$$
- Given $T(\alpha),T(\beta)$, we can compute $T(\lnot \alpha), T(\alpha\land\beta), T(\alpha\lor\beta)$ using bitwise NOT, AND, OR.
In this way, we can efficiently compute $T(\varphi)$, where $\varphi$ is the given 3CNF. The formula $\varphi$ is satisfiable iff $T(\varphi) \neq 0$. This gives a polynomial time algorithm for 3SAT in the unit cost model. Obviously something is wrong and needs fixing, which is what Tarjan is getting at.
Context
StackExchange Computer Science Q#26392, answer score: 6
Revisions (0)
No revisions yet.