debugMinor
How much bits need to add to 100bit of data in order to correct up to 10bits?
Viewed 0 times
100bitmuchorderneedbitscorrect10bitshowdataadd
Problem
I'm trying to calculate how much minimum bits need to be added to data of 100bits, in order to correct 10 bits that are messed up by:
I tried Hamming code and Reed-Solomon codes but didn't figured out the answer of 1 or 2.
There are some other more-efficient algorithm for that problem ?
Thanks in advance !
- bits that deleted (Erasure Correcting)
- bits that corrupted (Error Correcting)
I tried Hamming code and Reed-Solomon codes but didn't figured out the answer of 1 or 2.
There are some other more-efficient algorithm for that problem ?
Thanks in advance !
Solution
This seemingly innocent question, is actually quite difficult to answer. Let me try to give some more information on the parameters that get into the game here, and give a partial survey of common results and bounds.
As mentioned in Qalnut's answer, to deal with $10$ bits of errors (in the worst case) we must have distance at least $d=2\cdot 10+1$. This is given. Another parameter that is implied by your question is that the basic unit is a bit. You ask, what is the minimal length $n$ of a (binary) code that has distance $d$ and contains, at least $2^{100}$ codewords.
The lower Bound
Let's first try to ask a slightly more general question:
Given an alphabet of size $q$, what is the maximal number of codewords in code of length $n$ that has minimal distance $d$. [Let's denote this number by $A_q(n,d)$].
The answer to this question is not simple at all, but there are several well known bounds. One very useful bound is the singleton bound, which states that
$$ A_q(n,d) \le q^{n-d+1}$$
This immediately tells us that in order to have $2^{100}$ codewords with distance $d=21$, you must take $n>120$, that is, to add at least 20 bits. But this is only a bound, it never says that such a code (with length 21) actually exists, or how to find it.
More generally, this bound tells us that the relation between the rate of the code $R=n/k$ and the relative distance of the code $\delta=d/n$ behaves, for $n\to \infty$, as
$$ R \le 1-\delta + o(1)$$
Since you wished the relative distance of your code to be $d/n =0.175$, the maximal rate is $\approx 0.825$, which is exactly what we got ($k/n = 100/120 = 0.833 = 0.825 + 1/n)$. The moral of this story is: if you aim for distance $\delta$ (which can correct $\approx \delta n /2$ errors), you will have to add at least
$$(n-k) = n-Rn \approx \delta n $$
more symbols. This is a nice rule of thumb. In the asymptotic case (that is, when $n$ is very very large) that would be approximately the answer.
As said, this bound is, in many cases, not really tight. For other lower bounds, look for the Plotkin bound (for binary codes) and the Johnson bound.
The upper bound
So we know we can't get $n$ smaller than, say, 120. But this is not very helpful from a practical point of view, since we don't even know if a code with $n=120$ exists, or maybe in this case the first code with $d=21$ and $k=100$ shortest $n=137$??
Gilbert and Varshamov come to our help, and show that there always exists a code for which
$$ A_q(n,d) \ge \frac{q^n}{\sum_{j=0}^{d-1}\binom{n}{j}(q-1)^j}$$
So now you just need to plug in $A_q(n,21)=2^{100}$ and find $n$. Let me WolframAlpha this for you, it gives $n(1) is 76.
(1) At least, the best answer I was able to google. (ADDENDUM, found a better one:) and re-google.
As mentioned in Qalnut's answer, to deal with $10$ bits of errors (in the worst case) we must have distance at least $d=2\cdot 10+1$. This is given. Another parameter that is implied by your question is that the basic unit is a bit. You ask, what is the minimal length $n$ of a (binary) code that has distance $d$ and contains, at least $2^{100}$ codewords.
The lower Bound
Let's first try to ask a slightly more general question:
Given an alphabet of size $q$, what is the maximal number of codewords in code of length $n$ that has minimal distance $d$. [Let's denote this number by $A_q(n,d)$].
The answer to this question is not simple at all, but there are several well known bounds. One very useful bound is the singleton bound, which states that
$$ A_q(n,d) \le q^{n-d+1}$$
This immediately tells us that in order to have $2^{100}$ codewords with distance $d=21$, you must take $n>120$, that is, to add at least 20 bits. But this is only a bound, it never says that such a code (with length 21) actually exists, or how to find it.
More generally, this bound tells us that the relation between the rate of the code $R=n/k$ and the relative distance of the code $\delta=d/n$ behaves, for $n\to \infty$, as
$$ R \le 1-\delta + o(1)$$
Since you wished the relative distance of your code to be $d/n =0.175$, the maximal rate is $\approx 0.825$, which is exactly what we got ($k/n = 100/120 = 0.833 = 0.825 + 1/n)$. The moral of this story is: if you aim for distance $\delta$ (which can correct $\approx \delta n /2$ errors), you will have to add at least
$$(n-k) = n-Rn \approx \delta n $$
more symbols. This is a nice rule of thumb. In the asymptotic case (that is, when $n$ is very very large) that would be approximately the answer.
As said, this bound is, in many cases, not really tight. For other lower bounds, look for the Plotkin bound (for binary codes) and the Johnson bound.
The upper bound
So we know we can't get $n$ smaller than, say, 120. But this is not very helpful from a practical point of view, since we don't even know if a code with $n=120$ exists, or maybe in this case the first code with $d=21$ and $k=100$ shortest $n=137$??
Gilbert and Varshamov come to our help, and show that there always exists a code for which
$$ A_q(n,d) \ge \frac{q^n}{\sum_{j=0}^{d-1}\binom{n}{j}(q-1)^j}$$
So now you just need to plug in $A_q(n,21)=2^{100}$ and find $n$. Let me WolframAlpha this for you, it gives $n(1) is 76.
(1) At least, the best answer I was able to google. (ADDENDUM, found a better one:) and re-google.
Context
StackExchange Computer Science Q#43390, answer score: 2
Revisions (0)
No revisions yet.