HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Is it known whether the MD5 algorithm is surjective?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
themd5algorithmknownsurjectivewhether

Problem

Does MD5 map to all possible 128 bit numbers? To put it another way:
For every 128 bit number y, does there exist at least one x for which md5(x) = y?

The answer probably is yes, for arbitrary x, and probably no proof exists. However, feel free to prove me wrong :)

But what if I impose some kind of limit on x, e.g. x <= x_max? Of course, for very small x_max the answer is no. But what is the minimum value of x_max for which the answer likely becomes yes? Or the maximum value for which the answer can be shown to be no?

I'm not really interested in the exact value for x_max, rather in the approximate order of magnitude.

Solution

Q: Does MD5 map to all possible 128 bit numbers?

Taking into account the facts that MD5 accepts input of the arbitrary length, the output is "only" 128 bits, and the algorithm itself which behaves similarly to random functions, I would expect that the all outputs are possible. However, without any strong mathematical evidence, this is merely a speculation, or in the best case, the hunch.


Q: The answer probably is yes, for arbitrary x, and probably no proof exists.

Exactly, there is no proof (currently). Proving that MD5 is surjective would require either one of these:

  • Brute force check - taking into account coupon's collector problem, it takes something like 10^40 inputs to hit all 128-bit outputs. Assuming again that MD5 behaves similarly to random function, that should be enough to verify in the case that MD5 is surjective.



  • Concrete mathematical proof - eventual proof of surjectivity should be able for arbitrary 128-bit number y prove that there exists some input x such that MD5(x) = y. That should require deep mathematical analysis of MD5 function and it would probably shed some light on reversing this hash function. Even though MD5 is far from desirable hash function, it is still not known how to predict the input based on the given output.

Context

StackExchange Computer Science Q#43205, answer score: 3

Revisions (0)

No revisions yet.