patternMinor
Order preserving encoding of strings to numbers
Viewed 0 times
ordernumberspreservingencodingstrings
Problem
I want to encode strings as real numbers while preserving order. The order of the strings is the lexicographic order (as used in phone books); the order of the numbers is the standard order.
Is there an general (i.e. no additional conditions such as e.g. known max string size) encoding with this property?
Is there an general (i.e. no additional conditions such as e.g. known max string size) encoding with this property?
Solution
(I assume that by lexicographic order, you mean lexicographic order. Phone books may have all kinds of oddities due to ordering proper names, such as sorting
Suppose your strings only contained the characters
If you have $n$ characters in your alphabet, use base $n+1$.
This encoding accommodates infinite strings too, but you need to use base $n+2$ and skip the largest digit as well as the smallest digit (the largest string beginning with
McDonald near MacDonald.)Suppose your strings only contained the characters
1, 2, 3, 4, 5, 6, 7, 8 and 9. The mapping from the string $s$ to the decimal number 0.$\!\!s$ is monotonic.If you have $n$ characters in your alphabet, use base $n+1$.
This encoding accommodates infinite strings too, but you need to use base $n+2$ and skip the largest digit as well as the smallest digit (the largest string beginning with
1 is 1888… $\mapsto 0.1888\ldots$ which is smaller than 2 $\mapsto 0.2 = 0.1999\ldots$).Context
StackExchange Computer Science Q#2903, answer score: 6
Revisions (0)
No revisions yet.