patternMinor
What is the best way to compress a string with only numbers and commas in it?
Viewed 0 times
thewhatwithnumberswaycompressonlyandstringcommas
Problem
I have a string like this:
And I'm sure it's following these rules below:
Are there any simple algorithms that I can have to compress/decompress the data?
For instance, a very stupid way I can think is to convert each number into base-16 or even base base-36 numbers.
11,512,64,23,46,12,67,85,84And I'm sure it's following these rules below:
- Only
,and0123456789in it.
- No number will be bigger than 999, that means they are all 1,2 or 3 digits number.
Are there any simple algorithms that I can have to compress/decompress the data?
For instance, a very stupid way I can think is to convert each number into base-16 or even base base-36 numbers.
Solution
Treat this as a base-11 number (the comma is treated like the digit "10"). Convert the base-11 number to binary, and use the resulting binary string as the compressed version of your data.
This will be close to optimal. It falls short of optimal only in that it does not take into account the guarantee that every number in the list will have at most 3 digits -- but even taking that into account would not improve the compression ratio much.
The absolute best method will depend on the distribution of data that you will be compressing; but I expect this will be close to optimal for most real-world distributions.
This will be close to optimal. It falls short of optimal only in that it does not take into account the guarantee that every number in the list will have at most 3 digits -- but even taking that into account would not improve the compression ratio much.
The absolute best method will depend on the distribution of data that you will be compressing; but I expect this will be close to optimal for most real-world distributions.
Context
StackExchange Computer Science Q#100211, answer score: 3
Revisions (0)
No revisions yet.