# I recently came across a beautiful mathematical proof about data compression

The name S10 is an elegantly compact version of “Stien”. If you read the lyrics of her song *depth *If you want to write it more compactly, you can, for example, replace “Tadadadadadadada” with “Ta 78 x da”.

With lossless data compression, you can accurately restore the original file from the compressed file. It’s not obvious, think of the bloated images you sometimes see: the image file size is reduced there so you can’t go back to the original image.

I recently came across great mathematical proof that one lossless data compression method can never work for all files. If you have a method that shrinks at least one file, there will inevitably be a file that actually grows the same way.

It’s a silly proof: you start with the opposite of what you want to prove, then you start with the mind from there a number of perfectly logical steps until you come across a contradiction. Since all of your steps were logical and correct, the only possible conclusion is that your first assumption was wrong.

In this guide, we think of stored files as a finite sequence of bits (each of which can be 0 or 1). We assume that there is a compression method that converts at least one file into an output that is shorter than the original file and that method does not make all the other files longer than they were. (Making files longer is not a very useful form of compression.)

Now here’s a little notation: let M be the smallest number so that there is a file B of length M that gets converted to a shorter one. Let’s say N is the length of the compressed version of this file.

Since N is smaller than M, any file of length N would still be exactly the same length when compressed (because M was the smallest length that could be converted to a shorter file and we assumed our method would never make the files larger).

How many files of length N are there? Both N bits can be 0 or 1, making the sum 2^{n} Possible different files.

All these files are converted to one or another file of length N by the compression method. In addition, the larger file B is also compressed into a file of length N. This makes 2^{n}+1 compressed files of length N, while the number of possible files of this length is only 2^{n} he is. We have one file too many and that means two different input files lead to exactly the same zip file. This makes it impossible to determine what the original file was from this zip file.

This means that our original assumption cannot be correct: so there is no compression method that makes at least one file smaller and then not making all the others taller than they were. So any meaningful compression method makes at least one file larger than it was. As S10 sang: “Do you know the feeling that your dream has not come true?”

“Travel enthusiast. Alcohol lover. Friendly entrepreneur. Coffeeaholic. Award-winning writer.”