Ben Laurie: Compression Violates Semantic Security

There’s been quite a lot of noise about the still not-fully-disclosed CRIME attack on TLS recently. But, fully disclosed or not, I think we can say with certainty that it turns out that compression is a problem.

The interesting thing, to me at least, is that, in retrospect, this is completely obvious. In cryptography, we have standards that we hold encryption algorithms to, and one of these is semantic security. In short, this means that an attacker should learn nothing (other than length[1]) about a plaintext, given its ciphertext. One way this is often phrased is as a game: given two plaintexts of equal lengths, and one ciphertext made from one of the two plaintexts, then an attacker, who knows everything about the algorithm other than the key, should not be able to guess better than chance which of the two plaintexts was used.

It is obvious that, in general, if compression is used, this game can only go the attacker’s way: the length of the ciphertext must reveal something about the content of the plaintext. This is because, in general, not all texts can compress – indeed, if some plaintexts come out shorter, there must also be some that come out longer. So, since the attacker knows what compression algorithm is in use, he can tell which of the two plaintexts was used by the length of the ciphertext, in general (note that there may be pairs of plaintexts for which this is not true, but in general, there are pairs where the lengths are different). And thus he wins the game, which shows that compression simply cannot be used in a system giving semantic security[2].

And we expect TLS to give semantic security, just like all modern crypto. So, it should’ve been obvious from the start that compression was a non-runner. Why did we not realise? I think the answer to that question would be very interesting indeed. Also, what else do we do now that obviously violates semantic security?

[1] Sometimes even academics admit to real world constraints!

[2] Pedants might argue that actually, yes, you can use compression: just pad everything to the longest a plaintext could compress to. As I’ve noted above, if the compression works at all (that is, some texts are reduced in length), then some texts must actually expand. Which means that you must pad to longer than the original length. So, yeah, pedant, you can use compression, but only if it actually expands!

Latest Images

Trending Articles

Latest Images