``ElGamal'' is leaky and vulnerable
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Python-Crypto |
Invalid
|
Undecided
|
Unassigned |
Bug Description
The implemented ElGamal encryption scheme is leaky and vulnerable.
That is, one may distinguish ciphertexts of two distinct messages with non-negligible advantage and one may retrieve the messages from the ciphertexts.
The problems arise from the patch #985164 and the ``textbook'' ElGamal encryption scheme.
Preliminaries:
Let y = g^x be the public key and x be the secret key.
The ciphertext of M with y is (a,b) = (g^K, M * y^K), where K is a random element from Z_p^*.
For an element a in Z_p^*, its Legendre symbol is denoted by (a | p).
One can easily compute the Legendre symbol from the quadratic reciprocity.
(See, e.g., http://
Let QR denote a set of quadratic residues of Z_p^* (i.e., the subgroup of Z_p^* of order q).
In other word, a set of elements in Z_p^* such that its Legendre symbol is 1.
Attack 1:
The patch and the bad choice of the message space makes the scheme leaky.
Since one can compute the Legendre symbol easily, the scheme leaks information of the messages.
The Legendre symbol of g is -1 following g is the generator of Z_p^* of order 2q.
Hence, one can decide K is even or odd by computing (a | p) = (g | p)^K.
In addition, one can decide x is even or odd by computing (y | p) = (g | p)^x.
From the two values, (b | p) = (M | p) * (y | p)^K = (M | p) * (g | p)^{xK} leaks (M | p).
Attack2:
Even if g is of order q instead of 2q, the scheme leaks information of a message, since the message space is Z_p^*.
This is because (b | p) = (M | p) * (g | p)^{xK} = (M | p), where the last equation follows from the fact that g is in QR.
To avoid the attacks, the message space should be QR.
Attack 3 and more:
In addition, the textbook ElGamal should not be used in the real world.
Roughly speaking, one should not encrypt a short message M, say 64 bits, with the textbook ElGamal encryption scheme.
See the paper ``Why Textbook ElGamal and RSA Encryption Are Insecure'' written by Boneh, Joux, and Nguyen in ASIACRYPT 2000.
(Available at http://
Summary:
The ``textbook'' implementation leaks information and makes the scheme vulnerable.
We should stress that one should not use this implementation, shouldn't we?
I'll need some help with this, since I'm actually not too familiar with ElGamal, but I can draw a few parallels to RSA.
PyCrypto appears to have originally been designed to be an assortment of fast, low-level cryptographic primitives that can be composed to implement higher-level protocols. The target audience has really just been other crypto implementers. Unfortunately, due to being the top search result for 'python cryptography', a lot of ordinary application developers have been using it directly without really knowing what they are doing.
Crypto. PublicKey. RSA is an implementation of textbook RSA, and there's no good way to change it without breaking things for other developers who understand this and implemented the appropriate padding on top of it (e.g. the paramiko developers). We've recently added the Crypto. Cipher. PKCS1_OAEP and Crypto. Signature. PKCS1_PSS modules, so that we can at least tell people to use those instead of the raw RSA primitive.
Crypto. PublicKey. ElGamal is similar. It implements textbook ElGamal, by design. This is known, and while it's an unfortunate design decision, it's probably not going to change any time soon. We assume that application developers are implementing standard protocols with an appropriate level of understanding of what's going on, and with some interoperability testing (if they're not doing this already, they're screwed anyway). We should probably implement some sort of standard padding scheme on top of the ElGamal primitive, like we did with RSA. (Is there even a standard padding scheme for ElGamal?)
I'd like to focus the discussion on vulnerabilities, if any, under the assumption that textbook ElGamal is *not* being used. Can any of you comment on that?