C4: leaky power
222
C4 is a very advanced AES based defensive system. You are able to monitor the power lines. Is that enough?
You’re given three files:
powertraces.npy: Measurements (over time) of power consumption of a chip while performing AES encryption
#: Corresponding plaintext inputs that were encrypted
instructions.jwe: File encrypted using the same key as plaintexts.npy.
note: The first two files are NumPy arrays.
note: there’s a mistake in the way instructions.jwe was created (the algorithm is A128GCM, not A256GCM).
The encryption used is AES, from reading the challenge description we know we eventually were able to monitor the powerlines having some power traces on powertraces.npy and the corresponding plaintexts used on plaintexts.npy, the ciphertext we want to decrypt is located at instructions.jwe.
We have some power traces so we will need to use some kind of side-channel analysis:
In CPA (Correlation power analysis) the goal is to accurately produce a power model of the device under attack. During an attack, the aim is to find Correlation between a predicted output and the actual power output of a device. If the power model is accurate then a strong correlation should be demonstrated between the predicted output and actual output. If this correlation is found then, gathering a large number of traces will enable one to show that the correctly predicted cipher key will demonstrate the highest level of correlation.
One power model which may be used is the Hamming Weight Power Model. Traditionally, the Hamming weight of a value is the number of non-zeroes. For example, in the binary number 1100 0010 the Hamming weight would be 3. The assumption in using the Hamming Weight Power Model in power analysis attacks is that the number of bits set to 0 or 1 of an output is correlated with the power consumption of a device. The Hamming weight itself is then used as an arbitrary unit to model the consumption of power in a device. Hamming weight units can then be compared to the actual voltage levels of power traces captured when a device was performing cryptographic operations. This act of comparison is the process of finding correlation between the modelled power unit values and the actual power consumed.
One technique to calculate correlation between the power model and the actual power consumption is to use Pearson correlation coefficient equation. In essence, this equation will take two sets data (X and Y) and calculate whether there is a linear (positive or negative) correlation between the two sets of values. We may use this equation to find significance in our power traces since the assumption with the Hamming Weight Power Model is that as the number of 1’s increase in our predicted output, so too does the power consumption increase in the actual output (and vice versa).
_Figure 1 - Pearson correlation coefficient equation_
AES
At the start of encryption, the plaintext values (the data to be encrypted) and the cipher key values (the key used for encryption and decryption purposes) will be each arranged into a 4×4 matrix in the positions as shown in Figure 2. Each value in this matrix holds 1 byte of data. During the AddRoundKey step, each plaintext value is XOR’d with a cipher key value at a matching position in the 4×4 matrix.
_Figure 2. Plaintext and cipher key arrangement._
After AddRoundKey, the SubBytes step will use the result produced by Pi⊕ Ki as a lookup for a value stored in the Rijndael S-box. The S-box output will replace Pi⊕ Ki. The S-box is a 16×16 matrix of values which remains constant for all AES implementations. Each position in the 16×16 matrix will hold 1 byte of data.
_Figure 3 - Rijndael S-box_
For example if the result of Pi ⊕ Ki is c5 then we look for in the sbox table for the line c and the column 5 and we obtain the value a6.
In the context of CPA attack implemented aim to exploit the fact that information may be leaked if one was to monitor the power consumption of a cryptographic device during the point in which the S-box lookup is carried out.
Writing Python Script of CPA
Both this files are numpy arrays, they can be loaded into python by using the numpy.load function:
1 | import numpy as np |
Now for each we want to create some hypothesis, the range is between 0x0 to 0xff (all possible bytes), this hypothesis are key guesses in which we will Xor them between with the plaintexts used and calculated it’s Hamming Weight :
1 | sbox=( |
Now still in the same loop we want to calculate the means of the hypothesis and the points of the trace, this comes from the correlation formula in Figure 2.
We can say that the two aleatory variable X and Y where X is the hamming distance of the key hypothesis for every character of plaintext tested and Y for every points power consumption points in every trace. So concluding the Xi in the formula is the first HW[intermediate(plaintext[x0][k0], kguess)] and yi is all the points in the power consumption trace, while x̅ is the mean of variable X and ȳ is the respective mean of variable Y.
Calculation the means:
1 | #Mean of hypothesis |
Now calculating the summations in the formula and performing the square root:
1 | #For each trace, do the following |
So after calculating the correlation for every key guess the best guess key is the one with highest value of correlation:
1 | bestguess[bnum] = np.argmax(maxcpa) |
In the end we get the complete key used in the encryption, the full script:
1 | import numpy as np |
Running it :
1 | $ python leak_power.py |
The key is d2dea057d1145f456796966024a703b2 now that we have the key we can decrypt the cyphertext, we can do this ith a few lines of go:
1 | package main |
Running it we get the flag:
1 | $ go get gopkg.in/square/go-jose.v2 |
The flag was flag-e2f27bac480a7857de45