Update: a primitive GUI is now available to help remove the last false positives from signal matching. To use this, execute keyboards/classifier-read.py with the “–gui” argument to use this, code is in keyboards/ui.py.
This week, I made good progress on electromagnetic keylogging (in particular, distinguishing keystrokes from a high-current-leakage keyboard). Just as the traveller may miss the forest for the trees, so I was blinded by my own mistaken interpretation of the data at hand – depending on key, a series of unique peaks is identifiable *after* the initial strong electromagnetic leakage pulse signifying a closed circuit within the scan matrix.
I will not yet claim to remotely understand the physics behind what I presume is the decay of the magnetic field over time (and by extension, the way this phenomena translates into measured electrical signal over time) – but it is enough to know the behavior exists, and how to apply it to recovering keystroke data.
It is immediately apparent that the “unknown” signals of “0.npy” and “2.npy” are basically time-shifted versions of “p” – however, the initial pulse (the trigger) must be exempted for this to work. This is uniquely the case for every letter of the target keyboard I have tested thus far, forming an electromagnetic signature for each key, independent of position information from the scan matrix.
From here, the bulk of the work is in signal processing. I know little of the subject, but my stumbling about in Python has gotten a somewhat viable solution, as follows:
- Firstly, extract the peaks from each signal. This is an inherently fuzzy problem, so we approximate by extracting signals from the 99.95 percentile. We only accept peaks which are a minimum distance apart (the maximum point in the proximity of an identified peak). A set of this is saved as training data.
- We perform an equivalent extraction against each identified key pulse in the “unknown” signal.
- For each pairing of unknown signal and training data, we cull the longer signal to match the shorter signal, by discarding signal points of lowest absolute value. The assumption is that these peaks are swallowed by noise.
- We “match” a key when the following conditions are met:
- The position of the spikes is within a tolerance limit
- The correlation coefficient of the values of both trace-peak sets is within a tolerance limit
- The difference of absolute value of each equivalent peak-pair is within a tolerance limit (perhaps this and the time-position of each spike could be converted to a distance, and the entire thing fed to something starting with import tensorflow :P)
Of this, the peak extraction was the most finicky to get right – I ended up playing with a number of approaches before I landed on the current one, which seems fairly resistant to noise:
While imperfect (though I am semi-competent at Python, I am no data scientist), this generates extremely strong matches with relatively low false-positive and low false-negative rates. An improvement over previous matching efforts is that we can now generate true-negatives, where a sample simply matches nothing in our training dataset.
Still, p<unknown>ssw<unknown>rd really doesn’t leave too much to the imagination – and due to our ability to measure our certainty, for the instances where we’re unsure, we can lay the traces on top of one another like tracing paper and manually determine the best match. We can also improve this by saving all matches for a given unknown signal, and sieving a wordlist for common passwords / using language features to improve our signal matching.
During the investigation, I also explored whether unique frequency components were present in the decaying magnetic field, but I am currently unable to use a frequency-based interpretation to perform reliable matching. Observe, a matching signal overlaid on the correct training data in the frequency domain (via a power spectral density estimate using scipy.signal.welch):
Now, observe the same, though with an unmatched training data set:
This isn’t to rule out a frequency-based side-channel, simply that I don’t know how to do it – I am eager to explore other interpretations of the same data!
It is crucial to note that while conceptually reliable, this attack is severely affected by electromagnetic noise: a typist working by the light of the aurora should be all but immune to this. To successfully correlate training data, we must ensure there are no noise sources which can either cause extraneous peak detection, or missing peaks. Preferrably, training should be done with the same noise background as actual signal capture.
This attack is quite cheap to carry out, requiring only a readily available off-the-shelf USB oscilloscope (PicoScope 2206B) and two loops of thin wire. Best of all, due to per-keystroke triggering, no external storage is required for the attack. All the code required for the attack is available on Github – and sample data is available here. Enjoy!