Preliminary Notes on Electromagnetic Keylogging

This is my keylogger. There are very few like it – this one is mine.

Over the past few weeks (and on and off before that), I have done some work on electromagnetic keylogging – this has met with some success. This post will document my progress so far, and collate some thoughts on future work in this area. All the code can be downloaded at my Github – due to the extreme work-in-progress nature of this project, I recommend you review the available branches before experimenting (or get in touch and let’s work together!)

Background

To explain this keylogger’s operation, we must look into the actual operation of a keyboard. A modern keyboard is comprised primarily of two sheets of flexible plastic, with conductive traces printed on them. The traces on the two sheets of plastic intersect at locations corresponding to the physical keys – when a key is pressed, a contact is made between the two plastic sheets, allowing a conductive path to form between a row and a column of the scan matrix.

A microcontroller scans the matrix by setting each pin in a column to high for a period of time, and monitoring the voltage of each of the rows:

(credit: http://www.pykett.org.uk/picscannermidi.htm)

Viewed through another lens, a microcontroller activates one of num_columns antennas intermittently, signalling that it is scanning. Previously, Martin Vuagnoux and Sylvain Pasini exploited this property to determine the presence of keystrokes via delays in the scan cycle caused by the microcontroller buffering a USB packet or creating a PS2 message (not sure of the name of these). In my experimentation, I was unable to reliably observe this delay, but I was able to (visually!) observe the difference in magnitude of leaked energy – this was enough to perform imperfect keylogging on a target with training data (and presumably, without).

Implementation

To measure and correlate my attacks, I built some simple magnetic pickups – these were simply lengths of wire on a cardboard plate, with some aluminum foil noise shielding on one side (there is insufficient testing to determine how effective this shielding is), as follows:

Some manual analysis was performed to determine the trigger threshold, sample rate and sample count for a keystroke, and some signal processing was done to align the measured signals – from there, two primary paths of correlation became apparent:

  • In circumstances where a high current flowed between the column and the row (creating a large magnetic spike), I could isolate only the single spike corresponding to a single row-column scan event, and correlate the magnetic field generated against test data.
  • In circumstances where not enough current flowed to reliably trigger on a single column-row pair, it appears possible to continuously capture signal, align the captured signals in a known way and correlate that against training data.

In some cases, feature extraction and signal processing techniques can be used to clean up the captured signal to reduce background noise: though I found when capturing single row-column scan events, I had to be careful to avoid information loss. Continuous visualization with matplotlib helped here.

A further improvement was to increase the fidelity of the information I captured, using two coils instead of one. Triggering was still done on a single coil (this was reliable enough on a test keyboard), but correlation was performed on both channels of data. This was enough to correctly capture words – though with any such approach, particularly under the time-limited sampling constraint of typing, the results are imperfect, but a limited test case (continuous recapturing of training data is continuously a pain in the fucking ass) shows promise:

The arrays above represent the “key guess”, and a weighted sum of correlation coefficients of scan events vs training data (“correct” hits with >0.9 correlation are weighted more heavily).

Observe further the below two two-channel signal plots representing a full scan cycle, the first with the “J” key held down during capture:

And a second, without:

The difference is already as clear as day, and is consistent across samples, some trivial signal analysis can clean up the traces and do the necessary correlation to identify the key.

A “Combined Arms” Approach

In truth, the core of this project is not in the electromagnetic pickup – the principle of signal correlation should be applicable to almost any side channel, with some interpretation where appropriate. Most empowering perhaps is the potential to use different inputs to solve the detection vs identification problem referred to by John Monaco in his work, “SoK: Keylogging Side Channels“.

Particularly in cases where a keypress does not generate a significant enough event to trigger off alone, a secondary channel, such as sound or vibration picked up via a smartphone, could allow us to reduce the amount of data we need to sample. Alternatively, a “SAD Match” trigger (as per the ChipWhisperer) could feasibly be used, though greatly increasing the cost of the equipment required for the attack, ChipWhisperer or no.

I am also keen on investigating the potential interpretation of this signal in the frequency domain – if I am able to reduce the effective sample rate (currently sampling at ~125MSPS, requiring the use of an oscilloscope at the very least), perhaps by using higher-frequency harmonics, it may be possible to greatly reduce the cost of this attack. Most importantly, I am keen to explore the fundamental relationship between the fundamental frequency and it’s harmonics in a way that it may be applicable to other “loud” electromagnetic leakage sources.

I am excited by the progress made in this project, and as my understanding of the theory behind this class of attack improves, the vast horizons opening up ahead; a warm reminder of the heady days of my youth, learning for the first time that impacting a user’s view of a web application was known as “cross-site scripting”. If you are interested in working together on this, please do leave a note – I am most keen to keep exploring this area.

Posted in Bards, Computers, Jesting | Leave a comment

Writeups – coffee_break (SECCON), cobol_otp (Hack.lu)

Over the past two weeks, I have spent some time diving back into CTF challenges. My skills have decayed over so long of not doing this (or the challenges could have gotten harder, but let’s get real, it’s skill decay), a timely reminder of the need to commit to deliberate practice.

Nevertheless, two of the challenges solved will be presented below.

coffee_break

This challenge is presented as a crypto challenge, with a Python encryption oracle, and an encrypted flag. You can download the original challenge here. On inspection, we note that the encrypt function is ultimately a trivial character substitution cipher: with this keyspace, it’s simply faster to brute force each character rather than wrangling a decryptor.

We then feed this intermediate decrypted value to AES decrypt, giving us the flag:

Thanks to the SECCON CTF team for organising this – I had the presence of mind to grab a few binaries for the road, and will hopefully have the opportunity to test myself against them as time goes on.

cobol_otp

This challenge was presented as a COBOL file and accompanying output, which you can download here and here. The goal was to work out what input was fed to the Cobol program to get the output.

From initial inspection, the actual encryption is just XOR, but the key is unknown. We start by using the flag format to derive the first five letters of the key (xor out to “flag{“). We can extend the key with zeros to work out likely key lengths, then tweak one character of the key at a time, based on likely words in the flag.

The solution is fairly simple, which you can download here.

Thankyou to the hack.lu team for organising this event. I’m a little frustrated by my inability to solve the no-risc-no-future challenge, stymied at the last minute by non-working shellcode (when I could have just used pwntools shellcraft shellcode) – I’ll chalk this up as a lesson learned.

Posted in Bards, Computers, Jesting | Leave a comment

Quick Note: WiFi Bash Bunny

This weekend, while sulking over my lack of forward progress in smartcard power analysis, I added WiFi control to a Bash Bunny I’ve got lying around. Somehow, Google turns up nothing on the topic (surely I’m not the only person to think of this), so I’ll place the steps to modify a Bash Bunny here. To do this, you’ll need:

  • Bash Bunny
  • ESP-01 module
  • A way to program the ESP-01 module
  • 2x 10K Resistors (not strictly needed, but incase you want field upgrades).

Cracking open a Bash Bunny reveals a fairly standard USB-dev-board piece of kit. The bulk of the case is empty space, held up by a block of gel:

UART is helpfully broken out, as well as a 3.3v power supply, so this should be an easy job. An initial UART connection revealed that this ran actual Debian, which is nice. I figure there’s probably some open source WiFi-to-serial bridges for backdooring routers and whatnot on Github, so I got to searching:

The most popular solution seems to be jeelabs/esp-link, which is way, way overengineered for what we need. About 5 minutes of Arduino later, and we’ve got a much more lightweight, single-file no-bells-and-whistles WiFi bridge, which you can download here.

Before we wire this up, we need to get an idea of power consumption. Plugging the Bash Bunny into a USB power meter shows a peak of approximately 200mA at 5V (and a running current of maybe 150mA). From this website, we can see that the ESP draws approximately 500mA, but at 3.3V. Ignoring loss from… random components, this is just enough to fit under the 500mA / 5V power draw permitted by modern USB.

Now, we flash the firmware to the ESP-01 and connect it to the Bash Bunny. We use a “permanent on” configuration with 2 pull-up resistors on RST and CH_PD. You can wire these up directly to VCC if you want, but the resistors are useful if you want to update the ESP firmware later. The final product looks like this:

There’s enough space to hide the entire setup inside the original case, if you prefer.

Plugging this in gives us a shell on the actual device. Note the local echo – you’ll need to turn this off in your client.

Enjoy!

Posted in Bards, Computers, Jesting | Leave a comment

Trace Alignment!

Over the past week, I have progressed my work in power analysis of smartcards. By replacing my original power shunt (1.5R) with a higher resistance (4.7R) alternative, I’m now able to retrieve significantly higher resolution power traces, even at 1.8V. Through the lens of a software low pass filter, we can clearly see the rounds of some cryptographic operation (suspected AES, rest of trace snipped for brevity):

Unfortunately, this isn’t quite ready for further analysis:

I spent some time investigating the possible strategies for realigning the traces time-wise. Two approaches were identified:

  • Firstly, the Sum of Absolute Difference approach can be used. A “window” of traces is selected, and is shifted back and forward until the sum of absolute differences with a reference window is minimized. This is the approach taken by ChipWhisperer control software.
  • Alternatively, the window can be shifted back and forth until the correlation coefficient between it and a reference window is maximised, similar to a CPA key matching attack. This is probably the easier of the two approaches to independently implement, with numpy.corrcoef doing the heavy lifting for us

Both strategies should produce roughly the same results, though I suspect a Sum-of-Absolute-Difference match will result in more fine-grained control over exactly how much a trace can differ from a reference, though this may result in false negative matches when low-frequency fluctuations in power supply present themselves (aka why USB, why), causing a sum of absolute difference to exceed a chosen maximum threshold.

Both are relatively simple to implement, and the tooling is now available as part of the fuckshitfuck toolkit, as preprocessor.py. Note that this tool only implements “soft matching”: that is, it will use the selected matching strategy only up to a point, if it can’t find a match within a certain number of samples, it will discard the offending trace.

Window selection is also crucial, particularly when looking at cryptographic operations with repeating rounds that are similar in execution. For the below trace, I took a two-pass approach: firstly, I matched the “prefix” (second “column”) loosely, matching to a minimum correlation coefficient of 0.7. I then matched the third/fourth columns more tighly, with 0.95 minimum correlation – this managed to mostly (visual inspection) eliminate the temporal misalignment of the remaining traces.

Also, the tooling’s a bit slow (though it’s questionable whether this is a problem for a one-off preprocessing activity). Initial attempts at multi-threading the code met with failure, due to what I suspect is an IO bottleneck (in other words, a single huge memory mapped array of traces).

My attempts at massaging out a working Ki+OPc value from the smartcard’s MILENAGE (I suspect? I don’t think I can really “check” unless I go work for Gemalto or GCHQ), but thin rays of hope shine through the signal-to-noise ratio. Observe:

What exciting times we live in!

Posted in Bards, Computers, Jesting | Leave a comment

ESP-on-a-PMOD Runtime FPGA Configuration

Following on from my work yesterday, I needed something to quickly reconfigure the edge triggering FPGA (i.e. to use it as a generic smartcard trigger, if without support for true ISO7816 pattern triggering). My solution was to stick a PMOD connector onto an ESP-01 and just bit-bang it over a clocked serial line:

Simply put, the ESP-01 stands up a temporary wireless access network and a webserver. For all the shit-talking Arduino gets, I’d sure as fuck rather spend 15 minutes writing this in Arduino with HL functions than 3 days trying to debug why DHCP isn’t working.

The webserver serves three endpoints:

  • /, which displays a usage message
  • /configure?io=X&clk=Y, configuring the IO edge count and clock edge count
  • /program, which bit-bangs IO and CLK parameters to the FPGA

The ESP-FPGA communication is a simple fixed-length clocked signal, sampled on each rising edge of a dedicated clock line. It’s extremely slow, holding each signal for 40ms, but as a delay-tolerant configuration activity I don’t really care, an LED on the FPGA board is quickly repurposed to serve as a “programming indicator”.

A trivial Verilog state machine allows us to receive this incoming logic, and set the FPGA’s internal registers accordingly. For future reference and re-use:

always @(posedge prog_clk)
begin
    if(prog_reset == 0)
    begin         
        prog_shift <= 0;
        prog_wait_state <= 1;
    end
    else
    begin
        if(prog_wait_state == 1)
        begin
            if(prog_io == 1)
            begin
                IO_EDGE_TARGET <= 0;
                prog_state_io <= 1;
                prog_state_clk <= 0;
            end
            else
            begin
                CLK_EDGE_TARGET <= 0;
                prog_state_io <= 0;
                prog_state_clk <= 1;
            end
            prog_wait_state <= 0;
        end
        else
        begin
            if(prog_state_io == 1)  // start shifting bits in
            begin
                IO_EDGE_TARGET <= IO_EDGE_TARGET + (prog_io << prog_shift);
            end
            else if(prog_state_clk == 1)
            begin
                CLK_EDGE_TARGET <= CLK_EDGE_TARGET + (prog_io << prog_shift);
            end
            prog_shift <= prog_shift + 1;
        end
    end
end

For now, the ESP will always take 32-bit arguments and send the full 32-bits, but there is room for later optimisation.

Posted in Bards, Computers, Jesting | Leave a comment