Quick Note: WiFi Bash Bunny

This weekend, while sulking over my lack of forward progress in smartcard power analysis, I added WiFi control to a Bash Bunny I’ve got lying around. Somehow, Google turns up nothing on the topic (surely I’m not the only person to think of this), so I’ll place the steps to modify a Bash Bunny here. To do this, you’ll need:

  • Bash Bunny
  • ESP-01 module
  • A way to program the ESP-01 module
  • 2x 10K Resistors (not strictly needed, but incase you want field upgrades).

Cracking open a Bash Bunny reveals a fairly standard USB-dev-board piece of kit. The bulk of the case is empty space, held up by a block of gel:

UART is helpfully broken out, as well as a 3.3v power supply, so this should be an easy job. An initial UART connection revealed that this ran actual Debian, which is nice. I figure there’s probably some open source WiFi-to-serial bridges for backdooring routers and whatnot on Github, so I got to searching:

The most popular solution seems to be jeelabs/esp-link, which is way, way overengineered for what we need. About 5 minutes of Arduino later, and we’ve got a much more lightweight, single-file no-bells-and-whistles WiFi bridge, which you can download here.

Before we wire this up, we need to get an idea of power consumption. Plugging the Bash Bunny into a USB power meter shows a peak of approximately 200mA at 5V (and a running current of maybe 150mA). From this website, we can see that the ESP draws approximately 500mA, but at 3.3V. Ignoring loss from… random components, this is just enough to fit under the 500mA / 5V power draw permitted by modern USB.

Now, we flash the firmware to the ESP-01 and connect it to the Bash Bunny. We use a “permanent on” configuration with 2 pull-up resistors on RST and CH_PD. You can wire these up directly to VCC if you want, but the resistors are useful if you want to update the ESP firmware later. The final product looks like this:

There’s enough space to hide the entire setup inside the original case, if you prefer.

Plugging this in gives us a shell on the actual device. Note the local echo – you’ll need to turn this off in your client.


Posted in Bards, Computers, Jesting | Leave a comment

Trace Alignment!

Over the past week, I have progressed my work in power analysis of smartcards. By replacing my original power shunt (1.5R) with a higher resistance (4.7R) alternative, I’m now able to retrieve significantly higher resolution power traces, even at 1.8V. Through the lens of a software low pass filter, we can clearly see the rounds of some cryptographic operation (suspected AES, rest of trace snipped for brevity):

Unfortunately, this isn’t quite ready for further analysis:

I spent some time investigating the possible strategies for realigning the traces time-wise. Two approaches were identified:

  • Firstly, the Sum of Absolute Difference approach can be used. A “window” of traces is selected, and is shifted back and forward until the sum of absolute differences with a reference window is minimized. This is the approach taken by ChipWhisperer control software.
  • Alternatively, the window can be shifted back and forth until the correlation coefficient between it and a reference window is maximised, similar to a CPA key matching attack. This is probably the easier of the two approaches to independently implement, with numpy.corrcoef doing the heavy lifting for us

Both strategies should produce roughly the same results, though I suspect a Sum-of-Absolute-Difference match will result in more fine-grained control over exactly how much a trace can differ from a reference, though this may result in false negative matches when low-frequency fluctuations in power supply present themselves (aka why USB, why), causing a sum of absolute difference to exceed a chosen maximum threshold.

Both are relatively simple to implement, and the tooling is now available as part of the fuckshitfuck toolkit, as preprocessor.py. Note that this tool only implements “soft matching”: that is, it will use the selected matching strategy only up to a point, if it can’t find a match within a certain number of samples, it will discard the offending trace.

Window selection is also crucial, particularly when looking at cryptographic operations with repeating rounds that are similar in execution. For the below trace, I took a two-pass approach: firstly, I matched the “prefix” (second “column”) loosely, matching to a minimum correlation coefficient of 0.7. I then matched the third/fourth columns more tighly, with 0.95 minimum correlation – this managed to mostly (visual inspection) eliminate the temporal misalignment of the remaining traces.

Also, the tooling’s a bit slow (though it’s questionable whether this is a problem for a one-off preprocessing activity). Initial attempts at multi-threading the code met with failure, due to what I suspect is an IO bottleneck (in other words, a single huge memory mapped array of traces).

My attempts at massaging out a working Ki+OPc value from the smartcard’s MILENAGE (I suspect? I don’t think I can really “check” unless I go work for Gemalto or GCHQ), but thin rays of hope shine through the signal-to-noise ratio. Observe:

What exciting times we live in!

Posted in Bards, Computers, Jesting | Leave a comment

ESP-on-a-PMOD Runtime FPGA Configuration

Following on from my work yesterday, I needed something to quickly reconfigure the edge triggering FPGA (i.e. to use it as a generic smartcard trigger, if without support for true ISO7816 pattern triggering). My solution was to stick a PMOD connector onto an ESP-01 and just bit-bang it over a clocked serial line:

Simply put, the ESP-01 stands up a temporary wireless access network and a webserver. For all the shit-talking Arduino gets, I’d sure as fuck rather spend 15 minutes writing this in Arduino with HL functions than 3 days trying to debug why DHCP isn’t working.

The webserver serves three endpoints:

  • /, which displays a usage message
  • /configure?io=X&clk=Y, configuring the IO edge count and clock edge count
  • /program, which bit-bangs IO and CLK parameters to the FPGA

The ESP-FPGA communication is a simple fixed-length clocked signal, sampled on each rising edge of a dedicated clock line. It’s extremely slow, holding each signal for 40ms, but as a delay-tolerant configuration activity I don’t really care, an LED on the FPGA board is quickly repurposed to serve as a “programming indicator”.

A trivial Verilog state machine allows us to receive this incoming logic, and set the FPGA’s internal registers accordingly. For future reference and re-use:

always @(posedge prog_clk)
    if(prog_reset == 0)
        prog_shift <= 0;
        prog_wait_state <= 1;
        if(prog_wait_state == 1)
            if(prog_io == 1)
                IO_EDGE_TARGET <= 0;
                prog_state_io <= 1;
                prog_state_clk <= 0;
                CLK_EDGE_TARGET <= 0;
                prog_state_io <= 0;
                prog_state_clk <= 1;
            prog_wait_state <= 0;
            if(prog_state_io == 1)  // start shifting bits in
                IO_EDGE_TARGET <= IO_EDGE_TARGET + (prog_io << prog_shift);
            else if(prog_state_clk == 1)
                CLK_EDGE_TARGET <= CLK_EDGE_TARGET + (prog_io << prog_shift);
            prog_shift <= prog_shift + 1;

For now, the ESP will always take 32-bit arguments and send the full 32-bits, but there is room for later optimisation.

Posted in Bards, Computers, Jesting | Leave a comment

Adventures in ISO7816 “Smart” I/O Triggering

I recently wanted to build something to help me trigger off ISO7816 traffic. This led to a week of learning-through-failure, and this post talks through some of the learning experiences, and provides a solution for anyone else attempting to solve the same problem (not a true “smart” trigger, but at least something to land you in the right general place).

An initial design constraint for me was to not use a host emulation approach: while it would be simple to build a “smartcard proxy” which sent the appropriate trigger whenever I wanted, I think this would be extremely limiting in cases where the smartcard tries to verify the host via something like baud rate support or similar.

ISO7816 communication can be summarized as a synchronous one-line serial protocol with a non-standard baud rate that may change over the lifetime of a session (derived off the clock line). It can be decoded using UART, but you’ll need to experiment with baud rate to get a clear reading (the following traffic was at 149K).

Furthermore, there are quirks like the client will “echo” one byte of a command back to the reader, typically before arguments are provided, which may not be visible at first glance.

Upon initially facing the problem, I immediately entered a fit of madness, and decided I would use an ATMega168’s external interrupts to count IO edges. I suspect primarily wanted to do this:

This was a learning adventure in using external interrupts on the ATMega168/328, which I had not actually done before. In a nutshell, they can be used by configuring two registers:

  • EICRA, which configures when an interrupt should occur (rising edge, falling edge, any change).
  • EIMSK, which configures which interrupts are permitted

The actual interrupt routine is defined as a special function (in the below example, for interrupt 0).

ISR (INT0_vect)
PORTB ^= (1 << PINB1);
PORTD ^= (1 << PIND0);

While aesthetically pleasing in it’s own hot-glue-and-duct-tape way, the ATmega168 solution is unfortunately a bit too slow for this work. At 3.3V, we’re able to run at 8Mhz at best, which isn’t enough to do clock-cycle-level triggering on something running at just over >4Mhz. We can’t really use a higher voltage, otherwise at 5V, we’ll miss the 1.8V logic signal (and I didn’t have any level shifters handy, and a grand total of 3 2N3904’s left).

On rethinking the problem, I took the more sane approach of using an FPGA for this task. I started with a simple edge counter, which worked fine for static traffic – but I needed to be able to send somewhat variable traffic to the target for the task at hand. I dug out my old workhorse Arty board and a Saleae for debugging, and got to work:

I settled on a hybrid approach, where I first counted rising edges on the I/O line to get me “close”, then counted clock edges wherever variable data (but fixed-length data) was present. This can be represented by the following logic diagram:

This unfortuantely resulted in a stack of errors around the “multi-driven nets”. To debug this error, I could refer to the Schematic, under “Open Elaborated Design” on the navigation menu in Vivado 2018.3. This opens up a schematic representing which inputs drive which outputs:

A correct flow graph looks like this, with your inputs driving all outputs (i.e. connected left/right). Any outputs which are driven multiple times (which Vivado turns into driven once, and ignored) should stick out pretty quickly.

I tackled this hurdle by fixing my code to use scard_clk as a “Master External Clock” controlling the sampling of all other inputs, and then using state machine model to drive state transitions between waiting for IO edges and waiting for clock edges. Truly, FPGA programming is always a breath of fresh air and fresh thinking onto a problem.

The result is a nice, clean consistent trigger, down to the clock cycle (lines are CLK/IO/trigger):

15 minutes of SPA (what a fancy name for “looking at it”) later, and we are able to identify the 14 rounds of the first full-size software AES operation (in this case, testing of a supplied AUTN parameter as part of MILENAGE – what exciting times we live in!).

The code is available in the “x/” directory of github.com/CreateRemoteThread/fuckshitfuck. As a future improvement, I’m keen to make a re-usable, on-the-fly configurable core (though this seems to be a rock-and-a-wierd-place choice between convenience vs overhead – I will study the chipwhisperer source code for clues).

I am keen to hear more about other people’s approaches to this problem – I am sure there are more elegant solutions out there. If you have a different implementation strategy, please do get in touch (or just comment below).

Posted in Bards, Computers, Jesting | Leave a comment

On “Hardware Hacking” Tools

I got to thinking this weekend – with the advent of one-click shopping, it’s incredibly easy to stack shiny tools which basically do the same thing… and then you always end up writing custom code to do something just slightly out of reach of existing tools.

Still, while it is convenient to have a variety of these tools available, it’s a good learning experience (and generally more productive) to write your own code to do something, once you’re done prototyping with a BusPirate or similar.

I want to provide some thoughts on the common tools available, as well as some unusual alternatives down the bottom. As always, the focus is actually hacking at the thing instead of what to type to make openocd work, so take the below with a appropriate serving of salt.

This post isn’t a dig at any of these tools. I respect the effort that has gone into their development and production. To each their own.


Price: $29.95 USD (Sparkfun)

The BusPirate is an FTDI USB controller attached to a PIC uC. Some custom firmware bit-bangs common protocols (it has to – it remaps the same GPIO’s depending on mode). Of note, the default flywire assembly (the test clips thingy) sucks: I’ve never used the BusPirate without a multimeter testing which test clip connects to which IO pin, every fucking time.

The firmware is pretty decent – my favourite feature is the ability to simply type in data to send via say, SPI: instead of writing code, you can simply use a menu-driven system to enter SPI mode, and type something like [0xFF 0x12 0x34] and it will send the bytes, and handle chip select (the angle brackets do this). An auxiliary pin you can manually toggle is always handy as well if you need to violate some specs.

Of note, the BusPirate has level shifting circuitry, allowing you to safely interface with a variety of targets.

GoodFET (Facedancer lol)

Price: $49.95 USD (Adafruit)

Just imagine the MAX3421 is a bunch of GPIO’s broken out.

The GoodFET (and it’s descendants) are based on an MSP430 controller, tied to an FTDI USB controller to handle host communications. The Facedancer ties the MSP430 to a MAX3421E USB controller (thus it’s role as USB swiss army knife). The MSP430 is loaded with a basic OS, and a number of “apps” baked into the firmware.

These “apps” communicate with things the MSP430 is attached to, sometimes containing logic, mostly a passthrough proxy. In the case of the Facedancer, the host sends data to the MSP430, which mostly passes them straight across to the MSP430 via it’s SPI interface, and grabs a reply. While not as easy to prototype on as the BusPirate, you can get this going in a few lines of Python and maybe a half-hour of reading datasheets.

The MSP430 is surprisingly pleasant to code for using free tools, and I managed to add some GPIO triggering without lightning the board, my laptop or my person on fire, and have it work the first try.


Price: $89 USD (Hak5)

The logical successor to the GoodFET, the GreatFET is implemented on a more up to date LPC core, with integrated USB capability, but otherwise offering the same general capability as the MSP430.

I don’t have one, so I haven’t played with the firmware – but if the GoodFET is any indication, the GreatFET should be just as excellent in terms of usability.


Price: From $45 USD (int3.cc)

You may notice that the Shikra is a surprisingly bare-bones device. Infact, only one IC is present on the device, an FTDI USB controller. That’s right, the Shikra is a FTDI breakout board, except with less pins broken out.

This becomes more obvious as you read the Xipiter page for how to use the device. To use the Shikra to dump an SPI Flash rom, you use the following command.

flashrom -p ft2232_spi:type=232H -r spidump.bin

Reading through the documentation some more, a small EEPROM is also available for configuration data (VID/PID, descriptor strings, etc).


Price: $196 USD (Converted from Euro, Lab401)

The HydraBus is an STM32 devkit – in my opinion, this is a beefier BusPirate (with support for more protocols), minus the level shifters. The USB controller is again on-board. This also has an SD card for storing data, though it doesn’t seem that easy to actually interface say, SPI operations, to SD card (without some custom firmware).

Again, a menu-driven firmware system is used (similar to the buspirate), but the menu is much, much larger here.

I have one, but I haven’t played with it (primarily because there’s too many alternatives), but STMCube can generally help kick-start development with STM* family microcontrollers if you want to start from scratch.

For a more in-depth review, take a look at this.

These tools undoubtedly serve their purpose, and the last time I needed to pass an SPI command to a target IC, I reached for a buspirate instead of opening Atmel Studio (or insert tool of choice here), same as you.

Now, with that out of the way, let’s take a look at some alternative options…

FT2232 Mini Module

Price: $27 (Digikey)

Basically a Shikra, with more pins broken out. Anything you can do with the Shikra, you can do with this, and with less worry about damage to your USB connector because you can use a regular USB cable.


Price: $less-than-a-coffee

Another alternative is to simply use a microcontroller – the ATMega328p is my go-to out of familiarity, particularly when you need a project to have a limited amount of smarts (e.g. “send this thing via SPI, check the results pass this ruleset, beep at me if it doesn’t, otherwise perform logic X, loop”).

With a $10 spare parts ZIF programming jig – or an Arduino board – and a library of sample code in nice, familiar C, you can be up and running in minutes. The bare minimum circuitry is (I think) one resistor for the reset pullup – you can run this off an internal clock as well as a crystal, configurable via fuses.

While this doesn’t have built-in USB support, it has UART, making it perfect for interfacing with other tools (e.g. a chipwhisperer front-end).

PSoC Dev Kit

Price: $17 USD (rs-online, CY8CKIT-059 variant)

The PSoC is a unique line of microcontrollers – you can think of them as a microcontroller ring-fenced by an CPLD, In effect, this lets you create logical functionality (like UART), then arbitrarily map the I/O to any compatible physical pin. The two are then independent – if you want to remap the pins later, you can via the PSoC Creator IDE.

Unfortunately, the software is a bit clunky, and the autogenerated code for logical functionality can be a bit special (in that you need to work with PSoC a bit to learn how these things are named, and after that it’s fine).

FX3 SuperSpeed USB Development Kit

Price: $48 USD (rs-online)

Reading material: https://github.com/cnlohr/fx3fun. Toolset seems Windows-centric.

Potentially the best until last – Cypress sells this as a USB3 development kit, but this seems a bit… fancy for a USB controller, isn’t it? Flip it over, and you discover the pleasant surprise of a fully-featured 32-bit 200MHz ARM9 core.

Multiple power domains are available (unsure how flexible, this is sourced from the datasheet) which should allow flexible interfacing to a range of targets, as well as DMA-based I/O if your name is CNLohr and make this a logic analyzer in defiance of convention.

You can even get addon boards for this development kit (!). For approximately the same price, you can get an expansion board with a Xilinx CPLD (CYUSB3ACC-007) if you want to offload some logic, or high-speed connector boards for both Xilinx and Altera boards, and something about a machine vision interface.

This all comes nice foam-padded magnet-clasp box. As an added bonus, you even get a USB3 controller thrown in you can use if you’re into that kind of thing.

I hope this helps someone choosing their next shiny to buy. If you’ve got thoughts on these products, or if I’ve missed a major feature, please do comment!

Posted in Bards, Computers, Jesting | Leave a comment