Writeups – PicoHSM Speed Dating, Super Public Algorithm, Remote Lab (Donjon CTF)

In this post, I will describe the solution for 3 challenges I was able to solve.

The related solution files and support binaries are available here.

PicoHSM Speed Dating

This challenge was presented as an ARM ELF binary (provided in the Git repository), as well as a photo of the target we are exploiting:

A special note should be made here, to commend the Ledger Donjon team for having this challenge available throughout the CTF duration, particularly given the frequency of crashes and the fact that multiple people were attempting to exploit a fixed number of targets. Great work!

Inspecting the binary in Ghidra, we note that a stack overflow is present within the handle_client function:

My first instinct was to wrangle this into a ROP chain, using gadgets to overwrite debug_enabled and the debug_key, which are at static locations.

Unfortunately, I couldn’t get the ROP mechanism itself to work (entering more than a single return address would cause a crash). As part of debugging this, I built an emulator using Unicorn Engine, as suggested in the Donjon CTF discord (emu.py, emu2.py): while ROP worked in the emulator, it refused to work on the live target.

Eventually, I realized that a ROP chain wasn’t required, I could simply return to the executable stack. From there, it was smooth sailing to use ARM binutils to craft appropriate shellcode, and write a loader to insert a nop sled of the appropriate length, and take a wild guess at where the actual return address was (the emulator gave some clues in that regard).

The final solution is in fuck2.py:

Super Public Algorithm

This challenge was presented as a single power trace in npy format, representing power consumption over an ECC signature execution. Our goal was to retrieve a private scalar and therefore, decrypt a known ciphertext.

I began by plotting the trace in numpy. It is immediately apparent that there are differences in “rounds” of the algorithm, which should theoretically correspond to different operations (point multiplication?) depending on the bits of the key:

I then automated a simple power analysis attack, using a sliding window to detect the rounds of the ECC algorithm, and then checking the magnitude of leaked energy during the first spike to distinguish between the two separate waveforms.

The solution is in solver.py, outputting the correct scalar in base10:

We can then use the provided flag decryptor, to retrieve the flag:

Remote Lab

This challenge was presented as source code, and a remote service which could provide a fault (and power trace) for the verify_flag function. The source code is presented below:

My initial approach was to study the SPECK algorithm, but I quickly realized that we could skip the algorithm entirely using a fault attack, and then perform a DPA against the result of the OR operation in each round of the “password check”.

We can first skip the fault check, by simply incremending the fault offset and looking for a low sample count trace. It turns out for a 16-byte password, the correct offset is 286. The difference between a correctly glitched trace, and trace executing all rounds of SPECK is below:

The next step was to generate a large amount of random traces with the glitch applied with corresponding plaintexts. The idea is that the OR operation will leak a different amount of energy depending on the difference between the provided password, and the true key (the provided password is directly

Interestingly, the correlations are not particularly strong (but not incorrect, for the most part). The first two bytes are however incorrect, regardless of the particularly strong correlation.

Upon further review, it becomes apparent that the correlation will be particularly weak because of the subtraction operation, that is, assuming the true key is 0xAA, and the test key is 0xAB, the leaked energy is only 1 bit. However, if the true key is 0xAA and the leaked key is 0xA9, then the leaked energy is tremendous (number is -1, so 64 1’s). A better way to review this would be simply to view the power traces manually, changing one byte at a time:

A short amount of brute forcing later (two key bytes from the DPA result), revealed the true key, and the flag. Note the significantly lower energy consumption from all key bytes towards the right, indicating the correctness of the key (also the valid flag, indicating the same).

Once more, thankyou to the Donjon CTF team for putting together this challenge, this was a lot of fun and a great learning experience!

Posted in Bards, Computers, Jesting | Leave a comment

Writeup – Glitching AES (Donjon CTF)

Update: I am an idiot. But I think this makes the accidental result fun to think about!

Over the past few weeks, I participated in Donjon CTF. This post will document my method for solving the “Glitching AES” challenge, which is presented separately for the purpose of written clarity.

The accompanying files can be found here.

It should be noted that this method is not the “intended solve” – I didn’t understand the intended method, so I used a brute force method.

Glitching AES

This challenge was presented as a large (623MB) text file, the first few lines of which are presented below:

The objective of the challenge was to use this data to recover the AES key, and decrypt a provided ciphertext blob in ECB mode:

I performed a sanity check of looking for the ciphertext blocks in the provided encryption traces, but they were not present. I then reformatted the provided data into numpy traces, using format_to_npy.py. Initial statistical analysis showed that there were no “glitched outputs” – all the fault attempts either resulted in a correct output or a mute. Plotting the output states (mutes vs glitches) shows a 14-round AES-like trace, as follows:

I started by attempting a straightforward DPA attack (with dpa.py, which can be modified to select different models, and optionally discard traces for which points of interest are 0/missing data). against the first-round Sbox output, using only the traces which result in a non-mute response. This resulted in reasonably strong single point correlation for all byte positions:

Unfortunately, the resulting key (“4b596315d21df5a531079c838e11eee6”) was incorrect. I attempted a similar attack using a T-Table outputs model, which produced the same key. I then attempted a last-round attack, which produced similarly strong false correlations for another incorrect key (“c95c9f4d0054e0376518b43e6e4eb7ab”).

I thought at this point that the key was incorrect, but related to the true key, due to the single-point correlations for all bytes. I guessed also that fundamentally, the AES algorithm itself was unchanged except for the addition of more rounds: that is, in each round, the key is changed prior to the sbox, and I couldn’t see it because of the false correlation in first round.

I tested this theory against the second round of AES: by scheduling the initially discovered correct key (“4b…”) to round 2, I performed DPA (using diffs.py, make sure you edit the bytepos target range, and supply a target round) to identify a possible XOR mask prior to SubBytes, which yielded results, for all rounds:

The positional correlation (i.e. one byte after another) was extremely encouraging, as well as the fact that these spikes were generally single points of large differences of means, generally without ghost peaks or other such silliness:

Note that some error correction had to be performed, due to false or weak correlations in specific places:

  • Where the error could be fixed by modifying the leakage point to after MixColumns in the same round, I used 8ball.py.
  • Where the error required propagating to the sbox of the next round (IIRC, this was the case in Round 6 to Round 7), I used 9ball.py. Note that using this can take several hours to identify a single byte, so please be absolutely sure you can’t use 8ball before using 9ball.
  • I used taint_track.py to identify what impact an incorrect byte of AES could have past ShiftRows and MixColumns.

Also, as the brute force went on, the modified AES encryption took increasing amounts of time – an AES progress cache helped with this, allowing me to skip rounds where I knew the result. As an aside, the following behaviour in Python3 is quite interesting (causing some wasted time as I tracked this down in my initial AES cache implementation).

a = [{}] * 10
print(a)
a[2]["lol"] = 2
print(a)

The results of this brute force were stored in an array (in keymod.py). When I had all 14 rounds, I then performed an encryption against the provided plaintexts, and identified a final XOR mask (in test.py and test3.py, the one in keymod.py is old / unused), producing the correct ciphertext for all input plaintexts.

The final step of this challenge is to write a custom decryption function (in test3.py), to reveal the correct flag:

Thankyou to the Ledger Donjon team who wrote this CTF – their challenges were well-crafted and their infrastructure was quite stable throughout the allocated play time. I was able to improve my skills and improve my toolset through this experience, and greatly look forward to the next instance of this CTF (as well as RHME4 and others!).

Posted in Bards, Computers, Jesting | Leave a comment

Writeups – nothing, reversing (TWCTF)

This weekend, I took part in Tokyo Westerns CTF 2020. I only managed a few hours on Sunday night, but I was able to solve 2 (warmup) challengs. The writeups are below, for future reference:

Nothing More to Say

This challenge was presented as a 64-bit Linux ELF file and corresponding source code. You can download this here. The challenge text describes it as a format string vulnerabiilty.

Initial inspection shows that the challenge is reasonably simple, but established some constraints:

  • The input is constrained to 0x100 characters. Minus 3*8 characters (for offsets), then minus 3*7 (for “%..$hhn”) means we have an absolute maximum of 0xd3 characters.
  • read() is the input mechanism, so null bytes are OK
  • Input is passed to printf directly.
  • We have unlimited attempts to exploit
  • execstack is enabled – maybe this is an avenue

Initially, I leaked the remote libc by printing the GOT. Using an online libc database, we quickly find this is an Ubuntu libc, with three magic ROP gadgets – unfortunately, overwriting printf with these doesn’t seem to work (constraints not satisfied?)

Unfortunately, on this version of libc, system’s address ends with 0xe0, placing this out of reach of our format string – but on reversing the libc file, we note that system simply proxies the call to 0x4EF50:

While 0xEF is out of reach of our format string, libc base randomisation might cause an overflow to the next byte (depending on where libc is loaded at). Therefore, we can hope the second and third least significant bytes on the offset of libc+0x4ef50 are lower than 0xd3, allowing us to overwrite printf in a single go.

A few tries later, and success:

The resulting script is here. Life is made significantly easier by the fact that null bytes are accepted – we can simply stash our %n pointers at the end of our input, and reference them using static / known indexes (they’re always 33/34/35 – we can pad the input’s length to put them there.

It’s been so long since I’ve seen a format string exploit, it’s good to practice this again, even if it’s a bit unrealistic.

Reversing iS Amazing

This challenge was presented as a Linux binary, which you can download here. Initial reverse engineering shows that this uses the OpenSSL Public Key API to sign the user’s input, then compares the signed input to a static block:

The simplest approach (IMO) is to extract the key and static block from the binary during execution (breaking at BIO_new_mem_buf and memcmp), save it to file and

We can use gdb’s “dump” command to extract memory:

A little reading of the OpenSSL man pages later, and we have a stub executable that can read the private key and signed block from memory, and use RSA_public_decrypt to recover the message digest into readable form:

I also attempted the web “urlcheck” challenge for fun, but wasn’t able to solve it in the time allocated.

Thankyou to the Tokyo Westerns team for putting together another great CTF, I look forward to playing again next year.

Posted in Bards, Computers, Jesting | Leave a comment

Writeups – roppity, slithery (CSAW 2020)

This weekend, I participated briefly in the CSAW 2020 CTF. I solved a number of challenges, and below are the writeups for two (fairly straightforward, unfortunately) challenges, and one I missed because I did something incredibly dumb in the exploitation process.

roppity

This challenge was presented as a 64-bit Linux binary, which you can download here.

Initial reverse engineering showed this to be a straightforward two-stage ROP challenge – leaking the location of something in libc, then directly calling one_gadget. A puts call at 0x4005F5 can be exploited, leaking the GOT – and then helpfully, this continues onto a gets call at 0x400606.

We can use ropper and one_gadget to help us construct the rop chain, and moments later, we have a working exploit:

slithery

This challenge was presented as a host / port combination, which hosted a listening service, purporting to be a Python sandbox, with source code provided. The sandbox can be found here.

Initial exploration showed that we actually have a working exec primitive, but didn’t consistently work. That said, I could print items, and the blacklist didn’t cover itself (i.e. didn’t cover “blacklist”).

The following silly trick worked to get us out of the sandbox, by deleting the blacklist, importing the “os” module then system’ing bash:

It looks like the intended solution was something different, triggering some kind of exception and hitting a second sandbox (not provided):

Thankfully, we never had to deal with this.

grid

This challenge was presented as a 64-bit Linux binary, which you can download here.

Initial reverse engineering showed that this was a mess of C++. The application seemed to allow manipulation of a “grid”, which appeared to be implemented as a linked list, and then “copied” to the stack when the “d” command was used to display the grid:

Using this as an exploit primitive, we can write arbitrary (within reason: cin dies on some characters) content to the stack, including ROP gadgets (here: a call to cout trying to point to the GOT, but notice I can’t enter the full address of the GOT onto the stack):

I spent a little while searching for a GOT leak, but I didn’t realize: you could leak libc content simply by hitting “d” without overwriting anything.

FML.

With something in libc, it’s trivial to then overwrite a return address to hit a one_gadget special and get a shell. No flag this time, better luck next time 🙂

Thanks to the CSAW team for putting together this CTF – I look forward to playing again next year.

Posted in Bards, Computers, Jesting | Leave a comment

EMFI Privilege Escalation on Raspberry Pi

Over the past week, I have been exploring the concept of extending the KERNELFAULT attack to use EMFI, with some success, in the context of setting up further attacks against other targets.

The key questions to answer for me were as follows:

  • By nature, the transient EM pulse must be an order of magnitude shorter than the crowbar glitch employed previously. What does this look like in practice?
  • Is the fault model consistent – are we still encountering the same kind of program corruption as we experienced previously, or are there new faults to deal with?
  • Where are the most vulnerable locations above (or below!) the Broadcom SoC? What effect does different coil configuration have?

This exercised used a ChipSHOUTER and a ChipWhisperer-Lite running a simple counting loop as a timer. The same trigger program was used as the KERNELFAULT experiments – no modification was needed.

An afternoon of brute forcing led us to a successful result:

Of note:

  • The pulse length is significantly shorter than the previous use case, indicating (as expected) that the impact of decoupling capacitors is greatly reduced or removed in this scenario.
  • The fault model was consistent, the same type of crashes was occurring whether I’m VCC glitching or using EMFI.

From a positioning perspective, I noted a different result to Trouchkine et al’s paper (here):

This could be due to a difference in coil selection / manufacturing imperfection in the coil, or maybe it was wind pushing the board slightly. After watching some of Colin O’flynn’s work, I’ve purchased a Panavise clamp to hold the target in place, so I can open a window and not impact the results in future.

No testing has yet been done on the effect of coil choice, but I expect this to have a similar impact to moving the position of the coil: the shape and direction of the generated magnetic field would differ, so naturally, the results would too.

Some improvements were made to the classification script, which can help us identify where the successful glitches are occuring (red diamond marks indicating where code is executing down the commit_creds path – there’s some missing “bad crashes” here to remove clutter):

Interestingly, I was able to force execution down the commit_creds path when glitching using an ext_offset of approximately 200 (same as the VCC glitching experiment which previously succeeded), but was unable to wrangle this into a successful privilege escalation (crashes in commit_creds: I suspect this timing represents a before-and-after of the ns_capable_setid call in sys_setresuid.

This experiment required minimal code changes, everything is in Github here in the experiments/pi folder.

Much, much more work is to be done here, and I look forward to extending the attack against other complex SoC’s.

Posted in Bards, Computers, Jesting | Leave a comment