Late Writeup – SimpleMachine (Codegate 2020 Teaser)

This weekend, I participated in the Codegate 2020 Teaser CTF. I started late – in my defence, ctftime’s time reporting was inaccurate, and I was able to solve the SimpleMachine challenge, though far too late to get points. The writeup is presented below.

SimpleMachine

This challenge was presented as an executable and a “target” file, which you can download here. You can also download a copy of my working notes here.

The binary is a stripped Linux executable, but the challenge name indicates it’s some kind of virtual machine. Fortunately, the executable is not obfuscated, making reverse engineering relatively simple. Moving through the disassembly (i.e. following the call flow from main()), we can note the opcode processor at 0x17C0:

Looking at the function, we can make a few key observations:

  • The byte at $rdi + 0x30 holds the opcode.
  • The word at $rdi + 0x34 holds argument 1
  • The word at $rdi + 0x34 holds argument 2
  • The return value is held in $rdi + 0x3E. We don’t know where it goes for now.
  • 8 total operations exist – read, write, load, xor, multiply (without carry), add, compare, jump-if-zero (I intially mistakenly thought this was “exit”).

Next, we instrument the application with GDB to observe this in action. We can see it load the first opcode, a “read” opcode, corresponding to the first 8 bytes in the “target” file (the xxd options reverse the byte endianness, and groups bytes into pairs):

Matching this to the disassembly, this is the “read” opcode, reading ox24 bytes into address {base} + 0x4000. By following the read call itself, base is defined by the first qword pointer at $rdi[0] within the opcode processor function.

Further tracing of the application, correlating application behaviour with disassembly listing, sheds light on the instruction format (little endian):

AABB XXXX YYYY ZZZZ 
AA: Addressing mode (controls behaviour of XYZ)
BB: opcode
XXXX: where the result is stored.
YYYY: arg1
ZZZZ: arg2

Spending a bit more time debugging the executable, we can observe the following behavior:

  • Firstly, the program reads a flag (the 06 opcode)
  • The program compares the first part of the flag to CODEGATE2020 through static compares. If any bytes don’t match, the program is exited through the use of opcode 5, with an argument of 0x1a0 (which leads to opcode 8, exit).
  • The program then loads a number of constants into memory, for an unknown purpose. Let’s call this the “key material”.

At this point, we are at 0xf8 in the “target” file. The program continues:

  • We load the constant 0xdead into a virtual register
  • We load the constant 0x1 into a virtual register
  • We double 0xdead, and save the result in a register. This is at address 0x108 – this is important later.
  • We xor 0xdead with the doubled 0xdead, and save the result in a register. The result is 0x63f7.
  • I’m not sure what the next instruction is for. We multiply 2 by 0 and save the result?
  • We load 0xf974, a part of the key material loaded earlier, into 0x154. Note that 0x154 is part of the initial code, 0xFFFF in the “target” file – the program has now become self-modifying. Let’s press on.
  • We save the constant 0x400c, after “CODEGAT2020”, into 0x14c. This is also self-modifying. Note that 0x400c points to the next bytes of the flag after “CODEGATE2020”.

At this point, we can monitor the executable with gdb, using the “rwatch” command:

rwatch <address>
  • Three “nops” are executed, consisting of 0000 0000 0000 0000 instructions.
  • We xor the two bytes of the input flag with 0x63f7
  • We add the result to the first two bytes of the key material, 0xf974
  • We compare the above result to 0.

We can represent the above operation as thus:

input_flag ^ 0x63f7 + 0xf974 =0x10000
input_flag = (0x10000 - 0xf974) ^ 0x63f7

We can quickly check this in Python:

Great, this looks pretty sane. Let’s move forward:

  • If the last result was not zero (i.e. if the flag was incorrect), jump to 0x1a0
  • The next three operations appear to be a loop counter, checking if we’ve hit 0xc iterations, and if not, jumping to 0x108.

At this point, instead of doubling 0xdead, it doubled the 0x63f7. Going through the loop a few times, we note the following repeating cycle occuring:

From here, it is a simple matter to derive the key manually, and confirm it via the simple_machine executable:

My solving methodology for this challenge was particularly haphazard, with a half-hearted attempt at writing an emulator and many, many failures at tracing the program, mostly due to my own fuckups in patching virtual compare operations. A shortcut I learned was that gdb can skip X iterations of a breakpoint, with the following command:

ignore 1 15
(ignores breakpoint 1 the next 15 times)

I disagree with the assessment that it is “ezpz”, but in another light, it is a humbling reminder of how much more I have to learn, and what a challenge it is to keep my skills up to date, while also expanding into other areas.

Thankyou to the Codegate 2020 CTF organisers for putting together these challenges. See you in Aero CTF.

About Norman

Sometimes, I write code. Occasionally, it even works.
This entry was posted in Bards, Computers, Jesting. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.