2017-07-13:

Blind ROP livestream followup

livestream
Yesterday I've done a livestream during which I tried to learn the Blind Return Oriented Programming technique. Well, that isn't fully accurate - I already read the paper and had one attempt the evening before, so I knew that the ideas I wanted to try live should work. And they did, but only in part - I was able to find a signaling gadget (i.e. a gadget that sent some data to the socket) which could be used later on, as well as a "pop rsi; ret"-equivalent gadget (using a different method than described in the paper - it was partly accidental and heavily challenge specific). But throughout the rest of the livestream I was able to find neither a puts function nor use the gadget I already had to get a "pop rdi; ret"-equivalent gadget. The bug, as expected, was trivial.

Long story short (a more verbose version follows), my mistake was in the following code:

 if err == "CRASH" and len(ret) > 0:
   return (err, ret)

The correct form is:

 if err == "DISCONNECTED" and len(ret) > 0:
   return (err, ret)

And that's it. If you were on the YT chat or IRC you probably saw me "facepalming" (is that even a word? it should be) and pointing out the bug even before the stream went offline (i.e. during the mission screen). In the next 5 minutes I had both gadgets I needed, so that was the only bug there. Oh well.

Full version of the story:

I'll start by saying that if you don't know what ROP is (Return Oriented Programming, which is actually an exploitation technique), you might want to start there as BROP (or Blind ROP) is a technique that heavily builds on top of ROP. Here are some random links about ROP:

The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86) (Hovav Shacham)
Return-Oriented Programming: Exploits Without Code Injection
Doing ret2libc with a Buffer Overflow because of restricted return pointer - bin 0x0F (LiveOverflow)
ROP with a very small stack - 32C3CTF teufel (pwnable 200) (LiveOverflow)
And also:
Return-oriented exploiting
Hacking Livestream #20: Return-oriented Programming

There were some follow-up papers and techniques , e.g. Return-Oriented Programming without Returns, but the above should be enough to get you started.

One of such follow-ups was the Hacking Blind paper by Andrea Bittau, Adam Belay, Ali Mashtizadeh, David Mazieres and Dan Boneh from Stanford University - a really cool piece of research that showed that you don't really need the binary to find gadgets. I highly recommend reading the paper, but as a summary I'll note that it boils down to first locating the binary in memory (by basically reading the stack using e.g. Ben Hawkes' byte-by-byte brute force technique also described by pi3 in Phrack), then finding a gadget which has an observable effect (e.g. it hangs the connection or sends some additional data through the socket) and finally finding gadgets using pretty clever heuristics that boil down to observing one of three signals (a crash, the normal behavior, or the aforementioned observable effect). So yeah, it's the blind SQLI of ROP ;).

The original paper showcases several optimizations, but I decided to start with the basics (I'm a fan of reinventing the wheel). I prepared a small forking-server binary (an echo server basically) that had PIE and stack cookies disabled (and ASLR was also disabled system-wide), compiled it statically and run on a virtual machine.

Both the binary and the source code created before and after the stream are available on my github (the binary listens on port 1337; please note that when learning blind ROP you should not look at the binary in IDA/etc and pretend you don't have access to it).

The first step was to look for the bug (though that was trivial in this case as there is only one input length+data pair basically) and the size of the overflowed buffer. This was done by using growing data lengths (from 1 byte upwards) and checking whether the "Done!" message was transmitted via the socket (denoted as "OK" in my code) or whether the connection abruptly stopped before that (denoted as "DISCONNECTED" or "ERROR" in my socket code, or just "CRASH" later on - I mixed this up on the stream thus I couldn't get it to work to the full extent in the end). It turned out that the buffer was 40 bytes long.

The next step was to find a signaling gadget. A signaling gadget (similarly to a signaling behavior in Blind SQLI) can either be some data transmitted over the socket or a connection hang (i.e. the server application entering an infinite loop or hanging on some input which can be detected e.g. using the socket timeout mechanism). During the livestream I found a gadget which did both, i.e. it printed a 16-hexdigit number (so it was probably a printf("%.16x", arg2) function call) and then started reading from the socket (which basically stopped the application and caused a socket timeout on my side).

This gadget was actually really useful, since it being (probably) a printf("%.16x", arg2) call meant that jumping just after the instruction that puts the second argument in the RSI register (64-bit Linux calling convention) would output the current value of the RSI register. This also meant that I could easily find a "pop rsi; ret"-equivalent gadget (I keep writing "-equivalent" as found gadgets might be a lot large and have unpredictable side-effects, but in the end they would e.g. also pop a value from the stack and place it in a given register) just by using the following ROP-chain:

[unknown-scanned address]
[0x4141414141414141]
[address of the printf("%.16x", rsi) gadget]

The unknown-scanned address would be a sequential-or-random address within the assumed range of the executable section of the binary. Since one needs probably quite a lot of attempts to find such gadget (tens of thousands) I decided (as the paper suggested) to use multiple threads/connections to do it (20 threads worked fine in my case). Part of the output log looked like this (this is actually from the pre-stream version):

0x37b4 PRINT
................................................00000000006ee3c3
Data size:
0x37ef PRINT
......................................................................................................
......................................................................................................
.....................................00000000006ee3c3
Data size:
0x38e14141414141414141
Data size:
PRINT0x38e2
PRINT
.............................4141414141414141

Due to multithreading this isn't really readable, but one can observe some non-4141414141414141 values (mainly 00000000006ee3c3) and then in the end the magical 4141414141414141 value right next to the "PRINT 0x38e2" information (which means: the gadget is at image_base+0x38e2 address).

The next step was to find the puts function (followed probably by a crash) that could be used to easily find a "pop rdi; ret"-equivalent gadget (the idea was to do puts(image_base+1) which would echo "ELF" when the correct gadget is found), but for some reason it didn't work (yeah, this is the place where the "CRASH"/"DISCONNECTED" mix up started to be a problem). I then tried to use the print-gadget I had already, but actually jumping 7 or 8 bytes after it to skip setting the RDI register (and using that instead of puts, risking an accidental format tag like %s at the beginning of the binary resulting in a false negative). This didn't work either (though probably because I wasn't patient enough with 8 bytes variant - well, a stream is time limited).

After some more minutes of trying I decided to finish the stream at that point (it was already 1.5h long) and promised to do this blogpost about what went wrong. So yeah, I mixed up the name of the signal from my socket-helper implementation (serves me right for using strings instead of any sane type of enums/constants).

When I spotted the problem I was able to both find the puts gadget I wanted and the "pop rdi; ret"-equivalent gadget within the next few minutes. Having these two what was left to do was actually just dumping the binary as a puts(controlled_rdi) allows one to do it by sequentially reading the memory (and assuming that there is a \0 at the end of every piece of data). It must be noted that it took several hours to dump the binary (as it was over 1MB in size).

Having the binary would bring this to a standard ROP exploitation which wouldn't take much more time (especially if the magic gadget would be found in the binary; I guess one can look for this gadget in BROP as well).

So there you have it :). Again I would like to recommend reading the paper, especially the sections about the "BROP-gadget" and PLT/GOT methods - I really liked them.

And that's it.

Add a comment:

Nick:
URL (optional):
Math captcha: 10 ∗ 10 + 4 =