2023-01-28:

Asking MEMORY.DMP and Volatility to make up

volatility:forensics

A few days ago I've posted RE category write-ups from the KnightCTF 2023. Another category I've looked at – quite intensely at that – was forensics. While this blog post isn't a write-up for that category, I still wanted (and well, was asked to actually) write down some steps I took to make Volatility work with MEMORY.DMP file provided in the "Take care of this" challenge series. Or actually steps I took to convert MEMORY.DMP into something volatility could work with. I have to add that I didn't get the flags for these challenges*, so again, this isn't a write-up.
* It turned out that the flags weren't based on the MEMORY.DMP – the sole resource provided – at all due to an oversight in challenge creation. It was a pretty amusing situation we've learnt about after the CTF, but what can you do.

Let's start by stating the problem: neither Volatility 2 nor Volatility 3 were able to use MEMORY.DMP as input. WinDBG on the other hand had no issues at all, so we knew the file was correct.

$ python2 vol.py --profile Win7SP1x64 -f ../MEMORY.DMP pslist Volatility Foundation Volatility Framework 2.6 No suitable address space mapping found Tried to open image as: MachOAddressSpace: mac: need base LimeAddressSpace: lime: need base WindowsHiberFileSpace32: No base Address Space WindowsCrashDumpSpace64BitMap: No base Address Space WindowsCrashDumpSpace64: No base Address Space ...

If you're unfamiliar with Volatility, it's an open-source forensics framework written in Python 2 and Python 3 respectively, which allows an investigator to run queries on computer system's memory dumps. Technically it understands internal Windows and Linux kernel memory objects and can walk through them to do stuff like listing running processes, dumping console buffers or the content of the clipboard, digging through the registry (it's in-memory version), etc. See this example for instance. Pretty neat tool!

Some theory on how Volatility works

As said, the input is a system memory dump. These however come in different shapes and sizes, depending on how one might have acquired it. For example, the one common source is a Blue Screen of Death-time automatic memory dump creation – it's usual purpose is to allow folks to put it in WinDBG and figure out why the system crashed. Another typical example includes providing a raw dump of physical memory – these can be acquired in a multitude of ways, though they don't really include any useful metadata (will get to this later). Either way, usually what you get is a dump of physical memory – physical being the keyword here.

Physical memory however won't do. That's because the great majority of the kernel structures – as well as literally everything in user-land – operates on virtual memory. So the first thing volatility has to do is basically load the proper parser for the given input format and then provide a virtual memory view for it. This can of course be done easily based on the page table structure which maps virtual addresses to physical addresses.

One important thing to note here is that there isn't just one page table structure in memory. There are a lot of them – usually one per process, though in some cases even each thread might have one. That's OK however, since even if you find only a single page table in memory you'll be able to access the process/task/thread list, and these in turn hold physical addresses of their respective page tables. This means that each process/task/thread "sees different things" in memory, though usually at least the kernel part is seen by all of them in the same way.

Once Volatility has a virtual memory view it can proceed to do the required analysis by finding and walking through the aforementioned kernel structures. Of course each kernel version might have a bit different looking internal structures – this is best observed either on ReWolf's Terminus project website (_EPROCESS example) or Svitlana Storchak's / Sergey Podobry's Vergilius project website (_EPROCESS example again). To handle these differences Volatility 2 uses per-version profiles, which in turn refer to vtypes – these are kernel structure definitions in Python form:

ntkrnlmp_types = { ... '_EPROCESS' : [ 0x4f8, { 'Pcb' : [ 0x0, ['_KPROCESS']], 'ProcessLock' : [ 0x160, ['_EX_PUSH_LOCK']], 'CreateTime' : [ 0x168, ['_LARGE_INTEGER']], ...

In case you're missing a profile for a Windows version and want to use Volatility 2, you can use the pdb_tpi_vtypes.py script from pdbparse to get them. Note that you will need the PDB (debugging symbols) file for kernel in question, but just loading a MEMORY.DMP into WinDBG will automatically download it for you (hint: setup on-disk cache so it's actually saved on disk). Then just run something like:

pdb_tpi_vtypes.py ../../../ntkrnlmp.pdb/DADDB88936DE450292977378F364B1101/ntkrnlmp.pdb > win7_sp1_x64_24214_vtypes.py

and you have vtypes ready to go. Note that you need to copy the vtypes in the proper directory (volatility/plugins/overlays/windows/) and tweak profile files a bit to make sure these are actually used. E.g. in this case that would amount to adding the following class in volatility/plugins/overlays/windows/win7.py (yes, you need Volatility 2's source code):

class Win7SP1x64_24214(obj.Profile): """ A Profile for Windows 7 SP1 x64 (6.1.7601.24214 / ???) """ _md_memory_model = '64bit' _md_os = 'windows' _md_major = 6 _md_minor = 1 _md_build = 7601 _md_vtype_module = 'volatility.plugins.overlays.windows.win7_sp1_x64_24214_vtypes' _md_product = ["NtProductWinNt"]

Other changes might also be needed – I didn't explore this fully since it turned out in my case other profiles from "nearby" versions also work well enough (I ended up using plain old Win7SP1x64).

In Volatility 3 the vtypes are kept in JSON files generated from PDBs using volatility3/framework/symbols/windows/pdbconv.py script. I can't tell you much about this though, since pdbconv.py refused to work with ntkrnlmp.pdb I had and I decided not to fall into the rabbit hole of fixing PDB parsing.

The incompatible MEMORY.DMP

One thing to note is that the Windows memory dump format isn't really officially documented. The modus operandi seems to be that Microsoft programmers tweak the format at will, but implement its support in DbgHelp library (dbghelp.dll) for third parties to use. This of course means that the DbgHelp is not available on Linux, at least not in a straight forward fashion*. Given this, tools like Volatility have to implement their own Windows memory dump parsers – and these by definition are bound to play a catch-up game.
* Technically (i.e. ignoring any potential licensing issues which might or might not exist) there would be at least two ways to use dbghelp.dll directly on x86 Linux – either write a thin wrapper-service on it and run it through Wine, or use one of Tavis Ormandy's hacks or something similar to use it directly.

So my guess is that the incompatibility is the result of just this catch-up scenario, with the legacy Volatility 2 not having implemented some new variant of internal MEMORY.DMP structures. On the flip side it might just be that the kernel-memory-only dump used in this CTF challenge is unpopular enough that Volatility 2 just never had to support it.

Either way, the big question is: what do we do about this?

The first step is always to understand the problem. In this exact case it meant trying to compare the memory dump format specification with MEMORY.DMP file at hand. Oops, there's no specification available. So we take the next best thing – source code of the parser in Volatility 2.

I actually found two parsers in Volatility 2:

  • volatility/plugins/addrspaces/crash.py
  • volatility/plugins/addrspaces/crashbmp.py

So apparently there are two variants of Windows memory dumps supported – a non-bitmap one and a bitmap one (I'll get to the differences between them in a second). That's actually not fully true, since actually there are three variants in total, as the non-bitmap variant supports both 32- and 64-bit systems, while the bitmap variant supports only 64-bit systems:

  • WindowsCrashDumpSpace32
  • WindowsCrashDumpSpace64
  • WindowsCrashDumpSpace64BitMap

In any case the high-level file structure looks something like this:

A comparison of non-bitmap variants and the bitmap variant. All variants start with _DMP_HEADER structure and are followed with padding. The _PHYSICAL_MEMORY_RUN array is highlighted inside the header in the non-bitmap variants. After the padding the non-bitmap variants have a series of memory page runs, each has a different size. The bitmap variant on the other hand as a _FULL_DUMP64 structure after the padding, which contains the page bitmap. This structure is then padded to align with 0x1000 boundary and followed with actual single dumped memory pages.

The _DMP_HEADER and _DMP_HEADER64 definitions can be found in volatility/plugins/overlays/windows/crash_vtypes.py, while the _FULL_DUMP64 is defined in crashbmp.py itself.

The non-bitmap variants use a _PHYSICAL_MEMORY_RUN array, which is technically embedded inside another structure called _PHYSICAL_MEMORY_DESCRIPTOR – both defined in profile-specific vtypes file inside the volatility/plugins/overlays/windows/ files. This array contains information about "runs", i.e. continuous physical memory regions dumped into the file. If the whole physical memory was dumped, then it's just 1 run starting at 0 with run size that of the full physical memory. However since usually full dump are both impractical and unnecessary – e.g. there is no need to dump GPU VRAM regions – some regions are split into multiple runs that skip the unnecessary areas.

The bitmap variant is basically the same idea, but at page (0x1000 bytes aka 4KB) granularity. Instead of having memory split into continuous "runs", we deal with a bitmap (as in literally an array of bits) which answers the question of whether the Nth page is present in the dump file (Nth bit set to 1) or not (Nth bit cleared to 0).

An ilustration comparing a bitmap with which pages are in the file. The bitmap has bits at the following position set: 1, 2, 3 and 6. As such, the only physical pages 0x1000, 0x2000, 0x3000 and 0x6000 were dumped.

Having learned this from Volatility 2's source code, I started comparing it with the MEMORY.DMP at hand. What I found out is that while _DMP_HEADER64 did look correct, something else was broken down the line. Initially I though that it was just a matter of _DMP_HEADER64.DumpType set to an incorrect value, but nothing is ever that easy. Eventually I figured out that what I'm dealing with is a 64-bit version of the crash dump header, with a 32-bit version of the bitmap variant following down the line. This became obvious due to two things I've observed:

  1. When comparing individual pages in WinDBG, page by page, there were a lot more gaps than the _PHYSICAL_MEMORY_RUN[] would suggest. Also, memory pages in the file started much further than expected. So this had to be a bitmap variant.
  2. The data in the MEMORY.DMP I had – when interpreted as _FULL_DUMP64 – just made no sense.

Note that this isn't even the "missing 32-bit bitmap" variant, since this is indeed a 64-bit bitmap variant, just one that isn't using _FULL_DUMP64 structure, but rather some mythical _FULL_DUMP?32?*.
* Actually – as I found out while writing this blogpost – it turns out that when searching for _FULL_DUMP32 you can find it on GitHub in a file called ntiodump.w or ntiodump.h, which apparently is part of Windows 2003 / NT SDK. It's followed by a _SUMMARY_DUMP32 structure, which is pretty close if not identical to what I derived during the CTF. I haven't found it during the CTF though.

All this means I had to start with two things: making sure this really is a bitmap variant and reverse-engineering the structure enough to be able to add support for it in Volatility 2. The part of the file I focused on looked like this:

00002000 53 44 4d 50 44 55 4d 50 53 44 4d 50 00 70 02 00 |SDMPDUMPSDMP.p..| 00002010 00 00 12 00 20 fb 00 00 00 00 12 00 53 44 4d 50 |.... .......SDMP| 00002020 28 c0 e8 03 80 fa ff ff fe 00 00 00 00 00 00 00 |(...............| 00002030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00002040 00 00 00 00 00 00 00 00 3f 00 00 00 00 00 00 00 |........?.......| 00002050 00 00 00 00 00 00 00 00 00 00 80 7f 80 08 00 00 |................| 00002060 00 e0 ff 01 ec 0f 00 00 00 00 a0 80 00 00 00 00 |................|

Two things immediately stood out:

  1. Well, obviously the magic values SDMPDUMPSDMP and SDMP magic values marked in green above.
  2. On top of that I noticed the bytes 00 70 02 00, or rather – if treated as a 32-bit little endian value – 0x27000. I already knew at this point from my WinDBG experiments that the actual memory data in the file starts at that offset (remember the "Also, memory pages in the file started much further than expected." sentence?).

The next 12 bytes interpreted as 32-bit little endian gave the following three values: 0x120000, 0xfb20, 0x120000. Now what could this be?

Just thinking about what's actually needed to parse the bitmap should give you a clue: you need the overall size of the bitmap. Since I knew this header was at 0x2000 and I knew the data started at 0x27000, this left about 0x25000 (minus header) bytes for the bitmap at most. Well, neither 0x120000 nor 0xfb20 is close to that value. But wait, this is actually a bitmap, so it would make sense for the size to be expressed as bits. And, to no-one's surprise, it turns out that 0x25000 * 8 bits per byte gives you 0x128000, which is suspiciously close to 0x120000. Given this I've assumed it has to be the bitmap size in bits. Why does this value appear twice though? No idea, but it also doesn't really matter.

What about 0xfb20? My suspicion here was that it is the number of actual pages in the dump file / number of bits set in the bitmap (same thing). This can be verified either by going through the whole bitmap or by comparing the file size (263505368 bytes) minus headers to 0xfb20 pages. So that's 263505368 - 0x27000 = 263345624 on one side and 0xfb20 * 0x1000 = 263323648 on the other. That's pretty close – enough to assume this hypothesis is confirmed. Where does the difference between these values come from? It turns out there are some additional structures at the end of the file, though these aren't important.

Now for the bitmap itself. I already knew from WinDBG that page at 0x0000 didn't exist and I knew that the next 7 pages did exist, and that there was a large gap after that. So this would be either 0b11111110 (hex 0xfe) or 0b01111111 (hex 0x7f), depending on whether the bitmap uses MSB-first or LSB-first approach (that's Most-Significant-Bit-first, i.e. bit 7 denotes page 0, or Least-Significant-Bit-first, i.e. bit 0 denotes page 0). At offset 0x2028 there's an fe value followed by a lot of zeroes ("large gap"), so it's fair to suggest that's what we are looking for. I've calculated the address of the next page which should exist – that's the 3f byte at 0x2048 (which translates to memory pages at addresses 0x20000 - 0x26fff) – and checked in WinDBG if they exist (plus pages around them don't). WinDBG said I'm right.

What about the 28 c0 e8 03 80 fa ff ff bytes, which are just before the bitmap, but after the second magic? No idea. They didn't seem to be important for my purpose anyway.

To sum up, my reversed structure looked somewhat like this:

0x2000: "SDMPDUMPSDMP" magic value 0x200c: 4 bytes: start offset of pages 0x2010: number of pages total 0x2014: number of actually occupied pages 0x2018: number of pages total again ??? 0x201c: "SDMP" magic value 0x2020: 8 bytes of no idea "FFFFFA8003E8C028h" ??? 0x2028: start of bitmap (LSB first)

As a Volatility 2 vtype it would look like this:

{ '_FULL_DUMP64_GYNVAEL' : [ 0x28, { 'Signature' : [ 0x0, ['array', 12, ['unsigned char']]], 'DataStart' : [ 0xc, ['unsigned int']], 'NumberOfPages1' : [ 0x10, ['unsigned int']], 'NumberOfUsed' : [ 0x14, ['unsigned int']], 'NumberOfPages2' : [ 0x18, ['unsigned int']], 'Signature2' : [ 0x1c, ['array', 4, ['unsigned char']]], 'NoIdea' : [ 0x20, ['unsigned long long']], 'Buffer' : [ 0x28, ['array', lambda x: (x.NumberOfPages1 + 7) / 0x8, ['unsigned char']]], 'Buffer2' : [ 0x28, ['array', lambda x: (x.NumberOfPages1 + 31) / 32, ['unsigned int']]] } ], }

As you can see I've actually started implementing Volatility 2 support during the CTF, but then I decided it would be faster to just convert the MEMORY.DMP file to a raw (linear / headerless) memory dump. This would save me debugging, as well as "proper" structure parsing. So here's the ad-hoc code for the converter:

with open("MEMORY.DMP", "rb") as f: d = f.read() zeroes = bytearray(0x1000) with open("MEMORY.RAW", "wb") as f: page = 0x27000 offset = 0x2028 for i in range(0x120000): byte = i // 8 bit = i % 8 if ((d[offset+byte] >> bit) & 1) == 0: f.write(zeroes) continue print("%16x %16x" % (page, i * 0x1000)) f.write(d[page:page+0x1000]) page += 0x1000

There isn't much magic there – I've basically hardcoded the offsets and field values I've seen in the hexeditor, went through the bitmap bit by bit, dumped every page that existed and filled the space between with zeroes. The resulting file was of course much larger (4.5GB in fact, versus 252MB of the original).

One missing piece of the puzzle was to tell Volatility 2 where to look for the page table. Normally Volatility 2 would take it from the DirectoryTableBase field inside the _DMP_HEADER(64). However raw memory dumps don't have metadata headers, so one needs to pass this information via the command line. In my case I've actually read the value of cr3 register in WinDBG instead of just checking the DirectoryTableBase value in MEMORY.DMP – I somehow missed that it's there.

Final result:

$ python2 vol.py --profile Win7SP1x64 -f ../MEMORY.RAW --dtb=0x9bffa000 pslist Volatility Foundation Volatility Framework 2.6 Offset(V) Name PID PPID Thds Hnds Sess Wow64 Start ------------------ -------------------- ------ ------ ------ -------- ------ ------ ---------------------------- 0xfffffa80036ea040 System 4 0 84 485 ------ 0 2022-10-09 10:32:49 UTC+0000 0xfffffa8003f4c040 smss.exe 252 4 2 29 ------ 0 2022-10-09 10:32:49 UTC+0000 0xfffffa800400f620 csrss.exe 332 324 11 475 ------ 0 2022-10-09 10:32:58 UTC+0000 0xfffffa8004018b00 csrss.exe 384 376 12 141 ------ 0 2022-10-09 10:33:00 UTC+0000 0xfffffa80040119b0 wininit.exe 392 324 6 89 ------ 0 2022-10-09 10:33:00 UTC+0000 0xfffffa8005865060 winlogon.exe 428 376 7 133 ------ 0 2022-10-09 10:33:00 UTC+0000 0xfffffa800589d350 services.exe 488 392 15 203 ------ 0 2022-10-09 10:33:01 UTC+0000 0xfffffa80058b3500 lsass.exe 504 392 11 706 ------ 0 2022-10-09 10:33:01 UTC+0000 0xfffffa80058b6240 lsm.exe 512 392 10 148 ------ 0 2022-10-09 10:33:01 UTC+0000 0xfffffa800591d940 svchost.exe 604 488 10 314 ------ 0 2022-10-09 10:33:05 UTC+0000 ...

By the way...
On 22nd Nov'24 we're running a webinar called "CVEs of SSH" – it's free, but requires sign up: https://hexarcana.ch/workshops/cves-of-ssh (Dan from HexArcana is the speaker).

Next step would be to add full support in Volatility 2, but I'm not sure that's actually still supported, so perhaps instead I should look at Volatility 3 and think if that would be useful there.

In any case, that's it!

Add a comment:

Nick:
URL (optional):
Math captcha: 5 ∗ 7 + 7 =