DOS, how to break on EP (qemu, bochs) (last update: 2013-11-20, created: 2013-10-20) back to the list ↑
A problem I've recently encountered:
Having a CPU-level debugger (bochsdbg, qemu+gdb stub), how to break on the Entry Point of a DOS program?
The above task is obviously trivial while using standard DOS debuggers - you run the debugger with the program executable as the argument, then magic happens, and you end up looking at the debugger having already hit the breakpoint at the EP.
On CPU-level debuggers (or actually: CPU emulators with embedded debugging features) that's a little more complicated, since there is no notion of program or an application - that's strictly OS stuff which is a layer way above the CPU emulation. So, if there is no notion of program, then there is no notion of a program EP, and so it's hard to know where to break.
(One exception here is DOSBox debugger, which has a "debug" command inside the emulated environment which understands programs and their EPs.)
So, one could just take a look at the MZ file, determine the entry point (IMAGE_DOS_HEADER.e_ip), and place a "hardware" breakpoint on that address - right?
The answer is of course "yes and no".
The missing piece of information here is the CS segment, which is actually chosen by the DOS MZ loader.
So "no", you can't just place the breakpoint without knowing what CS will the EP end up on.
And "yes", if you can set a breakpoint as "ip==e_ip && cs==whatever", then you are all set (though that might hit a few false positives at first too).
Unfortunately neither the bochsdbg, nor qemu+gdb stub, seemed to be able to place that kind of breakpoints (though you still could do it with bochs instrumentation of course).
Getting the break on EP
One of the ideas I had was to analyze the DOS MZ loader (or actually the INT 21h/AH=4Bh interrupt aka EXEC - LOAD AND/OR EXECUTE PROGRAM), find how it transfers the control to the program and set a breakpoint there.
- DOS on some floppy image or hard disk image
- a CPU-level debugger (I used qemu+gdb stub+gdb)
- IDA for code analysis
I started (after DOS was up and running) with looking up the address of the INT 21h handler in the IVT - that's basically displaying 2 16-bit words at physical address 0x21*4:
That basically means that the handler is at 000F:40F8. I proceeded with dumping the whole 000F:* segment to a file and loading it in IDA.
Now I needed to find the handler for the 4Bh function. The code at the beginning looks like this:
And it's followed by what looks like an environment setup for executing more complex code. The interesting part start about 1-2 pages below:
So this code basically picks up the handler function address from the lookup table placed at 000F:3E9E (NOTE: in case of other DOS versions this can be a totally different address).
Looking at 000F:3E9E+4B*2 I got the address of the "EXEC" implementation.
The function at that address (000F:9B5F) is very very long, but... the important thing is that at the very bottom there are two retf instructions, which one can assume do the transfer (well, it would be either a jmp far, a call far, or a ret far).
So, setting the breakpoints at 000F:A0D9 and 000F:A0EE (in case of my DOS version) will allow you to:
1. Read full EP (CS:IP) from the stack.
2. Single-step into the EP (
Let's test it:
So in my case the EP was 1FAC:0000. Now I could easily put a breakpoint there and continue debugging through the code.
To switch gdb into disassembling 16-bit code it seems you have to do this:
To switch back into 32-bit code:
And that's about it.
Even more random notes
If you like these kind of setups (emulator+debugger), check out Heisenberg project - it's basically qemu + gdb + volatility, quite a cool idea.