The post below was originally published (in Polish) on forum 4programmers.net in the "Hello world bez bibliotek i asm" (link) thread.
--post start--
A piece of code from me - please note that I wanted to demonstrate a method and not create an always-working-code :)
The code was written to work on linux (32-bits x86) but you can use the same method on 64-bits or on Windows both 32- and 64-bits.
The code does not use any libraries (it doesn't even look for any in the memory) and there is no inline assembly/etc (well, no direct or explicit inline assembly/etc ;>).
I've placed the explanation of the method below the code.
volatile unsigned int something_wicked_this_way_comes(
int a, int b, int c, int d) {
a ^= 0xC3CA8900; b ^= 0xC3CB8900; c ^= 0xC3CE8900; d ^= 0x80CDF089;
return a+b+c+d;
}
void* find_the_witch(unsigned short witch) {
unsigned char *p = (unsigned char*)something_wicked_this_way_comes;
int i;
for(i = 0; i < 50; i++, p++) {
if(*(unsigned short*)p == witch) return (void*)p;
}
return (void*)0;
}
typedef void (*gadget)() __attribute__((fastcall));
int main(void) {
gadget eax_from_esi_call_int = (gadget)find_the_witch(0xF089);
gadget set_esi = (gadget)find_the_witch(0xCE89);
gadget set_ebx = (gadget)find_the_witch(0xCB89);
gadget set_edx = (gadget)find_the_witch(0xCA89);
if(!eax_from_esi_call_int) return 1;
if(!set_esi) return 3;
if(!set_ebx) return 4;
if(!set_edx) return 5;
set_edx(12), set_ebx(1), set_esi(4);
eax_from_esi_call_int("Hello World\n");
return 0;
}
This code uses a method really similar to the JIT-language exploitation techniques when the memory is protected via XD/NX/XN/DEP/etc - i.e. I tried to implicitly place in executable memory a couple of "gadgets" (think: ret2libc or return oriented programming - http://gynvael.coldwind.pl/?id=149) and then use them to make a syscall call into the kernel (so, there are no libraries needed at all, but of course there is interaction with the environment, i.e. the Linux kernel).
These gadgets are places in the something_wicked_with_way_comes function as the constants used in XORs.
a ^= 0xC3CA8900; b ^= 0xC3CB8900; c ^= 0xC3CE8900; d ^= 0x80CDF089;
The above C code on assembly / machine code level might look like this (compiled using gcc; disassembled using objdump afair):
[...]
6: 35 00 89 ca c3 xor eax,0xc3ca8900
b: 89 45 08 mov DWORD PTR [ebp+0x8],eax
e: 8b 45 0c mov eax,DWORD PTR [ebp+0xc]
11: 35 00 89 cb c3 xor eax,0xc3cb8900
16: 89 45 0c mov DWORD PTR [ebp+0xc],eax
19: 8b 45 10 mov eax,DWORD PTR [ebp+0x10]
1c: 35 00 89 ce c3 xor eax,0xc3ce8900
21: 89 45 10 mov DWORD PTR [ebp+0x10],eax
24: 8b 45 14 mov eax,DWORD PTR [ebp+0x14]
27: 35 89 f0 cd 80 xor eax,0x80cdf089
[...]
So, if we would disassemble the code with a slight misalignment (one or two bytes) we would get a code that differs a little:
6: 35 00 89 ca c3 → mov edx, ecx ; ret
11: 35 00 89 cb c3 → mov ebx, ecx ; ret
1c: 35 00 89 ce c3 → mov esi, ecx ; ret
27: 35 89 f0 cd 80 → mov eax, esi ; int 0x80
Thanks to the above I'm certain that in this case the needed gadgets do reside in memory (of course if the compiler would work in a slightly different way the opcodes might never show up; but in this specific compilation-case they did).
Going further into the code, I use the find_the_witch function to actually find these gadgets in memory in the something_wicked_this_way_comes function (the argument for the scanning function are the two first bytes of a gadget I'm looking for represented as uint16_t (little endian)).
gadget eax_from_esi_call_int = (gadget)find_the_witch(0xF089);
gadget set_esi = (gadget)find_the_witch(0xCE89);
gadget set_ebx = (gadget)find_the_witch(0xCB89);
gadget set_edx = (gadget)find_the_witch(0xCA89);
One more important thing - here's the gadget type:
typedef void (*gadget)() attribute((fastcall));
It has two essential features:
1. The unspecified amount of arguments denoted by the C's () (please note that in C++ () means (void), but in C it's closer to (...)).
2. The fastcall convention thanks to which the function arguments will be places in the general purpose registers and not on the stack (in case of the first few arguments of course) - in this specific case the first argument is always placed in the ecx register (the gadgets are designed to use this fact).
After that I "construct" a simple assembly-like hello world using the gadgets I have:
set_edx(12), set_ebx(1), set_esi(4);
eax_from_esi_call_int("Hello World\n");
This will be executed as following:
(main) mov ecx, 12
mov eax, set_edx
call eax
(gadget) mov edx, ecx
ret
(main) ...
... ...
(gadget) ...
int 0x80
Or, skipping the parts from the main() function:
[gadget 1] mov edx, 12 (length of the string)
[gadget 2] mov ebx, 1 (stdout)
[gadget 3] mov esi, 4 (sys_write)
[handled by fastcall] mov ecx, address "Hello World\n"
[gadget 4] mov eax, esi
[gadget 4] int 0x80
Of course I'm missing a C3 (ret) after the int 0x80 (no place left in a 4-byte gadget) so the program will crash AFTER writing out "hello world". However it would be fairly simple to fix this :)
Test:
$ gcc -m32 test.c -O0
$ ./a.out
Hello World
Segmentation fault (core dumped)
$
--post stop--
An elegant fix to the Segmentation fault problem was posted by Azarien in the same thread - he created another function called graceful_exit where, using the existing gadgets, he invoked the exit syscall. And then he added the call to this function in the something_wicked_this_way_comes just after d ^= 0x80CDF089; - thanks to this after the gadget 89 F0 CD 80 is executed the CPU will execute whatever is next after the CD 80 (int 0x80) and that would be the call to the graceful_exit function.
The said patch looks like this (Azarien's changes are yellow; there was another change in the patch - the gadget type declaration was moved to the top of the file but I'll skip this in the listing):
void graceful_exit()
{
set_ebx(0);
set_esi(1);
eax_from_esi_call_int(0);
}
volatile unsigned int something_wicked_this_way_comes(
int a, int b, int c, int d) {
a ^= 0xC3CA8900; b ^= 0xC3CB8900; c ^= 0xC3CE8900; d ^= 0x80CDF089;
graceful_exit();
return a+b+c+d;
}
As said, very elegant solution :)
It's worth also taking a look at MSM's post and the discussion underneath it (in Polish) - MSM's method uses the commonly known (in RE/shellcoding) technique of looking up kernel32 address in the loaded DLLs list in PEB, finding the GetProcAddress in the import tables and acquiring the addresses of all API functions required to print out "Hello World" (that being said, it kinda relies on some libraries; still, fun to look at).
And that's that. Cheers ;>
Comments:
char _start[] __attribute__ ((section(".text#"))) = {
0xE8, 0x0D, 0x00, 0x00, 0x00, 0x48, 0x65, 0x6C, 0x6C, 0x6F,
0x20, 0x57, 0x6F, 0x72, 0x6C, 0x64, 0x21, 0x0A, 0x5E, 0x31,
0xC0, 0x89, 0xC2, 0xFF, 0xC0, 0x89, 0xC7, 0xB2, 0x0D, 0x0F,
0x05, 0x48, 0x31, 0xFF, 0x6A, 0x3C, 0x58, 0x0F, 0x05};
Hello World!
The variant without -nostdlib parameter is similar, but it also has a main():
[...]
int
main(void)
{
((void (*)(void))a)();
return 0;
}
Sure, that's the most obvious solution, but it doesn't really respect the rule of "no direct inline assembly/etc" - in this case it's inline machine code, Turbo Pascal style :)
We've talked about this kind of solution on the Polish side of this post.
That being said - sure, it would work ;)
--Thanks
You forgot -nostdlib
So you are allowing libraries. And that makes the whole thing a lot easier.
extern long int syscall (long int __sysno, ...) __attribute__ ((__nothrow__ , __leaf__));
int main(void)
{
syscall( 1, 1, "Hello world!\n", 13);
}
I think I wa trying to make sure the function won't be optimized away.
@John
*cough* well the title of this post includes the phrase "without libraries" so I thought -nostdlib goes without saying :) *cough*
Add a comment: