What is ASLR?

ASLR stands for Address Space Layout Randomisation. It was introduced in Windows with Vista, released to customers in early 2007. Before that, Linux had already been enabling ASLR by default since kernel 2.6.12, released in June 2005. ASLR tries to make life of bad guys™ more difficult by randomizing how an executable is laid out in memory.

When I first wanted to figure out how ASLR works internally, I came across a lot of articles, but not one that tells the entire story. After I completed my research and fully understood how ASLR works in Windows, I decided to write an article about.

What is being moved around?

ASLR moves the following memory segments:

The stack
The heap
Code segments

How does this make life difficult for Bad Guys™?

When someone finds a bug in a program (for instance a buffer overflow), this can potentially be abused to take control of the flow of execution of the program by overwriting key values in the program, such as the return addresses on the stack. Before DEP (or NX) came along, the common approach was injecting shellcode onto the stack and then overwriting the return value on the stack by the offset which is the start of the shellcode. This is no longer possible because the stack is now a so-called ‘non-executable’ memory segment; if the CPU starts executing code on the stack (EIP is pointing somewhere on the stack), the CPU will throw an exception and the program will crash without the harmful code being executed.

DEP can be worked around by, instead of loading your own code onto the stack, executing code that already exists in an executable section of the program, such as the program itself or a library (a trick called return-to-libc). For instance; writing “shutdown /s” somewhere into memory, then loading a pointer to that string as argument onto the stack and then changing the return address to the address of system() will make the target computer shut down. All this requires knowledge of the memory segment layout of the program, which is what ASLR was designed to obfuscate.

How does that impact my code?

It shouldn’t. If ASLR breaks your code, you should definately fix it. You probably made assumptions about where things are located in the address space of your application, which is a bad practice and not very portable.

ASLR in action

We can easely demonstrate the effect of ASLR with a simple piece of C code:

int main() {
	int stack;
	void *heap;

	heap = malloc(1024);

	printf("Stack: 0x%p - Heap: 0x%p - Code: 0x%p - "
               "strlen(): 0x%p - MessageBoxA(): 0x%p\n",
		&stack, heap, &main, &strlen, &MessageBoxA);
}

Life before ASLR

It is important to note that the Visual C++ linker enables ASLR in the executable by default. So first, we’ll compile the program with /DYNAMICBASE:NO to disable ASLR. Then, we run the program a few times and line up the results:

Stack	Heap	Code	strlen()	MessageBoxA()
0x0019FF34	0x006BF6B8	0x00401050	0x752207A0	0x7784D740
0x0019FF34	0x005EFEE8	0x00401050	0x752207A0	0x7784D740
0x0019FF34	0x005608E8	0x00401050	0x752207A0	0x7784D740
After reboot
0x0019FF34	0x0057E0F0	0x00401050	0x75C2EFE0	0x75D58830
0x0019FF34	0x0068DDB0	0x00401050	0x75C2EFE0	0x75D58830
0x0019FF34	0x004EEBB8	0x00401050	0x75C2EFE0	0x75D58830

The heap moves around, but the other addresses don’t. The reason the heap CAN be moved around is because malloc() returns a pointer; code can never safely assume that memory allocated on the heap is in the same place. But as you can see, there are other things that stay in place, except the library functions. These are randomized at boot time.

Enabling ASLR

To enable ASLR, we simply recompile the code with the /DYNAMICBASE switch, then run it a few times again:

Stack	Heap	Code	strlen()	MessageBoxA()
0x008FFAD0	0x00C4F720	0x00F21050	0x750F07A0	0x76AAD740
0x0053FCF4	0x005AF6B8	0x00F21050	0x750F07A0	0x76AAD740
0x005DF7F0	0x0014F720	0x00F21050	0x750F07A0	0x76AAD740
After reboot
0x003EFA28	0x0011EBF8	0x00A11050	0x76E007A0	0x774AD740
0x007CFDE0	0x00CD0918	0x002B1050	0x76E007A0	0x774AD740
0x00D3FBFC	0x00E5F8B0	0x002B1050	0x76E007A0	0x774AD740

Note that the stack and heap always move around, but code doesn’t. This is likely because Windows caches the layout after the executable has been mapped to memory once, until it is gone from the cache or the system is rebooted. This is because mapping and rebasing an executable can take some time (see below).

What about 64-bit?

With /DYNAMICBASE:NO:

Stack	Heap	Code	strlen()	MessageBoxA()
0x000000000014FF20	0x00000000004C3C90	0x0000000140001070	0x00007FF8A8263C70	0x00007FF8AB5DFBE0
0x000000000014FF20	0x0000000000583C90	0x0000000140001070	0x00007FF8A8263C70	0x00007FF8AB5DFBE0
0x000000000014FF20	0x0000000000465570	0x0000000140001070	0x00007FF8A8263C70	0x00007FF8AB5DFBE0
After reboot
0x000000000014FF20	0x0000000000424E80	0x0000000140001070	0x00007FFFFF55F090	0x00007FF802798020
0x000000000014FF20	0x00000000005D0D50	0x0000000140001070	0x00007FFFFF55F090	0x00007FF802798020
0x000000000014FF20	0x000000000047F680	0x0000000140001070	0x00007FFFFF55F090	0x00007FF802798020

With /DYNAMICBASE:

Stack	Heap	Code	strlen()	MessageBoxA()
0x000000E58A52F770	0x0000021043823900	0x00007FF71AA21070	0x00007FF8A8263C70	0x00007FF8AB5DFBE0
0x000000449F51FA30	0x0000015B1D7A3C90	0x00007FF71AA21070	0x00007FF8A8263C70	0x00007FF8AB5DFBE0
0x000000312C74FB90	0x000001E25C7E3C90	0x00007FF71AA21070	0x00007FF8A8263C70	0x00007FF8AB5DFBE0
After reboot
0x000000B3508FF9C0	0x00000217580C0D80	0x00007FF656D91070	0x00007FFFFF55F090	0x00007FF802798020
0x0000009F4551FD40	0x0000023911081130	0x00007FF656D91070	0x00007FFFFF55F090	0x00007FF802798020
0x0000008EF03DFD50	0x0000019014F5F6D0	0x00007FF656D91070	0x00007FFFFF55F090	0x00007FF802798020

As you can see, 64-bit or 32-bit doesn’t influence which segments are randomized. It does however influence the entropy: since the 64-bit address space is much wider, it would be harder for an attacker to guess addresses.

Summarizing

Enabling ASLR for an application basically randomizes two segments: the code segment of the application and the stack. For convenience, here’s a table showing when the different base addresses are randomized.

/DYNAMICBASE:?	Code	Stack	Heap	DLLs
YES	OS boot	App. start	App. start	OS boot
NO	Never	Never	App. start	OS boot

How does this work?

When Windows loads an executable, the dynamic linker gets to work. An executable (or PE) contains a PE-header which tells the dynamic linker how the executable should be mapped into virtual memory and which segments should be readable, writable and executable (which is used in DEP). If ASLR is not enabled for a segment it will be placed on a fixed address (also specified in the PE-header).

The executable also contains a import address table (IAT). This table tells the OS which functions the program will be calling from external DLLs. The dynamic linker loads the required DLLs into memory and places the address of these functions into the IAT in memory. When the application wants to call on one of these functions, it loads the address from the IAT and calls it. Please note that the OS always loads the entire DLL into memory, but only fills in the addresses for the functions that are actually used in the application in the IAT.

Because Windows executables often contain absolute addresses (as opposed to ‘position independent code’), the executable loader scans for these absolute addresses and recalculates them based on the new offset of the segments if they have been relocated (due to ASLR or their preferred base address not being available). This process is called rebasing.

Conclusion

ASLR mitigates the exploitation of software bugs by making it hard for the attacker to create an exploit that works reliably on every system by randomizing the offsets of all the program segments. It is, however, not a magic remedy: if software can be exploited, there is still a chance an attacker can gain control of a system, it is just harder to do. 64-bit systems benefit more from ASLR than 32-bit systems: since the 64-bit address space is larger, there is more entropy in the randomness of the addresses.

To bypass ASLR an attacker may try and obtain pointers to certain locations in the program that may be located on the stack, for instance. This is why ‘pointer leakage’ bugs are a big deal and should be fixed.

In order to use the ‘return-to-libc’ method, an attacker must figure out where the libc-functions he needs are located. Since library locations are randomized at boot time and the same across the system, there is a chance that pointer leakage from one application running on a system can be used to exploit another application running on that system.

References

https://www.symantec.com/connect/articles/dynamic-linking-linux-and-windows-part-two
http://blog.morphisec.com/aslr-what-it-is-and-what-it-isnt/
https://en.wikipedia.org/wiki/Return-oriented_programming
https://en.wikipedia.org/wiki/Portable_Executable

Thomas

Address Space Layout Randomisation on Windows