What is ASLR?

ASLR stands for Address Space Layout Randomisation. It was introduced in Windows with Vista, released to customers in early 2007. Before that, Linux had already been enabling ASLR by default since kernel 2.6.12, released in June 2005. ASLR tries to make life of bad guys™ more difficult by randomizing how an executable is laid out in memory.

When I first wanted to figure out how ASLR works internally, I came across a lot of articles, but not one that tells the entire story. After I completed my research and fully understood how ASLR works in Windows, I decided to write an article about.

What is being moved around?

ASLR moves the following memory segments:

  • The stack
  • The heap
  • Code segments

How does this make life difficult for Bad Guys™?

When someone finds a bug in a program (for instance a buffer overflow), this can potentially be abused to take control of the flow of execution of the program by overwriting key values in the program, such as the return addresses on the stack. Before DEP (or NX) came along, the common approach was injecting shellcode onto the stack and then overwriting the return value on the stack by the offset which is the start of the shellcode. This is no longer possible because the stack is now a so-called ‘non-executable’ memory segment; if the CPU starts executing code on the stack (EIP is pointing somewhere on the stack), the CPU will throw an exception and the program will crash without the harmful code being executed.

DEP can be worked around by, instead of loading your own code onto the stack, executing code that already exists in an executable section of the program, such as the program itself or a library (a trick called return-to-libc). For instance; writing “shutdown /s” somewhere into memory, then loading a pointer to that string as argument onto the stack and then changing the return address to the address of system() will make the target computer shut down. All this requires knowledge of the memory segment layout of the program, which is what ASLR was designed to obfuscate.

How does that impact my code?

It shouldn’t. If ASLR breaks your code, you should definately fix it. You probably made assumptions about where things are located in the address space of your application, which is a bad practice and not very portable.

ASLR in action

We can easely demonstrate the effect of ASLR with a simple piece of C code:

int main() {
	int stack;
	void *heap;

	heap = malloc(1024);

	printf("Stack: 0x%p - Heap: 0x%p - Code: 0x%p - "
               "strlen(): 0x%p - MessageBoxA(): 0x%p\n",
		&stack, heap, &main, &strlen, &MessageBoxA);
}

Life before ASLR

It is important to note that the Visual C++ linker enables ASLR in the executable by default. So first, we’ll compile the program with /DYNAMICBASE:NO to disable ASLR. Then, we run the program a few times and line up the results:

Stack Heap Code strlen() MessageBoxA()
0x0019FF34 0x006BF6B8 0x00401050 0x752207A0 0x7784D740
0x0019FF34 0x005EFEE8 0x00401050 0x752207A0 0x7784D740
0x0019FF34 0x005608E8 0x00401050 0x752207A0 0x7784D740
After reboot        
0x0019FF34 0x0057E0F0 0x00401050 0x75C2EFE0 0x75D58830
0x0019FF34 0x0068DDB0 0x00401050 0x75C2EFE0 0x75D58830
0x0019FF34 0x004EEBB8 0x00401050 0x75C2EFE0 0x75D58830

The heap moves around, but the other addresses don’t. The reason the heap CAN be moved around is because malloc() returns a pointer; code can never safely assume that memory allocated on the heap is in the same place. But as you can see, there are other things that stay in place, except the library functions. These are randomized at boot time.

Enabling ASLR

To enable ASLR, we simply recompile the code with the /DYNAMICBASE switch, then run it a few times again:

Stack Heap Code strlen() MessageBoxA()
0x008FFAD0 0x00C4F720 0x00F21050 0x750F07A0 0x76AAD740
0x0053FCF4 0x005AF6B8 0x00F21050 0x750F07A0 0x76AAD740
0x005DF7F0 0x0014F720 0x00F21050 0x750F07A0 0x76AAD740
After reboot        
0x003EFA28 0x0011EBF8 0x00A11050 0x76E007A0 0x774AD740
0x007CFDE0 0x00CD0918 0x002B1050 0x76E007A0 0x774AD740
0x00D3FBFC 0x00E5F8B0 0x002B1050 0x76E007A0 0x774AD740

Note that the stack and heap always move around, but code doesn’t. This is likely because Windows caches the layout after the executable has been mapped to memory once, until it is gone from the cache or the system is rebooted. This is because mapping and rebasing an executable can take some time (see below).

What about 64-bit?

With /DYNAMICBASE:NO:

Stack Heap Code strlen() MessageBoxA()
0x000000000014FF20 0x00000000004C3C90 0x0000000140001070 0x00007FF8A8263C70 0x00007FF8AB5DFBE0
0x000000000014FF20 0x0000000000583C90 0x0000000140001070 0x00007FF8A8263C70 0x00007FF8AB5DFBE0
0x000000000014FF20 0x0000000000465570 0x0000000140001070 0x00007FF8A8263C70 0x00007FF8AB5DFBE0
After reboot        
0x000000000014FF20 0x0000000000424E80 0x0000000140001070 0x00007FFFFF55F090 0x00007FF802798020
0x000000000014FF20 0x00000000005D0D50 0x0000000140001070 0x00007FFFFF55F090 0x00007FF802798020
0x000000000014FF20 0x000000000047F680 0x0000000140001070 0x00007FFFFF55F090 0x00007FF802798020

With /DYNAMICBASE:

Stack Heap Code strlen() MessageBoxA()
0x000000E58A52F770 0x0000021043823900 0x00007FF71AA21070 0x00007FF8A8263C70 0x00007FF8AB5DFBE0
0x000000449F51FA30 0x0000015B1D7A3C90 0x00007FF71AA21070 0x00007FF8A8263C70 0x00007FF8AB5DFBE0
0x000000312C74FB90 0x000001E25C7E3C90 0x00007FF71AA21070 0x00007FF8A8263C70 0x00007FF8AB5DFBE0
After reboot        
0x000000B3508FF9C0 0x00000217580C0D80 0x00007FF656D91070 0x00007FFFFF55F090 0x00007FF802798020
0x0000009F4551FD40 0x0000023911081130 0x00007FF656D91070 0x00007FFFFF55F090 0x00007FF802798020
0x0000008EF03DFD50 0x0000019014F5F6D0 0x00007FF656D91070 0x00007FFFFF55F090 0x00007FF802798020

As you can see, 64-bit or 32-bit doesn’t influence which segments are randomized. It does however influence the entropy: since the 64-bit address space is much wider, it would be harder for an attacker to guess addresses.

Summarizing

Enabling ASLR for an application basically randomizes two segments: the code segment of the application and the stack. For convenience, here’s a table showing when the different base addresses are randomized.

/DYNAMICBASE:? Code Stack Heap DLLs
YES OS boot App. start App. start OS boot
NO Never Never App. start OS boot

How does this work?

When Windows loads an executable, the dynamic linker gets to work. An executable (or PE) contains a PE-header which tells the dynamic linker how the executable should be mapped into virtual memory and which segments should be readable, writable and executable (which is used in DEP). If ASLR is not enabled for a segment it will be placed on a fixed address (also specified in the PE-header).

The executable also contains a import address table (IAT). This table tells the OS which functions the program will be calling from external DLLs. The dynamic linker loads the required DLLs into memory and places the address of these functions into the IAT in memory. When the application wants to call on one of these functions, it loads the address from the IAT and calls it. Please note that the OS always loads the entire DLL into memory, but only fills in the addresses for the functions that are actually used in the application in the IAT.

Because Windows executables often contain absolute addresses (as opposed to ‘position independent code’), the executable loader scans for these absolute addresses and recalculates them based on the new offset of the segments if they have been relocated (due to ASLR or their preferred base address not being available). This process is called rebasing.

Conclusion

ASLR mitigates the exploitation of software bugs by making it hard for the attacker to create an exploit that works reliably on every system by randomizing the offsets of all the program segments. It is, however, not a magic remedy: if software can be exploited, there is still a chance an attacker can gain control of a system, it is just harder to do. 64-bit systems benefit more from ASLR than 32-bit systems: since the 64-bit address space is larger, there is more entropy in the randomness of the addresses.

To bypass ASLR an attacker may try and obtain pointers to certain locations in the program that may be located on the stack, for instance. This is why ‘pointer leakage’ bugs are a big deal and should be fixed.

In order to use the ‘return-to-libc’ method, an attacker must figure out where the libc-functions he needs are located. Since library locations are randomized at boot time and the same across the system, there is a chance that pointer leakage from one application running on a system can be used to exploit another application running on that system.

References

  • https://www.symantec.com/connect/articles/dynamic-linking-linux-and-windows-part-two
  • http://blog.morphisec.com/aslr-what-it-is-and-what-it-isnt/
  • https://en.wikipedia.org/wiki/Return-oriented_programming
  • https://en.wikipedia.org/wiki/Portable_Executable