Sunday, December 13, 2009

Using heap as stack and finding stack usage

Recently I learned how a memory allocated from heap can be used as stack and how we can find the stack usage of particular function invocation. The following is a sample code utilizing this technique and also it detects the stack usage of function call.
#include <stdlib.h>
#include <string.h>

static int depth = 0;
static void *prevESP;

void func()
{
    if (depth++ < 10)  // change the 10 to increase/decrease stack usage
        func();
}

int main(int argc, _TCHAR* argv[])
{
    const int STACK_SIZE = 1024 * 1024 * 4; // we allocate 4MB for our heap stack
    unsigned char *heapStack = (unsigned char*)malloc(STACK_SIZE);
    unsigned char *heapStackBottom = heapStack + STACK_SIZE; // move to bottom of the heap stack

    // fill the stack with 0xaa byte so that we can detect the stack
    // usage by scanning non 0xaa byte
    memset(heapStack, 0xaa, STACK_SIZE);

    __asm
    {
        mov prevESP, ESP          // take backup of current stack pointer
        mov ESP, heapStackBottom  // store our heap stack bottom as current stack pointer  
    }

    func();

    __asm
    {
        mov ESP, prevESP  // restore the original stack pointer
    }

    // lets scan for a byte which is not 0xaa. this reveals the
    // last dirty byte from which we can calculate the stack usage
    heapStackBottom = heapStack;
    while (*heapStackBottom++ == 0xaa);

    // the stack started from bottom, subtract it from top and the total size to find the stack usage
    printf ("Heap stack usage %d\n", STACK_SIZE - (heapStackBottom - heapStack));

    free(heapStack);

    return 0;
}
So in the above code we allocate 4MB heap space to be used as stack(lets call this as "heap stack"). Now we need to store the bottom(x86 stack grows downwards, e.g. 2000 -> 1000) of heap stack into ESP register. Before modifying the ESP we need to take the current ESP value which we will restore after function invocation. These operations are done in inlined assembly code. To the stack usage of a function invocation, we fill the allocated heap area with some known pattern(here 0xaa -> 0b10101010) and call the function. When the function returns we start scanning for a byte which is not 0xaa. This will be the last dirty byte from heap stack bottom. We subtract this offset from heap stack top which results the no. of bytes not used. Again subtracting it with total heap stack size we will end up with actual stack usage. Though the above code is writtend for VC++ compiler it can be ported to Linux by changing the inlined assembly with its GCC equivalent.
I thought about the above technique when I was reading setcontext/getcontext/swapcontext POSIX functions and how it could be implemented. As per C99 standard setjmp/longjmp is not guaranteed to work when we call longjmp of a function which is already completed its execution(i.e. returned). This is because the stack is not guaranteed to be same when setjmp was called. The POSIX solves this by using user allocated memory for stack. At this pointed I was wondering how could it be implemented and came up with the above code.

2 comments:

Joe Steeve said...

That could be dangerous. Compilers usually generate code to reference local variables as offsets of the EBP or the ESP.

So, the reference to prevESP is bound to be invalid when restoring the original ESP.

Siva Chandran P said...

Yes, you are correct. Access to prevESP will become invalid once we change the ESP as prevESP is a local variable.

Now I've made it as global and assuming it will not get affected by changing the ESP.

Thanks for pointing it out.