Crash Dump Analysis Patterns (Part 2)

Crash dump 불펌스페샬 2007. 5. 13. 23:55 posted by CecilDeSK
반응형
Crash Dump Analysis Patterns (Part 2)

Another pattern I would like to discuss isDynamic MemoryCorruption (and its user and kernel variants called Heap Corruption and Pool Corruption). You might have already guessed it :-) It is so ubiquitous. And its manifestations are random and usually crashes happen far away from the original corruption point. In your user mode and space part of exception threads (don’t forget about Multiple Exceptions pattern) you would see something like this:

ntdll!RtlpCoalesceFreeBlocks+0x10c
ntdll!RtlFreeHeap+0x142
MSVCRT!free+0xda
componentA!xxx

or this

ntdll!RtlpCoalesceFreeBlocks+0x10c
ntdll!RtlpExtendHeap+0x1c1
ntdll!RtlAllocateHeap+0x3b6
componentA!xxx

or any similar variants and you need to know exact component that corrupted application heap (which usually is not the same as componentA.dll you see in crashed thread stack).

For this common recurrent problem we have a general solution: enable heap checking. This general solution has many variants applied in a specific context:

  • parameter value checking for heap functions
  • user space software heap checks before or after certain checkpoints (like “malloc”/”new” and/or “free”/”delete” calls): usually implemented by checking various fill patterns, etc.
  • hardware/OS supported heap checks (like using guardand nonaccessible pages to trap buffer overruns)

The latter variant is the mostly used according to my experience and mainly due to the fact that most heap corruptions originate from buffer overflows.And it is easier to rely on instant MMU support than on checking fill patterns. Here is the article from Citrix support web site describing how you can enable full page heap. It uses specific process as an example: Citrix Independent Management Architecture (IMA) service but you can substitute any application name you are interested in debugging:

How to enable full page heap

and another article:

How to check in a user dump that full page heap was enabled

The following Microsoft article discusses various heap related checks:

How to use Pageheap.exe in Windows XP and Windows 2000

The Windows kernel analog to user mode and space heap corruption is called page and nonpaged pool corruption. If we consider Windows kernel pools as variants of heap then exactly the same techniques are applicable there, for example, the so called special pool enabled by Driver Verifier is implemented by nonaccessible pages. Refer to the following Microsoft article for further details:

How to use the special pool feature to isolate pool damage

- Dmitry Vostokov -

반응형