Crash Dump Analysis Patterns (Part 1)

Crash dump 불펌스페샬 2007. 5. 13. 23:54 posted by CecilDeSK
반응형
Crash Dump Analysis Patterns (Part 1)

After doing crash dump analysis exclusively for more than 3 years I decided toorganize my knowledge into a set of patterns (so to speak ina dump analysispattern language and therefore try facilitate its common vocabulary).

What is a pattern? It is a general solution you can apply in a specific context to a common recurrentproblem.

There are many pattern and pattern languages in software engineering, for example, look at the following almanac that lists +700 patterns:

The Pattern Almanac 2000

and thefollowing link is very useful:

Patterns Library

The first pattern I’m going to introduce today is Multiple Exceptions. This pattern captures the known fact that there could be as many exceptions (”crashes”) as many threads in a process.The following UML diagram depicts the relationshipbetween Process, Thread and Exception entities:

da_pattern_1_corrected.JPG

Every process in Windows has at least one execution thread so there could beat least one exception per thread (like invalid memory reference) if things go wrong. There could besecond exception in that thread if exception handling code experiences another exception or the first exception was handled and you have another one and so on.

So what is the general solution to that common problem whenan application or servicecrashes and you have a crash dump file (common recurrent problem) froma customer (specific context)? The general solution is to look at all threads and their stacks and do not rely on what tools say.

Here is a concrete example from one of the dumps I got today:

Internet Explorer crashed andI opened it in WinDbg and ran ‘!analyze -v’ command. This is what I gotin my WinDbg output:

ExceptionAddress: 7c822583 (ntdll!DbgBreakPoint)
ExceptionCode: 80000003 (Break instruction exception)
ExceptionFlags: 00000000
NumberParameters: 3
Parameter[0]: 00000000
Parameter[1]: 8fb834b8
Parameter[2]: 00000003

Break instruction, you might think, shows that the dump was taken manually from the running application and there was no crash - the customer sent the wrong dump or misunderstood instructions. HoweverI looked at all threadsand noticed the following two stacks (threads 15 and 16):

0:016>~*kL...

15 Id: 1734.8f4 Suspend: 1 Teb: 7ffab000 Unfrozen
ntdll!KiFastSystemCallRet
ntdll!NtRaiseHardError+0xc
kernel32!UnhandledExceptionFilter+0x54b
kernel32!BaseThreadStart+0x4a
kernel32!_except_handler3+0x61
ntdll!ExecuteHandler2+0x26
ntdll!ExecuteHandler+0x24
ntdll!KiUserExceptionDispatcher+0xe
componentA!xxx
componentB!xxx
mshtml!xxx
kernel32!BaseThreadStart+0x34

# 16 Id: 1734.11a4 Suspend: 1 Teb: 7ffaa000 Unfrozen
ntdll!DbgBreakPoint
ntdll!DbgUiRemoteBreakin+0x36

So we see here that the real crash happened incomponentA.dll and componentB.dll or mshtml.dll might have influenced that. Why this happened? The customermight have dumped Internet Explorer manually while it was displaying an exception message box. The following reference says that ZwRaiseHardError displays a message box containing an error message:

Windows NT/2000 Native API Reference

Or perhaps something else happened.Manycases where we see multiple thread exceptions in one process dump happened because crashed threads displayed message boxes like Visual C++ debug message box and preventing that process from termination.In our dump under discussion WinDbgautomatic analysis command recognized only the last breakpoint exception (shown as # 16). In conclusion we shouldn’t rely on“automatic analysis” often anywayandprobably should write our own extension to list possible multiple exceptions (based on some heuristics I will talk about later).

- Dmitry Vostokov -

반응형