반응형
Crash Dump Analysis AntiPatterns (Part 2)

Let’s define Zippocricy - commonsin in software support environments worldwide: someone gets something froma customer in archived form and without checkingthe contentsforwards it furtherto another person insupport chain. By the time the evidence gets unzipped somewhere, checked and found corrupt or irrelevant the customer suffers not hours but days.

Happens not onlywith crash dumps butwith any type of problem evidence.

- Dmitry Vostokov -

반응형
반응형
Crash Dump Analysis AntiPatterns (Part 3)

Ihaveheardengineers saying, “I didn’t know about thisdebugging command, let’s use it!” after training session or reading other people’s analysis of crash dumps. A yearlaterI hearthesame phrasefrom them about another debugging command. In the mean time they continue to use the same set of commands they know about until they hear the old newone.

This is a manifestation ofWord of Mouth anti-pattern.

Generalsolution: Know your tools. Study them proactively. RTFM.

Example solution: periodically read and re-read WinDbg help.

More refined solution: debugger.chm on Windows Mobile PC.

- Dmitry Vostokov -

반응형
반응형
Crash Dump Analysis AntiPatterns (Part 1)

In any domain of activity where patterns exist we can find anti-patterns too. They are bad solutions forrecurrent problems inspecific contexts. One of them I would like to introduce briefly is Alien Component. In essence, when every technique fails or you run out of WinDbg commands look at some innocent component you have never seen before ordon’t have symbols for: be it some driver orhook. Of course, this component cannot be the component developed bythe company you are working for. :-)

- DmitryVostokov -

반응형
반응형
Crash Dump Analysis Patterns (Part 13a)

Insufficient Memory pattern can be seen in many complete and kernel memory dumps. This condition could cause the system to crash, become slow, hang or refuse to provide the expected functionality, for example, refuse new terminal server connections. There are many types of memory resources and we can classify them initially into the following categories:

  • Committed memory
  • Virtual memory
    • Kernel space
      • Paged pool
      • Non-paged pool
      • Session pool
      • PTE limits
      • Desktop heap
      • GDI limits
    • User space
      • Virtual regions
      • Process heap

We will talk about all of them in separate parts. What I outline in this part is committed memory exhaustion. Committed memory is an allocated memory backed up by some physical memory or by a reserved space in the page file. Reserving the space needs to be done in case OS wants to swap out that memory’s data to disk if it is not currently used and there is no physical memory available for other processes. If that data is needed again OS brings it back to physical memory. If there is no space in the page file then physical memory is filled up. If committed memory is exhausted most likely the system will hang or result in a bugcheck soon so checking memory statistics shall always be done when you get a kernel or a complete memory dump. Even access violation bugchecks could result from insufficient memory when some memory allocation operation failed but a kernel mode component didn’t check the return value for NULL. Here is an example:

BugCheck 8E, {c0000005, 809203af, aa647c0c, 0}
0: kd> !analyze -v
...
...
...
TRAP_FRAME: aa647c0c -- (.trap ffffffffaa647c0c)
...
...
...
0: kd> .trap ffffffffaa647c0c
ErrCode = 00000000
eax=00000000 ebx=bc1f3cfc ecx=89589250 edx=000018c1 esi=bc1f3ce0 edi=aa647d14
eip=809203af esp=aa647c80 ebp=aa647c80 iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
nt!SeTokenType+0x8:
809203af 8b8080000000 mov eax,dword ptr [eax+80h] ds:0023:00000080=????????
0: kd> k
ChildEBP RetAddr
aa647c80 bf8173c5 nt!SeTokenType+0x8
aa647cdc bf81713b win32k!GreGetSpoolMessage+0xb0
aa647d4c 80834d3f win32k!NtGdiGetSpoolMessage+0x96
aa647d4c 7c82ed54 nt!KiFastCallEntry+0xfc

If we enter !vm command to display memory statistics we would see that all committed memory is filled up:

0: kd> !vm
*** Virtual Memory Usage ***
Physical Memory: 999294 ( 3997176 Kb)
Page File: ??C:pagefile.sys
Current: 4193280 Kb Free Space: 533744 Kb
Minimum: 4193280 Kb Maximum: 4193280 Kb
Available Pages: 18698 ( 74792 Kb)
ResAvail Pages: 865019 ( 3460076 Kb)
Locked IO Pages: 290 ( 1160 Kb)
Free System PTEs: 155265 ( 621060 Kb)
Free NP PTEs: 32766 ( 131064 Kb)
Free Special NP: 0 ( 0 Kb)
Modified Pages: 113 ( 452 Kb)
Modified PF Pages: 61 ( 244 Kb)
NonPagedPool Usage: 12380 ( 49520 Kb)
NonPagedPool Max: 64799 ( 259196 Kb)
PagedPool 0 Usage: 40291 ( 161164 Kb)
PagedPool 1 Usage: 2463 ( 9852 Kb)
PagedPool 2 Usage: 2455 ( 9820 Kb)
PagedPool 3 Usage: 2453 ( 9812 Kb)
PagedPool 4 Usage: 2488 ( 9952 Kb)
PagedPool Usage: 50150 ( 200600 Kb)
PagedPool Maximum: 67584 ( 270336 Kb)
********** 18 pool allocations have failed **********
Shared Commit: 87304 ( 349216 Kb)
Special Pool: 0 ( 0 Kb)
Shared Process: 56241 ( 224964 Kb)
PagedPool Commit: 50198 ( 200792 Kb)
Driver Commit: 1892 ( 7568 Kb)
Committed pages: 2006945 ( 8027780 Kb)
Commit limit: 2008205 ( 8032820 Kb)
********** 1216024 commit requests have failed **********

Total Private: 1715957 ( 6863828 Kb)

There might have been a memory leak or too many terminal sessions with fat applications to fit in physical memory and the page file. Actually for that particular case there were both.

- Dmitry Vostokov -

반응형