Wednesday, December 20, 2006

QA - IIS6 Debugging with NTSD, Setup

Question:

Using some very helpful guidence from this forum, I made my first attempt at trying to catch a problem I see perioidcally in my ISAPI module.

I installed the latest NTSD.EXE and supporting DLL's on the server of interest and loaded them by adding the following registry entry and restarting IIS:

REG ADD "HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\w3wp.exe" /v Debugger /d "C:\DEBUG\NTSD.EXE -g -G" /t REG_SZ /f

Note I wasn't attempting at this point to monitor remotely as I have RDO access to this server.

NTSD loaded as expected as a process along with W3WP.EXE (as shown in Task Manager). My ISAPI app was exposed to the load overnight and when the morning heavy loads hit, the event log posted the two events shown below. At that point in time, IIS stopped processing requests (although it still seemed to be running). So we removed this server from the WLBS array and began to look for some signs of the debugger's dump.

The problem is I can't find any results from the debugging process. Now this might have been as simple as not having a client running on that server which is monitoring the debug process. But I figured NTSD would throw up some sort of message box indicating a dump was occuring and where. But we didn't see anything like that.

I'm likely missing something obvious in this overall process -- can anyone see what I'm doing wrong?

Event Type: Warning
Event Source: W3SVC
Event Category: None
Event ID: 1010
Date: 12/15/2006
Time: 3:01:16 PM
User: N/A
Computer: DAZLOADBAL3
Description:
A process serving application pool 'DefaultAppPool' failed to respond to a ping.
The process id was '3800'.

Event Type: Information
Event Source: W3SVC
Event Category: None
Event ID: 1082
Date: 12/15/2006
Time: 3:01:16 PM
User: N/A
Computer: DAZLOADBAL3
Description:
A worker process with pid '3800' that serves application pool
'DefaultAppPool' has been determined to be unhealthy (see previous event log
message), but because a debugger is attached to it, the World Wide Web
Publishing Service will ignore the error.

Answer:

Ah, this attempt is correct except for one tiny detail - how to manipulate the debugger when it is auto-attached to an NT Service via Image File Execution Options. Unfortunately, the current situation is unrecoverable, so you will have to start over and account for the missing but critical detail.

Debuggers like CDB, NTSD, and WINDBG from the Microsoft Debugging Toolkit are general purpose debuggers which expect interactive command input to perform tasks like taking a crash dump, disassemble instructions, examine memory, etc. On the other hand, JIT Debuggers like OCA and Dr. Watson are specialized debuggers which automatically perform certain pre-programmed tasks upon triggering.

Common Ways to Manipulate a Debugger

Basically, the question is "now that I have a debugger attached to the process of interest, how do I manipulate the debugger to do what I want?"

The following are some common ways to manipulate a NTSD debugger:

  • Make the debugger command window show up on a WinStation which you can access by launching the debugger interactively as the logged-on user
  • Make the debugger command window show up on a WinStation which you can access by making the NT Service interactive with the Console desktop (WinStation#0)
  • Make the debugger into a "conduit" for an eventual debugging client by piping usermode output into a kernel mode debugger with -d
  • Make the debugger into a "conduit" for an eventual debugging client by opening a TCP/IP port or NamedPipe with -server

The Astute reader should note that there are other debugging methods, such as JIT Debugger, Kernel Debugger, etc... but they are not really relevant nor useful here, so I will skip them for the sake of logical clarity.

Yes, it may seem like a large number of choices for something as simple as "how do I manipulate the debugger", but rest assured, they exist because at one point or another some Microsoft product team needed the feature to debug some aspect of Windows. One may never need to use all of the options, but the utility of having the right option for the right situation means everything in a debugger. Remember, this is the same Debugging Toolkit used within Microsoft to debug native code, so it is plenty powerful when properly wielded.

The Issue, Reformulated

Now that I have enumerated some options, the issue should hopefully make more sense.

  • The NTSD debugger is configured to auto-attach via Image File Execution Options to the W3WP.EXE process launched by an NT Service, which does not interact with the Console desktop by default.
  • An unhandled exception occurred in the W3WP.EXE process, is caught by the attached NTSD debugger (also non-interactive with the Console desktop), and this halts all code execution within the W3WP.EXE process.
  • The NTSD debugger is awaiting commands following the caught exception, but you cannot input them into any debugger commandline window since it is not interacting with the Console Desktop, nor are there any queued commands to the debugger.
  • And since a Windows Process only has one Debugger port, you cannot attach a second debugger via any other method to regain control of the debugger/process...
  • Thus, the current debugging session is inaccessible and dead.
  • To add insult to injury - when W3SVC wants to recycle and/or terminate a monitored W3WP, and it detects that a debugger is already attached onto that W3WP, it will simply skip over taking action against it (i.e. the second event log entry mentioned above). So, not only is the W3WP.EXE halted from executing code and is inaccessible for debugging, IIS also skips cleaning it up.

    This is ok, though, because the feature was added during IIS6 development as a fail-safe against losing W3WP.EXE for investigations. Yes, the behavior looks silly when misconfigured, but the benefits outweigh the occassional mishap.

Corrective Actions

How to address this issue? Well, one can reconfigure the system to support debugging in any of the above ways that I specied earlier. This is how to do each:

  • Make the debugger command window show up on a WinStation which you can access by launching the debugger interactively as the logged-on user

    With the target W3WP.EXE already running, run: C:\DEBUG\NTSD -g -G -p {PID of W3WP.EXE}   If there is only one W3WP.EXE, you can use -pn w3wp.exe to select the unambiguous process name "w3wp.exe" to attach to.

  • Make the debugger command window show up on a WinStation which you can access by making the NT Service interactive with the Console desktop (WinStation#0)

    REG ADD "HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\w3wp.exe" /v Debugger /d "C:\DEBUG\NTSD.EXE -g -G" /t REG_SZ /f
    SC CONFIG IISADMIN type= share type= interact
    SC CONFIG W3SVC type= own type= interact
    NET STOP /y IISADMIN
    NET START W3SVC

    Be careful with the SC commands - the exact parameters and whitespacing are (unfortunately) important. In particular, neither type=interact, nor type =interact, nor just type= interact work.

    The NTSD window now automatically shows up in WinStation#0 (the local console) for each new W3WP.EXE.

  • Make the debugger into a "conduit" for an eventual debugging client by piping usermode output into a kernel mode debugger with -d

    REG ADD "HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\w3wp.exe" /v Debugger /d "C:\DEBUG\NTSD.EXE -g -G -d" /t REG_SZ /f

  • Make the debugger into a "conduit" for an eventual debugging client by opening a TCP/IP port or NamedPipe with -server

    • REG ADD "HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\w3wp.exe" /v Debugger /d "C:\DEBUG\NTSD.EXE -server tcp:port=%d -g -G" /t REG_SZ /f
    • REG ADD "HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\w3wp.exe" /v Debugger /d "C:\DEBUG\NTSD.EXE -server npipe:pipe=w3wp%d -g -G" /t REG_SZ /f

Conclusion

Which one is "best"? They are all "best" for certain situations and painfully inadequate for the wrong situations... so "best" is really subjective to the debugging task at hand. I recommend evaluating the needs of the debugging situation and then selecting the proper debugging approach that you are comfortable with. While the above list is not conclusive, it should suffice for most debugging situations.

Personally, I favor the -server TCP/IP accessed via a non-console WinStation on the server because it alters no service/server configuration. Yes, the commandline syntax can be complicated, but that's what batch scripting is for. :-)

//David

4 comments:

gurgaonindustry said...

Schools In Gurgaon | Hotels In Gurgaon | Hospitals in Gurgaon | Jewelers In Gurgaon


i like your blog

Anonymous said...

Be abiding to do a seek on the internet for the accomplish and appearance of Louis vuitton bags
you wish and you will get the after-effects for any website with that exact louis vuitton handbags
, this includes accepting your 2009 lv
. You will acquisition a louis vuitton
for annihilation in your life.

Wholesale Electronics said...

good post,i like it very much.
Wholesale Electronics

Medical coding said...

That's this type of great resource that you're delivering and that means you provide away free of charge. I like seeing websites that understand the benefits of delivering an excellent resource free of charge. Understand why phenomenal resource!