Kernel Memory Space Analyzer Specification (Phase 3)

 

Last Updated: 12/25/99

 


1.     Introduction. 1

1.1       Reducing Analysis Time. 1

1.2       Enhancing the Debugging Experience. 1

1.3       Extensibility. 2

2.     Using This Document 2

 


1.      Introduction

Debugging Windows NT system crashes can be a highly daunting task. A high level of expertise is generally required to properly classify and diagnose all but the simplest problems. Significant time can be expended simply getting to the point where the symptom of the problem is determined, let alone the point where the actual cause is understood. In addition, analysis procedures are often repetitive and prone to guesswork.

 

The reliability of Windows NT is improving over time. New features such as Special Pool and the Windows 2000 Driver Verifier help pinpoint problems that were previously highly likely to cause system problems, many of which were extremely difficult to analyze and diagnose. From a supportability standpoint, the flip side is that problems that are not caught by such advanced system features and other steadily evolving tools will require ever-increasing expertise to be analyzed.

 

Hence the Kernel Memory Space Analyzer (“kanalyze”) is a supportability tool with 3 fundamental goals.

 

·        Enable reduction of the time required to analyze Windows NT crashes;

·        Enhance and enrich the experience of debugging Windows NT crashes;

·        Provide a highly extensible framework within which others can build upon the capabilities provided to fulfill the first two goals.

 

Kanalyze is not a reliability tool in that it is entirely focused on dealing with crashes after they have already happened; kanalyze’s primary operand is the crash dump file. Kanalyze does, however, provide a framework within which additional information from a crashed system could be integrated into the analysis.

 

Kanalyze’s behavior can be configured and customized as required by anyone distributing or using kanalyze; the range of possibilities is essentially infinite. Kanalyze is delivered with several different “personalities” (pre-set configurations for specific scenario groups).

 

·        Personality 1: aid to an expert debugger, by identifying and inspecting a broad range of kernel space data items and the relationships among them, and exposing this information through an interactive command line interface.

·        Personality 2: identification and analysis of problems based on stop codes, anomalous conditions detected in kernel space, etc., with the results presented to an end-user.

·        Personality 3: similar to personality 2, except includes ability to use (and update) information about previously seen crashes in a central database. This is the main focus for phase 3.

As delivered, kanalyze examines the following items to ensure that there are no illegal overlaps among them and that they appear correct and consistent. Read-only memory areas of the kernel and hal and loaded device drivers are checked against the original images to ensure that no code or data has been overwritten.

 

 

The output of kanalyze is also highly extensible. As delivered, output shows the items located and analyzed and/or descriptions of problems detected with them, at varying levels of verbosity. A mechanism for interacting with kanalyze is also provided.

 

1.1      Reducing Analysis Time

The pie-in-the-sky goal for a tool like kanalyze would be to be able to look at a crash dump file and then come to a precise conclusion about not only the symptom of the problem (i.e., “this block of pool has been corrupted”) but also about the precise cause of the problem (i.e., “driver foobar.sys’s interrupt routine wrote past the end of an allocated block”) and the solution to the problem (i.e., “install version 2.4 of foobar.sys from ftp://ftp.foobar.com/support/winnt/foobar.sys”).

 

We note that the solution step is what customers care about; the cause step is generally unimportant to them. We also note that the symptoms of a problem, as expressed by the information typically found on a bluescreen, lend themselves nicely to being quantified and cataloged.

 

Thus reducing analysis time will mean that we attempt to leverage previously applied expertise to match crashes against a database of previously seen crashes and their solutions as described by a human being. After an expert has analyzed a crash dump, the results of the analysis can be added to the database. In this way time is saved by adding a new and completely automated initial step to crash dump analysis: for an investment of a few minutes in running kanalyze, analysis may be revealed to be completely unnecessary because the solution is already known.

 

And of course, for those cases where analysis is in fact needed, enhancing the expert’s debugging experience will naturally lead to a reduction in the time required.

 

1.2      Enhancing the Debugging Experience

Kanalyze essentially parses kernel space into an internal format, which includes indices based on data types, starting and ending addresses, etc. It understands things like how items are linked together via pointers, which items are contained within which other items, and which items of which types overlap with each other and with particular addresses in memory.

 

Kanalyze is therefore in a position to provide unique and interesting views of kernel space, a little like a large set of hyper-enhanced kernel debugger extensions. To do this, kanalyze includes a module which allows the user to interact with kanalyze in a debugger-like fashion.

 

1.3      Extensibility

Kanalyze is highly extensible in terms of its functionality, user interface, and output. Its operation can be customized such that it takes on entirely different personalities depending on its desired role in a particular scenario. For example it can be run from the command line, whereas third parties may wish to configure it as appropriate for automatic invocation at boot time after a crash. Kanalyze is a set of DLLs, allowing third parties to use kanalyze as the engine for analysis inside a larger analysis package. The core portions of kanalyze contain no UI of their own. The "standard" kanalyze includes 2 .exe’s: one so that it can be invoked as a stand-alone console application, and another that functions as the main phase 3 kanalyze, providing wizard functionality for analysis and database matching.

 

2.      Using This Document

Documentation for kanalyze is organized into several separate areas. Each document deals with a different aspect of kanalyze.

 

·          Architecture. This document describes kanalyze’s architecture, and specifies the interface for plug-in modules. The intended audience for this document is developers who will be writing custom modules to extend or enhance kanalyze’s functionality. However the concepts described therein are invaluable for understanding all of the other kanalyze documentation. Read this document first.

 

·          Known Issues Database. This document describes the database of previously seen crash data which can be used by kanalyze to determine whether a crash appears to match a previously seen crash scenario, and/or update the database to include new crash data. Additional tools for manipulating the database outside the context of kanalyze are also described.

 

·          Usage and customization. This document describes how to actually invoke and use kanalyze. It also documents a set of SQL scripts that can be used to create and maintain a known issues database. The target audience for this document is anyone who will deploy, use, or customize kanalyze.

 

·          Plug-in Reference. This document functions much like an appendix to the architectiure documentation, and specifies the plug-ins that are included with kanalyze as delivered, including their functionality and interfaces and methods they export for use by other plug-ins.