VitoPlantamura.com logo - Click here to go to the site home page... Click here to go to the site home page... Click here to go to the Blog page... Click here to go to the archives page... Click here to go to the about page...
"extremely impressive work"
- Mark Russinovich, www.sysinternals.com
(referring to my BugChecker)
my face...
Homepage of Vito Plantamura (@, LinkedIn), Windows Researcher and Developer. [user=Guest] - updated: August 08, 2007
 BCSTRANS MODULE
BCSTRANS MODULE

INTRODUCTION

The BCSTRANS module is a DLL I developed some time ago (2003) in order to convert .PDB and .DBG symbol files in .BCS files, that is the file format used internally by my kernel debugger (BugChecker) in order to access symbol information. BCSTRANS is very affine to the DbgHelp library from Microsoft in that it reads .PDB/.DBG data and outputs something to the user in a consumable format. However the DbgHelp API is ideal when writing user-mode applications that for various reasons have the need to access debug information and to call debug-oriented service functions (for generating a stack trace or to take a minidump), whereas the BCSTRANS module has been thought for debugging kernel mode user and system components. The .BCS format (similar in its purposes to the SoftICE .NMS file type) makes available to a kernel debugger all the required informations in order to:

     generate accurate stack back traces using FPO data.
     access to private symbols in order to describe functions and memory data in the debugger environment.
     access to type data in order to provide type information.
     access to a filtered collection of module symbols in order to display local variable and function parameter contents in the debugger environment.
     access to source files line information.
     access to original source files if available and/or explicitly packaged with the .BCS file.

BCSTRANS is used by the BugChecker Symbol Loader; here is a screenshot of that application:



HOW BCSTRANS WORKS

When one wants to write code that access symbol information that is stored in .PDB files generated by Microsoft compilers, he has usually two choices: using the DbgHelp dll or using the new DIA COM SDK. However I devised a third method, that actually I have seen used only in older versions of SoftICE Symbol Loader and (obviously) in Microsoft tools such as Visual Studio itself, and that is totally undocumented: MSPDBXX.dll direct referencing.

I chose that approach for two main reasons:

     The DIA SDK (used in later versions of SoftICE Symbol Loader and included alongside with a small example application in Visual Studio .NET) although being the best way to go in this case, unfortunately, is not freely redistributable. The server exposes some COM objects that give direct and linear access to the debug information inside the program database file. Even the DbgHelp module itself can consume DIA objects in special circumstances (you can guess that by looking at the MSDN documentation about DbgHelp: there are references to MSDIA in the DbgHelp proprietary structure definitions that lead to this conclusion). However, its licensing terms and rates are a bit confusing: in 2003, when I wrote BCSTRANS, I didn't find any conclusive answer to whether it could be redistributed paying a fee or acquiring a special license from Microsoft. So I decided to give up and to focus on the MSPDBXX solution I refer above that is actually the most "brute force" manner possible, second only to opening and reading the .PDB file directly (the "Program Database" file format is not documented because, as Microsoft has stated, it needs to be updated too often in order to follow the technological changes in the tools and compilers, so, for simplicity, it has been kept closed; however this has increased the need and growth of end-user API libraries like DbgHelp and DIA).
     The early 2003 version of DbgHelp (6.0) that I studied and evaluated for deciding which method to use for writing BCSTRANS, lacked some important APIs such as SymEnumLines, SymEnumSourceFiles, SymGetFileLineOffsets64 etc. that were introduced in later versions and that are required in order to convert the .PDB debug information in a comprehensive and "linear" manner (part of the missing information might be however obtained with that old version and with newer ones but in tricky ways and with a considerable increase of the conversion times). In any case the access to the type and symbol debug information in native or translated (as happens with newer versions of MSPDBXXX) CODEVIEW format gives comprehensive and quick access to every piece of debug data about the analyzed module than in the case of DbgHelp, where that same information is exposed to the end application through proprietary structures and multiple API callings (with DbgHelp, its library design that promotes portability and abstraction and the need to adhere to the SYMBOL_INFO and SymGetTypeInfo paradigms may cause some information to be filtered out, as I noticed during my preliminary tests in 2003).

Although the choice to consume directly MSPDBXX from a Visual Studio installation is by far the less "correct" compared to the other two methods, it worked very well in my case. Luckily the function names exported by MSPDBXX are rather self-explanatory and the CODEVIEW type format is well-documented. The major downsides of this approach are that you are relying on a private dll that refers to a private file format that can change in any moment (however the MSPDBXX releases are basically backward compatible) and that you need a licensed copy of Visual Studio for accessing a copy of that dll (the version of MSPDBXX used has to be as great as the generation of the .PDB file you are trying to open: I noticed however that several distributions of NTDDK come with old versions of MSPDBXX; in any case, if you are compiling and linking drivers with Microsoft tools, it is reasonable to guess that these tools come with a suitable MSPDBXX dll that can be used by BCSTRANS and BugChecker). Another little annoyance is that BCSTRANS needs further processing in its code for things that with DbgHelp, for example, comes for free: this is the case of OMAP translations, that need to be handled explicitly. In any case, as you can guess, I always like a bit of reverse engineering in my development projects...

The versions of MSPDBXX.dll that BCSTRANS support are as follows:

     MSPDB71.DLL from Visual C++ .NET 2003.
     MSPDB70.DLL from Visual C++ .NET.
     MSPDB60.DLL from Visual C++ 6.0.
     MSPDB50.DLL from Visual C++ 5.0.

USING BCSTRANS

I will discuss here how to reference and use the BCSTRANS module, in the case you are interested in incorporating it in a project of yours. However, in this case, don't forget to read my licensing policy file for binary components and that the type and symbol data that is stored in the .BCS files is tailored to the needs of BugChecker and its features.

After having loaded it through LoadLibrary, you can call two functions on it:

     INFOFromMOD: this function is used to obtain various debug information from an executable image. That information includes the signature and the age of the debug file, the image characteristics and, above all, the name itself of the debug file that refers to the specified image. The use of this function is demonstrated in the "Symbol Retriever" example, whose source code is provided for download.

Specifically, with this function you can get the following details about the source module:

     Debug file name.
     Signature (32 or 128 bit) and age of PDB.
     Timestamp, size of image, checksum, image alignment.
     Image section data.

     BCSFromPDB: this is the core function exported by BCSTRANS. It takes as parameters the name of the .PDB or .DBG file, other informations about the source image (obtained by calling INFOFromMOD) and specific directives about what to do. The "Symbol Retriever" example shows an alternative use of this function for obtaining informations about a .DBG file.

The levels of action of this function are defined as follows:

     TRANSLEVEL_GETDBGINFOONLY: no conversion occurs, only get informations about the .DBG module.
     TRANSLEVEL_PUBLICSONLY: public symbol names. No types, no source.
     TRANSLEVEL_TYPESONLY: only type information.
     TRANSLEVEL_SYMBOLSONLY: public+local symbol names. No source.
     TRANSLEVEL_SYMANDSOURCE: full debug information (line infos). This is the BCSTRANS default behaviour.
     TRANSLEVEL_SYMANDSRCWTFILES: full debug information + source code packaged in BCS. The source files are searched in the specified paths, in the order they appear in the "vpszSourceDirs" vector. If a source file is not found, a user callback is called in order to determine what to do: the file can be skipped (MISSINGSRCRES_SKIPONLYTHIS), all the next source files can be skipped (MISSINGSRCRES_SKIPALL) or an alternative path can be specified (MISSINGSRCRES_OTHERPROVIDED).

You may notice looking at the TRANSLEVEL_XXX constant names above that the BCSTRANS behaviour and programmability mimics that of the NMS translation module of the SoftICE Symbol Loader. Well, the intention was just that: I wanted to keep things simple.

The "../bcstrpub_stripped.h" file contains the definitions and prototypes for working with BCSTRANS:

bcstrpub_stripped.h

FORMAT OF BCS FILES

The "../bcsfiles.h" file contains the definitions and structures for reading a BCSTRANS generated file:

bcsfiles.h

Here are various notes that can be useful for interpreting this format:

     After the header string (MACRO_BCS_HEADERSTRING), the remaining of the file is organized in tagged records: each record is prefixed with a BCSRECORD structure whose dwTYP field is set to one of the MACRO_R_XXX constants.
     The difference between the MACRO_R_SECTIONS and MACRO_R_ORIGINALSECTIONS data resides in that the former is obtained from the PE image header whereas the latter is derived from the SegMap data of the symbol table. The MACRO_R_ORIGINALSECTIONS data reflects the PE section structure of the original image, before any OMAP transformation.
     MACRO_R_TYPES data represents a filtered and consistent series of type definitions in OMF format. The OMF format is the Microsoft format for describing symbols and types and is rather well documented. Before being included in the BCS archive, all the records are checked for dependency errors and redundancy and are ordered by type name. The MACRO_R_TYPIDS data contains a series of DWORD values that represent the type indexes for the data in the MACRO_R_TYPES section.
     The MACRO_R_OMFSYMBOLS data represents a filtered and consistent series of symbol definitions in OMF format. This information includes local variables data, labels positions etc.

USEFUL LINKS

OMF symbol and type format: link.

DOWNLOAD

Download BCSTRANS.DLL (binary only) from here (59KB).

 Quotes
"Among the Windows experts I know personally, no one can beat Vito Plantamura."
- Francesco Balena, Code Architects SRL

"Your NDIS Monitor application, is amongst the most impressive networking code I have seen on the .Net framework."
- Ben Hakim.
 Photos
Various images from italian conferences and events (keep the mouse on a thumbnail for a short description):
Me at the Microsoft/HP/Intel organized Route64 event in Milan in May 2005, explaining how COM+ behaves on 64-bit Microsoft operating systems. I was there with the friends of Code Architects.
Me at the Microsoft Security Roadshow event in Bari in April 2006, explaining how the logon process works in Windows NT. There were 250 attendees.
Microsoft Security Roadshow 2006 in Treviso. This is an image of the huge 700-seats conference room.
Me at the Microsoft Security Roadshow 2006 in Treviso. This is a moment of the 3-hours session.
 Site login
NOTE: Actually the login feature is used only for administrative and content management purposes.
Username

Password

Everything here (code, binaries, text, graphics, design, html) is © 2010 Vito Plantamura and VPC Technologies SRL (VATID: IT06203700965).
If you download something (compilable or not) from the site, you should read the license policy file.
If you want to contact me via email, write at this address.