VitoPlantamura's Website

Click here to go to the archives page...

"extremely impressive work"
- Mark Russinovich, www.sysinternals.com
(referring to my BugChecker)

Homepage of Vito Plantamura (@, LinkedIn), Windows Researcher and Developer.

[user=Guest] - updated: August 08, 2007

..:: BugCheckerVideo Example of Advanced Kernel Hooking Techniques ::..

INTRODUCTION

NOTE: the example code which this article refers to was extracted from the implementation of the "BugChecker Video Driver" code. BugChecker is still an in-development project and so it is the video driver itself: the code presented herein is incomplete in several aspects and should not be considered for inclusion in a commercial/large-scale product, for example. This article and the accompanying source code have to be considered simply examples of a solution to a rather complex hooking problem.

I have decided to post the source code of the "BugChecker Video Driver" (more informations here) because it shows to the curious low-level programmer a lot of interesting hooking and programming techniques and how very complex hooking problems are resolved in actual code.

This driver is fundamental for the working of the BugChecker debugger driver: it is loaded at the operating system boot time and then it remains active, waiting for an other driver or an user application to send an IOCTL requesting its services. The duty of this driver is to hook some DirectDraw API functions and then notifying the subscribed clients (up to 16) when the display video mode changes.

The information that is returned by this driver is handled in the following structure:

typedef struct _VIDEOMEMORYINFO
{
    FLATPTR             fpPrimary;              // offset to primary surface
    DWORD               dwFlags;                // flags
    DWORD               dwDisplayWidth;         // current display width
    DWORD               dwDisplayHeight;        // current display height
    LONG                lDisplayPitch;          // current display pitch
    DDPIXELFORMAT       ddpfDisplay;            // pixel format of display
    DWORD               dwOffscreenAlign;       // byte alignment for offscreen surfaces
    DWORD               dwOverlayAlign;         // byte alignment for overlays
    DWORD               dwTextureAlign;         // byte alignment for textures
    DWORD               dwZBufferAlign;         // byte alignment for z buffers
    DWORD               dwAlphaAlign;           // byte alignment for alpha
    PVOID               pvPrimary;              // kernel-mode pointer to primary surface
} VIDEOMEMORYINFO;

The most important informations in this structure that interest BugChecker are the pointer to the display framebuffer in kernel memory and the current resolution (width and height) and the layout of the display memory. BugChecker needs to know these nasty details because it has to draw its own interface directly in the video mapped memory: no operating system service (such as GDI) at any level can be used by the debugger because it has to "stay above" the OS itself. In fact, when these details are known, BugChecker becomes independent from the OS and can debug and trace into any portion of the system (including the modules that actually provide graphics and video services...) displaying its own, independent, interface.

I will not discuss here the generic structure of an NT device driver (I assume that the reader has this type of knowledge), but I will go straight explaining the logic and implementation of this software and how I came to define the implementation approach for solving this specific problem. The first step when writing a driver or a user component that does this type of job is to study thoroughly the technology that the software will come to hook/hack. I say this because there can exist many ways to solve the same problem, some equivalent to each other, some less convenient in terms of development time and effort and specially in terms of longevity/maintenance (with respect to possible changes in the software to be hooked in future versions...).

HOW RELEVANT DIRECTDRAW INITIALIZATION WORKS

( in a nutshell ! )

In this specific case, I referred to the Windows 2000 DDK documentation that comes with the DDK standard installation archive.

So, when the system boots up, GDI looks up in the registry and determines which display driver (and video miniport driver) to load. A display driver, as stated in the documentation, is a DLL with the "primary responsibility of rendering". "When an application calls a Win32 function with device-independent graphics requests, the Graphics Device Interface (GDI) interprets these instructions and calls the display driver." The features of the display driver are exposed to GDI via the DDI interface: this is a set of functions that the driver must implement in order to be used by GDI. The display driver initialization itself occurs immediately after the system has loaded the display driver DLL in system space. The only function that the display driver exports to GDI is "DrvEnableDriver" that is actually the entry point itself of the module. Infact, on my system, the display driver that was actually loaded by GDI at boot-up time was "ati2dvai.dll": this dll doesn't export any function. The "DrvEnableDriver" function is the first step in the display driver initialization: GDI calls it passing a pointer to a structure of type DRVENABLEDATA. This structure is initialized to zeroes by GDI before the call, so the driver can fill its fields with relevant informations, that are needed for the initialization process to continue.

typedef struct tagDRVENABLEDATA {
  ULONG iDriverVersion;
  ULONG c;
  DRVFN *pdrvfn;
} DRVENABLEDATA, *PDRVENABLEDATA;

In our case, the most important field is "pdrvfn": this is an array of function pointers that has to be filled by the driver itself. The functions comprised in the array are all the DDI functions that the display driver want to implement and expose to the system. Upon calling this function, the system knows all the pointers to the DDI functions the driver implements.

At this point, we are interested in a specific DDI function: "DrvGetDirectDrawInfo". This function is called as part of the DirectDraw initialization and in Windows 2000 the "DirectDraw initialization sequence is done at boot time and after each mode change" (see documentation for further details...). The prototype of this function is:

BOOL DrvGetDirectDrawInfo(
  IN DHPDEV  dhpdev,
  OUT DD_HALINFO  *pHalInfo,
  OUT DWORD  *pdwNumHeaps,
  OUT VIDEOMEMORY  *pvmList,
  OUT DWORD  pdwNumFourCC,
  OUT DWORD  *pdwFourCC
  );

Looking through these parameters, the one that can be relevant for our purposes is the second one: "pHalInfo" of type "DD_HALINFO". The documentation states that the DD_HALINFO structure describes the capabilities of the hardware and driver and that it is required to be filled by the aforementioned "DrvGetDirectDrawInfo" API.

typedef struct _DD_HALINFO{
  DWORD                  dwSize;
  VIDEOMEMORYINFO        vmiData;
  DDNTCORECAPS           ddCaps;
  PDD_GETDRIVERINFO      GetDriverInfo;
  DWORD                  dwFlags;
  LPVOID                 lpD3DGlobalDriverData;
  LPVOID                 lpD3DHALCallbacks;
  PDD_D3DBUFCALLBACKS    lpD3DBufCallbacks;
} DD_HALINFO, *PDD_HALINFO;

Interestingly, the second field of this structure contains exactly the type of information that we are looking for: the VIDEOMEMORYINFO structure has references to the current video resolution metrics and comprises a pointer to the virtual framebuffer that is mapped in kernel memory.

So, being able to hook DrvGetDirectDrawInfo and allowing the display driver to execute its implementation of this function and being able immediately after to get a pointer to the resulting, filled data structure seems to be the key for resolving our initial problem...

FOLLOWING THE HOOK SEQUENCE

Everything begins in the "DriverEntry" function. Here we create our device object, the corresponding symbolic link and we define our IRP handlers. This is pretty standard to all common NT drivers.

Some interesting thing happens at the end of this function: here we setup using the NT API "PsSetLoadImageNotifyRoutine" a callback function that the system will call every time a system or a user module is mapped in virtual memory.

Besides this, it can be interesting to note that before setting up this callback and so effectively starting up the whole hook mechanism, the driver shows a message to the user asking whether he is sure to proceed with the hook driver initialization. Remember that our driver is loaded at boot time and at that time there is no graphical interface to which delegating the duty to show our warning message. This forces us to draw directly in the video memory our alert message and to poll the keyboard hardware (in this case we refer to PS/2 keyboards) for the stroke of a predefined hotkey that will prevent the "PsSetLoadImageNotifyRoutine" from being called.

This is accomplished by these two functions: WaitForKeyBeingPressed and OutputTextString. The first of the two functions has a pretty simple implementation:

NTSTATUS WaitForKeyBeingPressed( IN BYTE bKeyScanCode, IN ULONG ulElapseInHundredthsOfSec )
{
      LARGE_INTEGER                 liWaitElapse;
      ULONG                         i;
      BYTE                          bKeybPortByte;
 
      // Wait and Read the Keyboard Port.
 
      liWaitElapse = RtlConvertLongToLargeInteger( - 100000 ); // Ten Milliseconds.
 
      for ( i=0; i<ulElapseInHundredthsOfSec; i++ )
      {
            // Read from the Keyboard Port.
 
            __asm
            {
                  in          al, 0x64
                  test      al, 1
                  mov         al, 0
                  jz          _SkipKeybInputPortRead
 
                  in          al, 0x60
 
_SkipKeybInputPortRead:
 
                  mov         bKeybPortByte, al
            }
 
            // Check if we have to Exit.
 
            if ( bKeybPortByte == bKeyScanCode )
                  return STATUS_SUCCESS;
 
            // Wait.
 
            KeDelayExecutionThread( UserMode, FALSE, & liWaitElapse );
      }
 
      // Return to the Caller.
 
      return STATUS_UNSUCCESSFUL;
}

Here we poll the keyboard i/o port comparing the read byte with the passed key scancode. This is a simplistic approach to this problem but it is the one that has less hassles and is more appropriate in this case: we need a safe way to read from the keyboard, we need a method that will not create any problem on the user computers. In BugChecker, for example, the keyboard is read in a far more complex dedicated Interrupt Service Routine, avoiding the problem of the polling.

The "OutputTextString" writes in the text video memory exactly as we were used to do in the old times of DOS. A more interesting function is the one that allows us to get a virtual memory pointer to the text video memory:

NTSTATUS InitializeTextVideoBufferPtr( VOID )
{
      PDEVICE_EXTENSION       extension = g_pDeviceObject->DeviceExtension;
      PVOID                         pvTextVideoBuffer;
      ULONG                         ulOutBufferSize;
 
      // Initialize the Pointer.
 
      if ( extension->pvTextVideoBuffer == NULL )
      {
#ifdef USE_PAGETABLE_TO_OBTAIN_TEXTMODE_VIDEOADDRESS
            if ( PhysAddressToLinearAddresses( & pvTextVideoBuffer, 1, & ulOutBufferSize, 0xB8000 ) == STATUS_SUCCESS &&
                  ulOutBufferSize == 1 )
            {
                  extension->pvTextVideoBuffer = pvTextVideoBuffer;
 
                  return STATUS_SUCCESS;
            }
            else
            {
                  return STATUS_UNSUCCESSFUL;
            }
#else
            PHYSICAL_ADDRESS        paPhysAddr;
            paPhysAddr.QuadPart = 0xB8000;
            extension->pvTextVideoBuffer = MmMapIoSpace( paPhysAddr, 1, MmNonCached );
#endif
      }
 
      // Return to the Caller.
 
      return STATUS_SUCCESS;
}

In this snippet, two different approaches are presented: the first one uses the "PhysAddressToLinearAddress" function for searching in the page directory and in the page tables for an already mapped reference to the 0xB8000 physical address. The second approach, that is more "correct" in many ways, uses the NT "MmMapIoSpace" function for doing the same job: this function does not search in the page tables for the specified address (thus not relying on its presence and mapping for returning a valid, non-null pointer) but forces the system to map this address and to return the corresponding virtual pointer.

It is important to note that for using this code for displaying messages to the user in text video mode, it is necessary that the "/NOGUIBOOT" option was specified in the boot.ini file for the currently running system installation. On the contrary the system will boot up with the graphical splash screen and perhaps our message will end up on the screen as a little mess of pixels in the upper portion of the screen...

So, if the user doesn't stroke the hotkey, the "PsSetLoadImageNotifyRoutine" is called and the hook mechanism is activated. Now, at each system-wide image mapping operation our "VPCIceVideoImageCallback" is called. Here we do nothing special: we check whether the name of the display driver was already read from the registry (it is read only once and then the name is cached for future reference), check whether the image being mapped is our display driver and then it will check whether the "EnforceWriteProtection" setting is set to 0 in the registry thus allowing the writing in the read-only code sections of kernel images (this is required for setting our detour code; without it the system - at the write attempt - will immediately bugcheck).

Following the logic of the display driver initialization, the next step is to intercept the call that the system is going to make to the entry point of the driver module itself (remember that the module entry point refers actually to the display driver "DrvEnableDriver" function). We need to know the virtual address (in the just mapped module) of its entry point, calling this simple function:

PVOID GePeImageEntryPoint( PVOID pvPeImageStart )
{
      IMAGE_DOS_HEADER*             pidhImageDosHeader;
      IMAGE_NT_HEADERS*             pinhPeNtHdrs;
 
      __try
      {
            // Do the Requested Operation.
 
            pidhImageDosHeader = (IMAGE_DOS_HEADER*) pvPeImageStart;
 
            if ( pidhImageDosHeader->e_magic != IMAGE_DOS_SIGNATURE )
                  return NULL;
            if ( pidhImageDosHeader->e_lfarlc < 0x40 )
                  return NULL;
 
            pinhPeNtHdrs = (IMAGE_NT_HEADERS*) ( (BYTE*) pvPeImageStart + pidhImageDosHeader->e_lfanew );
 
            if ( pinhPeNtHdrs->Signature != IMAGE_NT_SIGNATURE )
                  return NULL;
 
            // Return the Information to the Caller.
 
            return (PVOID) ( (BYTE*) pvPeImageStart + pinhPeNtHdrs->OptionalHeader.AddressOfEntryPoint );
      }
      __except ( EXCEPTION_EXECUTE_HANDLER )
      {
            return NULL;
      }
}

As you can guess, we will allow the system to call the "DrvEnableDriver" function, as expected, but, using the detours hooking library from Microsoft Research, we will insert ourselves in the execution flow just after the "DrvEnableDriver" function's exit and just before the control is returned to GDI. This can be done successfully because the image callback mechanism that is provided by the NT image loader and exposed to the programmer through the "PsSetLoadImageNotifyRoutine" API takes place in a moment during the module load sequence where the DLL entry point function is not called yet.

The MS Research Detours Library (http://research.microsoft.com/sn/detours/) is a C++ library that allows to "detour" the normal execution of system or user functions to programmer-defined hook code. Later, at the programmer's whim, the execution can be resumed to the original function: this can be accomplished at any point in the execution of the hook procedure. In our case, we are interested in intercepting the function call to the aforementioned "DrvEnableDriver", calling the original display driver implementation as the first step in our hook function and then replacing the function pointer to the "DrvGetDirectDrawInfo" procedure in the display driver returned "pdrvfn" pointer array. The job of the Detours library (in our case) is pretty simple (I have used it in this project and proposed it here in this example for cutting my own development time and effort with BugChecker in this preliminary stage of the project and to simplify the code of this example: for inexplicable problems of its licensing conditions, the Detours library cannot be used in commercial applications - I plan to replace it with ease in this code with my own similar solution in the immediate future). Basically what Detours does is to disassemble the entry point of the given function, recursively disassembling instructions until it gets the finite number of bytes that need to be removed in order to make room for the detour itself, that actually is a IA-32 FAR JMP instruction, that specifically takes 5 bytes of memory. The overwritten code memory that will be taken by the JMP instruction is simply MEMCPYed in a structure named "trampoline". Just next to this code, the Detours library inserts in the trampoline an other JMP instruction that will point to the original detoured function memory plus the size of the first JMP instruction plus the size of an offset, thus allowing the resuming of the execution to the original implementation. This has to be done through a disassembler because the Detours implementation cannot take the risk to move in the trampoline a number of bytes that refer to incomplete IA-32 instructions, that actually can be a very destructive thing if the processor control happens to go to crippled code like this. So, I have removed from the original Detours library several references to Win32 functions, I have introduced some syntax changes to the macros and converted the various C++ source files to a handy single C file that can be used in kernel code (you can get this file downloading the zip archive of this example at the bottom of this page).

So at file scope you define the trampoline (note that the "DETOUR_TRAMPOLINE_GLOBVAR" is a my personal customization to the standard macro provided by the Detours library):

static PVOID            g_pfnDrvEnableDriver = NULL;
DETOUR_TRAMPOLINE_GLOBVAR( BOOL APIENTRY Trampoline_DrvEnableDriver( ULONG iEngineVersion, ULONG cj, DRVENABLEDATA *pded ), g_pfnDrvEnableDriver )

and then you can hook the function:

NTSTATUS HookDisplayDriverEntryPoint( PVOID pvDrvEnableDriverFnPtr )
{
      NTSTATUS          nsRetVal = STATUS_UNSUCCESSFUL;
      BOOLEAN                 bHookRes;
 
      // Do the Requested Operation.
 
      __try
      {
            g_pfnDrvEnableDriver = pvDrvEnableDriverFnPtr;
            bHookRes = DetourFunctionWithTrampoline( (PBYTE) Trampoline_DrvEnableDriver, (PBYTE) Hooked_DrvEnableDriver, NULL );
 
            if ( bHookRes )
                  nsRetVal = STATUS_SUCCESS;
            else
                  nsRetVal = STATUS_UNSUCCESSFUL;
      }
      __except ( EXCEPTION_EXECUTE_HANDLER )
      {
            nsRetVal = STATUS_UNSUCCESSFUL;
      }
 
      // Return to the Caller.
 
      return nsRetVal;
}

Note that the __try/__except clause is required because I have removed from the original Detours code the calls to the Win32 APIs that in user mode contexts can be made to check whether the specified input addresses are valid for reading or writing. Now the execution will be detoured from the original "DrvEnableDriver" function to our hook function:

BOOL APIENTRY Hooked_DrvEnableDriver(
    ULONG          iEngineVersion,
    ULONG          cj,
    DRVENABLEDATA *pded )
{
      // Call the Original Function.
 
      bOriginalFnRetVal = Trampoline_DrvEnableDriver( iEngineVersion, cj, pded );
 
      // ### ### ### HERE it can do whatever it wants with the filled pded->pdrvfn pointers array... ### ### ###
}

Now in the remaining part of the function, we will iterate in the returned "pded->pdrvfn" pointers array searching for the reference to the display driver DrvGetDirectDrawInfo function (each returned function pointer is identified by an ordinal: in this case the identifier of the function we are searching for is INDEX_DrvGetDirectDrawInfo). When that reference is found, we simply replace its function pointer in the pdrvfn array with a pointer to our hook function of DrvGetDirectDrawInfo, holding the old pointer so we are able to call the original implementation when finished with our grabbing and client notification job of the VIDEOMEMORYINFO data structure.

SAMPLE APPLICATION

Along with the source code of the BugCheckerVideo example, I have included also a MFC application that must be used in order to install the device driver the first time. From this point on the driver will be loaded at OS boot-up time. You can use this same application for testing the driver features, sending an IOCTL to the driver requesting the current VIDEOMEMORYINFO structure.

This is the text that the application will show when the "ioctl" button is clicked:

fpPrimary = 0x0
dwFlags = 0x0
dwDisplayWidth = 0x400
dwDisplayHeight = 0x300
lDisplayPitch = 0x1000
ddpfDisplay.dwSize = 0x20
ddpfDisplay.dwFlags = 0x40
ddpfDisplay.dwFourCC = 0x0
ddpfDisplay.dwRGBBitCount = 0x20
ddpfDisplay.dwRBitMask = 0xFF0000
ddpfDisplay.dwGBitMask = 0xFF00
ddpfDisplay.dwBBitMask = 0xFF
ddpfDisplay.dwRGBAlphaBitMask = 0x0
dwOffscreenAlign = 0x40
dwOverlayAlign = 0x40
dwTextureAlign = 0x40
dwZBufferAlign = 0x40
dwAlphaAlign = 0x40
pvPrimary = 0xBCA3B000

DOWNLOAD

You can download the binaries and the source code from here (133KB).

Quotes

"Among the Windows experts I know personally, no one can beat Vito Plantamura."

- Francesco Balena, Code Architects SRL

"Your NDIS Monitor application, is amongst the most impressive networking code I have seen on the .Net framework."

- Ben Hakim.

Photos

Various images from italian conferences and events (keep the mouse on a thumbnail for a short description):

Me at the Microsoft/HP/Intel organized Route64 event in Milan in May 2005, explaining how COM+ behaves on 64-bit Microsoft operating systems. I was there with the friends of Code Architects.

Me at the Microsoft Security Roadshow event in Bari in April 2006, explaining how the logon process works in Windows NT. There were 250 attendees.

Microsoft Security Roadshow 2006 in Treviso. This is an image of the huge 700-seats conference room.

Me at the Microsoft Security Roadshow 2006 in Treviso. This is a moment of the 3-hours session.

Everything here (code, binaries, text, graphics, design, html) is © 2010 Vito Plantamura and VPC Technologies SRL (VATID: IT06203700965).
If you download something (compilable or not) from the site, you should read the license policy file.
If you want to contact me via email, write at this address.