CUSTOM READ/WRITE-PROCESSMEMORY IMPLEMENTATION
When dealing with NT loader-syncronized code, it may be necessary to call a custom implementation of the ReadProcessMemory and WriteProcessMemory functions in order to circumvent an insidious deadlock problem.
INTRODUCTION
Several years ago, I developed a Windows 2000 kernel driver whose primary goal was to notify a user-mode counterpart when a particular executable image was started. For achieving this without non-consistent polling solutions, I was forced to use the well-documented "Process Structure Routines" described in the standard DDK documentation. I needed to be notified when a new executable module was being started because I needed to patch its entry point and TLS callback routines: an exact and precise notification mechanism was required in order to apply the code patches strictly before the system was able to execute that code (the TLS callbacks/DllMain/etc. execution by the loader is an operation that occurs just after the image mapping in memory has taken place and -specifically, in the case of that software- as you may imagine, the system had to execute my just modified code and not the "old" version...). So, I installed in the driver both a process and an image creation callbacks (PsSetCreateProcessNotifyRoutine and PsSetLoadImageNotifyRoutine routines). When a new PE module was just mapped in memory (and before any execution had taken place) the image callback was called: my implementation took the image name passed as parameter to the callback routine, transformed it in an user-mode understandable file path (among other things, by sending an IRP to the Mount Manager system component - i.e. IOCTL_MOUNTMGR_QUERY_POINTS - asking to resolve the volume device name present in the original kernel path and eventually by referencing an undocumented slot in the kernel TIB) and then -after having signaled an user mode event- the new module name was handled to the user mode NT service for processing (along with the new module starting address in memory and with the target process ID).
The specific problem occurs now: the user mode component - upon receiving the kernel notification - was required to do its job of opening the just created target process and doing some sort of remote process code injection and modification (as explained above). Meanwhile the image callback routine (that is called as integral part of the process creation) was made wait until the entire user mode operation had completed: this forced synchronization mechanism with the system loader was required in order to prevent the system from starting to execute the code mapped in the new modules before all the injected code and jumps were in place. This was achieved through a simple KeWaitForSingleObject call in the image callback routine specifying a security timeout value in order to unblock the system loader (running and stuck in the context of the new process) in the case the user mode service crashed or something went horribly wrong in the whole procedure. Well, the problem is that, when you stop the loader in this way, calling ReadProcessMemory or WriteProcessMemory on that blocked process from a remote process will cause a system-wide deadlock that will prevent the system loader from being invoked by other system components and services (due to one or more locks - whose names I just forgot - that are held by the loader in the blocked thread).
Resorting to a custom implementation of the ReadProcessMemory and WriteProcessMemory functions can resolve this problem. Well, this is the minimal implementation of a function almost analogous to ReadProcessMemory (it has to be inserted in a simple driver skeleton code in order to be used):
case IOCTL_AVMKRNL_READ_PROCESS_MEM:
{
// do the read operation
if ( irpStack->Parameters.DeviceIoControl.InputBufferLength == sizeof( READPROCESSMEMORY_PARAMS ) &&
irpStack->Parameters.DeviceIoControl.OutputBufferLength != 0 )
{
PEPROCESS pepProcess;
NTSTATUS ntMyStatus;
PREADPROCESSMEMORY_PARAMS prpmpParams = (PREADPROCESSMEMORY_PARAMS) Irp->AssociatedIrp.SystemBuffer;
// try to obtain a PEB pointer to the process
ntMyStatus = PsLookupProcessByProcessId(
prpmpParams->ulProcess_ID,
& pepProcess );
if ( NT_SUCCESS( ntMyStatus ) )
{
// attach to the process
KeAttachProcess( pepProcess );
// try to copy the memory
__try
{
RtlCopyMemory(
Irp->AssociatedIrp.SystemBuffer,
prpmpParams->pvSourceAddress,
irpStack->Parameters.DeviceIoControl.OutputBufferLength );
ntStatus = STATUS_SUCCESS;
}
__except |