No-one who knows what they're talking about would say that writing a debugger is easy.  It's certainly made harder when the platform offers so many opportunities for things to go wrong.  Here are two examples.

CreateToolhelp32Snapshot

This function was introduced to the Windows NT-line in Windows 2000, though it existed as far back as Windows 95 in a separate DLL. On Windows NT-based systems, it calls into the ntdll RtlQueryProcessDebugInformation() function, which performs the majority of the work. Depending on the information that is requested, that function might insert into the process a thread that is used to gather that information about the process.

This has the unintended consequence of resuming a suspended process. For example, calling CreateProcess(myfile.exe, CREATE_SUSPENDED) then CreateToolhelp32Snapshot(myfile.exe pid) will cause myfile.exe to wake up and start running.

If a debugger has attached to the process, then Windows will create another thread that executes a breakpoint on behalf of the debugger. The problem is that when the process wakes up, the debug breakpoint will execute before the debugger can call WaitForDebugEvent() to intercept it.

This will typically cause the process to crash (though there are ways to intercept this and continue to run, no longer under the control of the debugger). One debugger is known to misbehave as a result of this bug.

Windows XP and later attempt to read from the process memory first. This attempt fails for a suspended process because it has not been completely initialised at that time. As a result, Windows XP and later do not create a new thread, so they do not demonstrate the problem.

CREATE_PROCESS_DEBUG_EVENT

When a process is started, a debugger typically wants to place a breakpoint at the main entrypoint. There are two common ways to locate this address.

The first way is to query the EntryPoint field in the InMemoryOrderModuleList structure. Interestingly, we document this field as "unsupported", even though the PSAPI.DLL uses it.

The second way is to wait for the CREATE_PROCESS_DEBUG_EVENT event to occur, and then to query the lpStartAddress field in the CREATE_PROCESS_DEBUG_INFO structure.  However, there is a problem with this second way. Windows has supported the relocation of EXE files since Windows 2000, though this fact has never been documented officially. With the introduction of Windows Vista and Address Space Layout Randomisation (ASLR), this "feature" came to be supported officially.

As a result, a file can be loaded to an address other than the one that it requested. One case in particular is when the requested address is intentionally invalid, such as zero or above 2Gb. This causes Windows to load the file to 0x10000. So far, so good.

The problem is that for such files, the value in the lpStartAddress field in the CREATE_PROCESS_DEBUG_INFO structure contains the "expected" (and incorrect) entrypoint value, that is calculated by summing the values from two PE header fields: ImageBase and AddressOfEntryPoint.

A breakpoint that a debugger places there will not be hit. If the debugger then resumes the process, the process will run freely. One debugger is known to misbehave as a result of this bug.

Such seemingly simple things, yet such potentially disasterous effects.  That's why debugging malware is best left to the professionals. If you can't trust your debugger, whom can you trust?

- Peter Ferrie