Ring 0 debugging and Windbg – part 2
2 Votes
As promised earlier I am including some more savory Kernel debugging topics. Some of these scenarios are more corner case one-off ones.
Virtual Machine (VM) debugging
Debugging a virtual machine
You will need to enable debugging on the VM just like a regular OS. Also, you will need to set one of the COM Port(COM1 or Com2) as a named pipe (=\.\pipe\<com1| com2 |yourstring>). On the VM HOST OS you can type this to attach to the debuggee.
windbg [-y SymbolPath] -k com:pipe,port=\.\pipe\PipeName[,resets=0][,reconnect]
Virtual to virtual Machine debugging
This is the scenario where you use one VM running on the host to debug another VM running on the same host. This is interesting scenario and is useful in any scenario where you may have to reimage the host OS maybe in a lab. Environment. I had to set this up for showing kernel debugging for KMDF at WinHEC.
This is how I did it:
Created a named pipe on COM1 as following on the VM acting as host/debugger
\.\pipe\com1
Created a named pipe on COM2 on the debuggee
\.\pipe\com2
When I tried this with COM1 port on the debugee VM I couldn’t get it to work but Named pipe on Com2 port worked for me. If you had a different experience please share it for others benefit.
Invoke the debugger on host VM as:
“Windbg –k com:port=com2”.
Debugging Local/single mode debugging
In some cases you may not have access to another machine or you may want to look at some device state or read a global variable for a driver. In the absence of other applications or tools one can use the local debugger. You will need to enable debugging the local machine on Vista and forward:
bcdedit /debug on
followed by a reboot. You can look at the documentation for how to edit the boot.ini file to do this on pre-vista OS.
Then you can begin debugging with the following command:
C:\> kd –kl
However, it has limited use. You can’t set break points or check call stacks.
You can use it for checking state of global variables in your drivers etc. You could also use it for trying out commands/ approaches when you don’t have another machine handy if you are going to visit a customer for example.
I am stuck now what. How do I get my machine back?
At some point in your life just by law of averages J you will come across a system which is crashing regularly maybe even at boot that you can’t get any work done or even log on to the system. These are a few techniques you could use to get your system back.
Safe mode
Try booting safe mode. The set of drivers loaded in safe mode are a subset of the remaining drivers. Majority of times these are well tested critical boot drivers. If you enabled verifier for all drivers. This is especially useful if one of the non-safe mode drivers is failing hence doesn’t get loaded in safe mode.
In some cases (more of a driver developer /test/verification scenario) the non-safe mode driver maybe failing a verifier check if you enable verifier for all drivers on your machine maybe to debug a corruption problem.
System restore
System restore helps you restore your system to a previous state. A thing to note is that driver binaries are not rolled back. It can be useful as it primarily reverts the registry to a prior well known state. If the state the system is in, involved the registry directly or indirectly, this is a good option to try.
The scenario I discussed earlier of enabling driver verifier to safe mode drivers and one such driver barfing could get your system to not boot even in the safe mode. It is unfortunate but you could find even safe mode drivers sometimes fail verifier checks.
Driver Verifier saves state in the registry so by reverting the registry (with system restore) you can disable the verifier and claim your machine back perhaps temporarily since the offending driver is still on your system. If you are lucky there is an update waiting for you from the IHV/ISV which fixes the issue flagged by driver verifier.
BIOS and disabling devices
You also find most BIOSes have the option of disabling devices. If the offending driver is for hardware and you can figure out the name of the driver binary (from the BSOD)which is crashing you can try disabling it from the BIOS. For e.g.: This is useful for devices you couldn’t care more for like the finger print reader (unless you actually use it) which has a buggy driver. Once the driver is fixed you can re-enable the device.
Kdfiles and Windbg
Kdfiles is an excellent way to change a boot driver or another driver especially if you are debugging/developing a driver. Windbg acts as a conduit for passing bits of the driver you need to change on to the target machine. It can be used for boot drivers as well except for windows vista.
For e.g.:
Kd> .kdfiles –m \systemroot\system32\driver.sys \myshare\checked\driver.sys
Crash dump
If all else fails and you desperately need a crash dump to pass on to the IHV for debugging. On windows 7, kernel crash dump which is good enough normally for debugging is enabled by default.
F10 trick to edit the boot parameters passed to the kernel at boot time
If you forgot to enable kernel debugging with bcdedit you could always do it at boot time. Pressing F8 can give you the boot menu which has the option of enable debugging. There is another way of doing it however you need to have precision timing. You will need to add the following lines to the boot debug options after pressing the F10 key. The debug option doesn’t persist across a boot.
Serial – “/debug /debugport=comX /baudrate=115200”
1394 – “/debug /debugport=1394 /channel=[1-63]”
USB – “/debug /debugport=usb /targetname=String “
Getting a crash dump for a process without enabling the kernel debugger
If the machine is not booted in debug mode (local kd cannot be used) and you can’t enable debugging and reboot it in fears of losing the repro, then you could try the following:
kdbgctrl –td <pid> <file>
This will capture a driver dump with the hang that should contain the IRP information to see if any thread in the process is hung on an IRP. Kdbgctrl.exe is a tool from the debuggers packages.
Debugging over USB-serial converter
When you set a target to use a COM port for debugging, the debugging engine
takes over the port and drives it itself. This is why a COM port will no
longer appear in device manager when being used for debugging. Let’s say now you want to
use a port on a USB device. When the system first boots, that port do
not exist according to the system hardware. Therefore there is no such thing
as COMX, so it just doesn’t work. Ports on USB-Serial converters do not get
created until later on when the PNP manager loads the driver for the USB
device.
The kernel drives the debug ports so if a driver is driving the debug port that won’t work.
So debugging over USB-serial converters doesn’t work.
Accessing the registry from Windbg
Won’t it be great if you could access the registry values from windbg. This way you could disable driver verifier through the debugger or set/unset some driver registry value. There are numerous scenarios which this could be handy. You can actually do this over windbg. Let me warn you that this may require a little poking and digging around as I show below.
The verifier stores its settings in the registry at:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\VerifyDrivers
For more information refer http://support.microsoft.com/kb/244617
If we could access this from the debugger we could actually turn off verifier using windbg.
Before we dig into the registry I would recommend you to read the following article from Mark Russinovich which explains the internals of the registry: http://technet.microsoft.com/en-us/libr ... 50583.aspx
What is the system registry?
The registry holds system information, such as configuration data, for both hardware and software. The registry is broken up into “Hives” which are registry files for parts of the registry. For example, the Software hive is where the HKEY_LOCAL_MACHINE\Software information is kept. The Hives are broken up into “Bins” and the Bins are broken up into “Cells” which hold the registry key and value data.
The diagram below which is borrowed from the above article shows the layout of the registry.
registry
How is this information managed?
The registry subsystem maintains the registry on a Hive basis. That is how the registry keeps track of open registry files. The open Hives in the system can be displayed by using the debugger extension “!reg hivelist”.
Below is an example output:
lkd> !reg hivelist
————————————————————————————————————-
| HiveAddr |Stable Length|Stable Map|Volatile Length|Volatile Map|MappedViews|PinnedViews|U(Cnt)| BaseBlock | FileName
————————————————————————————————————-
| 8bc0c2e8 | 1000 | 8bc0c364 | 1000 | 8bc0c4a0 | 0 | 0 | 0| 8bc0e000 | <UNKNOWN>
| 8bc1a5a0 | dbb000 | 8bc22000 | 44000 | 8bc1a758 | 0 | 0 | 0| 8bc1c000 | SYSTEM
| 8bc3f678 | 14000 | 8bc3f6f4 | 7000 | 8bc3f830 | 0 | 0 | 0| 8bc48000 | <UNKNOWN>
| 8bd6d9d0 | 8f000 | 8bd6da4c | 1000 | 8bd6db88 | 0 | 0 | 0| 901cb000 | temRoot\System32\Config\DEFAULT
| 8bd811f8 | 3c85000 | 901a3000 | 3000 | 8bd813b0 | 0 | 0 | 0| 8bcdd000 | emRoot\System32\Config\SOFTWARE
| 8bd85008 | 6000 | 8bd85084 | 0 | 00000000 | 0 | 0 | 0| 8bd05000 | <UNKNOWN>
| 8bd87008 | 8000 | 8bd87084 | 1000 | 8bd871c0 | 0 | 0 | 0| 8bd8e000 | <UNKNOWN>
| 911f53e8 | 3e000 | 911f5464 | a000 | 911f55a0 | 0 | 0 | 0| 95661000 | files\NetworkService\NTUSER.DAT
| 956e2798 | 3d000 | 956e2814 | 0 | 00000000 | 0 | 0 | 0| 956e3000 | rofiles\LocalService\NTUSER.DAT
| 9ec258b8 | 406000 | 9ec3d000 | d000 | 9ec25a70 | 0 | 0 | 0| 9ec27000 | \??\D:\Users\vmanan\ntuser.dat
| a0003650 | 3b6000 | a0073000 | 0 | 00000000 | 0 | 0 | 0| a007b000 | \Microsoft\Windows\UsrClass.dat
| a89d0008 | 861000 | a89d8000 | 0 | 00000000 | 0 | 0 | 0| a89d2000 | Volume Information\Syscache.hve
| d1d0c3b0 | b000 | d1d0c42c | 0 | 00000000 | 0 | 0 | 0| b0bc3000 | Device\HarddiskVolume1\Boot\BCD
————————————————————————————————————-
Before we can access the registry, we need to know where to begin. So we need the root block:
In the above case I pick the root block for the System Hive. Below I try and dump a few values under the “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet1\Control\Session Manager\Memory Management\” key namely “SystemPages”
lkd> !reg baseblock 8bc1a5a0
FileName : SYSTEM
Signature: HBASE_BLOCK_SIGNATURE
Sequence1: 84a7
Sequence2: 84a7
TimeStamp: 1ca3f22 34d2fd1c
Major : 1
Minor : 5
Type : HFILE_TYPE_PRIMARY
Format : HBASE_FORMAT_MEMORY
RootCell : 20 -> Index of the root cell
Length : dbb000
Cluster : 1
CheckSum : 4bd3452c
Since everything in the registry is represented as a cell we look at the cell index of the root block to calculate the cell address. The Cell address is broken down to the map directory offset, map table offset , and then block offset to access registry data. The debugger extension !reg cellindex does this for us automatically. !reg cellindex is then used to get the cell address.
lkd> !reg cellindex 8bc1a5a0 20
Map = 8bc22000 Type = 0 Table = 0 Block = 0 Offset = 20
MapTable = 8bc23000
BlockAddress = 937d3000
pcell: 937d3024
The !reg subkeylist command can do this for you. This breaks down the Key Node and does all the work.
lkd> !reg subkeylist 8bc1a5a0 937d3024
Dumping SubkeyList of Key <CMI-CreateHive{3FE9C764-973F-45AC-B646-C7564C5B6CAD}> :
SubKeyCount[Stable ]: 0×7
SubKeyLists[Stable ]: 0xaf1f30
SubKeyCount[Volatile]: 0×1
SubKeyLists[Volatile]: 0×80000178
[ 7] Stable SubKeys:
[Idx] [SubKeyAddr] [SubKeyName]
[0] 937d3164 ControlSet001
[1] 937d3164 ControlSet001
[2] 937d3164 ControlSet001
[3] 937d3164 ControlSet001
[4] 937d3164 ControlSet001
[5] 937d3164 ControlSet001
[6] 937d3164 ControlSet001
[ 1] Volatile SubKeys:
[Idx] [SubKeyAddr] [SubKeyName]
[0] 8bc31024 CurrentControlSet
Use ‘!reg knode <SubKeyAddr>’ to dump the key
What the List points to depends on the number of values. The registry uses an index mechanism to access the sub key data. The type of index is determined from the type code. Above 0x686c indicates this is a CM_KEY_HASH_LEAF. Where the cell contains a hash of the key and the cell index of the key. Another example is a CM_KEY_FAST_LEAF where the first four characters of the key name are used instead of a hash. What is used depends on the version of the hive. Also when there are many sub keys, the index can be a CM_KEY_INDEX_ROOT, which contains cell indexes which point to leafs.
You’ll see that some hives are volatile and don’t have associated files. The system creates and manages these hives entirely in memory; the hives are therefore temporary in nature. The system creates volatile hives every time the system boots. An example of a volatile hive is the HKEY_LOCAL_MACHINE \HARDWARE hive, which stores information regarding physical devices and the devices’ assigned resources. Resource assignment and hardware detection occur every time the system boots, so not storing this data on disk is logical.
You can use the !reg subkeylist recursively till you hit a roadblock where you don’t see all the child keys showing up in which case we resort to manual means. Since we have CurrentControlSet1, lets try and o down to its child key “Control”.
lkd> !reg subkeylist 8bc1a5a0 937d3164
Dumping SubkeyList of Key <ControlSet001> :
SubKeyCount[Stable ]: 0×5
SubKeyLists[Stable ]: 0x2ccbb0
SubKeyCount[Volatile]: 0×0
SubKeyLists[Volatile]: 0xffffffff
[ 5] Stable SubKeys:
[Idx] [SubKeyAddr] [SubKeyName]
[0] 937d3634 Control
[1] 937d3634 Control
[2] 937d3634 Control
[3] 937d3634 Control
[4] 937d3634 Control
[ 0] Volatile SubKeys:
Use ‘!reg knode <SubKeyAddr>’ to dump the key
Trying again recursively to dump the subkeys didn’t work as the debug extension dumped I only ACPI out of the 84 subkeys.
lkd> !reg subkeylist 8bc1a5a0 937d3634
Dumping SubkeyList of Key <Control> :
SubKeyCount[Stable ]: 0×54
SubKeyLists[Stable ]: 0x12a968
SubKeyCount[Volatile]: 0×3
SubKeyLists[Volatile]: 0x8001d930
[ 84] Stable SubKeys:
[Idx] [SubKeyAddr] [SubKeyName]
[0] 93732884 ACPI
[1] 93732884 ACPI
[2] 93732884 ACPI
[3] 93732884 ACPI
[4] 93732884 ACPI
[5] 93732884 ACPI
[6] 93732884 ACPI
[7] 93732884 ACPI
[8] 93732884 ACPI
[9] 93732884 ACPI
[10] 93732884 ACPI
[11] 93732884 ACPI
[12] 93732884 ACPI
[13] 93732884 ACPI
…
[79] 93732884 ACPI
[80] 93732884 ACPI
[81] 93732884 ACPI
[82] 93732884 ACPI
[83] 93732884 ACPI
[ 3] Volatile SubKeys:
[Idx] [SubKeyAddr] [SubKeyName]
[0] 8bc31484 hivelist
[1] 8bc31484 hivelist
[2] 8bc31484 hivelist
Use ‘!reg knode <SubKeyAddr>’ to dump the key
Since I was interested in getting to the address of “Session Manager”, I decided to this manually. On my particular system I looked at the offset of the Session Manger and it was the 60th key in alphabetical order.
I took a guess that the keys are arranged alphabetically so if you have a rough idea you could use that information to get the key you are looking for you could calculate the offset. This alphabetical order offset as it is displayed in the registry doesn’t correlate to the offset in the key list but I guess I have been lucky most times in getting close.
Let me warn you that your mileage may vary but instead of going brute force and looking at every key this approach can be fast if you get lucky. So I looked at the offset 60. Each sub-key takes 8 bytes so it was simple math
Offset of Session Manger = Control key base address + offset to subkeys + (60+1) *8 bytes
Offset to subkeys with some hit and trial is 4.
lkd> dc 936a996c+0×4+0x1e8
936a9b58 00018e30 d9dc4f27 00101220 0041d336 0…’O.. …6.A.
936a9b68 0000c8e0 5f6aaac2 0012acd0 0001c805 ……j_……..
936a9b78 0012ad80 fc209d70 00281b80 bb4d8ce3 ….p. …(…M.
936a9b88 00281950 f964a859 000a4020 dfc3fe3c P.(.Y.d. @..<…
936a9b98 00281a60 95e28f5b 000367d0 c0743deb `.(.[….g…=t.
936a9ba8 000fdba8 3d165065 002fd5f0 70574717 ….eP.=../..GWp
936a9bb8 0013ebd0 25abb786 000988a8 0001d599 …….%……..
936a9bc8 000c4ba0 09d542e7 0003d048 2ed17f8e .K…B..H…….
lkd> !reg cellindex 8bc1a5a0 00018e30
Map = 8bc22000 Type = 0 Table = 0 Block = 18 Offset = e30
MapTable = 8bc23000
BlockAddress = 937bb000
pcell: 937bbe34
lkd> !reg knode 937bbe34
Signature: CM_KEY_NODE_SIGNATURE (kn)
Name : Session Manager
ParentCell : 0×630
Security : 0xfc10 [cell index]
Class : 0xffffffff [cell index]
Flags : 0×20
MaxNameLen : 0x2a
MaxClassLen : 0×0
MaxValueNameLen : 0x3c
MaxValueDataLen : 0x2e
LastWriteTime : 0x 1ca319c:0x2470026e
SubKeyCount[Stable ]: 0xf
SubKeyLists[Stable ]: 0xaf8c0
SubKeyCount[Volatile]: 0×0
SubKeyLists[Volatile]: 0xffffffff
ValueList.Count : 0xe
ValueList.List : 0x3b8db8
After I got to the Session Manager, to get to the “Memory Management” sub-key the same dilemma presented itself where only the first value was visible and the other sub-keys weren’t so I had to sub-key hunting again J.
lkd> !reg subkeylist 8bc1a5a0 937bbe34
Dumping SubkeyList of Key <Session Manager> :
SubKeyCount[Stable ]: 0xf
SubKeyLists[Stable ]: 0xaf8c0
SubKeyCount[Volatile]: 0×0
SubKeyLists[Volatile]: 0xffffffff
[ 15] Stable SubKeys:
[Idx] [SubKeyAddr] [SubKeyName]
[0] 937bbe94 AppCompatCache
[1] 937bbe94 AppCompatCache
[2] 937bbe94 AppCompatCache
[3] 937bbe94 AppCompatCache
[4] 937bbe94 AppCompatCache
[5] 937bbe94 AppCompatCache
[6] 937bbe94 AppCompatCache
[7] 937bbe94 AppCompatCache
[8] 937bbe94 AppCompatCache
[9] 937bbe94 AppCompatCache
[10] 937bbe94 AppCompatCache
[11] 937bbe94 AppCompatCache
[12] 937bbe94 AppCompatCache
[13] 937bbe94 AppCompatCache
[14] 937bbe94 AppCompatCache
[ 0] Volatile SubKeys:
Use ‘!reg knode <SubKeyAddr>’ to dump the key
In this case at least the challenge was a little less since I was dealing with only 15 sub-keys instead of 84 earlier. So just like earlier first get the cell address of the subkey list.
lkd> !reg cellindex 8bc1a5a0 0xaf8c0
Map = 8bc22000 Type = 0 Table = 0 Block = af Offset = 8c0
MapTable = 8bc23000
BlockAddress = 93724000
pcell: 937248c4
Now lets go through this list and use the offset of “Memory Management” sub-key which happens to be 11 on my particular system. By hit and trial I found that the correct offset was 0×54 bytes for “Memory management”.
lkd> dc 937248c4+0×54
93724918 00098068 b76d431e 00040910 092eb60d h….Cm………
93724928 00304c98 122bab7f 000af950 381a2f7e .L0…+.P…~/.8
93724938 000af178 0001dd10 00000000 00000000 x……………
93724948 00000000 00000000 ffffffa0 00206b6e …………nk .
93724958 03a3c1c6 01ca3ef7 00000000 00018e30 …..>……0…
93724968 00000000 00000001 ffffffff 80035830 …………0X..
93724978 00000007 000b0340 0000fc10 ffffffff ….@………..
93724988 0000000a 00000000 00000010 00000246 …………F…
To confirm it lets calculate the cell address and check the knode.
lkd> !reg cellindex 8bc1a5a0 00098068
Map = 8bc22000 Type = 0 Table = 0 Block = 98 Offset = 68
MapTable = 8bc23000
BlockAddress = 9373b000
pcell: 9373b06c
lkd> !reg knode 9373b06c
Signature: CM_KEY_NODE_SIGNATURE (kn)
Name : Memory Management
ParentCell : 0x18e30
Security : 0x3a68c8 [cell index]
Class : 0xffffffff [cell index]
Flags : 0×20
MaxNameLen : 0×24
MaxClassLen : 0×0
MaxValueNameLen : 0×30
MaxValueDataLen : 0x2a
LastWriteTime : 0x 1ca3ef7:0x 3a88487
SubKeyCount[Stable ]: 0×2
SubKeyLists[Stable ]: 0x2a0c48
SubKeyCount[Volatile]: 0×0
SubKeyLists[Volatile]: 0xffffffff
ValueList.Count : 0xf
ValueList.List : 0x303f70
What about Values?
Keynodes have values as well. Now that we have got the actual sub-key we need to get the Value from the sub-key. This task is similar to sub-key hunting as each sub-key maintains a sub-key list and a value list.
Just like with subkeylist there is also a valuelist debugger extension
lkd> !reg valuelist 8bc1a5a0 9373b06c
Dumping ValueList of Key <Memory Management> :
[Idx] [ValAddr] [ValueName]
[ 0] 9373b1a4 ClearPageFileAtShutdown
[ 1] 9373b204 DisablePagingExecutive
[ 2] 9373b374 LargeSystemCache
[ 3] 9373b39c NonPagedPoolQuota
[ 4] 9373b3cc NonPagedPoolSize
[ 5] 9373b3f4 PagedPoolQuota
[ 6] 9373b43c PagedPoolSize
[ 7] 9373b4e4 SecondLevelDataCache
[ 8] 9373b53c SessionPoolSize
[ 9] 9373b564 SessionViewSize
[ a] 9373b514 SystemPages
[ b] 93724f9c PagingFiles
[ c] 935077ec PhysicalAddressExtension
[ d] 9cd4ffcc IOPageLockLimit
[ e] 9cd0d434 ExistingPageFiles
Use ‘!reg kvalue <ValAddr>’ to dump the value
lkd> !reg kvalue 9373b514
Signature: CM_KEY_VALUE_SIGNATURE (kv)
Name : SystemPages {compressed}
DataLength: 80000004
Data : c3000 [cell index]
Type : 4
In our case the value of SystemPages is 0xc300. You can dump the node address 9373b514 directly and change the value with “ed” command. The actual value is at offset 0×8.
lkd> dc 9373b514
9373b514 000b6b76 80000004 000c3000 00000004 vk…….0……
9373b524 00090001 74737953 61506d65 00736567 ….SystemPages.
9373b534 00098538 ffffffd8 000f6b76 80000004 8…….vk……
9373b544 00000004 00000004 00000001 73736553 …………Sess
9373b554 506e6f69 536c6f6f 00657a69 ffffffd8 ionPoolSize…..
9373b564 000f6b76 80000004 00000030 00000004 vk……0…….
9373b574 00000001 73736553 566e6f69 53776569 ….SessionViewS
9373b584 00657a69 ffffffe0 00086b76 00000042 ize…..vk..B…
Now you can change the value of SystemPages directly using “ed”.
lkd> ed 9373b514+0×8 <new value>
Just like with sub-keys sometimes the debugger extension may not show all the values in which case you could do it manually just like earlier. First lets get the cell address of the value list.
lkd> !reg cellindex 8bc1a5a0 0x303f70
Map = 8bc22000 Type = 0 Table = 1 Block = 103 Offset = f70
MapTable = 8bc25000
BlockAddress = 934d0000
pcell: 934d0f74
Since our Value list had 15 values lets dump out all of them. Dumping all the Value offsets in the value list.
lkd> dc 934d0f74 l 0xf
934d0f74 000981a0 00098200 00098370 00098398 ……..p…….
934d0f84 000983c8 000983f0 00098438 000984e0 ……..8…….
934d0f94 00098538 00098560 00098510 000aff98 8…`………..
934d0fa4 002cc7e8 00c72fc8 00cb3430 ..,../..04..
The Value “SystemPages” is the last one of the 15 but with hit and trial I found that “SystemPages” in the Value list was not close to offset 15 so it was pretty much manually poking around at each value offset. I found it at offset 11.
lkd> !reg kvalue 9373b514
Signature: CM_KEY_VALUE_SIGNATURE (kv)
Name : SystemPages {compressed}
DataLength: 80000004
Data : c3000 [cell index]
Type : 4