Saturday, August 8, 2015

Writing a Hypervisor for Kernel Mode Code Analysis and Fun

In this entry, I am going to share some tips to develop your own hypervisors using VMware Workstation and briefly introduce a sample ring-1 monitoring tool based on a home made hypervisor. This entry may not be for you if you already have your own hypervisors you can update at your disposal or are not interested in development at all.  

Motivation

You know that hypervisors are helpful for dynamic analysis, and most of you use them in some forms. However, I guess not many of researchers have ever written by your self might be because it sounds challenging, while it can be quite handy and fun to have ring-1 monitoring tools you can update however you want. You may, for example, want to detect and monitor PatchGuard contexts a piece of kernel mode code that relocates itself onto a random memory location and performs some uncommon operations such as disabling write protection with modifying the CR0, clearing hardware breakpoints with resetting the DR7 and accessing IDTR using the SIDT and LIDT instructions. Without having hypervisors, you are not able to detect any of those operations as those are just instructions, and you do not know where to set breakpoints due to periodic relocation. But if you wish, do hypervisors help you accomplish it.

Now, you may wonder why you want to write your own hypervisors even though there are quite a few open source projects you could re-purpose. The reason is "it sounds better if you say that you wrote your own hypervisor from scratch." ;) Besides, reading full-fledged hypervisors may not be as enjoyable as writing your own code. Let us see how you can do it.

What You Need

You need VMware Workstation as a test environment. It supports nested virtualization (emulation of VT-x technology) and lets you debug your hypervisor code form the host Windows through Windbg in the exact same manner as regular kernel debugging.

You may also be able to use other VMware products that support nested virtualization like Fusion, but you will have to configure kernel debugging between two VMs, which is not the most straightforward way. Also, having Windows as a host lets you use VirtualKD which makes communication between a debugee and a debugger very fast. VirtualBox, unfortunately, does not support nested VM and allow to execute VMX operations in it. 

If you are paranoia and do not trust software emulated VT-x, you could use a real box with a serial port (which means it is not going to be a laptop) as a debugee. You might already know that kernel debugging through USB is possible. DO NOT GO THERE unless you already have hardware that was confirmed that it supports USB debugging. There are some subtle requirements which you will not tell if a debugee device suffices by just looking at the specs online. Besides that, being able to take snapshots and memory dump from a hanged machine using VMware drastically speeds up your development.
Image 1: USB2 debug-cable. Not recommended.

Configuring Virtual Machines

You have to make some changes in a debugee virtual machine. Firstly, you have to check the following options:
  • Virtualize Intel VT-x/EPT or AMD-V/RVI
  • Virtualize CPU performance counters
Image 2: Virtual Machine Config

Secondly, you should add those lines in a corresponding VMX file [1], otherwise you will end up with getting mysterious, random-looking NMI_HARDWARE_FAILURE bug check.
----
hypervisor.cpuid.v0 = "FALSE"
mce.enable = "TRUE"
vhu.enable = "TRUE" 
----

EDIT (Nov 1, 2015): Removed entries with apic.xapic since I confirmed that it still worked without them.

Gotchas

Once you have configured the VM, the rest is only a matter of programming, but there are some gotchas I would like to share to keep you sane:
  • Try to avoid use of APIs inside a VMExit handler (in VMX root mode). Since the handler can be executed from any contexts including exception handlers or code under a very high IRQL, it is tough to conclude that calling an API which you do not know *exactly* what it does is 100% safe. 
  • For the same reason, avoid calling DbgPrint() from the VMExit handler. It usually works fine but sometimes causes mysterious errors like triple fault when you request a lot of log. Instead, store log texts into pre-allocated non-paged pool and print them out later from a safe context. 
  • Do not step-in to vmlaunch and vmresume instructions. The debugger will never return control to you, and the debugee will hang. 
  • Do not put software breakpoints everywhere in the VMExit handler. Although it seems to be fine in most cases, in some situation, the debugger does not get control from the debugee, and the system just freezes when int 3 is executed.  

Getting Memory Dump From a Hanged Debugee System

One of the biggest advantages of using VMware is that you can take memory dump (.dmp files) from a hanged debugee and give it to Windbg just like normal crash dump analysis. 

To take a dump file, first you suspend the virtual machine when it is freezed and take a snapshot.
Image 3: Suspend a hanged virtual machine

Then, navigate to where snapshot files are stored and run the vmss2core command under the VMware Workstation directory with names of the latest vmsn and vmem files. For instance, commands look like this:
----
> cd "C:\Users\user\Documents\Virtual Machines\Windows 8 x64"

> "C:\Program Files (x86)\VMware\VMware Workstation\vmss2core-win.exe" -W8 "Windows 8 x64-Snapshot45.vmsn" "Windows 8 x64-Snapshot45.vmem"

vmss2core version 2452889 Copyright (C) 1998-2015 VMware, Inc. All rights reserved.
scanning pa=0 len=0x10000000
Cannot translate linear address 7ff7d12b1b00.
Cannot read context LA from PRCB.
...
... 2020 MBs written.
... 2030 MBs written.
... 2040 MBs written.
Finished writing core.
----
Note that vmss2core comes with VMware Workstation by default (version 2780323) does not seem to be functioning (always generates 0 byte of empty files). If that the case for you too, download version 2452889 from VMware's website.

Now, you have gotten the memory.dmp and can give it to the debugger.
----
> windbg -z memory.dmp
----
Image 4: Memory dump analysis



References

There are some open source hypervisor projects you can refer to for implementation. Those are small enough to read quickly and written for Windows hosts.

A Sample Monitoring Tool Based on a Hypervisor

With those tips, you should be able to develop your own hypervisor fairly smoothly and utilize it for your research. I, for example, wrote a proof-of-concept hypervisor, Sushi, monitoring use of some uncommon instructions from non-image kernel space and stopping a thread when write protection in CR0 is modified.
Image 5: Demo hypervisor "Sushi" detecting interesting stuff

This is less than 4000 lines of code yet gives me an ability to investigate some run-time behaviour of the kernel I was unable to monitor. It is pretty awesome, and above all, playing with low-level stuff like this is quite fun.

In short, if you are interested in developing hypervisors for whatever reasons, you can do it without buying any extra hardware, and then, you can also make it one of your analysis tools like this demo hypervisor.

Thanks

Thank you @brucedang for letting me know that nested virtualization of VMware is reliable enough to write and test hypervisors.