Fuzzing ping(8)… and finding a 24 year old bug.
🐡

Fuzzing ping(8)… and finding a 24 year old bug.

📅 [ Archival Date ]
Dec 13, 2022 7:53 PM
⚠️ [ ORIGIN SOURCE ]
🏷️ [ Tags ]
OpenBSDPingAFL
✍️ [ Author ]
Florian Obser
💣 [ PoC / Exploit ]

Prologue

FreeBSD had a security fluctuation in their implementation of ping(8) the other day. As someone who has done a lot of work on ping(8) in OpenBSD this tickled my interests.

What about OpenBSD?

ping(8) is ancient:

* Author -
*      Mike Muuss
*      U. S. Army Ballistic Research Laboratory
*      December, 1983

What we know today as ping(8) started to become recognizable in 1986, for example see this csrg commit.

FreeBSD identified a stack overflow in the pr_pack() function and I expected a lot of similarity between the BSDs. This stuff did not change a lot since the csrg days.

Step one: Does this effect us? Turns out, it does not. FreeBSD rewrote pr_pack() in 2019, citing alignment problems.

Now we could join the punters on the Internet and point and laugh. But that's just rude, uncalled for, and generally boring and pointless. Technically I'm on vacation and I had resolved to only do fun things this week. So let's have some fun.

Step two: Did we mess something else up? FreeBSD had a problem in pr_pack() because that function handles data from the network. The data is untrusted and needs to be validated. Now is a good a time as any to check OpenBSD's implementation of pr_pack(). I wanted to try fuzzing something, anything, with afl for a few years, but never got around to it. I thought I might as well do it now, might be fun.

Make sure you are not holding it wrong.

I installed afl++ from packages and glanced at "Fuzzing libxml2 with AFL++". Here is what we need:

  • A program to test. Something with a know bug so that we can tell the fuzzing works.
  • An input file, that does not trigger the bug.
  • Compile the program with afl-clang-fast.
  • Run afl-fuzz.
/* Written by Florian Obser, Public Domain */
#include <err.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int
main(int argc, char **argv)
{
        FILE    *f;
        size_t   fsize;
        uint8_t *buf, len, *dbuf;

        f = fopen(argv[1], "rb");
        fseek(f, 0, SEEK_END);
        fsize = ftell(f);
        rewind(f);

        buf = malloc(fsize + 1);
        if (buf == NULL)
                err(1, NULL);
        fread(buf, fsize, 1, f);
        fclose(f);

        buf[fsize] = 0;

        len = buf[0];

        dbuf = malloc(len);
        if (dbuf == NULL)
                err(1, NULL);
        memcpy(buf + 1, dbuf, fsize - 1);
        warnx("len: %d", len);
        return 0;
}

This program has a trivial buffer overflow. It figures out how big a file is on disk and stores this in fsize. It allocates a buffer of this size and then reads the whole file into it. It interprets the first byte as the length of the data (len) and allocates a new buffer (dbuf) of this size. It skips the length byte and copies fsize - 1 bytes into the new buffer. So it trusts that the amount of data it read from disk is the same as indicated by the length byte.

While this might seem silly, this is what real world buffer overflows look like.

Here is a file where the length byte and file size agree. Create folders in and out and place test.txt into in/test.txt. Don't forget the newline.

ABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB

Compile test.c:

CC=/usr/local/bin/afl-clang-fast make test

and run afl-fuzz:

afl-fuzz -i in/ -o out -- ./test @@

It more or less immediately finds a crash. The reproducer(s) are in out/default/crashes/.

Fuzzing ping(8)

At this point we are facing a few problems. What does it mean to fuzz ping(8), where are we getting the sample input from and how do we feed it to ping(8).

From a high level point of view ping(8) parses arguments, initializes a bunch of stuff and then enters an infinite loop sending ICMP echo request packets and waiting for a reply. It parses and prints each reply.

Parsing the reply is the interesting thing. The reply comes from the network and is untrusted. This is where things can go wrong. The parsing is handled by pr_pack(), so that's what we should fuzz.

in/ for ping(8)

We need some sample data. An ICMP package is binary data on-wire. Crafting it by hand is annoying. So let's just hack ping(8) to dump the packet to disk.

diff --git sbin/ping/ping.c sbin/ping/ping.c
index a3b3d650eb5..78b571b95b4 100644
--- sbin/ping/ping.c+++ sbin/ping/ping.c@@ -79,6 +79,7 @@

 #include <sys/types.h>
 #include <sys/socket.h>
+#include <sys/stat.h>
 #include <sys/time.h>
 #include <sys/uio.h>

@@ -95,6 +96,7 @@
 #include <ctype.h>
 #include <err.h>
 #include <errno.h>
+#include <fcntl.h>
 #include <limits.h>
 #include <math.h>
 #include <poll.h>
@@ -217,6 +219,8 @@ const char          *pr_addr(struct sockaddr *, socklen_t);
 void                    pr_pack(u_char *, int, struct msghdr *);
 __dead void             usage(void);

+void                    output(char *, u_char *, int);
+
 /* IPv4 specific functions */
 void                    pr_ipopt(int, u_char *);
 int                     in_cksum(u_short *, int);
@@ -255,7 +259,7 @@ main(int argc, char *argv[])
        int df = 0, tos = 0, bufspace = IP_MAXPACKET, hoplimit = -1, mflag = 0;
        u_char *datap, *packet;
        u_char ttl = MAXTTL;
-       char *e, *target, hbuf[NI_MAXHOST], *source = NULL;
+       char *e, *target, hbuf[NI_MAXHOST], *source = NULL, *output_path = NULL;
        char rspace[3 + 4 * NROUTES + 1];       /* record route space */
        const char *errstr;
        double fraction, integral, seconds;
@@ -264,11 +268,13 @@ main(int argc, char *argv[])
        u_int rtableid = 0;
        extern char *__progname;

+#if 0
        /* Cannot pledge due to special setsockopt()s below */
        if (unveil("/", "r") == -1)
                err(1, "unveil /");
        if (unveil(NULL, NULL) == -1)
                err(1, "unveil");
+#endif

        if (strcmp("ping6", __progname) == 0) {
                v6flag = 1;
@@ -297,8 +303,8 @@ main(int argc, char *argv[])
        preload = 0;
        datap = &outpack[ECHOLEN + ECHOTMLEN];
        while ((ch = getopt(argc, argv, v6flag ?
-           "c:DdEefgHh:I:i:Ll:mNnp:qS:s:T:V:vw:" :
-           "DEI:LRS:c:defgHi:l:np:qs:T:t:V:vw:")) != -1) {
+           "c:DdEefgHh:I:i:Ll:mNno:p:qS:s:T:V:vw:" :
+           "DEI:LRS:c:defgHi:l:no:p:qs:T:t:V:vw:")) != -1) {
                switch(ch) {
                case 'c':
                        npackets = strtonum(optarg, 0, INT64_MAX, &errstr);
@@ -375,6 +381,9 @@ main(int argc, char *argv[])
                case 'n':
                        options &= ~F_HOSTNAME;
                        break;
+               case 'o':
+                       output_path = optarg;
+                       break;
                case 'p':               /* fill buffer with user pattern */
                        options |= F_PINGFILLED;
                        fill((char *)datap, optarg);
@@ -768,10 +777,10 @@ main(int argc, char *argv[])
        }

        if (options & F_HOSTNAME) {
-               if (pledge("stdio inet dns", NULL) == -1)
+               if (pledge("stdio inet dns wpath cpath", NULL) == -1)
                        err(1, "pledge");
        } else {
-               if (pledge("stdio inet", NULL) == -1)
+               if (pledge("stdio inet wpath cpath", NULL) == -1)
                        err(1, "pledge");
        }

@@ -960,8 +969,11 @@ main(int argc, char *argv[])
                                }
                        }
                        continue;
-               } else
+               } else {
+                       if (output_path != NULL)
+                               output(output_path, packet, cc);                        pr_pack(packet, cc, &m);
+               }

                if (npackets && nreceived >= npackets)
                        break;
@@ -2274,3 +2286,29 @@ usage(void)
        }
        exit(1);
 }
+
+void
+output(char *path, u_char *pack, int len)
+{
+       size_t bsz, off;
+       ssize_t nw;
+       int fd;
+       char *fname;
+
+       bsz = len;
+       if (asprintf(&fname, "%s/ping_%lld_%d.out", path, time(NULL),
+           getpid()) == -1)
+               err(1, NULL);
+
+       fd = open(fname, O_WRONLY | O_CREAT, S_IRUSR | S_IWUSR | S_IRGRP |
+           S_IROTH);
+       free(fname);
+
+       if (fd == -1)
+               err(1, "open");
+
+       for (off = 0; off < bsz; off += nw)
+               if ((nw = write(fd, pack + off, bsz - off)) == 0 || nw == -1)
+                       err(1, "write");
+       close(fd);
+}

After building and installing our hacked version of ping(8) we can create sample input data for afl thusly:

while :; do
    ping -o ./in/ -w 1 -c 1 \
         $(jot -r 0 255 | head -4 | tr '\n' '.' | sed 's/.$//')
done

jot creates a stream of random numbers between 0 and 255, we get the first four, concatenate them with '.' and cut of the trailing dot. Voilà we have a bunch of random IPv4 addresses. We then send a single ping and wait for one second. The ICMP reply is written to ./in/.

Fuzzing pr_pack()

At this point I wrote a main() function that accepts a file name as argument and reads it into a buffer. I then ripped pr_pack() out of ping(8) and fed it the file contents.

Of course compiling fails quite spectacularly at this point. So I added a bunch of missing functions, defines and global variables. It gets pretty close now. We don't have the msghdr from recvfrom(2) so we need to #if 0 some code. We also need to get rid of the validation of the data packet using SipHash because the whole point is that the data does not validate and SipHash would short circuit.

Oh yeah, and the thing is legacy IP only at this point.

So here (afl_ping.c) it is, it is quite terrible. It would probably make more sense to copy all of ping(8) and slap on a new main() function. Maybe.

Anyway, at this point I was 30 minutes in, from reading about afl for the first time until firing up afl-fuzz on my hacked pr_pack(). Not too bad. It was time for dinner and I left the thing running.

The promised bug

I came back after dinner and afl found zero crashes. That's disappointing. Or good. Depending on how you look at it. But it found hangs. Running afl_ping on one of the reproducers, it printed "unknown option 20" forever.

The problem is in this part of the code:

for (; hlen > (int)sizeof(struct ip); --hlen, ++cp) {
/* [...] */
    switch (*cp) {
    /* [...] */
    default:
       printf("\nunknown option %x", *cp);
       hlen = hlen - (cp[IPOPT_OLEN] - 1);
       cp = cp + (cp[IPOPT_OLEN] - 1);
       break;
    }
}

cp is untrusted data and if cp[IPOPT_OLEN] is zero we would increase hlen by one and the for loop would subtract one, same for cp. We never make any progress and spin forever.

The diff is fairly simple:

diff --git ping.c ping.c
index fb31365ad31..6019c87d8db 100644
--- ping.c+++ ping.c@@ -1525,8 +1525,11 @@ pr_ipopt(int hlen, u_char *buf)
                        break;
                default:
                        printf("\nunknown option %x", *cp);
-                       hlen = hlen - (cp[IPOPT_OLEN] - 1);
-                       cp = cp + (cp[IPOPT_OLEN] - 1);
+                       if (cp[IPOPT_OLEN] > 0 && (cp[IPOPT_OLEN] - 1) <= hlen) {
+                               hlen = hlen - (cp[IPOPT_OLEN] - 1);
+                               cp = cp + (cp[IPOPT_OLEN] - 1);
+                       } else
+                               hlen = 0;
                        break;
                }
        }

I foolishly tweaked the diff after collecting OKs and of course the tweak was wrong. Note to self: Never do this. So it's spread out over two commits: ping.c, Revision 1.247 and ping.c, Revision 1.248.

This bug was introduced April 3rd, 1998 in revision 1.30, over 24 years ago.

Epilogue

Afl uses files to feed data to programs to get them to crash or otherwise misbehave. I had wondered for a few years how I could use afl with things that talk to the network. Because that's what I mostly work on. In hindsight it's quite obvious. You identify the main parsing function, wrap it in a new main() function and Robert is your father's nearest male relative.

The two main takeaways from this are: One, if someone messes up somewhere, go look if you messed up in the same or similar way somewhere else. Two, afl is pretty easy to use, even for network programs. 30 minutes from reading about afl for the first time to finding a bug in a real world program is pretty neat.