The Oxeye research team has been able to gain remote code execution in Spotify’s open source, CNCF-incubated project—Backstage, by exploiting a VM sandbox escape through the vm2 third-party library. We reported this RCE vulnerability via Spotify’s bug bounty program, and the Backstage team responded rapidly by patching it in version 1.5.1, and ranking the vulnerability with a CVSS score of 9.8.
Potential vulnerability impact – An unauthenticated threat actor can execute arbitrary system commands on a Backstage application by exploiting a vm2 sandbox escape in the Scaffolder core plugin.
What is Backstage?
Having more than 19,000 stars on Github, Backstage – a CNCF incubated project by Spotify, is one of the most popular open source platforms for building developer portals. It restores order to your microservices and infrastructure, thus enabling your product teams to ship high-quality code quickly without compromising autonomy. Backstage unifies all of your infrastructure tooling, services, and documentation to create a streamlined development environment from end to end. In addition to Spotify, it’s used by a variety of organizations,including American Airlines, Netflix, Splunk, Fidelity Investments and Epic Games.
Backstage can hold integration details to many organization systems, such as Prometheus, Jira, ElasticSearch, and others. Thus, successful exploitation has critical implications for any affected organization and can compromise those services and the data they hold.
Out of the box, Backstage includes:
- Backstage software catalog for managing all your software (microservices, libraries, data pipelines, websites, ML models, etc.)
- Backstage software templates for quickly spinning up new projects and standardizing your tooling with your organization’s best practices
- Backstage techdocs for making it easy to create, maintain, find, and use technical documentation, using a "docs-like code" approach
- A growing ecosystem of open source plugins that further expand its customizability and functionality
Every research project we spin up starts with mapping potential inputs to an application. A key Backstage feature that caught our attention is its software templates.
Backstage has three parts:
Core – Base functionality built by core developers in the open source project
App – A Backstage app instance that is deployed and tweaked. It ties together core functionality with additional plugins. It’s built and maintained by app developers, usually by a productivity team at a company.
Plugins – Additional functionality to make your Backstage app more useful. Plugins can be specific to a company or open sourced/reusable.
Software templates let an application owner quickly spin up new projects, components, and plugins in their organization according to its best practices and guidelines.
Each template is defined by a YAML file that resembles a Kubernetes resource. It contains various fields that define how the component should behave. For example:
Evaluating user-provided strings in a template engine can be dangerous since it exposes the application to template-based attacks. The severity of such an attack depends on the features the templating engine offers. Here, Nunjucks is extremely powerful but has been notorious for being unsafe. For example, this 2016 blog post describes popular techniques to bypass restrictions that Nunjucks imposes.
Abusing the template engine
In an earlier research paper, Oxeye found a vm2 sandbox escape vulnerability that results in remote code execution (RCE) on the hosting machine. Once we sorted out that payload, we wondered, Could we exploit it in Backstage?
Exploiting the vm2 sandbox escape
When attempting this exploit in our previous blog post, we sought to control properties outside of the sandbox context. When our basic payload executed, we tried invoking the getThis method on any CallSite objects in the array. (These are used to manipulate the call stack generated by the raised error.)
This time we initially tried a simple approach, storing the vm2 sandbox escape payload in a Backstage template YAML and then trying to trigger that. But calling the getThis method yielded an undefined result and caused our exploit to fail.
To understand why our exploit failed, we found the following in the v8 documentation:
To maintain restrictions imposed on strict mode functions, frames that have a strict mode function and all frames below (its caller, etc.) are not allowed to access their receiver and function objects. For those frames, getFunction() and getThis() returns undefined
To understand which of the functions in our call stack run in strict mode, we added the following code snippet to the functions that appeared in the call stack:
When we ran our payload an additional time with the added code snippet above we could see that the earliest function executed in strict mode within the vm2 sandbox was the “renderString2” function.
The following screenshot shows the “renderString2” function executed in strict mode:
To overcome this limitation, we decided to try and override the “renderString2” function with our own implementation in an attempt to force it to run in non-strict mode.
Something else of interest could help us with gaining code execution: given an error raised while rendering the template, Backstage called the renderString2 function a second time. This let us divide our payload into two stages. Initially we overwrote the renderString2 function with our own implementation. Then we triggered an error, causing Backstage to call the function again, this time invoking our overridden implementation.
This next screenshot shows Backstage calling the renderTemplate function (that calls renderString2) twice in the event of an error:
Our final payload was:
Here is an explanation:
- Access the range function constructor property, which provides access to the sandboxed Function constructor. We use that to create an immediately invoked function expression (IIFE) containing our exploit code (within Nunjucks’s rendering engine context).
- Override the renderString function to contain our own implementation, This is so we can run in the context of the VM instead of Nunjucks. This is essential to exploit the vulnerability (here env is the Nunjucks reference).
- Trigger an error by invoking an undefined function (triggerException). This causes NunjucksWorkflowRunner.render to call the SecureTemplater.render function a second time and occurs in the following code: https://github.com/backstage/backstage/blob/2c54d6446fefe30e8e6d81ce7029db74e594b9cb/plugins/scaffolder-backend/src/scaffolder/tasks/NunjucksWorkflowRunner.ts#L147.
From here, the exploit runs in the context of the vm2 sandbox.
4. Save a copy of the original Error class.
5. Override the original Error class with a new empty class.
6. Implement the prepareStackTrace function under the newly created Error class.
7. Create an instance of the saved Error class and access its stack property. This triggers the built-in Node error module to call our implementation of the prepareStackTrace function. https://github.com/nodejs/node/blob/main/lib/internal/errors.js#L140
8. Access the CallSite object supplied in the traces array, on which we invoke the getThis function. This gets us an object created outside the sandbox, allowing us to execute an arbitrary system command.
When looking at the call stack as we ran our payload, none of the functions executed until our overridden renderString2 function was declared as strict. When we accessed index 2 of the trace array, we got a CallSite object created outside the sandbox. This enabled us to execute arbitrary code.
Exploiting Backstage instances in the wild
Once we had successfully executed our payload locally, we attempted to assess the potential impact of such a vulnerability if exploited in the wild.
We started by running a simple query for the Backstage favicon hash in Shodan; it resulted in more than 500 Backstage instances exposed to the internet. We then tried to assess how they could be exploited remotely without authenticating to the target Backstage instance.
The first thing we noticed was that Backstage is deployed by default without an authentication mechanism or an authorization mechanism, which allows guest access. Some of the public Backstage servers accessible to the internet did not require any authentication.
Additionally, this warning message caught our attention when we reviewed Backstage authentication documentation:
With this in mind, next we tried setting up a local Backstage instance that requires authentication, following tutorial guidelines originally maintained by Backstage. We ended up with authentication only enforced on the client side; requests flowing to the backend API were not verified for authentication or for authorization.
When trying to send requests directly to the backend API server of some of the internet-exposed instances, we found a handful did not require any form of authentication or authorization. Thus we concluded the vulnerability could be exploited without authentication on many instances.
If you’re using Backstage in your organization, we strongly recommend updating it to the latest version to defend against this vulnerability as soon as possible.
Moreover, if you’re using a template engine in your application, make sure you choose the right one in relation to security. Robust template engines are extremely useful but might pose a risk to your organization.
And if you use Backstage with authentication, enable it for both the front and backend.
"Sandbreak" - RCE In vm2 Sandbox Module (CVE-2022-36067)
The Oxeye research team has found "Sandbreak", a critical remote code execution vulnerability in the popular sandbox library vm2. The vulnerability was disclosed to the project owners and was rapidly patched in version 3.9.11. GitHub has issued CVE-2022-36067 for this critical vulnerability and the maximum CVSS score of 10.0.
What Is The Potential Impact of this Vulnerability?
The fact that this vulnerability has the maximum CVSS score of 10.0 and is extremely popular means its potential impact is widespread and critical.
An application may sometimes require the execution of untrusted code provided by the user as part of its business logic. This is considered dangerous since the user can abuse this mechanism to take over the application. Utilizing a sandbox mechanism such as vm2 helps to eliminate this risk.
The term “sandbox” refers to an isolated environment within which the untrusted code can run in an attempt to mitigate the risk of malicious code affecting the host machine running it. While sandboxes are extremely useful as isolation mechanisms, they should be used with caution since it is possible to bypass the restrictions, as demonstrated below.
Technical deep dive
The guiding principles behind our choice of research topics are:
- Pervasiveness - how wide-reaching is the vulnerability?
- Impact - how severe can the consequences be if the vulnerability is exploited?
As we looked for potential vulnerabilities to dig deeper into, the idea of sandboxes came up. By their very definition, sandboxes are considered safe places and trusted as mechanisms that isolate potentially dangerous code from our applications. But what would happen if this trust was compromised? This thesis drove our explorations and eventually led us to discover the vm2 sandbox vulnerability.
Laying the groundwork:
Our usual approach when evaluating a given software's security is first to analyze the previous security lapses discovered in the same software. This helps us better grasp the available attack surface and may also lead to low-hanging bugs stemming from incomplete fixes. It also helps us come up with techniques to bypass the implemented fixes. While reviewing the previous bugs disclosed to the vm2 maintainers, we noticed an interesting technique: the bug reporter abused the error mechanism in Node.js to escape the sandbox.
Node.js allows the application developer to customize the call stack of an error that occurred in the application. Customizing the call stack can achieve this by implementing the “prepareStacktrace” method under the global “Error” object. This means that when an error occurs and the “stack” property of the thrown error object is accessed, Node.js will call this method while providing it with a string representation of the error alongside an array of “CallSite” objects as arguments. The following screenshot shows Node.js attempting to call the “prepareStackTrace” function:
Each “CallSite” object in the array represents a different stack frame. Together, they comprise the call stack state when the error occurred. One of the methods exposed by the “CallSite” objects is “getThis” which is responsible for returning the “this” object that was available in the related stack frame. This behavior may lead to sandbox escapes as some of the “CallSite” objects may return objects created outside the sandbox when invoking the “getThis” method. After gaining hold of a “CallSite” object created outside the sandbox, it might be possible to access Node’s global objects and execute arbitrary system commands from there.
The vm2 maintainers were aware that overriding “prepareStackTrace” could lead to a sandbox escape and tried to mitigate this escape path by wrapping the Error object and the “prepareStackTrace” method with their own implementation, which prevents the users from overriding this method.
Escaping the sandbox
By this step, we understand that the prepareStackTrace function of the Error object is the function we want to override. Providing our own implementation of it while triggering an error would result in a sandbox escape.
That got us thinking about what would happen if we tried to use a similar escape technique, but instead of finding a way to override “prepareStackTrace” itself, we would simply try to override the global Error object with our own object, which implements the prepareStackTrace function.
The following code would result in our own implementation being called:
The only thing left to do here is to access the CallSite object of a frame that resides outside the sandbox; from there, we can access Node’s global members and access the currently executing process, which allows us to execute commands:
Although sandboxes are meant to run untrusted code within your application, you shouldn’t automatically assume that they are safe. If the use of a sandbox is unavoidable, it is recommended to separate the logical sensitive part of your application from the microservice that runs the sandbox code so if a threat actor successfully breaks out from the sandbox, the attack surface is limited to the isolated microservice.
Vulnerability researchers are more likely to look at the high-profile dependencies of your application, resulting in more frequent vulnerabilities within the dependency. Make sure to monitor your application dependencies frequently and upgrade their versions accordingly.