A Race to Report a TOCTOU: Analysis of a Bug Collision in Intel SMM

⚠️ [ ORIGIN SOURCE ]

https://research.nccgroup.com/2023/03/15/a-race-to-report-a-toctou-analysis-of-a-bug-collision-in-intel-smm/

📅 [ Archival Date ]

Mar 19, 2023 11:41 AM

🏷️ [ Tags ]

BiosSMMUEFIIntel

✍️ [ Author ]

Jeremy Boone

💣 [ PoC / Exploit ]

https://crash.software/STRLCPY/ICE_TEA_BIOS

About four months ago, in October 2022, I was idly poking around the “ICE TEA” leak. This leak was of particular interest to me, because it happened to expose the source code for Intel’s Alder Lake platform BIOS. It’s always fun to finally get to see the code for modules that you previously reverse engineered.

Soon enough, on October 13th, I discovered a time-of-check-time-of-use (TOCTOU) vulnerability in a SMI handler and reported it to Intel. The vulnerability was high risk (CVSS 7.9) because it enabled a local/physical DMA-capable adversary to corrupt SMRAM and escalate privilege into System Management Mode (SMM).

Approximately one month later, on November 17th, Intel triaged my bug report as a duplicate of another bug (CVE-2022-21198) that had been discovered internally by Intel engineers. In fact, the bug had already been fixed a week earlier on November 8th as part the 2022.3 IPU release (INTEL-SA-00688).

Upon reviewing Intel’s advisory, I felt that it lacked sufficient technical detail. So, in this blog post I want to present my own root-cause analysis and description of the bug. I think this bug serves as a good illustration of a classic SMI handler TOCTOU vulnerability. Let’s dive on in.

The Bug

The following SMI handler is part of Intel’s SPI flash SMM module. We can see that the handler calls SmmIsBufferOutsideSmmValid to check whether the attacker-controlled CommBuffer pointer overlaps SMRAM. This follows best practices for preventing confused deputy vulnerabilities. That is, if the CommBuffer overlapped SMRAM, the SMI handler could be coerced into reading/writing its own address space. So far, so good.

EFI_STATUS EFIAPI SmmSpiHandler (
  IN EFI_HANDLE DispatchHandle,
  IN CONST VOID *RegisterContext,
  IN OUT VOID *CommBuffer,
  IN OUT UINTN *CommBufferSize
  )
{
  ...
  TempCommBufferSize = *CommBufferSize;
  ...
  if (!SmmIsBufferOutsideSmmValid ((UINTN)CommBuffer, TempCommBufferSize)) {
    DEBUG ((..., "SmmSpiHandler: SMM communication buffer in SMRAM or overflow!\n"));
    return EFI_SUCCESS;
  }
  
  CommBufferPayloadSize = TempCommBufferSize - SMM_SPI_COMMUNICATE_HEADER_SIZE;
  ...

Next, a local copy of the communication buffer’s header is made into the SmmSpiFunctionHeader structure on the heap. Once again, this behavior is in line with best practice — Because the Comm Buffer resides in memory that is shared between the adversary and SMM, all data fetches are raceable and subject to TOCTOU risks. However, these race conditions can be avoided by making a local copy of the Comm Buffer. Everything still looks good here.

	...
  SMM_SPI_COMMUNICATE_FUNCTION_HEADER *SmmSpiFunctionHeader;
  SMM_SPI_COMMUNICATE_FUNCTION_HEADER *ExternalSmmSpiFunctionHeader;
  ...
  Status = gSmst->SmmAllocatePool (EfiRuntimeServicesData,
                                   TempCommBufferSize, 
                                   (VOID**) &SmmSpiFunctionHeader);
  ...
  CopyMem (SmmSpiFunctionHeader, CommBuffer, TempCommBufferSize);
  ...

Next, the original Comm Buffer pointer is aliased to ExternalSmmSpiFunctionHeader. Presumably, the variable name’s prefix of “external” is supposed to act as a reminder to the developer that this pointer refers to external untrusted memory.

	...
  ExternalSmmSpiFunctionHeader = (SMM_SPI_COMMUNICATE_FUNCTION_HEADER *) CommBuffer;
  ...

From this point forward, we need to understand that an attacker still controls two important things:

SmmSpiFunctionHeader – The individual structure fields in this local copy of the Comm Buffer have not yet been sanitized.
ExternalSmmSpiFunctionHeader – This “external” pointer still points to attacker-controlled shared memory, so all data fetches are raceable!

Unfortunately, the code that follows demonstrates some confusion about SMI handler security requirements.

Let’s consider the following SPI_FUNCTION_FLASH_READ sub-command handler. Here, the SmmSpiFlashRead structure pointer initially refers to external attacker-controlled memory. That is to say, the initial checks which ensure that the buffer range is valid are all performed using the external pointer, which can be raced by an attacker, rendering ineffective these important input validation steps.

	...
  SMM_SPI_FLASH_READ *SmmSpiFlashRead;
  ...
  switch (SmmSpiFunctionHeader->Function) {
    case SPI_FUNCTION_FLASH_READ:
      ...
      SmmSpiFlashRead = (SMM_SPI_FLASH_READ *) ExternalSmmSpiFunctionHeader->Data;
      ...
      if (((SmmSpiFlashRead->Buffer) != (UINT8 *)(SmmSpiFlashRead + 1)) ||
          ((SmmSpiFlashRead->Buffer + SmmSpiFlashRead->ByteCount) >
               (ExternalSmmSpiFunctionHeader->Data + CommBufferPayloadSize)))
      {
        DEBUG ((..., "FlashRead: SMM communication buffer range invalid!\n"));
        return EFI_SUCCESS;
      }
      ...

Next, the SmmSpiFlashRead pointer is updated to refer to the local copy of the communication buffer. However, as we established earlier, this local copy has not yet been validated — only the external copy has been checked. Therefore, it is unsafe to trust the SmmSpiFlashRead fields (in particular, Buffer and ByteCount) because they may not contain the same values that were validated above when the external pointer was checked.

  ...
  SmmSpiFlashRead \= (SMM\_SPI\_FLASH\_READ \*) SmmSpiFunctionHeader\->Data;
  SmmSpiFlashRead\->Buffer \= (UINT8 \*)(SmmSpiFlashRead + 1);

  Status \= mSmmSpiProtocol\->FlashRead (
                              mSmmSpiProtocol,
                              SmmSpiFlashRead\->FlashRegionType,
                              SmmSpiFlashRead\->Address,
                              SmmSpiFlashRead\->ByteCount,
                              SmmSpiFlashRead\->Buffer
                              );
  ...

Then finally, the SPI flash contents are copied back to external shared memory. However, due to the previously established potential for race conditions, an attacker can tamper with ExternalSmmSpiFunctionHeader->Data and SmmSpiFlashRead->Buffer to cause the buffers to overlap with SMRAM. The attacker could also tamper with SmmSpiFlashRead->ByteCount to be excessively large, enabling out-of-bounds writes in the following CopyMem call.

  ...
  if (!EFI\_ERROR (Status)) {
    Buffer \= (UINT8 \*)SmmSpiFlashRead\->Buffer;
    SmmSpiFlashRead \= (SMM\_SPI\_FLASH\_READ \*) ExternalSmmSpiFunctionHeader\->Data;
    CopyMem (SmmSpiFlashRead\->Buffer, 
             Buffer,
             SmmSpiFlashRead\->ByteCount);
  }
  break;
  ...

Most of the other SPI flash SMI sub-command handlers exhibit similar problems, including:

SPI_FUNCTION_FLASH_WRITE
SPI_FUNCTION_FLASH_READ_SFDP
SPI_FUNCTION_FLASH_READ_JEDEC_ID
SPI_FUNCTION_FLASH_WRITE_STATUS
SPI_FUNCTION_FLASH_READ_STATUS
SPI_FUNCTION_GET_REGION_ADDRESS
SPI_FUNCTION_READ_PCH_SOFTSTRAP
SPI_FUNCTION_READ_CPU_SOFTSTRAP

Conclusions

At a quick glance, this function appeared to take all the right steps to avoid the most common SMI handler vulnerability classes:

It ensured that the Comm Buffer pointer did not overlap with SMRAM.
It made a local copy of the Comm Buffer.
It validated the input structure fields in the Comm Buffer.

However, upon closer inspection we learn that, due to a subtle oversight, all the input validation steps were performed using the external Comm Buffer, rather than the local copy. This left a small window of opportunity for a TOCTOU to occur, which would undermine all the earlier input validation steps, paving the way for corruption of SMRAM.

Thankfully, Intel’s 2022.3 IPU release contained fixes for this issue.

Editors Note (2022-03-15): I updated the introductory paragraphs to make it clear that this bug is likely only exploitable by DMA-capable agents (e.g., a firmware IP or malicious PCIe peripheral device), as the SMI Rendezvous procedure will force other cores to wait while the SMI request is being serviced. Thanks to Dmytro Oleksiuk for pointing out this blunder.