GMO Flatt Security Research
June 21, 2021

CVE-2020–15702 Race Condition vulnerability in handling of PID by apport

Posted on June 21, 2021  •  13 minutes  • 2683 words
Table of contents

Note) It’s just an English version of previous post .

Hello, I’m Shiga( @Ga_ryo_ ), a security engineer at Flatt Security Inc.

In this article, I would like to give you a technical description of CVE-2020–15702 which is published recently. I discovered this vulnerability and reported it to the vendor via the Zero Day Initiative . This article is not intended to inform you of the dangers of vulnerabilities, but to share tips from a technical point of view.

Notes

If you have any questions or found any mistakes, I’d appreciate it if you could contact me individually. And, the code in this article basically refers to the apport source code at 2.20.11–0ubuntu27.

I will explain the outline of the PoC I wrote, but I will not post the actual code. Please understand.

Overview

Race Condition vulnerability in handling of PID by apport. By exploiting this vulnerability, it is possible to elevate privileges under specific conditions below.

Preconditions

  1. Arbitrary code execution on a target Ubuntu machine.

  2. Processes that meet certain conditions which I’ll explain later, will be executed with privileges.(e.g. logrotate)

We can get privileges of the targeted process by exploiting this. So if the target has higher authority(e.g. root), we can get higher authority.

Impact

Local Privilege Escalation

What is apport

https://wiki.ubuntu.com/Apport

apport is a crash reporting function that comes standard with Ubuntu. When the userland process crashes due to SEGV etc., it creates a formed report for later analysis etc.

garyo@garyo:~/sandbox$ sleep 100 &  
[1] 13048  
garyo@garyo:~/sandbox$ kill -SIGSEGV 13048  
garyo@garyo:~/sandbox$ head -n 20 /var/crash/_usr_bin_sleep.1000.crash  
ProblemType: Crash  
Architecture: amd64  
Date: Tue Aug 25 10:33:06 2020  
DistroRelease: Ubuntu 20.04  
ExecutablePath: /usr/bin/sleep  
ExecutableTimestamp: 1567679920  
ProcCmdline: sleep 100  
ProcCwd: /home/garyo/sandbox  
ProcEnviron:  
 SHELL=/bin/bash  
 LANG=en_US.UTF-8  
 TERM=xterm-256color  
 XDG_RUNTIME_DIR=  
 PATH=(custom, no user)  
ProcMaps:  
 564cdc98d000-564cdc98f000 r--p 00000000 08:02 1051009                    /usr/bin/sleep  
 564cdc98f000-564cdc993000 r-xp 00002000 08:02 1051009                    /usr/bin/sleep  
 564cdc993000-564cdc995000 r--p 00006000 08:02 1051009                    /usr/bin/sleep  
 564cdc996000-564cdc997000 r--p 00008000 08:02 1051009                    /usr/bin/sleep  
 564cdc997000-564cdc998000 rw-p 00009000 08:02 1051009                    /usr/bin/sleep  

On Linux, core files are usually dumped when a process crashes. However, if the first byte of /proc/sys/kernel/core_pattern is set to “|” (pipe), the kernel launches a process using usermodehelper (a function that launches a userland process from the kernel). The contents of the core dump will be written to the pipe linked to the standard input of the launched process.

Looking at the above kernel parameters in Ubuntu 20.04 as a test, it is as follows.

garyo@garyo:~/sandbox$ cat /proc/sys/kernel/core_pattern  
|/usr/share/apport/apport %p %s %c %d %P %E  

The format string after the second argument can be found by reading the format_corename() function called from the do_coredump() function of the Linux kernel. All you need to know here is that the PID of the crashed process will be passed to apport.

Vulnerability explanation

By the way, many of you may have heard of apport’s PID-related Race Condition. Similar vulnerabilities have been found in the past, and I actually referred to the following blog to write PoC.

Among them, the attack explanation of CVE-2019–15790 was very helpful. To briefly explain the vulnerabilities, apport at that time had the following flow.

  1. Launched by the kernel with the information of the crashed process (hereinafter A) as an argument.

  2. Get uid/gid/cwd from procfs based on the PID of process A which is taken as an argument.

  3. Change the real uid/gid to process A’s uid/gid by the drop_privileges() function.

  4. Get other information such as maps from procfs based on PID

  5. Report is created with permissions that process A can read

apport drops privileges at 3, so by sending a SIGSTOP signal to the apport process, the execution is stopped. Actually, process A can be terminated by sending SIGKILL to process A (a process with normal user privileges that was intentionally crashed) during this time *1. If we run a number of processes, the PID will be reused (when we use PID up to the maximum value, the free PID will start to be reused again from 0). So before running 4, we can replace the process that apport should refer to with another privileged process. Then, apport refers to /proc/[PID]/maps of the privileged process and includes it in the report, so it could be used for ASLR bypass of the privileged process.

*1: In the first place, the Linux kernel probably does not guarantee that the process is alive after the crash reporting function is started. In the actual situation, the process is alive when the crash reporting function comes up. However, it’s not because it’s waiting for the crash reporting function to finish(When running the usermodehelper inside a do_coredump, the argument UMH_WAIT_EXEC is used, which waits for the exec to complete but not for the process to terminate). The reason why the process is being alive is that the core dump information is basically large. It cannot be written to the stdin pipe at once then the writing is blocked. You can see that the crashed process immediately disappears from procfs without waiting for crash reporting function to finish after reading all standard input.

And this vulnerability was in the annotation part. Before that, let’s look at the countermeasures against the above vulnerabilities first.

-    global pidstat, real_uid, real_gid, cwd  
+    global pidstat, real_uid, real_gid, cwd, proc_pid_fd  
+  
+    proc_pid_fd = os.open('/proc/%s' % pid, os.O_RDONLY | os.O_PATH | os.O_DIRECTORY)  
  
-    pidstat = os.stat('/proc/%s/stat' % pid)  
+    pidstat = os.stat('stat', dir_fd=proc_pid_fd)  
  

https://git.launchpad.net/ubuntu/+source/apport/commit/data/apport?h=applied/ubuntu/bionic-devel&id=0c4ff2db0788ecffe84a5dc6938a616140f179c2

The important thing is a variable called proc_pid_fd. Before being patched, apport used to concatenate PIDs as strings and open /proc/[PID]/{stat/cwd/maps} every time. Therefore, if /proc/[PID] is replaced with the directory with the same name, it referred to new(replaced) directory. After being patched, it opens the directory /proc/[PID] first, and when retrieving data from procfs, openat is now used with using fd of the opened directory. Below is the verification code that makes it impossible to replace directories by using openat.

garyo@garyo:~/sandbox$ cat test.py  
#!/usr/bin/python3  
import os, tempfile  
d=tempfile.mkdtemp()  
dirfd=os.open(d, os.O_RDONLY|os.O_DIRECTORY|os.O_PATH)  
os.open(d+"/bbb", os.O_RDWR|os.O_CREAT)  
print("Open   {0}".format(os.open(d+"/bbb", os.O_RDONLY)))  
print("OpenAt {0}".format(os.open("bbb", os.O_RDONLY, dir_fd=dirfd)))  
os.remove(d+"/bbb")  
os.rmdir(d)  
os.mkdir(d)#create again instead of pid recycle  
os.open(d+"/ccc", os.O_RDWR|os.O_CREAT)  
print("Open   {0}".format(os.open(d+"/ccc", os.O_RDONLY)))  
print("OpenAt {0}".format(os.open("ccc", os.O_RDONLY, dir_fd=dirfd)))`  
  
garyo@garyo:~/sandbox$ python3 test.py  
Open   5  
OpenAt 6  
Open   8  
Traceback (most recent call last):  
  File "test.py", line 13, in   
    print("OpenAt {0}".format(os.open("ccc", os.O_RDONLY, dir_fd=dirfd)))  
FileNotFoundError: [Errno 2] No such file or directory: 'ccc'  
  

By this patch, if SIGSTOP signal stops the execution of apport between 2 and 4 and directory is replaced, retrieving data now fails.

However, as I mentioned earlier, there is no guarantee that the crashed process lives after apport is launched. This means that the process that the PID points to may already be another process at the moment apport is launched. This is the vulnerability I discovered. (The actual replacement method will be explained in the PoC overview section.)

By the way, in the previous vulnerability, it was possible to include the memory map of privileged processes in the report while keeping the owner of the report as its own uid, so it can be used for address leakage. But how about this time? In this point, the process replaced at the moment apport launches. So the uid that is referenced while dropping privileges also belongs to the privileged process, and normal users do not have the permission to read the report. This cannot be used for address leakage, however apport has one more important feature which dumps core file as it is. It is a feature to save the core dump information sent from the kernel to the working directory where the process was running (same as a normal core file dump without passing through apport). By abusing this, we can write the core dump information of our own process as a core file to the cwd of the privileged process. This is an important part of how to exploit this vulnerability. I will explain how to actually escalate the privileges from here in the PoC overview section.

PoC overview

There are two main points that are important in writing this PoC.

  1. Adjust timing of PID recycling

  2. The way to escalate privileges when core file is saved to cwd of privileged process

Adjust timing of PID recycling

The first important thing is to adjust the timing of PID recycling. What is currently required is to send a SIGSEGV to launch the apport and duplicate the PID by sending SIGKILL + launching privileged process before reading procfs.

It is practically difficult to make a round of PID in a short time. But since we can send SIGSEGV at any time, if we send SIGSEGV after repeating the run and kill process until just before the PID goes around, the issues of taking time can be solved.

However, timing issues remains. For example, in an implementation that sends SIGSEGV and sends SIGKILL 1sec later, the apport completes its task and terminates before receiving SIGKILL. On the other hand, if the timing of sending SIGKILL is too early, it will also not work well. If we send signals at about the same time, the do_coredump() function will not be called in the first place. I’ll explain below.

Linux kernel handles the received signals when the target process transitions from kernel mode to user mode. At this time, the operation is performed by the get_signal() function according to each received signal. For example, if a critical signal such as SIGSEGV is received, the do_coredump() function will be called as shown below, and a core dump file(crash report function if a pipe is set) will be saved.

if (sig_kernel_coredump(signr)) {  
			if (print_fatal_signals)  
				print_fatal_signal(ksig->info.si_signo);  
			proc_coredump_connector(current);  
			/*  
			 * If it was able to dump core, this kills all  
			 * other threads in the group and synchronizes with  
			 * their demise.  If we lost the race with another  
			 * thread getting here, it set group_exit_code  
			 * first and our do_group_exit call below will use  
			 * that value and ignore the one we pass it.  
			 */  
			do_coredump(&ksig->info);  
		}  

https://github.com/torvalds/linux/blob/9907ab371426da8b3cffa6cc3e4ae54829559207/kernel/signal.c#L2739-L2751

If both SIGSEGV and SIGKILL are held as received signals when get_signal() function is called, the do_coredump() function will not be called due to the following conditional branching, assuming that the process should be terminated. This is why it doesn’t work if the signal is sent too early.

/* Has this task already been marked for death? */  
	if (signal_group_exit(signal)) {  
		ksig->info.si_signo = signr = SIGKILL;  
		sigdelset(¤t->pending.signal, SIGKILL);  
		trace_signal_deliver(SIGKILL, SEND_SIG_NOINFO,  
				&sighand->action[SIGKILL - 1]);  
		recalc_sigpending();  
		goto fatal;  
	}  

https://github.com/torvalds/linux/blob/9907ab371426da8b3cffa6cc3e4ae54829559207/kernel/signal.c#L2601-L2609

I mean, we shouldn’t send them at least almost at the same time, but if it’s too late, apport will finish its task. It makes timing adjustment difficult. So I decided to use the lock implemented in apport. apport checks to see if another apport has already launched with a file called /var/run/apport.lock at startup. This checking function(check_lock()) is called before getting proc_pid_fd. I mean, if we keep running another apport process, we can stop execution between calling do_coredump() function and referencing procfs.

However, of course, execution will resume immediately when the another apport process terminates. I remembered the function of dbus that I used when I discovered another vulnerability (CVE-2020–11936). In the function is_closing_session(), by specifying the environment variable DBUS_SESSION_BUS_ADDRESS here, the request will be sent to the TCP server owned by an attacker.

def is_closing_session(uid):  
    '''Check if pid is in a closing user session.`  
  
    During that, crashes are common as the session D-BUS and X.org are going  
    away, etc. These crash reports are mostly noise, so should be ignored.  
    '''  
    with open('environ', 'rb', opener=proc_pid_opener) as e:  
        env = e.read().split(b'\0')  
    for e in env:  
        if e.startswith(b'DBUS_SESSION_BUS_ADDRESS='):  
            dbus_addr = e.split(b'=', 1)[1].decode()  
            break  
    else:  
        error_log('is_closing_session(): no DBUS_SESSION_BUS_ADDRESS in environment')  
        return False  
  
    orig_uid = os.geteuid()  
    os.setresuid(-1, os.getuid(), -1)  
    try:  
        gdbus = subprocess.Popen(['/usr/bin/gdbus', 'call', '-e', '-d',  
                                  'org.gnome.SessionManager', '-o', '/org/gnome/SessionManager', '-m',  
                                  'org.gnome.SessionManager.IsSessionRunning'], stdout=subprocess.PIPE,  
                                 stderr=subprocess.PIPE, env={'DBUS_SESSION_BUS_ADDRESS': dbus_addr})  
        (out, err) = gdbus.communicate()  
  

https://git.launchpad.net/ubuntu/+source/apport/tree/data/apport?h=applied/ubuntu/focal-devel&id=02a1fd19eafeae3f8e98f1461c9bcea850f0c419#n253

Now what if the TCP server doesn’t send a response? The gdbus command does not terminate, and of course the line gdbus.communicate() does not return and execution is paused. In other words, the program execution is stopped while holding the lock of /var/run/apport.lock.

In summary, it is possible to stop the execution of apport for the target PID and wait for the reuse of PID stably in the following steps.

  1. Set the environment variable DBUS_SESSION_BUS_ADDRESS to a value like tcp: host = 127.0.0.1, port = 8888

  2. Launch the TCP server on the specified port

  3. Launch apport by sending SIGSEGV to process A

  4. AUTH request comes from gdbus to TCP server, but stop it by not responding here.

  5. Launch apport by sendin SIGSEGV to newly created process B

  6. apport stops running at check_lock() function, because another apport process is not terminated.

  7. Send SIGKILL to process B.

  8. Launch a privileged process in some way to duplicate the PID of process B

  9. Send some random response from TCP server to resume execution

  10. apport for process A terminates

  11. The lock is released and the apport execution for process B resumes (at this point the PID of process B is being recycled).

How to escalate privileges when core file is saved to cwd of privileged process

Well, in the previous explanation, I explained how to recycle PID. But I haven’t yet described how to escalate privileges. I referred to the following exploit.

First, to summarize the current situation, it is possible to save a core dump file of the prepared process to the cwd of the privileged process that can be launched even by an unprivileged process or the start timing can be predicted. This means that the target process must be exploitable by reading a file from cwd. It was a little difficult to find a process that meets this condition. There is a lot of software (such as cron) that allows us to specify commands in a file, but it’s hard to find a process that calls chdir() to change the cwd. But finally, it turned out that logrotate was the software that met that condition.

logrotate was perfect for this exploit as follows.

  1. Basically always executed at a specific time (timing can be predicted)

  2. Basically run with high privileges

  3. call chdir() before reading files in /etc/logrotate.d/
    a.https://github.com/logrotate/logrotate/blob/dac0b9e5db48861f7554fdf668f51ed59a07adb5/config.c#L684-L713

  4. It doesn’t depend on file name. (read all files in that directory)

  5. Since the file format is not strictly checked and invalid characters are skipped, it is sufficient if the abnormal binary file contains a string of normal configuration.
    https://github.com/logrotate/logrotate/blob/dac0b9e5db48861f7554fdf668f51ed59a07adb5/config.c#L1048-L2050

As for 3, as I said earlier, 5 is also quite important. Since the core dump file outputs the memory state, it cannot be manipulated completely. So it is basically treated as an abnormal binary file.

Thanks to the 5, it was possible to create a configuration file in /etc/logrotate.d/ that executes the touch /tmp/exploited part by crashing the program that defines the following character string.

char payload[] = "\n/tmp/pwn.log{\n    su root root\n    daily\n    size=0\n    firstaction\n        touch /tmp/exploited;\n    endscript\n}\n";  

When logrotate is executed again, the command will be executed.

Aside

As an aside, the content explained so far is not perfect when you actually try to exploit. This is because it takes too short time after chdir() is called to finish parsing the config file and chdir() to the original directory. Since we have to execute a few lines of the python program(apport) in that short time, very hard timing adjustment is required.

For this issue, (because it is just a PoC) it was okay if I could exploit theoretically. So I created a meaningless config file with a large file size in /etc/logrotate.d/, I dealt with it. The larger the config file, the longer it will take to parse, so you can increase the amount of time the cwd is changed to privileged process’s one. I used inotify to detect chdir(), then the core file is written to /etc/logrotate.d/, and it was confirmed that the privileges can be elevated when the logrotate is executed again.

Below is an image of the result of the execution. (logrotate is executed manually.)

Fix

The vendor released the patch below, so it is recommended to upgrade apport package.

http://launchpadlibrarian.net/491870223/apport_2.20.11-0ubuntu27.4_2.20.11-0ubuntu27.6.diff.gz

Summary

I explained about privilege escalation method using the crash report function of Ubuntu in this post. I’ll post technical blog about other vulnerabilities if it becomes publicly available. Thank you :)

Reference

https://securitylab.github.com/research/ubuntu-apport-CVE-2019-15790

https://www.exploit-db.com/exploits/37088

https://scan.netsecurity.ne.jp/article/2015/06/15/36624.html