From linux-kernel@vger.kernel.org Thu Dec 18 21:50:26 2003 Date: Tue, 16 Dec 2003 15:59:16 +0000 From: Linux Kernel Mailing List To: bk-commits-24@vger.kernel.org Subject: [PATCH] duplicate PID fix ChangeSet 1.1302, 2003/12/16 13:59:16-02:00, t-kochi@bq.jp.nec.com [PATCH] duplicate PID fix Hello Marcelo, This fix was sent to lkml in April, and was merged to -ac tree, but is not merged in the main tree yet. Please consider taking this in. Without this, duplicate pids can be allocated, which will make one of them unkillable (signals are deliverd to only one of them), and this can be exploitable (I don't know for sure, but maybe, like brk() ;) This situation happens only when all pid space is full. Usually, users cannot fork processes more than 32768 (PID_MAX), but default user limit of max processes can be more than PID_MAX on large memory machines such as 64bit platforms (although it's adjustable by threads-max sysctl). This patch modifies common code and affects all architectures, but modifies code only executed when no pid is available, so it doesn't hurt any normal path anyway. (BTW, once I sent this patch to Rusty's Trivial patch monkey, but his reply was non-trivial, and he also said this is scary ;) The details are described below: In get_pid(), an available pid is searched through all task_structs even when there is no available pid. If a new pid is not available, the kernel exits the loop with static variable 'next_safe' untouched, which usually is no problem. spin_lock(&lastpid_lock); beginpid = last_pid; if((++last_pid) & 0xffff8000) { last_pid = 300; /* Skip daemons etc. */ goto inside; } if(last_pid >= next_safe) { inside: next_safe = PID_MAX; read_lock(&tasklist_lock); repeat: for_each_task(p) { if(p->pid == last_pid || p->pgrp == last_pid || p->tgid == last_pid || p->session == last_pid) { <= (A) if(++last_pid >= next_safe) { <= (B) if(last_pid & 0xffff8000) last_pid = 300; next_safe = PID_MAX; } if(unlikely(last_pid == beginpid)) <= (C) goto nomorepids; goto repeat; } if(p->pid > last_pid && next_safe > p->pid) next_safe = p->pid; if(p->pgrp > last_pid && next_safe > p->pgrp) next_safe = p->pgrp; if(p->tgid > last_pid && next_safe > p->tgid) next_safe = p->tgid; if(p->session > last_pid && next_safe > p->session) next_safe = p->session; } In a rare case, both (B) and (C) can be true and then, next_safe will remain PID_MAX (32768). If that happens, following get_pid() will always succeed until last_pid reaches 32768 and there may be many duplicate pids. For example, this happens when * PID space are full (300-32767 are all occupied) * the last pid allocated is 10000 * task list chain is like: ...(pids < 9999), 9999, ...(pids 300~9998, 10001~32767)... , 10000 The loop starts searching an available pid with beginpid=10000 and last_pid=10001. last_pid is incremented until it gets PID_MAX and then wraps around to 300, then is incremented again. At the point that p->pid=9999 is found in tasklist (condition (A)), last_pid = 9999 next_safe <= 9998 therefore condition (B) is true, and then last_pid = 10000 next_safe = PID_MAX and then, condition (C) is also true, and exits the loop. To protect this case is simple; when the condition (C) is true, set next_safe to 0 or any safe value to guarantee that a free pid will be searched through next time. Thanks, # This patch includes the following deltas: # ChangeSet 1.1301 -> 1.1302 # kernel/fork.c 1.31 -> 1.32 # fork.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletion(-) diff -Nru a/kernel/fork.c b/kernel/fork.c --- a/kernel/fork.c Tue Dec 16 09:02:43 2003 +++ b/kernel/fork.c Tue Dec 16 09:02:43 2003 @@ -114,8 +114,10 @@ last_pid = 300; next_safe = PID_MAX; } - if(unlikely(last_pid == beginpid)) + if(unlikely(last_pid == beginpid)) { + next_safe = 0; goto nomorepids; + } goto repeat; } if(p->pid > last_pid && next_safe > p->pid) - To unsubscribe from this list: send the line "unsubscribe bk-commits-24" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html