首页 > 解决方案 > `struct task_struct current` 的兄弟姐妹总是包含一个 `pid = 0` 的进程

问题描述

我正在破解 linux 内核并与struct task_struct current.

当输出pid兄弟的命令名和命令名时,似乎有一个格式错误的进程,pid=0并且命令名是乱码。

进程的父进程也会发生同样的事情。

为什么会有出现pid=0在兄弟姐妹中的过程?这个过程不是为 保留的swapper吗?

代码

// Loop over process and parents using something like:

/*
printk("--syscall ## Begin process results ##"); 

printk("--syscall // View children //");
my_list_head = &(current->children);

printk("--syscall // View siblings //");
my_list_head = &(current->sibling)

printk("--syscall results:  ...");
*/


if (my_list_head == NULL) {
  return 0;
}

list_for_each(tempNode, my_list_head) {
  tempTask = list_entry(tempNode, struct task_struct,
          sibling);
  printk("--syscall The %ld-th process's pid is %d and command %s",
         count, tempTask->pid, tempTask->comm);
 }

输出

格式化带空格

[ 2938.994084] --syscall ## Begin process results ##
[ 2938.994089] --syscall // View children //
[ 2938.994105] --syscall // View siblings //
[ 2938.994116] --syscall The 1-th process's pid is 0 and command \x80ݶE\x96\xff\xff
[ 2938.994133] --syscall results: pid=1400 name=process_ancesto state=0 uid=1000 nvcsw=1 nivcsw=0 num_children=0 num_siblings=1

[ 2938.994139] --syscall ## Begin process results ##
[ 2938.994144] --syscall // View children //
[ 2938.994149] --syscall The 1-th process's pid is 1400 and command process_ancesto
[ 2938.994158] --syscall // View siblings //
[ 2938.994163] --syscall The 1-th process's pid is 0 and command
[ 2938.994176] --syscall results: pid=1282 name=bash state=1 uid=1000 nvcsw=88 nivcsw=18 num_children=1 num_siblings=1

[ 2938.994180] --syscall ## Begin process results ##
[ 2938.994185] --syscall // View children //
[ 2938.994190] --syscall The 1-th process's pid is 1282 and command bash
[ 2938.994198] --syscall // View siblings //
[ 2938.994203] --syscall The 1-th process's pid is 1275 and command systemd
[ 2938.994210] --syscall The 2-th process's pid is 0 and command
[ 2938.994216] --syscall The 3-th process's pid is 117 and command systemd-journal
[ 2938.994222] --syscall The 4-th process's pid is 145 and command systemd-udevd
[ 2938.994227] --syscall The 5-th process's pid is 148 and command systemd-network
[ 2938.994233] --syscall The 6-th process's pid is 369 and command systemd-resolve
[ 2938.994239] --syscall The 7-th process's pid is 370 and command systemd-timesyn
[ 2938.994245] --syscall The 8-th process's pid is 412 and command accounts-daemon
[ 2938.994321] --syscall The 9-th process's pid is 413 and command dbus-daemon
[ 2938.994336] --syscall The 10-th process's pid is 417 and command irqbalance
[ 2938.994346] --syscall The 11-th process's pid is 418 and command rsyslogd
[ 2938.994352] --syscall The 12-th process's pid is 419 and command snapd
[ 2938.994359] --syscall The 13-th process's pid is 420 and command systemd-logind
[ 2938.994365] --syscall The 14-th process's pid is 439 and command cron
[ 2938.994372] --syscall The 15-th process's pid is 451 and command atd
[ 2938.994378] --syscall The 16-th process's pid is 456 and command agetty
[ 2938.994385] --syscall The 17-th process's pid is 461 and command sshd
[ 2938.994390] --syscall The 18-th process's pid is 491 and command unattended-upgr
[ 2938.994397] --syscall The 19-th process's pid is 501 and command polkitd
[ 2938.994413] --syscall results: pid=1200 name=login state=1 uid=0 nvcsw=31 nivcsw=33 num_children=1 num_siblings=19

标签: clinuxlinux-kernel

解决方案


下面是如何将两个同级子进程链接到其父进程的子进程列表的说明:

     PARENT              CHILD 1             CHILD 2
     ======              =======             =======

                         task_struct         task_struct
                        +-------------+     +-------------+
                        |             |     |             |
     task_struct        ~             ~     ~             ~
    +-------------+     |             |     |             |
    |             |     |-------------|     |-------------|
    ~             ~     | children    |     | children    |
    |             |     |             |     |             |
. . |-------------| . . |-------------| . . |-------------| . .
    | children    |     | sibling     |     | sibling     |
X==>| prev | next |<===>| prev | next |<===>| prev | next |<==X
. . |-------------| . . |-------------| . . |-------------| . .
    | sibling     |     |             |     |             |
    |             |     ~             ~     ~             ~
    |-------------|     |             |     |             |
    |             |     +-------------+     +-------------+
    ~             ~
    |             |     'X's are joined together, making
    +-------------+     a doubly linked, circular list.

虽然childrensibling都是 type struct list_headchildren但被用作实际的列表头(链接到其子进程列表),而sibling被用作列表条目。

父节点的children.next链接指向子节点 1 的sibling成员,子节点 1 的sibling.next链接指向子节点 2 的sibling成员,子节点 2 的sibling.next链接指向父节点的children成员(列表头)。类似地,父节点的children.prev链接指向子节点 2 的sibling成员,子节点 2 的sibling.prev链接指向子节点 1 的sibling成员,子节点 1 的sibling.prev链接指向父节点的children成员。

list_for_each(pos, head)宏访问列表中的每个节点,pos从 开始head->next,而pos != head

通常,head参数 oflist_for_each(pos, head)应该是一个实际的列表头,但宏无法区分列表头和列表条目。它们都是同一类型,并且所有节点都循环链接在一起。(整个列表由一个列表头组成,其中零个或多个列表条目链接成一个圆圈。对于一个空列表,列表头只是链接回自身。)list_for_each宏将围绕双向链表进行迭代,直到它回到它开始的地方。

如果list_for_each(pos, head)head指向父children成员的方式调用,则在第一次迭代中将pos指向子 1 的成员,在第二次迭代sibling中将指向子 2 的成员,并通过指向父成员来sibling终止循环。在循环内部,将正确指向子进程的开始。poschildrenlist_entry(pos, struct task_struct, sibling)struct task_struct

让我们说孩子 1 是current过程。OP 的代码使用list_for_each(pos, head)指向head孩子 1 的sibling成员。因此,pos将在第一次迭代中指向子 2 的sibling成员,并在第二次迭代中指向父级的成员,并通过指向子 1 的成员来children终止循环。在循环内部,将在第一次迭代中正确指向子 2 的开始,但在第二次迭代中将指向父级开始之前的某个位置。这就是问题在于 OP 代码的地方。possiblinglist_entry(pos, struct task_struct, sibling)struct task_structposstruct task_struct


推荐阅读