首页 > 解决方案 > filp_open 在 char 设备驱动程序的 release 函数中崩溃

问题描述

我正在编写一个设备驱动程序,当一个文件关闭时我需要做很多操作,其中一个操作是打开另一个文件。如果用户运行打开、关闭和返回,一切都会完美运行。但是,如果用户只运行打开然后返回,我的驱动程序就会崩溃,我注意到它只是在它尝试在我的发布函数中执行 filp_open 时崩溃。我的印象是,当释放函数不是由用户直接调用(通过关闭)而是由内核直接调用(因为用户在没有关闭的情况下返回)时,我不能执行 filp_open。(显然路径总是正确的,因为当用户尝试按此顺序打开、关闭和返回时它会起作用)。

这是在我的发布方法期间导致崩溃的代码:

struct file *file_open(const char *path, int flags, int rights)
{
    if(path==NULL){
        print_message("path is NULL\n");
        return NULL;
    }
    struct file *filp = NULL;
    mm_segment_t oldfs;
    int err = 0;

    oldfs = get_fs();
    set_fs(get_ds());
    print_message("I'm doing filp_open");

    filp = filp_open(path, flags, rights);
    set_fs(oldfs);
    if (IS_ERR(filp)) {
        print_message("error during filp_open\n");
        err = PTR_ERR(filp);
        return NULL;
    }
    if(filp==NULL){
        print_message("filp is NULL\n");
        return NULL;
    }
    return filp;
}

这是我执行 dmesg 时内核的转储:

[  961.870540] I'm doing filp_open
[  961.870548] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[  961.870550] PGD 0 P4D 0
[  961.870554] Oops: 0000 [#1] SMP PTI
[  961.870556] CPU: 1 PID: 2315 Comm: userspace Tainted: G           OE     4.18.0-25-generic #26~18.04.1-Ubuntu
[  961.870558] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  961.870570] RIP: 0010:set_root+0x26/0xc0
[  961.870571] Code: 1f 44 00 00 0f 1f 44 00 00 55 65 48 8b 04 25 00 5c 01 00 48 89 e5 41 55 41 54 41 52 53 f6 47 38 40 4c 8b a0 88 0a 00 00 74 3d <41> 8b 4c 24 08 f6 c1 01 75 7c 49 8b 54 24 20 49 8b 44 24 18 48 89
[  961.870596] RSP: 0018:ffffbd9bc33dbaa8 EFLAGS: 00010202
[  961.870598] RAX: ffff9a8f72384500 RBX: ffffbd9bc33dbbf0 RCX: 0000000000000001
[  961.870599] RDX: ffffffff8fef34c8 RSI: 0000000000000041 RDI: ffffbd9bc33dbbf0
[  961.870600] RBP: ffffbd9bc33dbac8 R08: ffff9a8fbfd27080 R09: ffff9a8fb474c600
[  961.870602] R10: ffffbd9bc33dba98 R11: 00000000ffffffff R12: 0000000000000000
[  961.870603] R13: ffffbd9bc33dbbf0 R14: 0000000000000001 R15: 0000000000000002
[  961.870605] FS:  0000000000000000(0000) GS:ffff9a8fbfd00000(0000) knlGS:0000000000000000
[  961.870606] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  961.870607] CR2: 0000000000000008 CR3: 0000000050e0a002 CR4: 00000000000606e0
[  961.870610] Call Trace:
[  961.870615]  path_init+0x16f/0x2f0
[  961.870617]  path_openat+0x78/0x1780
[  961.870621]  ? sched_clock+0x9/0x10
[  961.870626]  ? sched_clock_cpu+0x11/0xb0
[  961.870628]  do_filp_open+0x9b/0x110
[  961.870633]  ? vprintk_emit+0xec/0x290
[  961.870636]  file_open_name+0x114/0x180
[  961.870638]  ? file_open_name+0x114/0x180
[  961.870640]  filp_open+0x33/0x60
[  961.870643]  file_open+0x56/0x90 [driver]
[  961.870645]  my_char_device_driver_close+0x96/0x190 [driver]
[  961.870647]  __fput+0xea/0x220
[  961.870649]  ____fput+0xe/0x10
[  961.870652]  task_work_run+0x9d/0xc0
[  961.870655]  do_exit+0x2eb/0xb30
[  961.870658]  ? __do_page_fault+0x270/0x4d0
[  961.870660]  do_group_exit+0x43/0xb0
[  961.870662]  __x64_sys_exit_group+0x18/0x20
[  961.870666]  do_syscall_64+0x5a/0x120
[  961.870672]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  961.870674] RIP: 0033:0x7f0304ee3e06
[  961.870675] Code: Bad RIP value.
[  961.870679] RSP: 002b:00007fff99afb5c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[  961.870680] RAX: ffffffffffffffda RBX: 00007f03051e6740 RCX: 00007f0304ee3e06
[  961.870682] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
[  961.870686] RBP: 0000000000000000 R08: 00000000000000e7 R09: ffffffffffffff80
[  961.870688] R10: 0000000000000002 R11: 0000000000000246 R12: 00007f03051e6740
[  961.870689] R13: 0000000000000001 R14: 00007f03051ef628 R15: 0000000000000000
[  961.870691] Modules linked in: driver(OE) vboxvideo(OE) vboxsf(OE) snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm crct10dif_pclmul crc32_pclmul snd_seq_midi snd_seq_midi_event ghash_clmulni_intel joydev pcbc snd_rawmidi aesni_intel aes_x86_64 crypto_simd cryptd glue_helper snd_seq intel_rapl_perf input_leds snd_seq_device snd_timer serio_raw snd soundcore mac_hid vboxguest(OE) sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid psmouse video vmwgfx ttm ahci drm_kms_helper libahci syscopyarea e1000 sysfillrect i2c_piix4 sysimgblt fb_sys_fops pata_acpi drm
[  961.870724] CR2: 0000000000000008
[  961.870726] ---[ end trace 3fb99e3beca99ccc ]---
[  961.870728] RIP: 0010:set_root+0x26/0xc0
[  961.870729] Code: 1f 44 00 00 0f 1f 44 00 00 55 65 48 8b 04 25 00 5c 01 00 48 89 e5 41 55 41 54 41 52 53 f6 47 38 40 4c 8b a0 88 0a 00 00 74 3d <41> 8b 4c 24 08 f6 c1 01 75 7c 49 8b 54 24 20 49 8b 44 24 18 48 89
[  961.870753] RSP: 0018:ffffbd9bc33dbaa8 EFLAGS: 00010202
[  961.870755] RAX: ffff9a8f72384500 RBX: ffffbd9bc33dbbf0 RCX: 0000000000000001
[  961.870756] RDX: ffffffff8fef34c8 RSI: 0000000000000041 RDI: ffffbd9bc33dbbf0
[  961.870757] RBP: ffffbd9bc33dbac8 R08: ffff9a8fbfd27080 R09: ffff9a8fb474c600
[  961.870759] R10: ffffbd9bc33dba98 R11: 00000000ffffffff R12: 0000000000000000
[  961.870760] R13: ffffbd9bc33dbbf0 R14: 0000000000000001 R15: 0000000000000002
[  961.870761] FS:  0000000000000000(0000) GS:ffff9a8fbfd00000(0000) knlGS:0000000000000000
[  961.870763] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  961.870764] CR2: 00007f0304ee3ddc CR3: 0000000050e0a002 CR4: 00000000000606e0
[  961.870765] Fixing recursive fault but reboot is needed!

标签: clinuxlinux-kernellinux-device-driver

解决方案


我认为您是正确的,崩溃是filp_open()在进程退出期间从文件“释放”处理程序调用的结果。

崩溃的直接原因似乎是current->fs由于NULLset_root()调用filp_open().

do_exit()函数调用exit_files()关闭打开的文件,但文件的“释放”处理程序不会立即调用。工作项在当前任务上排队,以便稍后调用“释放”处理程序。

然后该do_exit()函数调用exit_fs()which destroyscurrent->fs并设置current->fsNULL

再进一步,将运行先前排队的工作项的do_exit()调用exit_task_work()(并防止添加任何更多工作项)。这会导致调用文件的“释放”处理程序。

结果是current->fs当文件正常关闭时在“释放”处理程序中有效,但在任务退出时关闭文件时 current->fs将在“释放”处理程序中有效。在is时崩溃,因此您应该避免从“释放”处理程序调用它,或者至少在从“释放”处理程序调用之前检查它是否为非 NULL 。NULLfilp_open()current->fsNULLcurrent->fsfilp_open()

一种可能的解决方法可能是filp_open()从内核线程调用或通过在系统工作队列中排队的工作项调用它,方法是调用schedule_work().


推荐阅读