Oops in hfi when using multiple ceph writers
In lake2, when using ceph via the IPoIB with multiple writers, after some seconds it causes an oops:
[ 2116.528509] BUG: kernel NULL pointer dereference, address: 0000000000000010
[ 2116.536343] #PF: supervisor read access in kernel mode
[ 2116.542106] #PF: error_code(0x0000) - not-present page
[ 2116.547853] PGD 0 P4D 0
[ 2116.550699] Oops: 0000 [#1] PREEMPT SMP PTI
[ 2116.555380] CPU: 4 PID: 42 Comm: ksoftirqd/4 Not tainted 6.4.11 #1-NixOS
[ 2116.562889] Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016
[ 2116.574768] RIP: 0010:napi_schedule_prep+0x9/0x50
[ 2116.580050] Code: 68 54 0c 94 e8 58 3e cf ff 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 <48> 8b 4f 10 f6 c1 04 75 29 48 89 ca 48 89 c8 83 e2 01 48 01 d2 48
[ 2116.601069] RSP: 0018:ffffabe5c65f0eb8 EFLAGS: 00010046
[ 2116.606923] RAX: ffffffffc14f1ab0 RBX: 0000000000000000 RCX: 0000000000000001
[ 2116.614916] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 2116.622905] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 2116.630897] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000617
[ 2116.638887] R13: ffff9164955396b0 R14: 0000000000000016 R15: ffff916498d09a00
[ 2116.646878] FS: 0000000000000000(0000) GS:ffff9173bfb00000(0000) knlGS:0000000000000000
[ 2116.655940] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2116.662375] CR2: 0000000000000010 CR3: 0000000a8ee20002 CR4: 00000000003706e0
[ 2116.670366] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2116.678356] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2116.686346] Call Trace:
[ 2116.689089] <IRQ>
[ 2116.691350] ? __die+0x23/0x70
[ 2116.694782] ? page_fault_oops+0x17d/0x4b0
[ 2116.700050] ? ip_protocol_deliver_rcu+0x32/0x170
[ 2116.705968] ? exc_page_fault+0x6d/0x150
[ 2116.711007] ? asm_exc_page_fault+0x26/0x30
[ 2116.716336] ? __pfx_hfi1_ipoib_sdma_complete+0x10/0x10 [hfi1]
[ 2116.723646] ? napi_schedule_prep+0x9/0x50
[ 2116.728875] hfi1_ipoib_sdma_complete+0x38/0x90 [hfi1]
[ 2116.735353] sdma_make_progress+0x178/0x460 [hfi1]
[ 2116.741459] ? __pfx_hfi1_ipoib_sdma_complete+0x10/0x10 [hfi1]
[ 2116.748712] sdma_engine_interrupt+0x72/0x100 [hfi1]
[ 2116.755030] sdma_interrupt+0x36/0x110 [hfi1]
[ 2116.760632] __handle_irq_event_percpu+0x4d/0x1a0
[ 2116.766538] handle_irq_event+0x3e/0x80
[ 2116.771462] handle_edge_irq+0x9d/0x280
[ 2116.776380] __common_interrupt+0x46/0xc0
[ 2116.781495] common_interrupt+0x81/0xa0
[ 2116.786418] </IRQ>
[ 2116.789403] <TASK>
[ 2116.792382] asm_common_interrupt+0x26/0x40
[ 2116.797708] RIP: 0010:skb_segment+0x86b/0xf00
[ 2116.803222] Code: 24 44 8b 74 24 60 49 89 cc 48 8b 4c 24 28 e9 8b 00 00 00 48 8b 11 48 8b 79 08 49 89 14 24 48 89 d0 49 89 7c 24 08 48 8b 50 08 <f6> c2 01 0f 85 c9 03 00 00 0f 1f 44 00 00 f0 ff 40 34 41 8b 44 24
[ 2116.825561] RSP: 0018:ffffabe5c65dbb90 EFLAGS: 00000213
[ 2116.832097] RAX: ffffd6a144ae8c00 RBX: ffff9164af715c00 RCX: ffff9164db525400
[ 2116.840773] RDX: 0000000000000000 RSI: ffff91648734f0e8 RDI: 0000000000008000
[ 2116.849444] RBP: ffffabe5c65dbc60 R08: 0000000000005dac R09: 0000000000006574
[ 2116.858127] R10: 25dd4e99d6e1ffe7 R11: 0000000000000003 R12: ffff916487cb7980
[ 2116.866801] R13: 0000000000005df8 R14: 0000000000000001 R15: 0000000000000000
[ 2116.875493] ? __pfx_csum_partial_ext+0x10/0x10
[ 2116.881263] ? __pfx_csum_block_add_ext+0x10/0x10
[ 2116.887289] tcp_gso_segment+0xec/0x4e0
[ 2116.892247] ? __pfx_tcp_wfree+0x10/0x10
[ 2116.897283] inet_gso_segment+0x159/0x3d0
[ 2116.902393] ? hfi1_ipoib_send+0x246/0x560 [hfi1]
[ 2116.908364] skb_mac_gso_segment+0xa4/0x110
[ 2116.914180] __skb_gso_segment+0xb7/0x170
[ 2116.919271] ? netif_skb_features+0x151/0x2e0
[ 2116.924746] validate_xmit_skb+0x16c/0x340
[ 2116.929930] validate_xmit_skb_list+0x4e/0x70
[ 2116.935392] sch_direct_xmit+0x18a/0x380
[ 2116.940372] __qdisc_run+0x149/0x5a0
[ 2116.944952] net_tx_action+0x1df/0x2a0
[ 2116.949714] __do_softirq+0xca/0x2ae
[ 2116.954278] ? __pfx_smpboot_thread_fn+0x10/0x10
[ 2116.960005] run_ksoftirqd+0x2c/0x40
[ 2116.964575] smpboot_thread_fn+0xdc/0x1d0
[ 2116.969622] kthread+0xe8/0x120
[ 2116.973702] ? __pfx_kthread+0x10/0x10
[ 2116.978465] ret_from_fork+0x2c/0x50
[ 2116.983033] </TASK>
[ 2116.986029] Modules linked in: netconsole ipmi_si nfsv3 nfs_acl nfs lockd grace netfs fscache msr sb_edac edac_core intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common hfi1 x86_pkg_temp_thermal intel_powerclamp coretemp crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha512_generic aesni_intel mgag200 libaes drm_shmem_helper crypto_simd cryptd igb drm_kms_helper rdmavt rapl iTCO_wdt mei_me intel_cstate intel_pmc_bxt ptp syscopyarea ib_uverbs pps_core watchdog sysfillrect mxm_wmi sunrpc intel_uncore sysimgblt mei i2c_i801 i2c_algo_bit ioatdma i2c_smbus lpc_ich evdev dca input_leds joydev led_class mousedev mac_hid wmi tiny_power_button acpi_power_meter acpi_pad button xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_rpfilter xt_pkttype xt_LOG nf_log_syslog xt_tcpudp nft_compat sch_fq_codel nf_tables libcrc32c nfnetlink atkbd libps2 serio vivaldi_fmap loop cpufreq_powersave tun tap macvlan bridge stp llc kvm irqbypass ib_ipoib ib_cm
[ 2116.986177] ib_umad ib_core ipmi_watchdog ipmi_devintf ipmi_msghandler fuse drm efi_pstore backlight configfs dmi_sysfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 hid_generic usbhid hid sd_mod ahci xhci_pci xhci_pci_renesas libahci firmware_class ehci_pci xhci_hcd libata ehci_hcd nvme nvme_core usbcore scsi_mod t10_pi crc32c_intel crc64_rocksoft crc64 crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common usb_common scsi_common rtc_cmos dm_mod dax [last unloaded: ipmi_si]
[ 2117.145385] CR2: 0000000000000010
[ 2117.149915] ---[ end trace 0000000000000000 ]---
[ 2117.215956] RIP: 0010:napi_schedule_prep+0x9/0x50
[ 2117.222128] Code: 68 54 0c 94 e8 58 3e cf ff 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 <48> 8b 4f 10 f6 c1 04 75 29 48 89 ca 48 89 c8 83 e2 01 48 01 d2 48
[ 2117.244851] RSP: 0018:ffffabe5c65f0eb8 EFLAGS: 00010046
[ 2117.251528] RAX: ffffffffc14f1ab0 RBX: 0000000000000000 RCX: 0000000000000001
[ 2117.260351] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 2117.269151] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 2117.277962] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000617
[ 2117.286754] R13: ffff9164955396b0 R14: 0000000000000016 R15: ffff916498d09a00
[ 2117.295538] FS: 0000000000000000(0000) GS:ffff9173bfb00000(0000) knlGS:0000000000000000
[ 2117.305396] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2117.312654] CR2: 0000000000000010 CR3: 0000000a8ee20002 CR4: 00000000003706e0
[ 2117.321457] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2117.330257] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2117.339079] Kernel panic - not syncing: Fatal exception in interrupt
[ 2117.347081] Kernel Offset: 0x12200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 2117.420699] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
I didn't saw any hfi changes in 6.4.12, but it may be worth the try.
Reported to the kernel maintainers.