Doesn't corrupt my XFS files systems the way 6.3.3 did.
The XFS issue is still present in kernel 6.3.4. But it seems to be gone in the 6.4 series.
@jforbes The crash is easily reproducible. We just needed to start the server, we didn't even have to start any services to make it happen. We are working on providing information based on the guidelines from the XFS maintainer. Hope it's ready in a couple of hours.
@jforbes I should mention that the corrupted file systems were created with Fedora31, while ths server surviving was created with Fedora36.
@jforbes The issue seems to still be there. I got data corruption soon after reboot and service start on two of three servers. This post refers to xfs changes in 6.3: https://www.spinics.net/lists/linux-xfs/msg71098.html.
@jforbes I have put the scratch build on four servers just now. Will let you know how they behave.
That makes a lot of sense. Since it happened to me on more than half of the servers with this kernel, it's not likely it's out there. I will test again when it's pushed to testing.
Hi!
This happened on two HPE DL360gen8 and HPE DL580 gen8. All use hardware raid and Intel CPUs. Two happened after a reboot, one of them was a hard reboot. On DL580 it happened during normal operation and the machine died.
I haven't seen any issues like this before for several years with 100 fedora servers. So it seems something bad happened.
Hi
It seems this kernel corrupts XFS file systems quite frequently. It has happened three times for me on different servers. Repairing the file system didn't make the system boot.
My advice: Stay away from this if you use XFS!
On a HP DL360 G6 server, I can't open xterm and cssh through an ssh connection after upgrading to kernel 6.0.17. I get the "Can't open display: localhost:10.0" error when trying. Kernel 6.0.16 and all previous kernels tested since the server was installed with Fedora in 2015 have worked for this purpose.
Hi! Just XFS and Ext4 on the servers affected at my end. I have totally 8 different servers crashing consistently with this update. HP ProLiant DL380 G7 and IBM BladeCenter HS22. It took an hour till a day for them to crash.
I can look for more traces in the logs if it's of interest.
I also see freezes after a few hours on several servers with heavy load that used to work stably with the 5.13 series.
For what it's worth - here are two traces:
------------[ cut here ]------------ WARNING: CPU: 1 PID: 2520 at kernel/ucount.c:280 dec_rlimit_ucounts+0x50/0x60 Modules linked in: binfmt_misc rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache netfs rfkill nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf> CPU: 1 PID: 2520 Comm: agent_control Tainted: G I 5.14.7-200.fc34.x86_64 #1 Hardware name: HP ProLiant DL380 G7, BIOS P67 05/21/2018 RIP: 0010:dec_rlimit_ucounts+0x50/0x60 Code: c8 f0 48 0f c1 04 31 48 29 d0 78 1e 48 39 cf 4c 0f 44 c0 48 8b 41 10 48 8b 88 e8 01 00 00 48 85 c9 75 db 4d 85 c0 0f 94 c0 c3 <0f> 0b eb de 31 c0 c3 66 0f 1f 84 00 00 00 00 00 0f 1f> RSP: 0018:ffffb53690527db8 EFLAGS: 00010296 RAX: ffffffffa963d36f RBX: ffffb53690527eb8 RCX: ffff977268261b00 RDX: 0000000000000001 RSI: 0000000000000080 RDI: ffff977268261b00 RBP: ffff977268261b00 R08: ffffffffffffffff R09: ffffffffffffffff R10: 00000000fb6cd72d R11: 0000000000000000 R12: 0000000000000b3f R13: 0000000000000010 R14: dead000000000122 R15: ffff976649b10000 FS: 00007f51cac6e740(0000) GS:ffff977c37a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8293daeb00 CR3: 0000000d2ea8e005 CR4: 00000000000206e0 Call Trace: release_task+0x45/0x4d0 ? thread_group_cputime_adjusted+0x3b/0x50 wait_consider_task+0x494/0xa90 do_wait+0x1e8/0x2e0 kernel_wait4+0x96/0x120 ? thread_group_exited+0x50/0x50 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f51cad3faca Code: ff e9 0a 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 3d 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5e c3 0f 1f 44 00 00 48 83 ec 28> RSP: 002b:00007ffe7b2fe338 EFLAGS: 00000246 ORIG_RAX: 000000000000003d RAX: ffffffffffffffda RBX: ffffffffffffff60 RCX: 00007f51cad3faca RDX: 0000000000000003 RSI: 00007ffe7b2fe34c RDI: 00000000ffffffff RBP: 0000000000000000 R08: 0000000000000000 R09: 00007ffe7b2fe700 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f51cac6e6a0 R13: 0000000000000000 R14: 0000000000000001 R15: 00000000018206d0 ------------[ cut here ]------------
------------[ cut here ]------------ kernel BUG at mm/slub.c:321! invalid opcode: 0000 [#1] SMP PTI CPU: 21 PID: 1205930 Comm: python3 Not tainted 5.14.7-200.fc34.x86_64 #1 Hardware name: IBM BladeCenter HS22 -[7870TKN]-/68Y8161, BIOS -[P9E164CUS-1.28]- 04/17/2018 RIP: 0010:__slab_free+0x245/0x4a0 Code: 0f b6 5c 24 1b 44 8b 44 24 1c 48 89 44 24 08 48 8b 54 24 20 4c 8b 4c 24 28 e9 8a fe ff ff 41 f7 45 08 00 0d 21 00 75 98 eb 8d <0f> 0b 49 3b 54 24 28 0f 85 53 ff ff ff 49 8b 44 24 08 4> RSP: 0018:ffffb6934db7fda0 EFLAGS: 00010246 RAX: ffff9ec506bf81e0 RBX: ffff9ec506bf8180 RCX: ffff9ec506bf8180 RDX: 00000000802a0029 RSI: ffffe444841afe00 RDI: ffff9ec500042800 RBP: ffffb6934db7fe50 R08: 0000000000000001 R09: ffffffffb710b505 R10: ffff9ed6f338d000 R11: 0000000062667658 R12: ffffe444841afe00 R13: ffff9ec500042800 R14: ffff9ec506bf8180 R15: ffff9ec506bf8180 FS: 00007fb9deb00740(0000) GS:ffff9ee21fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb9d15315f0 CR3: 000000010b450001 CR4: 00000000000206e0 Call Trace: ? filename_lookup+0x135/0x1b0 ? put_ucounts+0x65/0x70 kfree+0x369/0x3c0 put_ucounts+0x65/0x70 put_cred_rcu+0x70/0xd0 do_faccessat+0x113/0x240 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fb9ded6744b Code: 77 05 c3 0f 1f 40 00 48 8b 15 29 1a 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 0f 1f 40 00 f3 0f 1e fa b8 15 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 f9 19 0> RSP: 002b:00007fff21cd8e98 EFLAGS: 00000202 ORIG_RAX: 0000000000000015 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb9ded6744b RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007fb8fc07f390 RBP: 0000000000000001 R08: 0000000000000000 R09: 00007fb9d1596930 R10: 00007fb8fc07f000 R11: 0000000000000202 R12: 00007fff21cd8eb0 R13: 0000000000000001 R14: 000055b9f23d2bd0 R15: 00000000ffffff9c Modules linked in: binfmt_misc rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache netfs xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp nft_objref n> scsi_transport_sas mptscsih bnx2 mptbase crc32c_intel i2c_algo_bit target_core_mod fuse ---[ end trace 2615b3389f7a01df ]--- RIP: 0010:__slab_free+0x245/0x4a0 Code: 0f b6 5c 24 1b 44 8b 44 24 1c 48 89 44 24 08 48 8b 54 24 20 4c 8b 4c 24 28 e9 8a fe ff ff 41 f7 45 08 00 0d 21 00 75 98 eb 8d <0f> 0b 49 3b 54 24 28 0f 85 53 ff ff ff 49 8b 44 24 08 4> RSP: 0018:ffffb6934db7fda0 EFLAGS: 00010246 RAX: ffff9ec506bf81e0 RBX: ffff9ec506bf8180 RCX: ffff9ec506bf8180 RDX: 00000000802a0029 RSI: ffffe444841afe00 RDI: ffff9ec500042800 RBP: ffffb6934db7fe50 R08: 0000000000000001 R09: ffffffffb710b505 R10: ffff9ed6f338d000 R11: 0000000062667658 R12: ffffe444841afe00 R13: ffff9ec500042800 R14: ffff9ec506bf8180 R15: ffff9ec506bf8180 FS: 00007fb9deb00740(0000) GS:ffff9ee21fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb9d15315f0 CR3: 000000010b450001 CR4: 00000000000206e0 ------------[ cut here ]------------
This breaks ghostscript on sme PDFs.
/usr/bin/ghostscript -o 'compressed.pdf' -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dNOPAUSE -dQUIET -dBATCH -dPDFSETTINGS=/printer -f 'orig.pdf' DEBUG: FC_WEIGHT didn't match in /usr/share/fonts/cantarell/Cantarell-VF.otf jbig2dec FATAL ERROR decoding image: incompatible jbig2dec header (0.17) and library (0.18) versions **** Error reading a content stream. The page may be incomplete. Output may be incorrect.
* Error: File has unbalanced q/Q operators (too many Q's) Output may be incorrect. Error: Form stream has unbalanced q/Q operators (too many q's) Output may be incorrect. Error reading a content stream. The page may be incomplete. Output may be incorrect. * Error: File did not complete the page properly and may be damaged. Output may be incorrect.
Before the upgrade things worked fine.
Network interfaces are not found on an IBM HS22 (Type 7870) blade.
Unable to mount NFS volume. Kernel log says
selinux unable to set superblock options before the security server is initialized
even if selinux is disabled. Works on 6.4.15.