I should mention that the crash above happened while I was using about 4 GB of VRAM with pytorch
GPU hang with the following journalctl log (at p3)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:32774, for process chromium-browse pid 5188 thread chromium-b:cs0 pid 5223)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x00000080010e5000 from client 0x1b (UTCL2)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00501431
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: RW: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:32774, for process chromium-browse pid 5188 thread chromium-b:cs0 pid 5223)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x00000080010e5000 from client 0x1b (UTCL2)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: RW: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:88 vmid:5 pasid:32774, for process chromium-browse pid 5188 thread chromium-b:cs0 pid 5223)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000002000 from client 0x1b (UTCL2)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: RW: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:88 vmid:5 pasid:32774, for process chromium-browse pid 5188 thread chromium-b:cs0 pid 5223)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000002000 from client 0x1b (UTCL2)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: RW: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:88 vmid:5 pasid:32774, for process chromium-browse pid 5188 thread chromium-b:cs0 pid 5223)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000002000 from client 0x1b (UTCL2)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: RW: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:88 vmid:5 pasid:32774, for process chromium-browse pid 5188 thread chromium-b:cs0 pid 5223)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000002000 from client 0x1b (UTCL2)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
Okt 09 11:20:13 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: RW: 0x0
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:88 vmid:5 pasid:32774, for process chromium-browse pid 5188 thread chromium-b:cs0 pid 5223)
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000002000 from client 0x1b (UTCL2)
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x005012B1
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: SQC (inst) (0x9)
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0xb
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: RW: 0x0
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:88 vmid:5 pasid:32774, for process chromium-browse pid 5188 thread chromium-b:cs0 pid 5223)
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000002000 from client 0x1b (UTCL2)
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
Okt 09 11:20:23 maximaschine kernel: amdgpu 0000:03:00.0: amdgpu: RW: 0x0
Okt 09 11:20:23 maximaschine kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
Okt 09 11:20:33 maximaschine kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=181272, emitted seq=181275
Okt 09 11:20:33 maximaschine kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process chromium-browse pid 5188 thread chromium-b:cs0 pid 5223
KDE Plasma crashed when opening Firefox. After running rm -r .cache/kwin .cache/plasmashell
the crash is fixed. Maybe these updates really do need to trigger whatever cache invalidation is going on. (AMD RX 6600 XT, so likely not related to Intel fix)
Works
Works for me (AMD Ryzen 7600X + Radeon 6600XT)
ASUS PRIME B650 PLUS, Ryzen 7600X, Radeon RX6600XT working well (Did not have the GPU clock issue on 6.4.6 either)
Works
Works
Seems good, default tests pass
Works on Ryzen 5 7600X + Radeon RX6600XT
Works
works with kernel 6.3.11
On Workstation this: - doesn't break flatpak - makes it not print an error to the journal so i'm upvoting
Works
Seems ok on AMD Ryzen 5 7600X
Works with Ryzen 7600X + RX 6600 XT
Works.
Works