Short tip: amdgpu freezes with Steam under Pop!_OS 22.04
Recently I noticed that the entire graphical interface crashes after a few seconds as soon as a game is streamed via Steam - for example via a Steam Link or Apple TV box.
I'm using Pop!_OS version 22.04 LTS, the kernel used was 6.2.6-76060206 - but the older version 6.1 and 6.0 seem to be affected.
The whole thing is reproducible independent of the started game and is possibly documented in the log:
1[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring comp_1.2.0 timeout, signaled seq=30, emitted seq=32
2[drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process LiS-Win64-Shipp pid 84780 thread LiS-Win64-:cs0 pid 84847
3[ +0,000308] amdgpu 0000:2d:00.0: amdgpu: GPU reset begin!
4[ +0,054467] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <vce_v4_0> failed -22
5[ +0,092833] [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x117)
6[ +0,028231] amdgpu 0000:2d:00.0: amdgpu: BACO reset
7[ +0,556223] amdgpu 0000:2d:00.0: amdgpu: GPU reset succeeded, trying to resume
8[ +0,000197] [drm] PCIE GART of 512M enabled.
9[ +0,000002] [drm] PTB located at 0x000000F400000000
10[ +0,000050] [drm] VRAM is lost due to GPU reset!
11[ +0,000000] [drm] PSP is resuming...
12[ +0,187570] [drm] reserve 0x400000 from 0xf5fec00000 for PSP TMR
13[ +0,116934] [drm] kiq ring mec 2 pipe 1 q 0
14[ +0,021896] [drm] UVD and UVD ENC initialized successfully.
15[ +0,099689] [drm] VCE initialized successfully.
16[ +0,000010] amdgpu 0000:2d:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
17[ +0,000002] amdgpu 0000:2d:00.0: amdgpu: ring gfx_low uses VM inv eng 1 on hub 0
18[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring gfx_high uses VM inv eng 4 on hub 0
19[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 5 on hub 0
20[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 6 on hub 0
21[ +0,000000] amdgpu 0000:2d:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 7 on hub 0
22[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 8 on hub 0
23[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 9 on hub 0
24[ +0,000000] amdgpu 0000:2d:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 10 on hub 0
25[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 11 on hub 0
26[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 12 on hub 0
27[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 13 on hub 0
28[ +0,000000] amdgpu 0000:2d:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
29[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring page0 uses VM inv eng 1 on hub 1
30[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring sdma1 uses VM inv eng 4 on hub 1
31[ +0,000000] amdgpu 0000:2d:00.0: amdgpu: ring page1 uses VM inv eng 5 on hub 1
32[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring uvd_0 uses VM inv eng 6 on hub 1
33[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring uvd_enc_0.0 uses VM inv eng 7 on hub 1
34[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring uvd_enc_0.1 uses VM inv eng 8 on hub 1
35[ +0,000000] amdgpu 0000:2d:00.0: amdgpu: ring vce0 uses VM inv eng 9 on hub 1
36[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring vce1 uses VM inv eng 10 on hub 1
37[ +0,000001] amdgpu 0000:2d:00.0: amdgpu: ring vce2 uses VM inv eng 11 on hub 1
38[ +0,001953] amdgpu 0000:2d:00.0: amdgpu: recover vram bo from shadow start
39[ +0,000028] amdgpu 0000:2d:00.0: amdgpu: recover vram bo from shadow done
40[ +0,000017] [drm] Skip scheduling IBs!
41[ +0,000000] amdgpu 0000:2d:00.0: amdgpu: GPU reset(2) succeeded!
42...[ +0,000000] [drm] Skip scheduling IBs!
43[ +0,420667] rfkill: input handler enabled
44[ +0,090733] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
The solution was to install and use the Hardware Enablement kernel:
1# apt-get install linux-image-generic-hwe-22.04
This uses the older version 5.19, in which the driver seems to be less buggy. Since kernel 6.0, there seem to be never-ending problems with the amdgpu
driver module in general.
To make Pop!_OS use the older kernel by default, the following command helps according to documentation:
1# kernelstub -v -k /boot/vmlinuz-5.19.0-46-generic -i /boot/initrd.img-5.19.0-46-generic
Otherwise, the space bar must be pressed at every boot to call up the boot menu.