GPU has fallen off the bus and system frozed

Hi, I'm using manjaro xfce as my working system and recently I'm experiencing unexpected system freeze.

From my memory, most of them happened when I left the computer untuched for a while, but the screen will still on (such as playing video from chrome). Once it happened, nothing works even for CapLock button. The only way is to long press power.

I searched the system log and got the following error message:

5月 30 11:46:14 shore-82b6 kernel: NVRM: Xid (PCI:0000:01:00): 79, pid=993, GPU has fallen off the bus.
5月 30 11:46:21 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:27 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:37 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:37 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:37 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:37 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:37 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:37 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:37 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:37 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57d:0:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:1:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:3:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:5:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:7:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57d:0:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:1:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:3:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:5:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:7:0:0x0000000f
5月 30 11:46:45 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
5月 30 11:46:47 shore-82b6 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f

The following would be from inxi -Fxxx and hope it would do some help:

System:
  Host: shore-82b6 Kernel: 5.6.12-1-MANJARO x86_64 bits: 64 compiler: gcc 
  v: 9.3.0 Desktop: Xfce 4.14.2 tk: Gtk 3.24.20 info: xfce4-panel wm: xfwm4 
  dm: LightDM 1.30.0 Distro: Manjaro Linux 
Machine:
  Type: Laptop System: LENOVO product: 82B6 v: Lenovo Legion R7000 2020 
  serial: <root required> Chassis: type: 10 v: Lenovo Legion R7000 2020 
  serial: <root required> 
  Mobo: LENOVO model: LNVNB161216 v: SDK0L77769 WIN serial: <root required> 
  UEFI: LENOVO v: EUCN18WW date: 04/27/2020 
Battery:
  ID-1: BAT0 charge: 60.7 Wh condition: 61.3/60.0 Wh (102%) volts: 17.4/15.4 
  model: Celxpert L19C4PC0 type: Li-poly serial:  4751 status: Unknown 
  cycles: 2 
  ID-2: hidpp_battery_0 charge: N/A condition: N/A volts: 3.8/N/A 
  model: Logitech G903 LIGHTSPEED Wireless Gaming Mouse w/ HERO type: N/A 
  serial: 4087-ad-46-38-99 status: Discharging 
CPU:
  Topology: 8-Core model: AMD Ryzen 7 4800H with Radeon Graphics bits: 64 
  type: MT MCP arch: Zen rev: 1 L2 cache: 4096 KiB 
  flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm 
  bogomips: 92660 
  Speed: 1614 MHz min/max: 1400/2900 MHz boost: enabled Core speeds (MHz): 
  1: 1455 2: 1396 3: 1397 4: 1397 5: 1397 6: 1397 7: 1397 8: 1397 9: 1581 
  10: 1396 11: 1640 12: 2819 13: 1560 14: 1797 15: 1397 16: 1396 
Graphics:
  Device-1: NVIDIA vendor: Lenovo driver: nvidia v: 440.82 bus ID: 01:00.0 
  chip ID: 10de:1f99 
  Display: x11 server: X.Org 1.20.8 driver: nvidia 
  resolution: 1920x1080~60Hz 
  OpenGL: renderer: GeForce GTX 1650/PCIe/SSE2 v: 4.6.0 NVIDIA 440.82 
  direct render: Yes 
Audio:
  Device-1: NVIDIA driver: snd_hda_intel v: kernel bus ID: 01:00.1 
  chip ID: 10de:10fa 
  Device-2: AMD Raven/Raven2/FireFlight/Renoir Audio Processor 
  vendor: Lenovo driver: N/A bus ID: 06:00.5 chip ID: 1022:15e2 
  Device-3: AMD Family 17h HD Audio vendor: Lenovo driver: snd_hda_intel 
  v: kernel bus ID: 06:00.6 chip ID: 1022:15e3 
  Sound Server: ALSA v: k5.6.12-1-MANJARO 
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet 
  vendor: Lenovo driver: r8169 v: kernel port: 1000 bus ID: 03:00.0 
  chip ID: 10ec:8168 
  IF: eno1 state: down mac: 00:2b:67:2c:2a:d1 
  Device-2: Intel Wi-Fi 6 AX200 driver: iwlwifi v: kernel port: 1000 
  bus ID: 04:00.0 chip ID: 8086:2723 
  IF: wlp4s0 state: up mac: f8:e4:e3:d5:8d:a6 
Drives:
  Local Storage: total: 465.76 GiB used: 11.51 GiB (2.5%) 
  ID-1: /dev/nvme0n1 vendor: Samsung model: MZVLB512HBJQ-000L2 
  size: 476.94 GiB speed: 31.6 Gb/s lanes: 4 serial: S4DYNF0N130294 
  rev: 3L1QEXF7 scheme: GPT 
  ID-2: /dev/nvme1n1 vendor: Western Digital model: WDS500G3X0C-00SJG0 
  size: 465.76 GiB speed: 31.6 Gb/s lanes: 4 serial: 191729803490 
  rev: 102000WD scheme: GPT 
Partition:
  ID-1: / size: 457.16 GiB used: 11.51 GiB (2.5%) fs: ext4 
  dev: /dev/nvme1n1p2 
Sensors:
  System Temperatures: cpu: 55.0 C mobo: N/A gpu: nvidia temp: 41 C 
  Fan Speeds (RPM): N/A 
Info:
  Processes: 334 Uptime: 9m Memory: 30.74 GiB used: 1.85 GiB (6.0%) 
  Init: systemd v: 245 Compilers: gcc: 9.3.0 Shell: bash v: 5.0.16 
  running in: xfce4-terminal inxi: 3.0.37 

It has been annoying and I actually re-installed the system this morning, please help! Thank you very much!!

It's pretty warm with summer almost here in the Northern hemisphere. What's with all this about falling off the bus and getting hypothermia.

Do try to be more careful. :upside_down_face: :wink:

Seriously though, have you tested other kernels?
If not be sure to test at least 3 alternate kernels.

Is your bios up to date?

I would also try temporarily masking tlp. Search the forum for information about that troubleshooting step.

See here:

2 Likes

I found that too, thank you. I'll try, already change the power mode.

That's super you got things fixed up, but please be so kind as to elaborate on what actually was the thing that corrected the issue.

1 Like

Having a similar issue with my Dell XPS 7950 Laptop. Unable to solve issues disabling TLP or following the solution tbg linked.

Hi, sorry for the late reply. The fallen off hasn't happen since I permanently Set NVIDIA PowerMizer to performance

It has been 2 days and seems OK.

But I'm still not sure if it is totally fixed, I'll keep watching and if it happens again, I'll post here.

1 Like

Hi, I Set NVIDIA PowerMizer to performance and till now, the fallen off hasn't appear.

What driver and kernel version you have. I'm in 5.6 with prime-440, my graphic is 1650

Thanks for the reply, shore.

I'm on kernel 5.4, with the video-nvidia-430xx driver. I will try upgradíng kernel and driver, and watch for any changed. :slight_smile:

I have the GTX 1650 graphics card as well, so this seems like the right direction to go.

Update:

I switched to kernel 5.6, and for some time I was doing fine (better than yesterday). Then I noticed my temps on the GPU were 5-10 degrees lower than normal. I then noticed that I had forgotten to plug in my laptop. When I did this, my GPU temps rose to about 78C under heavy load. After a minute of this, my GPU fell off the bus again.

I never have any problems with my GPU, only when it reaches a temperature of around 80C. I'm now suspecting, that my issue is simply just a built-in hardware security, protecting the graphics card.

I had a quick search around the internet for temperature problems on the Dell XPS 7950 (my model), and I've found multiple users having the same problem. Looks like the fix for me is to undervolt my GPU :slight_smile: Thanks for the help though.

Hi Shore,

I got a LEGION R7000 2020 too. Do you get manjaro running well on your laptop?

Yanan

Till now, touchpad not working, warning message when shutting down on "failed to umount /oldroot", GPU fallen hasn't happen. Others were OK, and I actually work and programming on this system.

I got this laptop yesterday, finding a way to get touchpad work. Could it cause by the new chipset for ryzen 4000?

I don't know........ but currently I work with mise, so doesn't really matter to me....lol

@shore
你好
我有一些问题想请教你
我也买了R7000
并且希望它能做到我想要的那些
您是“前辈”
因此我来请教你关于显卡的问题

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Forum kindly sponsored by