I recently bought Acer Nitro 5 to replace my old dev workstation, and as I don't want to part ways with Manjaro I fired architect and begun wrestling with it right away :). After using this new install for a while, I encountered random freezes (possibly lightdm crashing, but I'm not 100% sure) that would lock up everything but mouse cursor (switching to different tty results in blank screen with no prompt). Working with cli only, I noticed that I'd got random pcie bus error messages to console output, over whatever was being done on-screen (I captured isolated one on camera, image below), and it seems that causes my GUI setup (xorg+lightdm+xfce) to freeze so hard. While looking for solution, I found post here from april, telling to use linux-amd-staging-drm-next-git kernel, not only it did not work, but it also was noticeably slower and disabled touchpad. I am now desperate for a solution, as working with cli/tmux-only is actually crippling my productivity and above that these messages covering cli are infuriating to work with. As You may know, this is my first post over here, sorry in advance if it is in wrong category.
First thing is to check for a BIOS update. Second thing is to make sure you're running kernel 4.18.
Next thing could be trying this kernel boot parameter:
idle=nomwait and checking out this thread: AMD Ryzen: Problems and Fixes 🔧
Latest 1.08 insyde bios and 4.19 kernel, as of today
Just logged in to try solutions from that thread, only to get this in 1 second
Also, why does kernel on ryzen still require intel-ucode.img?
It doesn't. Replace it with
amd-ucode. That's also worth trying here.
Also, please try not to upload massively high-resolution photos to the forum unless it's absolutely necessary. I've cropped, resized, and recompressed the uploads to reduce the size from 2MB->60KB.
Sorry, my bad, it is hard to do it otherwise on this site
After enabling nomwait and disabling C6, GUI still lockups at random intervals, while tty output is still interleaved with these messages, making mc unusable. From what I see, it seems that these messages pop up with increased network activity, be it wifi or ethernet
Sorry to hear that. The APU should be ok in that configuration. A WiFi NIC is a definite candidate, especially the cheap Realtek crap they usually defile computers with. Energy saving options are notorious for being problematic. Someone knowledgeable in this area should be able to guide you further.
Replaced ucode and updated kernel and mesa to today's version. During the update i got amdgpu vblank errors (while on tty3) alongside exactly the same pcie bus correction messages. After the reboot problems persist. And as Klorax guessed, Nitro 5 has realtek chipset with some pseudo-killer functionality
Try adding pci=noaer to the kernel parameters to get rid of PCI errors, it disable advanced error reporting for pci.
Use kernel version 4.18 as 4.19 is not stable.
Remove quiet from the parameters to confirm the errors are gone.
The first thing I do with new installations is remove quiet option :). I switched to 4.19 after 4.17 refused to boot, had no idea it was the unstable one. I'll switch Immediately if You say so, because I'd like to be productive on monday with this laptop.
4.18.9 started on a scary note (sorry in advance if it comes up big again)
and then it became even worse
What should I do now?
do you have amd-ucode enabled.
if all fails try installing arch linux.
if arch installer boots up without any errors you are good to go.
That's a pretty extreme solution to a driver firmware error...
I couldn't agree more. Also, amd-ucode seems to change nothing, both types of messages appear
OK, I think there may be separate issues happening here. The ath10k firmware issue is down to your network card, so have a search for other ath10k issue threads on the forum (IIRC some cards with that chip are problematic).
AMDGPU errors are a separate issue; that looks to be pointing towards
Should I change their drivers?
Tried to find any solution to ath10k problem, with no results (while thread about it here has been auto-closed)
I just want to get rid of these messages altogether ASAP, card works fine, but these messages hang any GUI operations and are distracting on tty
There are numerous people running with a Ryzen CPU, but if noone has has and resolved the same issue, then no, there's no "fast solution".
However, the fastest way of getting help is posting to a public thread rather than messaging a single person.
Which one? As I mentioned before, the only thread about this issue with ath10k has been auto-locked.
I guess you know it, but.. confirm system is properly updated.
Then try to boot to TTY straight, editing grub. That should save you from the spam errors, IIRC. Then check and maybe start your DM, or startx.