Primus could not load gpu driver

Hello everybody,

Since the update to the Kernel 4.4.52-1, I encounter some trouble while launching softs with primus. (For example primusrun glxspheres64)

Indeed, each time, I get the following message :
primus: fatal: Bumblebee daemon reported: error: Could not load GPU driver

As this topic suggests, I have upgraded my Kernel to the 4.8.17-1 version.
Unfortunatelly, no changes.

I checked my python install, and pacman said "Python has been installed as a dependency for another package"

Here it is the output of my inxi
System: Host: manjaro Kernel: 4.8.17-1-MANJARO x86_64 (64 bit gcc: 6.2.1) Desktop: Xfce 4.12.3 (Gtk 2.24.31) Distro: Manjaro Linux CPU: Quad core Intel Core i7-6700HQ (-HT-MCP-) cache: 6144 KB Graphics: Card-1: Intel HD Graphics 530 bus-ID: 00:02.0 Card-2: NVIDIA GM107M [GeForce GTX 960M] bus-ID: 01:00.0 Display Server: X.Org 1.19.2 driver: intel Resolution: 1920x1080@60.00hz GLX Renderer: Mesa DRI Intel HD Graphics 530 (Skylake GT2) GLX Version: 3.0 Mesa 17.0.1 Direct Rendering: Yes

At the moment, I a bit annoyed, and looking for some guidance.

Thanks in advance for any help.

Cheers :slight_smile:

You use the nonfree driver for your nvidia card right?

I forgot to mention it.
Indeed yes @Strit, I'm using video-hybrid-intel-nvidia-bumblebee

Also, kernel 4.8 is actually EOL I think, have you tried with Kernel 4.9 LTS?

I'm using 4.9 LTS, nonfree driver, same problem. Broke for me last update, works fine after downgrading bumblebee (3.2.1-17 -> 3.2.1-16).

Graphics: Card-1: Intel 3rd Gen Core processor Graphics Controller
Card-2: NVIDIA GF108M [GeForce GT 630M]

The here's the change from -16 to -17:

Urg, upgrade to kernel 4.9LTS sounds very bad to my computer.
After I have upgraded it, I cannot run Manjaro any more.

Should I take a look for thoses modifications, while running Kernel 4.4 or 4.8 ?
Is it related to an update of Mesa ? I remember updating Mesa from 16 to 17 at the same time.

Oh and... One more question :disappointed_relieved:
I haven't used git yet. Is there a Manjaro documentation about the use of this tool ?

Cheers. :slight_smile:

The ACPI errors are not stopping your system. It's just the kernel telling you that the BIOS/UEFI is using an old standard for power management or something.

But it does sound like a GPU driver issue. Did you use mhwd to change drivers?

ACPI Errors have major importance and maybe they are the source of primus crash.
Please share full dmesg log.

This issue is related directly with ACPICA kernel subsystem , not bumblebee.

My friends ACPI drivers went away when we updated his UEFI to latest version.

I did some work to take a look at your message.
Here it's the results :slight_smile:
(Unfortunately, the issue is still the same)

@Strit,
Thanks for your explanation about the ACPI error.
Thanksfully, you were right, it was a GPU issue.
I forgot to disable bumblebeed.service before restarting the laptop. (An old issue I solved with a timer, but, at the moment, it is disabled)
Kernel 4.9 works 'well'.

However, I didn't understand what you said by

I'm using video-hybrid-intel-nvidia-bumblebee since the original installation.
Maybe do you talk about updates of this driver ? If so, I used pacman. (It has updated the mhwd-nvidia if I remember well)

@FadeMind,
Indeed, my BIOS was out-to-date.
Refering to MSI official website, I have updated the BIOS to the E175IMS.107 version, and then to the E175IMS.118 version.
Now, dmidecode command tells the BIOS is up-to--date

I don't really know if it's better, worst, or the same, but here it is the dmesg output :
http://pastebin.com/htSe7JvS

@All,
Just for the test, I tried to execute primusrun glxspehres64 once more.
Here it's a trace of the systemctl status bumblebeed output :


I wonder what libkmod is ? :confused:

Cheers :slight_smile:

Whoooa:

[   15.040308] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[   15.040317] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID)
[   15.043946] pcieport 0000:00:1c.0:   device [8086:a110] error status/mask=00000001/00002000
[   15.046880] pcieport 0000:00:1c.0:    [ 0] Receiver Error         (First)

Please upgrade to LINUX 4.10 branch ASAP :stuck_out_tongue:

sudo mhwd-kernel -i linux410

AND finally:

add to /etc/default/grub file in GRUB_CMDLINE_LINUX_DEFAULT line this:

pci=nomsi

Between " " marks OFC.

Update grub:

sudo update-grub

Shutdown (NOT REBOOT).
Power ON Device and boot from Linux 4.10 kernel series.
Create NEW dmesg results and bumblebee works.

Yeah sorry about that. Brainfart. I meant if you used mhwd to change the kernel?
because that install all the nessesary modules for nvidia graphics aswell.

Couldn't it be this "simple" TLP error which prevents nvidia from loading when used battery (even once)? If so, there is an easy fix for that:

Find output of: $ lspci | grep "NVIDIA" | cut -b -8

Open: sudo nano /etc/default/tlp

Put input of first command into: RUNTIME_PM_BLACKLIST so it can look like:

RUNTIME_PM_BLACKLIST="01:00.0"

Reboot.

Hi back guys,

Here it is some news :

@michaldybczak,
I tried to set the RUNTIME_PM_BLACKLIST at 01:00.0. (That was the output of your command)
Unfortunately, no change. So I have commented the whole line. (Like it was setted before)
In any case, thanks for the tips :slight_smile:

@Strit,
I didn't used mhwd-kernel to change the kernel, but the GUI. (Settings/Manjaro Settings Manager/Kernel)
It has installed some additionnal stuff, like ndiswrapper and bbswitch.
I suppose it's good. :slight_smile:

@FadeMind,
I always had this PCIe bus error and I never understand what it tells. May you give some details about it ? :slight_smile:
Indeed, the new dmesg is cleans of this message.


Unfortunately, bumblebeed doesn't seems to work better. :confused:

Into the dmesg log, I found another error. Could the issue be related to that ?
[ 0.956952] ACPI Error: No handler for Region [EC__] (ffff8802760eb5a0) [EmbeddedControl] (20160930/evregion-166) [ 0.956959] ACPI Error: Region EmbeddedControl (ID=3) has no handler (20160930/exfldio-299) [ 0.956963] ACPI Error: Method parse/execution failed [\_SB.PCI0.LPCB.EC._REG] (Node ffff8802760ec438), AE_NOT_EXIST (20160930/psparse-543)

@All,
Have a good week-end morning !

Cheers :slight_smile:

[   89.335414] bbswitch: loading out-of-tree module taints kernel.
[   89.335844] bbswitch: version 0.8
[   89.335848] bbswitch: Found integrated VGA device 0000:00:02.0: \_SB_.PCI0.GFX0
[   89.335852] bbswitch: Found discrete VGA device 0000:01:00.0: \_SB_.PCI0.PEG0.PEGP
[   89.335859] ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160930/nsarguments-95)
[   89.335941] bbswitch: detected an Optimus _DSM function
[   89.336034] bbswitch: disabling discrete graphics
[   89.336037] ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160930/nsarguments-95)
[   89.541315] bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is off
[   99.320502] bbswitch: enabling discrete graphics

Hmmm

IMO cause you have major issue in ACPI DSDT Table (ACPI ERRORS for SB.PCI0.LPCB.EC._REG) bbswitch can ON NVIDIA card but kernel cannot handle them at all).

Why you have double variables in:

Command line: BOOT_IMAGE=/vmlinuz-4.10-x86_64 root=UUID=29914461-0fc4-413a-bcb0-eff0d38d2445 rw pci=nomsi quiet splash pci=nomsi quiet splash resume=UUID=c423e44a-2435-4081-b93d-cdabc971148e

Paste grub conf:

cat /etc/default/grub

Check command:

optirun nvidia-smi

too.

@FadeMind,
The double variable is due to the duplication of those values into GRUB_CMD_LINE_DEFAULT and GRUB_CMD_LINE
Here it is the head of the grub file

GRUB_DEFAULT=saved
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Manjaro"
GRUB_CMDLINE_LINUX_DEFAULT="pci=nomsi quiet splash resume=UUID=c423e44a-2435-4081-b93d-cdabc971148e"
GRUB_CMDLINE_LINUX="pci=nomsi quiet splash"

(I did this duplication. Shouldn't I do it ?)

And below, the output of optirun nvidia-msi

[ 407.104220] [ERROR]Cannot access secondary GPU - error: Could not load GPU driver
[ 407.104287] [ERROR]Aborting because fallback start is disabled.

Cheers.

Change from

GRUB_CMDLINE_LINUX="pci=nomsi quiet splash"

to

GRUB_CMDLINE_LINUX=""

Update grub:

sudo update-grub

Save unchanged GRUB_CMDLINE_LINUX_DEFAULT.

Check OLDER kernel (4.1 kernel series if 4.4 and newer have issue).

@FadeMind,
Oops, where's my head today. Sorry :confused:
GRUB_CMDLINE_LINUX cleaned.
Here it is the output of dmesg while running 4.1.

Unfortunately, I'm not able to reach the GUI with this kernel. (I used tty2 to get this output)
And with kernel 4.4, it didn't notice some errors. But primusrun/optirun still cannot load secondary GPU

Cheers.

Please using for now Linux 4.10 (lastest).

Forum kindly sponsored by