WIFI doesn't work - Help with gutting Network and installing it anew from live system

I got again the same problem as in the previous topic:

The thing is, I know now why it happened, sort of...

The combination is to suspend the system and wake it without AC plugged in.

What doesn't help:

  • reboot
  • hard reboot
  • boot into Windows and back into Manjaro
  • change of kernel
  • boot into fallback mode
  • systemd network restart
  • loging into test user with clean configs
  • restoring backup, sort of..., see the above topic

Conclusions:

  • Wifi works in Windows or in Manjaro Live, so this isn't a hardware problem.
  • Since on a clean user it also doesn't work so it's not a problem of user configs.
  • Can the ACPI somehow permanently alter the network related daemons or system-wide settings?

Since I won't destroy my system again like in the previous attempt with failed backups restore to fix it, I need to find some other solution.

My idea is to uninstall all the network related packages with their configs from live system and install them again. I'm looking for network specialist who know packages and how to fix it, at least for the moment. I'm not sure if I can prevent it again, but I think I can help a temporal fix or workarounds (avoid suspend). The completely nuke option would be to reinstall system again but that doesn't guarantee the issue won't happen again. So I guess I need @tbg or someone who knows the topic in and out.

I would like to avoid depleting battery because that process makes it permanently weaker.

2 Likes

I'm trying to use the solution given in the topic you posted, but I have a problem of moving/copying a file between live system and the installed one.

From live system I can't access /home because it's invisible because of permissions. Other drives are also empty (not mounted?). I can't copy the file on root either because of permissions.
From chroot I also don't know how to reach live system.

I'm sure there has to be a way to bypass it, some mounting or permission shenanigans that I'm not familiar with.

EDIT: I found solution.

Instead mhwd-chroot I used manjaro-chroot -a and then within a chroot mount -a (mount all file systems). This allowed me to see some non root and non home paritions to which I have access from live and where I could put my downloaded in live system board.bin file.

Then from chroot I changed names of board.bin to board.bin.bak in:

/usr/lib/firmware/ath10k/QCA6174/hw2.1/
/usr/lib/firmware/ath10k/QCA6174/hw3.0/

and then used mc (midnight commander) to open on one side those directories and my mounted partition (not home or root) where board.bin resides and copied it to the proper directories.

I'm stopping at this point and rebooting to see if it helped.

EDIT 2: It didn't help, still no wifi networks available. To sum up what I did and where I have problems with given solution:

  1. Rename the /usr/lib/firmware/ath10k/QCA6174/ board-2.bin files:
sudo mv /usr/lib/firmware/ath10k/QCA6174/hw2.1/board-2.bin /usr/lib/firmware/ath10k/QCA6174/hw2.1/board-2.bin.bak
sudo mv /usr/lib/firmware/ath10k/QCA6174/hw3.0/board-2.bin /usr/lib/firmware/ath10k/QCA6174/hw3.0/board-2.bin.bak

I did it from chroot, no problems there. However, I don't understand, why are we doing it? What does files do and why do we need to disable them?

  1. After some issues of how to copy board.bin file from live into the OS via chroot (see explanation above) I replaced board.bin in those directories:
/usr/lib/firmware/ath10k/QCA6174/hw2.1/
/usr/lib/firmware/ath10k/QCA6174/hw3.0/

with a file downloaded from http://www.killernetworking.com/support/K1535_Debian/board.bin

I rebooted and still nothing. So I'm proceeding with the instruction:

  1. If your wifi is still not working correctly then add an ath10k driver option file to /etc/modprobe.d:
echo "options ath10k_core skip_otp=y" | sudo tee /etc/modprobe.d/ath10k.conf 

Still nothing :frowning: . This is slowly getting more and more frustrating. Luckily, there are more things to try on the list, so I'm continuing.

  1. The next advice did within a chroot:
sudo pacman -S iwd 
sudo systemctl mask wpa_supplicant
sudo systemctl enable iwd.service

Reboot, still nothing... but I wondered if it worked so I checked:

sudo systemctl status iwd.service
● iwd.service - Wireless service
     Loaded: loaded (/usr/lib/systemd/system/iwd.service; enabled; vendor preset: disabled)
     Active: active (running) since Sat 2020-03-21 15:24:18 CET; 1min 0s ago
   Main PID: 889 (iwd)
      Tasks: 1 (limit: 9369)
     Memory: 1.9M
     CGroup: /system.slice/iwd.service
             └─889 /usr/lib/iwd/iwd

mar 21 15:24:18 alienware-PC systemd[1]: Starting Wireless service...
mar 21 15:24:18 alienware-PC iwd[889]: Wireless daemon version 1.5
mar 21 15:24:18 alienware-PC systemd[1]: Started Wireless service.
mar 21 15:24:18 alienware-PC iwd[889]: netconfig: Network configuration is disabled.
mar 21 15:24:18 alienware-PC iwd[889]: rfkill id 0 can't be matched to a wiphy
mar 21 15:24:18 alienware-PC iwd[889]: Wiphy: 0, Name: phy0
mar 21 15:24:18 alienware-PC iwd[889]:         Permanent Address: 9c:b6:d0:07:03:85
mar 21 15:24:18 alienware-PC iwd[889]:         Bands: 2.4 GHz 5 GHz
mar 21 15:24:18 alienware-PC iwd[889]:         Ciphers: CCMP TKIP BIP
mar 21 15:24:18 alienware-PC iwd[889]:         Supported iftypes: ad-hoc station ap p2p-client p2p-go p2p-device

The thing that sounds suspicious is: netconfig: Network configuration is disabled.
@tbg, what do you say? How to proceed further?

This is a bit annoying to switch from OS to live where I have to enable network, my password pamanger to access the forum over and over again but what I'm going to do? This is still quicker then booting into windows :wink: plus I have chroot possibilities why from windows... nothing.

https://lists.01.org/hyperkitty/list/iwd%40lists.01.org/thread/L5OXQ52CQCO2WAYUKOSZZHH6QR4UMQKB/

i have no experience in atheros but i thought i might mention this for consideration of a troubleshooting step. if you search the error there are a bunch of results with the same answer of restarting iwd after boot. :man_shrugging:

1 Like

OK, that maybe didn't help because the issue is slightly different but it gave me some ideas what to check next and I hope this will be enough for @tbg to know how to proceed.

From what I saw on this thread, the guy had also netconfig: Network configuration is disabled. and that was omitted as something normal, so I must assume this is correct and not a problem.
However, he has this INTERFACE line at the end, but I don't have it. Here it is again:

  michaldybczak  alienware-PC  ~  sudo systemctl status iwd.service
[sudo] hasło użytkownika michaldybczak: 
● iwd.service - Wireless service
     Loaded: loaded (/usr/lib/systemd/system/iwd.service; enabled; vendor preset: disabled)
     Active: active (running) since Sat 2020-03-21 16:12:59 CET; 35s ago
   Main PID: 886 (iwd)
      Tasks: 1 (limit: 9369)
     Memory: 2.1M
     CGroup: /system.slice/iwd.service
             └─886 /usr/lib/iwd/iwd

mar 21 16:12:59 alienware-PC systemd[1]: Starting Wireless service...
mar 21 16:12:59 alienware-PC iwd[886]: Wireless daemon version 1.5
mar 21 16:12:59 alienware-PC iwd[886]: `netconfig: Network configuration is disabled.`
mar 21 16:12:59 alienware-PC iwd[886]: rfkill id 0 can't be matched to a wiphy
mar 21 16:12:59 alienware-PC systemd[1]: Started Wireless service.
mar 21 16:12:59 alienware-PC iwd[886]: Wiphy: 0, Name: phy0
mar 21 16:12:59 alienware-PC iwd[886]:         Permanent Address: 9c:b6:d0:07:03:85
mar 21 16:12:59 alienware-PC iwd[886]:         Bands: 2.4 GHz 5 GHz
mar 21 16:12:59 alienware-PC iwd[886]:         Ciphers: CCMP TKIP BIP
mar 21 16:12:59 alienware-PC iwd[886]:         Supported iftypes: ad-hoc station ap p2p-client p2p-go p2p-devi>

I tried to restart iwd to see if it changed something but no, the same output, nothing interesting happened and the status is the same as above.

Then I used this networkctl:

 michaldybczak  alienware-PC  ~  networkctl
WARNING: systemd-networkd is not running, output will be incomplete.

IDX LINK    TYPE     OPERATIONAL SETUP    
  1 lo      loopback n/a         unmanaged
  2 enp59s0 ether    n/a         unmanaged
  5 wlan0   wlan     n/a         unmanaged

3 links listed.

This informed me that systemd-network is not running so I checked:

 michaldybczak  alienware-PC  ~  sudo systemctl status systemd-networkd
● systemd-networkd.service - Network Service
     Loaded: loaded (/usr/lib/systemd/system/systemd-networkd.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
       Docs: man:systemd-networkd.service(8)

Then I started it and checked networkctl again:


 michaldybczak  alienware-PC  ~  sudo systemctl start systemd-networkd

 michaldybczak  alienware-PC  ~  sudo systemctl status systemd-networkd
● systemd-networkd.service - Network Service
     Loaded: loaded (/usr/lib/systemd/system/systemd-networkd.service; disabled; vendor preset: enabled)
     Active: active (running) since Sat 2020-03-21 16:18:31 CET; 5s ago
       Docs: man:systemd-networkd.service(8)
   Main PID: 2582 (systemd-network)
     Status: "Processing requests..."
      Tasks: 1 (limit: 9369)
     Memory: 2.6M
     CGroup: /system.slice/systemd-networkd.service
             └─2582 /usr/lib/systemd/systemd-networkd

mar 21 16:18:31 alienware-PC systemd[1]: Starting Network Service...
mar 21 16:18:31 alienware-PC systemd-networkd[2582]: Enumeration completed
mar 21 16:18:31 alienware-PC systemd[1]: Started Network Service.

 michaldybczak  alienware-PC  ~  networkctl
IDX LINK    TYPE     OPERATIONAL SETUP    
  1 lo      loopback carrier     unmanaged
  2 enp59s0 ether    off         unmanaged
  5 wlan0   wlan     no-carrier  unmanaged

3 links listed.

Since I don't understan the output, I tried also something else. Someone from here gave me some simple utility to restart network easily - a process network-restart.service which was very helpful for me from time to time.

 michaldybczak  alienware-PC  ~  sudo systemctl start network-restart.service
Job for network-restart.service failed because the control process exited with error code.
See "systemctl status network-restart.service" and "journalctl -xe" for details.

This clearly didn't work correctly, so I checked what was suggested:

 michaldybczak  alienware-PC  ~  systemctl status network-restart.service
● network-restart.service - Atheros WiFi Restart Service
     Loaded: loaded (/etc/systemd/system/network-restart.service; enabled; vendor preset: disabled)
     Active: failed (Result: exit-code) since Sat 2020-03-21 16:20:56 CET; 36s ago
    Process: 2610 ExecStart=/usr/bin/sudo -u $USER /bin/bash -lc nmcli networking off (code=exited, status=0/S>
    Process: 2620 ExecStart=/usr/bin/sleep 1 (code=exited, status=0/SUCCESS)
    Process: 2621 ExecStart=/usr/bin/systemctl stop NetworkManager (code=exited, status=0/SUCCESS)
    Process: 2623 ExecStart=/usr/bin/ip link set wlp60s0 down (code=exited, status=1/FAILURE)
   Main PID: 2623 (code=exited, status=1/FAILURE)

mar 21 16:20:55 alienware-PC systemd[1]: Starting Atheros WiFi Restart Service...
mar 21 16:20:55 alienware-PC sudo[2610]:     root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/bash -lc nm>
mar 21 16:20:55 alienware-PC sudo[2610]: pam_unix(sudo:session): session opened for user root by (uid=0)
mar 21 16:20:55 alienware-PC sudo[2610]: pam_unix(sudo:session): session closed for user root
mar 21 16:20:56 alienware-PC ip[2623]: Cannot find device "wlp60s0"
mar 21 16:20:56 alienware-PC systemd[1]: network-restart.service: Main process exited, code=exited, status=1/F>
mar 21 16:20:56 alienware-PC systemd[1]: network-restart.service: Failed with result 'exit-code'.
mar 21 16:20:56 alienware-PC systemd[1]: Failed to start Atheros WiFi Restart Service.

 michaldybczak  alienware-PC  ~  journalctl -xe
mar 21 16:20:56 alienware-PC systemd[1]: network-restart.service: Failed with result 'exit-code'.
-- Subject: Unit hasn't succeed
-- Defined-By: systemd
-- Support: https://archived.forum.manjaro.org/c/technical-issues-and-assistance
-- 
-- The unit network-restart.service went to state „failed” 
-- z wynikiem „exit-code”.
mar 21 16:20:56 alienware-PC dbus-daemon[884]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
mar 21 16:20:56 alienware-PC systemd[1]: Failed to start Atheros WiFi Restart Service.
-- Subject: Task of starting for unit network-restart.service has failed
-- Defined-By: systemd
-- Support: https://archived.forum.manjaro.org/c/technical-issues-and-assistance
-- 
-- Task of starting for unit network-restart.service has failed.
-- 
--Task identifier: 1614, task result: failed.
mar 21 16:20:56 alienware-PC audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=network-restart comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
mar 21 16:20:56 alienware-PC systemd[1]: Started Network Manager Script Dispatcher Service.
-- Subject: Successfully finished task of starting unit for NetworkManager-dispatcher.service.
-- Defined-By: systemd
-- Support: https://archived.forum.manjaro.org/c/technical-issues-and-assistance
-- 
-- Successfully finished task of starting unit for NetworkManager-dispatcher.service.
-- 
-- Task identifier: 1701.
mar 21 16:20:56 alienware-PC audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
mar 21 16:20:56 alienware-PC sudo[2607]: pam_unix(sudo:session): session closed for user root
mar 21 16:20:56 alienware-PC audit[2607]: USER_END pid=2607 uid=0 auid=1000 ses=1 msg='op=PAM:session_close grantors=pam_limits,pam_unix,pam_permit acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/2 res=success'
mar 21 16:20:56 alienware-PC audit[2607]: CRED_DISP pid=2607 uid=0 auid=1000 ses=1 msg='op=PAM:setcred grantors=pam_unix,pam_permit,pam_env acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/2 res=success'
mar 21 16:21:06 alienware-PC systemd[1]: NetworkManager-dispatcher.service: Succeeded.
-- Subject: Unit has failed
-- Defined-By: systemd
-- Support: https://archived.forum.manjaro.org/c/technical-issues-and-assistance
-- 
-- Unit NetworkManager-dispatcher.service successfully went to the „dead” state.
mar 21 16:21:06 alienware-PC audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
mar 21 16:21:06 alienware-PC kernel: kauditd_printk_skb: 5 callbacks suppressed
mar 21 16:21:06 alienware-PC kernel: audit: type=1131 audit(1584804066.618:150): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'

I enabled the systemd-networkd (not sure why it disabled itself in the first place) and did a HARD REBOOT which was in the past often an only solution to some network lock (where restart wasn't working as well and was returning errors).

After reboot I checked the status:


 michaldybczak  alienware-PC  ~  sudo systemctl status systemd-networkd
[sudo] hasło użytkownika michaldybczak: 
● systemd-networkd.service - Network Service
     Loaded: loaded (/usr/lib/systemd/system/systemd-networkd.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2020-03-21 16:31:03 CET; 33s ago
TriggeredBy: ● systemd-networkd.socket
       Docs: man:systemd-networkd.service(8)
   Main PID: 433 (systemd-network)
     Status: "Processing requests..."
      Tasks: 1 (limit: 9369)
     Memory: 3.1M
     CGroup: /system.slice/systemd-networkd.service
             └─433 /usr/lib/systemd/systemd-networkd

mar 21 16:31:03 alienware-PC systemd[1]: Starting Network Service...
mar 21 16:31:03 alienware-PC systemd-networkd[433]: Enumeration completed
mar 21 16:31:03 alienware-PC systemd[1]: Started Network Service.
mar 21 16:31:03 alienware-PC systemd-networkd[433]: eth0: Interface name change detected, eth0 has been renamed to enp59s0.
mar 21 16:31:08 alienware-PC systemd-networkd[433]: wlan0: Link UP

Now I noticed something interesting, namely:

mar 21 16:31:03 alienware-PC systemd-networkd[433]: eth0: Interface name change detected, eth0 has been renamed to enp59s0.

and then I noticed that networkctl was also showing it as enp59s0

Some other lines are also pointing to some issue:

Cannot find device "wlp60s0"which may be an additional thing that adds to the mix.

Since I can't even disable network from GUI (I click on the checkbox but nothing happens), this tells me that something is misconfigured and there is a good chance that this is it.

Unfortunately, I don't know enough about this to know where should I change it. What do you think @tbg?

To compare how it looks on live system:

[manjaro@manjaro ~]$ networkctl
IDX LINK             TYPE               OPERATIONAL      SETUP     
  1 lo               loopback           carrier          unmanaged 
  2 enp59s0          ether              no-carrier       unmanaged 
  3 wlp60s0          wlan               routable         unmanaged

So on installed OS I have:

2 enp59s0       ether   off
3 wlan0         wlan    no-carrier

so clearly my configuration changed with time and now I'm experiencing the results of that. It's very possible that my previous issues where network wasn't working was also due this change in networks.

iwd changes the adapter to wlan0. mine is the same way

1 Like

Than why inxi shows me still wlp60s0?

12Network:   12Device-1 Qualcomm Atheros Killer E2400 Gigabit Ethernet 12driver alx 
           12IF enp59s0 12state down 12mac <filter> 
           12Device-2 Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter 12driver ath10k_pci 
           12IF wlp60s0 12state up 12mac <filter> 
           12Device-3 Qualcomm Atheros 12type USB 12driver btusb

Try this:



Add the following kernel boot parameters to grub:

ipv6.disable=1 pcie_aspm=off net.ifnames=0

The following command will automatically add the above kernel boot parameter to /etc/default/grub:

sudo cp /etc/default/grub /etc/default/grub.bak && sudo sed '/^GRUB_CMDLINE_LINUX_DEFAULT=/s/"$/ ipv6.disable=1 pcie_aspm=off net.ifnames=0"/g' -i /etc/default/grub

After adding the boot parameter(s) and saving your changes, run:

sudo update-grub 

Reboot, and test for improvement.



No change :frowning: ...

 michaldybczak  alienware-PC  ~  networkctl status
● State: no-carrier

mar 21 23:12:00 alienware-PC systemd[1]: Starting Network Service...
mar 21 23:12:01 alienware-PC systemd-networkd[406]: Enumeration completed
mar 21 23:12:01 alienware-PC systemd[1]: Started Network Service.
mar 21 23:12:01 alienware-PC systemd[1]: Starting Wait for Network to be Configured...
mar 21 23:12:01 alienware-PC systemd-networkd[406]: eth0: Interface name change detected, eth0 has been renamed to enp59s0.
mar 21 23:12:05 alienware-PC systemd-networkd[406]: wlan0: Link UP
mar 21 23:14:01 alienware-PC systemd[1]: systemd-networkd-wait-online.service: Main process exited, code=exited, status=1/FAILURE
mar 21 23:14:01 alienware-PC systemd[1]: systemd-networkd-wait-online.service: Failed with result 'exit-code'.
mar 21 23:14:01 alienware-PC systemd[1]: Failed to start Wait for Network to be Configured.

 michaldybczak  alienware-PC  ~  networkctl status wlan0
● 4: wlan0                                                       
             Link File: /usr/lib/systemd/network/80-iwd.link     
          Network File: n/a                                      
                  Type: wlan                                     
                 State: no-carrier (unmanaged)                   
                  Path: pci-0000:3c:00.0                         
                Driver: ath10k_pci                               
                Vendor: Qualcomm Atheros                         
                 Model: QCA6174 802.11ac Wireless Network Adapter
            HW Address: 9c:b6:d0:07:03:85 (Rivet Networks)       
                   MTU: 1500 (min: 256, max: 2304)               
  Queue Length (Tx/Rx): 1/1                                      

mar 21 23:12:05 alienware-PC systemd-networkd[406]: wlan0: Link UP

My GRUB kernel parameters:

acpi_rev_override=1 acpi_backlight=vendor quiet splash resume=/dev/sda4 resume_offset=1363968 ipv6.disable=1 pcie_aspm=off ifnames=0

When I suspend system and then go back again, the network service doesn't come up.

Again, looks like misscofiguraton somewhere:

systemd[1]: Failed to start Wait for Network to be Configured.

or system is unable to read configs from some reason.

I think that parameter should be net.ifnames=0, see https://www.freedesktop.org/software/systemd/man/systemd-udevd.service.html

1 Like

I corrected it and stripped down to essentials, disabled plymouth and made the boot non-quiet to see what it's doing so now kernel parameters look like:

ipv6.disable=1 pcie_aspm=off net.ifnames=0

Still, no change.

I tried disabling tlp to see if it helped - nothing.
I compared my /etc/ network related confs with the ones from older backups, nothing is standing out, all as previous. Either default configs or no configs, as always (so the ones from /usr/ should be used).

My journalctl looks like:

 michaldybczak  alienware-PC  ~  journalctl -b -p3
-- Logs begin at Thu 2019-10-31 22:06:09 CET, end at Sun 2020-03-22 10:50:47 CET. --
mar 22 10:50:32 alienware-PC kernel: DMAR: [Firmware Bug]: No firmware reserved region can cover this RMRR [0x0000000038800000-0x000000003cffffff], contact BIOS vendor for fixes
mar 22 10:50:32 alienware-PC kernel: sd 3:0:0:0: [sdd] No Caching mode page found
mar 22 10:50:32 alienware-PC kernel: sd 3:0:0:0: [sdd] Assuming drive cache: write through
mar 22 10:50:35 alienware-PC systemd[1]: Failed to start Tell Plymouth To Write Out Runtime Data.
mar 22 10:50:36 alienware-PC kernel: nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 022554 [ IBUS ]
mar 22 10:50:38 alienware-PC systemd[1]: Failed to start Tell Plymouth To Write Out Runtime Data.
mar 22 10:50:38 alienware-PC systemd[1]: Failed to start Tell Plymouth To Write Out Runtime Data.
mar 22 10:50:39 alienware-PC systemd[1]: Failed to start Tell Plymouth To Write Out Runtime Data.
mar 22 10:50:39 alienware-PC systemd[1]: Failed to start Tell Plymouth To Write Out Runtime Data.
mar 22 10:50:40 alienware-PC systemd[1]: Failed to start Tell Plymouth To Write Out Runtime Data.
mar 22 10:50:41 alienware-PC bluetoothd[863]: RFCOMM server failed for Headset Voice gateway: rfcomm_bind: Address already in use (98)
mar 22 10:50:41 alienware-PC bluetoothd[863]: RFCOMM server failed for :1.74/Profile/HSPHS/00001108-0000-1000-8000-00805F9B34FB: rfcomm_bind: Address already in use (98)
mar 22 10:50:41 alienware-PC systemd[1]: Failed to start Tell Plymouth To Write Out Runtime Data.

So aside systemd plymouth issue, during boot all is green OK and looks fine.

This is a second wasted weekend for the wifi issue :frowning: . Two days are gone and my laptop is barely functional, I'm mostly in live system.

I believe the the root of the issue is this:

systemd[1]: Failed to start Wait for Network to be Configured.

It's waiting for some config that is not there or not readable, or misconfigured, but nothing strikes down as different from my backups or live system, so I have no idea what to do.

/etc/systemd/network/ is empty so I added wirless configs according to arch wiki but still no dice, so I deleted them, since all places (installed OS, backups, live system) show this is empty.

EDIT: Probably this has nothing to do with the issue but the systemd-network.service file differs from the live one by having those additional lines:

DeviceAllow=char-* rw
ProtectKernelLogs=yes
RestartKillSignal=SIGUSR2

and the this line differs with the last element:

RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6 AF_PACKET AF_ALG (on live system is: AF_PACKET).

Another portion of data to help with it.

DEBUG AND HARDWARE INFO:

michaldybczak  alienware-PC  ~  journalctl -b -p3
-- Logs begin at Thu 2019-10-31 22:06:09 CET, end at Sun 2020-03-22 15:50:07 CET. --
mar 22 15:48:52 alienware-PC kernel: DMAR: [Firmware Bug]: No firmware reserved region can cover this R>
mar 22 15:48:52 alienware-PC kernel: sd 3:0:0:0: [sdd] No Caching mode page found
mar 22 15:48:52 alienware-PC kernel: sd 3:0:0:0: [sdd] Assuming drive cache: write through
mar 22 15:48:59 alienware-PC kernel: nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 022554 [>
mar 22 15:49:04 alienware-PC bluetoothd[880]: RFCOMM server failed for Headset Voice gateway: rfcomm_bi>
mar 22 15:49:04 alienware-PC bluetoothd[880]: RFCOMM server failed for :1.72/Profile/HSPHS/00001108-000>

 michaldybczak  alienware-PC  ~  lspci -v | grep -iA 7 network
3c:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32)
        Subsystem: Bigfoot Networks, Inc. QCA6174 802.11ac Wireless Network Adapter
        Flags: bus master, fast devsel, latency 0, IRQ 131
        Memory at dd200000 (64-bit, non-prefetchable) [size=2M]
        Capabilities: <access denied>
        Kernel driver in use: ath10k_pci
        Kernel modules: ath10k_pci

3d:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5227 PCI Express Card Reader (rev 01)

 michaldybczak  alienware-PC  ~  lsmod | grep ath10k_pci
ath10k_pci             65536  0
ath10k_core           606208  1 ath10k_pci

 michaldybczak  alienware-PC  ~  dmesg | grep ath10k_pci
dmesg: odczyt bufora jądra nie powiódł się: Operacja niedozwolona

 michaldybczak  alienware-PC  ~  sudo dmesg | grep ath10k_pci
[sudo] hasło użytkownika michaldybczak: 
[    4.743487] ath10k_pci 0000:3c:00.0: enabling device (0000 -> 0002)
[    4.744570] ath10k_pci 0000:3c:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[    5.001560] ath10k_pci 0000:3c:00.0: qca6174 hw3.2 target 0x05030000 chip_id 0x00340aff sub 1a56:1535
[    5.001562] ath10k_pci 0000:3c:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 0 testmode 0
[    5.001988] ath10k_pci 0000:3c:00.0: firmware ver WLAN.RM.4.4.1-00140-QCARMSWPZ-1 api 6 features wowlan,ignore-otp,mfp crc32 29eb8ca1
[    5.066070] ath10k_pci 0000:3c:00.0: board_file api 1 bmi_id N/A crc32 70c38a29
[    5.138554] ath10k_pci 0000:3c:00.0: unsupported HTC service id: 1536
[    5.158084] ath10k_pci 0000:3c:00.0: htt-ver 3.60 wmi-op 4 htt-op 3 cal otp max-sta 32 raw 0 hwcrypto 1
[    8.668697] ath10k_pci 0000:3c:00.0: unsupported HTC service id: 1536

 michaldybczak  alienware-PC  ~  sudo lshw -c network
sudo: lshw: nie znaleziono polecenia

 michaldybczak  alienware-PC  ~  ifconfig
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 14  bytes 2219 (2.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 14  bytes 2219 (2.1 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

wlan0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 9c:b6:d0:07:03:85  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


 michaldybczak  alienware-PC  ~  iwconfig
lo        no wireless extensions.

eth0      no wireless extensions.

wlan0     IEEE 802.11  ESSID:off/any  
          Mode:Managed  Access Point: Not-Associated   Tx-Power=0 dBm   
          Retry short limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          

 michaldybczak  alienware-PC  ~  iwlist scan
lo        Interface doesn't support scanning.

eth0      Interface doesn't support scanning.

wlan0     No scan results


 michaldybczak  alienware-PC  ~  rfkill list all
0: dell-rbtn: Wireless LAN
        Soft blocked: no
        Hard blocked: no
1: hci0: Bluetooth
        Soft blocked: yes
        Hard blocked: no
2: phy0: Wireless LAN
        Soft blocked: no
        Hard blocked: no


 michaldybczak  alienware-PC  ~  ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether f8:ca:b8:47:16:2c brd ff:ff:ff:ff:ff:ff
    altname enp59s0
4: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DORMANT group default qlen 1000
    link/ether 9c:b6:d0:07:03:85 brd ff:ff:ff:ff:ff:ff

The last one - ip link - resulted in interesting info, namely wlan0 being DOWN and DORMANT. So I tried scanning for wifi networks:

 michaldybczak  alienware-PC  ~  sudo iw dev wlan0 scan
[sudo] hasło użytkownika michaldybczak: 
command failed: Device or resource busy (-16)

The result was not surprising but this is the state that is by default after the system boot... So it's clearly something wrong. Let's try to activate it:

 michaldybczak  alienware-PC  ~  sudo ip link set wlan0 mode default

 michaldybczak  alienware-PC  ~  ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether f8:ca:b8:47:16:2c brd ff:ff:ff:ff:ff:ff
    altname enp59s0
4: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether 9c:b6:d0:07:03:85 brd ff:ff:ff:ff:ff:ff

Looks like I'm on a good track, from DORMANT to DEFAULT. So now scanning again:

michaldybczak  alienware-PC  ~  sudo iw dev wlan0 scan
BSS d0:84:b0:4f:4b:a3(on wlan0)
        TSF: 116569582548 usec (1d, 08:22:49)
        freq: 2412
        beacon interval: 100 TUs
        capability: ESS Privacy ShortSlotTime (0x0411)
        signal: -56.00 dBm
        last seen: 4697 ms ago
        Information elements from Probe Response frame:
        SSID: FunBox-4BA3
        Supported rates: 1.0* 2.0* 5.5* 11.0* 18.0 24.0 36.0 54.0 
        DS Parameter set: channel 1
        ERP: Barker_Preamble_Mode
        ERP D4.0: Barker_Preamble_Mode
        RSN:     * Version: 1
                 * Group cipher: TKIP
                 * Pairwise ciphers: CCMP TKIP
                 * Authentication suites: PSK
                 * Capabilities: 16-PTKSA-RC 1-GTKSA-RC (0x000c)
        Extended supported rates: 6.0 9.0 12.0 48.0 
        HT capabilities:
                Capabilities: 0x187c
                        HT20
                        SM Power Save disabled
                        RX Greenfield
                        RX HT20 SGI
                        RX HT40 SGI
                        No RX STBC
                        Max AMSDU length: 7935 bytes
                        DSSS/CCK HT40
                Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
                Minimum RX AMPDU time spacing: 8 usec (0x06)
                HT RX MCS rate indexes supported: 0-15
                HT TX MCS rate indexes are undefined
        HT operation:
                 * primary channel: 1
                 * secondary channel offset: no secondary
                 * STA channel width: 20 MHz
                 * RIFS: 1
                 * HT protection: no
                 * non-GF present: 1
                 * OBSS non-GF present: 0
                 * dual beacon: 0
                 * dual CTS protection: 0
                 * STBC beacon: 0
                 * L-SIG TXOP Prot: 0
                 * PCO active: 0
                 * PCO phase: 0
        WPS:     * Version: 1.0
                 * Wi-Fi Protected Setup State: 2 (Configured)
                 * Response Type: 3 (AP)
                 * UUID: 4e3a423e-4c3a-4e50-be4c-4b3d673a423e
                 * Manufacturer: Sagemcom
                 * Model: F5310
                 * Model Number: 123456
                 * Serial Number: 1234
                 * Primary Device Type: 6-0050f204-1
                 * Device name: 
                 * Config methods: Ethernet, Label
                 * RF Bands: 0x1
                 * Unknown TLV (0x1049, 6 bytes): 00 37 2a 00 01 20
        WPA:     * Version: 1
                 * Group cipher: TKIP
                 * Pairwise ciphers: CCMP TKIP
                 * Authentication suites: PSK
                 * Capabilities: 16-PTKSA-RC 1-GTKSA-RC (0x000c)
        WMM:     * Parameter version 1
                 * u-APSD
                 * BE: CW 15-1023, AIFSN 3
                 * BK: CW 15-1023, AIFSN 7
                 * VI: CW 7-15, AIFSN 2, TXOP 3008 usec
                 * VO: CW 3-7, AIFSN 2, TXOP 1504 usec
BSS c4:6e:1f:7d:bc:f8(on wlan0)
        TSF: 116539394542 usec (1d, 08:22:19)
        freq: 2412
        beacon interval: 100 TUs
        capability: ESS Privacy ShortPreamble ShortSlotTime (0x0431)
        signal: -85.00 dBm
        last seen: 23840 ms ago
        SSID: 
        Supported rates: 1.0* 2.0* 5.5* 11.0* 6.0 9.0 12.0 18.0 
        DS Parameter set: channel 1
        TIM: DTIM Count 0 DTIM Period 1 Bitmap Control 0x0 Bitmap[0] 0x0
        Country: US     Environment: Indoor/Outdoor
                Channels [1 - 13] @ 20 dBm
        ERP: Barker_Preamble_Mode
        RSN:     * Version: 1
                 * Group cipher: CCMP
                 * Pairwise ciphers: CCMP
                 * Authentication suites: PSK
                 * Capabilities: 1-PTKSA-RC 1-GTKSA-RC (0x0000)
        Extended supported rates: 24.0 36.0 48.0 54.0 
        HT capabilities:
                Capabilities: 0x2c
                        HT20
                        SM Power Save disabled
                        RX HT20 SGI
                        No RX STBC
                        Max AMSDU length: 3839 bytes
                        No DSSS/CCK HT40
                Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
                Minimum RX AMPDU time spacing: No restriction (0x00)
                HT RX MCS rate indexes supported: 0-7
                HT TX MCS rate indexes are undefined
        HT operation:
                 * primary channel: 1
                 * secondary channel offset: no secondary
                 * STA channel width: 20 MHz
                 * RIFS: 1
                 * HT protection: no
                 * non-GF present: 0
                 * OBSS non-GF present: 0
                 * dual beacon: 0
                 * dual CTS protection: 0
                 * STBC beacon: 0
                 * L-SIG TXOP Prot: 0
                 * PCO active: 0
                 * PCO phase: 0
        WMM:     * Parameter version 1
                 * BE: CW 15-1023, AIFSN 3
                 * BK: CW 15-1023, AIFSN 7
                 * VI: CW 7-15, AIFSN 2, TXOP 3008 usec
                 * VO: CW 3-7, AIFSN 2, TXOP 1504 usec

Now I'm getting somewhere. Unfortunately, I spend hours trying to connect by using iw dev or iwconfig but there was always something wrong with the key format and the articles and output info was super unhelpful and confusing. For the life, I just cannot estabilish connection via cli, not that it produces error, but I couldn't figure out the correct command :frowning: I'm so close and yet so far... :cry:

So I tried the next thing I could think of. Since the wlan0 was not DORMANT, maybe the restart service will work this time. To make the story short, it was hit and miss, because the service was meant for wl060 device, so I had to rewrite it, then after reboot wlan0 was changing name to wlan1 so I had to rewrite the service again, in the end it seemed to work fine:


  michaldybczak  alienware-PC  ~  sudo systemctl status network-restart.service
● network-restart.service - Atheros WiFi Restart Service
     Loaded: loaded (/etc/systemd/system/network-restart.service; enabled; vendor preset: disabled)
     Active: deactivating (stop) since Sun 2020-03-22 17:29:28 CET; 12s ago
    Process: 2101 ExecStart=/usr/bin/sudo -u $USER /bin/bash -lc nmcli networking off (code=exited, status=0/SUCCESS)
    Process: 2111 ExecStart=/usr/bin/sleep 1 (code=exited, status=0/SUCCESS)
    Process: 2112 ExecStart=/usr/bin/systemctl stop NetworkManager (code=exited, status=0/SUCCESS)
    Process: 2114 ExecStart=/usr/bin/ip link set wlan0 down (code=exited, status=0/SUCCESS)
    Process: 2118 ExecStart=/usr/bin/rmmod ath10k_pci (code=exited, status=0/SUCCESS)
    Process: 2124 ExecStart=/usr/bin/sleep 1 (code=exited, status=0/SUCCESS)
    Process: 2125 ExecStart=/usr/bin/rmmod ath10k_core (code=exited, status=0/SUCCESS)
    Process: 2126 ExecStop=/usr/bin/sleep 5 (code=exited, status=0/SUCCESS)
    Process: 2128 ExecStop=/usr/bin/modprobe ath10k_pci (code=exited, status=0/SUCCESS)
    Process: 2134 ExecStop=/usr/bin/sleep 2 (code=exited, status=0/SUCCESS)
    Process: 2141 ExecStop=/usr/bin/modprobe ath10k_core (code=exited, status=0/SUCCESS)
    Process: 2142 ExecStop=/usr/bin/sleep 2 (code=exited, status=0/SUCCESS)
    Process: 2143 ExecStop=/usr/bin/ip link set wlan1 up (code=exited, status=0/SUCCESS)
    Process: 2144 ExecStop=/usr/bin/sleep 2 (code=exited, status=0/SUCCESS)
   Main PID: 2125 (code=exited, status=0/SUCCESS); Control PID: 2145 (systemctl)
      Tasks: 1 (limit: 9370)
     Memory: 2.1M
     CGroup: /system.slice/network-restart.service
             └─2145 /usr/bin/systemctl start NetworkManager

mar 22 17:29:25 alienware-PC systemd[1]: Starting Atheros WiFi Restart Service...
mar 22 17:29:25 alienware-PC sudo[2101]:     root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/bash -lc nmcli networking off
mar 22 17:29:25 alienware-PC sudo[2101]: pam_unix(sudo:session): session opened for user root by (uid=0)
mar 22 17:29:25 alienware-PC sudo[2101]: pam_unix(sudo:session): session closed for user root
mar 22 17:29:28 alienware-PC systemd[1]: Finished Atheros WiFi Restart Service.
mar 22 17:29:28 alienware-PC systemd[1]: Stopping Atheros WiFi Restart Service...

 michaldybczak  alienware-PC  ~  nmcli g
STATE       CONNECTIVITY  WIFI-HW   WIFI      WWAN-HW   WWAN      
disconnected  unknown   connected  turned on  turned on  turned on

The problem with it it is that the tray icon for network is disappearing and not showing up again as it should. Also when ip link is checked, the new wlan1 device is DORMANT again. I can make it DEFAULT again, but... still I can't get graphical interface to connect and I can't do it via command line.

I hope that this info will help debug it.

I'm not sure how many days I can live with without the Internet and only on live system? I'm trying to save my install, because the plethora of personalisations done over the years and I'm not sure if I can restore it so easily, with all games and so on.

In your service you've written:

ExecStop=/usr/bin/modprobe ath10k_core
ExecStop=/usr/bin/sleep 2

After the above lines, try adding:

ExecStop=/usr/bin/ip link set wlan0 mode default
ExecStop=/usr/bin/rfkill unblock all

Thank you for mentioning that. Not sure how I managed to truncate that part. I will edit my prior posts to correct the entry.

Thanks, will do that and try it out. In the past, when the device locked itself up, it wasn't available by normal reboot and I had to hard reboot it. I wonder, if I manager to reboot it properly, will it start to work normally?

On other hand, this lockup is within system, because outside of it, the device works. So something is wrong and the restart service is just a workaround, not a solution.

Since you are quiet and short with answers, I assume you don't have much idea what is wrong either and we must find someone who knows the network devices in and out, knows relationships between them, the system and how to set it properly.

The problem is, beside you and few other people, there is not much help so it takes time. I'm doing all in my power to solve it on my own, but it takes a huge amount of my spare time and I don't have it too much. Even in the current situation, I have a load of work, even if I'm working from home so I'm too tired in the evenings to deal with it and weekends are too short apparently. Probably I will hang 1 or 2 days more and try restore backup first, although this didn't fix the problem the last time, so I don't expect it will now. If that won't work, I will have to seriously think about reinstalling the system because I'm going nowhere.

The most frustrating part is, that there has to be something simple that we are missing. One moment everything works, I close the lid, system goes to slip, my AC is disconnected (the cable doesn't hold too tight), I wake up system and wifi is gone... forever. This is ridiculous. Error messages are sparse and vague :frowning:

Even if I create a successful connection, after reboot the device is always as DORMANT. WHY??? I believe that a network specialist should know where this is set so where to look for. Hmm.... I have a sysadmin friend. Maybe he will know more.

Generally the symptoms you speak of are kernel or firmware related. The wifi suspend problems can only be worked around using a service if the kernel is responsible.

If this is a kernel regression you may be able to find an older kernel that does not have this problem. The real time kernels are also worth testing as they sometimes do not contain the same bugs as the mainline kernels (usually only a temporary fix).

Making changes while running a live system is difficult as kernels can not be switched in the live environment.

have you tried just avoiding network manager and just use conman?

I don't think it's about kernel. Like I said, on live system it works and I can suspend system, and wake on battery without any consequences.

I installed kernel 4.19 just to try it and it didn't help, still the same.

I used nmtui and found out it has the same problem as GUI, it sees past connections but when I want to connect, the list shows only cable connection, even if device is DEFAULT.

So I came to conclusion, that the device have to be not only DEFAULT but also UP. The restart service has this line as well:

ip link set wlan1 up

and it worked in the past, but now when I try to make wlan0 or wlan1 (whatever it is at the moment) UP, I doesn't get any failure output, as if it succeeded. But when checked, it's still DOWN.

So I suspect as long the device is DOWN, it won't scan for wifi and won't connect.

I remind you, that rfkill showed it as soft and hard unblocked. So I have no idea where else this is saved and set.

I'm not sure what do you mean. How would it work?

The live system will be using a different version of the kernel than any installed system that has been updated. Kernels are not static on Manjaro and only the live versions are frozen at a specific version. Installed kernels are continuously being updated and change regularly. Generally, kernel regressions can often end up being back ported to older kernel versions that are still being maintained.

The exceptions to this are older Manjaro kernels that are designated End Of Life (EOL). I do not recommend using EOL kernels as that may be a security risk. The Real Time kernels generally have delayed updates, so they may be free of recent regressions for a period of time. The real time kernels may fix an issue temporarily (until they are updated). However, many computers can have other serious problems with running an RT kernel. The RT kernels are worth testing as a troubleshooting step to determine if a kernel regression is responsible for a recent change.

I have dealt with wifi suspend issues extensively on the forum and I am very familiar with these problems and how to resolve them. A kernel update/change or a service is the only way avoid suspend issues in most cases.

The driver/firmware could also be a part of the issue along with the kernel. It is very hard to pin down exactly what may be creating a suspend problem and they are sometimes difficult to cure.

You should also test masking tlp if you have not doe so already.

Re connman:

3 Likes

Thanks. After taking a long time examining my options and trying various cli tools, I came to conclusion that the issue is that the wlan0 devise is powered down, hence all other tools won't work correctly. It doesn't have vendor lock (this looks different and I learned how to recognize it and turn it on/off), rfkill confirms it, besides, vendor lock is effective in all systems, no matter of setting and it looks like the device was gone, so I know it's not it. I even tried to lock and unlock it and it worked, but still it didn't powered up the device.

I tired all methods mentioned in many articles and they should work.

sudo ifconfig wlan0 down
sudo ifconfig wlan0 up
sudo rfkill unblock all
sudo rfkill unblock wifi

reboot

and many others.

So it looks like the the commands that should work simply won't work in this case and THERE IS NO ERROR which makes it worse.

This tells me that this really could be either kernel error or drivers error.

At first I thought that since the device worked and the kernel is not having configs on its own, the kernel is out of question. Then I realized: "What if the kernel or driver bug is that it is unable correctly to power the device?"
If that would be the case, once the device was powered down during suspend process, the kernel or drivers are not waking it up, so I can command it all I want, the kernel or drivers are doing nothing or doing it wrong.

On the other hand, the device state is written somewhere but I suspect this can be changed only through interaction with kernel or drivers? We would have to have kernel/driver level of knowledge to know how to prepare such signal manually.

The final conclusion is:

  1. Try rt kernels, if that won't work,
  2. Downgrade to stable branch, if that won't work
  3. Overwrite manually driver files with files from older backups, if that won't work
  4. Try restore backups, if that won't work
  5. Reinstall system

The worrying thing about it is, that after reinstall the issue may reappear at some point, even if I will stay at stable branch, unless there will be fix that will be passed to stable branch. In worst case scenario I would have to change the distro, which would be devastating because I'm using manjaro for ca. 6 years and I cannot imagine going back to other distros. I don't want to even try. I love having cutting edge software and AUR. I hate Open Suse and Solus have little software choice which is important for me. In the end result, I don't have much choice so I don't know what to do, but let's worry about it later.

then maybe give this a try so you wont have to nuke your current install.

  1. get rid of your old problematic backups and create a new one.
  2. shrink your manjaro partition and create a new one about 20gb or more if you have room.
  3. do a clean install of manjaro on that 20gb partition, and if it works properly then..
  4. run an update and see if it breaks wifi.
  5. if it all works, start replacing any network related files/configs over to your broken manjaro.

Forum kindly sponsored by