WiFi stopped working, suddenly

Yesterday it was fine. Today I booed into the system, didn't check wifi status (usually it's connected during the boot) and closed the lid, which triggers suspend action.

When I opened the lid, there was no wifi connection. Network restart, reboot, hard reboot - none of it worked. It looks like if there were no wifi connections around or as if it wifi was on but didn't try to search for networks.

Screenshot-2020-03-12-16-32-43

I checked into my Windows OS which I haven't booted for over a half year and the wifi is working fine, so that is not a hardware failure.

I'm using unstable branch lately so I thought this may be caused by my recent systemd 455 craziness that I complained about in the Announcement section. I couldn't switch to resting and downgrade because of some file ownership and dependency issue:

The lack of internet is a show stopper so I restored my backup from a few days ago 7.03.2020 with systemd 244 and... no change, the wifi is still not working.

Since this wasn't caused directly by some package versions, maybe some config went corrupt, I don't know. I'm writing from my Win 10 :frowning:, because Manjaro without Wifi is not usable for me. I don't have long enough cable to use it normally.

Please, help me to figure this out. I'm not familiar with networks.

I'm posting some VERY OLD inxi output that I bookmarked and posted elsewhere because at the moment of writing I'm on windows, so ignore software related info which is clearly obsolete:

System:    Host: alienware-linux-PC Kernel: 4.10.0-1-MANJARO x86_64 (64 bit gcc: 6.3.1)
           Desktop: KDE Plasma 5.9.2 (Qt 5.8.0) Distro: Manjaro Linux
Machine:   Device: laptop System: Alienware product: Alienware 17 R3 v: 1.2.3
           Mobo: Alienware model: Alienware 17 R3 v: A00
           UEFI: Alienware v: 1.2.3 date: 11/11/2015
Battery    BAT1: charge: 55.3 Wh 100.0% condition: 55.3/64.0 Wh (86%)
           model: COMPAL PABAS0241231 status: Full
CPU:       Quad core Intel Core i7-6700HQ (-HT-MCP-) cache: 6144 KB
           flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 20744
           clock speeds: max: 3500 MHz 1: 799 MHz 2: 799 MHz 3: 799 MHz
           4: 799 MHz 5: 799 MHz 6: 799 MHz 7: 799 MHz 8: 799 MHz
Graphics:  Card-1: Intel HD Graphics 530 bus-ID: 00:02.0
           Card-2: NVIDIA GM204M [GeForce GTX 970M] bus-ID: 01:00.0
           Display Server: X.Org 1.19.1 driver: intel
           Resolution: 1920x1080@60.02hz
           GLX Renderer: Mesa DRI Intel HD Graphics 530 (Skylake GT2)
           GLX Version: 3.0 Mesa 17.0.0 Direct Rendering: Yes
Audio:     Card Intel Sunrise Point-H HD Audio
           driver: snd_hda_intel bus-ID: 00:1f.3
           Sound: Advanced Linux Sound Architecture v: k4.10.0-1-MANJARO
Network:   Card-1: Qualcomm Atheros Killer E2400 Gigabit Ethernet Controller
           driver: alx port: d000 bus-ID: 3b:00.0
           IF: enp59s0 state: down mac: <filter>
           Card-2: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
           driver: ath10k_pci bus-ID: 3c:00.0
           IF: wlp60s0 state: up mac: <filter>
           Card-3: Qualcomm Atheros usb-ID: 001-004
           IF: null-if-id state: N/A speed: N/A duplex: N/A mac: N/A
Drives:    HDD Total Size: 1120.2GB (51.5% used)
           ID-1: /dev/sdb model: HGST_HTS721010A9 size: 1000.2GB
           ID-2: /dev/sda model: KINGSTON_SM2280S size: 120.0GB
Partition: ID-1: / size: 39G used: 32G (87%) fs: ext4 dev: /dev/sda4
           ID-2: /home size: 56G used: 36G (68%) fs: ext4 dev: /dev/sda2
           ID-3: swap-1 size: 16.78GB used: 0.00GB (0%) fs: swap dev: /dev/sda1
Sensors:   System Temperatures: cpu: 57.5C mobo: N/A
           Fan Speeds (in rpm): cpu: N/A
Info:      Processes: 201 Uptime: 1:37 Memory: 2452.7/7887.6MB
           Init: systemd Gcc sys: 6.3.1 Client: Shell (bash 4.4.121) inxi: 2.3.8

Have you looked at any blocks?

$ rfkill 

and possibly tried unblocking?

$ rfkill unblock all

Thanks, I tried it and there is no soft or hard block on wlan. Besides, I know the key combination to turn network off (hard switch) and that's not it.

If you look on my screenshot above, you will see that the system shows wlan as active, but the connections list is empty. What is striking is that I can't uncheck the wifi box to turn it off.

When I use command to restart network service, the network goes down and... doesn't re-appear. This happened in the past sporadically but hard reboot always fixed it (normal reboots did nothing). So this suggested some deep level issue.

I switched to another kernel, the same. I switched to fallback mode, the same. I logged in to default, test user, the same. So this is not a config issue.

At this point I'm starting believing that some package got corrupted. Maybe going further back with backups will help? On other side, if some file is corrupted, timeshift should detect it as different and replace with backup file.

Hmmm... Who is network expert on the forum? @dglt?

Here is a list of things you can attempt:



Disable MAC Address Randomization in NetworkManager with the following command:

echo -e "[device]\nwifi.scan-rand-mac-address=no" | sudo tee /etc/NetworkManager/conf.d/disable-random-mac.conf

After creating the new conf file, power down both your router and your computer.



Fully reset your BIOS to the factory defaults. Always completely remove all power sources from the computer. You may alternately wish to remove the small battery from the motherboard as well.

Before proceeding disconnect any other devices that can access the internet that do not require troubleshooting. This includes LAN cables, USB to Ethernet adapters, and USB phone tethering.

Do a hard power down, which means completely powering off then removing all your computers power sources.

Power down the laptop and remove the battery and power plug. Let the laptop sit without power for a few minutes. Then, hold down the power button for approximately 20 seconds. Pressing the power button will help drain all left over power from the unit’s motherboard.

Then reattach the AC power (no battery), and restart. Boot into the BIOS settings utility and reset the BIOS back to the factory default and save the default settings. There should be an option similar to "Reset BIOS Settings to the factory default levels".

You may need to change a few settings in the BIOS afterwards (such as secure boot) to get Manjaro to boot correctly again.


Clearing your CMOS (BIOS) may also be accomplished by removing your motherboards battery.

Steps to clear BIOS using the CMOS battery method:

Turn off all peripheral devices connected to the computer.

Disconnect the power cord and remove the computer battery first (not the mobo CMOS battery) if a laptop.

Remove the computer cover.

Find the small round CMOS battery on the board.

Remove the battery.

Wait 1–5 minutes, then reconnect the battery.

Put the computer cover back on.


There is yet another method to clear your CMOS (BIOS).

Before proceeding with this alternate method again be sure all power sources have been removed/turned off.

Your CMOS may be able to be cleared via jumper pins on your motherboard. This option is not available on all mobo's. Most desktop motherboards generally have a jumper pin to do this, many laptops and tablets do not.

Search your motherboard model on your manufacturers website to find directions on where the pins are located and how best to perform this procedure (if you do not have your motherboard manual). Near the jumper pin the mobo usually has a label such as "Clear CMOS". Perform the procedure by moving the pin to the alternate position, then reboot. After booting the computer fully, turn off your computer, remove power again and return the jumper to its original position. You should now have reset the system to factory.

Please follow your manufacturers detailed guidelines as these directions are generally applicable, but some manufacturers may differ on their procedure.



Restart your computer after disabling MAC randomization and resetting your bios. Check to see if there has been any improvement with your networking. If not, then put your computer into suspend. Give it a minute of sleep, then resume, and run these commands to see if anything has changed:

inxi -SMa; hwinfo --netcard --bluetooth | grep -Ei "(hotplug|speed|model|status|cmd|file|detected|driver:)" | grep -v "Config S" && echo "System install date $(head -n1 /var/log/pacman.log | cut -d " " -f1 | cut -c 2-11)" 
cat /etc/NetworkManager/conf.d/disable-random-mac.conf

Please post the outputs of any commands requested.


Questions:

Is your bios up to date?

Are you also booting Windows (or any other OS on this machine)?

Have you tested other kernels yet? if so which ones?

I would suggest you test every kernel above 4.14 including the real time 4.19 & 5.4 kernels, as well as 5.4 & 5.5. I see you have already tested kernel 5.6.



Add ipv6 disable & pcie_aspm=off via grub boot parameters.

The following command will add GRUB kernel boot parameters to /etc/default/grub:

sudo cp /etc/default/grub /etc/default/grub.bak && sudo sed '/^GRUB_CMDLINE_LINUX_DEFAULT=/s/"$/ ipv6.disable=1 pcie_aspm=off "/g' -i /etc/default/grub

After adding the boot parameter(s) and saving your changes, run:

sudo update-grub 

Reboot, and test your connection for improvement.

*sed magic courtesy of @dalto



You may also want to either temporarily mask or uninstall tlp (and reboot).

2 Likes

this guy :point_up:

1 Like
  1. I can't remove battery. My laptop's construction is the worst of the worst. To get it I would have to tear it apart completely. I did it once to replace nonfunctional battery and I don't wish to do it again. This took me several hours, I risked complete damage to the computer.

  2. I was able to launch Manjaro Live and... it has wifi without problems. So the issue is not with hardware and the shenanigans with the batteries and power is not needed.

There is something in Manjaro installation that broke when I suspended or rebooted computer.

Unfortunately, I was trying to dig further and restored backup from 26.02.2020 and... system doesn't boot.

It turns out that timeshift is a sh&%t and cannot be relied on. I have issue to chroot within current live usb and timeshift segefaults.

So at the moment I don't have even the access to my system.

EDIT: I'm downloading a fresh iso to get the newest live and will see if chroot will work there.

Is there a way to debug the problem? Your post gives some drastic solutions blindly. Since this isn't a hardware problem, the most of the suggestions is useless in my case.

The fact that network-manager is kinda stuck-frozen in enabled but not fully working state suggests issue with the network manager or network service itself.

I also checked various wifi networks, nothing. Network-manager simply doesn't look for them and the list is empty. Again, other devices, computers, systems (including Manjaro live) works well, so the issue is somewhere within network service that won't start when rebooted manually.

You can simply run your battery until it is completely drained if you can't remove it. My troubleshooting steps are not really that drastic. Reseting your bios is a simple procedure and many bios allow you to backup your configuration if you have a lot of custom settings.

The most likely resolution in situations like this is to reset or update your bios. Upgrade/downgrade your firmware and test alternate kernel versions.

Timeshift is not a sh*t program, but it does have its limitations. There are rare instances when timeshift may not be able to recover from a disaster. In those situations you really need to have made backup images of your drive. Clonezilla is the prefered backup program for full drive imaging. You can also simply use the dd utility to clone your drive to another,

So far the timeshift is the disaster.
System was fine until I used timeshift. I can't use timeshift in chroot because it segfaults. So basically it's useless program that is rather breaking the system.

Looks like many /var/lib/pacman/local/ files are missing, while I get hundreds duplicates in pacman db (deleting the db and restoring it doesn't help, duplicates persist), so the restore process completely messed my system and I'm not sure how to recover from it. I can't just use system update, because that part is broken.

i think you can use timeshift gui from a live environment without needing to chroot, tell it where the backups are stored and where you want it restored to. assuming you have more than just one snapshot on timeshift then try a couple of them and also make sure (it's been a while since i used timeshift) you check the option to update initramfs/mkinitcpio

1 Like

Thanks, timeshift from GUI works, however system still doesn't boot, although I restored to the last backup that worked before. So the GRUB must be broken somehow.

I marked to refresh the initrams and will see if if it helped. I usually mark not to reinstall GRUB because I don't understand the options and, although I asked it many times, NO ONE explained it to me. Everybody omits it, which is super frustrating, because probably this is the root of the issue and possibly also a way to fix it.

If I had a BIOS install, it would be easy, I would choose the hard drive and mbr install. With UEFI I have no idea what to choose. UEFI is on EFI partition that is not visible in timeshift so should I choose the drive itself or root partition, since EFI is mounted there? This is irritating because I know how to partition systems and install Linux and dual boots without a problem but with restore process I'm completely lost.

EDIT: When I'm doing restoring of the same backup the second time, timeshift have still plenty of files (over few thousands) that are marked as changed.... So what gives? The first restore process wasn't successful or what?

EDIT 2: This time I got info "Restored with errors". Updating initrams took a very long time so I guess this is the problem and the system still won't boot, but I guess I have to check it anyway.

OK, after the second restore I was able to boot the system but instead Plasma session I could only log in into LXQt. Wifi was still no-show. However, I run sudo update-grub and attempted THIRD restore (still the same, last backup) and it FINALLY BOOTED AND WIFI WAS WORKING!

Unfortunatelly, pacman is still broken and I'm in the middle of figuring out how to fix it, so I could update the system.

This is absolutely mind-blowing. Zero consitency and stability. Timeshift is incredibly unreliable as it seems :frowning: .

1 Like

glad to hear you made some progress. what's happening with pacman?

I fixed it. In /var/lib/pacman/local/ I had versions of the files from the backup and before restoring backup and that doubled files were interfering with pacman. I deleted the content and manually restored from backup (since timeshift weren't able to do it automatically despite many attempts) and that fixed double entries problem.

Then it turned out that during branch change or update I got multiple "file exists" which stopped the process so I figured that timeshift - again - wasn't able to restore /usr/ content, so I had to delete it and restore from backup manually (as root from live system).

There were some other hick-ups along the way but I'm recovering slowly, because I also removed my confings to have a clean start, so I'm downloading configs that I'm sure are OK and checking the system.

Anyway, system works, update was successful, wifi works, kwin didn't crash so far but I have plenty to do with re-creating the setup from the scratch, because I don't want to use certain desktop configs that were very old.

So it looks like the backup content is OK but timeshift has issues with restoring content, is inconsistent and unreliable - and yes, I deleted the timeshift configuration and set it up anew, still, didn't help. Luckily manual method works but it's a lot of work and takes a lot of time.

So despite a very poor, unbootable state I was able to fix the system and my almost 4 year old OS install is intact, so far. Maybe I reconsider reinstall in case of problems but I'm lazy. Too many personalizations were done over the years. Anyway, will see if the system will work correctly, so far it works OK.

2 Likes

One last tip: Start making Crash-proof Backups.

  • CloneZilla for the system backup (/boot, including EFI; /)
    Do one before any major upgrade, do another one after 24h of the same successful upgrade and keep 3 system backups.
  • Borg Backup for /home: that's a deduplicating, compressing, encrypting (If you want) backup system and the best there currently is. (I've used them all: in another forum I used to be called "dnʞɔɐq ˙ɹW") This one has all the advantages of a professional backup system except the "pulling backups automatically from a central server' (last time I looked)

I'll try that but the problem with other backup solutions are that they usually are:

  • not accessible (you can't browse, access files or copy them manually)
  • are not backing changes but the whole partitions which takes huge amount of time and space, space that I don't have

Timeshift on paper is the ideal tool, but the restoring is unreliable :frowning:.

Yup, CloneZilla and Borg both have those disadvantages, but:

  • A system backup must be taken off-line as one atomic transaction and restored as such
    Browsing files on a system backup is a disadvantage.
  • Borg has a GUI (Disclaimer: which I don't use, I'm CLI)

¯\_(ツ)_/¯

The thing is, I rarely use backups as a whole. More often I need to find some old config or install file. So far I never experienced destroying backup because of the browsing ability, yet I need the browsing regularly. With un-browsable backup, I'm basically screwed as restoring backup is the last and most drastic resort that is done super rarely (once or twice a year max or rarer).

In that case you should do something slightly different from me:

  • Cold system backup (/boot partition, / partition, swap partition, excluding /home) using CloneZilla that you (hopefully) never need, but that is there if you run into restore hell.
  • Warm system backup of / (so the entire file system) where you can restore individual files using Borg, through a GUI or not...

Remember: A backup is all about the restore! Not about the backup itself...

1 Like

Actually I have used dd to clone from a running system and it worked just fine. It's really not supposed to be done that way, but the new system duplicate seemed flawless when used.

There is a very old backup program called redo backup that images drives. It has a GUI and is super simple compared to the terminal based clonezilla to use.

I don't recommend it normally because it is outdated. Newer hardware will not likely work with RedoBackup, but older hardware generally can run on the very old live disk image it uses.

I'm pretty sure @michaldybczak uses an older laptop, so it might be worth trying. Clonezilla does take a bit of getting used to, redo is much simpler.

Forum kindly sponsored by