Luks Performance and System Freeze

Hi there!

After I made the Manjaro on my wife's laptop unbootable by destroying my Grub with the help of user @gohlip (see Grub all entries duplicated & Windows won't boot (Grub says UUID not found?) for details) I've reinstalled a new Manjaro installation here.

My problem now is really horrible filesystem performance. Like 20mb/s.

This looks good I think:

$ cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1      1272543 iterations per second for 256-bit key
PBKDF2-sha256    1438375 iterations per second for 256-bit key
PBKDF2-sha512    1133595 iterations per second for 256-bit key
PBKDF2-ripemd160  954987 iterations per second for 256-bit key
PBKDF2-whirlpool  675628 iterations per second for 256-bit key
argon2i       4 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id      4 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
#     Algorithm |       Key |      Encryption |      Decryption
        aes-cbc        128b       573,3 MiB/s      2432,5 MiB/s
    serpent-cbc        128b        73,4 MiB/s       511,1 MiB/s
    twofish-cbc        128b       154,1 MiB/s       319,5 MiB/s
        aes-cbc        256b       448,8 MiB/s      1865,2 MiB/s
    serpent-cbc        256b        80,1 MiB/s       504,1 MiB/s
    twofish-cbc        256b       160,6 MiB/s       319,7 MiB/s
        aes-xts        256b      1572,2 MiB/s      1603,4 MiB/s
    serpent-xts        256b       506,2 MiB/s       488,9 MiB/s
    twofish-xts        256b       311,7 MiB/s       311,6 MiB/s
        aes-xts        512b      1308,3 MiB/s      1259,1 MiB/s
    serpent-xts        512b       506,7 MiB/s       485,5 MiB/s
    twofish-xts        512b       301,3 MiB/s       310,9 MiB/s

But even though my system disk is aes-xts the real performance I see is much worse..

$ sudo cryptsetup status luks-b59ff969-7051-4bd4-bf75-0619f2b66885
/dev/mapper/luks-b59ff969-7051-4bd4-bf75-0619f2b66885 is active and is in use.
  type:    LUKS1
  cipher:  aes-xts-plain64
  keysize: 512 bits
  key location: dm-crypt
  device:  /dev/sdb2
  sector size:  512
  offset:  4096 sectors
  size:    976562191 sectors
  mode:    read/write
$ lsblk --fs
NAME                                          FSTYPE      LABEL      UUID                                 FSAVAIL FSUSE% MOUNTPOINT
sda                                                                                                                      
├─sda1                                        vfat        SYSTEM     22EB-45A0                                           
├─sda2                                        ntfs        Recovery   EE88A1F288A1BA0B                                    
├─sda3                                                                                                                   
├─sda4                                        ntfs        OS         A282A57482A54D9B                       15,6G    86% /run/media/biene/OS
├─sda5                                        ntfs                   24D28C6AD28C41D2                                    
├─sda6                                        swap                   e8926249-9576-49ac-8744-8b261c38e125                [SWAP]
└─sda7                                        ntfs        Restore    90EEA95DEEA93BFA                                    
sdb                                                                                                                      
├─sdb1                                        vfat                   5D0F-15D2                                99M     0% /boot/efi
└─sdb2                                        crypto_LUKS            b59ff969-7051-4bd4-bf75-0619f2b66885                
  └─luks-b59ff969-7051-4bd4-bf75-0619f2b66885 ext4                   f6fe0ada-29ea-46a7-b832-bae51f6a059d    168G    58% /

Also a related problem I see is that when I move / copy files from an external disk that's faster than those 20mb/s it fills some cache into the RAM until it's full and the system becomes very unresponsive. Then when I pause the copy / move task it slowly cleares that cache and makes the system useable again.

Help please. How do I disable / limit this copy / move cache feature so that it doesn't freeze the system by filling all available RAM and how do I get my LUKS system disk to perform better than 20mb/s especially since cryptsetup benchmark gives me 1308,3 MiB/s & 1259,1 MiB/s write & read.

tests are approximate using memory only
In other words, no drive I/O is involved in these benchmarks.

Also, what's your CPU? Older CPUs don't have AES-NI instructions and are thus noticeably slower when using AES encrypted volumes.

1 Like

Yes but the underlying hardware is an SSD which should be capable of around 500mb/s and did not show any such problems on my previous Manjaro installation..

$ inxi -Fxzc0
System:    Host: Blumenmonster Kernel: 5.2.21-1-MANJARO x86_64 bits: 64 compiler: gcc v: 9.2.0 Desktop: KDE Plasma 5.16.5 
           Distro: Manjaro Linux 
Machine:   Type: Laptop System: ASUSTeK product: N750JV v: 1.0 serial: <filter> 
           Mobo: ASUSTeK model: N750JV v: 1.0 serial: <filter> UEFI: American Megatrends v: N750JV.207 date: 08/07/2013 
Battery:   ID-1: BAT0 charge: 40.2 Wh condition: 39.7/69.5 Wh (57%) model: ASUSTeK N750-62 status: Unknown 
CPU:       Topology: Quad Core model: Intel Core i7-4700HQ bits: 64 type: MT MCP arch: Haswell rev: 3 L2 cache: 6144 KiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 38334 
           Speed: 2234 MHz min/max: 800/3400 MHz Core speeds (MHz): 1: 2234 2: 1888 3: 2677 4: 1965 5: 2114 6: 1967 7: 2299 
           8: 2736 
Graphics:  Device-1: Intel 4th Gen Core Processor Integrated Graphics vendor: ASUSTeK driver: i915 v: kernel bus ID: 00:02.0 
           Device-2: NVIDIA GK107M [GeForce GT 750M] driver: N/A bus ID: 01:00.0 
           Display: x11 server: X.Org 1.20.5 driver: intel resolution: 1920x1080~60Hz 
           OpenGL: renderer: Mesa DRI Intel Haswell Mobile v: 4.5 Mesa 19.2.1 direct render: Yes 
Audio:     Device-1: Intel Xeon E3-1200 v3/4th Gen Core Processor HD Audio driver: snd_hda_intel v: kernel bus ID: 00:03.0 
           Device-2: Intel 8 Series/C220 Series High Definition Audio vendor: ASUSTeK driver: snd_hda_intel v: kernel 
           bus ID: 00:1b.0 
           Sound Server: ALSA v: k5.2.21-1-MANJARO 
Network:   Device-1: Intel Centrino Advanced-N 6235 driver: iwlwifi v: kernel port: f040 bus ID: 03:00.0 
           IF: wlp3s0 state: up mac: <filter> 
           Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet vendor: ASUSTeK driver: r8169 v: kernel port: d000 
           bus ID: 04:00.0 
           IF: enp4s0 state: down mac: <filter> 
Drives:    Local Storage: total: 3.42 TiB used: 1.55 TiB (45.3%) 
           ID-1: /dev/sda vendor: LITE-ON IT model: LCS-256M6S size: 238.47 GiB 
           ID-2: /dev/sdb vendor: Samsung model: SSD 840 EVO 500GB size: 465.76 GiB 
           ID-3: /dev/sdc type: USB vendor: Western Digital model: WD My Passport 0827 size: 2.73 TiB 
Partition: ID-1: / size: 457.35 GiB used: 266.02 GiB (58.2%) fs: ext4 dev: /dev/dm-0 
           ID-2: swap-1 size: 111.33 GiB used: 71.8 MiB (0.1%) fs: swap dev: /dev/sda6 
Sensors:   System Temperatures: cpu: 71.0 C mobo: N/A 
           Fan Speeds (RPM): cpu: 3800 
Info:      Processes: 273 Uptime: 5h 54m Memory: 15.55 GiB used: 3.14 GiB (20.2%) Init: systemd Compilers: gcc: 9.2.0 
           Shell: bash v: 5.0.11 inxi: 3.0.36

Well OK, your CPU is fine, it has full AES-NI support.

So there must be something else, 20 MB/s is definitely too slow.
Do you have anything in your logs/journal?

About the cache issue, I suggest you halve the values of vm.dirty_ratio and vm.dirty_writeback_ratio.

1 Like

what logs/journals? sorry if I'm beeing dumb I usally rely on stuff to just work (TM) :wink:

journalctl
Especially I/O errors. (I'm not saying there are errors, but have a look for it please)

1 Like

Mostly hundrets of lines reading:
plasmashell[2677]: qml: temp unit: 0

And a few blocks of those in between:

Okt 16 16:16:05 Blumenmonster plasmashell[2677]: qml: temp unit: 0
Okt 16 16:16:07 Blumenmonster plasmashell[2677]: file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/NotificationItem.qml:138:17: QML Heading: Binding loop detected for property "height"
Okt 16 16:16:07 Blumenmonster plasmashell[2677]: file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/NotificationItem.qml:138:17: QML Heading: Binding loop detected for property "height"
Okt 16 16:16:08 Blumenmonster plasmashell[2677]: qml: temp unit: 0
Okt 16 16:16:08 Blumenmonster plasmashell[2677]: qml: temp unit: 0
Okt 16 16:16:08 Blumenmonster plasmashell[2677]: qml: temp unit: 0
Okt 16 16:16:08 Blumenmonster plasmashell[2677]: qml: temp unit: 0
Okt 16 16:16:10 Blumenmonster kwin_x11[2667]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 37018, resource id: 117458629, major code: 15 (QueryTree), minor code: 0
Okt 16 16:16:10 Blumenmonster kwin_x11[2667]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 37023, resource id: 117458629, major code: 18 (ChangeProperty), minor code: 0
Okt 16 16:16:10 Blumenmonster kwin_x11[2667]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 37026, resource id: 117440522, major code: 42 (SetInputFocus), minor code: 0
Okt 16 16:16:10 Blumenmonster kwin_x11[2667]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 37027, resource id: 117440522, major code: 25 (SendEvent), minor code: 0
Okt 16 16:16:10 Blumenmonster kwin_x11[2667]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 37051, resource id: 117440522, major code: 20 (GetProperty), minor code: 0
Okt 16 16:16:10 Blumenmonster kwin_x11[2667]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 37052, resource id: 117440522, major code: 20 (GetProperty), minor code: 0
Okt 16 16:16:10 Blumenmonster kwin_x11[2667]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 37054, resource id: 117440522, major code: 20 (GetProperty), minor code: 0
Okt 16 16:16:10 Blumenmonster kwin_x11[2667]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 37055, resource id: 117440522, major code: 20 (GetProperty), minor code: 0
Okt 16 16:16:10 Blumenmonster kwin_x11[2667]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 37056, resource id: 117440522, major code: 15 (QueryTree), minor code: 0
Okt 16 16:16:11 Blumenmonster plasmashell[2677]: qml: temp unit: 0

this might be bad? Found it in dmesg but is from a few hours ago so probably also not relevant:

[  +0,003018] systemd[1]: Starting Journal Service...
[  +0,269364] systemd-coredump[27621]: Process 16609 (systemd-journal) of user 0 dumped core.
[  +0,000009] systemd-coredump[27621]: Coredump diverted to /var/lib/systemd/coredump/core.systemd-journal.0.68738b849858417286dc9f7811806d83.16609.1571232565000000.lz4
[  +0,000006] systemd-coredump[27621]: Stack trace of thread 16609:
[  +0,000006] systemd-coredump[27621]: #0  0x00007ffb66dfaa87 __GI___pthread_timedjoin_ex (libpthread.so.0)
[  +0,000006] systemd-coredump[27621]: #1  0x00007ffb675a411f n/a (libsystemd-shared-242.so)
[  +0,000005] systemd-coredump[27621]: #2  0x00007ffb675a4237 n/a (libsystemd-shared-242.so)
[  +0,000005] systemd-coredump[27621]: #3  0x00007ffb675a7826 journal_file_append_object (libsystemd-shared-242.so)
[  +0,000005] systemd-coredump[27621]: #4  0x00007ffb675a7ee6 n/a (libsystemd-shared-242.so)
[  +0,000005] systemd-coredump[27621]: #5  0x00007ffb675abda3 journal_file_append_entry (libsystemd-shared-242.so)
[  +0,000005] systemd-coredump[27621]: #6  0x0000558295a68191 n/a (systemd-journald)

These are unrelated.

I don't have any further ideas, sorry... well one more question:
When exactly do you encounter these low speeds? Always? Or just when you copy files from external USB to your SSD?

1 Like

Well from USB to my SSD was the only big file operation I've tried so far.

Maybe a hint I've got one of those popups before and now again. Message reading:

Unable to create io-slave. Cannot talk to klauncher: Message recipient disconnected from message bus without replying

It gives me the options Retry, Skip, AutoSkip & Cancel.

Can you try copying a file in a terminal with cp -v and see whether it makes a difference?

Can you also try copying a large file from LUKS to LUKS (not from USB)?

1 Like

okey so I tried copying a file using
cp -v
in a terminal. Performance is a bit better (~30-40mb/s) but still freezes the GUI with lot's of CPU wait times.

Copying a file within the SSD LUKS root partition shows the same bad performance & behavior.

Also something I noticed is that when I start a copy / move job (no matter if GUI or CLI) it first just reads the file from the source for around 10 seconds before even starting to write the file to the target disk. It does however already (virtually?) show up at the target directory with the filesize that has already been read. so there is some bad caching in between that I want to disable / limit. Where & how?

FYI only
Kernel 5.2.21 has status EOL (end of life)
If you have fun to test, try newer and older kernel :wink:

I've just tried 4.19 - I thought at first it worked since it gave me read / write performance of around 300mb/s but after around 20 seconds it went back to 20mb/s and made the KDE panels freeze up again.

1 Like

That's just how the virtual memory system works, you can tweak and tune the VM system, but you cannot disable it.
You can e.g. reduce dirty limits (see my edited comment above), this should already help a bit.

1 Like

where do I find those settings?

Here's a suggestion of settings, I honestly don't know how well they work for you.

Put these into a file in /etc/sysctl.d and either load with sudo sysctl -p /etc/sysctl.d/filename to apply immediately, or simply reboot.

vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
vm.extfrag_threshold = 100
vm.swappiness = 10
vm.vfs_cache_pressure = 60

If these settings do not help, or if it gets worse (remember it's just a suggestion), simply delete the file again.
Or you can test each setting individually on the fly and experiment with different values, e.g.
sudo sysctl -w vm.dirty_background_ratio=5

It must be said though, it's entirely possible and probable that the problem lies elsewhere.

1 Like

After you fix this and with encrypted luks , make sure you install grub customizer again. It is in the repository.

did that, thanks. Tried a SSD to SSD copy and got around 200mb/s initially and it kept that up for longer than before, but it still degraded down to 30mb/s after around 35 seconds.
Copying from USB to SSD gets around 80mb/s for 50 seconds before degrading to 25mb/s which is also when the desktop starts to become unresponsive again.

You poor petty man. It's probably hard to believe but me coming back with another problem after one you tried to help me with but we couldn't solve but made it worse is not something I do to spite you and is not really about you at all.

BTW, the grub customizer was not what caused the Manjaro installation to be unbootable, it was your suggestion to replace grub with the vanilla-grub version.

Then go ahead and install grub-customizer. Install another kernel and do update-grub in your encrypted btrfs OS. Go ahead. Petty? You bet!

Please stop with the insults and name calling.

You can flag @gohlip as off-topic but the member has every right to comment as you called the member in by name.

And you started this topic by telling everyone you were not to blame but @gohlip - which is one the best to solve grub issues. He can be hard to read - with his special humor - but his advice are usually spot on.

I had a look at your previous issue and it was not @gohlip's suggestions - it was you failing to mention the LUKS encrypted installation - you didn't mention either that you made use of grub-customizer. And grub-customizer is known for its capability of creating grub issues.

But as you didn't provide all info needed - you were presenting an xy problem so don't blame others for your issues or their inability to provide the exact solution.


Addendum:

Installing grub-vanilla replacing the Manjaro grub works.

I know because I have done it - not today - earlier when the package was tested.

Forum kindly sponsored by