Failed to mount btrfs RAID0, after a repair it worked

I had a serious issue with my btrfs setup today. After compiling a linux-ck 4.16.16 I wasn't able to boot. My btrfs partition failed to mount. I actually have two M.2 SSD of 128GB each and I created a 127GB btrfs partition on each (sda3 and sdc3) and then coupled them with btrfs device add to a RAID0.

~ >>> lsblk -f                                                                 
NAME   FSTYPE     LABEL     UUID                                 MOUNTPOINT
sda                                                              
├─sda1 vfat                 7550-DABB                            /boot/efi
├─sda2 ext2                 302ae0b1-b913-471b-a1ab-3bb4575b4d00 /boot
└─sda3 btrfs                ea59470a-443b-4dbd-90b9-b6e0e6b32876 /home/eugen/Dat
sdb                                                              
├─sdb1 btrfs      Linux     cf269dfd-df9c-4939-a8ff-796a4ced6643 /mnt
├─sdb2 swap       swap      e487a4eb-8a7f-4196-896e-10ca767c4269 
├─sdb3 vfat                 4692-78DF                            
├─sdb4 crypto_LUK           7b749090-f58c-484e-ae9f-e7415f72da2e 
├─sdb5 xfs        DATA2     3de85b28-17b3-4a39-bd5e-61b4e118d7fd 
├─sdb6 xfs        DataHDD   bb0e217e-dc93-486c-93f8-c5d73dcf7c4d 
└─sdb9 ext4       casper-rw 62eb1d2b-5aeb-4bfd-b9df-e89162626a58 
sdc                                                              
├─sdc1                                                           
├─sdc2 ext2       boot32    bd30fb1c-2107-4489-98df-7d68a02a148b 
└─sdc3 btrfs                ea59470a-443b-4dbd-90b9-b6e0e6b32876 

I forgot to write down what the error was, I will post it from browser history on a different system.
EDIT: I'm trying to reconstruct the errors from my browser history because I searched them:

  • this happens when I try to mount: btrfs mount: /mnt: wrong fs type, bad option, bad superblock on /dev/, missing codepage or helper program, or other error.
  • this was in dmesg output: BTRFS critical (device sda3): corrupt leaf: root=1 block=38301958144 slot=199, bad key order, prev (576460795430043648 168 73728) current (43126693888 168 53248)

Anyway, what helped was running a dangerous command from a Manjaro install on a different disk.
https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-check#DANGEROUS_OPTIONS
sudo btrfs check --init-extent-tree /dev/sda3
I tried to mount the partition and run a normal btrfs check, but it didn't work and didn't help, then I decided to run a more risky command. I didn't care for https://btrfs.wiki.kernel.org/index.php/Restore because I had a backup of my system an user data.
The --init-extent-tree took very long, 10 hours maybe, it checked about 4 million units of whatever and finally produced a harmless looking summary which I didn't care to save.

After a reboot to the main install the filesystem is back.

It happened again.
Screen when booting after having compiled and installed the current linux-ck 4.17.4.

IMG_20180704_123439

Why 4.16.y that already has EOL
see LTS kernel 4.14.y

because the topic is 17 days old.

This time it is corrupt leaf: root=2 ...

This error could also be due to faulty hardware.
Which shows

btrfs-debug-tree -b 38301958144 /dev/sda3
1 Like

The fact that it is happening twice makes it likely.

Why -b 38301958144 ?

I mean this time it is block=38536470720.

~ >>> btrfs-debug-tree -b 38536470720 /dev/sda3                                                                                                               
zsh: command not found: btrfs-debug-tree

they must have renamed the command.

Time 13:16
Your message block=38301958144 device sda3

It was the block from the previous time 17 days ago. block=38536470720 is on the photo from today.

Use Info SCREENshot

They did:

Deprecated and obsolete tools:
btrfs-debug-tree
moved to btrfs inspect-internal dump-tree
btrfs-show-super
moved to btrfs inspect-internal dump-super
btrfs-zero-log
moved to btrfs rescue zero-log
https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs#STANDALONE_TOOLS

~ >>> sudo btrfs inspect-internal dump-tree -b 38536470720 /dev/sda3                                                                                       [1]
[sudo] password for eugen: 
btrfs-progs v4.16.1
ERROR: tree block bytenr 38536470720 is not aligned to sectorsize 4096
ERROR: failed to read 38536470720

any tool has probably problems with the size calculation (x 1024)
to come to the end of the volume of a cylinder (sector)

Edit:
the calculation always takes place relative to 0 (zero)

You mean sectorsize 4096 is a bad choice? What would you suggest?

What is meant here is the cylinder limit in the error message
example:

75VJ ~]$ fdisk -l
Festplatte /dev/sda: 931,5 GiB, 1000204886016 Bytes, 1953525168 Sektoren
Einheiten: Sektoren von 1 * 512 = 512 Bytes
Sektorgröße (logisch/physikalisch): 512 Bytes / 4096 Bytes
E/A-Größe (minimal/optimal): 4096 Bytes / 4096 Bytes
Festplattenbezeichnungstyp: gpt
Festplattenbezeichner: D6C0986D-0E20-47A4-BC6A-6A4C3524AE08

Gerät         Anfang       Ende   Sektoren  Größe Typ
/dev/sda1       2048    8390655    8388608     4G EFI-System
/dev/sda2    8392704   58724351   50331648    24G Linux Swap
/dev/sda3   58724352  373297151  314572800   150G Linux-Dateisystem
/dev/sda4  373297152 1953523711 1580226560 753,5G Linux-Dateisystem

note the end

~ >>> sudo fdisk -l                                                                                                                                           
[sudo] password for eugen: 
Disk /dev/sdb: 465,8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: BEA3C593-F71B-4AD5-BD9D-A42293BFEEC2

Device         Start       End   Sectors  Size Type
/dev/sdb1  311558144 515604479 204046336 97,3G Linux root (x86)
/dev/sdb2  958988288 976773119  17784832  8,5G Linux swap
/dev/sdb3       2048    526335    524288  256M EFI System
/dev/sdb5  764428288 958988287 194560000 92,8G Linux filesystem
/dev/sdb6  559628288 764428287 204800000 97,7G Linux filesystem
/dev/sdb9  515604480 559628287  44023808   21G Linux filesystem

Partition table entries are not in disk order.


Disk /dev/sda: 119,2 GiB, 128035676160 bytes, 250069680 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 02573044-EBBA-4069-92F6-3579DD2DEFEC

Device       Start       End   Sectors   Size Type
/dev/sda1     2048    231423    229376   112M EFI System
/dev/sda2   231424   1050623    819200   400M Linux filesystem
/dev/sda3  1050624 250069646 249019023 118,8G Linux filesystem


Disk /dev/sdc: 119,2 GiB, 128035676160 bytes, 250069680 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 6A345543-77DA-4CC8-8733-90BF8C97719A

Device       Start       End   Sectors   Size Type
/dev/sdc1     2048      4095      2048     1M Linux filesystem
/dev/sdc2   526336   1050623    524288   256M Linux filesystem
/dev/sdc3  1050624 250069646 249019023 118,8G Linux filesystem
~ >>>      

Sorry for my stupidity, I see "512 bytes", should I change the sector size for btrfs somewhere to 512, how? :thinking:

Some more information:

~ >>> sudo btrfs filesystem show                                                                                                                              
[sudo] password for eugen: 
Label: 'Linux'  uuid: cf269dfd-df9c-4939-a8ff-796a4ced6643
	Total devices 1 FS bytes used 74.56GiB
	devid    1 size 97.30GiB used 84.07GiB path /dev/sdb1

Label: none  uuid: ea59470a-443b-4dbd-90b9-b6e0e6b32876
	Total devices 2 FS bytes used 40.41GiB
	devid    1 size 118.74GiB used 21.03GiB path /dev/sda3
	devid    2 size 118.74GiB used 21.00GiB path /dev/sdc3

~ >>> sudo btrfs check --check-data-csum /dev/sdc3                                                                                                            
bad key ordering 201 202
ERROR: cannot open file system
~ >>>  

Maybe related

Bad News

Required manufacturer / model / type of SSD
for checking the technical data

                       Your Disk                                  My Disk        
Sector size (logical/physical): 512 bytes / 512 bytes             512 Bytes / 4096 Bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes              4096 Bytes / 4096 Bytes               
is not aligned to sectorsize 4096

Edit:

Net capacities are calculated as follows: (cylinder * heads * sectors) * 512 bytes / 1024³
~ >>> hwinfo --disk                                                                                                                                        [1]
25: IDE 100.0: 10600 Disk                                       
  [Created at block.245]
  Unique ID: WZeP.4MqfuMuv266
  Parent ID: 7EWs.ZF4VoENCRFC
  SysFS ID: /class/block/sdb
  SysFS BusID: 1:0:0:0
  SysFS Device Link: /devices/pci0000:00/0000:00:11.0/ata2/host1/target1:0:0/1:0:0:0
  Hardware Class: disk
  Model: "ST500LM012 HN-M5"
  Vendor: "ST500LM012"
  Device: "HN-M5"
  Revision: "0002"
  Driver: "ahci", "sd"
  Driver Modules: "ahci", "sd_mod"
  Device File: /dev/sdb
  Device Files: /dev/sdb, /dev/disk/by-id/ata-ST500LM012_HN-M500MBB_S2R7J9EF300175, /dev/disk/by-id/wwn-0x50004cf20cd6d085, /dev/disk/by-path/pci-0000:00:11.0-ata-2
  Device Number: block 8:16-8:31
  Drive status: no medium
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #21 (SATA controller)

26: IDE 200.0: 10600 Disk
  [Created at block.245]
  Unique ID: _kuT.a07scMb7ub9
  Parent ID: VCu0.miFdfHUX3YF
  SysFS ID: /class/block/sdc
  SysFS BusID: 2:0:0:0
  SysFS Device Link: /devices/pci0000:00/0000:00:02.1/0000:01:00.0/ata3/host2/target2:0:0/2:0:0:0
  Hardware Class: disk
  Model: "SAMSUNG MZHPU128"
  Vendor: "SAMSUNG"
  Device: "MZHPU128"
  Revision: "501Q"
  Driver: "ahci", "sd"
  Driver Modules: "ahci", "sd_mod"
  Device File: /dev/sdc
  Device Files: /dev/sdc, /dev/disk/by-id/ata-SAMSUNG_MZHPU128HCGM-00004_S1NBNYAFB00246, /dev/disk/by-id/wwn-0x50025386000491a9, /dev/disk/by-path/pci-0000:01:00.0-ata-1
  Device Number: block 8:32-8:47
  Drive status: no medium
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #11 (SATA controller)

27: IDE 00.0: 10600 Disk
  [Created at block.245]
  Unique ID: 3OOL.__O2WJ2PEP7
  Parent ID: 7EWs.ZF4VoENCRFC
  SysFS ID: /class/block/sda
  SysFS BusID: 0:0:0:0
  SysFS Device Link: /devices/pci0000:00/0000:00:11.0/ata1/host0/target0:0:0/0:0:0:0
  Hardware Class: disk
  Model: "KINGSTON RBU-SNS"
  Vendor: "KINGSTON"
  Device: "RBU-SNS"
  Revision: "06.9"
  Driver: "ahci", "sd"
  Driver Modules: "ahci", "sd_mod"
  Device File: /dev/sda
  Device Files: /dev/sda, /dev/disk/by-id/ata-KINGSTON_RBU-SNS8100S3128GD_50026B7248042CC6, /dev/disk/by-path/pci-0000:00:11.0-ata-1
  Device Number: block 8:0-8:15
  Drive status: no medium
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #21 (SATA controller)

Short SMART test shows OK:
DeepinScreenshot_select-area_20180704151629

Forum kindly sponsored by