Trying to recover a broken ext4 partition

Hi there!

Following discussion:


I asked that some posts be moved over here. I'll get back to the GPU discussion later, once I have the laptop running again.

I have sorta lost track of this thread because off-topic. I have two questions and please report back with output from the actual commands.

Have you tried
# dumpe2fs /dev/sda2 | grep superblock

and then when you get the list something like
# fsck -b XXXXXX /dev/sda2

this is stolen from
https://www.cyberciti.biz/faq/recover-bad-superblock-from-corrupted-partition/

1 Like

No I haven't. Thanks for the input. I used fsck (manually) very few times, so I don't know its options very well. I'll definitely try that when I get to that phase.

Personally I would probably move the drive into a less serious roll, third backup type thing and also check up on that memory for real.
A high percent of data corruption will come from a faulty memory read or write error, this is why we have ECC memory.
But anyway SDD has memory too and so on, so who knows. :grinning::grin::confused:

Edit: forgot to add, modern file systems (not ext4) are built to negate some of the memory rw problems so this makes them preferable. I am not sure how bart file system handles this but in ZFS it has some checks in place.

Edit2: I am not recommending ZFS on linux, it is not there yet as far as I know. BART should be the path on Linux I would think.
Anyone?

2 Likes

Update: all files successfully copied except one journal file (probably the one I'd like to read). I'll split this to a new thread and try recover after dinner.

3 Likes

Update:

[mochobb@mocho-desktop testdisk-7.1-WIP]$ sudo dumpe2fs -h /dev/sdh1
dumpe2fs 1.43.9 (8-Feb-2018)
Filesystem volume name:   Manjaro
Last mounted on:          /
Filesystem UUID:          f80640d5-cb8c-4daa-95bf-7f1964536d9b
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash 
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              4202496
Block count:              16777216
Reserved block count:     838860
Free blocks:              4703502
Free inodes:              3673097
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      1024
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8208
Inode blocks per group:   513
Flex block group size:    16
Filesystem created:       Wed Jul 26 20:40:15 2017
Last mount time:          Fri Feb 23 17:16:12 2018
Last write time:          Fri Feb 23 17:16:12 2018
Mount count:              230
Maximum mount count:      -1
Last checked:             Wed Jul 26 20:40:15 2017
Check interval:           0 (<none>)
Lifetime writes:          1169 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     32
Desired extra isize:      32
Journal inode:            8
First orphan inode:       3284473
Default directory hash:   half_md4
Directory Hash Seed:      fb23d973-a28f-4467-941f-1657927a4706
Journal backup:           inode blocks
Journal features:         journal_incompat_revoke journal_64bit
Journal size:             512M
Journal length:           131072
Journal sequence:         0x000eb40b
Journal start:            1

The last mounted date is a good sign. It means I wouldn't loose anything from that date if I recovered the journal.
These are the journal backups:

[mochobb@mocho-desktop testdisk-7.1-WIP]$ sudo dumpe2fs /dev/sdh1 | grep superblock   
[sudo] password for mochobb: 
dumpe2fs 1.43.9 (8-Feb-2018)
  Primary superblock at 0, Group descriptors at 1-8
  Backup superblock at 32768, Group descriptors at 32769-32776
  Backup superblock at 98304, Group descriptors at 98305-98312
  Backup superblock at 163840, Group descriptors at 163841-163848
  Backup superblock at 229376, Group descriptors at 229377-229384
  Backup superblock at 294912, Group descriptors at 294913-294920
  Backup superblock at 819200, Group descriptors at 819201-819208
  Backup superblock at 884736, Group descriptors at 884737-884744
  Backup superblock at 1605632, Group descriptors at 1605633-1605640
  Backup superblock at 2654208, Group descriptors at 2654209-2654216
  Backup superblock at 4096000, Group descriptors at 4096001-4096008
  Backup superblock at 7962624, Group descriptors at 7962625-7962632
  Backup superblock at 11239424, Group descriptors at 11239425-11239432

However:

[mochobb@mocho-desktop testdisk-7.1-WIP]$ sudo fsck.ext4 -b 32768 /dev/sdh1
e2fsck 1.43.9 (8-Feb-2018)
Superblock needs_recovery flag is clear, but journal has data.
Recovery flag not set in backup superblock, so running journal anyway.
Manjaro: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Recovery flag not set in backup superblock, so running journal anyway.
fsck.ext4: Unknown code ____ 251 while recovering journal of Manjaro
Superblock needs_recovery flag is clear, but journal has data.
Recovery flag not set in backup superblock, so running journal anyway.
fsck.ext4: unable to set superblock flags on Manjaro


Manjaro: ***** FILE SYSTEM WAS MODIFIED *****

Manjaro: ********** WARNING: Filesystem still has errors **********

[mochobb@mocho-desktop testdisk-7.1-WIP]$ sudo fsck.ext4 /dev/sdh1
e2fsck 1.43.9 (8-Feb-2018)
Manjaro: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>? yes
fsck.ext4: Unknown code ____ 251 while recovering journal of Manjaro
fsck.ext4: unable to set superblock flags on Manjaro


Manjaro: ********** WARNING: Filesystem still has errors **********

So, any other ideas?

Update:
I was able to read the journal copied from the drive with testdisk. It doesn't show any systemd or Xorg errors. The log ends at fev 24 23:14:50 which is probably coincident with the time I went to shut it down on Saturday. The date is pretty correct because nmbd generates an error for every 5 mins if the laptop isn't connected to any network (and it wasn't). Of course the journal is probably incomplete. After all there was a testdisk error when copying this file, and if there was some extra information I probably won't see it :disappointed_relieved:

Update: Windows' partitions weren't affected. I can boot Windows, though it hangs because I have the WDDM VBox video driver installed (I won't be using the WDDM driver any more - basic 3D with a bootable Windows is better).

Later I'll try to recover the root partition with testdisk (just because - I already have the data :slight_smile: )

ZFS is great, if you're great at adminship; I knew someone who managed to recover his partition with ZFS after he had ruined it with dd.

However, I think that just rsync’ing everything after every session is the simplest way to keep your data safe.

1 Like

Yep, I do that at the end of each work day (sync with my work desktop). Besides that, my current work data is sync with MEGA to my desktop, so I always have the most important stuff in 3 places. However, those folders aren't all the data I have, and these things always happen when you relax your methods.

Coincidently or not, last Friday I was lazy and didn't sync my laptop at work (and I worked mostly on the laptop that day). Even worse, my workplace's computer chose that day to perform updates and so I worked for a long time without a network connection (files weren't synced). Then I got home and didn't turn the laptop on until Saturday night, without a WiFi on (files weren't synced again). As you can see, â– â– â– â–  always happens weather you want it or not. Luckily I got all my files back, not that I'd loose a huge portion of work (just a part from Friday), but it would still be annoying.

Update:
I tried to recover the partitions with testisk with no success (it still boots to rootfs). I give up. Time to reinstall.

I never install updates. And I install the soft I need for work in ~ so that I can be more “system independent” (FF, LibreOffice, Master PDF Editor, pdftk, audio-recorder etc. — all them can easily be installed in ~). Some rollin' orthodoxes argued with me here (guys, please don't do that again now).

Yeah... My work desktop doesn't belong to me. I shouldn't even be able to connect it to my laptop, but that's another story...

Update: I finally realised this "frozen" feature didn't let me format the drive. Now, after unfreezing, I stil am not able to format the drive. I'm currently dd'ing it, but I'm starting to think the SSD went kaput. Time to check the warranty...

1 Like

Update: so, the “frozen” feature has nothing to do with the inability to write to the disk. It is completely broken. It is an SP S55 and even their toolbox can’t do anything about it. Luckily it is covered by warranty. I already contacted them. Too bad I'll have to return it with data. Luckily the only important files in there are encrypted with gpg (though the keys are also there). No pictures, nothing (besides work stuff, of course).

Update: received the disk on last Friday, only had tome to install it today. Tried a direct clone from the old disk without succes. I had reinstall the system and now I'll need to configure it (or copy the old configuration) and copy my Data partition's content (this time I went with NILFS2, just because).

So, in the end the solution was testdisk (allowed me to recover the contents). It isn't the first time this little program saves my butt. Highly recommended. I'll mark this post as the solution. Thanks for the feedback.

2 Likes

So the disk did actually die, glad you were able to recover your data.

This is why I use Clonezilla to make backup disk images, if a disk dies I have a bit by bit copy to restore to another disk. Handy if disk is encrypted also, particularly when returning faulty hardware.

Yes, I'm glad it wasn't encrypted (my habit is of encrypting files/folders, not the whole disk). I wouldn't loose much important Data, as I usually have it in different places, but I'd need to search for it and in the end I'd always have doubts, except for my current work files. I really need to buy a large spinnig disk to make regular backups. Luckily the drive remained readable with few errors!

Forum kindly sponsored by