Softpanorama

May the source be with you, but remember the KISS principle ;-)
Contents Bulletin Scripting in shell and Perl Network troubleshooting History Humor

Recovery of LVM partitions

News

LVM

Recommended Books

Recommended Links Disaster Recovery Relax-and-Recover

Reference

Ext3 filesystem Linux Software RAID Bare metal recovery of Linux systems FIT USB flash drives Direct Disk Editing Partition labels Baseliners
Snapshots dd Recovery of lost files using DD DD Rescue tar Unix cpio Linux Loopback filesystem
Linux Troubleshooting Linux Disk Partitioning Linux SCSI subsystem Linux Swap filesystem Sysadmin Horror Stories Humor Etc

Introduction

The price of excessive complexity is the loss of reliability. This statement is never more true then in attempts to recover lost data of LVM volumes.  LVM adds additional layer of complexity and one side effect of existence of this layer is that recovery of damaged volumes became much more complex. You really needs understanding of "nuts and bolts" of LVM,  which is  not a strong point of a typical Linux sysadmin. Moreover LVM is badly (for such a critical subsystem) documented. Most of what exists it iether outdated or irrelevant for the purposes of recovery of damaged volumes, or outright junk (like most docs provided by Red Hat). 

As with everything the more you do before the problem occurs, the less you need to do during the emergency. Having a current baseline of the server with all the necessary configuration files helps greatly. It takes a couple of minute to create and absence of the current baseline of the server (at least SOS in RHEL, supportconfig in Suse)  on some NFS volume or remote server is the most serious blunder sysadmin can commit in his professional career.

Having an overnight backup of data typically moves the situation from SNAFU to a nuisance category. Also you should create a copy of /etc directory ( or better the whole root partition) on each successful reboot and store it in USB drive or on an external filesystem/remote server.   This trick, when you have access to iether cpio copy or root partition or tarball of /etc directory in /boot provides you with some set of important parameters that might greatly help during the recovery. 

Another important tip of using LVM is to always to have at least 2GB free space on each volute group. That allow using snapshots during patching and similar operations when you can damage or destroy LVM. Learning how to use snapshots is a must for any sysadmin that uses LVM is business critical environment.  

Having the root filesystem on LVM can additionally complicate recovery of damaged file systems, so this is one thing that is probably prudent to avoid. There is no justification of putting operating system partitions on LVM with modern disks. Without /var /opt and /tmp they are usually less then 4GB and if you allocate, say, 32GB to root partition chances that you even run of space during the lifetime of the system are very slim. Even for /var you can move excessive log archive to other partitions overnight making the size of this partitions static even for the most log intensive services such as corporate proxy server or major webserver.

If your root partition is outside LVM you at least can edit files at /etc directory without jumping through hoops .

Complexity of recovery also depends on Linux distribution you are using.  For example, SLES DVD in rescue mode automatically recognizes the LVM group so it make recovery of LVM somewhat simpler, unless this is a major screw-up. 

Generally you need to have a copy of   /etc/lvm/backup/volume_name  to access the LVM partitions. That's why it is prudent to backup root partition at least weekly and  backup /etc directory on each login. You can write such data on a flash card of the server or blade (vFlash in Dell), or a FIT form factor flash drive permanently installed in one of USB ports.  The same FIT flash drive can contain a tar ball of major filesystems as provided by Relax-and-Recover.   With the current prices there is no excuse not to use FIT drive as a recovery medium. As of November 2016 128GB SanDisk FIT drive costs around $22. 64GB - $15, 32Gb is $8.  Samsung FIT form factor USB flash drives are even cheaper.

In large enterprise environment you can use a dedicated server for such "partial" backups and bare metal recovery ISO files. You should do it yourself, as central backup is typically highly bureaucratized large corporation if often unreliable. Often it is performed by the other department (operators on night shift) and restore is delegated to operators too, who can screw-up already damaged system further instead of helping to recover it just due to the lack of knowledge and understanding of the environment. Then you will need to deal with two problems :-(. 

Strategies

There are two strategies of recovery of LVM volume.

When hardware is fully operational and data are "mostly" intact:

When hardware is on its way south or content is mangled beyond recognition. Much depends if you can create more or less complete DD-image of such disk or not. Recovery actions should be attempted only on DD-image, never on damaged disk. For example:

Never use LVM with Linux software RAID for important data

Software RAID in Linux is generally invitation to troubles. This is a badly written and badly integrated subsystem. Unfortunately Red Hat popularized this horrible mess by including it in the certification.

Combination of Linux software RAID and LVM is especially toxic. As Richard Bullington-McGuire noted (Recovery of RAID and LVM2 Volumes, April 28, 2006 | )

The combination of Linux software RAID (Redundant Array of Inexpensive Disks) and LVM2 (Logical Volume Manager, version 2) offered in modern Linux operating systems offers both robustness and flexibility, but at the cost of complexity should you ever need to recover data from a drive formatted with software RAID and LVM2 partitions. I found this out the hard way when I recently tried to mount a system disk created with RAID and LVM2 on a different computer. The first attempts to read the filesystems on the disk failed in a frustrating manner.

I had attempted to put two hard disks into a small-form-factor computer that was really only designed to hold only one hard disk, running the disks as a mirrored RAID 1 volume. (I refer to that system as raidbox for the remainder of this article.) This attempt did not work, alas. After running for a few hours, it would power-off with an automatic thermal shutdown failure. I already had taken the system apart and started re-installing with only one disk when I realized there were some files on the old RAID volume that I wanted to retrieve.

Recovering the data would have been easy if the system did not use RAID or LVM2. The steps would have been to connect the old drive to another computer, mount the filesystem and copy the files from the failed volume. I first attempted to do so, using a computer I refer to as recoverybox, but this attempt met with frustration.

Importance of having fresh backup and baseline

As always, the most critical thing that distinguishes minor inconvenience from the major SNAFU is the availability of up-to-date backups. There is no replacement for up-to-date backups and baseline (for example, creating the baseline of /etc directory on each login can save you from a lot of troubles) and spending enough time and effort on this issue is really critical for recovery of major LVM screw-ups.

It is much better to spend a couple of hours organizing some additional, private backup and automatic taking of baseline of at least /etc directory on each your login, then to spend 10 hours in cold sweet trying to recover horribly messed LVM partition.  So, on a fundamental level, this is a question of your priorities ;-)

The second option is to have support contract and insist on kernel engineer to perform the recovery ;-).  That might well be not the same day day recovery, but a good kernel engineer can do amazing things with the messed system.

Some information about the recovery process

There are very few good articles on the Net that describe the nuts and bolts of the recovery process. I have found just two:

You need to study both first, before jumping into action.  and don't forget to make a dd copies of the disk before attempting recovery.  Instead of a harddrives you can actually work with the images via loopback interface, as described in  LVM partitions recovery - Skytechwiki

One of the most tragic blunders in recovery is the loss on initial configuration.  If you do not have enough disks buy them ASAP. Your data are much more valuable.

Novell recommendations

The information below is from Cool Solutions Recovering a Lost LVM Volume Disk

Logical Volume Management (LVM) provides a high level, flexible view of a server's disk storage. Though robust, problems can occur. The purpose of this document is to review the recovery process when a disk is missing or damaged, and then apply that process to plausible examples. When a disk is accidentally removed or damaged in some way that adversely affects the logical volume, the general recovery process is:
  1. Replace the failed or missing disk
  2. Restore the missing disk's UUID
  3. Restore the LVM meta data
  4. Repair the file system on the LVM device

The recovery process will be demonstrated in three specific cases:

  1. A disk belonging to a logical volume group is removed from the server
  2. The LVM meta data is damaged or corrupted
  3. One disk in a multi-disk volume group has been permanently removed

This article discusses how to restore the LVM meta data. This is a risky proposition. If you restore invalid information, you can loose all the data on the LVM device. An important part of LVM recovery is having backups of the meta data to begin with, and knowing how it's supposed to look when everything is running smoothly. LVM keeps backup and archive copies of it's meta data in /etc/lvm/backup and /etc/lvm/archive. Backup these directories regularly, and be familiar with their contents. You should also manually backup the LVM meta data with vgcfgbackup before starting any maintenance projects on your LVM volumes.

If you are planning on removing a disk from the server that belongs to a volume group, you should refer to the LVM HOWTO before doing so.

Server Configuration

In all three examples, a server with SUSE Linux Enterprise Server 10 with Service Pack 1 (SLES10 SP1) will be used with LVM version 2. The examples will use a volume group called "sales" with a linear logical volume called "reports". The logical volume and it's mount point are shown below. You will need to substitute your mount points and volume names as needed to match your specific environment.

ls-lvm:~ # cat /proc/partitions
major minor  #blocks  name

   8     0    4194304 sda
   8     1     514048 sda1
   8     2    1052257 sda2
   8     3          1 sda3
   8     5     248976 sda5
   8    16     524288 sdb
   8    32     524288 sdc
   8    48     524288 sdd

ls-lvm:~ # pvcreate /dev/sda5 /dev/sd[b-d]
  Physical volume "/dev/sda5" successfully created
  Physical volume "/dev/sdb" successfully created
  Physical volume "/dev/sdc" successfully created
  Physical volume "/dev/sdd" successfully created

ls-lvm:~ # vgcreate sales /dev/sda5 /dev/sd[b-d]
  Volume group "sales" successfully created

ls-lvm:~ # lvcreate -n reports -L +1G sales
  Logical volume "reports" created

ls-lvm:~ # pvscan
  PV /dev/sda5   VG sales   lvm2 [240.00 MB / 240.00 MB free]
  PV /dev/sdb    VG sales   lvm2 [508.00 MB / 0    free]
  PV /dev/sdc    VG sales   lvm2 [508.00 MB / 0    free]
  PV /dev/sdd    VG sales   lvm2 [508.00 MB / 500.00 MB free]
  Total: 4 [1.72 GB] / in use: 4 [1.72 GB] / in no VG: 0 [0   ]

ls-lvm:~ # vgs
  VG    #PV #LV #SN Attr   VSize VFree
  sales   4   1   0 wz--n- 1.72G 740.00M

ls-lvm:~ # lvs
  LV      VG    Attr   LSize Origin Snap%  Move Log Copy%
  reports sales -wi-ao 1.00G

ls-lvm:~ # mount | grep sales
/dev/mapper/sales-reports on /sales/reports type ext3 (rw)

ls-lvm:~ # df -h /sales/reports
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/sales-reports
                     1008M   33M  925M   4% /sales/reports

Disk Belonging to a Volume Group Removed

Removing a disk, belonging to a logical volume group, from the server may sound a bit strange, but with Storage Area Networks (SAN) or fast paced schedules, it happens.

Symptom:

The first thing you may notice when the server boots are messages like:

"Couldn't find all physical volumes for volume group sales."
"Couldn't find device with uuid '56pgEk-0zLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'."
'Volume group "sales" not found'

If you are automatically mounting /dev/sales/reports, then the server will fail to boot and prompt you to login as root to fix the problem.

Boot failed due to invalid fstab entry
  1. Type root's password.
  2. Edit the /etc/fstab file.
  3. Comment out the line with /dev/sales/report
  4. Reboot

The LVM symptom is a missing sales volume group. Typing cat /proc/partitions confirms the server is missing one of it's disks.

ls-lvm:~ # cat /proc/partitions
major minor  #blocks  name

   8     0    4194304 sda
   8     1     514048 sda1
   8     2    1052257 sda2
   8     3          1 sda3
   8     5     248976 sda5
   8    16     524288 sdb
   8    32     524288 sdc

ls-lvm:~ # pvscan
  Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
  Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
  Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
  Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
  Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
  Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
  PV /dev/sda5        VG sales   lvm2 [240.00 MB / 240.00 MB free]
  PV /dev/sdb         VG sales   lvm2 [508.00 MB / 0    free]
  PV unknown device   VG sales   lvm2 [508.00 MB / 0    free]
  PV /dev/sdc         VG sales   lvm2 [508.00 MB / 500.00 MB free]
  Total: 4 [1.72 GB] / in use: 4 [1.72 GB] / in no VG: 0 [0   ]

Solution:

  1. Fortunately, the meta data and file system on the disk that was /dev/sdc are intact.
  2. So the recovery is to just put the disk back.
  3. Reboot the server.
  4. The /etc/init.d/boot.lvm start script will scan and activate the volume group at boot time.
  5. Don't forget to uncomment the /dev/sales/reports device in the /etc/fstab file.

 

If this procedure does not work, then you may have corrupt LVM meta data.

Corrupted LVM Meta Data

The LVM meta data does not get corrupted very often; but when it does, the file system on the LVM logical volume should also be considered unstable. The goal is to recover the LVM volume, and then check file system integrity.

Symptom 1:

Attempting to activate the volume group gives the following:

ls-lvm:~ # vgchange -ay sales
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  Couldn't read volume group metadata.
  Volume group sales metadata is inconsistent
  Volume group for uuid not found: m4Cg2vkBVSGe1qSMNDf63v3fDHqN4uEkmWoTq5TpHpRQwmnAGD18r44OshLdHj05
  0 logical volume(s) in volume group "sales" now active

This symptom is the result of a minor change in the meta data. In fact, only three bytes were overwritten. Since only a portion of the meta data was damaged, LVM can compare it's internal check sum against the meta data on the device and know it's wrong. There is enough meta data for LVM to know that the "sales" volume group and devices exit, but are unreadable.

ls-lvm:~ # pvscan
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  /dev/sdc: Checksum error
  PV /dev/sda5   VG sales   lvm2 [240.00 MB / 240.00 MB free]
  PV /dev/sdb    VG sales   lvm2 [508.00 MB / 0    free]
  PV /dev/sdc    VG sales   lvm2 [508.00 MB / 0    free]
  PV /dev/sdd    VG sales   lvm2 [508.00 MB / 500.00 MB free]
  Total: 4 [1.72 GB] / in use: 4 [1.72 GB] / in no VG: 0 [0   ]

Notice pvscan shows all devices present and associated with the sales volume group. It's not the device UUID that is not found, but the volume group UUID.

Solution 1:

  1. Since the disk was never removed, leave it as is.
  2. There were no device UUID errors, so don't attempt to restore the UUIDs.
  3. This is a good candidate to just try restoring the LVM meta data.

     

    ls-lvm:~ # vgcfgrestore sales
      /dev/sdc: Checksum error
      /dev/sdc: Checksum error
      Restored volume group sales
    
    ls-lvm:~ # vgchange -ay sales
      1 logical volume(s) in volume group "sales" now active
    
    ls-lvm:~ # pvscan
      PV /dev/sda5   VG sales   lvm2 [240.00 MB / 240.00 MB free]
      PV /dev/sdb    VG sales   lvm2 [508.00 MB / 0    free]
      PV /dev/sdc    VG sales   lvm2 [508.00 MB / 0    free]
      PV /dev/sdd    VG sales   lvm2 [508.00 MB / 500.00 MB free]
      Total: 4 [1.72 GB] / in use: 4 [1.72 GB] / in no VG: 0 [0   ]
    
  4. Run a file system check on /dev/sales/reports.
    ls-lvm:~ # e2fsck /dev/sales/reports
    e2fsck 1.38 (30-Jun-2005)
    /dev/sales/reports: clean, 961/131072 files, 257431/262144 blocks
    
    ls-lvm:~ # mount /dev/sales/reports /sales/reports/
    
    ls-lvm:~ # df -h /sales/reports/
    Filesystem            Size  Used Avail Use% Mounted on
    /dev/mapper/sales-reports
                         1008M  990M     0 100% /sales/reports
    

Symptom 2:

Minor damage to the LVM meta data is easily fixed with vgcfgrestore. If the meta data is gone, or severely damaged, then LVM will consider that disk as an "unknown device." If the volume group contains only one disk, then the volume group and it's logical volumes will simply be gone. In this case the symptom is the same as if the disk was accidentally removed, with the exception of the device name. Since /dev/sdc was not actually removed from the server, the devices are still labeled a through d.

ls-lvm:~ # pvscan
  Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
  Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
  Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
  Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
  Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
  Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
  PV /dev/sda5        VG sales   lvm2 [240.00 MB / 240.00 MB free]
  PV /dev/sdb         VG sales   lvm2 [508.00 MB / 0    free]
  PV unknown device   VG sales   lvm2 [508.00 MB / 0    free]
  PV /dev/sdd         VG sales   lvm2 [508.00 MB / 500.00 MB free]
  Total: 4 [1.72 GB] / in use: 4 [1.72 GB] / in no VG: 0 [0   ]

Solution 2:

  1. First, replace the disk. Most likely the disk is already there, just damaged.
  2. Since the UUID on /dev/sdc is not there, a vgcfgrestore will not work.
    ls-lvm:~ # vgcfgrestore sales
      Couldn't find device with uuid '56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu'.
      Couldn't find all physical volumes for volume group sales.
      Restore failed.
    
  3. Comparing the output of cat /proc/partitions and pvscan shows the missing device is /dev/sdc, and pvscan shows which UUID it needs for that device. So, copy and paste the UUID that pvscan shows for /dev/sdc.
    ls-lvm:~ # pvcreate --uuid 56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu /dev/sdc
      Physical volume "/dev/sdc" successfully created
    
  4. Restore the LVM meta data
    ls-lvm:~ # vgcfgrestore sales
      Restored volume group sales
    
    ls-lvm:~ # vgscan
      Reading all physical volumes.  This may take a while...
      Found volume group "sales" using metadata type lvm2
    
    ls-lvm:~ # vgchange -ay sales
      1 logical volume(s) in volume group "sales" now active
    
  5. Run a file system check on /dev/sales/reports.
    ls-lvm:~ # e2fsck /dev/sales/reports
    e2fsck 1.38 (30-Jun-2005)
    /dev/sales/reports: clean, 961/131072 files, 257431/262144 blocks
    
    ls-lvm:~ # mount /dev/sales/reports /sales/reports/
    
    ls-lvm:~ # df -h /sales/reports
    Filesystem            Size  Used Avail Use% Mounted on
    /dev/mapper/sales-reports
                         1008M  990M     0 100% /sales/reports
    
    

Disk Permanently Removed

This is the most severe case. Obviously if the disk is gone and unrecoverable, the data on that disk is likewise unrecoverable. This is a great time to feel good knowing you have a solid backup to rely on. However, if the good feelings are gone, and there is no backup, how do you recover as much data as possible from the remaining disks in the volume group? No attempt will be made to address the data on the unrecoverable disk; this topic will be left to the data recovery experts.

Symptom:

The symptom will be the same as Symptom 2 in the Corrupted LVM Meta Data section above. You will see errors about an "unknown device" and missing device with UUID.

Solution:

  1. Add a replacement disk to the server. Make sure the disk is empty.
  2. Create the LVM meta data on the new disk using the old disk's UUID that pvscan displays.
    ls-lvm:~ # pvcreate --uuid 56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu /dev/sdc
      Physical volume "/dev/sdc" successfully created
    
  3. Restore the backup copy of the LVM meta data for the sales volume group.
    ls-lvm:~ # vgcfgrestore sales
      Restored volume group sales
    
    ls-lvm:~ # vgscan
      Reading all physical volumes.  This may take a while...
      Found volume group "sales" using metadata type lvm2
    
    ls-lvm:~ # vgchange -ay sales
      1 logical volume(s) in volume group "sales" now active
    
  4. Run a file system check to rebuild the file system.
    ls-lvm:~ # e2fsck -y /dev/sales/reports
    e2fsck 1.38 (30-Jun-2005)
    --snip--
    Free inodes count wrong for group #5 (16258, counted=16384).
    Fix? yes
    
    Free inodes count wrong (130111, counted=130237).
    Fix? yes
    
    /dev/sales/reports: ***** FILE SYSTEM WAS MODIFIED *****
    /dev/sales/reports: 835/131072 files (5.7% non-contiguous), 137213/262144 blocks
    
  5. Mount the file system and recover as much data as possible.
  6. NOTE: If the missing disk contains the beginning of the file system, then the file system's superblock will be missing. You will need to rebuild or use an alternate superblock. Restoring a file system superblock is outside the scope of this article, please refer to your file system's documentation.

Conclusion

LVM by default keeps backup copies of it's meta data for all LVM devices. These backup files are stored in /etc/lvm/backup and /etc/lvm/archive. If a disk is removed or the meta data gets damaged in some way, it can be easily restored, if you have backups of the meta data. This is why it is highly recommended to never turn off LVM's auto backup feature. Even if a disk is permanently removed from the volume group, it can be reconstructed, and often times the remaining data on the file system recovered.

Additional References

Till Brehm recommendations

the second valuable sourse of information about recovery process is  Recover Data From RAID1 LVM Partitions With Knoppix Linux LiveCD :

Version 1.0
Author: Till Brehm <t.brehm [at] projektfarm [dot] com>
Last edited: 04/11/2007

This tutorial describes how to rescue data from a single hard disk that was part of a LVM2 RAID1 setup like it is created by e.g the Fedora Core installer. Why is it so problematic to recover the data? Every single hard disk that formerly was a part of a LVM RAID1 setup contains all data that was stored in the RAID, but the hard disk cannot simply be mounted. First, a RAID setup must be configured for the partition(s) and then LVM must be set up to use this (these) RAID partition(s) before you will be able to mount it. I will use the Knoppix Linux LiveCD to do the data recovery.

Prerequisites

I used a Knoppix 5.1 LiveCD for this tutorial. Download the CD ISO image from here and burn it on CD, then connect the hard disk which contains the RAID partition(s) to the IDE / ATA controller of your mainboard, put the Knoppix CD in your CD drive and boot from the CD.

The hard disk I used is an IDE drive that is attached to the first IDE controller (hda). In my case, the hard disk contained only one partition.

Restoring The Raid

After Knoppix has booted, open a shell and execute the command:

sudo su

to become the root user.

As I don't have the mdadm.conf file from the original configuration, I create it with this command:

mdadm --examine --scan /dev/hda1 >> /etc/mdadm/mdadm.conf

The result should be similar to this one:

DEVICE partitions
CREATE owner=root group=disk mode=0660 auto=yes metadata=1
MAILADDR root
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=a28090aa:6893be8b:c4024dfc:29cdb07a

Edit the file and add devices=/dev/hda1,missing at the end of the line that describes the RAID array.

vi /etc/mdadm/mdadm.conf

Finally the file looks like this:

DEVICE partitions
CREATE owner=root group=disk mode=0660 auto=yes metadata=1
MAILADDR root
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=a28090aa:6893be8b:c4024dfc:29cdb07a devices=/dev/hda1,missing

The string /dev/hda1 is the hardware device and missing means that the second disk in this RAID array is not present at the moment.

Edit the file /etc/default/mdadm:

and change the line:

AUTOSTART=false

to:

AUTOSTART=true

Now we can start our RAID setup:

/etc/init.d/mdadm start
/etc/init.d/mdadm-raid start	

To check if our RAID device is ok, run the command:

cat /proc/mdstat

The output should look like this:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [ra id10]
md0 : active raid1 hda1[1]
293049600 blocks [2/1] [_U]
unused devices: 

	

Recovering The LVM Setup

The LVM configuration file cannot be created by an easy command like the mdadm.conf, but LVM stores one or more copy(s) of the configuration file content at the beginning of the partition. I use the command dd to extract the first part of the partition and write it to a text file:

dd if=/dev/md0 bs=512 count=255 skip=1 of=/tmp/md0.txt

Open the file with a text editor:

vi /tmp/md0.txt

You will find some binary data first and then a configuration file part like this:

VolGroup00 {
	id = "evRkPK-aCjV-HiHY-oaaD-SwUO-zN7A-LyRhoj"
	seqno = 2
	status = ["RESIZEABLE", "READ", "WRITE"]
	extent_size = 65536		# 32 Megabytes
	max_lv = 0
	max_pv = 0

	physical_volumes {

		pv0 {
			id = "uMJ8uM-sfTJ-La9j-oIuy-W3NX-ObiT-n464Rv"
			device = "/dev/md0"	# Hint only

			status = ["ALLOCATABLE"]
			pe_start = 384
			pe_count = 8943	# 279,469 Gigabytes
		}
	}

	logical_volumes {

		LogVol00 {
			id = "ohesOX-VRSi-CsnK-PUoI-GjUE-0nT7-ltxWoy"
			status = ["READ", "WRITE", "VISIBLE"]
			segment_count = 1

			segment1 {
				start_extent = 0
				extent_count = 8942	# 279,438 Gigabytes

				type = "striped"
				stripe_count = 1	# linear

				stripes = [
					"pv0", 0
				]
			}
		}
	}
}

Create the file /etc/lvm/backup/VolGroup00:

vi /etc/lvm/backup/VolGroup00

and insert the configuration data so the file looks similar to the above example.

Now we can start LVM:

/etc/init.d/lvm start

Read in the volume:

vgscan

Reading all physical volumes. This may take a while...
Found volume group "VolGroup00" using metadata type lvm2
pvscan
PV /dev/md0 VG VolGroup00 lvm2 [279,47 GB / 32,00 MB free]
Total: 1 [279,47 GB] / in use: 1 [279,47 GB] / in no VG: 0 [0 ]

Now activate the volume:

vgchange VolGroup00 -a y

1 logical volume(s) in volume group "VolGroup00" now active

Now we are able to mount the partition to /mnt/data:

mkdir /mnt/data
mount /dev/VolGroup00/LogVol00 /mnt/data/

If you recover data from a hard disk with filenames in UTF-8 format, it might be necessary to convert them to your current non-UTF-8 locale. In my case, the RAID hard disk is from a Fedora Core system with UTF-8 encoded filenames. My target locale is ISO-8859-1. In this case, the Perl script convmv helps to convert the filenames to the target locale.

Installation Of convmv

cd /tmp
wget http://j3e.de/linux/convmv/convmv-1.10.tar.gz
tar xvfz convmv-1.10.tar.gz
cd convmv-1.10
cp convmv /usr/bin/convmv

To convert all filenames in /mnt/data to the ISO-8859-1 locale, run this command:

convmv -f UTF-8 -t ISO-8859-1 -r --notest /mnt/data/*

If you want to test the conversion first, use:

convmv -f UTF-8 -t ISO-8859-1 -r /mnt/data/*


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

A simple introduction to working with LVM

xxx
I got badly bitten by this. Had a disk die on me. Very luckily only one logical volume was using the space on that physical volume.

Not that this is an in depth lvm review - but - just to make it more visible (I spent a _long_ time googling) - if you get a dead PV in your volume group - then - you can run

vgreduce --removemissing <volgrpname>

to get rid of it. Right enough that you will lose any partitions wholly or partly on that PV. But you should be able to rescue the rest :)

#

Re: A simple introduction to working with LVM

Posted by marki (89.173.xx.xx) on Thu 29 Jun 2006 at 23:19

I had this problem - one of the disks in LVM died (I wasn't using RAID on this server - but now I use :)

The failed disk contained part of /home. I had a backup, but it was few days old, so I wanted to try to read new files from /home.

I put all good disks to another machine and booted from Live CD (INSERT or RiPLINUX, I don't remember which one worked). The problem was the VG refused to activate itself because of missing PV. I have found that switch "-P" to vgchange allows it to activate in partial mode. That was OK, but it activates itself only in read-only mode. Problem was the ext3 filesystem on /home, which wasn't unmounted and required recovery - which is not possible on read-only "disk" :(

I had to use mdadm to create bogus PV (which returns all nulls on read) instead of the missing one (it's written in man vgchange). But I had to google on how to create it.

Finally I created a "replacement" PV in RAM. Just created a big enough file on ramdisk, used losetup to make a loopback device of it, then used pvcreate --uuid with uuid of the missing PV. pvscan recognized it, but it didn't show that it is part of VG. Running vgcfgrestore solved also this. This allowed vgchange to activate the VG in read-write mode and I could mount the ext3 fs. I was able to read all data on the good disk.

So using LVM does not make your data unavailable when one of the disks dies (I mean it is possible to get data out of the good ones).

[Sep 14, 2010] Learn Linux, 101 Maintain the integrity of filesystems by Ian Shields

Aug 24, 2010 | developerWorks

Checking filesystems

In cases when your system crashes or loses power, Linux may not be able to cleanly unmount your filesystems. Thus, your filesystems may be left in an inconsistent state, with some changes completed and some not. Operating with a damaged filesystem is not a good idea as you are likely to further compound any existing errors.

The main tool for checking filesystems is fsck, which, like mkfs, is really a front end to filesystem-checking routines for the various filesystem types. Some of the underlying check routines are shown in Listing 1.

Listing 1. Some of the fsck programs
[ian@echidna ~]$ ls /sbin/*fsck*
/sbin/btrfsck  /sbin/fsck         /sbin/fsck.ext3     /sbin/fsck.msdos
/sbin/dosfsck  /sbin/fsck.cramfs  /sbin/fsck.ext4     /sbin/fsck.vfat
/sbin/e2fsck   /sbin/fsck.ext2    /sbin/fsck.ext4dev  /sbin/fsck.xfs

You may be surprised to learn that several of these files are hard links to just one file as shown in Listing 2. Remember that these programs may be used so early in the boot process that the filesystem may not be mounted and symbolic link support may not yet be available. See our article Learn Linux, 101: Create and change hard and symbolic links for more information about hard and symbolic links.

Listing 2. One fsck program with many faces
[ian@echidna ~]$ find /sbin -samefile /sbin/e2fsck
/sbin/fsck.ext4dev
/sbin/e2fsck
/sbin/fsck.ext3
/sbin/fsck.ext4
/sbin/fsck.ext2

The system boot process use fsck with the -A option to check the root filesystem and any other filesystems that are specified for checking in the /etc/fstab control file. If the filesystem was not cleanly unmounted, a consistency check is performed and repairs are made, if they can be done safely. This is controlled by the pass (or passno) field (the sixth field) of the /etc/fstab entry. Filesystems with pass set to zero are not checked at boot time. The root filesystem has a pass value of 1 and is checked first. Other filesystems will usually have a pass value of 2 (or higher), indicating the order in which they should be checked.

Multiple fsck operations can run in parallel if the system determines it is advantageous, so different filesystems are allowed to have the same pass value, as is the case for the /grubfile and //mnt/ext3test filesystems shown in Listing 3. Note that fsck will avoid running multiple filesystem checks on the same physical disk. To learn more about the layout of /etc/fstab, check the man pages for fstab.

Listing 3. Boot checking of filesystems with /etc/fstab entries
                   

filesystem                           mount point  type   options    dump pass
UUID=a18492c0-7ee2-4339-9010-3a15ec0079bb /              ext3    defaults        1   1
UUID=488edd62-6614-4127-812d-cbf58eca85e9 /grubfile      ext3    defaults        1   2
UUID=2d4f10a6-be57-4e1d-92ef-424355bd4b39 swap           swap    defaults        0   0
UUID=ba38c08d-a9e7-46b2-8890-0acda004c510 swap           swap    defaults        0   0
LABEL=EXT3TEST                            /mnt/ext3test  ext3    defaults        0   2
/dev/sda8                                 /mnt/xfstest   xfs     defaults        0   0
LABEL=DOS                                 /dos           vfat    defaults        0   0
tmpfs                   /dev/shm                         tmpfs   defaults        0   0
devpts                  /dev/pts                         devpts  gid=5,mode=620  0   0
sysfs                   /sys                             sysfs   defaults        0   0
proc                    /proc                            proc    defaults        0   0

Some journaling filesystems, such as ReiserFS and XFS, might have a pass value of 0 because the journaling code, rather than fsck, does the filesystem consistency check and repair. On the other hand, some filesystems, such as /proc, are built at initialization time and therefore do need to be checked.

You can check filesystems after the system is booted. You will need root authority, and the filesystem you want to check should be unmounted first. Listing 4 shows how to check two of our filesystems, using the device name, label, or UUID. You can use the blkid command to find the device given a label or UUID, and the label and UUID, given the device.


Listing 4. Using fsck to check filesystems
[root@echidna ~]# # find the device for LABEL=EXT3TEST
[root@echidna ~]# blkid -L EXT3TEST
/dev/sda7
[root@echidna ~]# # Find label and UUID for /dev/sda7
[root@echidna ~]# blkid /dev/sda7
/dev/sda7: LABEL="EXT3TEST" UUID="7803f979-ffde-4e7f-891c-b633eff981f0" SEC_TYPE="ext2" 
 TYPE="ext3" 
[root@echidna ~]# # Check /dev/sda7
[root@echidna ~]# fsck /dev/sda7
fsck from util-linux-ng 2.16.2
e2fsck 1.41.9 (22-Aug-2009)
EXT3TEST: clean, 11/7159808 files, 497418/28637862 blocks
[root@echidna ~]# # Check it by label using fsck.ext3
[root@echidna ~]# fsck.ext3 LABEL=EXT3TEST
e2fsck 1.41.9 (22-Aug-2009)
EXT3TEST: clean, 11/7159808 files, 497418/28637862 blocks
[root@echidna ~]# # Check it by UUID using e2fsck
[root@echidna ~]# e2fsck UUID=7803f979-ffde-4e7f-891c-b633eff981f0
e2fsck 1.41.9 (22-Aug-2009)
EXT3TEST: clean, 11/7159808 files, 497418/28637862 blocks
[root@echidna ~]# # Finally check the vfat partition
[root@echidna ~]# fsck LABEL=DOS
fsck from util-linux-ng 2.16.2
dosfsck 3.0.9, 31 Jan 2010, FAT32, LFN
/dev/sda9: 1 files, 1/513064 clusters

If you attempt to check a mounted filesystem, you will usually see a warning similar to the one in Listing 5 where we try to check our root filesystem. Heed the warning and do not do it!


Listing 5. Do not attempt to check a mounted filesystem
[root@echidna ~]# fsck UUID=a18492c0-7ee2-4339-9010-3a15ec0079bb 
fsck from util-linux-ng 2.16.2
e2fsck 1.41.9 (22-Aug-2009)
/dev/sdb9 is mounted.  

WARNING!!!  Running e2fsck on a mounted filesystem may cause
SEVERE filesystem damage.

Do you really want to continue (y/n)? no

check aborted.

It is also a good idea to let fsck figure out which check to run on a filesystem; running the wrong check can corrupt the filesystem. If you want to see what fsck would do for a given filesystem or set of filesystems, use the -N option as shown in Listing 6.


Listing 6. Finding what fsck would do to check /dev/sda7, /dev/sda8, and /dev/sda9
[root@echidna ~]# fsck -N /dev/sda7 /dev/sda[89]
fsck from util-linux-ng 2.16.2
[/sbin/fsck.ext3 (1) -- /mnt/ext3test] fsck.ext3 /dev/sda7 
[/sbin/fsck.xfs (2) -- /mnt/xfstest] fsck.xfs /dev/sda8 
[/sbin/fsck.vfat (3) -- /dos] fsck.vfat /dev/sda9 

... ... ...


Monitoring free space

On a storage device, a file or directory is contained in a collection of blocks. Information about a file is contained in an inode, which records information such who the owner is, when the file was last accessed, how large it is, whether it is a directory, and who can read from or write to it. The inode number is also known as the file serial number and is unique within a particular filesystem. See our article Learn Linux, 101: File and directory management for more information on files and directories.

Data blocks and inodes each take space on a filesystem, so you need to monitor the space usage to ensure that your filesystems have space for growth.

The df command

The df command displays information about mounted filesystems. If you add the -T option, the filesystem type is included in the display; otherwise, it is not. The output from df for the Fedora 12 system that we used above is shown in Listing 8.


Listing 8. Displaying filesystem usage
[ian@echidna ~]$ df -T
Filesystem    Type   1K-blocks      Used Available Use% Mounted on
/dev/sdb9     ext3    45358500  24670140  18384240  58% /
tmpfs        tmpfs     1927044       808   1926236   1% /dev/shm
/dev/sda2     ext3      772976     17760    716260   3% /grubfile
/dev/sda8      xfs    41933232      4272  41928960   1% /mnt/xfstest
/dev/sda7     ext3   112754024    192248 106834204   1% /mnt/ext3test
/dev/sda9     vfat     2052256         4   2052252   1% /dos

Notice that the output includes the total number of blocks as well as the number used and free. Also notice the filesystem, such as /dev/sbd9, and its mount point: / /dev/sdb9. The tmpfs entry is for a virtual memory filesystem. These exist only in RAM or swap space and are created when mounted without need for a mkfs command. You can read more about tmpfs in "Common threads: Advanced filesystem implementor's guide, Part 3".

For specific information on inode usage, use the -i option on the df command. You can exclude certain filesystem types using the -x option, or restrict information to just certain filesystem types using the -t option. Use these multiple times if necessary. See the examples in Listing 9.


Listing 9. Displaying inode usage
[ian@echidna ~]$ df -i -x tmpfs
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sdb9            2883584  308920 2574664   11% /
/dev/sda2              48768      41   48727    1% /grubfile
/dev/sda8            20976832       3 20976829    1% /mnt/xfstest
/dev/sda7            7159808      11 7159797    1% /mnt/ext3test
/dev/sda9                  0       0       0    -  /dos
[ian@echidna ~]$ df -iT -t vfat -t ext3
Filesystem    Type    Inodes   IUsed   IFree IUse% Mounted on
/dev/sdb9     ext3   2883584  308920 2574664   11% /
/dev/sda2     ext3     48768      41   48727    1% /grubfile
/dev/sda7     ext3   7159808      11 7159797    1% /mnt/ext3test
/dev/sda9     vfat         0       0       0    -  /dos

You may not be surprised to see that the FAT32 filesystem does not have inodes. If you had a ReiserFS filesystem, its information would also show no inodes. ReiserFS keeps metadata for files and directories in stat items. And since ReiserFS uses a balanced tree structure, there is no predetermined number of inodes as there are, for example, in ext2, ext3, or xfs filesystems.

There are several other options you may use with df to limit the display to local filesystems or control the format of output. For example, use the -H option to display human readable sizes, such as 1K for 1024, or use the -h (or --si) option to get sizes in powers of 10 (1K=1000).

If you aren't sure which filesystem a particular part of your directory tree lives on, you can give the df command a parameter of a directory name or even a filename as shown in Listing 10.


Listing 10. Human readable output for df
[ian@echidna ~]$ df --si ~ian/index.html
Filesystem             Size   Used  Avail Use% Mounted on
/dev/sdb9               47G    26G    19G  58% /

The tune2fs command

The ext family of filesystems also has a utility called tune2fs, which can be used to inspect information about the block count as well as information about whether the filesystem is journaled (ext3 or ext4) or not (ext2). The command can also be used to set many parameters or convert an ext2 filesystem to ext3 by adding a journal. Listing 11 shows the output for a near-empty ext3 filesystem using the -l option to simply display the existing information.


Listing 11. Using tune2fs to display ext4 filesystem information
[root@echidna ~]# tune2fs -l /dev/sda7
tune2fs 1.41.9 (22-Aug-2009)
Filesystem volume name:   EXT3TEST
Last mounted on:          <not available>
Filesystem UUID:          7803f979-ffde-4e7f-891c-b633eff981f0
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype 
 needs_recovery sparse_super large_file
Filesystem flags:         signed_directory_hash 
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              7159808
Block count:              28637862
Reserved block count:     1431893
Free blocks:              28140444
Free inodes:              7159797
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      1017
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Filesystem created:       Mon Aug  2 15:23:34 2010
Last mount time:          Tue Aug 10 14:17:53 2010
Last write time:          Tue Aug 10 14:17:53 2010
Mount count:              3
Maximum mount count:      30
Last checked:             Mon Aug  2 15:23:34 2010
Check interval:           15552000 (6 months)
Next check after:         Sat Jan 29 14:23:34 2011
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:	          256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      2438df0d-fa91-4a3a-ba88-c07b2012f86a
Journal backup:           inode blocks
... ... ...

Listing 13. Using du
[testuser1@echidna ~]$ du -hc *
4.0K	Desktop
4.0K	Documents
4.0K	Downloads
16K	index.html
4.0K	Music
4.0K	Pictures
4.0K	Public
4.0K	Templates
4.0K	Videos
48K	total
[testuser1@echidna ~]$ du -hs .
1.1M	.

The reason for the difference between the 48K total from du -c * and the 1.1M summary from du -s is that the latter includes the entries starting with a dot, such as .bashrc, while the former does not.

One other thing to note about du is that you must be able to read the directories that you are running it against.

So now, let's use du to display the total space used by the /usr tree and each of its first-level subdirectories. The result is shown in Listing 14. Use root authority to make sure you have appropriate access permissions.


Listing 14. Using du on /usr
[root@echidna ~]# du -shc /usr/*
394M	/usr/bin
4.0K	/usr/etc
4.0K	/usr/games
156M	/usr/include
628K	/usr/kerberos
310M	/usr/lib
1.7G	/usr/lib64
110M	/usr/libexec
136K	/usr/local
30M	/usr/sbin
2.9G	/usr/share
135M	/usr/src
0	/usr/tmp
5.7G	total

Repairing filesystems

Occasionally, very occasionally we hope, the worst will happen and you will need to repair a filesystem because of a crash or other failure to unmount cleanly. The fsck command that you saw above can repair filesystems as well as check them. Usually the automatic boot-time check will fix the problems and you can proceed.

If the automatic boot-time check of filesystems is unable to restore consistency, you are usually dumped into a single user shell with some instructions to run fsck manually. For an ext2 filesystem, which is not journaled, you may be presented with a series of requests asking you to confirm proposed actions to fix particular blocks on the filesystem. You should generally allow fsck to attempt to fix problems, by responding y (for yes). When the system reboots, check for any missing data or files.

If you suspect corruption, or want to run a check manually, most of the checking programs require the filesystem to be unmounted, or at least mounted read-only. Because you can't unmount the root filesystem on a running system, the best you can do is drop to single user mode (using telinit 1) and then remount the root filesystem read-only, at which time you should be able to perform a consistency check. A better way to check a filesystem is to boot a recovery system, such as a live CD or a USB memory key, and perform the check of your unmounted filesystems from that.

If fsck cannot fix the problem, you do have some other tools available, although you will generally need advanced knowledge of the filesystem layout to successfully fix it.

Why journal?

An fsck scan of an ext2 disk can take quite a while to complete, because the internal data structure (or metadata) of the filesystem must be scanned completely. As filesystems get larger and larger, this takes longer and longer, even though disks also keep getting faster, so a full check may take one or more hours.

This problem was the impetus for journaled, or journaling, filesystems. Journaled filesystems keep a log of recent changes to the filesystem metadata. After a crash, the filesystem driver inspects the log in order to determine which recently changed parts of the filesystem may possibly have errors. With this design change, checking a journaled filesystem for consistency typically takes just a matter of seconds, regardless of filesystem size. Furthermore, the filesystem driver will usually check the filesystem on mounting, so an external fsck check is generally not required. In fact, for the xfs filesystem, fsck does nothing!

If you do run a manual check of a filesystem, check the man pages for the appropriate fsck command (fsck.ext3, e2fsck , reiserfsck, and so on) to determine the appropriate parameters. The -p option, when used with ext2, ext3, or ext4 filesystems will cause fsck to automatically fix all problems that can be safely fixed. This is, in fact, what happens at boot time.

We'll illustrate the use of e2fsck and xfs_check by first running e2fsck on an empty XFS filesystem and then using xfs_check to fix it. Remember we suggested that you use the fsck front end to be sure you are using the right checker, and we warned you that failure to do so may result in filesystem corruption.

In Listing 15, we start running e2fsck against /dev/sda8, which contains an XFS filesystem. After a few interactions we use ctrl-Break to break out, but it is too late. Warning: Do NOT do this unless you are willing to destroy your filesystem.


Listing 15. Deliberately running e2fsck manually on an XFS filesystem
[root@echidna ~]# e2fsck /dev/sda8
e2fsck 1.41.9 (22-Aug-2009)
/dev/sda8 was not cleanly unmounted, check forced.
Resize inode not valid.  Recreate<y>? yes

Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong for group #0 (31223, counted=31224).
Fix<y>? ctrl-Break

/dev/sda8: e2fsck canceled.

/dev/sda8: ***** FILE SYSTEM WAS MODIFIED *****

Even if you broke out at the first prompt, your XFS filesystem would still have been corrupted. Repeat after me. Do NOT do this unless you are willing to destroy your filesystem.

... ... ...

Superblocks

You may be wondering how all these checking and repairing tools know where to start. Linux and UNIX filesystems usually have a superblock, which describes the filesystem metadata, or data describing the filesystem itself. This is usually stored at a known location, frequently at or near the beginning of the filesystem, and replicated at other well-known locations. You can use the -n option of mke2fs to display the superblock locations for an existing filesystem. If you specified parameters such as the bytes per inode ratio, you should invoke mke2fs with the same parameters when you use the -n option. Listing 17 shows the location of the superblocks on /dev/sda7.


Listing 17. Finding superblock locations
[root@echidna ~]# mke2fs -n /dev/sda7
mke2fs 1.41.9 (22-Aug-2009)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
7159808 inodes, 28637862 blocks
1431893 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
874 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
	4096000, 7962624, 11239424, 20480000, 23887872

Advanced tools

There are several more advanced tools that you can use to examine or repair a filesystem. Check the man pages for the correct usage and the Linux Documentation Project (see Resources) for how-to information. Almost all of these commands require a filesystem to be unmounted, although some functions can be used on filesystems that are mounted read-only. A few of the commands are described below.

You should always back up your filesystem before attempting any repairs.

Tools for ext2 and ext3 filesystems

tune2fs
Adjusts parameters on ext2 and ext3 filesystems. Use this to add a journal to an ext2 system, making it an ext3 system, as well as display or set the maximum number of mounts before a check is forced. You can also assign a label and set or disable various optional features.
dumpe2fs
Prints the super block and block group descriptor information for an ext2 or ext3 filesystem.
debugfs
Is an interactive file system debugger. Use it to examine or change the state of an ext2 or ext3file system.

Tools for Reiserfs filesystems

reiserfstune
Displays and adjusts parameters on ReiserFS filesystems.
debugreiserfs
Performs similar functions to dumpe2fs and debugfs for ReiserFS filesystems.

... ... ...

We will wrap up our tools review with an illustration of the debugfs command, which allows you to explore the inner workings of an ext family filesystem. By default, it opens the filesystem in read-only mode. It does have many commands that allow you to attempt undeletion of files or directories, as well as other operations that require write access, so you will specifically have to enable write access with the -w option. Use it with extreme care. Listing 18 shows how to open the root filesystem on my system; navigate to my home directory; display information, including the inode number, about a file called index.html; and finally, map that inode number back to the pathname of the file.

Listing 18. Using debugfs
[root@echidna ~]# debugfs /dev/sdb9
debugfs 1.41.9 (22-Aug-2009)
debugfs:  cd home/ian
debugfs:  pwd
[pwd]   INODE: 165127  PATH: /home/ian
[root]  INODE:      2  PATH: /
debugfs:  stat index.html
Inode: 164815   Type: regular    Mode:  0644   Flags: 0x0
Generation: 2621469650    Version: 0x00000000
User:  1000   Group:  1000   Size: 14713
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 32
Fragment:  Address: 0    Number: 0    Size: 0
ctime: 0x4bf1a3e9 -- Mon May 17 16:15:37 2010
atime: 0x4c619cf0 -- Tue Aug 10 14:39:44 2010
mtime: 0x4bf1a3e9 -- Mon May 17 16:15:37 2010
Size of extra inode fields: 4
Extended attributes stored in inode body: 
  selinux = "unconfined_u:object_r:user_home_t:s0\000" (37)
BLOCKS:
(0-2):675945-675947, (3):1314836
TOTAL: 4

debugfs:  ncheck 164815
Inode	Pathname
164815	/home/ian/index.html
debugfs:  q

Conclusion

We've covered many tools you can use for checking, modifying, and repairing your filesystems. Remember to always use extreme care when using the tools discussed in this article or any other tools. Data loss may be only a keystroke away.

[Jul 08, 2010] Recovering a Lost LVM Volume Disk Novell User Communities

Contents:

Overview

Logical Volume Management (LVM) provides a high level, flexible view of a server's disk storage. Though robust, problems can occur. The purpose of this document is to review the recovery process when a disk is missing or damaged, and then apply that process to plausible examples. When a disk is accidentally removed or damaged in some way that adversely affects the logical volume, the general recovery process is:

  1. Replace the failed or missing disk
  2. Restore the missing disk's UUID
  3. Restore the LVM meta data
  4. Repair the file system on the LVM device

The recovery process will be demonstrated in three specific cases:

  1. A disk belonging to a logical volume group is removed from the server
  2. The LVM meta data is damaged or corrupted
  3. One disk in a multi-disk volume group has been permanently removed

This article discusses how to restore the LVM meta data. This is a risky proposition. If you restore invalid information, you can loose all the data on the LVM device. An important part of LVM recovery is having backups of the meta data to begin with, and knowing how it's supposed to look when everything is running smoothly. LVM keeps backup and archive copies of it's meta data in /etc/lvm/backup and /etc/lvm/archive. Backup these directories regularly, and be familiar with their contents. You should also manually backup the LVM meta data with vgcfgbackup before starting any maintenance projects on your LVM volumes.

If you are planning on removing a disk from the server that belongs to a volume group, you should refer to the LVM HOWTO before doing so.

[Jun 09, 2010] SLES10 SP2 can't boot after Kernel Update - NOVELL FORUMS

The SLES10 DVD in rescue mode automatically recognizes the LVM group.

Hi everyone,

Last saturday I've updated my servers running SLES10 SP2 to the latest kernel (to fix the infamous NULL pointer issue on the net stack) and after a reboot the system could not start due to a LVM failure.

Grub works OK but then there come a lot of messages like that: "Waiting for device /dev/system/sys_root to appear. Volume "system" not found. Fallback to sh" and messages repeat until a mini-bash appears. I can't do anything with that bash (or shell?).

At first I though the new kernel (or initrd) doesn't had lvm2 support because I booted with a rescue CD and checked every lvm partition with fsck and I was able to mount them, so I chrooted and run "mkinitrd -f lvm2" and rebooted the system, but nothing changed.

I've ran pvdisplay, vgdisplay and lvdisplay and everything looks fine. (See bottom of the post)

After the chroot, I've ran "yast2 lvm_config" and it recognizes the volume, I even created a new volume to check, but when I restart the server the problem occurs again.

BTW: The SLES10 SP2 DVD in rescue mode automatically recognizes the LVM group.

So I downgraded the kernel (2.6.16-0.42.4) to the previous working one (2.6.16-60-0.39.3), but the problem still persist.

My situation is very similar to these ones bellow, but without any file system corruption:

http://forums.novell.com/novell-prod...st1536272.html

LVM and RAID - Waiting for device to appear - openSUSE Forums

The "/boot" partition is a linux native (0x83) with ReiserFS format, I have 17 LVM partitions inside the "system" group, my machine is a Dell PowerEdge 2950 with a PERC5/i controller and 5 SAS disk in a RAID5 setup, all the disk are OK.

I don't think the problem was the kernel update, the server had an uptime of 160 days at the moment I rebooted it, so the problem could happened days before but went unnoticed until this reboot.

By the moment, I could start the server with the rescue CD, chrooted and started every process manually so the users can work during the week, all the data is perfectly fine.

What can I do to resolve this problem? Is very important to have this server working flawlessly because is the main data server of the company.

Thanks in advance for any advise,

Raul

PS: These are the pvdisplay, lvdisplay and vgdisplay output (I had to shorten the output of lvdisplay because it was too long for the post).

Code:
Server:~ # pvdisplay
  --- Physical volume ---
  PV Name               /dev/sda6
  VG Name               system
  PV Size               542,23 GB / not usable 2,53 MB
  Allocatable           yes
  PE Size (KByte)       4096
  Total PE              138811
  Free PE               32315
  Allocated PE          106496
  PV UUID               yhllOV-uPt2-XiAX-mMPf-94Hb-xow4-tOHIHN

Code:
Server:~ # pvdisplay
  --- Logical volume ---
  LV Name                /dev/system/sys_home
  VG Name                system
  LV UUID                5zR1Ze-ISx2-7NNj-HAGJ-vaUt-5UfJ-x1y4RM
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                30,00 GB
  Current LE             7680
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:0

  --- Logical volume ---
  LV Name                /dev/system/sys_root
  VG Name                system
  LV UUID                7nue1u-Qci6-VHtX-U1Yv-cMfx-bWxG-UJ6R2a
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                12,00 GB
  Current LE             3072
  Segments               2
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:1

  --- Logical volume ---
  LV Name                /dev/system/sys_tmp
  VG Name                system
  LV UUID                aNgfsd-Bn7f-TcqP-swoq-HhLx-jtUw-L15gSC
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                2,00 GB
  Current LE             512
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:2

  --- Logical volume ---
  LV Name                /dev/system/sys_usr
  VG Name                system
  LV UUID                lkT9K7-csO8-QMEe-3R9J-BUar-7Oa2-FTUu0r
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                8,00 GB
  Current LE             2048
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:3

  --- Logical volume ---
  LV Name                /dev/system/sys_var
  VG Name                system
  LV UUID                kXcoKf-UeYc-s5t5-8gqR-I8r9-6aT9-vzZtj6
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                10,00 GB
  Current LE             2560
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:4

  --- Logical volume ---
  LV Name                /dev/system/sys_compras
  VG Name                system
  LV UUID                xefE83-SlWD-S7Ax-GeHw-0W3T-kPyI-imEjL2
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                8,00 GB
  Current LE             2048
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:5

  --- Logical volume ---
  LV Name                /dev/system/sys_proyectos
  VG Name                system
  LV UUID                ulgdax-bPqI-Vi2f-ynYL-pNm4-V4i1-CBn2q9
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                120,00 GB
  Current LE             30720
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:6

  --- Logical volume ---
  LV Name                /dev/system/sys_restore
  VG Name                system
  LV UUID                0jAS4z-iN1V-bR2p-b0l5-bPYa-rKFY-rMftMV
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                60,00 GB
  Current LE             15360
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:7

  --- Logical volume ---
  LV Name                /dev/system/sys_administracion
  VG Name                system
  LV UUID                GkmdIM-Qa2c-6DHs-R6PB-jrNo-pg1o-Q0H6gt
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                80,00 GB
  Current LE             20480
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:8

  --- Logical volume ---
  LV Name                /dev/system/sys_direccion
  VG Name                system
  LV UUID                uc3wr4-qBqL-Mnco-4JPN-vpmi-XZE6-1KarGn
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                4,00 GB
  Current LE             1024
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:9

  --- Logical volume ---
  LV Name                /dev/system/sys_misc
  VG Name                system
  LV UUID                cl7dYM-c9eJ-FAFS-jz0e-EOQN-9saF-kDuJed
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                10,00 GB
  Current LE             2560
  Segments               2
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:10

  --- Logical volume ---
  LV Name                /dev/system/sys_lulo
  VG Name                system
  LV UUID                2mSZiq-mvZ4-iinE-DMxt-ndF2-GF7U-SwGC9o
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                8,00 GB
  Current LE             2048
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:11
Code:
Server:~ # vgdisplay
  --- Volume group ---
  VG Name               system
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  32
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                17
  Open LV               16
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               542,23 GB
  PE Size               4,00 MB
  Total PE              138811
  Alloc PE / Size       106496 / 416,00 GB
  Free  PE / Size       32315 / 126,23 GB
  VG UUID               7SoTGW-QyzO-Ns5d-UpDM-Wjef-JJSj-1KswVA

[May 17, 2010] How to mount a LVM partition on Ubuntu - Server Fault

#fdisk -l 
 
/dev/sdb1   *           1        9702    77931283+  8e  Linux LVM 

I tried the following command:

#mkdir /media/backup 
#mount /dev/sdb1 /media/backup 
 
mount: unknown file system 'LVM2_member' 

How do I mount it?

[May 15, 2010] Recover Data From RAID1 LVM Partitions With Knoppix Linux LiveCD HowtoForge - Linux Howtos and Tutorials

LVM stores one or more copy(s) of the configuration file content at the beginning of the partition. I use the command dd to extract the first part of the partition and write it to a text file:

dd if=/dev/md0 bs=512 count=255 skip=1 of=/tmp/md0.txt

Open the file with a text editor:

vi /tmp/md0.txt

You will find some binary data first and then a configuration file part like this:

VolGroup00 {
	id = "evRkPK-aCjV-HiHY-oaaD-SwUO-zN7A-LyRhoj"
	seqno = 2
	status = ["RESIZEABLE", "READ", "WRITE"]
	extent_size = 65536		# 32 Megabytes
	max_lv = 0
	max_pv = 0

	physical_volumes {

		pv0 {
			id = "uMJ8uM-sfTJ-La9j-oIuy-W3NX-ObiT-n464Rv"
			device = "/dev/md0"	# Hint only

			status = ["ALLOCATABLE"]
			pe_start = 384
			pe_count = 8943	# 279,469 Gigabytes
		}
	}

	logical_volumes {

		LogVol00 {
			id = "ohesOX-VRSi-CsnK-PUoI-GjUE-0nT7-ltxWoy"
			status = ["READ", "WRITE", "VISIBLE"]
			segment_count = 1

			segment1 {
				start_extent = 0
				extent_count = 8942	# 279,438 Gigabytes

				type = "striped"
				stripe_count = 1	# linear

				stripes = [
					"pv0", 0
				]
			}
		}
	}
}

Create the file /etc/lvm/backup/VolGroup00:

vi /etc/lvm/backup/VolGroup00

and insert the configuration data so the file looks similar to the above example.

Now we can start LVM:

/etc/init.d/lvm start

Missing bits found

July 18th, 2007 | linuxjournal.com

On July 18th, 2007 Richard Bullington-McGuire says:

> ... couldn't you have booted the recovery box with a Live CD and simply mounted

only the drive partitions you needed?

That was what I was originally hoping to do, but that did not work automatically. RAID arrays on USB-connected drives are not available to the system when it does its first scan for RAID arrays. Also, if the recovery box has a volume group with the same name, it will not recognize the newly-attached volume group.

I have used USB RAID arrays in production, and you have to take some extra steps to activate them late in the boot process. I typically use a script similar to this to do the job:


#!/bin/sh
#
# Mount a USB raid array
#
# Call from /etc/rc.d/rc.local

DEVICE=/dev/ExampleVolGroup/ExampleVol00
MOUNTPOINT=/mnt/ExampleVol00

# Activate the array. This assumes that /etc/mdadm.conf has an entry for it already
/sbin/mdadm -A -s
# Look for LVM2 volume groups on all connected partitions, including the array
/sbin/vgscan --mknodes
# Activate all LVM partitions, including that on the array
/sbin/vgchange -a y
# Make sure to fsck the device so it stays healthy long-term
fsck -T -a $DEVICE
mount $DEVICE $MOUNTPOINT

> In otherwords, just don't mount the drive in the recovery box that had the equivalent vol group. That way there would have been no conflict right?

That's mostly right. You'd still need to scan for the RAID arrays with 'mdadm --examine --scan $MYDEVICENAME' , then activate them after creating /etc/mdadm.conf.

If you had other md software RAID devices on the system, you might have to fix up the device numbering on the md devices.

> If the recovery box did NOT have any LVM partitions or LVM config native to it.. could i simply plug the raid drive in and the recovery box would automagically find the raid LVM partitions or would I still have to something else to make it work?

On a recovery box without any software RAID or LVM configuration, if you plugged the RAID drive directly into the IDE or SATA connector, it might automagically find the RAID array and LVM volume. I have not done that particular experiment, you might try it and let me know how it goes.

If the drive was attached to the recovery box using a USB enclosure, the RAID and LVM configurations probably won't be autodetected during the early boot stages, and you'll almost certainly have to do a scan / activate procedure on both the RAID and LVM layers.

You might have to scan for RAID partitions, build an /etc/mdadm.conf file, and then scan for volume groups and activate them in either case.

The most difficult part of the recovery outlined in the article was pulling the LVM configuration out of the on-disk ring buffer. You can avoid that by making sure you have a backup of the LVM configuration for that machine stored elsewhere.

Experienced this exact

On September 13th, 2006 Neekofab (not verified) says:

Experienced this exact problem. moved a md0/md1 disk to a recovery workstation that already had an md0/md1 device. they could not coexist, and I could not find a way to move the additional md0/md1 devices to md2/md3. I ended up disconnecting the system md0/md1 devices, booting up with sysresccd and shoving the data over the network.

bleah

I ran into the same issue

On May 9th, 2007 Anonymous (not verified) says:

I ran into the same issue and solved it with a little reading about mdadm. All you have to do is create a new array from the old disks.

# MAKEDEV md1
# mdadm -C /dev/md1 -l 1 -n 2 missing /dev/sdb1

Voila. Your raid array has now been moved from md0 to md1.

Help restoring LVM partition Redhat, RHEL 5, 5.1, LVM

Example of wrong move which destroys LVM control table. You need to use
vgextend VolGroup00 /dev/sdb1 
to add new disk to existing logical volume. See Recovery of RAID and LVM2 Volumes for information about recovery.
Zones: Linux, Linux Administration, Linux Setup

Tags: Redhat, RHEL 5, 5.1, LVM

I was working on a Dell Precision Workstation system with 2 SAS drives.
The first drive /dev/sda was partitioned with the following table

Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 36472 292856917+ 8e Linux LVM

[root@lrc200604665 tsm]# cat /etc/fstab
/dev/VolGroup00/LogVol00 / ext3 defaults 1 1
LABEL=/boot /boot ext3 defaults 1 2
devpts /dev/pts devpts gid=5,mode=620 0 0
tmpfs /dev/shm tmpfs defaults 0 0
proc /proc proc defaults 0 0
sysfs /sys sysfs defaults 0 0
/dev/VolGroup00/LogVol01 swap swap defaults 0 0

I wanted to add a second hard drive to the system and mount it as "home"

My idea was to add it to the existing Volume Group VolGroup00

After I formated the drive and using the standard Linux 8E LVM2 partition type, I ran the following command to prepare it for LVM

[root@lrc200604665 home]# pvcreate /dev/sdb1
Can't initialize physical volume "/dev/sdb1" of volume group "VolGroup00" without -ff

Well, I ran the command and it overwrote my LVM table.

pvscan detects a UUID mitchmatch since PVCREATE overwrote the VolGroup00
vgscan and lvscan are also useless.

The system will odiously not boot now

any help would be greatly appreciated

3 Answers

oldest newest votes
1 I found the solution
#pvs 
/dev/sdc5 intranet lvm2 a- 372,37G 0 
 
# lvdisplay /dev/intranet 
LV Name                /dev/intranet/root 
 
#mount /dev/intranet/root /media/backup 
link|flag

[May 17, 2010] LinuxDevCenter.com Managing Disk Space with LVM

Second, having the root filesystem on LVM can complicate recovery of damaged file systems. Because boot loaders don't support LVM yet, you must also have a non-LVM /boot partition (though it can be on a RAID 1 device).

Third, you need some spare unallocated disk space for the new LVM partition. If you don't have this, use parted to shrink your existing root partition, as described in the LVM HOWTO.

For this example, assume you have your swap space and /boot partitions already set up outside of LVM on their own partitions. You can focus on moving your root filesystem onto a new LVM partition in the partition /dev/hda4. Check that the filesystem type on hda4 is LVM (type 8e).

Initialize LVM and create a new physical volume:

# vgscan
# pvcreate /dev/hda4
# vgcreate rootvg /dev/hda4

Now create a 5G logical volume, formatted into an xfs file system:

# lvcreate rootvg ---name rootlv -size 5G
# mkfs.xfs /dev/rootvg/rootlv

Copy the files from the existing root file system to the new LVM one:

# mkdir /mnt/new_root
# mount /dev/rootvg/rootlv /mnt/new_root
# cp -ax /. /mnt/new_root/

Next, modify /etc/fstab to mount / on /dev/rootvg/root instead of /dev/hda3.

The trickiest part is to rebuild your initrd to include LVM support. This tends to be distro-specific, but look for mkinitrd or yaird. Your initrd image must have the LVM modules loaded or the root filesystem will not be available. To be safe, leave your original initrd image alone and make a new one named, for example, /boot/initrd-lvm.img.

Finally, update your bootloader. Add a new section for your new root filesystem, duplicating your original boot stanza. In the new copy, change the root from /dev/hda3 to /dev/rootvg/rootlv, and change your initrd to the newly built one.

For example, with grub, if you have:

title=Linux
  root (hd0,0)
  kernel /vmlinuz root=/dev/hda3 ro single
  initrd /initrd.img

add a new section such as:

title=LinuxLVM

  root (hd0,0)
  kernel /vmlinuz root=/dev/rootvg/root ro single
  initrd /initrd-lvm.img

Conclusion

LVM is only one of many enterprise technologies in the Linux kernel that has become available for regular users. LVM provides a great deal of flexibility with disk space, and combined with RAID 1, NFS, and a good backup strategy, you can build a bulletproof, easily managed way to store, share, and preserve any quantity of files.

Bryce Harrington is a Senior Performance Engineer at the Open Source Development Labs in Beaverton, Oregon.

Kees Cook is the senior network administrator at OSDL.

[Nov 11, 2008] EXT3 filesystem recovery in LVM2

EXT3 filesystem recovery in LVM2 
--------------------------------------------------------------------------------
This is the bugzilla bug I started on the fedora buzilla:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=142737
--------------------------------------------------------------------------------
Very good idea to do something like the following, so that you have
an copy of the partition you're trying to recover, in case something
bad happens:
dd if=/dev/hda2 bs=1024k conv=noerror,sync,notrunc | reblock -t 65536 30 |
ssh remote.host.uci.edu 'cat > /recovery/damaged-lvm2-ext3'
--------------------------------------------------------------------------------
e2salvage died with "Terminated".  I assume it OOM'd.
--------------------------------------------------------------------------------
e2extract gave a huge list of 0 length files.  Doesn't seem right,
and it was taking forever, so I decided to move on to other methods.
But does anyone know if this is normal behavior for e2extract on an ext3?
--------------------------------------------------------------------------------
I wrote a small program that searches for ext3 magic numbers.  It's
finding many, EG 438, 30438, 63e438 and so on (hex).  The question is,
how do I convert from that to an fsck -b number?
--------------------------------------------------------------------------------
Running the same program on a known-good ext3, the first offset was the
same, but others were different.  However, they all ended in hex 38...
--------------------------------------------------------------------------------
I'm now running an "fsck -vn -b" with the -b argument ranging from 0 to
999999.  I'm hoping this will locate a suitable -b for me via brute force.
--------------------------------------------------------------------------------
Sent a post to gmane.linux.kernel 2004-12-16 
--------------------------------------------------------------------------------
Robin Green  very helpfully provided the
following instructions, which appear to be getting somewhere:

1) Note down what the root= device is that appears on the kernel
command line (this can be found by going to boot from hard drive and
then examining
the kernel command line in grub, or by looking in /boot/grub/grub.conf )

2) Be booted from rescue disk

3) Sanity check: ensure that the nodes /dev/hda, /dev/hda2 etc. exist

4) Start up LVM2 (assuming it is not already started by the rescue disk!) by
typing:

  lvm vgchange --ignorelockingfailure -P -a y

Looking at my initrd script, it doesn't seem necessary to run any other
commands
to get LVM2 volumes activated - that's it.

5) Find out which major/minor number the root device is. This is the
slightly tricky
bit. You may have to use trial-and-error. In my case, I guessed right
first time:
(no comments about my odd hardware setup please ;)

[root@localhost t]# ls /sys/block
dm-0  dm-2  hdd    loop1  loop3  loop5  loop7  ram0  ram10  ram12  ram14
ram2  ram4  ram6  ram8
dm-1  hdc   loop0  loop2  loop4  loop6  md0    ram1  ram11  ram13  ram15
ram3  ram5  ram7  ram9
[root@localhost t]# cat /sys/block/dm-0/dev
253:0
[root@localhost t]# devmap_name 253 0
Volume01-LogVol02

In the first command, I listed the block devices known to the kernel. dm-*
are the LVM
devices (on my 2.6.9 kernel, anyway). In the second command, I found
out the major:minor
numbers of /dev/dm-0. In the third command, I used devmap_name to check
that the device
mapper name of node with major 253 and minor 0, is the same as the name
of the root device
from my kernel command line (cf. step 1). Apart from a slight punctuation
difference,
it is the same, therefore I have found the root device.

I'm not sure if FC3 includes the devmap_name command. According to
fr2.rpmfind.net, it doesn't.
But you don't really need it, you can just try all the LVM devices in
turn until you find
your root device. Or, I can email you a statically-linked binary of it
if you want.

6) Create the /dev node for the root filesystem if it doesn't already
exist, e.g.:

  mknod /dev/dm-0 b 253 0

using the major-minor numbers found in step 5.

Please note that for the purpose of _rescue_, the node doesn't actually
have to be under
/dev (so /dev doesn't have to be writeable) and its name does not
matter. It just needs
to exist somewhere on a filesystem, and you have to refer to it in the
next command.

7) Do what you want to the root filesystem, e.g.:

  fsck /dev/dm-0
  mount /dev/dm-0 /where/ever

As you probably know, the fsck might actually work, because a fsck
can sometimes
correct filesystem errors that the kernel filesystem modules cannot.

8) If the fsck doesn't work, look in the output of fsck and in dmesg
for signs of
physical drive errors. If you find them, (a) think about calling a
data recovery
specialist, (b) do NOT use the drive!
--------------------------------------------------------------------------------
On FC3's rescue disk, what I actually did was:

1) Do startup network interfaces
2) Don't try to automatically mount the filesystems - not even readonly
3) lvm vgchange --ignorelockingfailure -P -a y
4) fdisk -l, and guess which partition is which based on size: the small
one was /boot, and the large one was /
5) mkdir /mnt/boot
6) mount /dev/hda1 /mnt/boot
7) Look up the device node for the root filesystem in /mnt/boot/grub/grub.conf
8) A first tentative step, to see if things are working: fsck -n
/dev/VolGroup00/LogVol00
9) Dive in: fsck -f -y /dev/VolGroup00/LogVol00
10) Wait a while...  Be patient.  Don't interrupt it
11) Reboot
--------------------------------------------------------------------------------
Are these lvm1 or lvm2?

lvmdiskscan -v
vgchange -ay
vgscan -P
vgchange -ay -P
--------------------------------------------------------------------------------
jeeves:~# lvm version
  LVM version:     2.01.04 (2005-02-09)
  Library version: 1.01.00-ioctl (2005-01-17)
  Driver version:  4.1.0
--------------------------------------------------------------------------------
I think you are making a potentially very dangerous mistake!

Type 8e is a partition type. You don't want to use resize2fs on the PARTITION,
which is not an ext2 partition, but an lvm partition. You want
to resize the filesystem on the logical VOLUME.

And yes, resize2fs is appropriate for logical volumes. But resize the VOLUME
(e.g. /dev/VolGroup00/LogVol00), not the partition or volume group.

On Fri, Mar 04, 2005 at 06:35:31PM +0000, Robert Buick wrote:
> I'm using type 8e, does anyone happen to know if resize2fs is
> appropriate for this type; the man page only mentions type2.
--------------------------------------------------------------------------------
A method of hunting for two text strings in a raw disk, after files
have been deleted.  The data blocks of the disk are read once, but
grep'd twice.

seki-root> reblock -e 75216016 $(expr 1024 \* 1024) 300 <
/dev/mapper/VolGroup00-LogVol00 | mtee 'egrep --binary-files=text -i -B
1000 -A 1000 dptutil > dptutil-hits' 'egrep --binary-files=text -i
-B 1000 -A 1000 dptmgr > dptmgr-hits'
stdin seems seekable, but file length is 0 - no exact percentages
Estimated filetransfer size is 77021200384 bytes
Estimated percentages will only be as accurate as your size estimate
Creating 2 pipes
popening egrep --binary-files=text -i -B 1000 -A 1000 dptutil > dptutil-hits
popening egrep --binary-files=text -i -B 1000 -A 1000 dptmgr > dptmgr-hits
(estimate: 0.1%  0s 56m 11h) Kbytes: 106496.0  Mbits/s: 13.6  Gbytes/hr:
6.0  min: 1.0
(estimate: 0.2%  9s 12m 12h) Kbytes: 214016.0  Mbits/s: 13.3  Gbytes/hr:
5.8  min: 2.0
(estimate: 0.3%  58s 58m 11h) Kbytes: 257024.0  Mbits/s: 13.5  Gbytes/hr:
5.9  min: 2.4
...

references:
http://stromberg.dnsalias.org/~strombrg/reblock.html
http://stromberg.dnsalias.org/~strombrg/mtee.html
egrep --help
--------------------------------------------------------------------------------
Performing the above reblock | mtee, my fedora core 3 system got -very-
slow.  If I were to suspend the pipeline above, performance would be
great.  If I resumed it, very quickly, performance would be bad again.
This command seems to have left my sytem a little bit jerky, but it's
-far- more usable now, despite the pipeline above still pounding the
SATA drive my home directory is on.

seki-root> echo deadline > scheduler 
Wed Mar 09 17:56:58

seki-root> cat scheduler 
noop anticipatory [deadline] cfq 
Wed Mar 09 17:57:00

seki-root> pwd
/sys/block/sdb/queue
Wed Mar 09 17:58:31

BTW, I looked into tagged command queuing for this system as well,
but apparently VIA SATA doesn't support TCQ on linux 2.6.x.
--------------------------------------------------------------------------------
Eventually the reblock | mtee egrep egrep gave:
egrep: memory exhausted
...using GNU egrep 2.5.1.
...so now I'm trying something closer to my classical method:
seki-root> reblock -e 75216016 $(expr 1024 \* 1024) 300 <
/dev/mapper/VolGroup00-LogVol00 | mtee './bgrep dptutil | ./ranges >
dptutil-ranges' './bgrep dptmgr | ./ranges > dptmgr-ranges'
Creating 2 pipes
popening ./bgrep dptutil | ./ranges > dptutil-ranges
popening ./bgrep dptmgr | ./ranges > dptmgr-ranges
stdin seems seekable, but file length is 0 - no exact percentages
Estimated filetransfer size is 77021200384 bytes
Estimated percentages will only be as accurate as your size estimate
(estimate: 1.3%  16s 12m 1h) Kbytes: 1027072.0  Mbits/s: 133.6  Gbytes/hr:
58.7  min: 1.0
(estimate: 2.5%  36s 16m 1h) Kbytes: 1913856.0  Mbits/s: 124.5  Gbytes/hr:
54.7  min: 2.0
(estimate: 3.7%  10s 17m 1h) Kbytes: 2814976.0  Mbits/s: 122.1  Gbytes/hr:
53.6  min: 3.0
(estimate: 4.9%  10s 17m 1h) Kbytes: 3706880.0  Mbits/s: 120.6  Gbytes/hr:
53.0  min: 4.0
...
--------------------------------------------------------------------------------
I've added a -s option to reblock, which makes it sleep for an  arbitrary
number of (fractions of) seconds between blocks.  Between this and the
I/O scheduler change, seki has become very pleasant to work on again,
despite the hunt for my missing palm memo.  :)
--------------------------------------------------------------------------------
From Bryan Ragon  

Here is a detailed list of steps that worked:

;; first backed up the first 512 bytes of /dev/hdb
# dd if=/dev/hdb of=~/hdb.first512 count=1 bs=512
1+0 records in
1+0 records out
 

;; zero them out, per Alasdair
# dd if=/dev/zero of=/dev/hdb count=1 bs=512
1+0 records in
1+0 records out

;; verified
# blockdev --rereadpt /dev/hdb
BLKRRPART: Input/output error

;; find the volumes
# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "media_vg" using metadata type lvm2

# pvscan
  PV /dev/hdb   VG media_vg   lvm2 [111.79 GB / 0    free]
  Total: 1 [111.79 GB] / in use: 1 [111.79 GB] / in no VG: 0 [0   ]

# lvmdiskscan
  /dev/hda1 [      494.16 MB]
  /dev/hda2 [        1.92 GB]
  /dev/hda3 [       18.65 GB]
  /dev/hdb  [      111.79 GB] LVM physical volume
  /dev/hdd1 [       71.59 GB]
  0 disks
  4 partitions
  1 LVM physical volume whole disk
  0 LVM physical volumes

# vgchange -a y
  1 logical volume(s) in volume group "media_vg" now active

;; /media is a defined mount point in fstab, listed below for future archive
searches
# mount /media
# ls /media
graphics  lost+found  movies  music


Success!!  Thank you, Alasdair!!!!

/etc/fstab

/dev/media_vg/media_lv  /media          ext3            noatime
0 0

--------------------------------------------------------------------------------
home blee has:
hdc1 ext3 /big wdc
sda5 xfs /backups
00/00 ext3 hda ibm fc3: too hot?
00/01 swap hda ibm
01/00 ext3 hdd maxtor fc4
01/01 swap hdd maxtor
hdb that samsung dvd drive that overheats
--------------------------------------------------------------------------------
 

Recommended Links

Recovering a Lost LVM Volume Disk Novell User Communities

LVM by default keeps backup copies of it's meta data for all LVM devices. These backup files are stored in /etc/lvm/backup and /etc/lvm/archive. If a disk is removed or the meta data gets damaged in some way, it can be easily restored, if you have backups of the meta data. This is why it is highly recommended to never turn off LVM's auto backup feature. Even if a disk is permanently removed from the volume group, it can be reconstructed, and often times the remaining data on the file system recovered.

Recovery of RAID and LVM2 Volumes

A simple introduction to working with LVM (look at comments)

Reference

[May 17, 2010] Appendix D. LVM Volume Group Metadata

The configuration details of a volume group are referred to as the metadata. By default, an identical copy of the metadata is maintained in every metadata area in every physical volume within the volume group. LVM volume group metadata is small and stored as ASCII.

If a volume group contains many physical volumes, having many redundant copies of the metadata is inefficient. It is possible to create a physical volume without any metadata copies by using the --metadatacopies 0 option of the pvcreate command. Once you have selected the number of metadata copies the physical volume will contain, you cannot change that at a later point. Selecting 0 copies can result in faster updates on configuration changes. Note, however, that at all times every volume group must contain at least one physical volume with a metadata area (unless you are using the advanced configuration settings that allow you to store volume group metadata in a file system). If you intend to split the volume group in the future, every volume group needs at least one metadata copy.

The core metadata is stored in ASCII. A metadata area is a circular buffer. New metadata is appended to the old metadata and then the pointer to the start of it is updated.

You can specify the size of metadata area with the --metadatasize. option of the pvcreate command. The default size is too small for volume groups with many logical volumes or physical volumes.

The Physical Volume Label

By default, the pvcreate command places the physical volume label in the 2nd 512-byte sector. This label can optionally be placed in any of the first four sectors, since the LVM tools that scan for a physical volume label check the first 4 sectors. The physical volume label begins with the string LABELONE.

The physical volume label Contains:

Metadata locations are stored as offset and size (in bytes). There is room in the label for about 15 locations, but the LVM tools currently use 3: a single data area plus up to two metadata areas.

Metadata Contents

The volume group metadata contains:

The volume group information contains:



Etc

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit exclusivly for research and educational purposes.   If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner. 

ABUSE: IPs or network segments from which we detect a stream of probes might be blocked for no less then 90 days. Multiple types of probes increase this period.  

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Haterís Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least


Copyright © 1996-2016 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License.

The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to make a contribution, supporting development of this site and speed up access. In case softpanorama.org is down you can use the at softpanorama.info

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last modified: September 12, 2017