Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Softpanorama Search

Linux filesystems

News See also Recommended Links Tutorials Introductory materials Papers
LVM Snapshots RAM Disks Linux Swap filesystem UFS NTFS
Ext2 / Ext3 ReisnerFS jfs XFS Humor Etc

The file system is one of the most important parts of an operating system. The file system stores and manages user data on disk drives, and ensures that what’s read from storage is identical to what was originally written. In addition to storing user data in files, the file system also creates and manages information about files and about itself. Besides guaranteeing the integrity of all that data, file systems are also expected to be extremely reliable and have very good performance.

File systems update their structural information (called metadata) by synchronous writes. Each metadata update may require many separate writes, and if the system crashes during the write sequence, metadata may be in inconsistent state. At the next boot the filesystem check utility (called fsck) must walk through the metadata structures, examining and repairing them. This operation takes a very very long time on large filesystems. And the disk may not contain sufficient information to correct the structure. This results in misplaced or removed files. A journaling file system uses a separate area called a log or journal. Before metadata changes are actually performed, they are logged to this separate area. The operation is then performed. If the system crashes during the operation, there is enough information in the log to "replay" the log record and complete the operation. This approach does not require a full scan of the file system, yielding very quick filesystem check time on large file systems, generally a few seconds for a multiple-gigabyte file system. In addition, because all information for the pending operation is saved, no removals or lost-and-found moves are required. Disadvantage of journaling filesystems is that they are slower than other filesystems. Some journaling filesystems: BeFS, HTFS, JFS, NSS, Ext3, VxFS and XFS. 

Fortunately, a number of other Linux file systems take up where Ext2 leaves off. Indeed, Linux now offers four alternatives to Ext2:

In addition to meeting some or all of the requirements listed above, each of these alternative file systems also supports journaling, a feature certainly demanded by enterprises, but beneficial to anyone running Linux. A journaling file system can simplify restarts, reduce fragmentation, and accelerate I/O. Better yet, journaling file systems make fscks a thing of the past.

If you maintain a system of fair complexity or require high-availability, you should seriously consider a journaling file system. Let’s find out how journaling file systems work, look at the four journaling file systems available for Linux, and walk through the steps of installing one of the newer systems, JFS. Switching to a journaling file system is easier than you might think, and once you switch — well, you’ll be glad you did.

Fun with File Systems

To better appreciate the benefits of journaling file systems, let’s start by looking at how files are saved in a non-journaled file system like Ext2. To do that, it’s helpful to speak the vernacular of file systems.

Figure One: How file extents work

 

jfs_03

An extent is described by its block offset in the file, the location of the first block in the extent, and the length of the extent.

If file sample.txt requires 18 blocks, and the file system is able to allocate one extent of length 8, a second extent of length 5, and a third extent of length 5, the file system would look something like the drawing below. The first extent has offset 0 (block Ain the file), location 0, and length 8. The second extent has offset 8 (block I), location 20, and length 5. The last extent has offset 13, location 35, and length 5.

Figure Two illustrates blocks, inodes (with a number of meta-data attributes), directories, and their relationships.

jfs_01 Figure Two: Blocks, inodes, directories, files, and their relationships

When Good File Systems Go Bad

With those concepts in mind, here’s what happens when a three-block file is modified and grows to be a five-block file:

  1. First, two new blocks are allocated to hold the new data.
     
  2. Next, the file’s inode is updated to record the two new block pointers and the new size of the file.
     
  3. Finally, the actual data is written into the blocks.

As you can see, while writing data to a file appears to be a single atomic operation, the actual process involves a number of steps (even more steps than shown here if you consider all of the accounting required to remove free blocks from a list of free blocka, among other possible metadata changes).

If all the steps to write a file are completed perfectly (and this happens most of the time), the file is saved successfully. However, if the process is interrupted at any time (perhaps due to power failure or other systemic failure), a non-journaled file system can end up in an inconsistent state. Corruption occurs because the logical operation of writing (or updating) a file is actually a sequence of I/O, and the entire operation may not be totally reflected on the media at any given point in time.

If the meta-data or the file data is left in an inconsistent state, the file system will no longer function properly.

Non-journaled file systems rely on fsck to examine all of the file system’s metadata and detect and repair structural integrity problems before restarting. If Linux shuts down smoothly, fsck will typically return a clean bill of health. However, after a power failure or crash, fsck is likely to find some kind of error in meta-data.

A file system has a lot of meta-data, and fsck can be very time consuming. After all, fsck has to scan a file system’s entire repository of meta-data to ensure consistency and error-free operation. As you may have experienced, the speed of fsck on a disk partition is proportional to the size of the partition, the number of directories, and the number of files in each directory.

For large file systems, journaling becomes crucial. A journaling file system provides improved structural consistency, better recovery, and faster restart times than non-journaled file systems. In most cases, a journaled file system can restart in less than a second.

Dear Journal…

The magic of journaling file systems lies in transactions. Just like a database transaction, a journaling file system transaction treats a sequence of changes as a single, atomic operation — but instead of tracking updates to tables, the journaling file system tracks changes to file system meta-data and/or user data. The transaction guarantees that either all or none of the file system updates are done.

For example, the process of creating a new file modifies several meta-data structures (inodes, free lists, directory entries, etc.). Before the file system makes those changes, it creates a transaction that describes what it’s about to do. Once the transaction has been recorded (on disk), the file system goes ahead and modifies the meta-data. The journal in a journaling file system is simply a list of transactions.

In the event of a system failure, the file system is restored to a consistent state by replaying the journal. Rather than examine all meta-data (the fsck way), the file system inspects only those portions of the meta-data that have recently changed. Recovery is much faster, usually only a matter of seconds. Better yet, recovery time is not dependent on the size of the partition.

In addition to faster restart times, most journaling file systems also address another significant problem: scalability. If you combine even a few large-capacity disks, you can assemble some massive (certainly by early-90s’ standards) file systems. Features of modern file systems include:

More advanced file systems also manage sparse files, internal fragmentation, and the allocation of inodes better than Ext2.

A Wealth of Options

While advanced file systems are tailored primarily for the high throughput and high uptime requirements of servers (from single processor systems to clusters), these file systems can also benefit client machines where performance and reliability are wanted or needed.

As mentioned in the introduction, recent releases of Linux include not one, but four journaling file systems. JFS from IBM, XFS from SGI, and ReiserFS from Namesys have all been “open sourced” and subsequently included in the Linux kernel. In addition, Ext3 was developed as a journaling add-on to Ext2.

Figure Three shows where the file systems fit in Linux. You’ll note that JFS, XFS, ReiserFS, and Ext3 are independent “peers.” It’s possible for a single Linux machine to use all of those file systems at the same time. A system administrator could configure a system to use XFS on one partition, and ReiserFS on another.

jfs_02 Figure Three: Where file systems fit in the operating system

What are the features and benefits of each system? Let’s take a quick look at Ext3, ReiserFS, and XFS, and then an in-depth look at JFS.

EXT3

As mentioned above, Ext2 is the de facto file system for Linux. While it lacks some of the advanced features (extremely large files, extent-mapped files, etc.) of XFS and ReiserFS and others, it’s reliable, stable, and still the default “out of the box” file system for all Linux distributions. Ext2’s real weakness is fsck: the bigger the Ext2 file system, the longer it takes to fsck. Longer fsck times means longer down times.

The Ext3 file system was designed to provide higher availability without impacting the robustness (at least the simplicity and reliability) of Ext2. Ext3 is a minimal extension to Ext2 to add support for journaling. Ext3 uses the same disk layout and data structures as Ext2, and it’s forward- and backward-compatible with Ext2. Migration from Ext2 to Ext3 (and vice versa) is quite easy, and can even be done in-place in the same partition. The other three journaling file systems required the partition to be formatted with their mkfs utility.

If you want to adopt a journaling file system, but don’t have free partitions on your system, Ext3 could be the journaling file system to use. See “Switching to Ext3″ for information on how to switch to Ext3 on your Linux machine.


 
 


Switching to Ext3


 

If you want to switch to Ext3, it’s a good idea to make a backup of your file systems. Once you’ve done that, run the tune2fs program with the -j option to add a journal file to an existing Ext2 file system. You can run tune2fs on a mounted or unmounted Ext2 file system. For instance, if /dev/hdb3 is an Ext2 file system, the command

# tune2fs -j /dev/hdb3

creates the log. If the file system is mounted, a journal file named .journal will be placed in the root directory of the file system. If the file system is not mounted, the journal file will be hidden. (When you mount an Ext3 file system, the .journal file will appear. The .journal file is just an indicator to show that the file system is indeed Ext3.)

Next, the entry for /dev/hdb in /etc/fstab needs to be changed from ext2 to ext3. The final step is to reboot and verify that the /dev/hdb3 partition has type ext3. Type mount. The output should include an entry like this one:

% mount

/dev/hdb3 on /test type ext3 (rw)

Ext3 provides three data journaling modes that can be set at mount time: data=journal, data=writeback, and data=ordered. The data=journal mode provides both meta-data and data journaling. data=writeback mode provides only meta-data journaling. data=ordered mode, which is the default mode, provides meta-data journaling with increased integrity. With three modes, a system administrator can make a trade off between performance and file data consistency.

If for some reason you’d like to change the Ext3 partition back to Ext2, the process is very simple: umount the file system, and re-mount it using Ext2.

# mount -t ext2 /dev/hdb3 /test

If you want the file system to mount as Ext2 at boot time, you’ll also have to change its entry in etc/fstab.

 

The downside of Ext3? It’s an add-on to Ext2, so it still has the same limitations that Ext2 has. The fixed internal structures of Ext2 are simply too small (too few bits) to capture large file sizes, extremely large partition sizes, and enormous numbers of files in a single directory. Moreover, the bookkeeping techniques of Ext2, such as its linked-list directory implementation, do not scale well to large file systems (there is an upper limit of 32,768 subdirectories in a single directory, and a “soft” upper limit of 10,000-15,000 files in a single directory.) To make radical improvements to Ext2, you’d have to make radical changes. Radical change was not the intent of Ext3.

However, newer file systems do not have to be backward-compatible with Ext2. ReiserFS, XFS, and JFS offer scalability, high-performance, very large file systems, and of course, journaling. “Why Four Journaling File Systems is a Good Thing” presents an overview of the capabilities of the four journaling file systems.

Why Four Journaling File Systems is Good

One of the great things about open source is that choice is looked upon favorably. Linux is the only operating system with four journaling file systems in production: ReiserFS, Ext3, JFS, and XFS.

All four file systems have the GPL license, and source code is available at http://www.kernel.org or on each project’s home page. Each of the journaling file system teams follow a community model and welcome users and contributors. In fact, the teams share their best ideas, and competitive benchmarking encourages constant improvement of all of the systems.

The table below summarizes the features and limits of the four Linux journaling file systems. The first section provides some history of when the journaling file system were accepted into the kernel.org source trees. The next section, lists some of the features of the file systems. The final section, lists some of the distributions that are currently shipping the journaling file systems. If the distribution is shipping the file system that you want to use, you can use that file system right “out-of-the-box.”

For complete feature lists of each journaling file system, see the respective project Web pages.

A comparison of journaling file systems

 
Kernel support Ext3 ReiserFS XFS JFS

 
Kernel prerequisites No No Yes No

 
In kernel.org source tree 2.4.Ix 2.4.15 2.4.1 - -

 
In kernel.org source tree 2.5.Ix 2.5.0 2.5.0 - 2.5.6

 
License GPL GPL GPL GPL

 
 

 
Features

 
Largest block size supported on ia32 4 Kb 4 Kb 4 Kb 4 Kb

 
File system size maximum 16384 Gb 17592 Gb 18,000 Pb+ 32 Pb

 
File size maximum 2048 Gb 1 Eb* 9,000 Pb 4 Pb

 
Growing the file system size Patch Yes Yes Yes

 
Access Control Lists Patch No Yes WIP

 
Dynamic disk inode allocation No Yes Yes Yes

 
Data logging Yes No No No

 
Place log on an external device Yes Yes Yes Yes

 
 

 
Distros with journaling file systems

 
Red Hat 7.3 Yes Yes No Yes

 
SuSE 8.0 Yes Yes Yes Yes

 
Mandrake Linux 8.2 Yes Yes Yes Yes

 
Slackware Linux 8.1 Yes Yes Yes Yes

 
 

+ Pb is petabyte, or 1015 bytes

* Eb is exabyte or 1018 bytes

By the way, the 2.4 kernel has a limit of 2048 Gb for a single block device, so no file system larger than that can be created at this time (without patching the standard kernel). This restriction could be removed in the 2.5.x development kernel, and there are patches available to remove this limit, but as of 2.5.29, the patches haven’t been officially included yet.

 

REISERFS

ReiserFS is designed and developed by Hans Reiser and his team of developers at Namesys. Like the other journaling file systems, it’s open source, is available in most Linux distributions, and supports meta-data journaling.

One of the unique advantages of ReiserFS is support for small files — lots and lots of small files. Reiser’s philosophy is simple: small files encourage coding simplicity. Rather than use a database or create your own file caching scheme, use the filesystem to handle lots of small pieces of information.

ReiserFS is about eight to fifteen times faster than Ext2 at handling files smaller than 1K.

Even more impressive, (when properly configured) ReiserFS can actually store about 6% more data that Ext2 on the same physical file system. Rather than allocate space in fixed 4K blocks, ReiserFS can allocate the exact space that’s needed. A B* tree manages all file system meta-data, and stores and compresses tails, portions of files smaller than a block.

Of course, ReiserFS also has excellent performance for large files, but it’s especially adept at managing small files.

For a more in-depth discussion of ReiserFS and instructions on how to install it, see “Journaling File Systems” in the August 2000 issue, available online at http://www.linux-mag.com/2000-08/journaling_01.html.

JFS

JFS for Linux is based on IBM’s successful JFS file system for OS/2 Warp. Donated to open source in early 2000 and ported to Linux soon after, JFS is well-suited to enterprise environments. JFS uses many advanced techniques to boost performance, provide for very large file systems, and of course, journal changes to the file system. SGI’s XFS (described next) has many similar features. Some of the features of JFS include:

There are other advanced features in JFS such as allocation groups (which speeds file access times by maximizing locality), and various block sizes ranging from 512-bytes to 4096-bytes (which can be tuned to avoid internal and external fragmentation). You can read about all of them at the JFS Web site at http://www-124.ibm.com/developerworks/oss/jfs.

XFS

A little more than a year ago, SGI released a version of its high-end XFS file system for Linux. Based on SGI’s Irix XFS file system technology, XFS supports meta-data journaling, and extremely large disk farms. How large? A single XFS file system can be 18,000 petabytes (that’s 1015 bytes) and a single file can be 9,000 petabytes. XFS is also capable of delivering excellent I/O performance.

In addition to truly amazing scale and speed, XFS uses many of the same techniques found in JFS.

Installing JFS

For the rest of the article, let’s look at how to install and use IBM’s JFS system. If you have the latest release of Turbolinux, Mandrake, SuSE, Red Hat, or Slackware, you can probably skip ahead to the section “Creating a JFS Partition.” If you want to include the latest JFS source code drop into your kernel, the next few sections show you what to do.

THE LATEST AND GREATEST

JFS has been incorporated into the 2.5.6 Linux kernel, and is also included in Alan Cox’s 2.4.X-ac kernels beginning with 2.4.18-pre9-ac4, which was released on February 14, 2002. Alan’s patches for 2.4.x series are available from http://www.kernel.org. You can also download a 2.4 kernel source tree and add the JFS patches to this tree. JFS comes as a patch for several of the 2.4.x kernel, so first of all, get the latest kernel from http://www.kernel.org.

At the time of writing, the latest kernel was 2.4.18 and the latest release of JFS was 1.0.20. We’ll be using those in the instructions below. The JFS patch is available from the JFS web site. You also need both the utilities (jfsutils-1.0.20.tar.gz), the kernel patch (jfs-2.4.18-patch), and the file system source (jfs-2.4-1.0.20.tar.gz).

If you’re using any of the latest distros, you probably won’t have to patch the kernel for the JFS code. Instead, you’ll only need to compile the kernel to update to the latest release of JFS (you can build JFS either as built-in or as a module). (To determine what version of JFS was shipped in the distribution you’re running, you can edit the JFS file super.c and look for a printk() that has the JFS development version number string.)

PATCHING THE KERNEL TO SUPPORT JFS

In the example below, we’ll use the 2.4.18 kernel source tree as an example on how to patch JFS into the kernel source tree.

First, you need to download the Linux kernel: linux-2.4.18 .tar.gz. If you have a linux subdirectory, move it to linux-org, so it won’t replaced by the linux-2.4.18 source tree. When you download the kernel archive, save it under /usr/src and expand the kernel source tree by using:

% mv linux linux-org
% tar zxvf linux-2.4.18.tar.gz

This operation will create a directory named /usr/src/linux.

The next step is to get the JFS utilities and the appropriate patch for kernel 2.4.18. Before you do that, you need to create a directory for JFS source, /usr/src/jfs1020, and download (to that directory) the JFS kernel patch and the JFS file system source files. Once you have those files, you have everything you need to patch the kernel.

Next, change to the directory of the kernel 2.4.18 source tree and apply the JFS kernel patch:

% cd /usr/src/linux
% patch -p1 < /usr/src/jfs1020/jfs-2.4-18-patch
% cp /usr/src/jfs1020/jfs-2.4-1.0.20.tar.gz .
% tar zxvf jfs-2.4-1.0.20.tar.gz

Now, you need to configure the kernel and enable JFS by going to the File systems section of the configuration menu and enabling JFS file system support (CONFIG_JFS_FS=y). You also have the option to configure JFS as a module, in which case you only need to recompile and reinstall kernel modules by typing:

% make modules && make install_modules

Otherwise, if you configured the JFS option as a kernel built-in, you need to:

1. Recompile the kernel (in /usr/src/linux). Run the command

% make dep && make clean && make bzImage

2. Recompile and install modules (only if you added other options as modules)

% make modules && make modules_install

3. Install the kernel.

# cp arch/i386/boot/bzImage /boot/jfs-bzImage
# cp System.map /boot/jfs-System.map
# ln -s /boot/jfs-System.map /boot/System.map

Next, update /etc/lilo.conf with the new kernel. Add an entry like the one that follows and a jfs1020 entry should appear at the lilo boot prompt:

image=/boot/jfs-bzImage
label=jfs1020
read-only
root=/dev/hda5  # Change to your partition

Be sure to specify the correct root partition. Then run

# lilo

to make the system aware of the new kernel. Reboot and select the jfs1020 kernel to boot from the new image.

After you compile and install the kernel, you should compile and install the JFS utilities. Save the jfsutils-1.0.20.tar.gz file into the /usr/src/jfs1020 directory, expand it, run configure, and the install the utilities.

  % tar zxvf jfsutils-1.0.20.tar.gz
  % cd jfsutils-1.0.20
  % ./configure
  % make && make install

Creating a JFS partition

Having built and installed the JFS utilities, the next step is to create a JFS partition. In this exact example, we’ll demonstrate the process using a spare partition.

(If there’s unpartitioned space on your disk, you can create a partition using fdisk. After you create the partition, reboot the system to make sure that the new partition is available to create a JFS file system on it. In our test system, we had /dev/hdb3 as a spare partition.)

To create the JFS file system with the log inside the JFS partition, apply the following command:

# mkfs.jfs /dev/hdb3

After the file system has been created, you need to mount it. You will need a mount point. Create a new empty directory such as /jfs to mount the file system with the following command:

# mount -t jfs /dev/hdb3 /jfs

After the file system is mounted, you are ready to try out JFS. To unmount the JFS file system, you simply use the umount command with the same mount point as the argument:

# umount /jfs

 
 


A Performance Tweak for All File Systems


 

Linux records an atime, or access time, whenever a file is read. However, access time isn’t very useful, and can be quite costly to track.

To get a quick performance boost on any kind of Linux file system, simply disable access time updates with the mount option noatime. For example, to disable access times on a JFS partition, do something like this in /etc/fstab:

/dev/hda6 /jfs jfs noatime 1 2

 

Go Faster with An External Log

An external log improves performance since the log updates are saved to a different partition than its corresponding file system.

To create the JFS file system with the log on an external device, your system will need to have 2 unused partitions. Our test system had /dev/hda6 and /dev/hdb1 as spare partitions.

# mkfs.jfs -j /dev/hdb1 /dev/hda6
mkfs.jfs version: 1.0.20 21-Jun-2002
Warning! All data on device /dev/hda6 will be lost!
Warning! All data on device /dev/hdb1 will be lost!
Continue? (Y/N) y
Format completed successfully.
10249438 kilobytes total disk space.

To mount the file system use the following mount command:

# mount -t jfs /dev/hda6 /jfs

So you don’t have to mount this file system every time you boot, you can add it to /etc/fstab. Make a backup of /etc/fstab and edit it with you favorite editor. Add the /dev/hda6 device. For example, add:

/dev/hda6 /jfs jfs defaults 1 2

Not Just for Reboots Anymore

Some people have the impression that journaling file systems only provide fast restart times. As you’ve seen, this isn’t true. Considerable coding efforts have made journaling file systems scalable, reliable, and fast.

Whether you’re running an enterprise server, a cluster supercomputer, or a small Web site, XFS, JFS, and ReiserFS add credibility and oomph to Linux. Need a better reason to switch to a journaling file system? Just imagine yourself in a world without fsck. What will you do with all that extra time?



Steve Best works in the Linux Technology Center of IBM in Austin, Texas. He is currently working on the Journaled File System (JFS) for Linux project. Steve has done extensive work in operating system development with a focus in the areas of file systems, internationalization, and security. He can be reached at sbest@us.ibm.com.

Old News ;-)

[Apr 16, 2009] Freezing filesystems and containers [LWN.net]

By Jake Edge
June 25, 2008

Freezing seems to be on the minds of some kernel hackers these days, whether it is the northern summer or southern winter that is causing it is unclear. Two recent patches posted to linux-kernel look at freezing, suspending essentially, two different pieces of the kernel: filesystems and containers. For containers, it is a step along the path to being able to migrate running processes elsewhere, whereas for filesystems it will allow backup systems to snapshot a consistent filesystem state. Other than conceptually, the patches have little to do with each other, but each is fairly small and self-contained so a combined look seemed in order.

Takashi Sato proposes taking an XFS-specific feature and moving it into the filesystem code. The patch would provide an ioctl() for suspending write access to a filesystem, freezing, along with a thawing option to resume writes. For backups that snapshot the state of a filesystem or otherwise operate directly on the block device, this can ensure that the filesystem is in a consistent state.

Essentially the patch just exports the freeze_bdev() kernel function in a user accessible way. freeze_bdev() locks a file system into a consistent state by flushing the superblock and syncing the device. The patch also adds tracking of the frozen state to the struct block_device state field. In its simplest form, freezing or thawing a filesystem would be done as follows:

    ioctl(fd, FIFREEZE, 0);

    ioctl(fd, FITHAW, 0);
Where fd is a file descriptor of the mount point and the argument is ignored.

 

In another part of the patchset, Sato adds a timeout value as the argument to the ioctl(). For XFS compatibility—though courtesy of a patch by David Chinner, the XFS-specific ioctl() is removed—a value of 1 for the pointer argument means that the timeout is not set. A value of 0 for the argument also means there is no timeout, but any other value is treated as a pointer to a timeout value in seconds. It would seem that removing the XFS-specific ioctl() would break any applications that currently use it anyway, so keeping the compatibility of the argument value 1 is somewhat dubious.

If the timeout occurs, the filesystem will be automatically thawed. This is to protect against some kind of problem with the backup system. Another ioctl() flag, FIFREEZE_RESET_TIMEOUT, has been added so that an application can periodically reset its timeout while it is working. If it deadlocks, or otherwise fails to reset the timeout, the filesystem will be thawed. Another FIFREEZE_RESET_TIMEOUT after that occurs will return EINVAL so that the application can recognize that it has happened.

Moving on to containers, Matt Helsley posted a patch which reuses the software suspend (swsusp) infrastructure to implement freezing of all the processes in a control group (i.e. cgroup). This could be used now to checkpoint and restart tasks, but eventually could be used to migrate tasks elsewhere entirely for load balancing or other reasons. Helsley's patch set is a forward port of work originally done by Cedric Le Goater.

The first step is to make the freeze option, in the form of the TIF_FREEZE flag, available to all architectures. Once that is done, moving two functions, refrigerator() and freeze_task(), from the power management subsystem to the new kernel/freezer.c file makes freezing tasks available even to architectures that don't support power management.

As is usual for cgroups, controlling the freezing and thawing is done through the cgroup filesystem. Adding the freezer option when mounting will allow access to each container's freezer.state file. This can be read to get the current freezer state or written to change it as follows:

    # cat /containers/0/freezer.state
    RUNNING
    # echo FROZEN > /containers/0/freezer.state
    # cat /containers/0/freezer.state
    FROZEN
It should be noted that it is possible for tasks in a cgroup to be busy doing something that will not allow them to be frozen. In that case, the state would be FREEZING. Freezing can then be retried by writing FROZEN again, or canceled by writing RUNNING. Moving the offending tasks out of the cgroup will also allow the cgroup to be frozen. If the state does reach FROZEN, the cgroup can be thawed by writing RUNNING.

 

In order for swsusp and cgroups to share the refrigerator() it is necessary to ensure that frozen cgroups do not get thawed when swsusp is waking up the system after a suspend. The last patch in the set ensures that thaw_tasks() checks for a frozen cgroup before thawing, skipping over any that it finds.

There has not been much in the way of discussion about the patches on linux-kernel, but an ACK from Pavel Machek would seem to be a good sign. Some comments by Paul Menage, who developed cgroups, also indicate interest in seeing this feature merged.

Recommended Links

Guide to Linux Filesystem Mastery

Journaling File Systems Linux Magazine

  1. Shades of Greylisting
  2. What's the diff?
  3. Filenames by Design, Part One
  4. Network Block Devices: Using Hardware Over a Network
  5. The Importance of Command Line Literacy

    Ext3: http://www.zipworld.com.au/~akpm/linux/ext3

    JFS for Linux: http://oss.software.ibm.com/jfs

    ReiserFS: http://www.namesys.com

    Linux XFS: http://oss.sgi.com/projects/xfs

    Extended Attributes & Access Controls Lists: http://acl.bestbits.at



Copyright © 1996-2009 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

Disclaimer:

Last modified: August 14, 2009