Tuesday, February 20, 2007

ext3 tuning

The ext3 system is a great workhorse filesystem. Lots of tools, lots of distros that know how to read it, and it's pretty much the "safe" choice for almost all workloads. Still, there are things that the default ext3 doesn't do as well as it should so most installations need a little TLC.

For the most part, you should plan on shutting down a system before tuning it (after making backups!). Tuning doesn't take too long and is a lot simpler to do if the system is offline.

First off, you should check out your existing filesystem settings with:

# tune2fs -l /dev/hdXY

1) Directory indexing - Which helps ext3 deal with any directories that have lots of files. (See the Gentoo Forum link for explanations of why.)

# tune2fs -O dir_index /dev/hdXY
# e2fsck -D /dev/hdXY

The first command changes the ext3 system to use directory indexing for all new directories, the second command updates all existing directories.

2) Journal mode

# tune2fs -O has_journal -o journal_data /dev/hdXY

I prefer full journaled mode. The "-O has_journal" should be unnecessary (all ext3 file systems have journals after all) but probably ensures that things work if you accidently run it on a ext2 filesystem.

3) Journal size

This requires poking around a bit to find out what your current journal size is. First, you need to find the inode of the journal.

# tune2fs -l /dev/hdXY | grep -i "journal inode"
Journal inode: 8
# /sbin/debugfs /dev/md2
debugfs 1.39 (29-May-2006)
debugfs: stat <8>
Inode: 8 Type: regular Mode: 0600 Flags: 0x0 Generation: 0
User: 0 Group: 0 Size: 134217728
File ACL: 0 Directory ACL: 0
Links: 1 Blockcount: 262416
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x4658e77b -- Sat May 26 22:05:47 2007
atime: 0x00000000 -- Wed Dec 31 19:00:00 1969
mtime: 0x4658e77b -- Sat May 26 22:05:47 2007


In this particular case, for a 12GB partition, the journal size is 128MB (262416 blocks at 4096 bytes each, or look at the "Size:" field which is in bytes). On my 64GB partition, the journal size is also only 128MB.

So, do we want to muck with the journal size? Well, maybe... Doubling the size is probably okay, maybe even making it 4x larger. But beyond that and I think you'd want to tread carefully.

# tune2fs -J size=$SIZE /dev/hdXY

$SIZE is defined in megabytes, so for me to double the 128MB journal, I'd use a value of "size=256".

Source links:
Whitepaper: Red Hat's New Journaling File System: ext3 (RedHat, 2001)
EXT3 Filesystem tuning (Christoph C. Cemper, 2005)
Tuning ext3 for large disk arrays (LKML, Peter Chubb, 2005)
Some ext3 Filesystem Tips (Gentoo Forums, Peter Gordon, 2005)
Performance Tuning Guidelines for Large Deployments (Zimbra, 2007)
Linux Magazine: Tuning Journaling File Systems (2007, registration required)