Arch was never a mistake. Arch Linux had already taught me so much about Linux (and will continue to do so on my other desktop). But Arch definitely requires more time and attention than I would like to spend on a server. Ideally I’d prefer to be able to forget about the server for a while until a reminder email says “um … there’s a couple updates you should look at, buddy.”
Space isn’t free – and neither is space
The opportunity to migrate to Ubuntu was the fact that I had run out of SATA ports, the ports required to connect hard drives to the rest of the computer – that 7TB RAID array uses a lot of ports! I had even given away my very old 200GB hard disk as it took up one of those ports. I also warned the recipient that the disk’s SMART monitoring indicated it was unreliable. As a temporary workaround to the lack of SATA ports, I had even migrated the server’s OS to a set of four USB sticks in an md RAID1. Crazy. I know. I wasn’t too happy about the speed. I decided to go out and buy a new reliable hard drive and a SATA expansion card to go with it.
The server’s primary Arch partition was using about 7GB of disk. A big chunk of that was a swap file, cached data and otherwise miscellaneous or unnecessary files. Overall the actual size of the OS, including the /home folder, was only about 2GB. This prompted me to look into a super-fast SSD drive, thinking perhaps a smaller one might not be so expensive. It turned out that the cheapest non-SSD drive I could find actually cost more than one of these relatively small SSDs. Yay for me. 🙂
In choosing the OS, I’d already decided it wouldn’t be Arch. Out of all the other popular distributions, I’m most familiar with Ubuntu and CentOS. Fedora was also a possibility – but I hadn’t seriously yet considered it for a server. Ubuntu won the round.
I was new to using SSDs in Linux – I’m well aware of the pitfalls of not using them correctly, mostly due to their risk of poor longevity if misused.
I didn’t want to use a dedicated swap partition. I plan on upgrading the server’s motherboard/CPU/memory not too far in the future. Based on that I decided I will put swap into a swap file on the existing md RAID. The swap won’t be particularly fast but its only purpose will be for that rare occasion when something’s gone wrong and the memory isn’t available.
This then left me to give the root path the full 60GB out of an Intel 330 SSD. I considered separating /home but it just seemed a little pointless, given how little was used in the past. I first set up the partition with LVM – something I’ve recently been doing whenever I set up a Linux box (really, there’s no excuse not to use LVM). When it got to the part where I would configure the filesystem, I clicked the drop-down and instinctively selected ext4. Then I noticed btrfs in the same list. Hang on!!
But a what?
Btrfs (“butter-eff-ess”, “better-eff-ess”, “bee-tree-eff-ess”, or whatever you fancy on the day) is a relatively new filesystem developed in order to bring Linux’ filesystem capabilities back on track with current filesystem tech. The existing King-of-the-Hill filesystem, “ext” (the current version called ext4) is pretty good – but it is limited, stuck in an old paradigm (think of a brand new F22 Raptor vs. an F4 Phantom with a half-jested attempt at an equivalency upgrade) and is unlikely to be able to compete for very long with newer Enterprise filesystems such as Oracle’s ZFS. Btrfs still has a long way to go and is still considered experimental (depending on who you ask and what features you need). Many consider it to be stable for basic use – but nobody is going to make any guarantees. And, of course, everyone is saying to make and test backups!
The most fundamental difference between ext and btrfs is that btrfs is a “CoW” or “Copy on Write” filesystem. This means that data is never actually deliberately overwritten by the filesystem’s internals. If you write a change to a file, btrfs will write your changes to a new location on physical media and will update the internal pointers to refer to the new location. Btrfs goes a step further in that those internal pointers (referred to as metadata) are also CoW. Older versions of ext would have simply overwritten the data. Ext4 would use a Journal to ensure that corruption won’t occur should the AC plug be yanked out at the most inopportune moment. The journal results in a similar number of steps required to update data. With an SSD, the underlying hardware operates a similar CoW process no matter what filesystem you’re using. This is because SSD drives cannot actually overwrite data – they have to copy the data (with your changes) to a new location and then erase the old block entirely. An optimisation in this area is that an SSD might not even erase the old block but rather simply make a note to erase the block at a later time when things aren’t so busy. The end result is that SSD drives fit very well with a CoW filesystem and don’t perform as well with non-CoW filesystems.
To make matters interesting, CoW in the filesystem easily goes hand in hand with a feature called deduplication. This allows two (or more) identical blocks of data to be stored using only a single copy, saving space. With CoW, if a deduplicated file is modified, the separate twin won’t be affected as the modified file’s data will have been written to a different physical block.
CoW in turn makes snapshotting relatively easy to implement. When a snapshot is made the system merely records the new snapshot as being a duplication of all data and metadata within the volume. With CoW, when changes are made, the snapshot’s data stays intact, and a consistent view of the filesystem’s status at the time the snapshot was made can be maintained.
A new friend
With the above in mind, especially as Ubuntu has made btrfs available as an install-time option, I figured it would be a good time to dive into btrfs and explore a little. 🙂
Part 2 coming soon …