Hands Up! this is a RAID!

And who wouldn't put their hands up for RAID? The Redundant Array of Independent disks can improve your disk performance, or provide you with data integrity, or both.

Windows Media Center provides an excellent home for your digital entertainment -- and like any excellent home you want to pack more and more into it. Take a look at your physical collections -- photos, albums, CD's, DVDs, video tapes, etc. If you are into home entertainment then you've probably collected plenty of good stuff. The same thing will happen to your digital collection. Before long you will outgrow a single drive, then two. Around the time you get your third drive and file searches require three separate operations you'll proably start looking around for a better storage system.

raid5-1

Adding a RAID Controller card and an array of 3 or more hard drives will combine those separate drives (minus a bit of space for safe data storage) into a single virtual drive that is easier to manage and provides protection against a single disk failure. It may improve your disk performance, depending on how you configure it.

 

In a Nutshell:

A RAID system can be software or hardware -- the software system is generally not favoured as a system crash can lose data. Hardware is the preferred solution and it typically consists of a card that fits into your motherboard, a software driver to integrate the card and its drives into your system, and 3 or more hard drives (required to build a redundant system) - in most cases the drives need to be at least the same size, and sometimes the same brand and model as well. [1].

Raid reformats your 3 drives into one large virtual drive and reserves some space (usually hidden) for the information that provides the redundancy. According to Wikipedia the term RAID was first defined by three guys working at the University of California, Berkeley in 1987. In the early years the technology was used mainly in corporate and high-end systems. Today, fortunately, RAID technology has become affordable for most home users.

There are several standard levels of RAID creatively named RAID 0, RAID 1, RAID 2, RAID 3, RAID 4, RAID 5, and RAID 6. Each level provides an escalating combination of benefits. The three primary methods used in RAID technology are "striping" whereby a block of data is split into chunks (stripes) that are written to, and read back from all the drives in the array simultaneously providing a significant performance improvement. "Mirroring" that writes data to two drives at the same time, and "parity" which we'll look at later.

In the diagram below, data is broken into blocks and in this example, three blocks are written to 3 disks at the same time providing a significant performance improvement, but no redundancy.

raid5-2

 

Mirroring is similar, but this method writes the same block to (in this case) 3 drives. Lots of redundancy, but no performance improvement.

raid5-3

 

The third method provides redundancy by means of creating parity data in the hidden space on the disks. That space gets created when the drives are initialized to create the single array and it's that space that prevents 100% utilization of all the combined disc space.

There are a couple of ways to store the parity information - one way is to write it all to a single disk. This is RAID 3.

 

The more popular approach today is to distribute the parity information across all available drives. This is RAID 5.

raid5-4

 

The lower RAID levels can use just one or two discs focus on striping and mirroring while the higher RAID levels focus on parity and generally require the most disks.

Wikipedia has an excellent page on the subject (http://en.wikipedia.org/wiki/RAID) but don't be embarrassed if you get lost around the second or third screen -- The "parity" part of RAID is rocket science in my estimation.

 

Management:

In addition to the official RAID levels, there are also more flavors of RAID, many of them are proprietary (RAID hardware and software created by a company for use with its own gear) but any RAID solution be it open source or propriatary will have a management screen that allows you to change the disk configuration, build and change the size of the virtual drive and adjust operating variables such as block size and stripe size to tweak performance. It should also give you the option to manage alarms and reports on any faults that may occur and manage the disk re-build and re-initialisation process should that become necessary.

Here's the management screen I'm using at the moment from LSI:

raid5-6

 

... admittedly a bit dense, but it's got a lot of features. To be honest, I haven't looked at mine since the day I installed it a couple of years ago.

 

UPDATE: After a disk reconfiguration I temporarily lost my RAID 5 array and it's been up and down ever since. I've been in contact with HighPoint support and I'll let you know how we get on in a subsequent post. -dh

 

RAID 5 seems to be the favorite just now -- it doesn't provide much in the way of performance enhancements, but it does consolidate all of your physical drives into a single virtual drive, and it protects your data. You can lose a whole disk and your system will continue to operate normally with perhaps a small loss of performance -- when you replace the bad disk, RAID will rebuild it for you. It's a great feeling to know your data is safe.

raid5-7

Another tip on data protection - keep your operating system separate. In a one-disk environment, this means partitioning your drive into two virtual drives. The "C:\" drive portion holds Windows and your programs -- I've allocated 100 Mb for this drive and that's always been enough. The "D:\" portion will hold your data. In this way, even if you lose your primary drive in a system crash, your data drive will remain unaffected. It's great for maintenance too -- sometimes it's easier just to wipe your C:\ drive and start over from scratch - formatting C:\ and reloading Windows. This can be easier than trying to locate a fault or the source of a system slow-down. Be sure to keep a copy of your favourites, cookies, etc. (personal preferences normally held on the C:\ drive) in a backup folder on D:\.

Currently I use a separate physical drive for this purpose. While all the drives in my RAID array are SATA - I use the 40-pin ATA (commonly known as the IDE) port on my motherboard to connect up an old 120 Gig Seagate just to hold the Windows operating system..

 

Parity:

This is where the rocket science kicks in -- I've got 8 1Tb drives on my system, and they give me 6.63 Tb of useable space which means that 1.37 Tb is used for the parity information. Yet under this system, I can lose any single 1 Tb. disk - meaning that somehow, 6.63 Tb of data has been backed up on 1.37 Tb of disk! How can that be?

The short answer is "Parity" and, sadly, much of the available information on how it works is pretty glib. At the low end you get explanations like:

"Parity can be added to protect the striped data. Parity data is calculated for the stripes and placed on another disk drive."

But at the high end, explanations start to look like this:

The calculation process is also displayed below:

1111 XOR 1110 XOR 1100 XOR 1000
((1111 XOR 1110) XOR 1100) XOR 1000
(0001 XOR 1100) XOR 1000
1101 XOR 1000
= 0101

But fear not, I found an explanation that, while simple, gives a picture of how parity works.

Let's say you have three drives - A, B, and C - and in a RAID 5-type environment you would write data to two of them A and B, and on C you would write the parity information. The information on A is "5" and the information on B is "12" and the parity information calculated is "7" (12 - 5). If you lose the information on B ("12") - you can calculate that information by adding A and C.

So now, just imagine that the data written to A and B is much longer and more complex than simple integers, and imagine that the adding and subtracting calculations are more like the "XOR" operation above and you can start to get a feel for what must be going on.

 

The Down Side:

There is a cost for these improvements of course - you will need a RAID card ($300 - $400 in New Zealand). You will also need a minimum of 3 hard drives of identical capacity to form a RAID 5 array.

And a word of caution here. My first RAID setup (RocketRaid) allowed me to connect drives to my system, even though they contained data and this system would gradually build the new disk array and migrate my data into it. The second system (LSI) card began to re-format my disks the moment I attached them. Fortunately the 3 disks didn't contain valuable information. If they had, that information would be gone.

The safest course of action is to start with blank disks. Any existing data on your system will need to be copied some other temporary repository (borrowing your friends external drives for a few days perhaps), and depending on the level of RAID you choose - you will lose a portion of that disk space to provide the redundancy. From my 8 x 1Tb disks, I get a 6.63 Tb of storage space - but the payoff is large in easier disk management and peace of mind.

A second word of caution - ensure that you have sufficient power from your power supply to handle the additional hardware. Along with the added hardware is more heat so also make sure you have sufficient cooling. I went overboard with 5 fans, but as long as you keep your system from over-heating you should be fine - fans are cheap and pretty easy to install.

 

Conclusion:

Perhaps more than any other type of package, Windows Media Center (and Media Center-type packages) encourage users to load their systems with the largest types of files held on a home computer. A typical move in AVI format will be 800 Mb - a movie held in full DVD format (.VOB files) can be 4 to 5 Gigs - so, sadly, Terabytes are becoming the new Megabytes.

Upgrading to a RAID storage solution was a very good move for me and I'd recommend it to anyone who knows how to add a disk to their system.

 

References:

My thanks to the good folks on the Internet that have chosen to share their knowledge with us.

 

http://www.gigaserver.nl/index.php?cPath=178_203

http://www.iomega.com/europe/support/francais/manuals/nas800_fr/trblshoot_raid.html

http://books.google.com/ - search for "Raid parity for dummies"

http://www.scottklarr.com/topic/23/how-raid-5-really-works/

http://riceball.com/d/content/raid-5-parity-what-it-and-how-does-it-work

http://compreviews.about.com/od/storage/l/aaRAIDPage1.htm

http://www.zdnet.com/blog/ou/raid-storage-explained/502

http://www.dataclinic.co.uk/raid-parity-xor.htm

 

 

Footnotes

[1] Drobo is a system that can build an array out of dissimilar drives, but it's quite expensive.

 

 

blog comments powered by Disqus