RAID storage basics guide for beginners

Perspectives on redundant and performance storage solutions for beginners

This is meant to be a quick read. These concepts can be very dry at first for someone new to data storage solutions. This article is a summary that I created from what I have learned about RAID storage. It also includes strategy and direction.




RAID storage basics guide for beginners

I wrote this basic guide for beginners, it simplifys what I have learned about RAID storage after the work and research I have done. Keep in mind that decisions as to storage solutions will always be best if optimized for the needs of their specific application. And so there is no single solution that will work for all instances.

Skip past the quick reference, read the guide below first, come back up to the reference when you need clarification of any terms or concepts.

This is all my opinion and I do not take any responsibility for what you do or do not do with this information. I'm hoping that it could be a good place for someone to get started in their own research. For simplification, the illustration used in this guide treats RAID 10 and RAID 01 as identical. The illustration is actually of RAID 01. Please feel free to ask questions.


Quick reference


Terms:
RAID - Redundant Array of Independent Disks
Array - A group of hard disks functioning together within a managed RAID setup.
Level - This refers to the type of RAID you are using.

RAID management:
Hardware - This refers to an external card or device that manages the array.
Software - This usually refers to an array that is managed by a motherboard's onboard RAID controller.

RAID Levels:

RAID 0 = stripe


- This RAID level is used for increased performance and does not provide any data protection. It spreads information across two or more drives in "stripes." This allows multiplication of drive speed and performance. Drives in the array all act together as if they were one single drive. The total storage size will be [size of smallest drive in the array x number of drives].




RAID 1 = mirror

- This RAID level is used to protect data from drive failure by creating an exact duplicate of the information. It usually uses two drives. The total storage size will be equal to that of the smallest drive in the array.




RAID 10 = striped mirrors (RAID 01 = mirrored stripes)

- This RAID level takes advantage of both RAID 0 and RAID 1, thus the name RAID 1+0 or RAID 10. It manages separate mirrors and stripes data across them. The number of drives required for this RAID level are 4, 8, 10 or any multiple of 2 (minimum of 4 drives). One common variation is RAID 0+1 or RAID 01, it is so similar that many manufacturers simply refer to it as RAID 10. The only difference is that RAID 01 manages the data in the opposite way, in other words, it mirrors multiple striped sets. It performs the same basic function (statistically speaking, true RAID 10 is slightly more fault tolerant than RAID 01). The total storage size of this type of array will be [(size of smallest drive in the array x number of drives)/(2)].




RAID 5 = redundancy achieved through distributed parity

- This RAID level uses three or more disks. It creates parity bits to prevent data loss due to drive failure. Parity bits are information that can be used to rebuild after a drive replacement. This level stripes the parity along with the actual data across all of the disks in the array. The storage size will be [(size of smallest drive in the array x number of drives)-(size of smallest drive in the array)]




RAID, data storage basics

Among the great many standard, nested and hybrid levels available, RAID 1, RAID 0, RAID 5 and RAID 10 (or RAID 01) are currently the most common levels. Then there is also Intel Matrix RAID that allows you to use different raid levels within one array, separated by partitions.

RAID 5 is the most widely used in business because of it's low cost of redundancy, but that low cost comes at a performance disadvantage and so I would not recommend a software controlled RAID 5 array for a home system or any system for that matter. This is where the price advantage of RAID 5 does become a bit cloudy, there are hardware controllers that can boost the performance of RAID 5 to somewhere between that of RAID 0 and RAID 10 but this added performance comes at a significant cost increase further complicating the decision.

For a performance based machine, you should use RAID 0 for your operating system in order to take advantage of the reduced latency and decreased read/write times. You need this for any hardcore gaming system. And in my experience, you currently need a minimum of 150GB for an operating system. This is especially true with a gaming system in which you plan to install a lot of games.

You will find that once you start installing games and software, especially if you use a large operating system such as newer versions of Windows, you will quickly have 100GB of disk usage. And you should always have extra room in order to be able to stay relatively defragmented. Full drives do not function as well as half full drives.

Two arrays is sufficient for a home system. This would allow you to take advantage of options such as external storage and redundancy while retaining a separate performance based array for the operating system.

You could manage a system safely with one RAID 10 array for the entire system using two separate partitions. This is the best option to take advantage of both redundancy and improved performance using a single software controlled array. In this case, and in the event of a single drive failure, you would not have to reformat, you could just replace the drive and the array would rebuild. Additionally, if ever needed, you could reformat your operating system partition without damaging your file storage.

I was checking prices at a reputable retailer I purchase hardware from. At the time this article was written, a high quality 250GB drive had a $0.24/GB ratio, a high quality 640GB drive had a $0.14/GB ratio. The cheapest 80GB drive came into that equation at $0.54/GB. In the case of the 80GB drive, you would be paying about 4x the price/GB of the 640GB drive.

However, the largest drive is not always your best bet, when you're dealing with a RAID array, rebuild time comes into play when you consider the statistical probability of drive failure causing data loss. Larger drives take longer to rebuild so they actually increase the chance of data loss caused by simultaneous drive failure.

At some point, associated risks caused by extended array rebuild times would offset value of larger drives. The ultimate choice would depend on your budget and tolerance for data loss.

When considering storage costs, you should always distinguish between data that is not replaceable such as family photos, and replaceable data such as mp3's you've downloaded, and then weigh this against your storage budget.

Personally, I want all of my file storage protected from drive failure, so I run a 750GB RAID 0 stripe (3 x 250GB drives) for my operating system boot drive. And I have a separate 500GB external RAID 0 array (2 x 250GB drives) that I backup data to. I have one folder in the root directory of my operating system stripe that I backup in entirety to that drive. Rather than using any of the backup functions of my operating system, I use a freeware program to synchronize the data in that folder between the two arrays.

That folder looks like this:
    
File Folder

    - Audio

    - Documents

    - Images

    - Software

    - Video

A main storage directory with internal sub-directories for organization, and each of those internal folders has a similar internal directory structure.

A thought to leave you with, RAID is an amazing technology. It can provide both data protection and performance advantage. Data is becoming more important to all of us every day in both our business and personal lives. Additionally, it is becoming more and more common for motherboards to come equipped with onboard RAID. Because of all these factors, RAID has become integral to business and performance computing. We should make use of the advantages it can provide in our daily lives.






Links and additional information


Additional information
Wikipedia: http://en.wikipedia.org/wiki/RAID#Standard_levels

This is a link to an interesting article testing SSD vs VelciRaptor: http://www.hothardware.com/News/OCZ_Core_Series_SSD_Vs_VelociRaptor_Sneak_Peek/


Hard disk drives
This drive has a very good value, size, speed and noise level for a home computer:
http://www.newegg.com/Product/Product.aspx?Item=N82E16822148262
(Seagate Barracuda 7200.10 ST3250410AS 250GB 7200 RPM 16MB Cache SATA 3.0Gb/s Hard Drive - OEM - $59.99 + free shipping)

Because of the exceptional $/GB value and its performance track record, this drive would also be a viable choice:
http://www.newegg.com/Product/Product.aspx?Item=N82E16822136218
(Western Digital Caviar SE16 WD6400AAKS 640GB 7200 RPM 16MB Cache SATA 3.0Gb/s Hard Drive - OEM - $89.99 + free shipping)

New drive to keep your eye on:
http://www.newegg.com/Product/Product.aspx?Item=N82E16822136284&Tpk=caviar%2bblack
(Western Digital Caviar Black WD1001FALS 1TB 7200 RPM 32MB Cache SATA 3.0Gb/s Hard Drive - OEM)



Hardware RAID specific articles and guides
I think that this article really helps you get an idea about external hardware based RAID storage solutions:
http://www.tomshardware.com/reviews/external-raid-storage,1922.html



I hope you take away some knowledge from this article that will help you in your daily and business life.

Thank you for reading RAID storage basics guide for beginners,
Apache0c

Comments

I have questions

I was thinking about a two different raid levels for personal desktop use.

First of all, you have exemplified two raids on a personal computer, but both were RAID 0. So, is there a option to set up a RAID 0 used by O.S. and another one a different raid level like 1? Or if i want that configuration, can I use the on board raid capabilities or I need an external PCI raid controller?

Secondly, can be a SATAII 3gb/s raid connected to motherboard and O.S. installed on raid 10 and another one only SATA 1,5gb/s mirroring raid level used for personal data and backups, but the HDDs are connected to a expansion pci card (Promise PCI-X to 8 SATA 1,5gb with no raid capabilities?

Thanks in advance and excuses if the questions were too long.

Last edited Aug 15, 2008 1:10 AM
Report abusive comment
Article rating:
Your rating:

Reviews

    Similar Content on the Web

    ApacheOc also wrote

    Knol translations

    Activity for this knol

    This week:

    68pageviews

    Totals:

    2304pageviews
    3comments