What is RAID and what are different types of RAID
configurations?

Answer Posted / s.s.k.samy

These descriptions are based on the original RAID
definitions from the Berkeley paper by Patterson, Gibson
and Katz. RAID originally stood for Redundant Array of
Inexpensive Disks, but the disk vendors did not like that,
as it had cost implications. They changed it to mean
Redundant Array of Independent Disks.
Now this page has turned out to be a lot more popular that
I ever thought it would, and needs a bit more explanation,
as a lot of people are coming in from the home PC angle.
I'm from a big systems background, IBM mainframes, big Unix
servers, Windows and Netware clusters, that sort of stuff
and that biases my opinions on RAID. If you want to put
RAID onto your home PC, then in my opinion, RAID1 is the
best way to go. It's simple. it works and it only needs two
disks. It will even perform if it is a software
implementation.
If you run big storage systems with gigabytes of cache and
hundreds of physical disks, then I would definitely go for
RAID5. Why? It is cheaper because it uses fewer disks for a
given capacity and it performs just as good as RAID1. If
you have eighty 500GB disks, you can only store 20
Terabytes of data on them with RAID1, but you will get 35
TB on them in a 7+1 RAID5 implementation. That's why I
claim that RAID5 is cheaper than RAID1. It is for big
systems, but not for small systems, say less than a couple
of terabytes.
I had an animated discussion (which is one way of
describing it) with a DBA last year who insisted that
Oracle databases had to have RAID1 or they would not
perform. We bought a DMX and ran some tests with the same
database on RAID1 and RAID5, and the RAID5 setup actually
performed better, I suspect, because it was pulling the
data off more spindles.
However, I would never touch a software implementation of
RAID5 as the write penalty will kill performance.
So there you go, PCs and small systems; RAID1, big systems
RAID5 but at the end of the day it is your money.
RAID can be implemented by software in the host, but this
is not usually successful. It is best implemented by
microcode in the storage subsystem controller. The various
types of RAID are explained below. In the diagrams, the
square box represents the controller and the cache.
Parity is a means of adding extra data, so that if one of
the bits of data is deleted, it can be recreated from the
parity. For example, suppose a binary halfword consists of
the bits 1011. The total number of '1's in the halfword is
odd, so we make the parity bit a 1. The halfword then
becomes 10111. Suppose the third bit is lost, the halfword
is then 10?11. We know from the last bit that there should
be an odd number of '1's, the number of recognisable '1's
is even, so the missing but must be a '1'. This is a very
simplistic explanation, in practice, disk parity is
calculated on blocks of data using XOR hardware functions.
The advantage of parity is that it is possible to recover
data from errors. The disadvantage is that more storage
space is required.
· RAID0 is simply data striped over several disks.
This gives a performance advantage, as it is possible to
read parts of a file in parallel. However not only is there
no data protection, it is actually less reliable than a
single disk, as all the data is lost if a single disk in
the array stripe fails.

· RAID1 is data mirroring. Two copies of the data are
held on two physical disks, and the data is always
identical. RAID1 has a performance advantage, as reads can
come from either disk, and is simple to implement. However,
it is expensive, as twice as many disks are needed to store
the data.

· RAID2 is a theoretical entity. It stripes data at
bit level across an array of disks, then writes check bytes
to other disks in the array. The check bytes are calculated
using a Hamming code. Theoretical performance is very high,
but it would be so expensive to implement that no-one uses
it.
· RAID3 A block of data is striped over an array of
disks, then parity data is written to a dedicated parity
disk. Successful implementations usually require that all
the disks have synchronised rotation. RAID3 is very
effective for large sequential data, such as satellite
imagery and video.

In the gif above, the right hand disk is dedicated parity,
the other three disks are data disks.
· RAID4 data is written in blocks onto the data disks
(i.e. not striped), then parity is generated and written to
a dedicated parity disk.

In the gif above, the right hand disk is dedicated parity,
the other three disks are data disks.
· RAID5 data is written in blocks onto data disks,
and parity is generated and rotated around the data disks.
Good general performance, and reasonably cheap to
implement. Used extensively for general data.

The gif below illustrates the RAID5 write overhead. If a
block of data on a RAID5 disk is updated, then all the
unchanged data blocks from the RAID stripe have to be read
back from the disks, then new parity calculated before the
new data block and new parity block can be written out.
This means that a RAID5 write operation requires 4 IOs. The
performance impact is usually masked by a large subsystem
cache.
As Nat Makarevitch pointed out, more efficient RAID-5
implementations hang on to the original data and use that
to generate the parity according to the formula new_parity
= old_data XOR new_data XOR old_parity. If the old data
block is retained in cache, and it often is, then this just
requires one extra IO to fetch the old parity. Worst case
it will require to read two extra data blocks, not four.

RAID 5 often gets a bad press, due to potential data loss
on hardware errors and poor performance on random writes.
Some database manufactures will positively tell you to
avoid RAID5. The truth is, it depends on the
implementation. Avoid software implemented RAID5, it will
not perform. RAID5 on smaller subsystems will not perform
unless the subsystem has a large amount of cache. However,
RAID5 is fine on enterprise class subsystems like the EMC
DMX, the HDS USP or the IBM DDS devices. They all have
large, gigabyte size caches and force all write IOs to be
written to cache, thus guaranteeing performance and data
integrity.
Most manufactures will let you have some control over the
RAID5 configuration now. You can select your block stripe
size and the number of volumes in an array group.
A smaller stripe size is more efficient for a heavy random
write workload, while a larger blocksize works better for
sequential writes. A smaller number of disks in an array
will perform better, but has a bigger parity bit overhead.
Typical configurations are 3+1 (25% parity) and 7+1 (12.5%
parity).
· RAID6 is growing in popularity as it is seen as the
best way to guarantee data integrity as it uses double
parity. It was originally used in SUN V2X devices, where
there are a lot of disks in a RAID array, and so a higher
chance of multiple failures. RAID6 as implemented by SUN
does not have a write overhead, as the data is always
written out to a different block.
The problem with RAID6 is that there is no standard method
of implementation; every manufacturer has their own method.
In fact there are two distinct architectures, RAID6 P+Q and
RAID6 DP.
DP, or Double Parity raid uses a mathematical method to
generate two independent parity bits for each block of
data, and several mathematical methods are used. P+Q
generates a horizontal P parity block, then combines those
disks into a second vertical RAID stripe and generates a Q
parity, hence P+Q. One way to visualise this is to picture
three standard four disk RAID5 arrays then take a fourth
array and stripe again to construct a second set of raid
arrays that consist of one disk from each of the first
three arrays, plus a fourth disk from the fourth array. The
consequence is that those sixteen disks will only contain
nine disks worth of data.
P+Q architectures tend to perform better than DP
architectures and are more flexible in the number of disks
that can be in each RAID array. DP architectures usually
insist that the number of disks is prime, something like
4+1, 6+1 or 10+1. This can be a problem as the physical
disks usually come in units of eight, and so do not easily
fit a prime number scheme.
· RAID7 is a registered trademark of Storage Computer
Corporation, and is basically RAID3 with an embedded
operating system in the controller to manage the data and
cache to speed up the access.
· RAID1+0 is a combination of RAID1 mirroring and
data striping. This means it has very good performance, and
high reliability, so its ideal for mission critical
database applications. All that redundancy means that it is
expensive.
· RAID50 is implemented as a RAID5 array that is then
striped in RAID0 fashion for fast access
· RAID53 applies this 'RAID then stripe' principle to
RAID3. It should really be called RAID3+0. Both these RAID
versions are expensive to implement in hardware terms
· RAID0+1 is implemented as a mirrored array whose
segments are RAID 0 arrays, which is not the same as
RAID10. RAID 0+1 has the same fault tolerance as RAID level
5. The data will survive the loss of a single disk, but at
this point, all you have is a striped RAID0 disk set. It
does provide high performance, with lower resilience than
RAID10.
· RAID-S or parity RAID is a specific implementation
of RAID5, used by EMC. It uses hardware facilities within
the disks to produce the parity information, and so does
not have the RAID5 write overhead. It used to be called
RAID-S, and is sometimes called 3+1 or 7+1 RAID.
· RAIDZ is part of the SUN ZFS file system. It is a
software based variant of RAID5 which does not used a fixed
size RAID stripe but writes out the current block of data
as a varying size RAID stripe. With standard RAID, data is
written and read in blocks and several blocks are usually
combined together to make up a RAID stripe. If you need to
update one data block, you have to read back all the other
data blocks in that stripe to calculate the new RAID
parity. RAIDZ eliminates the RAID 5 write penalty as any
read and write of existing data will just include the
current block. In a failure, data is re-created by reading
checksum bytes from the file system itself, not the
hardware, so recovery is independent of hardware failures.
The problem, of course is that RAIDZ closely couples the
operating system and the hardware. In other words, you have
to buy them both from SUN.

Is This Answer Correct ?    17 Yes 2 No



Post New Answer       View All Answers


Please Help Members By Posting Answers For Below Questions

Which are ddl commands?

540


Where in ms sql server is ’100’ equal to ‘0’?

598


What are the new features of sql server 2012 reporting service?

84


How to create a login account in ms sql server to access the database engine using "create login" statements?

569


How to set database to be read_only in ms sql server?

548






What is the difference between SQL notification and SQL invalidation?

585


What is replication with database mirroring? : sql server database administration

529


What is the difference between ‘having’ clause and a ‘where’ clause?

547


What are different types of database indexes?

524


What is a db view?

516


How to download microsoft sql server 2005 express edition?

574


Mention the different types of replication in sql server.

561


What is similarity and difference between truncate and delete in sql?

586


What kind of problems occurs if we do not implement proper locking strategy?

967


Explain different backup plans?

530