Software-RAID HOWTO
Software-RAID HOWTO
Linas Vepstas, linas@linas.org v0.54, 21 November 1998
ÃÖ Èñö ironyjk@kldp.org
2000³â 3¿ù 1ÀÏ
RAID ´Â ''Redundant Array of Inexpensive Disks'' ÀÇ ¾àÀÚ·Î,
°¢°¢ÀÇ µð½ºÅ© µéÀ» ¹¾î¼ ºü¸£°í ¾ÈÀüÇÑ µð½ºÅ© ½Ã½ºÅÛÀ» ¸¸µå´Â °ÍÀÌ´Ù.
RAID ´Â ÇϳªÀÇ µð½ºÅ©¿¡ ºñÇØ ¿À·ù¸¦ ´ëºñÇÒ ¼ö ÀÖÀ¸¸ç,
¼Óµµ¸¦ Áõ°¡ ½ÃŲ´Ù.
RAID stands for ''Redundant Array of Inexpensive Disks'', and
is meant to be a way of creating a fast and reliable disk-drive
subsystem out of individual disks. RAID can guard against disk
failure, and can also improve performance over that of a single
disk drive.
ÀÌ ¹®¼´Â Linux MD kernel È®Àå¿¡ °üÇÑ tutorial/HOWTO/FAQ ¹®¼ÀÌ´Ù.
MD È®ÀåÀº RAID-0,1,4,5 ¸¦ ¼ÒÇÁÆ®¿þ¾î ÀûÀ¸·Î Áö¿øÇÏ°í,
ÀÌ°ÍÀ» ÅëÇØ ¿ì¸®´Â Ưº°ÇÑ Çϵå¿þ¾î³ª µð½ºÅ© ÄÜÆ®·Ñ·¯ ¾øÀÌ
RAID ¸¦ »ç¿ëÇØ º¼¼ö ÀÖ´Ù.
This document is a tutorial/HOWTO/FAQ for users of
the Linux MD kernel extension, the associated tools, and their use.
The MD extension implements RAID-0 (striping), RAID-1 (mirroring),
RAID-4 and RAID-5 in software. That is, with MD, no special hardware
or disk controllers are required to get many of the benefits of RAID.
- ¸Ó¸®¸»
-
This document is copyrighted and GPL'ed by Linas Vepstas
(
linas@linas.org).
Permission to use, copy, distribute this document for any purpose is
hereby granted, provided that the author's / editor's name and
this notice appear in all copies and/or supporting documents; and
that an unmodified version of this document is made freely available.
This document is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY, either expressed or implied. While every effort
has been taken to ensure the accuracy of the information documented
herein, the author / editor / maintainer assumes NO RESPONSIBILITY
for any errors, or for any damages, direct or consequential, as a
result of the use of the information documented herein.
º» ¹®¼ÀÇ È²´çÇÏ°í ¹«Ã¥ÀÓÇÑ ¹ø¿ªÀ¸·Î ÀÎÇÑ Á¤½ÅÀû ¹°¸®Àû ÇÇÇظ¦ ¿ªÀڴ åÀÓÀ» ÁöÁö ¾Ê½À´Ï´Ù. ^^
(¹ø¿ªÀÌ Ã³À½ÀÌ´Ù º¸´Ï, ÀÌ ¹®¼¿¡ ¿À¿ªÀÌ Á¶±Ý(?) ÀÖ½À´Ï´Ù.)
ÀÌ ¹®¼´Â GPLÀ» µû¸¨´Ï´Ù. ¿À¿ª°ú À߸øµÈ, °»½ÅÇØ¾ß ÇÒ Á¤º¸¿¡ °üÇؼ´Â
Àú¿¡°Ô ¸ÞÀÏÀ» Áֽñ⠹ٶø´Ï´Ù.
¹ø¿ªÀ̶ó°í Çϱä Çߴµ¥ ¾û¼ºÇϱ⠱×Áö ¾ø±º¿ä. Á» ´õ ÀÚ¼¼ÇÑ ¹ø¿ªÀ» ÇÏ°í ½Í±ä ÇÏÁö¸¸.
Àß ¸ð¸£´Â °Íµµ ¸¹°í ´õ ÇÏ°í ½ÍÀº °Íµµ ¸¹¾Æ¼ ^^;
RAID´Â µð½ºÅ©ÀÇ Ãß°¡·Î ½Ã½ºÅÛÀÇ ½Å·Ú¼ºÀ» Çâ»ó½Ãų¼ö ÀÖÀ¸³ª,
À߸øµÈ »ç¿ëÀ¸·Î ÀÎÇØ ¿ªÈ¿°ú¸¦ ³¾ ¼öµµ ÀÖ´Ù.
ƯÈ÷ , RAID´Â µð½ºÅ© ÀÚüÀÇ ¿À·ù¿¡ ´ëºñÇÑ °ÍÀÌÁö.
»ç¿ëÀÚÀÇ ½Ç¼ö³ª, Àü¿øÀÇ ºÒ·®¿¡ ´ëºñÇϵµ·Ï ¼³°èµÈ °ÍÀÌ ¾Æ´Ï´Ù.
Àü¿øÀÇ ºÒ·®°ú, °³¹ß Ä¿³Î, ±×¸®°í, °ü¸®ÀÚÀÇ ½Ç¼ö´Â µ¥ÀÌÅ͸¦
¼Õ»ó½Ãų °ÍÀÌ°í, RAID ´Â ¹é¾÷¹æ¹ýÀÌ ¾Æ´Ô¿¡ À¯ÀÇÇ϶ó.
RAID, although designed to improve system reliability by adding
redundancy, can also lead to a false sense of security and confidence
when used improperly. This false confidence can lead to even greater
disasters. In particular, note that RAID is designed to protect against
*disk* failures, and not against *power* failures or *operator*
mistakes. Power failures, buggy development kernels, or operator/admin
errors can lead to damaged data that it is not recoverable!
RAID is *not* a substitute for proper backup of your system.
Know what you are doing, test, be knowledgeable and aware!
- Q:
RAID ¶õ ¹«¾ùÀΰ¡?
A
RAID ´Â ''Redundant Array of Inexpensive Disks' ÀÇ ¾àÀÚ·Î,
°¢°¢ÀÇ µð½ºÅ© µéÀ» ¹¾î¼ ºü¸£°í ¾ÈÀüÇÑ µð½ºÅ© ½Ã½ºÅÛÀ» ¸¸µå´Â °ÍÀÌ´Ù.
RAID stands for "Redundant Array of Inexpensive Disks",
and is meant to be a way of creating a fast and reliable disk-drive
subsystem out of individual disks. In the PC world, "I" has come to
stand for "Independent", where marketing forces continue to
differentiate IDE and SCSI. In it's original meaning, "I" meant
"Inexpensive as compared to refrigerator-sized mainframe
3380 DASD", monster drives which made nice houses look cheap,
and diamond rings look like trinkets.
- Q:
ÀÌ ¹®¼´Â ¹«¾ùÀΰ¡?
A:
ÀÌ ¹®¼´Â Linux MD kernel È®Àå¿¡ °üÇÑ tutorial/HOWTO/FAQ ¹®¼ÀÌ´Ù.
MD È®ÀåÀº RAID-0,1,4,5 ¸¦ ¼ÒÇÁÆ®¿þ¾î ÀûÀ¸·Î Áö¿øÇÏ°í,
ÀÌ°ÍÀ» ÅëÇØ ¿ì¸®´Â Ưº°ÇÑ Çϵå¿þ¾î³ª µð½ºÅ© ÄÜÆ®·Ñ·¯ ¾øÀÌ
RAID ¸¦ »ç¿ëÇØ º¼¼ö ÀÖ´Ù.
This document is a tutorial/HOWTO/FAQ for users of the Linux MD
kernel extension, the associated tools, and their use.
The MD extension implements RAID-0 (striping), RAID-1 (mirroring),
RAID-4 and RAID-5 in software. That is, with MD, no special
hardware or disk controllers are required to get many of the
benefits of RAID.
This document is NOT an introduction to RAID;
you must find this elsewhere.
- Q:
Linux Ä¿³ÎÀº ¾î¶² ·¹º§ÀÇ RAID ¸¦ Áö¿øÇϴ°¡?
A:
RAID-0 Àº 2.x ¹öÀüÀÇ ¸®´ª½º Ä¿³ÎµéÀÌ Áö¿øÇÑ´Ù.
ÀÌ°ÍÀº ÀÌÇØÇϱ⠽±°í, ¸î¸îÀÇ ¸Å¿ì Å« À¯Áî³Ý ´º½º ¼¹ö¿¡ »ç¿ëµÈ´Ù.
Striping (RAID-0) and linear concatenation are a part
of the stock 2.x series of kernels. This code is
of production quality; it is well understood and well
maintained. It is being used in some very large USENET
news servers.
RAID-1, RAID-4, RAID-5 ´Â Ä¿³Î 2.1.63 ÀÌ»óÀÇ ¹öÀü¿¡¼ Áö¿øÇÑ´Ù.
2.0.x ¿Í 2.1.xÀÇ Ä¿³ÎµéÀº ÆÐÄ¡¸¦ ÇØ¾ß ÇÑ´Ù.
Ä¿³ÎÀ» ¾÷±×·¹À̵å ÇØ¾ß ÇÑ´Ù°í »ý°¢ÇÏÁö ¸¶¶ó. ¾÷±×·¹À̵庸´Ù.
ÆÐÄ¡°¡ ÈξÀ ½¬¿ï °ÍÀÌ´Ù.
RAID-1, RAID-4 & RAID-5 are a part of the 2.1.63 and greater
kernels. For earlier 2.0.x and 2.1.x kernels, patches exist
that will provide this function. Don't feel obligated to
upgrade to 2.1.63; upgrading the kernel is hard; it is *much*
easier to patch an earlier kernel. Most of the RAID user
community is running 2.0.x kernels, and that's where most
of the historic RAID development has focused. The current
snapshots should be considered near-production quality; that
is, there are no known bugs but there are some rough edges and
untested system setups. There are a large number of people
using Software RAID in a production environment.
RAID-1 hot reconstruction has been recently introduced
(August 1997) and should be considered alpha quality.
RAID-5 hot reconstruction will be alpha quality any day now.
A word of caution about the 2.1.x development kernels:
these are less than stable in a variety of ways. Some of
the newer disk controllers (e.g. the Promise Ultra's) are
supported only in the 2.1.x kernels. However, the 2.1.x
kernels have seen frequent changes in the block device driver,
in the DMA and interrupt code, in the PCI, IDE and SCSI code,
and in the disk controller drivers. The combination of
these factors, coupled to cheapo hard drives and/or
low-quality ribbon cables can lead to considerable
heartbreak. The ckraid tool, as well as
fsck and mount put considerable stress
on the RAID subsystem. This can lead to hard lockups
during boot, where even the magic alt-SysReq key sequence
won't save the day. Use caution with the 2.1.x kernels,
and expect trouble. Or stick to the 2.0.34 kernel.
- Q:
¾îµð¿¡¼ Ä¿³ÎÀÇ ÆÐÄ¡¸¦ ±¸ÇÒ ¼ö ÀÖ³ª¿ä?
A:
Software RAID-0 and linear mode are a stock part of
all current Linux kernels. Patches for Software RAID-1,4,5
are available from
http://luthien.nuclecu.unam.mx/~miguel/raid.
See also the quasi-mirror
ftp://linux.kernel.org/pub/linux/daemons/raid/
for patches, tools and other goodies.
- Q:
Linux RAID ¿¡ °üÇÑ ´Ù¸¥ ¹®¼µéÀÌ ÀÖ³ª¿ä?
A:
- Q:
ÀÌ ¹®¼¿¡ ´ëÇØ ´©±¸¿¡°Ô ºÒÆòÇØ¾ß ÇÏÁÒ?
A:
Linas Vepstas slapped this thing together.
However, most of the information,
and some of the words were supplied by
Copyrights
- Copyright (C) 1994-96 Marc ZYNGIER
- Copyright (C) 1997 Gadi Oxman, Ingo Molnar, Miguel de Icaza
- Copyright (C) 1997, 1998 Linas Vepstas
- By copyright law, additional copyrights are implicitly held
by the contributors listed above.
Thanks all for being there!
- Q:
RAID´Â ¹«¾ùÀΰ¡? ¿Ö ³ª´Â ¾ÆÁ÷ »ç¿ëÇØ º¸Áö ¾Ê¾Ò´Â°¡?
A:
RAID ´Â ¿©·¯°³ÀÇ µð½ºÅ©¸¦ ¿ª¾î¼, ¼Óµµ¿Í, ¾ÈÀü¼ºÀÌ ÁÁÀº
ÇϳªÀÇ ÇüÅ·Π¸¸µå´Â °ÍÀÌ´Ù.
RAID ´Â ¿©·¯°¡Áö ÇüÅ°¡ ÀÖ°í, ±× ÇüŸ¶´Ù °¢°¢ÀÇ Àå´ÜÁ¡À»
°¡Áö°í ÀÖ´Ù.
¿¹¸¦ µé¸é RAID ·¹º§ 1 Àº µÎ°³(ȤÀº ÀÌ»ó)ÀÇ µð½ºÅ©¿¡ °°Àº µ¥ÀÌÅÍÀÇ
º¹»çº»À» ³Ö´Â °ÍÀÌ´Ù. µ¥ÀÌÅÍ°¡ º¹»çµÈ °¢ µð½ºÅ©¿¡¼
µ¥ÀÌÅ͸¦ °¡Á®¿À±â ¶§¹®¿¡ Àд ¼Óµµ´Â »¡¶óÁú °ÍÀÌ´Ù.
Ãß°¡ÀûÀ¸·Î º¹»çµÈ µ¥ÀÌÅÍ´Â ÇϳªÀÇ µð½ºÅ©°¡ ±úÁ³À» ¶§ ¾ÈÁ¤¼ºÀ»
Á¦°øÇÒ °ÍÀÌ´Ù. RAID ·¹º§¿¡ ÀÇÇÑ ´Ù¸¥ ¹æ¹ýÀº, ¿©·¯°³ÀÇ µð½ºÅ©¸¦
ÇϳªÀÇ µð½ºÅ©·Î ¹´Â °ÍÀÌ´Ù. ±×°ÍÀº °£´ÜÇÑ º¹»ç¿¡ ºñÇØ
Á» ´õ ¸¹Àº ÀúÀå·üÀ» Á¦°øÇÒ °ÍÀÌ´Ù, ¶ÇÇÑ, Àб⠾²±â¸¦ À§ÇÑ
¼º´É Çâ»óÀ» ½ÃÅ°¸é¼, ¿©ÀüÈ÷ ¿À·ù¿¡ ´ëºñÇÑ Àû´çÇÑ ¿©À¯°ø°£À»
³²°ÜµÑ °ÍÀÌ´Ù.
RAID is a way of combining multiple disk drives into a single
entity to improve performance and/or reliability. There are
a variety of different types and implementations of RAID, each
with its own advantages and disadvantages. For example, by
putting a copy of the same data on two disks (called
disk mirroring, or RAID level 1), read performance can be
improved by reading alternately from each disk in the mirror.
On average, each disk is less busy, as it is handling only
1/2 the reads (for two disks), or 1/3 (for three disks), etc.
In addition, a mirror can improve reliability: if one disk
fails, the other disk(s) have a copy of the data. Different
ways of combining the disks into one, referred to as
RAID levels, can provide greater storage efficiency
than simple mirroring, or can alter latency (access-time)
performance, or throughput (transfer rate) performance, for
reading or writing, while still retaining redundancy that
is useful for guarding against failures.
RAID´Â µð½ºÅ© ¿À·ù¿¡ ´ëºñÇÒ ¼ö ÀÖÁö¸¸, Àΰ£ÀÇ ½Ç¼ö³ª,
ÇÁ·Î±×·¥ÀÇ ¿À·ù¿¡´Â ´ëºñÇÒ ¼ö ¾ø´Ù.
(RAID ÇÁ·Î±×·¥ ÀÚüµµ ¿À·ù¸¦ Æ÷ÇÔÇÒ ¼ö ÀÖ´Ù.)
net »ó¿¡´Â RAID ¼³Ä¡¿¡ Àͼ÷Ä¡ ¾ÊÀº °ü¸®ÀÚµéÀÌ ±×µéÀÇ µ¥ÀÌÅ͸¦
¸ðµÎ ÀÒ¾î¹ö¸®´Â ±×·± ºñ±ØÀûÀÎ À̾߱⠵éÀÌ ¸¹´Ù.
RAID´Â ¹é¾÷ÀÇ ´ëü ÇÒ ¼ö ¾øÀ½¿¡ ÁÖÀÇÇ϶ó!
Although RAID can protect against disk failure, it does
not protect against operator and administrator (human)
error, or against loss due to programming bugs (possibly
due to bugs in the RAID software itself). The net abounds with
tragic tales of system administrators who have bungled a RAID
installation, and have lost all of their data. RAID is not a
substitute for frequent, regularly scheduled backup.
RAID ´Â Çϵå¿þ¾î ÀûÀ¸·Î, Ưº°ÇÑ µð½ºÅ© ÄÜÆ®·Ñ·¯·Î, ¶Ç´Â,
Ä¿³Î ¸ðµâ °°Àº ¼ÒÇÁÆ®¿þ¾î ÀûÀ¸·Î ±¸ÇöµÉ ¼ö ÀÖ´Ù.
RAID Çϵå¿þ¾î "µð½ºÅ© ÄÜÆ®·Ñ·¯"´Â µð½ºÅ© µå¶óÀ̺꿡 ÄÉÀ̺íÀ»
¿¬°áÇÒ¼ö ÀÖ°Ô ÇØÁÖ´Â °ÍÀÌ´Ù. ÀϹÝÀûÀ¸·Î
lISA/EISA/PCI/S-Bus/MicroChannel ½½·Ô¿¡ ÀåÂøÇÒ¼ö ÀÖ´Â Ä«µåÇü½ÄÀ̳ª,
¾î¶² °ÍµéÀº ÀϹÝÀûÀÎ ÄÁÆ®·Ñ·¯¿Í µð½ºÅ©»çÀ̸¦ ¿¬°áÇØ´Â
¹Ú½º Çü½ÄÀÌ´Ù.
RAID can be implemented
in hardware, in the form of special disk controllers, or in
software, as a kernel module that is layered in between the
low-level disk driver, and the file system which sits above it.
RAID hardware is always a "disk controller", that is, a device
to which one can cable up the disk drives. Usually it comes
in the form of an adapter card that will plug into a
ISA/EISA/PCI/S-Bus/MicroChannel slot. However, some RAID
controllers are in the form of a box that connects into
the cable in between the usual system disk controller, and
the disk drives. Small ones may fit into a drive bay; large
ones may be built into a storage cabinet with its own drive
bays and power supply.
ÃÖ½ÅÀÇ ºü¸¥ CPU¸¦ »ç¿ëÇÏ´Â RAID Çϵå¿þ¾î´Â ÃÖ°íÀÇ ¼Óµµ¸¦ ³»±ä
ÇÏÁö¸¸, ±×¸¸Å ºñ½Ò °ÍÀÌ´Ù.
ÀÌÀ¯´Â ´ëºÎºÐÀÌ º¸µå»ó¿¡ ÃæºÐÇÑ DSP ¿Í ¸Þ¸ð¸®¸¦ °¡Áö°í ÀÖ±â
¶§¹®ÀÌ´Ù.
¿À·¡µÈ RAID Çϵå¿þ¾î´Â DSP¿Í ij½¬ÀÇ º´¸ñÇö»óÀ¸·Î ÃÖ½ÅÀÇ CPU¸¦
»ç¿ëÇÏ´Â ½Ã½ºÅÛÀÇ ¼Óµµ¸¦ ÀúÇϽÃų¼ö ÀÖ´Ù. ¶§·Î´Â, ÀϹÝÀûÀÎ
Çϵå¿þ¾î¿Í ¼ÒÇÁÆ®¿þ¾î RAID ¸¦ »ç¿ëÇÏ´Â °Íº¸´Ù ´õ ´À¸± °ÍÀÌ´Ù.
Çϵå¿þ¾î RAID°¡ ¼ÒÇÁÆ®¿þ¾î RAID¿¡ ºñÇØ ÀåÁ¡ÀÌ ÀÖÀ» ¼ö ÀÖÁö¸¸,
ÃÖ±Ù ´ëºÎºÐÀÇ µð½ºÅ© µå¶óÀ̺êµé¿¡°Õ ±×·¸Áö ¾Ê´Ù.?
RAID Çϵå¿þ¾î´Â ÀϹÝÀûÀ¸·Î ´Ù¸¥ ¸ÞÀÌÄ¿¿Í ¸ðµ¨ÀÇ Çϵåµé¿¡°Ô
ȣȯ¼ºÀ» Á¦°øÇÏÁö ¾ÊÁö¸¸, ¸®´ª½º»óÀÇ ¼ÒÇÁÆ®¿þ¾î RAID´Â
¾î¶² Ưº°ÇÑ ¼³Á¤¾øÀÌ ´ëºÎºÐÀÇ Çϵå¿þ¾îµéÀÌ Àß µ¹¾Æ°¥ °ÍÀÌ´Ù.
The latest RAID hardware used with
the latest & fastest CPU will usually provide the best overall
performance, although at a significant price. This is because
most RAID controllers come with on-board DSP's and memory
cache that can off-load a considerable amount of processing
from the main CPU, as well as allow high transfer rates into
the large controller cache. Old RAID hardware can act as
a "de-accelerator" when used with newer CPU's: yesterday's
fancy DSP and cache can act as a bottleneck, and it's
performance is often beaten by pure-software RAID and new
but otherwise plain, run-of-the-mill disk controllers.
RAID hardware can offer an advantage over pure-software
RAID, if it can makes use of disk-spindle synchronization
and its knowledge of the disk-platter position with
regard to the disk head, and the desired disk-block.
However, most modern (low-cost) disk drives do not offer
this information and level of control anyway, and thus,
most RAID hardware does not take advantage of it.
RAID hardware is usually
not compatible across different brands, makes and models:
if a RAID controller fails, it must be replaced by another
controller of the same type. As of this writing (June 1998),
a broad variety of hardware controllers will operate under Linux;
however, none of them currently come with configuration
and management utilities that run under Linux.
¼ÒÇÁÆ®¿þ¾î RAID´Â Ä¿³Î ¸ðµâ·Î ¼³Á¤Çϸç, °ü¸® µµ±¸µîµµ
¸ðµÎ ¼ø¼öÇÑ ¼ÒÇÁÆ®¿þ¾î ÀûÀ¸·Î ÀÌ·ç¾îÁ® ÀÖ´Ù.
¸®´ª½º RAID ½Ã½ºÅÛÀº IDE, SCSI and Paraport drives °°Àº
Àú¼öÁØ µå¶óÀ̹ö¿Í block-device interface À§¿¡ ¾ãÀº ÃþÀ¸·Î Á¸ÀçÇÑ´Ù.
ext2fs ³ª, DOS-FATµîÀÇ ÆÄÀϽýºÅÛÀº block-device interfaceÀ§¿¡
¾òÇô ÀÖ´Ù. ¼ÒÇÁÆ®¿þ¾î RAID´Â ¼ÒÇÁÆ®¿þ¾îÀûÀ¸·Î ¸Å¿î ÀÚ¿¬½º·¯¿î
°ÍÀ̸ç, Çϵå¿þ¾îÀû ±¸Çöº¸´Ù À¯¿¬ÇÑ °ÍÀÌ´Ù.
´ÜÁ¡À¸·Î´Â Çϵå¿þ¾î ½Ã½ºÅÛº¸´Ù CPU cycle°ú Àü¿øÀ» Á¶±Ý ´õ
¼Ò¸ðÇÑ´Ù´Â °ÍÀÌÁö¸¸, °¡°ÝÀÌ ºñ½ÎÁö´Â °ÍÀº ¾Æ´Ï´Ù.
¼ÒÇÁÆ®¿þ¾î RAID´Â ÆÄƼ¼Ç ´ÜÀ§·Î ¿òÁ÷À̸ç, °¢°¢ÀÇ ÆÄƼ¼ÇÀ»
¹¾î¼ RAID ÆÄƼ¼ÇÀ» ¸¸µé ¼öµµ ÀÖ´Ù.
ÀÌ°ÍÀº Çϵå¿þ¾îÀû ±¸Çö°ú Å©°Ô ´Ù¸¥ Á¡À̸ç, µð½ºÅ©µé Àüü¸¦
Çϳª·Î ¹¾î¹ö¸± ¼öµµ ÀÖ´Ù.
±×°ÍÀº Çϵå¿þ¾îÀûÀ¸·Î´Â ¿î¿µÃ¼Á¦·ÎÀÇ ¼³Á¤À» °£´ÜÇÏ°í ¸í¹éÈ÷
ÇÒ¼ö ÀÖ°í, ¼ÒÇÁÆ®¿þ¾îÀûÀ¸·Î´Â Á» ´õ ´Ù¾çÇÑ ¼³Á¤À¸·Î
º¹ÀâÇÑ ¹®Á¦µé¿¡ Á¢±ÙÇÒ ¼ö ÀÖ´Ù.
Software-RAID is a set of kernel modules, together with
management utilities that implement RAID purely in software,
and require no extraordinary hardware. The Linux RAID subsystem
is implemented as a layer in the kernel that sits above the
low-level disk drivers (for IDE, SCSI and Paraport drives),
and the block-device interface. The filesystem, be it ext2fs,
DOS-FAT, or other, sits above the block-device interface.
Software-RAID, by its very software nature, tends to be more
flexible than a hardware solution. The downside is that it
of course requires more CPU cycles and power to run well
than a comparable hardware system. Of course, the cost
can't be beat. Software RAID has one further important
distinguishing feature: it operates on a partition-by-partition
basis, where a number of individual disk partitions are
ganged together to create a RAID partition. This is in
contrast to most hardware RAID solutions, which gang together
entire disk drives into an array. With hardware, the fact that
there is a RAID array is transparent to the operating system,
which tends to simplify management. With software, there
are far more configuration options and choices, tending to
complicate matters.
ÀÌ ±ÛÀÌ ¾²¿©Áö´Â ½ÃÁ¡( 1998³â 6¿ù)¿¡¼, LinuxÇÏÀÇ RAIDÀÇ ¼³Á¤Àº
¾î·Á¿î °ÍÀÌ°í, ¼÷·ÃµÈ ½Ã½ºÅÛ °ü¸®ÀÚ°¡ ¼³Á¤ÇÏ´Â °ÍÀÌ ÁÁÀ» °ÍÀÌ´Ù.
¹æ¹ýÀº ³Ê¹« º¹ÀâÇÏ°í , startup scriptµéÀÇ ¼öÁ¤À» ÇÊ¿ä·Î ÇÑ´Ù.
µð½ºÅ© ¿¡·¯·ÎºÎÅÍÀÇ º¹±¸´Â Æò¹üÇÑ °ÍÀÌ ¾Æ´Ï°í »ç¶÷ÀÇ ½Ç¼ö·Î
À̾îÁö±â ½±´Ù. RAID´Â Ãʺ¸ÀÚ¸¦ À§ÇÑ °ÍÀÌ ¾Æ´Ï´Ù.
¼Óµµ Çâ»ó°ú ¾ÈÀü¼ºÀ» ¾ò±â Àü¿¡ ÀÇ¿ÜÀÇ º¹ÀâÇÔ¿¡ Ç㸦 Âñ¸®±â
½¬¿ì´Ï Á¶½ÉÇϱ⠹ٶõ´Ù..
ƯÈ÷, ¿äÁò µð½ºÅ©µéÀº ¹ÏÀ» ¼ö ¾øÀ» ¸¸Å ¾ÈÀüÇÏ°í
¿äÁò CPU¿Í ÄÁÆ®·Ñ·¯´Â ÃæºÐÈ÷ °·ÂÇÏ´Ù. ´ç½ÅÀº
ÁúÁÁ°í ºü¸¥ Çϵå¿þ¾îÀÇ ±¸ÀÔÀ¸·Î Á»´õ ½±°Ô ¿øÇÏ´Â ¸¸ÅÀÇ
¼Óµµ¿Í ¾ÈÁ¤¼ºÀ» ¾òÀ» ¼ö ÀÖÀ» °ÍÀÌ´Ù.
As of this writing (June 1998), the administration of RAID
under Linux is far from trivial, and is best attempted by
experienced system administrators. The theory of operation
is complex. The system tools require modification to startup
scripts. And recovery from disk failure is non-trivial,
and prone to human error. RAID is not for the novice,
and any benefits it may bring to reliability and performance
can be easily outweighed by the extra complexity. Indeed,
modern disk drives are incredibly reliable and modern
CPU's and controllers are quite powerful. You might more
easily obtain the desired reliability and performance levels
by purchasing higher-quality and/or faster hardware.
- Q:
RAID ·¹º§ÀÌ ¹«¾ùÀΰ¡¿ä? ¿Ö ±×·¸°Ô ¸¹Àº°¡¿ä? ¾î¶»°Ô ±¸ºÐÇÏÁÒ?
A:
°¢ ·¹º§¸¶´Ù, ¼Óµµ¿Í »ç¿ë°ø°£, ¾ÈÁ¤¼º, °¡°ÝÀÇ Æ¯¼ºÀÌ ´Ù¸£´Ù.
¸ðµç RAID ·¹º§ÀÇ °úÀ×»ç¿ë°ø°£ÀÌ µð½ºÅ© ¿À·ù¸¦ ´ëºñÇØ ÁÖ´Â °ÍÀº
¾Æ´Ï´Ù. RAID-1°ú RAID-5°¡ °¡Àå ¸¹ÀÌ »ç¿ëµÇ¸ç,
RAID-1´Â Á»´õ ³ªÀº ¼Óµµ¸¦ ³» ÁÙ °ÍÀ̸ç,
RAID-5´Â Á» ´õ µð½ºÅ©ÀÇ ¿©À¯°ø°£À» ¸¹ÀÌ ³²°ÜÁÙ°ÍÀÌ´Ù.
ÇÏÁö¸¸, ¼Óµµ°¡ ·¹º§¿¡ ÀÇÇؼ ¿ÏÀüÈ÷ °áÁ¤µÇ´Â °ÍÀº ¾Æ´Ï´Ù.
¼Óµµ´Â »ç¿ëÇÒ ÇÁ·Î±×·¥, stripe,block,file µéÀÇ Å©±âµî
´Ù¾çÇÑ ¿äÀο¡ ¸¹Àº ¿µÇâÀ» ¹Þ´Â´Ù.
ÀÌ¿¡ °üÇؼ´Â ÀÌ µÚ¿¡¼ ÀÚ¼¼È÷ ´Ù·ê °ÍÀÌ´Ù.
The different RAID levels have different performance,
redundancy, storage capacity, reliability and cost
characteristics. Most, but not all levels of RAID
offer redundancy against disk failure. Of those that
offer redundancy, RAID-1 and RAID-5 are the most popular.
RAID-1 offers better performance, while RAID-5 provides
for more efficient use of the available storage space.
However, tuning for performance is an entirely different
matter, as performance depends strongly on a large variety
of factors, from the type of application, to the sizes of
stripes, blocks, and files. The more difficult aspects of
performance tuning are deferred to a later section of this HOWTO.
¾Æ·¡¿¡¼´Â Linux ¼ÒÇÁÆ®¿þ¾î RAID ±¸ÇöÀÇ ´Ù¸¥ ·¹º§µé¿¡ ´ëÇؼ
¼³¸íÇÏ°í ÀÖ´Ù.
The following describes the different RAID levels in the
context of the Linux software RAID implementation.
- ¼±Çü RAID (RAID-linear)
Àº ¿©·¯°³ÀÇ ÆÄƼ¼ÇµéÀ» ¿¬°áÇØ ÇϳªÀÇ Å« °¡»ó ÆÄƼ¼ÇÀ»
¸¸µå´Â °ÍÀÌ´Ù. ÀÌ°ÍÀº ÀÛÀº µå¶óÀ̺êµéÀ» ¿©·¯°³ °¡Áö°í ÀÖ°í
ÀÌ°ÍÀ» ÇϳªÀÇ Å« ÆÄƼ¼ÇÀ¸·Î ¸¸µé°íÀÚ ÇÒ¶§ À¯¿ëÇÒ °ÍÀÌ´Ù.
ÇÏÁö¸¸, ÀÌ ¿¬°áÀº ¾ÈÀü¼ºÀ» Á¦°øÇÏÁö ¾Ê´Â´Ù.
ÇϳªÀÇ µð½ºÅ©¿¡ ¿À·ù°¡ ³ª¸é, ¹¿©ÀÖ´Â ÆÄƼ¼Ç Àüü°¡
¿À·ù°¡ ³¯°ÍÀÌ´Ù.
is a simple concatenation of partitions to create
a larger virtual partition. It is handy if you have a number
small drives, and wish to create a single, large partition.
This concatenation offers no redundancy, and in fact
decreases the overall reliability: if any one disk
fails, the combined partition will fail.
- RAID-1
´Â "mirroring" ½ÃÅ°´Â °ÍÀÌ´Ù.
µÎ°³ ÀÌ»óÀÇ °°Àº Å©±â¸¦ °¡Áø ÆÄƼ¼ÇµéÀÌ ¸ðµÎ
ºí·°´ë ºí·°À¸·Î °°Àº µ¥ÀÌÅ͸¦ °¡Áö°Ô µÈ´Ù.
¹Ì·¯¸µÀº µð½ºÅ© ¿À·ù¿¡ ¾ÆÁÖ °·ÂÇÏ´Ù.
µð½ºÅ© Çϳª°¡ ¿À·ù³µÀ» ¶§¿¡µµ, ÆÄ¼ÕµÈ µð½ºÅ©¿Í
¿ÏÀüÈ÷ ¶È°°Àº º¹Á¦º»ÀÌ ÀÖ´Â °ÍÀÌ´Ù.
¹Ì·¯¸µÀº Àб⠿äûÀ» ¸î°³ÀÇ µð½ºÅ©°¡ ³ª´©¾î ó¸®ÇÔÀ¸·Î½á,
I/O°¡ ¸¹Àº ½Ã½ºÅÛÀÇ ºÎÇϸ¦ ÁÙ¿©ÁÙ¼ö ÀÖ´Ù.
ÇÏÁö¸¸, »ç¿ë°ø°£ÀÇ ÀÌ¿ëÀ²¿¡¼ º¼ ¶§ ¹Ì·¯¸µÀº
ÃÖ¾ÇÀÌ´Ù...
is also referred to as "mirroring".
Two (or more) partitions, all of the same size, each store
an exact copy of all data, disk-block by disk-block.
Mirroring gives strong protection against disk failure:
if one disk fails, there is another with the an exact copy
of the same data. Mirroring can also help improve
performance in I/O-laden systems, as read requests can
be divided up between several disks. Unfortunately,
mirroring is also the least efficient in terms of storage:
two mirrored partitions can store no more data than a
single partition.
- Striping
Àº ´Ù¸¥ RAID ·¹º§¿¡ ±âº»ÀûÀÎ °³³äÀÌ´Ù.
stripe´Â µð½ºÅ© ºí·°µéÀÌ ¿¬¼ÓÀûÀ¸·Î ºÙ¾îÀÖ´Â °ÍÀÌ´Ù.
stripe ´Â ÇϳªÀÇ µð½ºÅ© ºí·°¸¸Å ªÀ» ¼öµµ ÀÖÀ» °ÍÀÌ°í,
¼ö õ°³ÀÇ ºí·°µé·Î ÀÌ·ç¾îÁ® ÀÖÀ» ¼öµµ ÀÖÀ» °ÍÀÌ´Ù.
RAID µå¶óÀ̹ö´Â µð½ºÅ© ÆÄƼ¼ÇÀ» stripe ·Î ³ª´ °ÍÀÌ´Ù.
RAID ÀÇ ·¹º§Àº stripe°¡ ¾î¶»°Ô ±¸¼ºµÇ¾ú´Â°¡.
¾î¶² µ¥ÀÌÅ͸¦ ´ã°í Àִ°¡¿¡ µû¶ó¼ ´Þ¶óÁú °ÍÀÌ´Ù.
stripeÀÇ Å©±â¿Í, ÆÄÀϽýºÅÛ¾ÈÀÇ ÆÄÀÏÀÇ Å©±â, ±×°ÍµéÀÇ
µð½ºÅ© ¾È¿¡¼ÀÇ À§Ä¡°¡ RAID ½Ã½ºÅÛÀÇ ÀüüÀûÀÎ ¼º´ÉÀ»
Á¿ìÇÒ °ÍÀÌ´Ù.
(¿ªÀÚ µ¡, stripe´Â ¶ìÀε¥.. Çϳª¿¡ µð½ºÅ©¿¡ ÀÖ´Â°Ô ¾Æ´Ï¶ó.
¿©·¯°³ÀÇ µð½ºÅ©¿¡¼ °°Àº ºÎºÐÀÌ ¶ì¸¦ ¸¸µå´Â °ÍÀÌ°ÚÁÒ..)
is the underlying concept behind all of
the other RAID levels. A stripe is a contiguous sequence
of disk blocks. A stripe may be as short as a single disk
block, or may consist of thousands. The RAID drivers
split up their component disk partitions into stripes;
the different RAID levels differ in how they organize the
stripes, and what data they put in them. The interplay
between the size of the stripes, the typical size of files
in the file system, and their location on the disk is what
determines the overall performance of the RAID subsystem.
- RAID-0
Àº ¼±Çü RAID¿¡ ´õ °¡±õ´Ù. ÆÄƼ¼ÇÀ» stripe µé·Î ³ª´©°í
¹´Â °ÍÀÌ´Ù. ¼±Çü RAIDó·³ °á°ú´Â ÇϳªÀÇ Å« ÆÄƼ¼ÇÀÌ°í,
±×°ÍÀº °úÀ× °ø°£ÀÌ ¾ø´Ù. ¿ª½Ã ¾ÈÀü¼ºµµ ÁÙ¾îµç´Ù.
´Ü¼øÇÑ ¼±Çü RAID¿¡ ºñÇØ ¼º´ÉÀÌ Çâ»óµÇ±ä ÇÏÁö¸¸,
ÆÄÀÏ ½Ã½ºÅÛ°ú, stripe ÀÇ Å©±â¿¡ ÀÇÇØ »ý±â´Â ÆÄÀÏÀÇ ÀϹÝÀûÀÎ
Å©±â, ÀÛ¾÷ÀÇ ÇüÅ¿¡ ¸¹Àº ÀÇÁ¸À» ÇÑ´Ù.
is much like RAID-linear, except that
the component partitions are divided into stripes and
then interleaved. Like RAID-linear, the result is a single
larger virtual partition. Also like RAID-linear, it offers
no redundancy, and therefore decreases overall reliability:
a single disk failure will knock out the whole thing.
RAID-0 is often claimed to improve performance over the
simpler RAID-linear. However, this may or may not be true,
depending on the characteristics to the file system, the
typical size of the file as compared to the size of the
stripe, and the type of workload. The ext2fs
file system already scatters files throughout a partition,
in an effort to minimize fragmentation. Thus, at the
simplest level, any given access may go to one of several
disks, and thus, the interleaving of stripes across multiple
disks offers no apparent additional advantage. However,
there are performance differences, and they are data,
workload, and stripe-size dependent.
- RAID-4
´Â RAID-0 ó·³ stripe·Î ³ª´©´Â ¹æ½ÄÀ» »ç¿ëÇÑ´Ù.
ÇÏÁö¸¸, parity Á¤º¸¸¦ ÀúÀåÇÒ Ãß°¡ÀûÀÎ ÆÄƼ¼ÇÀ» »ç¿ëÇÑ´Ù.
parity ´Â °úÀ× Á¤º¸¸¦ ÀúÀåÇϴµ¥ »ç¿ëµÇ°í, ÇϳªÀÇ µð½ºÅ©¿¡
¿À·ù°¡ ³µÀ» ¶§, ³²Àº µð½ºÅ©ÀÇ µ¥ÀÌÅÍ´Â ÆÄ¼ÕµÈ µð½ºÅ©ÀÇ
µ¥ÀÌÅ͸¦ º¹±¸Çϴµ¥ »ç¿ëµÉ °ÍÀÌ´Ù. N °³ÀÇ µð½ºÅ©°¡ ÀÖ°í,
ÇϳªÀÇ parity µð½ºÅ©°¡ ÀÖ´Ù¸é, parity stripe´Â °¢ µð½ºÅ©ÀÇ
stripe µéÀÇ XOR ¿¬»êÀ¸·Î °è»êµÉ °ÍÀÌ´Ù.
(N+1) µð½ºÅ©¸¦ °¡Áø RAID-4 ¹è¿ÀÇ ÀúÀå¿ë·®Àº
N ÀÌ µÉ°ÍÀÌ´Ù.
ÇÏÁö¸¸, RAID-4´Â ¹Ì·¯¸µ¸¸Å Àд ¼Óµµ°¡ ºü¸£Áö ¾Ê°í
¸Å¹ø µð½ºÅ©¸¦ ¾µ ¶§¸¶´Ù ¿¬»êÀ» ÇÏ°í parity µð½ºÅ©¿¡
½á¾ß ÇÑ´Ù. ¶§¹®¿¡ ¾²±â°¡ ¸¹Àº ½Ã½ºÅÛ¿¡´Â ¸Å¹ø parity
µð½ºÅ©¸¦ access ÇØ¾ß Çϱ⠶§¹®¿¡, º´¸ñÇö»óÀÌ ÀϾ ¼ö ÀÖ´Ù.
interleaves stripes like RAID-0, but
it requires an additional partition to store parity
information. The parity is used to offer redundancy:
if any one of the disks fail, the data on the remaining disks
can be used to reconstruct the data that was on the failed
disk. Given N data disks, and one parity disk, the
parity stripe is computed by taking one stripe from each
of the data disks, and XOR'ing them together. Thus,
the storage capacity of a an (N+1)-disk RAID-4 array
is N, which is a lot better than mirroring (N+1) drives,
and is almost as good as a RAID-0 setup for large N.
Note that for N=1, where there is one data drive, and one
parity drive, RAID-4 is a lot like mirroring, in that
each of the two disks is a copy of each other. However,
RAID-4 does NOT offer the read-performance
of mirroring, and offers considerably degraded write
performance. In brief, this is because updating the
parity requires a read of the old parity, before the new
parity can be calculated and written out. In an
environment with lots of writes, the parity disk can become
a bottleneck, as each write must access the parity disk.
- RAID-5
´Â °¢ µå¶óÀ̺긶´Ù parity stripe ¸¦ ÀúÀå½ÃÅ´À¸·Î½á
RAID-4ÀÇ ¾²±â º´¸ñÇö»óÀ» ÇÇÇÒ¼ö ÀÖ´Ù.
±×¸®³ª, ¿©ÀüÈ÷ ¾²±â Àü¿¡ XOR ¿¬»êÀ» ÇØ¾ß Çϱ⠶§¹®¿¡
¾²±â ¼º´ÉÀº ¹Ì·¯¸µ¸¸Å »¡¶óÁú¼ö ¾ø´Ù.
Àб⠿ª½Ã ¿©·¯°³ÀÇ µ¥ÀÌÅÍ°¡ ÀÖ´Â °ÍÀÌ ¾Æ´Ï±â ¶§¹®¿¡
¹Ì·¯¸µ ¸¸Å »¡¶óÁú ¼ö ¾ø´Ù.
avoids the write-bottleneck of RAID-4
by alternately storing the parity stripe on each of the
drives. However, write performance is still not as good
as for mirroring, as the parity stripe must still be read
and XOR'ed before it is written. Read performance is
also not as good as it is for mirroring, as, after all,
there is only one copy of the data, not two or more.
RAID-5's principle advantage over mirroring is that it
offers redundancy and protection against single-drive
failure, while offering far more storage capacity when
used with three or more drives.
- RAID-2 ¿Í RAID-3
´Â ÀÌÁ¦ °ÅÀÇ »ç¿ëµÇÁö ¾Ê´Â´Ù.
¸î¸î ·¹º§Àº Çö´ë µð½ºÅ© ±â¼ú·Î ÀÎÇØ ÇÊ¿ä ¾ø¾îÁ³±â ¶§¹®ÀÌ´Ù.
RAID-2´Â RAID-4¿Í ºñ½ÁÇÏÁö¸¸, parity ´ë½Å¿¡ ECC Á¤º¸¸¦
ÀúÀåÇÏ´Â °ÍÀÌ ´Ù¸£´Ù. ÇöÀçÀÇ ¸ðµç µð½ºÅ©µéÀº ECC Á¤º¸¸¦
µð½ºÅ© ÀÚü³»¿¡ ³Ö¾î¹ö·È´Ù. ÀÌ°ÍÀº, µð½ºÅ© ÀÚü¿¡ ÀÛÀº
¾ÈÀüÀåÄ¡¸¦ ´Ü °ÍÀÌ´Ù. RAID-2 ´Â µð½ºÅ© ¾²±â µµÁß
Àü¿ø°ø±ÞÀÌ Â÷´ÜµÉ ¶§, µ¥ÀÌÅÍ ¾ÈÀü¼ºÀ» Á¦°øÇØÁØ´Ù.
ÇÏÁö¸¸, ¹èÅ͸® ¹é¾÷À̳ª, clean shutdown ¿ª½Ã ¶È°°Àº
±â´ÉÀ» Á¦°øÇÑ´Ù.. RAID-3Àº °¡´ÉÇÑ ÃÖ¼ÒÀÇ stripe Å©±â¸¦
»ç¿ëÇÏ´Â °ÍÀ» Á¦¿ÜÇϸé RAID-4 ¿Í ºñ½ÁÇÏ´Ù.
Linux ¼ÒÇÁÆ®¿þ¾î RAID µå¶óÀ̹ö´Â RAID-2 ¿Í RAID-3¸¦
¸ðµÎ Áö¿øÇÏÁö ¾Ê´Â´Ù.
are seldom used anymore, and
to some degree are have been made obsolete by modern disk
technology. RAID-2 is similar to RAID-4, but stores
ECC information instead of parity. Since all modern disk
drives incorporate ECC under the covers, this offers
little additional protection. RAID-2 can offer greater
data consistency if power is lost during a write; however,
battery backup and a clean shutdown can offer the same
benefits. RAID-3 is similar to RAID-4, except that it
uses the smallest possible stripe size. As a result, any
given read will involve all disks, making overlapping
I/O requests difficult/impossible. In order to avoid
delay due to rotational latency, RAID-3 requires that
all disk drive spindles be synchronized. Most modern
disk drives lack spindle-synchronization ability, or,
if capable of it, lack the needed connectors, cables,
and manufacturer documentation. Neither RAID-2 nor RAID-3
are supported by the Linux Software-RAID drivers.
- ±×¿ÜÀÇ RAID ·¹º§µéÀº
´Ù¾çÇÑ ¼ö¿ä¿Í ÆǸÅÀڵ鿡 ÀÇÇØ ¸¸µé¾îÁ³°í, Ưº°ÇÑ Çϵå¿þ¾î¸¦ ÇÊ¿ä·Î Çϰųª
¾î¶² °ÍµéÀº ÀúÀÛ±ÇÀ» º¸È£ ¹Þ°í ÀÖ´Ù.
Linux ¼ÒÇÁÆ®¿þ¾î RAID´Â ´Ù¸¥ ¾î¶² º¯Á¾µéµµ Áö¿øÇÏÁö ¾Ê´Â´Ù.
have been defined by various
researchers and vendors. Many of these represent the
layering of one type of raid on top of another. Some
require special hardware, and others are protected by
patent. There is no commonly accepted naming scheme
for these other levels. Sometime the advantages of these
other systems are minor, or at least not apparent
until the system is highly stressed. Except for the
layering of RAID-1 over RAID-0/linear, Linux Software
RAID does not support any of the other variations.
- Q:
Software RAID ¸¦ ¾î¶»°Ô ¼³Ä¡ÇØ¾ß °¡Àå ÁÁÀ» ±î¿ä?
A:
³ª´Â ÆÄÀÏ ½Ã½ºÅÛ °èȹÀÌ Á» ´õ ¾î·Á¿î À¯´Ð½º ¼³Á¤ÀÛ¾÷ÀÎ °ÍÀ»
±ú´Ý µµ·Ï ³²°ÜµÐ´Ù.
Áú¹®¿¡ ´ëÇÑ ´ë´äÀ¸·Î, ¿ì¸®°¡ ÇÑ ÀÏÀ» ¼³¸íÇÏ°Ú´Ù.
¿ì¸®´Â °¢°¢ 2.1 ±â°¡ÀÇ EIDE µð½ºÅ©¸¦ ¾Æ·¡¿Í °°ÀÌ ¼³Á¤ÇÒ °èȹÀ» ¼¼¿ü´Ù.
I keep rediscovering that file-system planning is one
of the more difficult Unix configuration tasks.
To answer your question, I can describe what we did.
We planned the following setup:
- two EIDE disks, 2.1.gig each.
disk partition mount pt. size device
1 1 / 300M /dev/hda1
1 2 swap 64M /dev/hda2
1 3 /home 800M /dev/hda3
1 4 /var 900M /dev/hda4
2 1 /root 300M /dev/hdc1
2 2 swap 64M /dev/hdc2
2 3 /home 800M /dev/hdc3
2 4 /var 900M /dev/hdc4
- °¢ µð½ºÅ©´Â ¸ðµÎ ºÐ¸®µÈ ÄÁÆ®·Ñ·¯¿Í ¸®º» ÄÉÀÌºí »ó¿¡ ÀÖ´Ù.
ÀÌ°ÍÀº ÇϳªÀÇ ÄÁÆ®·Ñ·¯³ª ÄÉÀ̺íÀÌ °íÀå ³µÀ» ¶§,
µð½ºÅ©µéÀÌ °°ÀÌ »ç¿ë ºÒ°¡´ÉÇÏ°Ô µÇ´Â °ÍÀ» ¸·¾ÆÁØ´Ù.
Each disk is on a separate controller (& ribbon cable).
The theory is that a controller failure and/or
ribbon failure won't disable both disks.
Also, we might possibly get a performance boost
from parallel operations over two controllers/cables.
- ·çÆ® ÆÄƼ¼Ç (
/ /dev/hda1 )¿¡ ¸®´ª½º Ä¿³ÎÀ»
¼³Ä¡ÇÒ °ÍÀÌ´Ù. ÀÌ ÆÄƼ¼ÇÀ» bootable·Î ¼³Á¤Çضó.
Install the Linux kernel on the root (/ )
partition /dev/hda1 . Mark this partition as
bootable.
- /dev/hac1Àº /dev/hda1 ÀÇ RAID º¹»çº»ÀÌ ¾Æ´Ñ
´Ü¼ø º¹»çº»ÀÌ´Ù. ÀÌ°ÍÀ¸·Î, ù¹ø° µð½ºÅ©°¡ ¿À·ù³µÀ» ¶§
rescue µð½ºÅ©¸¦ »ç¿ëÇØ ÀÌ ÆÄƼ¼ÇÀ» bootable ¼³Á¤ÇÏ¿©
½Ã½ºÅÛÀ» ´Ù½Ã ÀνºÅçÇÏÁö ¾Ê°í »ç¿ëÇÒ ¼ö ÀÖ´Ù.
/dev/hdc1 will contain a ``cold'' copy of
/dev/hda1 . This is NOT a raid copy,
just a plain old copy-copy. It's there just in
case the first disk fails; we can use a rescue disk,
mark /dev/hdc1 as bootable, and use that to
keep going without having to reinstall the system.
You may even want to put /dev/hdc1 's copy
of the kernel into LILO to simplify booting in case of
failure.
ÀÌ°ÍÀº ½É°¢ÇÑ ¹®Á¦ ¹ß»ý½Ã, raid superblock-corruption À̳ª
´Ù¸¥ ÀÌÇØÇÒ¼ö ¾ø´Â ¹®Á¦¿¡ ´ëÇÑ °ÆÁ¤¾øÀÌ ½Ã½ºÅÛÀ» ºÎÆÃÇÒ ¼ö
ÀÖ°Ô ÇØÁØ´Ù.
The theory here is that in case of severe failure,
I can still boot the system without worrying about
raid superblock-corruption or other raid failure modes
& gotchas that I don't understand.
/dev/hda3 ¿Í /dev/hdc3 ´Â
¹Ì·¯¸µÀ» ÅëÇØ /dev/md0 °¡ µÉ°ÍÀÌ´Ù.
/dev/hda3 and /dev/hdc3 will be mirrors
/dev/md0 .
/dev/hda4 ¿Í /dev/hdc4 ´Â
¹Ì·¯¸µÀ» ÅëÇØ /dev/md1 °¡ µÉ°ÍÀÌ´Ù.
/dev/hda4 and /dev/hdc4 will be mirrors
/dev/md1 .
- ¿ì¸®´Â ¾Æ·¡¿Í °°Àº ÀÌÀ¯·Î ÆÄƼ¼ÇÀ» ³ª´©°í,
/var ¿Í /home
ÆÄƼ¼ÇÀ» ¹Ì·¯¸µÇϱâ·Î °áÁ¤ÇÏ¿´´Ù.
we picked /var and /home to be mirrored,
and in separate partitions, using the following logic:
/ (·çÆ® ÆÄƼ¼Ç)ÀÇ µ¥ÀÌÅ͵éÀº »ó´ëÀûÀ¸·Î
Àß º¯ÇÏÁö ¾Ê´Â´Ù.
/ (the root partition) will contain
relatively static, non-changing data:
for all practical purposes, it will be
read-only without actually being marked &
mounted read-only.
/home ÆÄƼ¼ÇÀº ''õõÈ÷'' º¯ÇÏ´Â µ¥ÀÌÅ͸¦
°¡Áö°í ÀÖ´Ù.
/home will contain ''slowly'' changing
data.
/var> ´Â ¸ÞÀÏ spool , µ¥ÀÌÅͺ£À̽º ³»¿ë,
À¥ ¼¹öÀÇ log ¿Í °°Àº ±Þ¼ÓÈ÷ º¯ÇÏ´Â µ¥ÀÌÅ͸¦
Æ÷ÇÔÇÏ°í ÀÖ´Ù.
/var will contain rapidly changing data,
including mail spools, database contents and
web server logs.
ÀÌ·¸°Ô ¿©·¯°³ÀÇ ´Ù¸¥ ÆÄƼ¼ÇÀ» ³ª´©´Â °ÍÀº,
Àΰ£ÀÇ ½Ç¼ö, Àü¿ø, ȤÀº osÀÇ ¹®Á¦µîÀÌ ÀϾÀ» ¶§,
±×°ÍÀÌ ¹ÌÄ¡´Â ¿µÇâÀÌ ÇϳªÀÇ ÆÄƼ¼Ç¿¡¸¸ ÇÑÁ¤µÇ±â ¶§¹®ÀÌ´Ù.
The idea behind using multiple, distinct partitions is
that if, for some bizarre reason,
whether it is human error, power loss, or an operating
system gone wild, corruption is limited to one partition.
In one typical case, power is lost while the
system is writing to disk. This will almost certainly
lead to a corrupted filesystem, which will be repaired
by fsck during the next boot. Although
fsck does it's best to make the repairs
without creating additional damage during those repairs,
it can be comforting to know that any such damage has been
limited to one partition. In another typical case,
the sysadmin makes a mistake during rescue operations,
leading to erased or destroyed data. Partitions can
help limit the repercussions of the operator's errors.
-
/usr ¿Í /opt ÆÄƼ¼ÇÀ» ¼±ÅÃÇÏ¿©µµ ±¦Âú¾ÒÀ» °ÍÀÌ´Ù.
»ç½Ç, Çϵ尡 Á»´õ ÀÖ¾ú´Ù¸é, /opt ¿Í /home ÆÄƼ¼ÇÀ»
RAID-5 ·Î ¼³Á¤ÇÏ´Â °ÍÀÌ ´õ ÁÁ¾ÒÀ» °ÍÀÌ´Ù.
ÁÖÀÇÇÒ Á¡Àº /usr ÆÄƼ¼ÇÀ» RAID-5·Î ¼³Á¤ÇÏÁö ¸»¶ó´Â °ÍÀÌ´Ù.
½É°¢ÇÑ ¹®Á¦°¡ ÀϾÀ» °æ¿ì /usr ÆÄƼ¼Ç¿¡ ¸¶¿îÆ® ÇÒ¼ö ¾ø°Ô
µÉ °ÍÀÌ°í, /usr ÆÄƼ¼Ç¾ÈÀÇ ³×Æ®¿öÅ© Åø°ú ÄÄÆÄÀÏ·¯ °°Àº °ÍµéÀ»
ÇÊ¿ä·Î ÇÏ°Ô µÉ °ÍÀÌ´Ù. RAID-1À» »ç¿ëÇÑ´Ù¸é, ÀÌ·± ¿¡·¯°¡ ³µÀ»¶§,
RAID´Â »ç¿ëÇÒ¼ö ¾ø¾îµµ µÎ°³ÀÇ ¹Ì·¯¸µµÈ °ÍÁß Çϳª¿¡´Â ¸¶¿îÆ®°¡ °¡´ÉÇÏ´Ù.
Other reasonable choices for partitions might be
/usr or /opt . In fact, /opt
and /home make great choices for RAID-5
partitions, if we had more disks. A word of caution:
DO NOT put /usr in a RAID-5
partition. If a serious fault occurs, you may find
that you cannot mount /usr , and that
you want some of the tools on it (e.g. the networking
tools, or the compiler.) With RAID-1, if a fault has
occurred, and you can't get RAID to work, you can at
least mount one of the two mirrors. You can't do this
with any of the other RAID levels (RAID-5, striping, or
linear append).
±×·¡¼ Áú¹®¿¡ ´ëÇÑ ¿Ï¼ºµÈ ´äÀº:
- ù¹ø° µð½ºÅ©ÀÇ Ã¹¹ø° ÆÄƼ¼Ç¿¡ ¿î¿µÃ¼Á¦¸¦ ¼³Ä¡ÇÏ°í
´Ù¸¥ ÆÄƼ¼ÇµéÀº ¸¶¿îÆ®ÇÏÁö ¸»¾Æ¶ó.
install the OS on disk 1, partition 1.
do NOT mount any of the other partitions.
- ¸í·É´ÜÀ§·Î RAID¸¦ ¼³Ä¡Ç϶ó.
install RAID per instructions.
-
md0 ¿Í md1 . ¼³Á¤Ç϶ó.
configure md0 and md1 .
- µð½ºÅ© ¿À·ù°¡ ÀϾÀ» ¶§ ¹«¾ùÀ» ÇØ¾ß ÇÏ´Â Áö
ÁغñÇضó. °ü¸®ÀÚ°¡ Áö±Ý ½Ç¼öÇÏ´ÂÁö ã¾Æº¸°í,
Ÿ°ÝÀ» ÀÔ°Ô ³öµÎÁö ¸¶¶ó. ±×¸®°í °æÇèÀ» ½×¾Æ¶ó.
(¿ì¸®´Â µð½ºÅ©°¡ ÀÛµ¿ÇÏ°í ÀÖ´Â µ¿¾È, Àü¿øÀ» ²¨º¸¾Ò´Ù.
ÀÌ°ÍÀº ¸ÛûÇغ¸ÀÌÁö¸¸, Á¤º¸¸¦ ¾òÀ» ¼ö ÀÖ´Ù.)
convince yourself that you know
what to do in case of a disk failure!
Discover sysadmin mistakes now,
and not during an actual crisis.
Experiment!
(we turned off power during disk activity —
this proved to be ugly but informative).
-
/var ¸¦ /dev/md1 À¸·Î ¿Å±â´Â Áß,
¾î´À Á¤µµ À߸øµÈ mount/copy/unmount/rename/reboot À» Çغ¸¶ó.
Á¶½ÉÈ÷¸¸ ÇÑ´Ù¸é, À§ÇèÇÏÁö´Â ¾ÊÀ» °ÍÀÌ´Ù.
do some ugly mount/copy/unmount/rename/reboot scheme to
move /var over to the /dev/md1 .
Done carefully, this is not dangerous.
- ±×¸®°í, ±×°ÍµéÀ» Áñ°Ü¶ó.
- Q:
mdadd , mdrun µîÀÇ ¸í·É°ú raidadd , raidrun ¸í·ÉÀÇ
´Ù¸¥ Á¡ÀÌ ¹º°¡¿ä?
A:
raidtools ÆÐÅ°ÁöÀÇ 0.5 ¹öÁ¯ºÎÅÍ À̸§ÀÌ ¹Ù²î¾ú´Ù. md ·Î À̸§ÀÌ ºÙ´Â °ÍÀº 0.43 ÀÌÀü¹öÁ¯ÀÌ°í
raid ·Î À̸§ÀÌ ºÙ´Â °ÍÀº 0.5 ¹öÁ¯°ú ´õ »õ¹öÁ¯µéÀÌ´Ù..
The names of the tools have changed as of the 0.5 release of the
raidtools package. The md naming convention was used
in the 0.43 and older versions, while raid is used in
0.5 and newer versions.
- Q:
°¡Áö°í ÀÖ´Â 2.0.34 Ä¿³Î¿¡¼ RAID-linear ¿Í RAID-0 ¸¦ »ç¿ëÇÏ°í ½Í´Ù.
RAID-linear ¿Í RAID-0 À» À§Çؼ ÆÐÄ¡°¡ ÇÊ¿äÇÏÁö ¾Ê±â ¶§¹®¿¡.
raid ÆÐÄ¡´Â ÇÏ°í ½ÍÁö ¾Ê´Ù. ¾îµð¿¡ °¡¸é, À̰͵éÀ» À§ÇÑ raid-tool À»
±¸ÇÒ¼ö ÀÖ³ª?
A:
°ú°ÜÇÑ Áú¹®ÀÌ´Ù. »ç½Ç, ÃÖ½ÅÀÇ raid toolµéÀº ÄÄÆÄÀÏ Çϱâ À§ÇØ
RAID-1,4,5 Ä¿³Î ÆÐÄ¡¸¦ ÇÊ¿ä·Î ÇÑ´Ù.
ÇöÀç raid toolÀÇ ÄÄÆÄÀÏµÈ ¹ÙÀ̳ʸ® ¹öÁ¯Ã£Áö ¸øÇß´Ù.
ÇÏÁö¸¸, 2.1.100 Ä¿³Î¿¡¼ ÄÄÆÄÀÏµÈ ¹ÙÀ̳ʸ®°¡ 2.0.34 Ä¿³Î¿¡¼
RAID-0/linear ÆÄƼ¼ÇÀ» ¸¸µå´Â °ÍÀ» Àß ¼öÇàÇÏ´Â °ÍÀ» º¸¾Ò´Ù.
±×·¡¼, ³ª´Â
http://linas.org/linux/Software-RAID/ ¿¡ mdadd,mdcreateµîÀÇ
¹ÙÀ̳ʸ®¸¦ ÀÓ½ÃÀûÀ¸·Î ¿Ã¸°´Ù.
This is a tough question, indeed, as the newest raid tools
package needs to have the RAID-1,4,5 kernel patches installed
in order to compile. I am not aware of any pre-compiled, binary
version of the raid tools that is available at this time.
However, experiments show that the raid-tools binaries, when
compiled against kernel 2.1.100, seem to work just fine
in creating a RAID-0/linear partition under 2.0.34. A brave
soul has asked for these, and I've temporarily
placed the binaries mdadd, mdcreate, etc.
at http://linas.org/linux/Software-RAID/
You must get the man pages, etc. from the usual raid-tools
package.
- Q:
·çÆ® ÆÄƼ¼Ç¿¡ RAID¸¦ Àû¿ëÇÒ ¼ö Àִ°¡?
¿Ö md µð½ºÅ©·Î Á÷Á¢ ºÎÆÃÇÒ ¼ö ¾ø´Â°¡?
A:
LILO¿Í Loadlin ¸ðµÎ RAID ÆÄƼ¼Ç¿¡¼ Ä¿³ÎÀ̹ÌÁö¸¦ Àоî¿Ã ¼ö ¾ø´Ù.
·çÆ® ÆÄƼ¼Ç¿¡ RAID¸¦ Àû¿ëÇÏ°í ½Í´Ù¸é, Ä¿³ÎÀ» ÀúÀåÇÒ
RAID°¡ ¾Æ´Ñ ÆÄƼ¼ÇÀ» ¸¸µé¾î¾ß ÇÒ°ÍÀÌ´Ù.
(ÀϹÝÀûÀ¸·Î ÀÌ ÆÄƼ¼ÇÀÇ À̸§Àº /boot ÀÌ´Ù.)
<
HarryH@Royal.Net>
·ÎºÎÅÍ ¹ÞÀº initial ramdisk (initrd) ¶Ç´Â, ÆÐÄ¡´Â RAID µð½ºÅ©¸¦ root µð¹ÙÀ̽º·Î
»ç¿ë°¡´ÉÇÏ°Ô ÇØ ÁÙ°ÍÀÌ´Ù.
(ÀÌ ÆÐÄ¡´Â ÃÖ±Ù 2.1.xÄ¿³Î¿¡´Â ±âº»ÀûÀ¸·Î äÅõǾîÀÖ´Ù.)
Both LILO and Loadlin need an non-stripped/mirrored partition
to read the kernel image from. If you want to strip/mirror
the root partition (/ ),
then you'll want to create an unstriped/mirrored partition
to hold the kernel(s).
Typically, this partition is named /boot .
Then you either use the initial ramdisk support (initrd),
or patches from Harald Hoyer
<
HarryH@Royal.Net>
that allow a stripped partition to be used as the root
device. (These patches are now a standard part of recent
2.1.x kernels)
°Å±â¿¡´Â »ç¿ëÇÒ ¼ö ÀÖ´Â ¸î°¡Áö ¹æ¹ýÀÌ Àִµ¥, Çϳª´Â
Bootable RAID mini-HOWTO:
ftp://ftp.bizsystems.com/pub/raid/bootable-raid¿¡
ÀÚ¼¼È÷ ¼³¸íµÇ¾î ÀÖ´Ù.
There are several approaches that can be used.
One approach is documented in detail in the
Bootable RAID mini-HOWTO:
ftp://ftp.bizsystems.com/pub/raid/bootable-raid.
¶Ç´Â, ¾Æ·¡Ã³·³ mkinitrd ¸¦ »ç¿ëÇØ ramdisk image¸¦ ¸¸µé¼öµµ ÀÖ´Ù.
Alternately, use mkinitrd to build the ramdisk image,
see below.
Edward Welbon
<
welbon@bga.com>
writes:
- ... all that is needed is a script to manage the boot setup.
To mount an
md filesystem as root,
the main thing is to build an initial file system image
that has the needed modules and md tools to start md .
I have a simple script that does this.
- For boot media, I have a small cheap SCSI disk
(170MB I got it used for $20).
This disk runs on a AHA1452, but it could just as well be an
inexpensive IDE disk on the native IDE.
The disk need not be very fast since it is mainly for boot.
- This disk has a small file system which contains the kernel and
the file system image for
initrd .
The initial file system image has just enough stuff to allow me
to load the raid SCSI device driver module and start the
raid partition that will become root.
I then do an
echo 0x900 > /proc/sys/kernel/real-root-dev
(0x900 is for /dev/md0 )
and exit linuxrc .
The boot proceeds normally from there.
- I have built most support as a module except for the AHA1452
driver that brings in the
initrd filesystem.
So I have a fairly small kernel. The method is perfectly
reliable, I have been doing this since before 2.1.26 and
have never had a problem that I could not easily recover from.
The file systems even survived several 2.1.4[45] hard
crashes with no real problems.
- At one time I had partitioned the raid disks so that the initial
cylinders of the first raid disk held the kernel and the initial
cylinders of the second raid disk hold the initial file system
image, instead I made the initial cylinders of the raid disks
swap since they are the fastest cylinders
(why waste them on boot?).
- The nice thing about having an inexpensive device dedicated to
boot is that it is easy to boot from and can also serve as
a rescue disk if necessary. If you are interested,
you can take a look at the script that builds my initial
ram disk image and then runs
LILO .
http://www.realtime.net/~welbon/initrd.md.tar.gz
It is current enough to show the picture.
It isn't especially pretty and it could certainly build
a much smaller filesystem image for the initial ram disk.
It would be easy to a make it more efficient.
But it uses LILO as is.
If you make any improvements, please forward a copy to me. 8-)
- Q:
striping À§¿¡ ¹Ì·¯¸µÀÌ °¡´ÉÇÏ´Ù°í µé¾ú´Âµ¥, »ç½ÇÀΰ¡?
loopback ÀåÄ¡·Î ¹Ì·¯¸µÇÒ ¼ö Àִ°¡?
A:
±×·¸´Ù. ÇÏÁö¸¸, ±× ¹Ý´ë·Î´Â ¾ÈµÈ´Ù.
Yes, but not the reverse. That is, you can put a stripe over
several disks, and then build a mirror on top of this. However,
striping cannot be put on top of mirroring.
°£´ÜÈ÷ ±â¼úÀûÀÎ ¼³¸íÀ» µ¡ºÙÀÌÀÚ¸é, linear ¿Í stripe´Â
ÀÚüÀûÀ¸·Î ll_rw_blk ·çƾÀ» »ç¿ëÇÏ´Â µ¥ ÀÌ°ÍÀº
block ¸¦ »ç¿ëÇÏÁö ¾Ê°í µð½ºÅ© device¿Í sector¸¦ »ç¿ëÇØ
Á¤½ÄÀûÀ¸·Î, ±×¸®°í Àú¼öÁØÀÇ access¸¦ ÇÑ´Ù, ¶§¹®¿¡,
´Ù¸¥ ¹Ì·¯¸µÀ§¿¡ À§Ä¡½Ãų¼ö ¾ø´Ù.
A brief technical explanation is that the linear and stripe
personalities use the ll_rw_blk routine for access.
The ll_rw_blk routine
maps disk devices and sectors, not blocks. Block devices can be
layered one on top of the other; but devices that do raw, low-level
disk accesses, such as ll_rw_blk , cannot.
ÇöÀç (1997³â 11¿ù) RAID´Â loopback device¸¦ Áö¿øÇÏÁö ¾ÊÁö¸¸,
°ð Áö¿øÇÒ °ÍÀÌ´Ù.
Currently (November 1997) RAID cannot be run over the
loopback devices, although this should be fixed shortly.
- Q:
µÎ°³ÀÇ ÀÛÀº µð½ºÅ©¿Í ¼¼°³ÀÇ Å« µð½ºÅ©¸¦ °¡Áö°í ÀÖÀ»¶§,
ÀÛÀº µð½ºÅ© µÎ°³¸¦ RAID-0À¸·Î ¹Àº ÈÄ, ³ª¸ÓÁö µð½ºÅ©µé°ú,
RAID-5¸¦ ¸¸µé¼ö Àִ°¡?
A:
1997³â 11¿ù ÇöÀç, RAID-5·Î ¹À» ¼ö´Â ¾ø´Ù.
¹¿©Áø µð½ºÅ©µé·Î´Â RAID-1(¹Ì·¯¸µ)¸¸ °¡´ÉÇÏ´Ù.
Currently (November 1997), for a RAID-5 array, no.
Currently, one can do this only for a RAID-1 on top of the
concatenated drives.
- Q:
µÎ°³ÀÇ µð½ºÅ©·Î RAID-1 À» ¼³Á¤ÇÏ´Â °Í°ú, RAID-5¸¦ ¼³Á¤ÇÏ´Â °ÍÀÌ
¾î¶»°Ô ´Ù¸¥°¡?
A:
µ¥ÀÌÅÍÀÇ ÀúÀåÀ²¿¡´Â Â÷ÀÌ°¡ ¾ø´Ù. µð½ºÅ©¸¦ ´õ ºÙÈù´Ù°í ÀúÀåÀ²ÀÌ
´Ã¾î°¡´Â °Íµµ ¾Æ´Ï´Ù.
There is no difference in storage capacity. Nor can disks be
added to either array to increase capacity (see the question below for
details).
RAID-1 Àº °¢ µå¶óÀ̺꿡¼ µÎ ¼½Å͸¦ µ¿½Ã¿¡ Àд ºÐ»ê ±â¼úÀ» »ç¿ëÇϱ⠶§¹®¿¡
µÎ¹èÀÇ Àб⠼º´ÉÀ» º¸¿©ÁØ´Ù.
RAID-1 offers a performance advantage for reads: the RAID-1
driver uses distributed-read technology to simultaneously read
two sectors, one from each drive, thus doubling read performance.
RAID-5´Â ¸¹Àº °ÍµéÀ» Æ÷ÇÔÇÏÁö¸¸, 1997³â 9¿ù ÇöÀç ±îÁö´Â,
µ¥ÀÌÅÍ µð½ºÅ©°¡ parity µð½ºÅ©·Î ½ÇÁ¦ÀûÀ¸·Î ¹Ì·¯¸µµÇÁö´Â ¾Ê´Â´Ù.
¶§¹®¿¡ µ¥ÀÌÅ͸¦ º´·Ä·Î ÀÐÁö´Â ¾Ê´Â´Ù.
The RAID-5 driver, although it contains many optimizations, does not
currently (September 1997) realize that the parity disk is actually
a mirrored copy of the data disk. Thus, it serializes data reads.
- Q:
µÎ°³ÀÇ µð½ºÅ©°¡ ¸Á°¡Á³À»¶§¿¡´Â ¾î¶»°Ô ´ëºñÇÏÁÒ?
A:
¸î¸îÀÇ RAID ´Â ¾Ë°í¸®ÁòÀº ¿©·¯°³ÀÇ µð½ºÅ©°¡ ¸Á°¡Á³À» ¶§¸¦ ´ëºñÇÒ
¼ö ÀÖ´Ù. ÇÏÁö¸¸, ÇöÀç ¸®´ª½º¿¡¼ Áö¿øµÇÁö´Â ¾Ê´Â´Ù.
±×·¯³ª, RAIDÀ§¿¡ RAID¸¦ ±¸ÃàÇÔÀ¸·Î½á, Linux Software RAID·Îµµ,
±×·± »óȲ¿¡ ´ëºñÇÒ ¼ö ÀÖ´Ù. ¿¹¸¦ µé¸é,9°³ÀÇ µð½ºÅ©·Î 3°³ÀÇ
RAID-5¸¦ ¸¸µé°í ´Ù½Ã ±×°ÍÀ» ÇϳªÀÇ RAID-5 ·Î ¸¸µå´Â °ÍÀÌ´Ù.
ÀÌ·± ¼³Á¤Àº 3°³ÀÇ µð½ºÅ©°¡ ¸Á°¡Á³À»¶§±îÁö ´ëºñÇÒ ¼ö ÀÖÁö¸¸,
¸¹Àº °ø°£ÀÌ ''³¶ºñ''µÈ´Ù´Â °ÍÀ» ÁÖ¸ñÇ϶ó.
Some of the RAID algorithms do guard against multiple disk
failures, but these are not currently implemented for Linux.
However, the Linux Software RAID can guard against multiple
disk failures by layering an array on top of an array. For
example, nine disks can be used to create three raid-5 arrays.
Then these three arrays can in turn be hooked together into
a single RAID-5 array on top. In fact, this kind of a
configuration will guard against a three-disk failure. Note that
a large amount of disk space is ''wasted'' on the redundancy
information.
For an NxN raid-5 array,
N=3, 5 out of 9 disks are used for parity (=55%)
N=4, 7 out of 16 disks
N=5, 9 out of 25 disks
...
N=9, 17 out of 81 disks (=~20%)
ÀϹÝÀûÀ¸·Î, MxN °³·Î ¸¸µé¾îÁø RAID¸¦ À§ÇØ M+N-1 °³ÀÇ
µð½ºÅ©°¡ parity ·Î »ç¿ëµÇ°í, M = N À϶§ ¹ö·ÁÁö´Â ¾çÀÌ
ÃÖ¼Ò°¡ µÉ °ÍÀÌ´Ù.
In general, an MxN array will use M+N-1 disks for parity.
The least amount of space is "wasted" when M=N.
´Ù¸¥ ¹æ¹ýÀº ¼¼°³ÀÇ µð½ºÅ©(RAID-5·Î ¼³Á¤µÈ)·Î RAID-1À» ¸¸µå´Â °ÍÀÌ´Ù.
±×°ÍÀº, ¼¼°³ÀÇ µð½ºÅ©Áß °°Àº µ¥ÀÌÅ͸¦ °¡Áö´Â 2/3À» ³¶ºñÇÏ°Ô
µÉ °ÍÀÌ´Ù.
Another alternative is to create a RAID-1 array with
three disks. Note that since all three disks contain
identical data, that 2/3's of the space is ''wasted''.
- Q:
ÆÄƼ¼ÇÀÌ Á¦´ë·Î unmount µÇÁö ¾Ê¾ÒÀ» ¶§
fsck °¡ ½ÇÇàµÇ¾î¼
ÆÄÀϽýºÅÛÀ» ½º½º·Î °íÄ¡´Â °ÍÀÌ ¾î¶»°Ô °¡´ÉÇÑÁö ¾Ë°í ½Í´Ù.
RAID ½Ã½ºÅÛÀ» ckraid --fix ·Î °íÄ¥¼ö Àִµ¥ ¿Ö ±×°ÍÀ»
ÀÚµ¿À¸·Î ÇÏÁö ¾Ê´Â°¡?
I'd like to understand how it'd be possible to have something
like fsck : if the partition hasn't been cleanly unmounted,
fsck runs and fixes the filesystem by itself more than
90% of the time. Since the machine is capable of fixing it
by itself with ckraid --fix , why not make it automatic?
A:
/etc/rc.d/rc.sysinit ¿¡ ¾Æ·¡¿Í °°ÀÌ
Ãß°¡ÇÔÀ¸·Î½á ÇÒ¼ö ÀÖ´Ù.
This can be done by adding lines like the following to
/etc/rc.d/rc.sysinit :
mdadd /dev/md0 /dev/hda1 /dev/hdc1 || {
ckraid --fix /etc/raid.usr.conf
mdadd /dev/md0 /dev/hda1 /dev/hdc1
}
or
mdrun -p1 /dev/md0
if [ $? -gt 0 ] ; then
ckraid --fix /etc/raid1.conf
mdrun -p1 /dev/md0
fi
Á»´õ ¿Ïº®ÇÑ ½ºÅ©¸³Æ®¸¦ ¸¸µé±â ÀÌÀü¿¡ ½Ã½ºÅÛÀÌ ¾î¶»°Ô ÄÑÁö´ÂÁö º¸µµ·Ï ÇÏÀÚ.
Before presenting a more complete and reliable script,
lets review the theory of operation.
Á¤»óÀûÀ¸·Î Á¾·áµÇÁö ¾Ê¾Ò´Ù¸é, ¸®´ª½º´Â ¾Æ·¡¿Í °°Àº »óÅÂÁßÀÇ ÇϳªÀÏ ²¨¶ó°í
Gadi OxmanÀº ¸»Çß´Ù.
Gadi Oxman writes:
In an unclean shutdown, Linux might be in one of the following states:
- ºñÁ¤»ó Á¾·á´ç½Ã, ¸Þ¸ð¸®ÀÇ µð½ºÅ© ij½¬°¡ ÀúÀå(sync) µÈ »óÅÂ.
µ¥ÀÌÅÍ´Â ¼Õ»óµÇÁö ¾Ê´Â´Ù.
The in-memory disk cache was in sync with the RAID set when
the unclean shutdown occurred; no data was lost.
- ¹®Á¦°¡ ¹ß»ýÇßÀ» ¶§, µð½ºÅ© ij½¬´Â RAID ¿¡ ÀÖ´Â °Íº¸´Ù ÃÖ±Ù °ÍÀ̾ú´ø »óÅÂ
ÀÌ °á°ú´Â ÆÄÀϽýºÅÛÀÌ ¸Á°¡Áö°í, µ¥ÀÌÅ͸¦ ÀÒÀ» °ÍÀÌ´Ù.
ÀÌ°ÍÀº ´Ù½Ã ¾Æ·¡ÀÇ µÎ°¡Áö »óÅ·Π³ª‡»´Ù.
The in-memory disk cache was newer than the RAID set contents
when the crash occurred; this results in a corrupted filesystem
and potentially in data loss.
This state can be further divided to the following two states:
- ¸®´ª½º°¡ µ¥ÀÌÅ͸¦ ¾²°í(write) ÀÖ¾úÀ» °æ¿ì.
- ¸®´ª½º°¡ µ¥ÀÌÅ͸¦ ¾²°í ÀÖÁö ¾Ê¾ÒÀ» °æ¿ì.
RAID-1À» »ç¿ëÇÑ´Ù¸é, À§ÀÇ Ã¹¹ø° °æ¿ì¿¡¼, ¾î´ÀÁ¤µµ¸¸ ¹Ì·¯¸µµÆÀ» °æ¿ì°¡
»ý±ä´Ù. ÀÌ·± °æ¿ì, ´ÙÀ½ ºÎÆö§, ¹Ì·¯¸µµÈ µ¥ÀÌÅÍ°¡ ¼·Î °°Áö ¾ÊÀ» °ÍÀÌ´Ù.
ÀÌ·±°æ¿ì¿¡ ¹Ì·¯¸µÀÌ ´Ù¸¥°É ¹«½ÃÇÑ´Ù¸é, Àбâ½Ã ¹Ì·¯¸µµÈ °ÍÁß Çϳª¸¦ ¼±ÅÃÇÒ °ÍÀÌ°í,
¸ð¼øµÈ °á°ú¸¦ Ãâ·ÂÇÒ °ÍÀÌ´Ù.
Suppose we were using a RAID-1 array. In (2a), it might happen that
before the crash, a small number of data blocks were successfully
written only to some of the mirrors, so that on the next reboot,
the mirrors will no longer contain the same data.
If we were to ignore the mirror differences, the raidtools-0.36.3
read-balancing code
might choose to read the above data blocks from any of the mirrors,
which will result in inconsistent behavior (for example, the output
of e2fsck -n /dev/md0 can differ from run to run).
RAID ´Â ºñÁ¤»óÀûÀÎ shutdownÀ» À§ÇØ ¼³°èµÈ °ÍÀÌ ¾Æ´Ï°í,
ÀϹÝÀûÀ¸·Î ¹Ì·¯¸µµÈ µ¥ÀÌÅÍ°¡ ´Ù¸¦ ¶§³ª, ÆÄÀϽýºÅÛÀÌ °íÀå³µÀ» ¶§ÀÇ
¿Ïº®ÇÑ ÇØ°áÃ¥µµ ¾ø´Ù.
Since RAID doesn't protect against unclean shutdowns, usually
there isn't any ''obviously correct'' way to fix the mirror
differences and the filesystem corruption.
For example, by default ckraid --fix will choose
the first operational mirror and update the other mirrors
with its contents. However, depending on the exact timing
at the crash, the data on another mirror might be more recent,
and we might want to use it as the source
mirror instead, or perhaps use another method for recovery.
¾Æ·¡ÀÇ ½ºÅ©¸³Æ®¸¦ rc.raid.init ¿¡ Ãß°¡ÇÏ°í,
±× µð·ºÅ丮¿¡ path¸¦ °É¾î¶ó.
±×°ÍÀº Á»´õ ¾ÈÀüÇÑ ºÎÆÃÀ» Áö¿øÇÒ °ÍÀÌ°í, ƯÈ÷,
ÀÏÄ¡ÇÏÁö ¾Ê´Â µð½ºÅ©³ª, ÄÜÆ®·Ñ·¯, ÄÜÆ®·Ñ·¯ µå¶óÀ̹öµîÀÌ
ÀÖÀ¸¸é, Áú°í ¹Ýº¹ÀûÀ¸·Î chraid ¸¦ ½ÇÇàÇÒ °ÍÀÌ´Ù.
rc.raid.init ´Â fsck ·Î ·çÆ®ÆÄƼ¼ÇÀÌ Ã¼Å©µÇ°í
Read Write ¸¶¿îÆ® µÈ »óÅ¿¡¼ ÀÛµ¿ÇÒ °ÍÀÌ´Ù.
The following script provides one of the more robust
boot-up sequences. In particular, it guards against
long, repeated ckraid 's in the presence
of uncooperative disks, controllers, or controller device
drivers. Modify it to reflect your config,
and copy it to rc.raid.init . Then invoke
rc.raid.init after the root partition has been
fsck'ed and mounted rw, but before the remaining partitions
are fsck'ed. Make sure the current directory is in the search
path.
mdadd /dev/md0 /dev/hda1 /dev/hdc1 || {
rm -f /fastboot # force an fsck to occur
ckraid --fix /etc/raid.usr.conf
mdadd /dev/md0 /dev/hda1 /dev/hdc1
}
# if a crash occurs later in the boot process,
# we at least want to leave this md in a clean state.
/sbin/mdstop /dev/md0
mdadd /dev/md1 /dev/hda2 /dev/hdc2 || {
rm -f /fastboot # force an fsck to occur
ckraid --fix /etc/raid.home.conf
mdadd /dev/md1 /dev/hda2 /dev/hdc2
}
# if a crash occurs later in the boot process,
# we at least want to leave this md in a clean state.
/sbin/mdstop /dev/md1
mdadd /dev/md0 /dev/hda1 /dev/hdc1
mdrun -p1 /dev/md0
if [ $? -gt 0 ] ; then
rm -f /fastboot # force an fsck to occur
ckraid --fix /etc/raid.usr.conf
mdrun -p1 /dev/md0
fi
# if a crash occurs later in the boot process,
# we at least want to leave this md in a clean state.
/sbin/mdstop /dev/md0
mdadd /dev/md1 /dev/hda2 /dev/hdc2
mdrun -p1 /dev/md1
if [ $? -gt 0 ] ; then
rm -f /fastboot # force an fsck to occur
ckraid --fix /etc/raid.home.conf
mdrun -p1 /dev/md1
fi
# if a crash occurs later in the boot process,
# we at least want to leave this md in a clean state.
/sbin/mdstop /dev/md1
# OK, just blast through the md commands now. If there were
# errors, the above checks should have fixed things up.
/sbin/mdadd /dev/md0 /dev/hda1 /dev/hdc1
/sbin/mdrun -p1 /dev/md0
/sbin/mdadd /dev/md12 /dev/hda2 /dev/hdc2
/sbin/mdrun -p1 /dev/md1
¾Æ·¡¿Í °°ÀÌ rc.raid.halt ¸¦ Ãß°¡ÇÏ°í ½Í´Ù¸é Ãß°¡Ç϶ó.
In addition to the above, you'll want to create a
rc.raid.halt which should look like the following:
/sbin/mdstop /dev/md0
/sbin/mdstop /dev/md1
rc.sysinit ¿Í init.d/halt ½ºÅ©¸³¤¼Áß
½Ã½ºÅÛÀÇ halt/roboot¸¦ À§ÇÑ ¸ðµç unmount Àü¿¡
À§ ÆÄÀÏÀ» ÷ºÎ½ÃÄѶó.
( rc.sysinit ¿¡´Â fsck°¡ ½ÇÆÐÇßÀ»¶§,
µð½ºÅ©¸¦ unmount ÇÏ°í reboot ÇÏ´Â ·çƾÀÌ ÀÖ´Ù.)
Be sure to modify both rc.sysinit and
init.d/halt to include this everywhere that
filesystems get unmounted before a halt/reboot. (Note
that rc.sysinit unmounts and reboots if fsck
returned with an error.)
- Q:
ÇöÀç °¡Áø ÇϳªÀÇ µð½ºÅ©·Î ¹ÝÂÊÂ¥¸® RAID-1À» ±¸¼ºÈÄ, ³ªÁß¿¡
µð½ºÅ©¸¦ Ãß°¡ÇÒ¼ö ÀÖ³ª¿ä?
A:
ÇöÀçÀÇ µµ±¸µé·Î´Â ºÒ°¡´ÉÇÏ°í, ¾î¶² ½¬¿î ¹æ¹ýµµ ¾ø´Ù. ƯÈ÷,
µð½ºÅ©ÀÇ ³»¿ëÀ» ´ÜÁö º¹»çÇÔÀ¸·Î½á, ¹Ì·¯¸µÀº ÀÌ·ç¾îÁöÁö ¾Ê´Â´Ù.
RAID µå¶óÀ̹ö´Â ÇÇƼ¼Ç³¡ÀÇ ÀÛÀº °ø°£À» superblock·Î »ç¿ëÇϱ⠶§¹®ÀÌ´Ù.
ÀÌ°ÍÀº ÀÛÀº °ø°£¸¸À» Â÷ÁöÇÏÁö¸¸, ÀÌ¹Ì Á¸ÀçÇÏ´Â ÆÄÀÏ ½Ã½ºÅ׿¡
´Ü¼øÈ÷ º¹»çÇÏ·Á ÇÒ°æ¿ì, superblockÀº ÆÄÀÏ ½Ã½ºÅÛÀ» µ¤¾î¾º¿ï °ÍÀÌ°í,
µ¥ÀÌÅ͸¦ ¾û¸ÁÀ¸·Î ¸¸µé°ÍÀÌ´Ù.
ext2fs ÆÄÀϽýºÅÛÀº ÆÄÀϵéÀÌ Á¶°¢³ª´Â °ÍÀ» ¸·±âÀ§ÇØ,
ÆÄÀϵéÀ» ¹«ÀÛÀ§·Î ¹èÄ¡½ÃÄѿԱ⠶§¹®¿¡, µð½ºÅ©¸¦ ¸ðµÎ »ç¿ëÇϱâ Àü¿¡,
ÆÄƼ¼ÇÀÇ ³¡ºÎºÐÀÌ ÃæºÐÈ÷ »ç¿ëµÉ ¼ö ÀÖ´Ù.
With the current tools, no, not in any easy way. In particular,
you cannot just copy the contents of one disk onto another,
and then pair them up. This is because the RAID drivers
use glob of space at the end of the partition to store the
superblock. This decreases the amount of space available to
the file system slightly; if you just naively try to force
a RAID-1 arrangement onto a partition with an existing
filesystem, the
raid superblock will overwrite a portion of the file system
and mangle data. Since the ext2fs filesystem scatters
files randomly throughput the partition (in order to avoid
fragmentation), there is a very good chance that some file will
land at the very end of a partition long before the disk is
full.
´ç½ÅÀÌ À¯´ÉÇÏ´Ù¸é, superblockÀÌ ¾î´ÀÁ¤µµÀÇ °ø°£À» Â÷ÁöÇÏ´ÂÁö
°è»êÇؼ, ÆÄÀϽýºÅÛÀ» Á¶±Ý ÀÛ°Ô ¸¸µé°ÍÀ» Á¦¾ÈÇÑ´Ù.
±×¸®°í, µð½ºÅ©¸¦ Ãß°¡ÇÒ¶§, RAID ÅøÀ» ´ç½Å¿¡ ¸Â°Ô °íÃļ »ç¿ëÇØ¾ß ÇÒ°ÍÀÌ´Ù.
(±× ÅøµéÀÌ ²ûÂïÇÏ°Ô º¹ÀâÇÏÁö´Â ¾Ê´Ù.)
If you are clever, I suppose you can calculate how much room
the RAID superblock will need, and make your filesystem
slightly smaller, leaving room for it when you add it later.
But then, if you are this clever, you should also be able to
modify the tools to do this automatically for you.
(The tools are not terribly complex).
ÁÖÀDZí°Ô ÀÐÀº »ç¶÷À̶ó¸é, ¾Æ·¡¿Í °°Àº °ÍÀÌ ÀÛµ¿ÇÒ°ÍÀ̶ó°í ÁöÀûÇßÀ» °ÍÀÌ´Ù.
³ª´Â ÀÌ°ÍÀ» ½ÃµµÇغ¸°Å³ª Áõ¸íÇغ¸Áö´Â ¸øÇß´Ù.
/dev/null À» ÇϳªÀÇ ±â±â·Î½á mkraid ¿¡ ÀÌ¿ëÇÏ´Â °ÍÀÌ´Ù.
ÁøÂ¥ µð½ºÅ© Çϳª¸¦ °¡Áö°í mdadd -r ¸¦ ½ÇÇà½ÃŲÈÄ,
mkraid ·Î RAID ¹è¿À» ¸¸µé¼ö ÀÖÀ» °ÍÀÌ°í, µð½ºÅ© Çϳª°¡
±úÁ³À» ¶§Ã³·³ "degraded" ¸ðµå·Î ÀÛµ¿½Ãų¼ö ÀÖÀ»°ÍÀÌ´Ù.
Note:A careful reader has pointed out that the
following trick may work; I have not tried or verified this:
Do the mkraid with /dev/null as one of the
devices. Then mdadd -r with only the single, true
disk (do not mdadd /dev/null ). The mkraid
should have successfully built the raid array, while the
mdadd step just forces the system to run in "degraded" mode,
as if one of the disks had failed.
- Q:
RAID-1À» »ç¿ëÇÏ°í Àִµ¥, µð½ºÅ©°¡ ÀÛµ¿Áß Àü¿øÀÌ ²¨Á³½À´Ï´Ù.
¾î¶»°Ô ÇØ¾ß ÇÒ±î¿ä?
A:
ÀÌ·± »óȲ¿¡¼´Â ¸î°¡Áö ¹æ¹ýÀÌ ÀÖ´Ù.
The redundancy of RAID levels is designed to protect against a
disk failure, not against a power failure.
There are several ways to recover from this situation.
- ù¹ø° ¹æ¹ýÀº raid µµ±¸µéÀ» »ç¿ëÇÏ´Â °ÍÀÌ´Ù.
ÀÌ°ÍÀº raidÀÇ µ¥ÀÌÅ͵éÀ» µ¿±âÈ ½ÃÄÑÁØ´Ù.(sync) ÇÏÁö¸¸,
ÆÄÀϽýºÅÛÀÇ ¼Õ»óÀº º¹±¸ÇØÁÖÁö ¾ÊÀ¸¹Ç·Î ÈÄ¿¡ fsck¸¦ »ç¿ëÇØ
°íÃÄ¾ß ÇÑ´Ù. RAID ´Â
ckraid /etc/raid1.conf ÅëÇØ
Á¡°ËÇغ¼¼ö ÀÖ´Ù.(RAID-1ÀÏ °æ¿ìÀÌ´Ù, ´Ù¸¥ °æ¿ì¶ó¸é,
/etc/raid5.conf ó·³»ç¿ëÇØ¾ß ÇÑ´Ù.)
ckraid /etc/raid1.conf --fix ¸¦ »ç¿ëÇØ
RAIDµÈ µð½ºÅ©Áß Çϳª¸¦ ¼±ÅÃÇؼ, ´Ù¸¥ µð½ºÅ©·Î ¹Ì·¯¸µ ½Ãų¼ö ÀÖ´Ù.
µð½ºÅ©Áß ¾î´À °ÍÀ» ¼±ÅÃÇØ¾ß ÇÒÁö ¸ð¸¥´Ù¸é, ckraid /etc/raid1.conf --fix --force-source /dev/hdc3
ÀÌ·± ½ÄÀ¸·Î --force-source ¿É¼ÇÀ» »ç¿ëÇ϶ó.
ckraid ´Â --fix ¿É¼ÇÀ» Á¦°ÅÇÔÀ¸·Î½á RAID ½Ã½ºÅÛ¿¡ ¾î¶² º¯È¾øÀÌ
¾ÈÀüÇÏ°Ô ½Ãµµ µÉ ¼ö ÀÖ´Ù. Á¦¾ÈµÈ º¯°æ¿¡ ´ëÇؼ, ¸¸Á·ÇÒ °æ¿ì¿¡ --fix ¿É¼ÇÀ» »ç¿ëÇضó.
Method (1): Use the raid tools. These can be used to sync
the raid arrays. They do not fix file-system damage; after
the raid arrays are sync'ed, then the file-system still has
to be fixed with fsck. Raid arrays can be checked with
ckraid /etc/raid1.conf (for RAID-1, else,
/etc/raid5.conf , etc.)
Calling ckraid /etc/raid1.conf --fix will pick one of the
disks in the array (usually the first), and use that as the
master copy, and copy its blocks to the others in the mirror.
To designate which of the disks should be used as the master,
you can use the --force-source flag: for example,
ckraid /etc/raid1.conf --fix --force-source /dev/hdc3
The ckraid command can be safely run without the --fix
option
to verify the inactive RAID array without making any changes.
When you are comfortable with the proposed changes, supply
the --fix option.
- µÎ¹ø° ¹æ¹ýÀº ù¹ø° ¹æ¹ýº¸´Ù ¸¹ÀÌ ÁÁÀº ¹æ¹ýÀº ¾Æ´Ï´Ù.
/dev/hda3 ¿Í /dev/hdc3 ·Î ¸¸µé¾îÁø RAID-1 µð½ºÅ©°¡
ÀÖ´Ù°í °¡Á¤ÇÒ¶§, ¾Æ·¡¿Í °°ÀÌ Çغ¼¼ö ÀÖ´Ù.
Method (2): Paranoid, time-consuming, not much better than the
first way. Lets assume a two-disk RAID-1 array, consisting of
partitions /dev/hda3 and /dev/hdc3 . You can
try the following:
fsck /dev/hda3
fsck /dev/hdc3
- µÎ°³ÀÇ ÆÄƼ¼ÇÁß, ¿¡·¯°¡ ÀûÀº ÂÊÀ̳ª, ´õ ½±°Ô º¹±¸°¡ µÈÂÊ,
¶Ç´Â º¹±¸ÇÏ°í ½ÍÀº µ¥ÀÌÅÍ°¡ ³²¾ÆÀÖ´Â Âʵî, »õ·Î¿î ¸¶½ºÅÍ·Î ¾µ ÆÄƼ¼ÇÀ»
°áÁ¤ÇØ¾ß ÇÑ´Ù.
/dev/hdc3 ¸¦ ¼±ÅÃÇß´Ù ÇÏÀÚ.
decide which of the two partitions had fewer errors,
or were more easily recovered, or recovered the data
that you wanted. Pick one, either one, to be your new
``master'' copy. Say you picked /dev/hdc3 .
dd if=/dev/hdc3 of=/dev/hda3
mkraid raid1.conf -f --only-superblock
¸¶Áö¸· µÎ´Ü°è ´ë½Å¿¡ ckraid /etc/raid1.conf --fix --force-source /dev/hdc3
¸¦ »ç¿ëÇϸé Á» ´õ ºü¸¦ °ÍÀÌ´Ù.
Instead of the last two steps, you can instead run
ckraid /etc/raid1.conf --fix --force-source /dev/hdc3
which should be a bit faster.
- ¼¼¹ø¤Š ¹æ¹ýÀº ¿À·¨µ¿¾È fsck¸¦ ±â´Ù¸®±â°¡ ±ÍÂúÀº »ç¶÷µéÀ» À§ÇÑ °ÍÀÌ´Ù.
ù¹ø° 3´Ü°è¸¦ ¶Ù¾î³Ñ°í ¹Ù·Î ¸¶Áö¸· µÎ´Ü°è¸¦ ½ÇÇàÇÏ´Â °ÍÀÌ´Ù.
±×·± ÈÄ¿¡
fsck /dev/md0 ¸¦ ½ÇÇàÇÏ´Â °ÍÀÌ´Ù.
ÀÌ°ÍÀº ù¹ø° ¹æ¹ýÀÇ ¸ð¾çÀ» ¹Ù²Û °ÍÀÏ »ÓÀÌ´Ù.
Method (3): Lazy man's version of above. If you don't want to
wait for long fsck's to complete, it is perfectly fine to skip
the first three steps above, and move directly to the last
two steps.
Just be sure to run fsck /dev/md0 after you are done.
Method (3) is actually just method (1) in disguise.
¾î¶² ¹æ¹ýµµ RAID¸¦ µ¿±âÈ ½ÃÄÑÁÙ ¼ö ÀÖÀ» »ÓÀÌ°í, ÆÄÀÏ ½Ã½ºÅÛ ¶ÇÇÑ
Àß º¹±¸µÇ±â¸¦ ¿øÇÒ °ÍÀÌ´Ù. À̸¦ À§Çؼ, fsck¸¦ md device¸¦
unmount ½ÃŲÈÄ fsck¸¦ ½ÇÇàÇ϶ó.
In any case, the above steps will only sync up the raid arrays.
The file system probably needs fixing as well: for this,
fsck needs to be run on the active, unmounted md device.
¼¼°³ÀÇ µð½ºÅ©·Î ±¸¼ºµÈ RAID-1 ½Ã½ºÅÛÀ̶ó¸é ¸¹ÀÌ ÀÏÄ¡ÇÑ ºÎºÐÀ» ÅëÇØ,
´äÀ» ã¾Æ³»´Â ¹æ¹ýµîÀÇ, Á¶±Ý ´õ ¸¹Àº ¹æ¹ýÀÌ ÀÖ°ÚÁö¸¸,
ÀÌ·± °ÍÀ» ÀÚµ¿À¸·Î ÇØÁÖ´Â µµ±¸´Â ÇöÀç Áö¿øµÇÁö ¾Ê´Â´Ù.
With a three-disk RAID-1 array, there are more possibilities,
such as using two disks to ''vote'' a majority answer. Tools
to automate this do not currently (September 97) exist.
- Q:
RAID-4 ¶Ç´Â RAID-5 ½Ã½ºÅÛÀ» °¡Áö°í Àִµ¥, µð½ºÅ© ÀÛµ¿Áß¿¡ ²¨Á³½À´Ï´Ù.
¾î¶»°Ô ÇØ¾ß ÇÒ±î¿ä?
A:
RAID-4³ª RAID-5 ½Ã½ºÅÛ¿¡¼´Â ¿¹ºñ ¼ö¸®¸¦ À§ÇØ fsck¸¦ »ç¿ëÇÒ ¼ö ¾ø´Ù.
¸ÕÀú ckraid¸¦ »ç¿ëÇ϶ó.
ckraid ´Â --fix ¿É¼ÇÀ» Á¦°ÅÇÔÀ¸·Î½á RAID ½Ã½ºÅÛ¿¡ ¾î¶² º¯È¾øÀÌ
¾ÈÀüÇÏ°Ô ½Ãµµ µÉ ¼ö ÀÖ´Ù. Á¦¾ÈµÈ º¯°æ¿¡ ´ëÇؼ, ¸¸Á·ÇÒ °æ¿ì¿¡ --fix ¿É¼ÇÀ» »ç¿ëÇضó.
¿øÇÑ´Ù¸é,--suggest-failed-disk-mask ¿É¼ÇÀ» ÅëÇØ µð½ºÅ©µéÁß Çϳª¸¦ ¸Á°¡Áø µð½ºÅ©·Î ÁöÁ¤ÇÑ Ã¤ ckraid¸¦ ½Ãµµ ÇÒ¼ö ÀÖ´Ù.
RAID-5´Â ´ÜÁö ÇϳªÀÇ bit¸¸ÀÌ flag ·Î ¼³Á¤µÇ¾î Àֱ⠶§¹®¿¡, RAID-5´Â µÎ°³ÀÇ µð½ºÅ©°¡ ¸Á°¡Á³À» ¶§´Â º¹±¸ÇÒ ¼ö ¾ø´Ù.
¾Æ·¡´Â binary bit mask ÀÌ´Ù.
The redundancy of RAID levels is designed to protect against a
disk failure, not against a power failure.
Since the disks in a RAID-4 or RAID-5 array do not contain a file
system that fsck can read, there are fewer repair options. You
cannot use fsck to do preliminary checking and/or repair; you must
use ckraid first.
The ckraid command can be safely run without the
--fix option
to verify the inactive RAID array without making any changes.
When you are comfortable with the proposed changes, supply
the --fix option.
If you wish, you can try designating one of the disks as a ''failed
disk''. Do this with the --suggest-failed-disk-mask flag.
Only one bit should be set in the flag: RAID-5 cannot recover two
failed disks.
The mask is a binary bit mask: thus:
0x1 == first disk
0x2 == second disk
0x4 == third disk
0x8 == fourth disk, etc.
¶Ç´Â, --suggest-fix-parity ¿É¼ÇÀ» ÅëÇØ parity ¼½Å͸¦ ¼öÁ¤ÇÒ ¼öµµ ÀÖ´Ù.
ÀÌ°ÍÀº ´Ù¸¥ ¼½Å͵é·ÎºÎÅÍ parity ¸¦ ´Ù½Ã °è»êÇس¾ °ÍÀÌ´Ù.
--suggest-failed-dsk-mask ¿Í --suggest-fix-parity ¿É¼ÇÀº
--fix ¿É¼ÇÀ» Á¦°ÅÇÔÀ¸·Î½á, °¡´ÉÇÑ ¼öÁ¤ °èȹÀÇ È®ÀÎÀ» À§ÇØ ¾ÈÀüÇÏ°Ô »ç¿ëµÉ¼ö ÀÖ´Ù.
Alternately, you can choose to modify the parity sectors, by using
the --suggest-fix-parity flag. This will recompute the
parity from the other sectors.
The flags --suggest-failed-dsk-mask and
--suggest-fix-parity
can be safely used for verification. No changes are made if the
--fix flag is not specified. Thus, you can experiment with
different possible repair schemes.
- Q:
/dev/hda3 °ú /dev/hdc3 µÎ°³ÀÇ µð½ºÅ©·Î
/dev/md0 ÀÇ RAID-1 ½Ã½ºÅÛÀ» ¸¸µé¾î¼ »ç¿ëÇÏ°í ÀÖ½À´Ï´Ù.
ÃÖ±Ù¿¡, /dev/hdc3 ÀÌ °íÀ峪¼ »õ µð½ºÅ©¸¦ ±¸ÀÔÇß´Ù.
Á¦ °¡Àå Ä£ÇÑ Ä£±¸°¡, ''dd if=/dev/hda3 of=/dev/hdc3 ''¸¦ Çغ¸¶ó°í Çؼ
Çغ¸¾ÒÁö¸¸, ¾ÆÁ÷µµ ÀÛµ¿ÇÏ°í ÀÖÁö ¾Ê½À´Ï´Ù.
My RAID-1 device, /dev/md0 consists of two hard drive
partitions: /dev/hda3 and /dev/hdc3 .
Recently, the disk with /dev/hdc3 failed,
and was replaced with a new disk. My best friend,
who doesn't understand RAID, said that the correct thing to do now
is to ''dd if=/dev/hda3 of=/dev/hdc3 ''.
I tried this, but things still don't work.
A:
Ä£±¸¸¦ ´ç½ÅÀÇ ÄÄÇ»ÅÍ¿¡ °¡±îÀÌ °¡°Ô ÇÏÁö ¾Ê°Ô Çؼ,
±³¿ì°ü°è¸¦ À¯ÁöÇÏ´Â °Ô ÁÁÀ» °ÍÀÌ´Ù. ´ÙÇེ·´°Ôµµ,
½É°¢ÇÑ ¼Õ»óÀ» ÀÔÁö´Â ¾Ê´Â´Ù. ¾Æ·¡¿Í °°ÀÌ ½ÇÇàÇÔÀ¸·Î½á,
½Ã½ºÅÛÀ» ȸº¹½Ãų¼ö ÀÖÀ» °ÍÀÌ´Ù.
mkraid raid1.conf -f --only-superblock
dd ¸í·É¾î¸¦ ÀÌ¿ëÇؼ, ÆÄƼ¼ÇÀÇ º¹»çº»À» ¸¸µå´Â °ÍÀº
´ëºÎºÐ °¡´ÉÇÏ´Ù. ÇÏÁö¸¸, RAID-1 ½Ã½ºÅÛ¿¡¼´Â superblock ÀÌ ´Ù¸£±â
¶§¹®¿¡ ¾ÈµÈ´Ù. ¶§¹®¿¡ RAID-1À» µÎ ÆÄƼ¼ÇÁßÀÇ ÇϳªÀÇ
superblock¸¦ ´Ù½Ã ¸¸µé¾îÁÖ¸é, ´Ù½Ã »ç¿ë°¡´ÉÇÏ°Ô µÉ °ÍÀÌ´Ù.
You should keep your best friend away from you computer.
Fortunately, no serious damage has been done.
You can recover from this by running:
mkraid raid1.conf -f --only-superblock
By using dd , two identical copies of the partition
were created. This is almost correct, except that the RAID-1
kernel extension expects the RAID superblocks to be different.
Thus, when you try to reactivate RAID, the software will notice
the problem, and deactivate one of the two partitions.
By re-creating the superblock, you should have a fully usable
system.
- Q:
³»
mkraid ´Â --only-superblock ¿É¼ÇÀÌ Áö¿øµÇÁö
¾Ê´Â ¹öÁ¯ÀÔ´Ï´Ù. ¾î¶»°Ô ÇØ¾ß ÇÒ±î¿ä?
A:
»õ·Î¿î Åø¿¡¼´Â --force-resync ¿É¼ÇÀ¸·Î ¹Ù²ã¾ú°í,
ÃÖ½ÅÀÇ ÅøµéÀÇ »ç¿ëÀº ¾Æ·¡¿Í °°ÀÌ »ç¿ëÇØ¾ß ÇÑ´Ù.
umount /web (/dev/md0°¡ ¸¶¿îÆ® µÇ¾îÀÖ´Â °÷.)
raidstop /dev/md0
mkraid /dev/md0 --force-resync --really-force
raidstart /dev/md0
cat /proc/mdstat ¸¦ ÅëÇØ °á°ú¸¦ º¼ ¼ö ÀÖÀ» °ÍÀÌ°í,
mount /dev/md0 ¸¦ ÅëÇØ ´Ù½Ã »ç¿ë°¡´ÉÇÒ °ÍÀÌ´Ù.
The newer tools drop support for this flag, replacing it with
the --force-resync flag. It has been reported
that the following sequence appears to work with the latest tools
and software:
umount /web (where /dev/md0 was mounted on)
raidstop /dev/md0
mkraid /dev/md0 --force-resync --really-force
raidstart /dev/md0
After doing this, a cat /proc/mdstat should report
resync in progress , and one should be able to
mount /dev/md0 at this point.
- Q:
/dev/hda3 °ú /dev/hdc3 µÎ°³ÀÇ µð½ºÅ©·Î
/dev/md0 ÀÇ RAID-1 ½Ã½ºÅÛÀ» ¸¸µé¾î¼ »ç¿ëÇÏ°í ÀÖ½À´Ï´Ù.
Á¦ °¡Àå Ä£ÇÑ (¿©ÀÚ?) Ä£±¸°¡, ¸øº¸´Â »çÀÌ, /dev/hda3 ¸¦
fsck ·Î ½ÇÇà½ÃÅ°´Â ¹Ù¶÷¿¡, RAID°¡ µ¿ÀÛÇÏÁö ¾Ê°í ÀÖ½À´Ï´Ù.
¾î¶»°Ô ÇØ¾ß ÇÒ±î¿ä?
My RAID-1 device, /dev/md0 consists of two hard drive
partitions: /dev/hda3 and /dev/hdc3 .
My best (girl?)friend, who doesn't understand RAID,
ran fsck on /dev/hda3 while I wasn't looking,
and now the RAID won't work. What should I do?
A:
´ç½ÅÀº °¡Àå Ä£ÇÑ Ä£±¸¶ó´Â °³³äÀ» ´Ù½Ã Çѹø »ý°¢Çغ¸¾Æ¾ß ÇÒ°ÍÀÌ´Ù.
ÀϹÝÀûÀ¸·Î fsck ´Â RAID¸¦ ¸¸µå´Â ÆÄƼ¼ÇÁß Çϳª¿¡¼
µ¹·Á¼´Â Àý´ë·Î ¾ÈµÈ´Ù.
ÆÄƼ¼Ç ¼Õ»óÀ̳ª, µ¥ÀÌÅÍ ¼Õ»óÀÌ ¹ß»ýµÇÁö ¾Ê¾Ò´Ù°í ÇÑ´Ù¸é,
RAID-1 ½Ã½ºÅÛÀ» ¾Æ·¡¿Í °°ÀÌ ¼ö¸®ÇÒ ¼ö ÀÖ´Ù.
/dev/hda3 ÀÇ ¹é¾÷À» ¹Þ´Â´Ù.
dd if=/dev/hda3 of=/dev/hdc3
mkraid raid1.conf -f --only-superblock
You should re-examine your concept of ``best friend''.
In general, fsck should never be run on the individual
partitions that compose a RAID array.
Assuming that neither of the partitions are/were heavily damaged,
no data loss has occurred, and the RAID-1 device can be recovered
as follows:
- make a backup of the file system on
/dev/hda3
dd if=/dev/hda3 of=/dev/hdc3
mkraid raid1.conf -f --only-superblock
This should leave you with a working disk mirror.
- Q:
¿Ö À§ÀÇ º¹±¸ ¼ø¼´ë·Î ÇØ¾ß Çϴ°¡?
A:
RAID-1 À» ÀÌ·ç´Â ÆÄƼ¼ÇµéÀº ¿Ïº®È÷ °°Àº º¹Á¦º»À̾î¾ß Çϱ⠶§¹®ÀÌ´Ù.
¹Ì·¯¸µÀÌ ÀÛµ¿µÇÁö ¾ÊÀ» °æ¿ì, ÇÇƼ¼ÇµéÁß Çϳª¸¦ RAID¸¦ »ç¿ëÇÏÁö ¾Ê°í
mountÇؼ »ç¿ëÇÏ°í, ¾Æ·¡¿Í °°Àº ¸í·ÉÀ» »ç¿ëÇØ RAID ½Ã½ºÅÛÀ» º¹±¸ÇÑ ÈÄ,
ÆÄƼ¼ÇÀ» unmount ÇÏ°í, RAID ½Ã½ºÅÛÀ» ´Ù½Ã ½ÃÀÛÇÏ¿©¾ß ÇÑ´Ù.
¾Æ·¡ÀÇ ¸í·ÉµéÀº RAID-1ÀÌ ¾Æ´Ñ ´Ù¸¥ ·¹º§µé¿¡ »ç¿ëÇÏ¸é ¾ÈµÈ´Ù´Â °ÍÀ»
ÁÖÀÇÇ϶ó.
Because each of the component partitions in a RAID-1 mirror
is a perfectly valid copy of the file system. In a pinch,
mirroring can be disabled, and one of the partitions
can be mounted and safely run as an ordinary, non-RAID
file system. When you are ready to restart using RAID-1,
then unmount the partition, and follow the above
instructions to restore the mirror. Note that the above
works ONLY for RAID-1, and not for any of the other levels.
À§¿¡¼ ó·³ ¸Á°¡ÁöÁö ¾ÊÀº ÆÄƼ¼ÇÀ» ¸Á°¡Áø ÆÄƼ¼ÇÀ¸·Î º¹»çÇÏ´Â °ÍÀº
±âºÐ ÁÁÀº ÀÏÀÏ °ÍÀÌ´Ù. ÀÌÁ¦ md ÀåÄ¡¸¦ fsck ·Î °Ë»çÇϱ⸸ ÇÏ¸é µÈ´Ù.
It may make you feel more comfortable to reverse the direction
of the copy above: copy from the disk that was untouched
to the one that was. Just be sure to fsck the final md.
- Q:
³ª´Â À§ÀÇ Áú¹®µé¿¡ È¥¶õ½º·´´Ù.
fsck /dev/md0 ¸¦ ½ÇÇàÇÏ´Â °ÍÀº
¾ÈÀüÇÑ°¡?
A:
±×·¸´Ù. md ÀåÄ¡µéÀ» fsck ÇÏ´Â °ÍÀº ¾ÈÀüÇÏ´Ù.
»ç½Ç, ±×°Ô ¾ÈÀüÇÏ°Ô fsck ¸¦ ½ÇÇà½ÃÅ°´Â À¯ÀÏÇÑ ¹æ¹ýÀÌ´Ù.
Yes, it is safe to run fsck on the md devices.
In fact, this is the only safe place to run fsck .
- Q:
µð½ºÅ©°¡ õõÈ÷ ¿À·ù³ª±â ½ÃÀÛÇÑ´Ù¸é, ¾î´À ÆÄƼ¼ÇÀÇ ¿À·ùÀÎÁö ¸í¹éÇÒ°ÍÀΰ¡?
ÀÌ·± È¥¶õÀº °ü¸®ÀڷκÎÅÍ À§ÇèÇÑ °áÁ¤À» ³»¸®°Ô ÇÒ ¼öµµ ÀÖÁö ¾ÊÀº°¡.
A:
µð½ºÅ©¿¡ ¹®Á¦°¡ »ý±â±â »õÀÛÇϸé, RAIDÀÇ Àú¼öÁØ µå¶óÀ̹ö°¡
error Äڵ带 ¹ÝȯÇÒ °ÍÀÌ´Ù.
RAID µå¶óÀ̹ö´Â ÁÁÀº ÂÊ diskÀÇ superblock¾È¿¡ ''bad'' Ç¥½Ã¸¦ ÇÒ°ÍÀÌ°í,
°¡´ÉÇÑ ¹Ì·¯¸µÀ» À¯ÁöÇϵµ·Ï ¸í·ÉÇÒ °ÍÀÌ´Ù.
(³ªÁß¿¡ ¾î¶² ¹Ì·¯¸µÀÌ ÁÁÀº ÂÊÀÌ°í ³ª»ÛÂÊÀÎÁö ¹è¿ì°Ô µÉ °ÍÀÌ´Ù.)
¹°·Ð, disk¿Í Àú¼öÁØ µå¶óÀ̹ö°¡ Àбâ/¾²±â ¿¡·¯¸¦ °¨ÁöÇÒ °ÍÀÌ°í,
Á¶¿ëÈ÷ µ¥ÀÌÅÍ°¡ ¸Á°¡ÁöÁö´Â ¾Ê´Â´Ù.
Once a disk fails, an error code will be returned from
the low level driver to the RAID driver.
The RAID driver will mark it as ``bad'' in the RAID superblocks
of the ``good'' disks (so we will later know which mirrors are
good and which aren't), and continue RAID operation
on the remaining operational mirrors.
This, of course, assumes that the disk and the low level driver
can detect a read/write error, and will not silently corrupt data,
for example. This is true of current drives
(error detection schemes are being used internally),
and is the basis of RAID operation.
- Q:
hot-repair ´Â ¹«¾ùÀΰ¡?
A:
RAID ½Ã½ºÅÛÁß ÇϳªÀÇ µð½ºÅ©°¡ ¸Á°¡Á³À» ¶§, RAIDÀÇ ÁߴܾøÀÌ
½ÇÇàÁß¿¡ ¿©ºÐÀÇ µð½ºÅ©ÀÇ Ãß°¡¸¦ ÅëÇØ º¹±¸ÇÏ´Â
''ºü¸¥ º¹±¸'' ¸¦ ¿Ï¼ºÇÏ·Á°í ÁøÇàÁßÀÌ´Ù.
±×·¯³ª ÀÌ°ÍÀ» »ç¿ëÇϱâ À§Çؼ±, ¿©ºÐÀÇ µð½ºÅ©´Â ºÎÆýÃ
¼±¾ðµÇ¾ú°Å³ª. ¸î¸î Ưº°ÇÑ Àåºñ°¡ Áö¿øÇÏ´Â Àü¿øÀÌ µé¾î¿Â »óÅ¿¡¼
Çϵ带 Ãß°¡°¡ °¡´ÉÇØ¾ß ÇÑ´Ù.
Work is underway to complete ``hot reconstruction''.
With this feature, one can add several ``spare'' disks to
the RAID set (be it level 1 or 4/5), and once a disk fails,
it will be reconstructed on one of the spare disks in run time,
without ever needing to shut down the array.
However, to use this feature, the spare disk must have
been declared at boot time, or it must be hot-added,
which requires the use of special cabinets and connectors
that allow a disk to be added while the electrical power is
on.
97³â 10¿ù MDÀÇ º£Å¸¹öÁ¯ÀÌ ÇÒ¼ö ÀÖ´Â °ÍÀº ¾Æ·¡¿Í °°´Ù.
- ¿©ºÐÀÇ µð½ºÅ©¸¦ ÅëÇÑ RAID 1 ¿Í 5 ÀÇ º¹±¸
- À߸øµÈ ½Ã½ºÅÛÁ¾·á½Ã RAID-5 parity ÀÇ º¹±¸
- ÀÛµ¿ÇÏ´Â RAID 1, 4,5 ½Ã½ºÅÛ¿¡ ¿©ºÐ µå¶óÀ̺ê Ãß°¡.
ÇöÀç ±âº»ÀûÀ¸·Î ÀÚµ¿º¹±¸´Â ¼³Á¤µÇ¾î ÀÖÁö ¾Ê°í,
include/linux/md.h ¾ÈÀÇ SUPPORT_RECONSTRUCTION
°ªÀ» ¹Ù²Ù¾î ¼³Á¤ÇÒ ¼ö ÀÖ´Ù.
As of October 97, there is a beta version of MD that
allows:
- RAID 1 and 5 reconstruction on spare drives
- RAID-5 parity reconstruction after an unclean
shutdown
- spare disk to be hot-added to an already running
RAID 1 or 4/5 array
By default, automatic reconstruction is (Dec 97) currently
disabled by default, due to the preliminary nature of this
work. It can be enabled by changing the value of
SUPPORT_RECONSTRUCTION in
include/linux/md.h .
Ä¿³Î ±â¹ÝÀÇ º¹±¸°¡ ¼³Á¤µÇ¾î ÀÖ°í, RAID ½Ã½ºÅÛ¿¡
¿©ºÐÀÇ µð½ºÅ©( superblockÀº ÀÌ¹Ì mkraid ¸¦ ÅëÇØ ¸¸µé¾îÁ³À» °ÍÀÌ´Ù.)
¸¦ Ãß°¡ÇÏ·Á ÇÑ´Ù¸é, Ä¿³ÎÀº ³»¿ëÀ» ÀÚµ¿ÀûÀ¸·Î º¹±¸½ÃÄÑÁÙ °ÍÀÌ´Ù.
(ÀϹÝÀûÀÎ mdstop , µð½ºÅ©±³Ã¼, ckraid ,
mdrun ÀÇ ÀýÂ÷¸¦ ¹âÁö ¾Ê¾Æµµ µÈ´Ù.)
If spare drives were configured into the array when it
was created and kernel-based reconstruction is enabled,
the spare drive will already contain a RAID superblock
(written by mkraid ), and the kernel will
reconstruct its contents automatically (without needing
the usual mdstop , replace drive, ckraid ,
mdrun steps).
´ç½ÅÀÌ ÀÚµ¿ º¹±¸¸¦ ½ÇÇàÇÏÁö ¾Ê¾Ò°í, ±³Ã¼ÇÒ µð½ºÅ©¸¦
¼³Á¤ÇÏÁö ¾Ê¾Ò´Ù¸é, Gadi Oxman
<
gadio@netvision.net.il>
°¡ Á¦¾ÈÇÑ ¾Æ·¡¿Í °°Àº ´Ü°è¸¦ µû¸¦ ¼ö ÀÖ´Ù.
- ÇϳªÀÇ µð½ºÅ©°¡ Á¦°ÅµÇ¾ú´Ù¸é, RAID´Â degraged mode ·Î ¼³Á¤µÇ¾î ÀÛµ¿ÇÒ °ÍÀÌ´Ù.
ÀÌ°ÍÀ» full operation mode·Î À嵿½ÃÅ°±â À§Çؼ´Â ¾Æ·¡¿Í °°Àº ÀýÂ÷°¡ ÇÊ¿äÇÏ´Ù.
- RAID¸¦ ÁߴܽÃÄѶó. (
mdstop /dev/md0 )
- °íÀå³ µð½ºÅ©¸¦ ±³Ã¼Ç϶ó.
- ³»¿ëº¹±¸¸¦ À§ÇØ
ckraid raid.conf ¸¦ ½ÇÇàÇ϶ó.
- RAID¸¦ ´Ù½Ã½ÇÇà½ÃÄѶó. (
mdadd , mdrun ).
Áß¿äÇÑ Á¡Àº RAID´Â ´Ù½Ã ¸ðµç µå¶óÀ̺꿡¼ µ¹¾Æ°¥ °ÍÀ̶ó´Â °Í°ú.
ÇϳªÀÇ µð½ºÅ©ÀÇ ¹®Á¦°¡ »ý°åÀ» ¶§¸¦ ´ëºñÇÑ °ÍÀ̶ó´Â °ÍÀÌ´Ù.
ÇöÀçÀÇ ÇϳªÀÇ ±³Ã¼µð½ºÅ©¸¦ ¿©·¯°³ÀÇ RAID¿¡ ¹èºÐÇÏ´Â °ÍÀº ºÒ°¡´ÉÇÏ´Ù.
°¢°¢ÀÇ RAID´Â °¢°¢ÀÇ disk¸¦ ÇÊ¿ä·Î ÇÑ´Ù.
If you are not running automatic reconstruction, and have
not configured a hot-spare disk, the procedure described by
Gadi Oxman
<
gadio@netvision.net.il>
is recommended:
- Currently, once the first disk is removed, the RAID set will be
running in degraded mode. To restore full operation mode,
you need to:
- stop the array (
mdstop /dev/md0 )
- replace the failed drive
- run
ckraid raid.conf to reconstruct its contents
- run the array again (
mdadd , mdrun ).
At this point, the array will be running with all the drives,
and again protects against a failure of a single drive.
Currently, it is not possible to assign single hot-spare disk
to several arrays. Each array requires it's own hot-spare.
- Q:
Ãʺ¸ °ü¸®ÀÚ°¡ ¹®Á¦°¡ »ý°å´Ù´Â °ÍÀ» ¾Ë ¼ö ÀÖµµ·Ï
''¹Ì·¯¸µµÇ°í ÀÖ´Â µð½ºÅ©Áß Çϳª°¡ ¸Á°¡Á³´Ù.¸Ûû¾Æ.''
°°Àº °æ°í¸¦ ¼Ò¸®·Î µéÀ» ¼ö Àֱ⸦ ¿øÇÑ´Ù.
A:
Ä¿³ÎÀº ``KERN_ALERT '' À̺¥Æ®¿¡ ´ëÇؼ
¿ì¼±ÀûÀ¸·Î syslog¿¡ ·Î±×¸¦ ³²±â°í ÀÖ´Ù.
syslog¸¦ ¸ð´ÏÅ͸µÇÒ ¸î¸î ¼ÒÇÁÆ®¿þ¾îµéÀÌ ÀÖ°í, ±×°ÍµéÀÌ
ÀÚµ¿ÀûÀ¸·Î PC speaker·Î beep¸¦ ¿ï¸®°Å³ª, »ß»ß¸¦ È£ÃâÇϰųª.
e-mailµîÀ» º¸³¾°ÍÀÌ´Ù.
The kernel is logging the event with a
``KERN_ALERT '' priority in syslog.
There are several software packages that will monitor the
syslog files, and beep the PC speaker, call a pager, send e-mail,
etc. automatically.
- Q:
RAID-5¸¦ ¾î¶»°Ô degraded mode·Î »ç¿ëÇÒ ¼ö Àִ°¡?
(µð½ºÅ© Çϳª¿¡ ¹®Á¦°¡ »ý°å°í, ¾ÆÁ÷ ±³Ã¼ÇÏÁö ¾Ê¾Ò´Ù.)
A:
Gadi Oxman
<
gadio@netvision.net.il>
ÀÌ Àû±â¸¦...
ÀϹÝÀûÀ¸·Î, n °³ÀÇ µå¶óÀ̺ê·Î raid-5 ½Ã½ºÅÛÀ» µ¹¸®·Á¸é ¾Æ·¡¿Í °°ÀÌ ÇÑ´Ù.:
mdadd /dev/md0 /dev/disk1 ... /dev/disk(n)
mdrun -p5 /dev/md0
µð½ºÅ©Áß¿¡ Çϳª°¡ ¸Á°¡Áø °æ¿ì¶óµµ ¿©ÀüÈ÷ mdadd ¸¦ »ç¿ëÇØ ¼³Á¤ÇØ¾ß ÇÑ´Ù.
(?? ¸Á°¡Áø µð½ºÅ© ´ë½Å /dev/nullÀ» »ç¿ëÇؼ µµÀüÇضó ??? Á¶½ÉÇضó..)
(¿ªÀÚ, µ¡. ÀÌ ¹°À½Ç¥´Â ¹»±î... ÀÌ ¹®¼ÀÇ ÀúÀÚ´Â ÀÌ ¹®¼ÀÇ ÀºÎºÐ¿¡¼ ÀÌ ¹æ¹ýÀº ½ÃµµÇØ
º»ÀûÀÌ ¾ø´Ù°í Çß´Ù...-.-)
RAID´Â (n-1)°³ÀÇ µå¶óÀ̺긦 »ç¿ëÇÑ degraded mode·Î µ¿ÀÛÇÒ °ÍÀÌ´Ù.
``mdrun ''°¡ ½ÇÆÐÇß´Ù¸é, kernelÀº ¿¡·¯¸¦ ³¾ °ÍÀÌ´Ù.
( ¸î°³ÀÇ ¹®Á¦°¡ ÀÖ´Â µð½ºÅ©¶óµçÁö, shutdownÀ» Á¦´ë·Î ¾ÈÇÑ °æ¿ì.)
''dmesg '' ¸í·É¾î¸¦ »ç¿ëÇÏ¿© kernelÀÇ ¿¡·¯¸¦ º¸¾Æ¶ó.
raid-5´Â µð½ºÅ©°¡ Çϳª ±úÁö´Â °Íº¸´Ù Àü¿øÀÌ ³ª°¬À» ´õ À§ÇèÇϸç,
¾Æ·¡¿Í °°ÀÌ »õ·Î¿î RAID superblock¸¦ ¸¸µêÀ¸·Î½á º¹±¸¸¦ ½Ãµµ ÇÒ ¼ö ÀÖ´Ù.
mkraid -f --only-superblock raid5.conf
superblockÀÇ º¹±¸´Â
¸ðµç µå¶óÀ̺갡 ''OK'' ·Î Ç¥½ÃµÇ´Â »óÅÂ(¾Æ¹«Àϵµ ÀϾÁö ¾Ê¾Ò´Ù¸é.)¿¡
¿µÇâÀ» ¹ÞÁö ¾Ê±â ¶§¹®¿¡ °£´ÜÇÒ °ÍÀÌ´Ù.
Gadi Oxman
<
gadio@netvision.net.il>
writes:
Normally, to run a RAID-5 set of n drives you have to:
mdadd /dev/md0 /dev/disk1 ... /dev/disk(n)
mdrun -p5 /dev/md0
Even if one of the disks has failed,
you still have to mdadd it as you would in a normal setup.
(?? try using /dev/null in place of the failed disk ???
watch out)
Then,
The array will be active in degraded mode with (n - 1) drives.
If ``mdrun '' fails, the kernel has noticed an error
(for example, several faulty drives, or an unclean shutdown).
Use ``dmesg '' to display the kernel error messages from
``mdrun ''.
If the raid-5 set is corrupted due to a power loss,
rather than a disk crash, one can try to recover by
creating a new RAID superblock:
mkraid -f --only-superblock raid5.conf
A RAID array doesn't provide protection against a power failure or
a kernel crash, and can't guarantee correct recovery.
Rebuilding the superblock will simply cause the system to ignore
the condition by marking all the drives as ``OK'',
as if nothing happened.
- Q:
µð½ºÅ©¿¡ ¹®Á¦°¡ ¹ß»ýÇϸé RAID-5´Â ¾î¶»°Ô µ¿ÀÛÇϳª¿ä?
A:
¾Æ·¡¿Í °°Àº ÀüÇüÀûÀÎ µ¿ÀÛ ´Ü°è°¡ ÀÖ´Ù.
- RAID-5 °¡ ÀÛµ¿ÇÑ´Ù.
- RAID ÀÛµ¿Áß ÇϳªÀÇ µð½ºÅ©¿¡ ¹®Á¦°¡ »ý°å´Ù.
- µå¶óÀ̺êÀÇ firmware¿Í Àú¼öÁØÀÇ Linux µð½ºÅ© ÄÁÆ®·Ñ·¯ µå¶óÀ̹ö´Â
¿À·ù¸¦ °¨ÁöÇÏ°í MD driver¿¡ º¸°íÇÑ´Ù.
- MD driver´Â ³ª¸ÓÁö »ç¿ë°¡´ÉÇÑ µå¶óÀ̺êµé·Î, Ä¿³ÎÀÇ »óÀ§·¹º§ ºÎºÐ¿¡
¿¡·¯¿Í °ü°è¾øÀÌ
/dev/md0 ¸¦ Á¦°øÇÒ °ÍÀÌ´Ù. (¼º´ÉÀº ¶³¾îÁø´Ù.)
- °ü¸®ÀÚ´Â ÀϹÝÀûÀ¸·Î
umount /dev/md0 °ú mdstop /dev/md0 ¸¦ ÇÒ ¼ö ÀÖ´Ù.
- °íÀå³ µð½ºÅ©°¡ ±³Ã¼µÇÁö ¾Ê¾Æµµ, °ü¸®ÀÚ´Â
mdadd ¿Í mdrun ¸¦ ½ÇÇà½ÃÄѼ.
¿©ÀüÈ÷ degraded mode·Î µ¿ÀÛ½Ãų ¼ö ÀÖÀ» °ÍÀÌ´Ù.
The typical operating scenario is as follows:
- A RAID-5 array is active.
- One drive fails while the array is active.
- The drive firmware and the low-level Linux disk/controller
drivers detect the failure and report an error code to the
MD driver.
- The MD driver continues to provide an error-free
/dev/md0
device to the higher levels of the kernel (with a performance
degradation) by using the remaining operational drives.
- The sysadmin can
umount /dev/md0 and
mdstop /dev/md0 as usual.
- If the failed drive is not replaced, the sysadmin can still
start the array in degraded mode as usual, by running
mdadd and mdrun .
- Q:
A:
- Q:
¿Ö 13¹ø° Áú¹®Àº ¾ø³ª¿ä?
A:
´ç½ÅÀÌ, RAID¿Í ³ôÀº ´É·Â°ú UPS¿Í °ü·ÃÀÌ ÀÖ´Â »ç¶÷À̶ó¸é,
±×°ÍµéÀ» ¹Ì½ÅÀûÀ¸·Î ¹Ï´Â °Í ¸¶Àúµµ ÁÁÀº »ý°¢ÀÏ °ÍÀÌ´Ù.
±×°ÍÀº Àý´ë ¸Á°¡ÁöÁö ¾ÊÀ» °ÍÀÌ´Ù. ±×·¸Áö ¾ÊÀº°¡?
If you are concerned about RAID, High Availability, and UPS,
then its probably a good idea to be superstitious as well.
It can't hurt, can it?
- Q:
RAID-5 ½Ã½ºÅÛ¿¡¼ ÇϳªÀÇ °íÀå³ µð½ºÅ©¸¦ ±³Ã¼ÇßÀ» »ÓÀε¥..
RAID ¸¦ º¹±¸ÇÑÈÄ
fsck °¡ ¸¹Àº error¸¦ º¸¿©ÁÝ´Ï´Ù.
±×°Ô Á¤»óÀΰ¡¿ä?
A:
Á¤»óÀÌ ¾Æ´Ï´Ù. ±×¸®°í fsck¸¦ ¼öÁ¤À» ÇÏÁö ¾Ê´Â °Ë»ç Àü¿ë¸ðµå¿¡¼
½ÇÇà½ÃÅ°Áö ¾Ê¾Ò´Ù¸é, µ¥ÀÌÅÍ¿¡ ¹®Á¦°¡ »ý±â´Â °ÍÀÌ ÃæºÐÈ÷ °¡´ÉÇÏ´Ù.
ºÒÇàÇÏ°Ôµµ, µð½ºÅ© ±³Ã¼ÈÄ RAID-5ÀÇ disk ¼ø¼¸¦ ¿ì¿¬È÷ ¹Ù²Ù¾î ¹ö¸®´Â ÈçÇÑ
½Ç¼ö ÁßÀÇ ÇϳªÀÌ´Ù. ºñ·Ï RAID superblockÀÌ ¹Ù¶÷Á÷ÇÑ ¹æ¹ýÀ¸·Î ÀúÀåµÇ±ä ÇÏÁö¸¸,
¸ðµç ÅøµéÀÌ ÀÌ Á¤º¸¸¦ µû¸£´Â °ÍÀº ¾Æ´Ï´Ù.
Ưº°È÷, ckraid ÀÇ ÇöÀç ¹öÁ¯Àº -f ¿É¼ÇÀ» »ç¿ëÇؼ
ÇöÀç superblock¾ÈÀÇ µ¥ÀÌÅÍ ´ë½Å, Á¤º¸¸¦ Àоî¿Àµµ·Ï ÇÒ ¼ö ÀÖ´Ù.
(´ëü·Î /etc/raid5.conf ÆÄÀÏÀ» »ç¿ëÇÑ´Ù.)
ÁöÁ¤ÇÑ Á¤º¸°¡ ºÎÁ¤È®Çϸé, ±³Ã¼ÇÑ µð½ºÅ©°¡ ºÎÁ¤È®ÇÏ°Ô º¹±¸µÉ °ÍÀÌ°í.
ÀÌ·± Á¾·ùÀÇ ½Ç¼öµéÀÌ ¸¹Àº fsck errorµéÀ» ³»´Â Áõ»óÀ»
º¸¿©ÁØ´Ù.
±×¸®°í ´ç½ÅÀÌ ½Å±âÇÑ °æ¿ì¿¡(ÀÌ·± ½Ç¼ö·Î ¸ðµç µ¥ÀÌÅ͸¦ ¼Õ½Ç´çÇÏ´Â..) ÇØ´çµÈ´Ù¸é,
RAIDÀÇ Àç¼³Á¤Àü¿¡ ¸ðµç µ¥ÀÌÅ͸¦ ¹é¾÷Çϱ⸦ °·ÂÈ÷ ÃßõÇÑ´Ù.
No. And, unless you ran fsck in "verify only; do not update"
mode, its quite possible that you have corrupted your data.
Unfortunately, a not-uncommon scenario is one of
accidentally changing the disk order in a RAID-5 array,
after replacing a hard drive. Although the RAID superblock
stores the proper order, not all tools use this information.
In particular, the current version of ckraid
will use the information specified with the -f
flag (typically, the file /etc/raid5.conf )
instead of the data in the superblock. If the specified
order is incorrect, then the replaced disk will be
reconstructed incorrectly. The symptom of this
kind of mistake seems to be heavy & numerous fsck
errors.
And, in case you are wondering, yes, someone lost
all of their data by making this mistake. Making
a tape backup of all data before reconfiguring a
RAID array is strongly recommended.
- Q:
QuickStart ¿¡¼
mdstop ´Â ´ÜÁö µð½ºÅ©µéÀ» µ¿±âÈ(sync)½ÃÅ°´Â °Í »ÓÀ̶ó°í
Çϴµ¥, ±×°Ô Á¤¸» ÇÊ¿äÇÑ °¡¿ä? ÆÄÀϽýºÅÛÀ» unmountÇÏ´Â °ÍÀ¸·Î ÃæºÐÇÏÁö ¾Ê³ª¿ä?
A:
mdstop /dev/md0 ¸í·ÉÀº.
- shutdownÀÌ Àß µÇ¾ú¼¹´ÂÁö¸¦ ¹ß°ßÇϱâ À§ÇØ,
''clean''À» Ç¥½ÃÇÑ´Ù.
- RAID¸¦ µ¿±âÈ ½ÃŲ´Ù. ÈÄ¿¡ ÆÄÀϽýºÅÛÀÇ unmountº¸´Ù Áß¿äÇÏÁö
¾ÊÁö¸¸, ÆÄÀϽýºÅÛÀ» ÅëÇÏ´Â °ÍÀÌ ¾Æ´Ñ,
/dev/md0 À» Á÷Á¢
access Çϱ⠶§¹®¿¡, Áß¿äÇÏ´Ù.
The command mdstop /dev/md0 will:
- mark it ''clean''. This allows us to detect unclean shutdowns, for
example due to a power failure or a kernel crash.
- sync the array. This is less important after unmounting a
filesystem, but is important if the
/dev/md0 is
accessed directly rather than through a filesystem (for
example, by e2fsck ).
- Q:
2.0.x ´ëÀÇ Ä¿³ÎÀ» À§ÇÑ ÇöÀç ¾Ë·ÁÁø °¡Àå ¾ÈÁ¤µÈ RAID´Â ÆÐÄ¡´Â ¹«¾ùÀΰ¡¿ä?
A:
As of 18 Sept 1997, it is
"2.0.30 + pre-9 2.0.31 + Werner Fink's swapping patch
+ the alpha RAID patch". As of November 1997, it is
2.0.31 + ... !?
- Q:
RAID ÆÐÄ¡°¡ Àü Àß ÀνºÅçµÇÁö ¾Ê³×¿ä. ¹«¾ùÀÌ ¹®Á¦Àϱî¿ä?
A:
/usr/include/linux ¸¦ /usr/src/linux/include/linux À¸·Î
½Éº¼¸¯ ¸µÅ©¸¦ °É¾î¶ó.
raid5.c µîÀÇ »õÆÄÀÏÀ» Á¦´ë·Î µÈ À§Ä¡·Î º¹»çÇ϶ó.
¶§¶§·Î ÆÐÄ¡ ¸í·É¾î´Â »õ·Î¿î ÆÄÀÏÀ» ¸¸µéÁö ¸øÇÑ´Ù.
ÆÐÄ¡½Ã -f ¿É¼ÇÀ» »ç¿ëÇغ¸¶ó.
Make sure that /usr/include/linux is a symbolic link to
/usr/src/linux/include/linux .
Make sure that the new files raid5.c , etc.
have been copied to their correct locations. Sometimes
the patch command will not create new files. Try the
-f flag on patch .
- Q:
raidtools 0.42¸¦ ÄÄÆÄÀÏ Áß include <pthread.h> ¿¡¼,
ÄÄÆÄÀÏÀÌ ¸Ü¾ú´Ù. ±×°Ô ³» ½Ã½ºÅÛ¿¡´Â ¾ø´Âµ¥, ¾î¶»°Ô ÀÌ°ÍÀ» °íÃÄ¾ß Çϴ°¡?
A:
raidtools-0.42´Â
ftp://ftp.inria.fr/INRIA/Projects/cristal/Xavier.Leroy
¾òÀ» ¼ö ÀÖ´Â linuxthreads-0.6À» ÇÊ¿ä·Î ÇÑ´Ù. ¶ÇÇÑ,
glibc v2.0µµ »ç¿ëÇÑ´Ù.
raidtools-0.42 requires linuxthreads-0.6 from:
ftp://ftp.inria.fr/INRIA/Projects/cristal/Xavier.Leroy
Alternately, use glibc v2.0.
- Q:
ÀÌ·± ¸Þ½ÃÁö°¡ ³ª¿É´Ï´Ù.
mdrun -a /dev/md0: Invalid argument
A:
ù¹ø° »ç¿ëÇϱâ Àü¿¡ mkraid ¸¦ »ç¿ëÇÏ¿© RAID¸¦ ÃʱâÈ ÇÏ¿©¾ß ÇÑ´Ù.
Use mkraid to initialize the RAID set prior to the first use.
mkraid ensures that the RAID array is initially in a
consistent state by erasing the RAID partitions. In addition,
mkraid will create the RAID superblocks.
- Q:
ÀÌ·± ¸Þ½ÃÁö°¡ ³ª¿É´Ï´Ù.
mdrun -a /dev/md0: Invalid argument
¼³Á¤Àº.
A:
lsmod (¶Ç´Â cat /proc/modules )·Î raid ¸ðµâÀÌ
·ÎµåµÇ¾ú´ÂÁö È®ÀÎÇضó. ·ÎµåµÇ¾îÀÖÁö ¾Ê´Ù¸é,modprobe raid1 ¶Ç´Â modprobe raid5
·Î È®½ÇÈ÷ ·Îµå½ÃÄѶó. ¶Ç´Â autoloader¸¦ »ç¿ëÇÑ´Ù¸é,
/etc/conf.modules ¿¡ ¾Æ·¡¿Í °°Àº ÁÙÀ» Ãß°¡½ÃÄѶó.
alias md-personality-3 raid1
alias md-personality-4 raid5
Try lsmod (or, alternately, cat
/proc/modules ) to see if the raid modules are loaded.
If they are not, you can load them explicitly with
the modprobe raid1 or modprobe raid5
command. Alternately, if you are using the autoloader,
and expected kerneld to load them and it didn't
this is probably because your loader is missing the info to
load the modules. Edit /etc/conf.modules and add
the following lines:
alias md-personality-3 raid1
alias md-personality-4 raid5
- Q:
mdadd -a ÀÇ ½ÇÇàÁß /dev/md0: No such file or directory
¿Í °°Àº ¸Þ½ÃÁö°¡ ³ª¿Ô½À´Ï´Ù. Á¤¸» ¾îµð¿¡µµ /dev/md0 °¡ ¾ø½À´Ï´Ù.
ÀÌ°ÍÀ» ¾î¶»°Ô ÇØ¾ß ÇÒ±î¿ä?
A:
raid-tools´Â ·çÆ®°¡ make install ÇßÀ» ¶§, ÀÌ ÀåÄ¡µéÀ»
¸¸µç´Ù. ¶ÇÇÑ, ¾Æ·¡¿Í °°ÀÌ ÇÒ ¼öµµ ÀÖ´Ù.
cd /dev
./MAKEDEV md
The raid-tools package will create these devices when
you run make install as root. Alternately,
you can do the following:
cd /dev
./MAKEDEV md
- Q:
/dev/md0 ¿¡ RAID¸¦ ¸¸µç ÈÄ mount ½Ãµµ½Ã¿¡ ¾Æ·¡¿Í °°Àº ¿¡·¯°¡
³³´Ï´Ù. ¹«¾ùÀÌ ¹®Á¦ÀԴϱî?
mount: wrong fs type, bad option, bad superblock on /dev/md0,
or too many mounted file systems .
A:
¸¶¿îÆ® ÇϱâÀü¿¡ ÆÄÀϽýºÅÛÀ» ¸¸µé¾î¾ß ÇÑ´Ù.
mke2fs ¸¦ »ç¿ëÇ϶ó.
You need to create a file system on /dev/md0
before you can mount it. Use mke2fs .
- Q:
Truxton Fulton ÀÛ¼º:
Á¦ 2.0.30 ½Ã½ºÅÛ¿¡¼
mkraid ·Î RAID-1À» À§ÇØ °¢°¢ÀÇ ÆÄƼ¼ÇÀ» Áö¿ì´Â Áß.
ÄÜ¼Ö»ó¿¡ "Cannot allocate free page" ÀÌ·± ¿¡·¯°¡ ³ª°í, system log¿¡
"Unable to handle kernel paging request at virtual address ..." ÀÌ·± ¿¡·¯°¡ ³³´Ï´Ù.
Truxton Fulton wrote:
On my Linux 2.0.30 system, while doing a mkraid for a
RAID-1 device,
during the clearing of the two individual partitions, I got
"Cannot allocate free page" errors appearing on the console,
and "Unable to handle kernel paging request at virtual address ..."
errors in the system log. At this time, the system became quite
unusable, but it appears to recover after a while. The operation
appears to have completed with no other errors, and I am
successfully using my RAID-1 device. The errors are disconcerting
though. Any ideas?
A:
±×°ÍÀº 2.0.30 kernelÀÇ ¾Ë·ÁÁø ¹ö±×ÀÌ´Ù. 2.0.31·Î ¼öÁ¤Çϰųª
2.0.29·Î µ¹¾Æ°¡¶ó.
This was a well-known bug in the 2.0.30 kernels. It is fixed in
the 2.0.31 kernel; alternately, fall back to 2.0.29.
- Q:
mdadd µÈ ÀåÄ¡¸¦ mdrun ÇÏ·Á°í Çϸé,
¾Æ·¡¿Í °°Àº ¸Þ½ÃÁö°¡ ³ª¿É´Ï´Ù.
''invalid raid superblock magic ''.
A:
mkraid ¸¦ ½ÇÇàÇ϶ó.
Make sure that you've run the mkraid part of the install
procedure.
- Q:
Á¦°¡
/dev/md0 À» »ç¿ëÇÏ´Â µ¿¾È, Ä¿³ÎÀº ¾Æ·¡¿Í °°Àº
¿¡·¯µéÀ» ½ñ¾Æ³À´Ï´Ù.
md0: device not running, giving up !
, I/O error...
Àü, °¡»ó device¿¡ Á¦ device¸¦ ¼º°øÀûÀ¸·Î Ãß°¡½ÃÄ×¾ú½À´Ï´Ù.
When I access /dev/md0 , the kernel spits out a
lot of errors like md0: device not running, giving up !
and I/O error... . I've successfully added my devices to
the virtual device.
A:
»ç¿ëÇϱâ À§Çؼ device´Â ½ÇÇàµÇ¾î¾ß ÇÑ´Ù.
mdrun -px /dev/md0 ¸¦ »ç¿ëÇ϶ó.
x´Â RAID levelÀÌ´Ù.
To be usable, the device must be running. Use
mdrun -px /dev/md0 where x is l for linear, 0 for
RAID-0 or 1 for RAID-1, etc.
- Q:
µÎ°³ÀÇ device¸¦ ¼±Çü ¿¬°áÇؼ md device¸¦ ¸¸µé¾ú½À´Ï´Ù.
cat /proc/mdstat ·Î´Â ÃÑÅ©±â°¡ ³ª¿ÀÁö¸¸,
df ¸í·ÉÀ¸·Î´Â ù¹ø° µð½ºÅ©ÀÇ Å©±â¹Û¿¡ ¾È³ª¿É´Ï´Ù.
I've created a linear md-dev with 2 devices.
cat /proc/mdstat shows
the total size of the device, but df only shows the size of the first
physical device.
A:
óÀ½ »ç¿ëÇϱâ Àü¿¡ ¹Ýµå½Ã mkfs ¸¦ ½ÇÇà¾ß ÇÑ´Ù.
You must mkfs your new md-dev before using it
the first time, so that the filesystem will cover the whole device.
- Q:
mdcreate·Î
/etc/mdtab ¸¦ ¼³Á¤ÇÏ°í ,mdadd , mdrun ,fsck
µîÀ» »ç¿ëÇØ, µÎ°³ÀÇ /dev/mdX ÆÄƼ¼ÇÀ» ¸¸µé¾ú½À´Ï´Ù.
reboot Çϱâ Àü±îÁö ¸ðµç °ÍÀÌ ±¦Âú¾Æ º¸¿´Áö¸¸, reboot ÇÒ¶§, µÎ ÆÄƼ¼Ç¿¡¼ ¾Æ·¡¿Í °°Àº fsck
°¡ ³µÀ¾´Ï´Ù. fsck.ext2: Attempt to read block from filesystem resulted in short
read while trying too open /dev/md0
¿Ö ±×·¸°í ¾î¶»°Ô °íÃÄ¾ß Çϳª¿ä?
I've set up /etc/mdtab using mdcreate, I've
mdadd 'ed, mdrun and fsck 'ed
my two /dev/mdX partitions. Everything looks
okay before a reboot. As soon as I reboot, I get an
fsck error on both partitions: fsck.ext2: Attempt to read block from filesystem resulted in short
read while trying too open /dev/md0 . Why?! How do
I fix it?!
A:
ºÎÆà ÀÛ¾÷½Ã RAID ÆÄƼ¼ÇµéÀº fsck ÀÌÀü¿¡ ¹Ýµå½Ã ½ÃÀ۵Ǿî¾ß ÇÑ´Ù.
fsck ´Â /etc/rc.d/rc.S ¶Ç´Â
/etc/rc.d/rc.sysinit ¿¡¼ ºÒ·ÁÁú °ÍÀÌ´Ù.
ÀÌ ÆÄÀϵéÀÇ fsck -A Àü¿¡ mdadd -ar ¸¦ Ãß°¡Çضó.
During the boot process, the RAID partitions must be started
before they can be fsck 'ed. This must be done
in one of the boot scripts. For some distributions,
fsck is called from /etc/rc.d/rc.S , for others,
it is called from /etc/rc.d/rc.sysinit . Change this
file to mdadd -ar *before* fsck -A
is executed. Better yet, it is suggested that
ckraid be run if mdadd returns with an
error. How do do this is discussed in greater detail in
question 14 of the section ''Error Recovery''.
- Q:
4GB º¸´Ù Å« ÆÄƼ¼ÇµéÀ» Æ÷ÇÔÇÏ´Â RAID¸¦ ½ÇÇà½ÃÅ°·Á µµÀüÇÒ ¶§ ¾Æ·¡¿Í °°Àº
¸Þ½ÃÁö°¡ ³ª¿Ô½À´Ï´Ù.
invalid raid superblock magic
A:
ÀÌ ¹ö±×´Â ÇöÀç °íÃÄÁ³´Ù.(97³â 9¿ù) ÃÖ±ÙÀÇ raid¹öÁ®À» »ç¿ëÇضó.
This bug is now fixed. (September 97) Make sure you have the latest
raid code.
- Q:
2GBº¸´Ù Å« ÆÄƼ¼ÇÀ¸·Î mke2fs¸¦ ½ÃµµÇϴµ¥ ¾Æ·¡¿Í °°Àº ¿À·ù°¡ ³³´Ï´Ù.
Warning: could not write 8 blocks in inode table starting at 2097175
A:
ÀÌ°ÍÀº mke2fs ÀÇ ¹®Á¦ °°´Ù.
Àӽ÷Π¼Ò½º¿¡¼ e2fsprogs-1.10/lib/ext2fs/llseek.c ÆÄÀÏÀÇ
ù #ifdef HAVE_LLSEEK Àü¿¡ #undef HAVE_LLSEEK ¸¦ Ãß°¡ÇÑÈÄ
mke2fs¸¦ ÀçÄÄÆÄÀÏÇؼ »ç¿ëÇ϶ó.
This seems to be a problem with mke2fs
(November 97). A temporary work-around is to get the mke2fs
code, and add #undef HAVE_LLSEEK to
e2fsprogs-1.10/lib/ext2fs/llseek.c just before the
first #ifdef HAVE_LLSEEK and recompile mke2fs.
- Q:
ckraid °¡ /etc/mdtab ¸¦ ÀÐÁö ¸øÇÕ´Ï´Ù.
A:
/etc/mdtab ¾ÈÀÇ RAID-0 / linear RAID ¼³Á¤Àº ¾²ÀÌÁö ¾Ê°í ÀÖ°í,
Á»´õ ÈÄ¿¡¾ß Áö¿ø µÉ°ÍÀÌ´Ù. /etc/raid1.conf µîÀÇ ¼³Á¤ÆÄÀϵéÀ»
»ç¿ëÇ϶ó.
The RAID0/linear configuration file format used in
/etc/mdtab is obsolete, although it will be supported
for a while more. The current, up-to-date config files
are currently named /etc/raid1.conf , etc.
- Q:
(
raid1.o ) °°Àº ¸ðµâµéÀÌ ÀÚµ¿À¸·Î ·ÎµåµÇÁö ¾Ê½À´Ï´Ù.
mdrun Àü¿¡ ¼öµ¿À¸·Î modprobe¸¦ ½ÇÇà½ÃÄÑ¾ß Çϴµ¥, ¾î¶»°Ô ÀÌ°ÍÀ» °íÃÄ¾ß ÇÒ±î¿ä?
A:
/etc/conf.modules ¿¡ ¾Æ·¡¿Í °°Àº ÁÙÀ» Ãß°¡Çضó.
alias md-personality-3 raid1
alias md-personality-4 raid5
To autoload the modules, we can add the following to
/etc/conf.modules :
alias md-personality-3 raid1
alias md-personality-4 raid5
- Q:
µå¶óÀ̺ê 13°³¸¦
mdadd ÇÑ ÈÄ, mdrun -p5 /dev/md0
½Ãµµ ÇßÁö¸¸, ¾Æ·¡¿Í °°Àº ¿À·ù°¡ ³ª¿Ô½À´Ï´Ù.
/dev/md0: Invalid argument
A:
software RAIDÀÇ ±âº»¼³Á¤ 8°³ÀÌ´Ù. linux/md.h
ÀÇ #define MAX_REAL=8 ¸¦ Å©°Ô ¹Ù²ÛÈÄ,
Ä¿³ÎÀ» ÀçÄÄÆÄÀÏ Ç϶ó.
The default configuration for software RAID is 8 real
devices. Edit linux/md.h , change
#define MAX_REAL=8 to a larger number, and
rebuild the kernel.
- Q:
ÃÖ½Å½Ä SPARCstation 5 ¿¡¼ md ÀÛ¾÷À» ÇÒ¼ö ¾ø½À´Ï´Ù.
µð½ºÅ© ¶óº§¿¡ ¹«½¼ ¹®Á¦°¡ ÀÖ´Â °ÍÀ̶ó°í »ý°¢µË´Ï´Ù¸¸..
I can't make md work with partitions on our
latest SPARCstation 5. I suspect that this has something
to do with disk-labels.
A:
Sun ÀÇ µð½ºÅ© ¶óº§Àº ÆÄƼ¼ÇÀÇ Ã¹ 1K ¿¡ ÀÖ´Ù.
RAID-1¿¡¼´Â ext2fs ÆÄƼ¼ÇÀÌ ¸ðµç ¶óº§À» ¹«½Ã ÇÒ °ÍÀ̱⠶§¹®¿¡
¹®Á¦°¡ ¾øÁö¸¸, ´Ù¸¥ ·¹º§µé¿¡ °üÇؼ´Â ¾ÆÁ÷ ÇØ°áµÇÁö ¾Ê¾Ò´Ù.
Sun disk-labels sit in the first 1K of a partition.
For RAID-1, the Sun disk-label is not an issue since
ext2fs will skip the label on every mirror.
For other raid levels (0, linear and 4/5), this
appears to be a problem; it has not yet (Dec 97) been
addressed.
- Q:
°¢°¢ ´Ù¸¥ ȸ»çÀÇ µð½ºÅ©¿Í SCSI adapter ·Î RAID¸¦ ±¸¼ºÇÒ ¼ö Àִ°¡?
I have SCSI adapter brand XYZ (with or without several channels),
and disk brand(s) PQR and LMN, will these work with md to create
a linear/stripped/mirrored personality?
A:
¹°·ÐÀÌ´Ù. ¼ÒÇÁÆ®¿þ¾î RAID´Â ¾î¶² µð½ºÅ©¿Í µð½ºÅ©ÄÜÆ®·Ñ·¯¿¡¼µµ
ÀÛµ¿ÇÑ´Ù, ¿¹¸¦ µé¸é, SCSI¿Í IDE¸¦ ¼¯¾î¼ RAID ¸¦ ¸¸µé¼öµµ ÀÖ´Ù.
µð½ºÅ© Å©±â°¡ °°À» ÇÊ¿äµµ ¾øÀ¸¸ç, ±× ¿Ü ¾î¶² Á¦¾àµµ ¾ø´Ù.
ÀÌ°ÍÀº µð½ºÅ© ±×´ë·Î°¡ ¾Æ´Ñ, ÆÄƼ¼ÇÀ» »ç¿ëÇϱ⠶§¹®¿¡ °¡´ÉÇÏ´Ù.
´ÜÁö, RAID ·¹º§ 1°ú 5ÀÇ °æ¿ì °°Àº Å©±âÀÇ ÆÄƼ¼ÇÀ» »ç¿ëÇÒ °ÍÀ» ±ÇÀåÇÒ »ÓÀÌ´Ù.
°°Àº Å©±âÀÇ ÆÄƼ¼ÇÀ» »ç¿ëÇÏÁö ¾ÊÀ¸¸é ³²´Â ÆÄƼ¼ÇÀº ¹ö·ÁÁö°Ô µÈ´Ù.
Yes! Software RAID will work with any disk controller (IDE
or SCSI) and any disks. The disks do not have to be identical,
nor do the controllers. For example, a RAID mirror can be
created with one half the mirror being a SCSI disk, and the
other an IDE disk. The disks do not even have to be the same
size. There are no restrictions on the mixing & matching of
disks and controllers.
This is because Software RAID works with disk partitions, not
with the raw disks themselves. The only recommendation is that
for RAID levels 1 and 5, the disk partitions that are used as part
of the same set be the same size. If the partitions used to make
up the RAID 1 or 5 array are not the same size, then the excess
space in the larger partitions is wasted (not used).
- Q:
³ª´Â Çϵå¿þ¾î RAID 0,1,0+1À» Áö¿øÇÏ´Â µÎ ä³ÎÂ¥¸® BT-952¸¦ °¡Áö°í ÀÖ´Ù....
I have a twin channel BT-952, and the box states that it supports
hardware RAID 0, 1 and 0+1. I have made a RAID set with two
drives, the card apparently recognizes them when it's doing it's
BIOS startup routine. I've been reading in the driver source code,
but found no reference to the hardware RAID support. Anybody out
there working on that?
A:
RAIDPlus ¸¦ Áö¿øÇÏ´Â Mylex/BusLogic FlashPoint º¸µåµéÀº, ÀüºÎ
Çϵå¿þ¾î¸¦ »ç¿ëÇÏ´Â °ÍÀÌ ¾Æ´Ï¶ó, ½ÇÁ¦·Î ¼ÒÇÁÆ®¿þ¾î RAIDµµ »ç¿ëÇÏ°í,
´ÜÁö Windows 95 and Windows NT¿¡¼¸¸ Áö¿øµÈ´Ù.
The Mylex/BusLogic FlashPoint boards with RAIDPlus are
actually software RAID, not hardware RAID at all. RAIDPlus
is only supported on Windows 95 and Windows NT, not on
Netware or any of the Unix platforms. Aside from booting and
configuration, the RAID support is actually in the OS drivers.
While in theory Linux support for RAIDPlus is possible, the
implementation of RAID-0/1/4/5 in the Linux kernel is much
more flexible and should have superior performance, so
there's little reason to support RAIDPlus directly.
- Q:
SMP box¿¡¼ RAID¸¦ »ç¿ëÇÏ´Â °ÍÀÌ ¾ÈÀüÇÑ°¡?
I want to run RAID with an SMP box. Is RAID SMP-safe?
A:
"³ª´Â ±×·¸°Ô »ý°¢ÇÑ´Ù." °¡ ³»°¡ Áö±Ý ÇÒ¼ö ÀÖ´Â °¡Àå ÁÁÀº ´äÀÌ´Ù.
ÃÖ±Ù¿¡ ¸¹Àº À¯ÀúµéÀÌ SMP¿¡¼ÀÇ RAID¿¡ ´ëÇغ¸°íÇÏ°í ÀÖ´Ù.
ÇÏÁö¸¸, ¾Æ·¡¿Í °°Àº ¹®Á¦°¡ ¸ÞÀϸµ ¸®½ºÆ®¸¦ ÅëÇØ ³Ñ¾î¿À±âµµ Çß´Ù.
"I think so" is the best answer available at the time I write
this (April 98). A number of users report that they have been
using RAID with SMP for nearly a year, without problems.
However, as of April 98 (circa kernel 2.1.9x), the following
problems have been noted on the mailing list:
- Adaptec AIC7xxx SCSI drivers are not SMP safe
(General note: Adaptec adapters have a long
& lengthly history
of problems & flakiness in general. Although
they seem to be the most easily available, widespread
and cheapest SCSI adapters, they should be avoided.
After factoring for time lost, frustration, and
corrupted data, Adaptec's will prove to be the
costliest mistake you'll ever make. That said,
if you have SMP problems with 2.1.88, try the patch
ftp://ftp.bero-online.ml.org/pub/linux/aic7xxx-5.0.7-linux21.tar.gz
I am not sure if this patch has been pulled into later
2.1.x kernels.
For further info, take a look at the mail archives for
March 98 at
http://www.linuxhq.com/lnxlists/linux-raid/lr_9803_01/
As usual, due to the rapidly changing nature of the
latest experimental 2.1.x kernels, the problems
described in these mailing lists may or may not have
been fixed by the time your read this. Caveat Emptor.
)
- IO-APIC with RAID-0 on SMP has been reported
to crash in 2.1.90
- Q:
¼±Çü RAID°¡ È®Àå°¡´ÉÇÑ°¡?
µð½ºÅ©¸¦ Ãß°¡Çؼ Á¸ÀçÇÏ´Â ÆÄÀÏ ½Ã½ºÅÛÀÇ Å©±â¸¦ ´Ã¸± ¼ö Àִ°¡?
Are linear MD's expandable?
Can a new hard-drive/partition be added,
and the size of the existing file system expanded?
A:
Miguel de Icaza
<
miguel@luthien.nuclecu.unam.mx>
´Â ¾Æ·¡¿Í °°ÀÌ ¸»Çß´Ù.
³ª´Â ÆÄÀÏ ½Ã½ºÅÛ ´ç ÇϳªÀÇ µð½ºÅ©´ë½Å ¿©·¯°³ÀÇ µð½ºÅ©¸¦ »ç¿ëÇÒ °æ¿ì¸¦
´ëºñÇؼ ext2fsÀÇ ¼Ò½ºÄڵ带 °íÃÆ´Ù.
±×·¡¼, ÆÄÀÏ ½Ã½ºÅÛÀ» ´Ã¸®°íÀÚ ÇÑ´Ù¸é, À¯Æ¿¸®Æ¼·Î »õµð½ºÅ©¸¦ Àû´çÈ÷ ¼³Á¤ÈÄ
´Ù¸¸, ½Ã½ºÅÛ¿¡ ¾Ë¸®±â¸¸ ÇϸéµÈ´Ù.
½Ã½ºÅÛÀÌ ÀÛµ¿Áß¿¡ °ð¹Ù·Î Ãß°¡°¡ °¡´ÉÇÏ°í, ÀçºÎÆÃÇÒ ÇÊ¿ä´Â ¾ø´Ù.
¾Æ·¡ÀÇ È¨ÆäÀÌÁö¿¡¼, 2.1.x¹öÁ¯´ëÀÇ Ä¿³ÎÆÐÄ¡¸¦ ¹Þ¾Æ¶ó.
http://www.nuclecu.unam.mx/~miguel/ext2-volume
Miguel de Icaza
<
miguel@luthien.nuclecu.unam.mx>
writes:
I changed the ext2fs code to be aware of multiple-devices
instead of the regular one device per file system assumption.
So, when you want to extend a file system,
you run a utility program that makes the appropriate changes
on the new device (your extra partition) and then you just tell
the system to extend the fs using the specified device.
You can extend a file system with new devices at system operation
time, no need to bring the system down
(and whenever I get some extra time, you will be able to remove
devices from the ext2 volume set, again without even having
to go to single-user mode or any hack like that).
You can get the patch for 2.1.x kernel from my web page:
http://www.nuclecu.unam.mx/~miguel/ext2-volume
- Q:
RAID-5 ½Ã½ºÅÛ¿¡ µð½ºÅ©¸¦ Ãß°¡ÇÒ ¼ö Àִ°¡?
A:
ÇöÀç, ¸ðµç µ¥ÀÌÅ͸¦ Áö¿ìÁö ¾Ê°í´Â ºÒ°¡´ÉÇÏ´Ù.
ÄÁ¹öÁ¯ÇÏ´Â µµ±¸´Â ¾ÆÁ÷ ¾øÀ¸¸ç, RAID-5ÀÇ ½ÇÁ¦Àû ±¸Á¶´Â,
diskÀÇ ¼ö¿¡ ÀÇÁ¸Çϱ⠶§¹®ÀÌ´Ù.
¹°·Ð, ¸ðµç µ¥ÀÌÅ͸¦ ¹é¾÷ÇÑÈÄ, ½Ã½ºÅÛÀ» ´Ù½Ã ±¸ÃàÇÏ¸é °¡´ÉÇÏ´Ù.
Currently, (September 1997) no, not without erasing all
data. A conversion utility to allow this does not yet exist.
The problem is that the actual structure and layout
of a RAID-5 array depends on the number of disks in the array.
Of course, one can add drives by backing up the array to tape,
deleting all data, creating a new array, and restoring from
tape.
- Q:
RAID1/RAID0 À¸·Î »ç¿ëÇÏ°í ÀÖ´Â
/dev/hdb ¸¦ /dev/hdc ·Î
À̵¿½ÃÅ°·Á ÇÕ´Ï´Ù. ¸¸¾à, /etc/mdtab °ú /etc/raid1.conf ÀÇ
¼³Á¤¸¸ ¹Ù²Û´Ù°í ÇÑ´Ù¸é, ¾î¶² ÀÏÀÌ ÀϾ±î¿ä?
What would happen to my RAID1/RAID0 sets if I shift one
of the drives from being /dev/hdb to /dev/hdc ?
Because of cabling/case size/stupidity issues, I had to
make my RAID sets on the same IDE controller (/dev/hda
and /dev/hdb ). Now that I've fixed some stuff, I want
to move /dev/hdb to /dev/hdc .
What would happen if I just change the /etc/mdtab and
/etc/raid1.conf files to reflect the new location?
A:
linear ¿Í RAID-0¿¡¼´Â Á¤È®È÷ °°Àº ¸í·ÉÀ¸·Î µå¶óÀ̺긦 ÁöÁ¤ÇØ¾ß ÇÑ´Ù.
¿¹¸¦ µé¸é ¿ø·¡ ¼³Á¤ÀÌ, ¾Æ·¡¿Í °°´Ù¸é,
mdadd /dev/md0 /dev/hda /dev/hdb
»õ·Î¿î ¼³Á¤Àº ¹Ýµå½Ã ¾Æ·¡¿Í °°¾Æ¾ß ÇÒ °ÍÀÌ´Ù.
mdadd /dev/md0 /dev/hda /dev/hdc
RAID-1/4/5¿¡¼´Â superblock¿¡ ''RAID number''°¡ ÀúÀåµÇ¾î Àֱ⠶§¹®¿¡,
¾î¶² µð½ºÅ©¸¦ ÁöÁ¤Çϴ°¡´Â º°·Î Áß¿äÇÏÁö ¾Ê´Ù.
For RAID-0/linear, one must be careful to specify the
drives in exactly the same order. Thus, in the above
example, if the original config is
mdadd /dev/md0 /dev/hda /dev/hdb
Then the new config *must* be
mdadd /dev/md0 /dev/hda /dev/hdc
For RAID-1/4/5, the drive's ''RAID number'' is stored in
its RAID superblock, and therefore the order in which the
disks are specified is not important.
RAID-0/linear does not have a superblock due to it's older
design, and the desire to maintain backwards compatibility
with this older design.
- Q:
µð½ºÅ© µÎ°³¸¦ »ç¿ëÇÏ´Â RAID-1 ½Ã½ºÅÛÀ» ¼¼°³ÀÇ µð½ºÅ©¸¦ »ç¿ëÇÏ´Â
RAID-5 ½Ã½ºÅÛÀ¸·Î ¹Ù²Ü¼ö Àִ°¡?
A:
ÇÒ¼ö ÀÖ´Ù. BizSystems ÀÇ MichaelÀº ÀÌ°É ½±°Ô ÇÏ´Â ¹æ¹ýÀ» ¸¸µé¾î³Â´Ù.
±×·¯³ª, ½Ç¼ö·Î ÀÎÇØ µ¥ÀÌÅ͸¦ ³¯¸± ¼ö ÀÖÀ¸¹Ç·Î, ÇÊÈ÷ ¹é¾÷Çسõ±â¸¦ ¹Ù¶õ´Ù.
Yes. Michael at BizSystems has come up with a clever,
sneaky way of doing this. However, like virtually all
manipulations of RAID arrays once they have data on
them, it is dangerous and prone to human error.
Make a backup before you start.
I will make the following assumptions:
---------------------------------------------
disks
original: hda - hdc
raid1 partitions hda3 - hdc3
array name /dev/md0
new hda - hdc - hdd
raid5 partitions hda3 - hdc3 - hdd3
array name: /dev/md1
You must substitute the appropriate disk and partition numbers for
you system configuration. This will hold true for all config file
examples.
--------------------------------------------
DO A BACKUP BEFORE YOU DO ANYTHING
1) recompile kernel to include both raid1 and raid5
2) install new kernel and verify that raid personalities are present
3) disable the redundant partition on the raid 1 array. If this is a
root mounted partition (mine was) you must be more careful.
Reboot the kernel without starting raid devices or boot from rescue
system ( raid tools must be available )
start non-redundant raid1
mdadd -r -p1 /dev/md0 /dev/hda3
4) configure raid5 but with 'funny' config file, note that there is
no hda3 entry and hdc3 is repeated. This is needed since the
raid tools don't want you to do this.
-------------------------------
# raid-5 configuration
raiddev /dev/md1
raid-level 5
nr-raid-disks 3
chunk-size 32
# Parity placement algorithm
parity-algorithm left-symmetric
# Spare disks for hot reconstruction
nr-spare-disks 0
device /dev/hdc3
raid-disk 0
device /dev/hdc3
raid-disk 1
device /dev/hdd3
raid-disk 2
---------------------------------------
mkraid /etc/raid5.conf
5) activate the raid5 array in non-redundant mode
mdadd -r -p5 -c32k /dev/md1 /dev/hdc3 /dev/hdd3
6) make a file system on the array
mke2fs -b {blocksize} /dev/md1
recommended blocksize by some is 4096 rather than the default 1024.
this improves the memory utilization for the kernel raid routines and
matches the blocksize to the page size. I compromised and used 2048
since I have a relatively high number of small files on my system.
7) mount the two raid devices somewhere
mount -t ext2 /dev/md0 mnt0
mount -t ext2 /dev/md1 mnt1
8) move the data
cp -a mnt0 mnt1
9) verify that the data sets are identical
10) stop both arrays
11) correct the information for the raid5.conf file
change /dev/md1 to /dev/md0
change the first disk to read /dev/hda3
12) upgrade the new array to full redundant status
(THIS DESTROYS REMAINING raid1 INFORMATION)
ckraid --fix /etc/raid5.conf
- Q:
I've created a RAID-0 device on
/dev/sda2 and
/dev/sda3 . The device is a lot slower than a
single partition. Isn't md a pile of junk?
A:
To have a RAID-0 device running a full speed, you must
have partitions from different disks. Besides, putting
the two halves of the mirror on the same disk fails to
give you any protection whatsoever against disk failure.
- Q:
What's the use of having RAID-linear when RAID-0 will do the
same thing, but provide higher performance?
A:
It's not obvious that RAID-0 will always provide better
performance; in fact, in some cases, it could make things
worse.
The ext2fs file system scatters files all over a partition,
and it attempts to keep all of the blocks of a file
contiguous, basically in an attempt to prevent fragmentation.
Thus, ext2fs behaves "as if" there were a (variable-sized)
stripe per file. If there are several disks concatenated
into a single RAID-linear, this will result files being
statistically distributed on each of the disks. Thus,
at least for ext2fs, RAID-linear will behave a lot like
RAID-0 with large stripe sizes. Conversely, RAID-0
with small stripe sizes can cause excessive disk activity
leading to severely degraded performance if several large files
are accessed simultaneously.
In many cases, RAID-0 can be an obvious win. For example,
imagine a large database file. Since ext2fs attempts to
cluster together all of the blocks of a file, chances
are good that it will end up on only one drive if RAID-linear
is used, but will get chopped into lots of stripes if RAID-0 is
used. Now imagine a number of (kernel) threads all trying
to random access to this database. Under RAID-linear, all
accesses would go to one disk, which would not be as efficient
as the parallel accesses that RAID-0 entails.
- Q:
How does RAID-0 handle a situation where the different stripe
partitions are different sizes? Are the stripes uniformly
distributed?
A:
To understand this, lets look at an example with three
partitions; one that is 50MB, one 90MB and one 125MB.
Lets call D0 the 50MB disk, D1 the 90MB disk and D2 the 125MB
disk. When you start the device, the driver calculates 'strip
zones'. In this case, it finds 3 zones, defined like this:
Z0 : (D0/D1/D2) 3 x 50 = 150MB total in this zone
Z1 : (D1/D2) 2 x 40 = 80MB total in this zone
Z2 : (D2) 125-50-40 = 35MB total in this zone.
You can see that the total size of the zones is the size of the
virtual device, but, depending on the zone, the striping is
different. Z2 is rather inefficient, since there's only one
disk.
Since ext2fs and most other Unix
file systems distribute files all over the disk, you
have a 35/265 = 13% chance that a fill will end up
on Z2, and not get any of the benefits of striping.
(DOS tries to fill a disk from beginning to end, and thus,
the oldest files would end up on Z0. However, this
strategy leads to severe filesystem fragmentation,
which is why no one besides DOS does it this way.)
- Q:
I have some Brand X hard disks and a Brand Y controller.
and am considering using
md .
Does it significantly increase the throughput?
Is the performance really noticeable?
A:
The answer depends on the configuration that you use.
- Linux MD RAID-0 and RAID-linear performance:
-
If the system is heavily loaded with lots of I/O,
statistically, some of it will go to one disk, and
some to the others. Thus, performance will improve
over a single large disk. The actual improvement
depends a lot on the actual data, stripe sizes, and
other factors. In a system with low I/O usage,
the performance is equal to that of a single disk.
- Linux MD RAID-1 (mirroring) read performance:
-
MD implements read balancing. That is, the RAID-1
code will alternate between each of the (two or more)
disks in the mirror, making alternate reads to each.
In a low-I/O situation, this won't change performance
at all: you will have to wait for one disk to complete
the read.
But, with two disks in a high-I/O environment,
this could as much as double the read performance,
since reads can be issued to each of the disks in parallel.
For N disks in the mirror, this could improve performance
N-fold.
- Linux MD RAID-1 (mirroring) write performance:
-
Must wait for the write to occur to all of the disks
in the mirror. This is because a copy of the data
must be written to each of the disks in the mirror.
Thus, performance will be roughly equal to the write
performance to a single disk.
- Linux MD RAID-4/5 read performance:
-
Statistically, a given block can be on any one of a number
of disk drives, and thus RAID-4/5 read performance is
a lot like that for RAID-0. It will depend on the data, the
stripe size, and the application. It will not be as good
as the read performance of a mirrored array.
- Linux MD RAID-4/5 write performance:
-
This will in general be considerably slower than that for
a single disk. This is because the parity must be written
out to one drive as well as the data to another. However,
in order to compute the new parity, the old parity and
the old data must be read first. The old data, new data and
old parity must all be XOR'ed together to determine the new
parity: this requires considerable CPU cycles in addition
to the numerous disk accesses.
- Q:
What RAID configuration should I use for optimal performance?
A:
Is the goal to maximize throughput, or to minimize latency?
There is no easy answer, as there are many factors that
affect performance:
- operating system - will one process/thread, or many
be performing disk access?
- application - is it accessing data in a
sequential fashion, or random access?
- file system - clusters files or spreads them out
(the ext2fs clusters together the blocks of a file,
and spreads out files)
- disk driver - number of blocks to read ahead
(this is a tunable parameter)
- CEC hardware - one drive controller, or many?
- hd controller - able to queue multiple requests or not?
Does it provide a cache?
- hard drive - buffer cache memory size -- is it big
enough to handle the write sizes and rate you want?
- physical platters - blocks per cylinder -- accessing
blocks on different cylinders will lead to seeks.
- Q:
What is the optimal RAID-5 configuration for performance?
A:
Since RAID-5 experiences an I/O load that is equally
distributed
across several drives, the best performance will be
obtained when the RAID set is balanced by using
identical drives, identical controllers, and the
same (low) number of drives on each controller.
Note, however, that using identical components will
raise the probability of multiple simultaneous failures,
for example due to a sudden jolt or drop, overheating,
or a power surge during an electrical storm. Mixing
brands and models helps reduce this risk.
- Q:
What is the optimal block size for a RAID-4/5 array?
A:
When using the current (November 1997) RAID-4/5
implementation, it is strongly recommended that
the file system be created with mke2fs -b 4096
instead of the default 1024 byte filesystem block size.
This is because the current RAID-5 implementation
allocates one 4K memory page per disk block;
if a disk block were just 1K in size, then
75% of the memory which RAID-5 is allocating for
pending I/O would not be used. If the disk block
size matches the memory page size, then the
driver can (potentially) use all of the page.
Thus, for a filesystem with a 4096 block size as
opposed to a 1024 byte block size, the RAID driver
will potentially queue 4 times as much
pending I/O to the low level drivers without
allocating additional memory.
Note: the above remarks do NOT apply to Software
RAID-0/1/linear driver.
Note: the statements about 4K memory page size apply to the
Intel x86 architecture. The page size on Alpha, Sparc, and other
CPUS are different; I believe they're 8K on Alpha/Sparc (????).
Adjust the above figures accordingly.
Note: if your file system has a lot of small
files (files less than 10KBytes in size), a considerable
fraction of the disk space might be wasted. This is
because the file system allocates disk space in multiples
of the block size. Allocating large blocks for small files
clearly results in a waste of disk space: thus, you may
want to stick to small block sizes, get a larger effective
storage capacity, and not worry about the "wasted" memory
due to the block-size/page-size mismatch.
Note: most ''typical'' systems do not have that many
small files. That is, although there might be thousands
of small files, this would lead to only some 10 to 100MB
wasted space, which is probably an acceptable tradeoff for
performance on a multi-gigabyte disk.
However, for news servers, there might be tens or hundreds
of thousands of small files. In such cases, the smaller
block size, and thus the improved storage capacity,
may be more important than the more efficient I/O
scheduling.
Note: there exists an experimental file system for Linux
which packs small files and file chunks onto a single block.
It apparently has some very positive performance
implications when the average file size is much smaller than
the block size.
Note: Future versions may implement schemes that obsolete
the above discussion. However, this is difficult to
implement, since dynamic run-time allocation can lead to
dead-locks; the current implementation performs a static
pre-allocation.
- Q:
How does the chunk size (stripe size) influence the speed of
my RAID-0, RAID-4 or RAID-5 device?
A:
The chunk size is the amount of data contiguous on the
virtual device that is also contiguous on the physical
device. In this HOWTO, "chunk" and "stripe" refer to
the same thing: what is commonly called the "stripe"
in other RAID documentation is called the "chunk"
in the MD man pages. Stripes or chunks apply only to
RAID 0, 4 and 5, since stripes are not used in
mirroring (RAID-1) and simple concatenation (RAID-linear).
The stripe size affects both read and write latency (delay),
throughput (bandwidth), and contention between independent
operations (ability to simultaneously service overlapping I/O
requests).
Assuming the use of the ext2fs file system, and the current
kernel policies about read-ahead, large stripe sizes are almost
always better than small stripe sizes, and stripe sizes
from about a fourth to a full disk cylinder in size
may be best. To understand this claim, let us consider the
effects of large stripes on small files, and small stripes
on large files. The stripe size does
not affect the read performance of small files: For an
array of N drives, the file has a 1/N probability of
being entirely within one stripe on any one of the drives.
Thus, both the read latency and bandwidth will be comparable
to that of a single drive. Assuming that the small files
are statistically well distributed around the filesystem,
(and, with the ext2fs file system, they should be), roughly
N times more overlapping, concurrent reads should be possible
without significant collision between them. Conversely, if
very small stripes are used, and a large file is read sequentially,
then a read will issued to all of the disks in the array.
For a the read of a single large file, the latency will almost
double, as the probability of a block being 3/4'ths of a
revolution or farther away will increase. Note, however,
the trade-off: the bandwidth could improve almost N-fold
for reading a single, large file, as N drives can be reading
simultaneously (that is, if read-ahead is used so that all
of the disks are kept active). But there is another,
counter-acting trade-off: if all of the drives are already busy
reading one file, then attempting to read a second or third
file at the same time will cause significant contention,
ruining performance as the disk ladder algorithms lead to
seeks all over the platter. Thus, large stripes will almost
always lead to the best performance. The sole exception is
the case where one is streaming a single, large file at a
time, and one requires the top possible bandwidth, and one
is also using a good read-ahead algorithm, in which case small
stripes are desired.
Note that this HOWTO previously recommended small stripe
sizes for news spools or other systems with lots of small
files. This was bad advice, and here's why: news spools
contain not only many small files, but also large summary
files, as well as large directories. If the summary file
is larger than the stripe size, reading it will cause
many disks to be accessed, slowing things down as each
disk performs a seek. Similarly, the current ext2fs
file system searches directories in a linear, sequential
fashion. Thus, to find a given file or inode, on average
half of the directory will be read. If this directory is
spread across several stripes (several disks), the
directory read (e.g. due to the ls command) could get
very slow. Thanks to Steven A. Reisman
<
sar@pressenter.com> for this correction.
Steve also adds:
I found that using a 256k stripe gives much better performance.
I suspect that the optimum size would be the size of a disk
cylinder (or maybe the size of the disk drive's sector cache).
However, disks nowadays have recording zones with different
sector counts (and sector caches vary among different disk
models). There's no way to guarantee stripes won't cross a
cylinder boundary.
The tools accept the stripe size specified in KBytes.
You'll want to specify a multiple of if the page size
for your CPU (4KB on the x86).
- Q:
What is the correct stride factor to use when creating the
ext2fs file system on the RAID partition? By stride, I mean
the -R flag on the
mke2fs command:
mke2fs -b 4096 -R stride=nnn ...
What should the value of nnn be?
A:
The -R stride flag is used to tell the file system
about the size of the RAID stripes. Since only RAID-0,4 and 5
use stripes, and RAID-1 (mirroring) and RAID-linear do not,
this flag is applicable only for RAID-0,4,5.
Knowledge of the size of a stripe allows mke2fs
to allocate the block and inode bitmaps so that they don't
all end up on the same physical drive. An unknown contributor
wrote:
I noticed last spring that one drive in a pair always had a
larger I/O count, and tracked it down to the these meta-data
blocks. Ted added the -R stride= option in response
to my explanation and request for a workaround.
For a 4KB block file system, with stripe size 256KB, one would
use -R stride=64 .
If you don't trust the -R flag, you can get a similar
effect in a different way. Steven A. Reisman
<
sar@pressenter.com> writes:
Another consideration is the filesystem used on the RAID-0 device.
The ext2 filesystem allocates 8192 blocks per group. Each group
has its own set of inodes. If there are 2, 4 or 8 drives, these
inodes cluster on the first disk. I've distributed the inodes
across all drives by telling mke2fs to allocate only 7932 blocks
per group.
Some mke2fs pages do not describe the [-g blocks-per-group]
flag used in this operation.
- Q:
Where can I put the
md commands in the startup scripts,
so that everything will start automatically at boot time?
A:
Rod Wilkens
<
rwilkens@border.net>
writes:
What I did is put ``mdadd -ar '' in
the ``/etc/rc.d/rc.sysinit '' right after the kernel
loads the modules, and before the ``fsck '' disk check.
This way, you can put the ``/dev/md? '' device in the
``/etc/fstab ''. Then I put the ``mdstop -a ''
right after the ``umount -a '' unmounting the disks,
in the ``/etc/rc.d/init.d/halt '' file.
For raid-5, you will want to look at the return code
for mdadd , and if it failed, do a
ckraid --fix /etc/raid5.conf
to repair any damage.
- Q:
I was wondering if it's possible to setup striping with more
than 2 devices in
md0 ? This is for a news server,
and I have 9 drives... Needless to say I need much more than two.
Is this possible?
A:
Yes. (describe how to do this)
- Q:
When is Software RAID superior to Hardware RAID?
A:
Normally, Hardware RAID is considered superior to Software
RAID, because hardware controllers often have a large cache,
and can do a better job of scheduling operations in parallel.
However, integrated Software RAID can (and does) gain certain
advantages from being close to the operating system.
For example, ... ummm. Opaque description of caching of
reconstructed blocks in buffer cache elided ...
On a dual PPro SMP system, it has been reported that
Software-RAID performance exceeds the performance of a
well-known hardware-RAID board vendor by a factor of
2 to 5.
Software RAID is also a very interesting option for
high-availability redundant server systems. In such
a configuration, two CPU's are attached to one set
or SCSI disks. If one server crashes or fails to
respond, then the other server can mdadd ,
mdrun and mount the software RAID
array, and take over operations. This sort of dual-ended
operation is not always possible with many hardware
RAID controllers, because of the state configuration that
the hardware controllers maintain.
- Q:
If I upgrade my version of raidtools, will it have trouble
manipulating older raid arrays? In short, should I recreate my
RAID arrays when upgrading the raid utilities?
A:
No, not unless the major version number changes.
An MD version x.y.z consists of three sub-versions:
x: Major version.
y: Minor version.
z: Patchlevel version.
Version x1.y1.z1 of the RAID driver supports a RAID array with
version x2.y2.z2 in case (x1 == x2) and (y1 >= y2).
Different patchlevel (z) versions for the same (x.y) version are
designed to be mostly compatible.
The minor version number is increased whenever the RAID array layout
is changed in a way which is incompatible with older versions of the
driver. New versions of the driver will maintain compatibility with
older RAID arrays.
The major version number will be increased if it will no longer make
sense to support old RAID arrays in the new kernel code.
For RAID-1, it's not likely that the disk layout nor the
superblock structure will change anytime soon. Most all
Any optimization and new features (reconstruction, multithreaded
tools, hot-plug, etc.) doesn't affect the physical layout.
- Q:
The command
mdstop /dev/md0 says that the device is busy.
A:
There's a process that has a file open on /dev/md0 , or
/dev/md0 is still mounted. Terminate the process or
umount /dev/md0 .
- Q:
Are there performance tools?
A:
There is also a new utility called iotrace in the
linux/iotrace
directory. It reads /proc/io-trace and analyses/plots it's
output. If you feel your system's block IO performance is too
low, just look at the iotrace output.
- Q:
I was reading the RAID source, and saw the value
SPEED_LIMIT defined as 1024K/sec. What does this mean?
Does this limit performance?
A:
SPEED_LIMIT is used to limit RAID reconstruction
speed during automatic reconstruction. Basically, automatic
reconstruction allows you to e2fsck and
mount immediately after an unclean shutdown,
without first running ckraid . Automatic
reconstruction is also used after a failed hard drive
has been replaced.
In order to avoid overwhelming the system while
reconstruction is occurring, the reconstruction thread
monitors the reconstruction speed and slows it down if
its too fast. The 1M/sec limit was arbitrarily chosen
as a reasonable rate which allows the reconstruction to
finish reasonably rapidly, while creating only a light load
on the system so that other processes are not interfered with.
- Q:
What about ''spindle synchronization'' or ''disk
synchronization''?
A:
Spindle synchronization is used to keep multiple hard drives
spinning at exactly the same speed, so that their disk
platters are always perfectly aligned. This is used by some
hardware controllers to better organize disk writes.
However, for software RAID, this information is not used,
and spindle synchronization might even hurt performance.
- Q:
How can I set up swap spaces using raid 0?
Wouldn't striped swap ares over 4+ drives be really fast?
A:
Leonard N. Zubkoff replies:
It is really fast, but you don't need to use MD to get striped
swap. The kernel automatically stripes across equal priority
swap spaces. For example, the following entries from
/etc/fstab stripe swap space across five drives in
three groups:
/dev/sdg1 swap swap pri=3
/dev/sdk1 swap swap pri=3
/dev/sdd1 swap swap pri=3
/dev/sdh1 swap swap pri=3
/dev/sdl1 swap swap pri=3
/dev/sdg2 swap swap pri=2
/dev/sdk2 swap swap pri=2
/dev/sdd2 swap swap pri=2
/dev/sdh2 swap swap pri=2
/dev/sdl2 swap swap pri=2
/dev/sdg3 swap swap pri=1
/dev/sdk3 swap swap pri=1
/dev/sdd3 swap swap pri=1
/dev/sdh3 swap swap pri=1
/dev/sdl3 swap swap pri=1
- Q:
I want to maximize performance. Should I use multiple
controllers?
A:
In many cases, the answer is yes. Using several
controllers to perform disk access in parallel will
improve performance. However, the actual improvement
depends on your actual configuration. For example,
it has been reported (Vaughan Pratt, January 98) that
a single 4.3GB Cheetah attached to an Adaptec 2940UW
can achieve a rate of 14MB/sec (without using RAID).
Installing two disks on one controller, and using
a RAID-0 configuration results in a measured performance
of 27 MB/sec.
Note that the 2940UW controller is an "Ultra-Wide"
SCSI controller, capable of a theoretical burst rate
of 40MB/sec, and so the above measurements are not
surprising. However, a slower controller attached
to two fast disks would be the bottleneck. Note also,
that most out-board SCSI enclosures (e.g. the kind
with hot-pluggable trays) cannot be run at the 40MB/sec
rate, due to cabling and electrical noise problems.
If you are designing a multiple controller system,
remember that most disks and controllers typically
run at 70-85% of their rated max speeds.
Note also that using one controller per disk
can reduce the likelihood of system outage
due to a controller or cable failure (In theory --
only if the device driver for the controller can
gracefully handle a broken controller. Not all
SCSI device drivers seem to be able to handle such
a situation without panicking or otherwise locking up).
- Q:
RAID´Â µ¥ÀÌÅÍ ¼Õ½ÇÀ» ¸·¾ÆÁÙ ¼ö ÀÖ´Ù, ÇÏÁö¸¸, ½Ã½ºÅÛÀ» ¼Õ»ó¾øÀÌ
°¡´ÉÇÑ ¿À·¡ ÄѳõÀ» ¼ö Àִ°¡?
RAID can help protect me against data loss. But how can I also
ensure that the system is up as long as possible, and not prone
to breakdown? Ideally, I want a system that is up 24 hours a
day, 7 days a week, 365 days a year.
A:
°í °¡¿ë¼º(High-Availability)Àº Á» ´õ ¾î·Æ°í ºñ½Ñ °ÍÀÌ´Ù.
¾Æ·¡ÀÇ hint¿Í tip, »ý°¢, ¼Ò¹®ÀÌ ±× ¹®Á¦¿¡ ´ëÇؼ µµ¿ÍÁÙ °ÍÀÌ´Ù.
- °°Àº IDE ¸®º» ÄÉÀÌºí¿¡ ¿¬°áµÈ µð½ºÅ©Áß Çϳª¿¡ ¿À·ù°¡ ³ª¸é,
µð½ºÅ© µÎ°³°¡ ¸ðµÎ ¸Á°¡Áø °ÍÀ¸·Î ÀÎ½ÄµÉ °ÍÀÌ´Ù.
ÇϳªÀÇ IDE ÄÉÀÌºí¿¡´Â ÇϳªÀÇ µð½ºÅ©¸¸À» »ç¿ëÇضó.
- SCSI chain ¿ª½Ã ÇϳªÀÇ ¿À·ùµð½ºÅ©°¡ ¸ðµç µð½ºÅ©¿¡
Á¢±Ù ¸øÇÏ°Ô ÇÒ°ÍÀÌ´Ù. °°Àº SCSI chain¿¡ °°Àº RAID ½Ã½ºÅÛÀÇ
µð½ºÅ©µéÀ» µÎÁö ¸»¾Æ¶ó.
- µð½ºÅ© ÄÜÆ®·Ñ·¯µµ ¿ª½Ã ¿©·¯°³¸¦ »ç¿ëÇ϶ó.
- ¸ðµç µð½ºÅ©¸¦ °°Àº ȸ»ç, °°Àº ¸ðµ¨·Î ¾²Áö ¸»¾Æ¶ó.
µð½ºÅ©µéÀÌ ¹°¸®ÀûÀÎ Ãæ°ÝÀ» ¹ÞÀ» °æ¿ì Á» ´õ ¾ÈÀüÇÒ °ÍÀÌ´Ù.
- CPU³ª ÄÜÆ®·Ñ·¯ÀÇ ½ÇÆÐÇÒ °æ¿ì¿¡ ´ëºñÇؼ SCSI¸¦ µÎ°³ÀÇ ÄÄÇ»ÅÍ¿Í
¿¬°áµÇ´Â "twin-tailed" »óÅ·Π¼³Á¤ÇÒ ¼ö ÀÖÀ» °ÍÀÌ´Ù.
(¾Æ·¡ ¿ø¹®À» Âü°í ÇϽñæ.. -.-; ÂÁ.)
- Ç×»ó UPS¸¦ »ç¿ëÇÏ°í shutdownÀ» Ç϶ó.
- SCSI ÄÉÀ̺íÀº ¸Å¿ì ±î´Ù·Ó°í, ¹®Á¦°¡ µÇ±â ½¬¿î °ÍÀ¸·Î ¾Ë·ÁÀú ÀÖ´Ù.
»ì ¼ö ÀÖ´Â °¡Àå ÁÁÀº ÁúÀÇ ÄÉÀ̺íÀ» »ç¿ëÇضó.
- SSI (Serial Storage Architecture) ¸¦ º¸°í ´Ù¼Ò ºñ½Î´õ¶óµµ
¾ÈÀüÇÏ´Ù°í ¾Ë·ÁÁø Á¦Ç°À» »ç¿ëÇضó.
- Áñ°Ü¶ó, ¸Á°¡Áö´Â °ÍÀº ´ç½ÅÀÌ »ý°¢ÇÏ´Â °Íº¸´Ù ³ªÁß ÀÏÀÏ °ÍÀÌ´Ù.
High-Availability is difficult and expensive. The harder
you try to make a system be fault tolerant, the harder
and more expensive it gets. The following hints, tips,
ideas and unsubstantiated rumors may help you with this
quest.
- IDE disks can fail in such a way that the failed disk
on an IDE ribbon can also prevent the good disk on the
same ribbon from responding, thus making it look as
if two disks have failed. Since RAID does not
protect against two-disk failures, one should either
put only one disk on an IDE cable, or if there are two
disks, they should belong to different RAID sets.
- SCSI disks can fail in such a way that the failed disk
on a SCSI chain can prevent any device on the chain
from being accessed. The failure mode involves a
short of the common (shared) device ready pin;
since this pin is shared, no arbitration can occur
until the short is removed. Thus, no two disks on the
same SCSI chain should belong to the same RAID array.
- Similar remarks apply to the disk controllers.
Don't load up the channels on one controller; use
multiple controllers.
- Don't use the same brand or model number for all of
the disks. It is not uncommon for severe electrical
storms to take out two or more disks. (Yes, we
all use surge suppressors, but these are not perfect
either). Heat & poor ventilation of the disk
enclosure are other disk killers. Cheap disks
often run hot.
Using different brands of disk & controller
decreases the likelihood that whatever took out one disk
(heat, physical shock, vibration, electrical surge)
will also damage the others on the same date.
- To guard against controller or CPU failure,
it should be possible to build a SCSI disk enclosure
that is "twin-tailed": i.e. is connected to two
computers. One computer will mount the file-systems
read-write, while the second computer will mount them
read-only, and act as a hot spare. When the hot-spare
is able to determine that the master has failed (e.g.
through a watchdog), it will cut the power to the
master (to make sure that it's really off), and then
fsck & remount read-write. If anyone gets
this working, let me know.
- Always use an UPS, and perform clean shutdowns.
Although an unclean shutdown may not damage the disks,
running ckraid on even small-ish arrays is painfully
slow. You want to avoid running ckraid as much as
possible. Or you can hack on the kernel and get the
hot-reconstruction code debugged ...
- SCSI cables are well-known to be very temperamental
creatures, and prone to cause all sorts of problems.
Use the highest quality cabling that you can find for
sale. Use e.g. bubble-wrap to make sure that ribbon
cables to not get too close to one another and
cross-talk. Rigorously observe cable-length
restrictions.
- Take a look at SSI (Serial Storage Architecture).
Although it is rather expensive, it is rumored
to be less prone to the failure modes that SCSI
exhibits.
- Enjoy yourself, its later than you think.
- Q:
If, for cost reasons, I try to mirror a slow disk with a fast disk,
is the S/W smart enough to balance the reads accordingly or will it
all slow down to the speed of the slowest?
- Q:
For testing the raw disk thru put...
is there a character device for raw read/raw writes instead of
/dev/sdaxx that we can use to measure performance
on the raid drives??
is there a GUI based tool to use to watch the disk thru-put??
Bradley Ward Allen
<
ulmo@Q.Net>
wrote:
Ideas include:
- Boot-up parameters to tell the kernel which devices are
to be MD devices (no more ``
mdadd '')
- Making MD transparent to ``
mount ''/``umount ''
such that there is no ``mdrun '' and ``mdstop ''
- Integrating ``
ckraid '' entirely into the kernel,
and letting it run as needed
(So far, all I've done is suggest getting rid of the tools and putting
them into the kernel; that's how I feel about it,
this is a filesystem, not a toy.)
- Deal with arrays that can easily survive N disks going out
simultaneously or at separate moments,
where N is a whole number > 0 settable by the administrator
- Handle kernel freezes, power outages,
and other abrupt shutdowns better
- Don't disable a whole disk if only parts of it have failed,
e.g., if the sector errors are confined to less than 50% of
access over the attempts of 20 dissimilar requests,
then it continues just ignoring those sectors of that particular
disk.
- Bad sectors:
- A mechanism for saving which sectors are bad,
someplace onto the disk.
- If there is a generalized mechanism for marking degraded
bad blocks that upper filesystem levels can recognize,
use that. Program it if not.
- Perhaps alternatively a mechanism for telling the upper
layer that the size of the disk got smaller,
even arranging for the upper layer to move out stuff from
the areas being eliminated.
This would help with a degraded blocks as well.
- Failing the above ideas, keeping a small (admin settable)
amount of space aside for bad blocks (distributed evenly
across disk?), and using them (nearby if possible)
instead of the bad blocks when it does happen.
Of course, this is inefficient.
Furthermore, the kernel ought to log every time the RAID
array starts each bad sector and what is being done about
it with a ``
crit '' level warning, just to get
the administrator to realize that his disk has a piece of
dust burrowing into it (or a head with platter sickness).
- Software-switchable disks:
- ``disable this disk''
-
would block until kernel has completed making sure
there is no data on the disk being shut down
that is needed (e.g., to complete an XOR/ECC/other error
correction), then release the disk from use
(so it could be removed, etc.);
- ``enable this disk''
-
would mkraid a new disk if appropriate
and then start using it for ECC/whatever operations,
enlarging the RAID5 array as it goes;
- ``resize array''
-
would respecify the total number of disks
and the number of redundant disks, and the result
would often be to resize the size of the array;
where no data loss would result,
doing this as needed would be nice,
but I have a hard time figuring out how it would do that;
in any case, a mode where it would block
(for possibly hours (kernel ought to log something every
ten seconds if so)) would be necessary;
- ``enable this disk while saving data''
-
which would save the data on a disk as-is and move it
to the RAID5 system as needed, so that a horrific save
and restore would not have to happen every time someone
brings up a RAID5 system (instead, it may be simpler to
only save one partition instead of two,
it might fit onto the first as a gzip'd file even);
finally,
- ``re-enable disk''
-
would be an operator's hint to the OS to try out
a previously failed disk (it would simply call disable
then enable, I suppose).
Other ideas off the net:
- finalrd analog to initrd, to simplify root raid.
- a read-only raid mode, to simplify the above
- Mark the RAID set as clean whenever there are no
"half writes" done. -- That is, whenever there are no write
transactions that were committed on one disk but still
unfinished on another disk.
Add a "write inactivity" timeout (to avoid frequent seeks
to the RAID superblock when the RAID set is relatively
busy).
|
|