Commit Graph

86 Commits

Author SHA1 Message Date
NeilBrown
4f2e639af4 [PATCH] md: endian annotations for the bitmap superblock
And a couple of bug fixes found by sparse.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21 13:35:05 -07:00
NeilBrown
1c05b4bc22 [PATCH] md: endian annotation for v1 superblock access
Includes a couple of bugfixes found by sparse.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21 13:35:05 -07:00
NeilBrown
e8703fe1f5 [PATCH] md: remove MAX_MD_DEVS which is an arbitrary limit
Once upon a time we needed to fixed limit to the number of md devices,
probably because we preallocated some array.  This need no longer exists, but
we still have an arbitrary limit.

So remove MAX_MD_DEVS and allow as many devices as we can fit into the 'minor'
part of a device number.

Also remove some useless noise at init time (which reports MAX_MD_DEVS) and
remove MD_THREAD_NAME_MAX which hasn't been used for a while.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:18 -07:00
NeilBrown
11ce99e625 [PATCH] md: Remove working_disks from raid1 state data
It is equivalent to conf->raid_disks - conf->mddev->degraded.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:17 -07:00
Paul Clements
9b1d1dac18 [PATCH] md: new sysfs interface for setting bits in the write-intent-bitmap
Add a new sysfs interface that allows the bitmap of an array to be dirtied.
The interface is write-only, and is used as follows:

echo "1000" > /sys/block/md2/md/bitmap

(dirty the bit for chunk 1000 [offset 0] in the in-memory and on-disk
bitmaps of array md2)

echo "1000-2000" > /sys/block/md1/md/bitmap

(dirty the bits for chunks 1000-2000 in md1's bitmap)

This is useful, for example, in cluster environments where you may need to
combine two disjoint bitmaps into one (following a server failure, after a
secondary server has taken over the array).  By combining the bitmaps on
the two servers, a full resync can be avoided (This was discussed on the
list back on March 18, 2005, "[PATCH 1/2] md bitmap bug fixes" thread).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:17 -07:00
NeilBrown
76186dd8b7 [PATCH] md: remove 'working_disks' from raid10 state
It isn't needed as mddev->degraded contains equivalent info.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:17 -07:00
NeilBrown
02c2de8cc8 [PATCH] md: remove the working_disks and failed_disks from raid5 state data.
They are not needed.  conf->failed_disks is the same as mddev->degraded and
conf->working_disks is conf->raid_disks - mddev->degraded.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:17 -07:00
NeilBrown
850b2b420c [PATCH] md: replace magic numbers in sb_dirty with well defined bit flags
Instead of magic numbers (0,1,2,3) in sb_dirty, we have
some flags instead:
MD_CHANGE_DEVS
   Some device state has changed requiring superblock update
   on all devices.
MD_CHANGE_CLEAN
   The array has transitions from 'clean' to 'dirty' or back,
   requiring a superblock update on active devices, but possibly
   not on spares
MD_CHANGE_PENDING
   A superblock update is underway.

We wait for an update to complete by waiting for all flags to be clear.  A
flag can be set at any time, even during an update, without risk that the
change will be lost.

Stop exporting md_update_sb - isn't needed.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:17 -07:00
NeilBrown
b5c124af69 [PATCH] md: fix a comment that is wrong in raid5.h
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:16 -07:00
Adrian Bunk
fbedac04fa [PATCH] md: the scheduled removal of the START_ARRAY ioctl for md
This patch contains the scheduled removal of the START_ARRAY ioctl for md.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:16 -07:00
David Howells
9361401eb7 [PATCH] BLOCK: Make it possible to disable the block layer [try #6]
Make it possible to disable the block layer.  Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.

This patch does the following:

 (*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
     support.

 (*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
     an item that uses the block layer.  This includes:

     (*) Block I/O tracing.

     (*) Disk partition code.

     (*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.

     (*) The SCSI layer.  As far as I can tell, even SCSI chardevs use the
     	 block layer to do scheduling.  Some drivers that use SCSI facilities -
     	 such as USB storage - end up disabled indirectly from this.

     (*) Various block-based device drivers, such as IDE and the old CDROM
     	 drivers.

     (*) MTD blockdev handling and FTL.

     (*) JFFS - which uses set_bdev_super(), something it could avoid doing by
     	 taking a leaf out of JFFS2's book.

 (*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
     linux/elevator.h contingent on CONFIG_BLOCK being set.  sector_div() is,
     however, still used in places, and so is still available.

 (*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
     parts of linux/fs.h.

 (*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.

 (*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.

 (*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
     is not enabled.

 (*) fs/no-block.c is created to hold out-of-line stubs and things that are
     required when CONFIG_BLOCK is not set:

     (*) Default blockdev file operations (to give error ENODEV on opening).

 (*) Makes some /proc changes:

     (*) /proc/devices does not list any blockdevs.

     (*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.

 (*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.

 (*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
     given command other than Q_SYNC or if a special device is specified.

 (*) In init/do_mounts.c, no reference is made to the blockdev routines if
     CONFIG_BLOCK is not defined.  This does not prohibit NFS roots or JFFS2.

 (*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
     error ENOSYS by way of cond_syscall if so).

 (*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
     CONFIG_BLOCK is not set, since they can't then happen.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30 20:52:31 +02:00
David Woodhouse
fadcfa33b6 [HEADERS] One line per header in Kbuild files to reduce conflicts
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-09-19 12:43:58 +01:00
NeilBrown
ff4e8d9a9f [PATCH] md: fix resync speed calculation for restarted resyncs
We introduced 'io_sectors' recently so we could count the sectors that causes
io during resync separate from sectors which didn't cause IO - there can be a
difference if a bitmap is being used to accelerate resync.

However when a speed is reported, we find the number of sectors processed
recently by subtracting an oldish io_sectors count from a current
'curr_resync' count.  This is wrong because curr_resync counts all sectors,
not just io sectors.

So, add a field to mddev to store the curren io_sectors separately from
curr_resync, and use that in the calculations.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:16 -07:00
Linus Torvalds
6fa0cb1141 Merge git://git.infradead.org/hdrinstall-2.6
* git://git.infradead.org/hdrinstall-2.6:
  Remove export of include/linux/isdn/tpam.h
  Remove <linux/i2c-id.h> and <linux/i2c-algo-ite.h> from userspace export
  Restrict headers exported to userspace for SPARC and SPARC64
  Add empty Kbuild files for 'make headers_install' in remaining arches.
  Add Kbuild file for Alpha 'make headers_install'
  Add Kbuild file for SPARC 'make headers_install'
  Add Kbuild file for IA64 'make headers_install'
  Add Kbuild file for S390 'make headers_install'
  Add Kbuild file for i386 'make headers_install'
  Add Kbuild file for x86_64 'make headers_install'
  Add Kbuild file for PowerPC 'make headers_install'
  Add generic Kbuild files for 'make headers_install'
  Basic implementation of 'make headers_check'
  Basic implementation of 'make headers_install'
2006-07-04 12:55:45 -07:00
NeilBrown
4254376914 [PATCH] md: Don't write dirty/clean update to spares - leave them alone
- record the 'event' count on each individual device (they
  might sometimes be slightly different now)
- add a new value for 'sb_dirty': '3' means that the super
  block only needs to be updated to record a clean<->dirty
  transition.
- Prefer odd event numbers for dirty states and even numbers
  for clean states
- Using all the above, don't update the superblock on
  a spare device if the update is just doing a clean-dirty
  transition.  To accomodate this, a transition from
  dirty back to clean might now decrement the events counter
  if nothing else has changed.

The net effect of this is that spare drives will not see any IO requests
during normal running of the array, so they can go to sleep if that is what
they want to do.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:39 -07:00
NeilBrown
d785a06a0b [PATCH] md/bitmap: change md/bitmap file handling to use bmap to file blocks
If md is asked to store a bitmap in a file, it tries to hold onto the page
cache pages for that file, manipulate them directly, and call a cocktail of
operations to write the file out.  I don't believe this is a supportable
approach.

This patch changes the approach to use the same approach as swap files.  i.e.
bmap is used to enumerate all the block address of parts of the file and we
write directly to those blocks of the device.

swapfile only uses parts of the file that provide a full pages at contiguous
addresses.  We don't have that luxury so we have to cope with pages that are
non-contiguous in storage.  To handle this we attach buffers to each page, and
store the addresses in those buffers.

With this approach the pagecache may contain data which is inconsistent with
what is on disk.  To alleviate the problems this can cause, md invalidates the
pagecache when releasing the file.  If the file is to be examined while the
array is active (a non-critical but occasionally useful function), O_DIRECT io
must be used.  And new version of mdadm will have support for this.

This approach simplifies a lot of code:
 - we no longer need to keep a list of pages which we need to wait for,
   as the b_endio function can keep track of how many outstanding
   writes there are.  This saves a mempool.
 - -EAGAIN returns from write_page are no longer possible (not sure if
    they ever were actually).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:38 -07:00
NeilBrown
0b79ccf0cd [PATCH] md/bitmap: remove bitmap writeback daemon
md/bitmap currently has a separate thread to wait for writes to the bitmap
file to complete (as we cannot get a callback on that action).

However this isn't needed as bitmap_unplug is called from process context and
waits for the writeback thread to do it's work.  The same result can be
achieved by doing the waiting directly in bitmap_unplug.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:38 -07:00
Adrian Bunk
5e56341d02 [PATCH] md: make md_print_devices() static
This patch makes the needlessly global md_print_devices() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:37 -07:00
NeilBrown
c93983bf51 [PATCH] md: support stripe/offset mode in raid10
The "industry standard" DDF format allows for a stripe/offset layout where
data is duplicated on different stripes.  e.g.

  A  B  C  D
  D  A  B  C
  E  F  G  H
  H  E  F  G

(columns are drives, rows are stripes, LETTERS are chunks of data).

This is similar to raid10's 'far' mode, but not quite the same.  So enhance
'far' mode with a 'far/offset' option which follows the layout of DDFs
stripe/offset.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:37 -07:00
NeilBrown
7c7546ccf6 [PATCH] md: allow a linear array to have drives added while active
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:37 -07:00
NeilBrown
5fd6c1dce0 [PATCH] md: allow checkpoint of recovery with version-1 superblock
For a while we have had checkpointing of resync.  The version-1 superblock
allows recovery to be checkpointed as well, and this patch implements that.

Due to early carelessness we need to add a feature flag to signal that the
recovery_offset field is in use, otherwise older kernels would assume that a
partially recovered array is in fact fully recovered.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:37 -07:00
NeilBrown
16a53ecc35 [PATCH] md: merge raid5 and raid6 code
There is a lot of commonality between raid5.c and raid6main.c.  This patches
merges both into one module called raid456.  This saves a lot of code, and
paves the way for online raid5->raid6 migrations.

There is still duplication, e.g.  between handle_stripe5 and handle_stripe6.
This will probably be cleaned up later.

Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:37 -07:00
NeilBrown
8932c2e0dc [PATCH] md: remove arbitrary limit on chunk size
The largest chunk size the code can support without substantial surgery is
2^30 bytes, so make that the limit instead of an arbitrary 4Meg.  Some day,
the 'chunksize' should change to a sector-shift instead of a byte-count.  Then
no limit would be needed.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:36 -07:00
David Woodhouse
8555255f0b Add generic Kbuild files for 'make headers_install'
This adds the Kbuild files listing the files which are to be installed by
the 'headers_install' make target, in generic directories.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-18 12:14:01 +01:00
NeilBrown
6f91fe88e4 [PATCH] md: make sure 64bit fields in version-1 metadata are 64-bit aligned
reshape_position is a 64bit field that was not 64bit aligned.  So swap with
new_level.

NOTE: this is a user-visible change.  However:
  - The bad code has not appeared in a released kernel
  - This code is still marked 'experimental'
  - This only affects version-1 superblock, which are not in wide use
  - These field are only used (rather than simply reported) by user-space
    tools in extemely rare circumstances : after a reshape crashes in the
    first second of the reshape process.

So I believe that, at this stage, the change is safe.  Especially if people
heed the 'help' message on use mdadm-2.4.1.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:30 -07:00