Commit Graph

70 Commits

Author SHA1 Message Date
NeilBrown
7be3dfec47 md: reduce CPU wastage on idle md array with a write-intent bitmap
Recent patch titled
  Reduce CPU wastage on idle md array with a write-intent bitmap.

would sometimes leave the array with dirty bitmap bits that stay dirty.  A
subsequent write would sort things out so it isn't a big problem, but should
be fixed nonetheless.

We need to make sure that when the bitmap becomes not "allclean", the
daemon_sleep really does get set to a sensible value.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-10 18:01:19 -07:00
NeilBrown
8311c29d40 md: reduce CPU wastage on idle md array with a write-intent bitmap
On an md array with a write-intent bitmap, a thread wakes up every few seconds
and scans the bitmap looking for work to do.  If the array is idle, there will
be no work to do, but a lot of scanning is done to discover this.

So cache the fact that the bitmap is completely clean, and avoid scanning the
whole bitmap when the cache is known to be clean.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04 16:35:17 -08:00
Jan Blunck
cf28b4863f d_path: Make d_path() use a struct path
d_path() is used on a <dentry,vfsmount> pair.  Lets use a struct path to
reflect this.

[akpm@linux-foundation.org: fix build in mm/memory.c]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Bryan Wu <bryan.wu@analog.com>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:17:09 -08:00
NeilBrown
d089c6af10 md: change ITERATE_RDEV to rdev_for_each
As this is more in line with common practice in the kernel.  Also swap the
args around to be more like list_for_each.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:19 -08:00
NeilBrown
b47490c9bc md: Update md bitmap during resync.
Currently an md array with a write-intent bitmap does not updated that bitmap
to reflect successful partial resync.  Rather the entire bitmap is updated
when the resync completes.

This is because there is no guarentee that resync requests will complete in
order, and tracking each request individually is unnecessarily burdensome.

However there is value in regularly updating the bitmap, so add code to
periodically pause while all pending sync requests complete, then update the
bitmap.  Doing this only every few seconds (the same as the bitmap update
time) does not notciably affect resync performance.

[snitzer@gmail.com: export bitmap_cond_end_sync]
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: "Mike Snitzer" <snitzer@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:18 -08:00
Alan D. Brunelle
2ad8b1ef11 Add UNPLUG traces to all appropriate places
Added blk_unplug interface, allowing all invocations of unplugs to result
in a generated blktrace UNPLUG.

Signed-off-by: Alan D. Brunelle <Alan.Brunelle@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-09 13:41:32 +01:00
NeilBrown
85bfb4da8c md: fix an unsigned compare to allow creation of bitmaps with v1.0 metadata
As page->index is unsigned, this all becomes an unsigned comparison,
which almost always returns an error.

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Stable <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-23 08:32:06 -07:00
NeilBrown
4ad1366376 md: change bitmap_unplug and others to void functions
bitmap_unplug only ever returns 0, so it may as well be void.  Two callers try
to print a message if it returns non-zero, but that message is already printed
by bitmap_file_kick.

write_page returns an error which is not consistently checked.  It always
causes BITMAP_WRITE_ERROR to be set on an error, and that can more
conveniently be checked.

When the return of write_page is checked, an error causes bitmap_file_kick to
be called - so move that call into write_page - and protect against recursive
calls into bitmap_file_kick.

bitmap_update_sb returns an error that is never checked.

So make these 'void' and be consistent about checking the bit.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:15 -07:00
NeilBrown
f0d76d70bc md: check that internal bitmap does not overlap other data
We current completely trust user-space to set up metadata describing an
consistant array.  In particlar, that the metadata, data, and bitmap do not
overlap.

But userspace can be buggy, and it is better to report an error than corrupt
data.  So put in some appropriate checks.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:15 -07:00
NeilBrown
ab6085c795 md: don't write more than is required of the last page of a bitmap
It is possible that real data or metadata follows the bitmap without full page
alignment.

So limit the last write to be only the required number of bytes, rounded up to
the hard sector size of the device.

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-23 20:14:14 -07:00
Mark Fasheh
ef51c97623 Remove do_sync_file_range()
Remove do_sync_file_range() and convert callers to just use
do_sync_mapping_range().

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:04 -07:00
Neil Brown
505fa2c4a2 [PATCH] md: fix calculation for size of filemap_attr array in md/bitmap
If 'num_pages' were ever 1 more than a multiple of 8 (32bit platforms)
or of 16 (64 bit platforms).  filemap_attr would be allocated one
'unsigned long' shorter than required.  We need a round-up in there.

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-12 15:31:42 -07:00
Andrew Morton
fc0ecff698 [PATCH] remove invalidate_inode_pages()
Convert all calls to invalidate_inode_pages() into open-coded calls to
invalidate_mapping_pages().

Leave the invalidate_inode_pages() wrapper in place for now, marked as
deprecated.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:31 -08:00
Neil Brown
da6e1a32fb [PATCH] md: avoid possible BUG_ON in md bitmap handling
md/bitmap tracks how many active write requests are pending on blocks
associated with each bit in the bitmap, so that it knows when it can clear
the bit (when count hits zero).

The counter has 14 bits of space, so if there are ever more than 16383, we
cannot cope.

Currently the code just calles BUG_ON as "all" drivers have request queue
limits much smaller than this.

However is seems that some don't.  Apparently some multipath configurations
can allow more than 16383 concurrent write requests.

So, in this unlikely situation, instead of calling BUG_ON we now wait
for the count to drop down a bit.  This requires a new wait_queue_head,
some waiting code, and a wakeup call.

Tested by limiting the counter to 20 instead of 16383 (writes go a lot slower
in that case...).

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-09 09:25:47 -08:00
NeilBrown
f49d5e62d9 [PATCH] md: avoid reading past the end of a bitmap file
In most cases we check the size of the bitmap file before reading data from
it.  However when reading the superblock, we always read the first PAGE_SIZE
bytes, which might not always be appropriate.  So limit that read to the size
of the file if appropriate.

Also, we get the count of available bytes wrong in one place, so that too can
read past the end of the file.

Cc: "yang yin" <yinyang801120@gmail.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26 13:50:59 -08:00
Josef Sipek
c649bb9c55 [PATCH] struct path: convert md
Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-08 08:28:47 -08:00
NeilBrown
4f2e639af4 [PATCH] md: endian annotations for the bitmap superblock
And a couple of bug fixes found by sparse.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21 13:35:05 -07:00
Alexey Dobriyan
5f6e3c8365 [PATCH] md: use BUILD_BUG_ON
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:26 -07:00
Paul Clements
a638b2dc95 [PATCH] md: use ffz instead of find_first_set to convert multiplier to shift
find_first_set doesn't find the least-significant bit on bigendian machines,
so it is really wrong to use it.

ffs is closer, but takes an 'int' and we have a 'unsigned long'.  So use
ffz(~X) to convert a chunksize into a chunkshift.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:18 -07:00
Paul Clements
9b1d1dac18 [PATCH] md: new sysfs interface for setting bits in the write-intent-bitmap
Add a new sysfs interface that allows the bitmap of an array to be dirtied.
The interface is write-only, and is used as follows:

echo "1000" > /sys/block/md2/md/bitmap

(dirty the bit for chunk 1000 [offset 0] in the in-memory and on-disk
bitmaps of array md2)

echo "1000-2000" > /sys/block/md1/md/bitmap

(dirty the bits for chunks 1000-2000 in md1's bitmap)

This is useful, for example, in cluster environments where you may need to
combine two disjoint bitmaps into one (following a server failure, after a
secondary server has taken over the array).  By combining the bitmaps on
the two servers, a full resync can be avoided (This was discussed on the
list back on March 18, 2005, "[PATCH 1/2] md bitmap bug fixes" thread).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:17 -07:00
Jörn Engel
6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
NeilBrown
ce25c31bdd [PATCH] md: Change md/bitmap file handling to use bmap to file blocks-fix
Fix problems with new bmap based access to bitmap files.

1/ When not using a file based bitmap, attach a NULL list of buffers
   to each page so the common free_buffer routine can cope.
2/ Use submit_bh to read as well as write, rather than vfs_read.
   This makes read and write more symetric.
3/ sync the file before reading, to ensure that the page cache has no
   dirty pages that might get written out later.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:38 -07:00
NeilBrown
d785a06a0b [PATCH] md/bitmap: change md/bitmap file handling to use bmap to file blocks
If md is asked to store a bitmap in a file, it tries to hold onto the page
cache pages for that file, manipulate them directly, and call a cocktail of
operations to write the file out.  I don't believe this is a supportable
approach.

This patch changes the approach to use the same approach as swap files.  i.e.
bmap is used to enumerate all the block address of parts of the file and we
write directly to those blocks of the device.

swapfile only uses parts of the file that provide a full pages at contiguous
addresses.  We don't have that luxury so we have to cope with pages that are
non-contiguous in storage.  To handle this we attach buffers to each page, and
store the addresses in those buffers.

With this approach the pagecache may contain data which is inconsistent with
what is on disk.  To alleviate the problems this can cause, md invalidates the
pagecache when releasing the file.  If the file is to be examined while the
array is active (a non-critical but occasionally useful function), O_DIRECT io
must be used.  And new version of mdadm will have support for this.

This approach simplifies a lot of code:
 - we no longer need to keep a list of pages which we need to wait for,
   as the b_endio function can keep track of how many outstanding
   writes there are.  This saves a mempool.
 - -EAGAIN returns from write_page are no longer possible (not sure if
    they ever were actually).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:38 -07:00
NeilBrown
acc55e2201 [PATCH] md/bitmap: tidy up i_writecount handling in md/bitmap
md/bitmap modifies i_writecount of a bitmap file to make sure that no-one else
writes to it.  The reverting of the change is sometimes done twice, and there
is one error path where it is omitted.

This patch tidies that up.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:38 -07:00
NeilBrown
0cdd02cabd [PATCH] md/bitmap: remove dead code from md/bitmap
bitmap_active is never called, and the BITMAP_ACTIVE flag is never users or
tested, so discard them both.

Also remove some out-of-date 'todo' comments.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:38 -07:00