Commit Graph

848 Commits

Author SHA1 Message Date
Mark McLoughlin
f9cea4f707 [PATCH] dm snapshot: fix metadata error handling
Fix the error handling when store.read_metadata is called: the error should be
returned immediately.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:14 -07:00
Mark McLoughlin
4c7e3bf44d [PATCH] dm snapshot: allow zero chunk_size
The chunk size of snapshots cannot be changed so it is redundant to require it
as a parameter when activating an existing snapshot.  Allow a value of zero in
this case and ignore it.  For a new snapshot, use a default value if zero is
specified.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:14 -07:00
Milan Broz
92c060a692 [PATCH] dm snapshot: fix invalidation ENOMEM
Fix ENOMEM error sign.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:14 -07:00
Ishai Rabinovitz
e3f6ac6123 [PATCH] dm: fix alloc_dev error path
While reading the code I found a bug in the error path in alloc_dev in dm.c

When blk_alloc_queue fails there is no call to free_minor.

This patch fixes the problem.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:14 -07:00
Milan Broz
e90dae1f58 [PATCH] dm: support ioctls on mapped devices: fix with fake file
The new ioctl code passes the wrong file pointer to the underlying device.
No file pointer is available so make a temporary fake one.

ioctl_by_bdev() does set_fs(KERNEL_DS) so it's for ioctls originating
within the kernel and unsuitable here.  We are processing ioctls that
originated in userspace and mapping them to different devices.  Fixing the
existing callers that pass a NULL file struct and consolidating the
fake_file users are separate matters to solve in later patches.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:13 -07:00
Alasdair G Kergon
7006f6eca8 [PATCH] dm: export blkdev_driver_ioctl
Export blkdev_driver_ioctl for device-mapper.

If we get as far as the device-mapper ioctl handler, we know the ioctl is not
a standard block layer BLK* one, so we don't need to check for them a second
time and can call blkdev_driver_ioctl() directly.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:13 -07:00
Milan Broz
9af4aa30b7 [PATCH] dm mpath: support ioctls
When an ioctl is performed on a multipath device simply pass it on to the
underlying block device through current_path.  If current path is not yet
selected, select it.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:13 -07:00
Milan Broz
ab17ffa440 [PATCH] dm linear: support ioctls
When an ioctl is performed on a device with a linear target, simply pass it on
to the underlying block device.

Note that the ioctl will pass through the filtering in blkdev_ioctl() twice.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:13 -07:00
Milan Broz
aa129a2247 [PATCH] dm: support ioctls on mapped devices
Extend the core device-mapper infrastructure to accept arbitrary ioctls on a
mapped device provided that it has exactly one target and it is capable of
supporting ioctls.

[We can't use unlocked_ioctl because we need 'inode': 'file' might be NULL.
Is it worth changing this?]

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

Arnd Bergmann <arnd@arndb.de> wrote:

> Am Wednesday 21 June 2006 21:31 schrieb Alasdair G Kergon:
> > static struct block_device_operations dm_blk_dops = {
> > .open = dm_blk_open,
> > .release = dm_blk_close,
> > +.ioctl = dm_blk_ioctl,
> > .getgeo = dm_blk_getgeo,
> > .owner = THIS_MODULE
>
> I guess this also needs a ->compat_ioctl method, otherwise it won't
> work for ioctl numbers that have a compat_ioctl implementation in the
> low-level device driver.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:13 -07:00
David Howells
9361401eb7 [PATCH] BLOCK: Make it possible to disable the block layer [try #6]
Make it possible to disable the block layer.  Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.

This patch does the following:

 (*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
     support.

 (*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
     an item that uses the block layer.  This includes:

     (*) Block I/O tracing.

     (*) Disk partition code.

     (*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.

     (*) The SCSI layer.  As far as I can tell, even SCSI chardevs use the
     	 block layer to do scheduling.  Some drivers that use SCSI facilities -
     	 such as USB storage - end up disabled indirectly from this.

     (*) Various block-based device drivers, such as IDE and the old CDROM
     	 drivers.

     (*) MTD blockdev handling and FTL.

     (*) JFFS - which uses set_bdev_super(), something it could avoid doing by
     	 taking a leaf out of JFFS2's book.

 (*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
     linux/elevator.h contingent on CONFIG_BLOCK being set.  sector_div() is,
     however, still used in places, and so is still available.

 (*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
     parts of linux/fs.h.

 (*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.

 (*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.

 (*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
     is not enabled.

 (*) fs/no-block.c is created to hold out-of-line stubs and things that are
     required when CONFIG_BLOCK is not set:

     (*) Default blockdev file operations (to give error ENODEV on opening).

 (*) Makes some /proc changes:

     (*) /proc/devices does not list any blockdevs.

     (*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.

 (*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.

 (*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
     given command other than Q_SYNC or if a special device is specified.

 (*) In init/do_mounts.c, no reference is made to the blockdev routines if
     CONFIG_BLOCK is not defined.  This does not prohibit NFS roots or JFFS2.

 (*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
     error ENOSYS by way of cond_syscall if so).

 (*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
     CONFIG_BLOCK is not set, since they can't then happen.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30 20:52:31 +02:00
Jens Axboe
4aff5e2333 [PATCH] Split struct request ->flags into two parts
Right now ->flags is a bit of a mess: some are request types, and
others are just modifiers. Clean this up by splitting it into
->cmd_type and ->cmd_flags. This allows introduction of generic
Linux block message types, useful for sending generic Linux commands
to block devices.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-09-30 20:23:37 +02:00
Rik Snel
3c164bd815 [BLOCK] dm-crypt: trivial comment improvements
Just some minor comment nits.

- little-endian is better than low-endian
- and since it is called essiv everywere it should also be essiv
  in the comments (and not ess_iv)

Signed-off-by: Rik Snel <rsnel@cube.dyndns.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:46:27 +10:00
Herbert Xu
3505868791 [CRYPTO] users: Use crypto_hash interface instead of crypto_digest
This patch converts all remaining crypto_digest users to use the new
crypto_hash interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:46:21 +10:00
Herbert Xu
d1806f6a97 [BLOCK] dm-crypt: Use block ciphers where applicable
This patch converts dm-crypt to use the new block cipher type where
applicable.  It also changes simple cipher operations to use the new
encrypt_one/decrypt_one interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:46:13 +10:00
NeilBrown
ddac7c7e3a [PATCH] md: Fix issues with referencing rdev in md/raid1
We need to be careful when referencing mirrors[i].rdev.  It can disappear
under us at various times.

So:
  fix a couple of problem places.
  comment a couple of non-problem places
  move an 'atomic_add' which deferences rdev down a little
    way to some where where it is sure to not be NULL.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-01 11:39:08 -07:00
NeilBrown
6394cca548 [PATCH] md: fix recent breakage of md/raid1 array checking
A recent patch broke the ability to do a user-request check of a raid1.
This patch fixes the breakage and also moves a comment that was dislocated
by the same patch.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27 11:01:31 -07:00
NeilBrown
8469219596 [PATCH] md: avoid backward event updates in md superblock when degraded.
If we
  - shut down a clean array,
  - restart with one (or more) drive(s) missing
  - make some changes
  - pause, so that they array gets marked 'clean',
the event count on the superblock of included drives
will be the same as that of the removed drives.
So adding the removed drive back in will cause it
to be included with no resync.

To avoid this, we only update the eventcount backwards when the array
is not degraded.  In this case there can (should) be no non-connected
drives that we can get confused with, and this is the particular case
where updating-backwards is valuable.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27 11:01:31 -07:00
Daniel Kobras
c06aad854f [PATCH] dm: Fix deadlock under high i/o load in raid1 setup.
On an nForce4-equipped machine with two SATA disk in raid1 setup using dmraid,
we experienced frequent deadlock of the system under high i/o load.  'cat
/dev/zero > ~/zero' was the most reliable way to reproduce them: Randomly
after a few GB, 'cp' would be left in 'D' state along with kjournald and
kmirrord.  The functions cp and kjournald were blocked in did vary, but
kmirrord's wchan always pointed to 'mempool_alloc()'.  We've seen this pattern
on 2.6.15 and 2.6.17 kernels.  http://lkml.org/lkml/2005/4/20/142 indicates
that this problem has been around even before.

So much for the facts, here's my interpretation: mempool_alloc() first tries
to atomically allocate the requested memory, or falls back to hand out
preallocated chunks from the mempool.  If both fail, it puts the calling
process (kmirrord in this case) on a private waitqueue until somebody refills
the pool.  Where the only 'somebody' is kmirrord itself, so we have a
deadlock.

I worked around this problem by falling back to a (blocking) kmalloc when
before kmirrord would have ended up on the waitqueue.  This defeats part of
the benefits of using the mempool, but at least keeps the system running.  And
it could be done with a two-line change.  Note that mempool_alloc() clears the
GFP_NOIO flag internally, and only uses it to decide whether to wait or return
an error if immediate allocation fails, so the attached patch doesn't change
behaviour in the non-deadlocking case.  Path is against current git
(2.6.18-rc4), but should apply to earlier versions as well.  I've tested on
2.6.15, where this patch makes the difference between random lockup and a
stable system.

Signed-off-by: Daniel Kobras <kobras@linux.de>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27 11:01:28 -07:00
Michal Miroslaw
485311a23c [PATCH] dm: BUG/OOPS fix
Fix BUG I tripped on while testing failover and multipathing.

BUG shows up on error path in multipath_ctr() when parse_priority_group()
fails after returning at least once without error.  The fix is to
initialize m->ti early - just after alloc()ing it.

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c027c3d2
*pde = 00000000
Oops: 0000 [#3]
Modules linked in: qla2xxx ext3 jbd mbcache sg ide_cd cdrom floppy
CPU:    0
EIP:    0060:[<c027c3d2>]    Not tainted VLI
EFLAGS: 00010202   (2.6.17.3 #1)
EIP is at dm_put_device+0xf/0x3b
eax: 00000001   ebx: ee4fcac0   ecx: 00000000   edx: ee4fcac0
esi: ee4fc4e0   edi: ee4fc4e0   ebp: 00000000   esp: c5db3e78
ds: 007b   es: 007b   ss: 0068
Process multipathd (pid: 15912, threadinfo=c5db2000 task=ef485a90)
Stack: ec4eda40 c02816bd ee4fc4c0 00000000 f7e89498 f883e0bc c02816f6 f7e89480
       f7e8948c c0281801 ffffffea f7e89480 f883e080 c0281ffe 00000001 00000000
       00000004 dfe9cab8 f7a693c0 f883e080 f883e0c0 ca4b99c0 c027c6ee 01400000
Call Trace:
 <c02816bd> free_pgpaths+0x31/0x45  <c02816f6> free_priority_group+0x25/0x2e
 <c0281801> free_multipath+0x35/0x67  <c0281ffe> multipath_ctr+0x123/0x12d
 <c027c6ee> dm_table_add_target+0x11e/0x18b  <c027e5b4> populate_table+0x8a/0xaf
 <c027e62b> table_load+0x52/0xf9  <c027ec23> ctl_ioctl+0xca/0xfc
 <c027e5d9> table_load+0x0/0xf9  <c0152146> do_ioctl+0x3e/0x43
 <c0152360> vfs_ioctl+0x16c/0x178  <c01523b4> sys_ioctl+0x48/0x60
 <c01029b3> syscall_call+0x7/0xb
Code: 97 f0 00 00 00 89 c1 83 c9 01 80 e2 01 0f 44 c1 88 43 14 8b 04 24 59 5b 5e 5f 5d c3 53 89 c1 89 d3 ff 4a 08 0f 94 c0 84 c0 74 2a <8b> 01 8b 10 89 d8 e8 f6 fb ff ff 8b 03 8b 53 04 89 50 04 89 02
EIP: [<c027c3d2>] dm_put_device+0xf/0x3b SS:ESP 0068:c5db3e78

Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-14 12:54:29 -07:00
NeilBrown
f9abd1ace4 [PATCH] md: Fix a bug that recently crept into md/linear
A recent patch that allowed linear arrays to be reconfigured on-line
allowed in a bug which results in divide by zero - not all
mddev->array_size were converted to conf->array_size.

This patch finished the conversion and fixed the bug.

The offending patch was commit 7c7546ccf6.

Thanks to Simon Kirby <sim@netnation.com> for the bug report.

Cc: Simon Kirby <sim@netnation.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06 08:57:46 -07:00
Andrew Morton
d0a0a5ee7a [PATCH] md: fix oops in error-handling
During early MD setup (superblock reading), we don't have a personality yet.
But the error-handling code tries to dereference mddev->pers.  Fix.

Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:17 -07:00
NeilBrown
d695043259 [PATCH] md: include sector number in messages about corrected read errors
This is generally useful, but particularly helps see if it is the same sector
that always needs correcting, or different ones.

[akpm@osdl.org: fix printk warnings]
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:17 -07:00
NeilBrown
67463acb64 [PATCH] md: require CAP_SYS_ADMIN for (re-)configuring md devices via sysfs
The ioctl requires CAP_SYS_ADMIN, so sysfs should too.  Note that we don't
require CAP_SYS_ADMIN for reading attributes even though the ioctl does.
There is no reason to limit the read access, and much of the information is
already available via /proc/mdstat

Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:17 -07:00
NeilBrown
80ca3a44f5 [PATCH] md: unify usage of symbolic names for perms
Some places we use number (0660) someplaces names (S_IRUGO).  Change all
numbers to be names, and change 0655 to be what it should be.

Also make some formatting more consistent.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:17 -07:00
NeilBrown
5e3db645f8 [PATCH] md: fix usage of wrong variable in raid1
Though it rarely matters, we should be using 's' rather than r1_bio->sector
here.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:17 -07:00