Commit Graph

108 Commits

Author SHA1 Message Date
Al Viro
47e4491b40 Cleanup after commit 585d3bc06f
fsync_bdev() export and a bunch of stubs for !CONFIG_BLOCK case had
been left behind

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-04-01 07:07:16 -04:00
Nick Piggin
585d3bc06f fs: move bdev code out of buffer.c
Move some block device related code out from buffer.c and put it in
block_dev.c. I'm trying to move non-buffer_head code out of buffer.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-27 14:44:03 -04:00
Takashi Sato
fcccf50254 filesystem freeze: implement generic freeze feature
The ioctls for the generic freeze feature are below.
o Freeze the filesystem
  int ioctl(int fd, int FIFREEZE, arg)
    fd: The file descriptor of the mountpoint
    FIFREEZE: request code for the freeze
    arg: Ignored
    Return value: 0 if the operation succeeds. Otherwise, -1

o Unfreeze the filesystem
  int ioctl(int fd, int FITHAW, arg)
    fd: The file descriptor of the mountpoint
    FITHAW: request code for unfreeze
    arg: Ignored
    Return value: 0 if the operation succeeds. Otherwise, -1
    Error number: If the filesystem has already been unfrozen,
                  errno is set to EINVAL.

[akpm@linux-foundation.org: fix CONFIG_BLOCK=n]
Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com>
Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com>
Cc: <xfs-masters@oss.sgi.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-09 16:54:42 -08:00
Linus Torvalds
2150edc6c5 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (57 commits)
  jbd2: Fix oops in jbd2_journal_init_inode() on corrupted fs
  ext4: Remove "extents" mount option
  block: Add Kconfig help which notes that ext4 needs CONFIG_LBD
  ext4: Make printk's consistently prefixed with "EXT4-fs: "
  ext4: Add sanity checks for the superblock before mounting the filesystem
  ext4: Add mount option to set kjournald's I/O priority
  jbd2: Submit writes to the journal using WRITE_SYNC
  jbd2: Add pid and journal device name to the "kjournald2 starting" message
  ext4: Add markers for better debuggability
  ext4: Remove code to create the journal inode
  ext4: provide function to release metadata pages under memory pressure
  ext3: provide function to release metadata pages under memory pressure
  add releasepage hooks to block devices which can be used by file systems
  ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc
  ext4: Init the complete page while building buddy cache
  ext4: Don't allow new groups to be added during block allocation
  ext4: mark the blocks/inode bitmap beyond end of group as used
  ext4: Use new buffer_head flag to check uninit group bitmaps initialization
  ext4: Fix the race between read_inode_bitmap() and ext4_new_inode()
  ext4: code cleanup
  ...
2009-01-08 17:14:59 -08:00
NeilBrown
d3374825ce md: make devices disappear when they are no longer needed.
Currently md devices, once created, never disappear until the module
is unloaded.  This is essentially because the gendisk holds a
reference to the mddev, and the mddev holds a reference to the
gendisk, this a circular reference.

If we drop the reference from mddev to gendisk, then we need to ensure
that the mddev is destroyed when the gendisk is destroyed.  However it
is not possible to hook into the gendisk destruction process to enable
this.

So we drop the reference from the gendisk to the mddev and destroy the
gendisk when the mddev gets destroyed.  However this has a
complication.
Between the call
   __blkdev_get->get_gendisk->kobj_lookup->md_probe
and the call
   __blkdev_get->md_open

there is no obvious way to hold a reference on the mddev any more, so
unless something is done, it will disappear and gendisk will be
destroyed prematurely.

Also, once we decide to destroy the mddev, there will be an unlockable
moment before the gendisk is unlinked (blk_unregister_region) during
which a new reference to the gendisk can be created.  We need to
ensure that this reference can not be used.  i.e. the ->open must
fail.

So:
 1/  in md_probe we set a flag in the mddev (hold_active) which
     indicates that the array should be treated as active, even
     though there are no references, and no appearance of activity.
     This is cleared by md_release when the device is closed if it
     is no longer needed.
     This ensures that the gendisk will survive between md_probe and
     md_open.

 2/  In md_open we check if the mddev we expect to open matches
     the gendisk that we did open.
     If there is a mismatch we return -ERESTARTSYS and modify
     __blkdev_get to retry from the top in that case.
     In the -ERESTARTSYS sys case we make sure to wait until
     the old gendisk (that we succeeded in opening) is really gone so
     we loop at most once.

Some udev configurations will always open an md device when it first
appears.   If we allow an md device that was just created by an open
to disappear on an immediate close, then this can race with such udev
configurations and result in an infinite loop the device being opened
and closed, then re-open due to the 'ADD' even from the first open,
and then close and so on.
So we make sure an md device, once created by an open, remains active
at least until some md 'ioctl' has been made on it.  This means that
all normal usage of md devices will allow them to disappear promptly
when not needed, but the worst that an incorrect usage will do it
cause an inactive md device to be left in existence (it can easily be
removed).

As an array can be stopped by writing to a sysfs attribute
  echo clear > /sys/block/mdXXX/md/array_state
we need to use scheduled work for deleting the gendisk and other
kobjects.  This allows us to wait for any pending gendisk deletion to
complete by simply calling flush_scheduled_work().



Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-09 08:31:10 +11:00
Randy Dunlap
94e2959e7a fs: fix function param name in kernel-doc
Fix function parameter name in kernel-doc:

Warning(linux-2.6.28-git5//fs/block_dev.c:1272): No description found for parameter 'pathname'
Warning(linux-2.6.28-git5//fs/block_dev.c:1272): Excess function parameter 'path' description in 'lookup_bdev'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:14 -08:00
Theodore Ts'o
87d8fe1ee6 add releasepage hooks to block devices which can be used by file systems
Implement blkdev_releasepage() to release the buffer_heads and pages
after we release private data belonging to a mounted filesystem.

Cc: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-01-03 09:47:09 -05:00
Denis ChengRq
c2acf7b908 fs/block_dev.c: __read_mostly improvement and sb_is_blkdev_sb utilization
- iget5_locked in bdget really needs blockdev_superblock, instead of
  bd_mnt, so bd_mnt could be just a local variable;

- blockdev_superblock really needs __read_mostly, while local var bd_mnt
  not;

- make use of sb_is_blkdev_sb in bd_forget, instead of direct reference
  to blockdev_superblock.

Signed-off-by: Denis ChengRq <crquan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31 18:07:43 -05:00
Christoph Hellwig
fd4ce1acd0 [PATCH 1/2] kill FMODE_NDELAY_NOW
Update FMODE_NDELAY before each ioctl call so that we can kill the
magic FMODE_NDELAY_NOW.  It would be even better to do this directly
in setfl(), but for that we'd need to have FMODE_NDELAY for all files,
not just block special files.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-04 04:22:57 -05:00
Christoph Hellwig
ebbefc011e [PATCH] clean up blkdev_get a little bit
The way the bd_claim for the FMODE_EXCL case is implemented is rather
confusing.  Clean it up to the most logical style.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-04 04:22:56 -05:00
Tejun Heo
89f97496e8 block: fix __blkdev_get() for removable devices
Commit 0762b8bde9 moved disk_get_part()
in front of recursive get on the whole disk, which caused removable
devices to try disk_get_part() before rescanning after a new media is
inserted, which might fail legit open attempts or give the old
partition.

This patch fixes the problem by moving disk_get_part() after
__blkdev_get() on the whole disk.

This problem was spotted by Borislav Petkov.

Signed-off-by: Tejun Heo <tj@kernel.org>
Tested-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-06 08:41:56 +01:00
Linus Torvalds
2248485640 Merge git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev
* git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev: (66 commits)
  [PATCH] kill the rest of struct file propagation in block ioctls
  [PATCH] get rid of struct file use in blkdev_ioctl() BLKBSZSET
  [PATCH] get rid of blkdev_locked_ioctl()
  [PATCH] get rid of blkdev_driver_ioctl()
  [PATCH] sanitize blkdev_get() and friends
  [PATCH] remember mode of reiserfs journal
  [PATCH] propagate mode through swsusp_close()
  [PATCH] propagate mode through open_bdev_excl/close_bdev_excl
  [PATCH] pass fmode_t to blkdev_put()
  [PATCH] kill the unused bsize on the send side of /dev/loop
  [PATCH] trim file propagation in block/compat_ioctl.c
  [PATCH] end of methods switch: remove the old ones
  [PATCH] switch sr
  [PATCH] switch sd
  [PATCH] switch ide-scsi
  [PATCH] switch tape_block
  [PATCH] switch dcssblk
  [PATCH] switch dasd
  [PATCH] switch mtd_blkdevs
  [PATCH] switch mmc
  ...
2008-10-23 10:23:07 -07:00
Al Viro
421748ecde [PATCH] assorted path_lookup() -> kern_path() conversions
more nameidata eviction

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:52 -04:00
Al Viro
56b26add02 [PATCH] kill the rest of struct file propagation in block ioctls
Now we can switch blkdev_ioctl() block_device/mode

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:49:14 -04:00
Al Viro
572c489215 [PATCH] sanitize blkdev_get() and friends
* get rid of fake struct file/struct dentry in __blkdev_get()
* merge __blkdev_get() and do_open()
* get rid of flags argument of blkdev_get()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:49:06 -04:00
Al Viro
30c40d2c01 [PATCH] propagate mode through open_bdev_excl/close_bdev_excl
replace open_bdev_excl/close_bdev_excl with variants taking fmode_t.
superblock gets the value used to mount it stored in sb->s_mode

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:49:00 -04:00
Al Viro
9a1c354276 [PATCH] pass fmode_t to blkdev_put()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:58 -04:00
Al Viro
90b8f2824c [PATCH] end of methods switch: remove the old ones
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:52 -04:00
Al Viro
d4430d62fa [PATCH] beginning of methods conversion
To keep the size of changesets sane we split the switch by drivers;
to keep the damn thing bisectable we do the following:
	1) rename the affected methods, add ones with correct
prototypes, make (few) callers handle both.  That's this changeset.
	2) for each driver convert to new methods.  *ALL* drivers
are converted in this series.
	3) kill the old (renamed) methods.

Note that it _is_ a flagday; all in-tree drivers are converted and by the
end of this series no trace of old methods remain.  The only reason why
we do that this way is to keep the damn thing bisectable and allow per-driver
debugging if anything goes wrong.

New methods:
	open(bdev, mode)
	release(disk, mode)
	ioctl(bdev, mode, cmd, arg)		/* Called without BKL */
	compat_ioctl(bdev, mode, cmd, arg)
	locked_ioctl(bdev, mode, cmd, arg)	/* Called with BKL, legacy */

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:32 -04:00
Al Viro
86d434dede [PATCH] eliminate use of ->f_flags in block methods
store needed information in f_mode

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:08 -04:00
Al Viro
aeb5d72706 [PATCH] introduce fmode_t, do annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:06 -04:00
Randy Dunlap
496aa8a98f block: fix current kernel-doc warnings
Fix block kernel-doc warnings:

Warning(linux-2.6.27-git4//fs/block_dev.c:1272): No description found for parameter 'path'
Warning(linux-2.6.27-git4//block/blk-core.c:1021): No description found for parameter 'cpu'
Warning(linux-2.6.27-git4//block/blk-core.c:1021): No description found for parameter 'part'
Warning(/var/linsrc/linux-2.6.27-git4//block/genhd.c:544): No description found for parameter 'partno'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-17 08:46:57 +02:00
Randy Dunlap
57d1b5366f block_dev: fix kernel-doc in new functions
Fix kernel-doc in new functions:

Error(mmotm-2008-1002-1617//fs/block_dev.c:895): duplicate section name 'Description'
Error(mmotm-2008-1002-1617//fs/block_dev.c:924): duplicate section name 'Description'
Warning(mmotm-2008-1002-1617//fs/block_dev.c:1282): No description found for parameter 'pathname'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
cc: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 10:42:38 +02:00
Andrew Patterson
608aeef17a Call flush_disk() after detecting an online resize.
We call flush_disk() to make sure the buffer cache for the disk is
flushed after a disk resize. There are two resize cases, growing and
shrinking. Given that users can shrink/then grow a disk before
revalidate_disk() is called, we treat the grow case identically to
shrinking. We need to flush the buffer cache after an online shrink
because, as James Bottomley puts it,

     The two use cases for shrinking I can see are

     1. planned: the fs is already shrunk to within the new boundaries
        and all data is relocated, so invalidate is fine (any dirty
        buffers that might exist in the shrunk region are there only
        because they were relocated but not yet written to their
        original location).
     2. unplanned:  In this case, the fs is probably toast, so whether
        we invalidate or not isn't going to make a whole lot of
        difference; it's still going to try to read or write from
        sectors beyond the new size and get I/O errors.

Immediately invalidating shrunk disks will cause errors for outstanding
I/Os for reads/write beyond the new end of the disk to be generated
earlier then if we waited for the normal buffer cache operation. It also
removes a potential security hole where we might keep old data around
from beyond the end of the shrunk disk if the disk was not invalidated.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:13 +02:00
Andrew Patterson
56ade44b46 Added flush_disk to factor out common buffer cache flushing code.
We need to be able to flush the buffer cache for for more than
just when a disk is changed, so we factor out common cache flush code
in check_disk_change() to an internal flush_disk() routine.  This
routine will then be used for both disk changes and disk resizes (in a
later patch).

Include the disk name in the text indicating that there are busy
inodes on the device and increase the KERN severity of the message.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:13 +02:00