Commit Graph

110319 Commits

Author SHA1 Message Date
Andrew Patterson
9bc3ffbfbd Check for device resize when rescanning partitions
Check for device resize in the rescan_partitions() routine. If the device
has been resized, the bdev size is set to match. The rescan_partitions()
routine is called when opening the device and when calling the
BLKRRPART ioctl.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:12 +02:00
Andrew Patterson
c3279d1454 Adjust block device size after an online resize of a disk.
The revalidate_disk routine now checks if a disk has been resized by
comparing the gendisk capacity to the bdev inode size.  If they are
different (usually because the disk has been resized underneath the kernel)
the bdev inode size is adjusted to match the capacity.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:12 +02:00
Andrew Patterson
0c002c2f74 Wrapper for lower-level revalidate_disk routines.
This is a wrapper for the lower-level revalidate_disk call-backs such
as sd_revalidate_disk(). It allows us to perform pre and post
operations when calling them.

We will use this wrapper in a later patch to adjust block device sizes
after an online resize (a _post_ operation).

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:12 +02:00
Tejun Heo
243294dae0 block: fix duplicate headers for /proc/partitions
seqf can be started multiple times for a read and the header should be
printed only for the initial one.  Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:12 +02:00
FUJITA Tomonori
fad7f01e61 sg: set dxferp to NULL for READ with the older SG interface
With the older SG interface, we don't know a user-space address to
trasfer data when executing a SCSI command. So we can't pass a
user-space address to blk_rq_map_user.

This patch fixes sg to pass a NULL user-space address to
blk_rq_map_user so that it just sets up a request and bios with page
frames propely without data transfer.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:12 +02:00
FUJITA Tomonori
818827669d block: make blk_rq_map_user take a NULL user-space buffer
This patch changes blk_rq_map_user to accept a NULL user-space buffer
with a READ command if rq_map_data is not NULL. Thus a caller can pass
page frames to lk_rq_map_user to just set up a request and bios with
page frames propely. bio_uncopy_user (called via blk_rq_unmap_user)
doesn't copy data to user space with such request.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:11 +02:00
Jens Axboe
839e96afba block: update comment on end_request()
It refers to functions that no longer exist after the IO completion
changes.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:11 +02:00
Tejun Heo
55dc7db70a init: DEBUG_BLOCK_EXT_DEVT requires explicit root= param
DEBUG_BLOCK_EXT_DEVT shuffles SCSI and IDE device numbers and root
device number set using rdev become meaningless.  Root devices should
be explicitly specified using textual names.  Warn about it if root
can't be found and DEBUG_BLOCK_EXT_DEVT is enabled.  Also, add warning
to the help text.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:11 +02:00
Tejun Heo
2bbedcb4c1 block: don't test for partition size in bdget_disk() and blk_lookup_devt()
bdget_disk() and blk_lookup_devt() never cared whether the specified
partition (or disk) is zero sized or not.  I got confused while
converting those not to depend on consecutive minor numbers in commit
5a6411b1178baf534aa9138052864dfa89d3eada and later when dev0 was added
it broke callers which expected to get valid return for zero sized
disk devices.

So, they never needed nr_sects checks in the first place.  Kill them.

This problem was spotted and debugged by Bartlmoiej Zolnierkiewicz.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:11 +02:00
Jens Axboe
759f8ca304 Change default value of CONFIG_DEBUG_BLOCK_EXT_DEVT to 'n'
It's a debug option that you would explicitly enable to test this
feature, we should default it to 'n' to prevent accidental surprises
for now.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:11 +02:00
Harvey Harrison
aeb3d3a81e block: kmalloc args reversed, small function definition fixes
Noticed by sparse:
block/blk-softirq.c:156:12: warning: symbol 'blk_softirq_init' was not declared. Should it be static?
block/genhd.c:583:28: warning: function 'bdget_disk' with external linkage has definition
block/genhd.c:659:17: warning: incorrect type in argument 1 (different base types)
block/genhd.c:659:17:    expected unsigned int [unsigned] [usertype] size
block/genhd.c:659:17:    got restricted gfp_t
block/genhd.c:659:29: warning: incorrect type in argument 2 (different base types)
block/genhd.c:659:29:    expected restricted gfp_t [usertype] flags
block/genhd.c:659:29:    got unsigned int
block: kmalloc args reversed

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:11 +02:00
FUJITA Tomonori
01cfcddd98 sg: use blk_rq_aligned helper function
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Douglas Gilbert <dougg@torque.net>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:11 +02:00
FUJITA Tomonori
879040742c block: add blk_rq_aligned helper function
This adds blk_rq_aligned helper function to see if alignment and
padding requirement is satisfied for DMA transfer. This also converts
blk_rq_map_kern and __blk_rq_map_user to use the helper function.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:11 +02:00
FUJITA Tomonori
4d8ab62e08 bio: convert bio_copy_kern to use bio_copy_user
bio_copy_kern and bio_copy_user are very similar. This converts
bio_copy_kern to use bio_copy_user.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:10 +02:00
FUJITA Tomonori
10db10d144 sg: convert the indirect IO path to use the block layer
This patch converts the indirect IO path (including mmap IO and old
struct sg_header) to use the block layer functions (blk_get_request,
blk_execute_rq_nowait, blk_rq_map_user, etc) instead of
scsi_execute_async().

[Jens: fixed compile error with SCSI logging enabled]

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Douglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:10 +02:00
FUJITA Tomonori
6e5a30cba5 sg: convert the direct IO path to use the block layer
This patch converts the direct IO path (SG_FLAG_DIRECT_IO) to use the
block layer functions (blk_get_request, blk_execute_rq_nowait,
blk_rq_map_user, etc) instead of scsi_execute_async().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Douglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:10 +02:00
FUJITA Tomonori
10865dfa34 sg: convert the non-data path to use the block layer
This patch converts the non data path to use the block layer functions
(blk_get_request, blk_execute_rq_nowait, etc) instead of uses
scsi_execute_async().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Douglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:10 +02:00
FUJITA Tomonori
152e283fdf block: introduce struct rq_map_data to use reserved pages
This patch introduces struct rq_map_data to enable bio_copy_use_iov()
use reserved pages.

Currently, bio_copy_user_iov allocates bounce pages but
drivers/scsi/sg.c wants to allocate pages by itself and use
them. struct rq_map_data can be used to pass allocated pages to
bio_copy_user_iov.

The current users of bio_copy_user_iov simply passes NULL (they don't
want to use pre-allocated pages).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Douglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:10 +02:00
FUJITA Tomonori
a3bce90edd block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov
Currently, blk_rq_map_user and blk_rq_map_user_iov always do
GFP_KERNEL allocation.

This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov
so sg can use it (sg always does GFP_ATOMIC allocation).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Douglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:10 +02:00
Aaron Carroll
45333d5a31 cfq-iosched: fix queue depth detection
CFQ's detection of queueing devices assumes a non-queuing device and detects
if the queue depth reaches a certain threshold.  Under some workloads (e.g.
synchronous reads), CFQ effectively forces a unit queue depth, thus defeating
the detection logic.  This leads to poor performance on queuing hardware,
since the idle window remains enabled.

This patch inverts the sense of the logic: assume a queuing-capable device,
and detect if the depth does not exceed the threshold.

Signed-off-by: Aaron Carroll <aaronc@gelato.unsw.edu.au>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:09 +02:00
Jens Axboe
605401618c block: don't use bio_has_data() in the completion path
We should just check for rq->bio, as that is really the information
we are looking for. Even if the bio attached doesn't carry data,
we still need to do IO post processing on it.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:09 +02:00
Jens Axboe
ab780f1ece block: inherit CPU completion on bio->rq and rq->rq merges
Somewhat incomplete, as we do allow merges of requests and bios
that have different completion CPUs given. This is done on the
assumption that a larger IO is still more beneficial than CPU
locality.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:09 +02:00
Jens Axboe
c7c22e4d5c block: add support for IO CPU affinity
This patch adds support for controlling the IO completion CPU of
either all requests on a queue, or on a per-request basis. We export
a sysfs variable (rq_affinity) which, if set, migrates completions
of requests to the CPU that originally submitted it. A bio helper
(bio_set_completion_cpu()) is also added, so that queuers can ask
for completion on that specific CPU.

In testing, this has been show to cut the system time by as much
as 20-40% on synthetic workloads where CPU affinity is desired.

This requires a little help from the architecture, so it'll only
work as designed for archs that are using the new generic smp
helper infrastructure.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:09 +02:00
Jens Axboe
18887ad910 block: make kblockd_schedule_work() take the queue as parameter
Preparatory patch for checking queuing affinity.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:09 +02:00
Jens Axboe
b646fc59b3 block: split softirq handling into blk-softirq.c
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:09 +02:00