Because of the terrible structuring of scsi-bidi-commands
it breaks some of the life time rules of a scsi-command.
It is now not allowed to free up the block-request before
cleanup and partial deallocation of the scsi-command. (Which
is not so for none bidi commands)
The right fix to this problem would be to make bidi command
a first citizen by allocating a scsi_sdb pointer at scsi command
just like cmd->prot_sdb. The bidi sdb should be allocated/deallocated
as part of the get/put_command (Again like the prot_sdb) and the
current decoupling of scsi_cmnd and blk-request should be kept.
For now make sure scsi_release_buffers() is called before the
call to blk_end_request_all() which might cause the suicide of
the block requests. At best the leak of bidi buffers, at worse
a crash, as there is a race between the existence of the bidi_request
and the free of the associated bidi_sdb.
The reason this was never hit before is because only OSD has the potential
of doing asynchronous bidi commands. (So does bsg but it is never used)
And OSD clients just happen to do all their bidi commands synchronously, up
until recently.
CC: Stable Tree <stable@kernel.org>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
A thin provisioned device may temporarily be out of sufficient
allocation units to fulfill a write request. In that case it will
return a space allocation in progress error. Wait a bit and retry the
write.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Stanse found a potential NULL dereference in scsi_kill_request.
Instead of triggering BUG() in 'if (unlikely(cmd == NULL))' branch,
the kernel will Oops earlier on cmd dereference.
Move the dereferences after the if.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block: (29 commits)
block: use blkdev_issue_discard in blk_ioctl_discard
Make DISCARD_BARRIER and DISCARD_NOBARRIER writes instead of reads
block: don't assume device has a request list backing in nr_requests store
block: Optimal I/O limit wrapper
cfq: choose a new next_req when a request is dispatched
Seperate read and write statistics of in_flight requests
aoe: end barrier bios with EOPNOTSUPP
block: trace bio queueing trial only when it occurs
block: enable rq CPU completion affinity by default
cfq: fix the log message after dispatched a request
block: use printk_once
cciss: memory leak in cciss_init_one()
splice: update mtime and atime on files
block: make blk_iopoll_prep_sched() follow normal 0/1 return convention
cfq-iosched: get rid of must_alloc flag
block: use interrupts disabled version of raise_softirq_irqoff()
block: fix comment in blk-iopoll.c
block: adjust default budget for blk-iopoll
block: fix long lines in block/blk-iopoll.c
block: add blk-iopoll, a NAPI like approach for block devices
...
Update scsi_io_completion() such that it only fails requests till the
next error boundary and retry the leftover. This enables block layer
to merge requests with different failfast settings and still behave
correctly on errors. Allow merge of requests of different failfast
settings.
As SCSI is currently the only subsystem which follows failfast status,
there's no need to worry about other block drivers for now.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Niel Lambrechts <niel.lambrechts@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
If a SCSI ULD driver sets blk_queue_prep_rq(), it should clean it
up itself on remove(), and not from the bus callbacks. This
removes the need to hook into bus->remove(), which should not
be used at the same time as driver->remove().
[jejb: fix sdkp initialisation problem due to mismerge]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Conflicts:
drivers/message/fusion/mptsas.c
fixed up conflict between req->data_len accessors and mptsas driver updates.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
scsi timeout on two or more devices may cause extremely long execution
time for user applications because SDEV_OFFLINE state is changed to
SDEV_RUNNING state during scsi error recovery procedures triggered by
a bus reset or a host reset of scsi LLD, and scsi timeout can happens
on the same devices many times.
This happens because scsi_internal_device_unblock() changes device's
state to SDEV_RUNNING even if a device in other states than SDEV_BLOCK,
while the following two transitions are required in this function.
SDEV_BLOCK -> SDEV_RUNNING
SDEV_CREATED_BLOCK -> SDEV_CREATED
Otherwise, it returns -EINVAL.
Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
[matthew@wil.cx: supplied rewritten base for patch]
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The last request completion cleanup in scsi_lib left an unused
this_count variable in scsi_io_completion().
(It was used before in a code segment that now uses blk_end_request_all())
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Commit c3a4d78c58 introduced
rq->data_len and converted residual count users to it. While
converting, it mistakenly converted scsi_end_request() to finish
requests with residual count when it wants to do is fully complete the
request. Fix it by using blk_end_request_all() instead.
This bug was spotted by Boaz Harrosh.
Signed-off-by: Tejun Heo <tj@kernel.org>
Spotted-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Till now block layer allowed two separate modes of request execution.
A request is always acquired from the request queue via
elv_next_request(). After that, drivers are free to either dequeue it
or process it without dequeueing. Dequeue allows elv_next_request()
to return the next request so that multiple requests can be in flight.
Executing requests without dequeueing has its merits mostly in
allowing drivers for simpler devices which can't do sg to deal with
segments only without considering request boundary. However, the
benefit this brings is dubious and declining while the cost of the API
ambiguity is increasing. Segment based drivers are usually for very
old or limited devices and as converting to dequeueing model isn't
difficult, it doesn't justify the API overhead it puts on block layer
and its more modern users.
Previous patches converted all block low level drivers to dequeueing
model. This patch completes the API transition by...
* renaming elv_next_request() to blk_peek_request()
* renaming blkdev_dequeue_request() to blk_start_request()
* adding blk_fetch_request() which is combination of peek and start
* disallowing completion of queued (not started) requests
* applying new API to all LLDs
Renamings are for consistency and to break out of tree code so that
it's apparent that out of tree drivers need updating.
[ Impact: block request issue API cleanup, no functional change ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: unsik Kim <donari75@gmail.com>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Tim Waugh <tim@cyberelk.net>
Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Laurent Vivier <Laurent@lvivier.info>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Pierre Ossman <drzeus@drzeus.cx>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
Cc: Stefan Weinhuber <wein@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Implement accessors - blk_rq_pos(), blk_rq_sectors() and
blk_rq_cur_sectors() which return rq->hard_sector, rq->hard_nr_sectors
and rq->hard_cur_sectors respectively and convert direct references of
the said fields to the accessors.
This is in preparation of request data length handling cleanup.
Geert : suggested adding const to struct request * parameter to accessors
Sergei : spotted error in patch description
[ Impact: cleanup ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Ackec-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
rq->data_len served two purposes - the length of data buffer on issue
and the residual count on completion. This duality creates some
headaches.
First of all, block layer and low level drivers can't really determine
what rq->data_len contains while a request is executing. It could be
the total request length or it coulde be anything else one of the
lower layers is using to keep track of residual count. This
complicates things because blk_rq_bytes() and thus
[__]blk_end_request_all() relies on rq->data_len for PC commands.
Drivers which want to report residual count should first cache the
total request length, update rq->data_len and then complete the
request with the cached data length.
Secondly, it makes requests default to reporting full residual count,
ie. reporting that no data transfer occurred. The residual count is
an exception not the norm; however, the driver should clear
rq->data_len to zero to signify the normal cases while leaving it
alone means no data transfer occurred at all. This reverse default
behavior complicates code unnecessarily and renders block PC on some
drivers (ide-tape/floppy) unuseable.
This patch adds rq->resid_len which is used only for residual count.
While at it, remove now unnecessasry blk_rq_bytes() caching in
ide_pc_intr() as rq->data_len is not changed anymore.
Boaz : spotted missing conversion in osd
Sergei : spotted too early conversion to blk_rq_bytes() in ide-tape
[ Impact: cleanup residual count handling, report 0 resid by default ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Darrick J. Wong <djwong@us.ibm.com>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Now that all block request data transfer is done via bio, rq->data
isn't used. Kill it.
While at it, make the roles of rq->special and buffer clear.
[ Impact: drop now unncessary field from struct request ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Boaz Harrosh <bharrosh@panasas.com>
There are many [__]blk_end_request() call sites which call it with
full request length and expect full completion. Many of them ensure
that the request actually completes by doing BUG_ON() the return
value, which is awkward and error-prone.
This patch adds [__]blk_end_request_all() which takes @rq and @error
and fully completes the request. BUG_ON() is added to to ensure that
this actually happens.
Most conversions are simple but there are a few noteworthy ones.
* cdrom/viocd: viocd_end_request() replaced with direct calls to
__blk_end_request_all().
* s390/block/dasd: dasd_end_request() replaced with direct calls to
__blk_end_request_all().
* s390/char/tape_block: tapeblock_end_request() replaced with direct
calls to blk_end_request_all().
[ Impact: cleanup ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
We cannot call blk_plug_device from scsi_target_queue_ready
because the q lock is not held. And we do not need to call
it from there because when we return 0, the scsi_request_fn
not_ready handling will plug the queue for us if needed.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We have a problem with recovered error handling in that any command
which goes down as BLOCK_PC but which returns a sense code of RECOVERED
ERROR gets completed with -EIO. For actual SG_IO commands, this doesn't
matter at all, since the error return code gets dropped in favour of
req->errors which contain the SCSI completion code.
However, if this command is part of the block system, then it will pay
attention to the returned error code. In particularly if a SYNCHRONIZE
CACHE from a barrier command completes with RECOVERED ERROR, the
resulting -EIO on the barrier causes block to error the request and
return it to the filesystem. Fix this by converting the -EIO for
recovered error to zero, plus remove the printing of this from sd and sr
so the message isn't double printed.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
No one uses scsi_execute_async with data transfer now. We can remove
scsi_req_map_sg.
Only scsi_eh_lock_door uses scsi_execute_async. scsi_eh_lock_door
doesn't handle sense and the callback. So we can remove
scsi_io_context too.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>