It's ok if the read path is a lot more costly, as long as inc/dec is
really cheap. The inc/dec will happen for each created/freed io context,
while the reading only happens when a disk queue exits.
Signed-off-by: Jens Axboe <axboe@suse.de>
cfq_exit_lock is protecting two things now:
- The per-ioc rbtree of cfq_io_contexts
- The per-cfqd linked list of cfq_io_contexts
The per-cfqd linked list can be protected by the queue lock, as it is (by
definition) per cfqd as the queue lock is.
The per-ioc rbtree is mainly used and updated by the process itself only.
The only outside use is the io priority changing. If we move the
priority changing to not browsing the rbtree, we can remove any locking
from the rbtree updates and lookup completely. Let the sys_ioprio syscall
just mark processes as having the iopriority changed and lazily update
the private cfq io contexts the next time io is queued, and we can
remove this locking as well.
Signed-off-by: Jens Axboe <axboe@suse.de>
A collection of little fixes and cleanups:
- We don't use the 'queued' sysfs exported attribute, since the
may_queue() logic was rewritten. So kill it.
- Remove dead defines.
- cfq_set_active_queue() can be rewritten cleaner with else if conditions.
- Several places had cfq_exit_cfqq() like logic, abstract that out and
use that.
- Annotate the cfqq kmem_cache_alloc() so the allocator knows that this
is a repeat allocation if it fails with __GFP_WAIT set. Allows the
allocator to start freeing some memory, if needed. CFQ already loops for
this condition, so might as well pass the hint down.
- Remove cfqd->rq_starved logic. It's not needed anymore after we dropped
the crq allocation in cfq_set_request().
- Remove uneeded parameter passing.
Signed-off-by: Jens Axboe <axboe@suse.de>
- Don't assign variables that are only used once.
- Kill spin_lock() prefetching, it's opportunistic at best.
Signed-off-by: Jens Axboe <axboe@suse.de>
After Christophs SCSI change, the only usage left is RQ_ACTIVE
and RQ_INACTIVE. The block layer sets RQ_INACTIVE right before freeing
the request, so any check for RQ_INACTIVE in a driver is a bug and
indicates use-after-free.
So kill/clean the remaining users, straight forward.
Signed-off-by: Jens Axboe <axboe@suse.de>
It is always identical to &q->rq, and we only use it for detecting
whether this request came out of our mempool or not. So replace it
with an additional ->flags bit flag.
Signed-off-by: Jens Axboe <axboe@suse.de>
As the comments indicates in blkdev.h, we can fold it into ->end_io_data
usage as that is really what ->waiting is. Fixup the users of
blk_end_sync_rq().
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Get rid of the as_rq request type. With the added elevator_private2, we
have enough room in struct request to get rid of any arq allocation/free
for each request.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Get rid of the cfq_rq request type. With the added elevator_private2, we
have enough room in struct request to get rid of any crq allocation/free
for each request.
Signed-off-by: Jens Axboe <axboe@suse.de>
A big win, we now save an allocation/free on each request! With the
previous rb/hash abstractions, we can just reuse queuelist/donelist
for the FIFO data and be done with it.
Signed-off-by: Jens Axboe <axboe@suse.de>
The rbtree sort/lookup/reposition logic is mostly duplicated in
cfq/deadline/as, so move it to the elevator core. The io schedulers
still provide the actual rb root, as we don't want to impose any sort
of specific handling on the schedulers.
Introduce the helpers and rb_node in struct request to help migrate the
IO schedulers.
Signed-off-by: Jens Axboe <axboe@suse.de>
Right now, every IO scheduler implements its own backmerging (except for
noop, which does no merging). That results in duplicated code for
essentially the same operation, which is never a good thing. This patch
moves the backmerging out of the io schedulers and into the elevator
core. We save 1.6kb of text and as a bonus get backmerging for noop as
well. Win-win!
Signed-off-by: Jens Axboe <axboe@suse.de>
Right now ->flags is a bit of a mess: some are request types, and
others are just modifiers. Clean this up by splitting it into
->cmd_type and ->cmd_flags. This allows introduction of generic
Linux block message types, useful for sending generic Linux commands
to block devices.
Signed-off-by: Jens Axboe <axboe@suse.de>
The following patches reduce the size of the VFS inode structure by 28 bytes
on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction
in the inode size on a UP kernel that is configured in a production mode
(i.e., with no spinlock or other debugging functions enabled; if you want to
save memory taken up by in-core inodes, the first thing you should do is
disable the debugging options; they are responsible for a huge amount of bloat
in the VFS inode structure).
This patch:
The filesystem or device-specific pointer in the inode is inside a union,
which is pretty pointless given that all 30+ users of this field have been
using the void pointer. Get rid of the union and rename it to i_private, with
a comment to explain who is allowed to use the void pointer. This is just a
cleanup, but it allows us to reuse the union 'u' for something something where
the union will actually be used.
[judith@osdl.org: powerpc build fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Judith Lebzelter <judith@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>