Commit Graph

134 Commits

Author SHA1 Message Date
Jens Axboe c265a7f417 cfq-iosched: get rid of enable_idle being unused warning
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03 13:21:14 +02:00
Jens Axboe 7b679138b3 cfq-iosched: add message logging through blktrace
Now that blktrace has the ability to carry arbitrary messages in
its stream, use that for some CFQ logging.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03 13:21:12 +02:00
Jens Axboe 9a11b4ed0e cfq-iosched: properly protect ioc_gone and ioc count
If we have multiple tasks freeing cfq_io_contexts when cfq-iosched
is being unloaded, we could complete() ioc_gone twice. Fix that by
protecting ioc_gone complete() and clearing with a spinlock for
just that purpose. Doesn't matter from a performance perspective,
since it'll only enter that path when ioc_gone != NULL (when cfq-iosched
is being rmmod'ed).

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03 13:21:12 +02:00
Jens Axboe d6de8be711 cfq-iosched: fix RCU problem in cfq_cic_lookup()
cfq_cic_lookup() needs to properly protect ioc->ioc_data before
dereferencing it and also exclude updaters of ioc->ioc_data as well.

Also add a number of comments documenting why the existing RCU usage
is OK.

Thanks a lot to "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> for
review and comments!

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-05-28 14:49:28 +02:00
Richard Kennedy be754d2c21 block: reorder cfq_queue to save space on 64bit builds
saves 8 bytes of padding & increases objects/slab from 30 to 32 on my
AMD64 config

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-05-28 14:49:27 +02:00
Jens Axboe 6d63c27557 cfq-iosched: make io priorities inherit CPU scheduling class as well as nice
We currently set all processes to the best-effort scheduling class,
regardless of what CPU scheduling class they belong to. Improve that
so that we correctly track idle and rt scheduling classes as well.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-05-07 09:51:23 +02:00
Jens Axboe 07416d29bc cfq-iosched: fix RCU race in the cfq io_context destructor handling
put_io_context() drops the RCU read lock before calling into cfq_dtor(),
however we need to hold off freeing there before grabbing and
dereferencing the first object on the list.

So extend the rcu_read_lock() scope to cover the calling of cfq_dtor(),
and optimize cfq_free_io_context() to use a new variant for
call_for_each_cic() that assumes the RCU read lock is already held.

Hit in the wild by Alexey Dobriyan <adobriyan@gmail.com>

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-05-07 09:28:57 +02:00
Fabio Checconi 4faa3c8150 cfq-iosched: do not leak ioc_data across iosched switches
When switching scheduler from cfq, cfq_exit_queue() does not clear
ioc->ioc_data, leaving a dangling pointer that can deceive the following
lookups when the iosched is switched back to cfq.  The pattern that can
trigger that is the following:

    - elevator switch from cfq to something else;
    - module unloading, with elv_unregister() that calls cfq_free_io_context()
      on ioc freeing the cic (via the .trim op);
    - module gets reloaded and the elevator switches back to cfq;
    - reallocation of a cic at the same address as before (with a valid key).

To fix it just assign NULL to ioc_data in __cfq_exit_single_io_context(),
that is called from the regular exit path and from the elevator switching
code.  The only path that frees a cic and is not covered is the error handling
one, but cic's freed in this way are never cached in ioc_data.

Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-10 08:28:01 +02:00
Fabio Checconi 34e6bbf23c cfq-iosched: fix rcu freeing of cfq io contexts
SLAB_DESTROY_BY_RCU is not a direct substitute for normal call_rcu()
freeing, since it'll page freeing but NOT object freeing. So change
cfq to do the freeing on its own.

Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-02 15:42:20 +02:00
Jens Axboe ffc4e75957 cfq-iosched: add hlist for browsing parallel to the radix tree
It's cumbersome to browse a radix tree from start to finish, especially
since we modify keys when a process exits. So add a hlist for the single
purpose of browsing over all known cfq_io_contexts, used for exit,
io prio change, etc.

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=9948

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-02-19 10:04:00 +01:00
Jens Axboe fe094d98e7 cfq-iosched: make checkpatch compliant
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-02-01 09:26:33 +01:00
Jens Axboe febffd6181 cfq-iosched: kill some big inlines
Use of inlines were a bit over the top, trim them down a bit.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 13:19:43 +01:00
Jens Axboe 0871714e08 cfq-iosched: relax IOPRIO_CLASS_IDLE restrictions
Currently you must be root to set idle io prio class on a process. This
is due to the fact that the idle class is implemented as a true idle
class, meaning that it will not make progress if someone else is
requesting disk access. Unfortunately this means that it opens DOS
opportunities by locking down file system resources, hence it is root
only at the moment.

This patch relaxes the idle class a little, by removing the truly idle
part (which entals a grace period with associated timer). The
modifications make the idle class as close to zero impact as can be done
while still guarenteeing progress. This means we can relax the root only
criteria as well.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 11:38:15 +01:00
Jens Axboe 4ac845a2e9 block: cfq: make the io contect sharing lockless
The io context sharing introduced a per-ioc spinlock, that would protect
the cfq io context lookup. That is a regression from the original, since
we never needed any locking there because the ioc/cic were process private.

The cic lookup is changed from an rbtree construct to a radix tree, which
we can then use RCU to make the reader side lockless. That is the performance
critical path, modifying the radix tree is only done on process creation
(when that process first does IO, actually) and on process exit (if that
process has done IO).

As it so happens, radix trees are also much faster for this type of
lookup where the key is a pointer. It's a very sparse tree.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:50:33 +01:00
Nikanth Karthikesan 66dac98ed0 io_context sharing - cfq changes
changes in the cfq for io_context sharing

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:50:32 +01:00
Jens Axboe fd0928df98 ioprio: move io priority from task_struct to io_context
This is where it belongs and then it doesn't take up space for a
process that doesn't do IO.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:50:29 +01:00
Adrian Bunk 2fdd82bd88 block: let elv_register() return void
elv_register() always returns 0, and there isn't anything it does where
it should return an error (the only error condition is so grave that
it's handled with a BUG_ON).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-12-18 08:29:28 +01:00
Oleg Nesterov 0e7be9edb9 cfq_idle_class_timer: add paranoid checks for jiffies overflow
In theory, if the queue was idle long enough, cfq_idle_class_timer may have
a false (and very long) timeout because jiffies can wrap into the past wrt
->last_end_request.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-07 13:51:35 +01:00
Oleg Nesterov b70c864d3c cfq: fix IOPRIO_CLASS_IDLE delays
After the fresh boot:

	ionice -c3 -p $$
	echo cfq >> /sys/block/XXX/queue/scheduler
	dd if=/dev/XXX of=/dev/null bs=512 count=1

Now dd hangs in D state and the queue is completely stalled for approximately
INITIAL_JIFFIES + CFQ_IDLE_GRACE jiffies. This is because cfq_init_queue()
forgets to initialize cfq_data->last_end_request.

(I guess this patch is not complete, overflow is still possible)

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-07 09:46:13 +01:00
Oleg Nesterov 2389d1ef17 cfq: fix IOPRIO_CLASS_IDLE accounting
Spotted by Nick <gentuu@gmail.com>, hopefully can explain the second trace in
http://bugzilla.kernel.org/show_bug.cgi?id=9180.

If ->async_idle_cfqq != NULL cfq_put_async_queues() puts it IOPRIO_BE_NR times
in a loop. Fix this.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-07 09:45:00 +01:00
Oleg Nesterov 0a0836a09c cfq_get_queue: fix possible NULL pointer access
cfq_get_queue()->cfq_find_alloc_queue() can fail, check the returned value.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

Note that this isn't a bug at the moment, since the regular IO path
does not call this path without __GFP_WAIT set. However, it could be a
future bug, so I've applied it.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-29 11:33:05 +01:00
Oleg Nesterov 4310864b9d cfq_exit_queue() should cancel cfq_data->unplug_work
Spotted by Nick <gentuu@gmail.com>, perhaps explains the first trace in
http://bugzilla.kernel.org/show_bug.cgi?id=9180.

cfq_exit_queue() should cancel cfqd->unplug_work before freeing cfqd.
blk_sync_queue() seems unneeded, removed.

Q: why cfq_exit_queue() calls cfq_shutdown_timer_wq() twice?

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-29 11:33:05 +01:00
Jens Axboe 165125e1e4 [BLOCK] Get rid of request_queue_t typedef
Some of the code has been gradually transitioned to using the proper
struct request_queue, but there's lots left. So do a full sweet of
the kernel and get rid of this typedef and replace its uses with
the proper type.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-24 09:28:11 +02:00
Alexey Dobriyan 8350163a90 cfq: Write-only stuff in CFQ data structures
There are some leftover bits from the task cooperator patch, that was
yanked out again. While it will get reintroduced, no point in having
this write-only stuff in the tree. So yank it.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-20 10:07:50 +02:00
Vasily Tarasov c2dea2d1fd cfq: async queue allocation per priority
If we have two processes with different ioprio_class, but the same
ioprio_data, their async requests will fall into the same queue. I guess
such behavior is not expected, because it's not right to put real-time
requests and best-effort requests in the same queue.

The attached patch fixes the problem by introducing additional *cfqq
fields on cfqd, pointing to per-(class,priority) async queues.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-20 10:06:38 +02:00