When struct cfq_data allocation fails, cic_index need to be freed.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
The 'group_changed' variable is initialized to 0 and never changed, so
checking the variable is meaningless.
It is a leftover from 0bbfeb8320 ("cfq-iosched: Always provide group
iosolation."). Let's get rid of it.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Justin TerAvest <teravest@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Reduce the number of bit operations in cfq_choose_req() on average
(and worst) cases.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Simplify the calculation in cfq_prio_to_maxrq(), plus replace CFQ_PRIO_LISTS to
IOPRIO_BE_NR since they are the same and IOPRIO_BE_NR looks more reasonable in
this context IMHO.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
If we don't explicitly initialize it to zero, CFQ might think that
cgroup of ioc has changed and it generates lots of unnecessary calls
to call_for_each_cic(changed_cgroup). Fix it.
cfq_get_io_context()
cfq_ioc_set_cgroup()
call_for_each_cic(ioc, changed_cgroup)
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Commit 73c1010119 ("block: initial patch for on-stack per-task plugging")
removed calls to elv_bio_merged() when @bio merged with @req. Re-add them.
This in turn will update merged stats in associated group. That
should be safe as long as request has got reference to the blkio_group.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Divyesh Shah <dpshah@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Make BLKIO_STAT_MERGED per cpu hence gettring rid of need of taking
blkg->stats_lock.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
We allocated per cpu stats struct for root group but did not free it.
Fix it.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
We don't need them anymore, so kill:
- REQ_ON_PLUG checks in various places
- !rq_mergeable() check in plug merging
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Currently we take a queue lock on each bio to check if there are any
throttling rules associated with the group and also update the stats.
Now access the group under rcu and update the stats without taking
the queue lock. Queue lock is taken only if there are throttling rules
associated with the group.
So the common case of root group when there are no rules, save
unnecessary pounding of request queue lock.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Now dispatch stats update is lock free. But reset of these stats still
takes blkg->stats_lock and is dependent on that. As stats are per cpu,
we should be able to just reset the stats on each cpu without any locks.
(Atleast for 64bit arch).
On 32bit arch there is a small race where 64bit updates are not atomic.
The result of this race can be that in the presence of other writers,
one might not get 0 value after reset of a stat and might see something
intermediate
One can write more complicated code to cover this race like sending IPI
to other cpus to reset stats and for offline cpus, reset these directly.
Right not I am not taking that path because reset_update is more of a
debug feature and it can happen only on 32bit arch and possibility of
it happening is small. Will fix it if it becomes a real problem. For
the time being going for code simplicity.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Some of the stats are 64bit and updation will be non atomic on 32bit
architecture. Use sequence counters on 32bit arch to make reading
of stats safe.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Currently we take blkg_stat lock for even updating the stats. So even if
a group has no throttling rules (common case for root group), we end
up taking blkg_lock, for updating the stats.
Make dispatch stats per cpu so that these can be updated without taking
blkg lock.
If cpu goes offline, these stats simply disappear. No protection has
been provided for that yet. Do we really need anything for that?
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Soon we will allow accessing a throtl_grp under rcu_read_lock(). Hence
start freeing up throtl_grp after one rcu grace period.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Use same helper function for root group as we use with dynamically
allocated groups to add it to various lists.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
A helper function for the code which is used at 2-3 places. Makes reading
code little easier.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Currently, we allocate root throtl_grp statically. But as we will be
introducing per cpu stat pointers and that will be allocated
dynamically even for root group, we might as well make whole root
throtl_grp allocation dynamic and treat it in same manner as other
groups.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Currently, all the cfq_group or throtl_group allocations happen while
we are holding ->queue_lock and sleeping is not allowed.
Soon, we will move to per cpu stats and also need to allocate the
per group stats. As one can not call alloc_percpu() from atomic
context as it can sleep, we need to drop ->queue_lock, allocate the
group, retake the lock and continue processing.
In throttling code, I check the queue DEAD flag again to make sure
that driver did not call blk_cleanup_queue() in the mean time.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
blkg->key = cfqd is an rcu protected pointer and hence we used to do
call_rcu(cfqd->rcu_head) to free up cfqd after one rcu grace period.
The problem here is that even though cfqd is around, there are no
gurantees that associated request queue (td->queue) or q->queue_lock
is still around. A driver might have called blk_cleanup_queue() and
release the lock.
It might happen that after freeing up the lock we call
blkg->key->queue->queue_ock and crash. This is possible in following
path.
blkiocg_destroy()
blkio_unlink_group_fn()
cfq_unlink_blkio_group()
Hence, wait for an rcu peirod if there are groups which have not
been unlinked from blkcg->blkg_list. That way, if there are any groups
which are taking cfq_unlink_blkio_group() path, can safely take queue
lock.
This is how we have taken care of race in throttling logic also.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Nobody seems to be using cfq_find_alloc_cfqg() function parameter "create".
Get rid of that.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
cgroup unaccounted_time file is created only if CONFIG_DEBUG_BLK_CGROUP=y.
there are some fields which are out side this config option. Fix that.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Group initialization code seems to be at two places. root group
initialization in blk_throtl_init() and dynamically allocated group
in throtl_find_alloc_tg(). Create a common function and use at both
the places.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Since for-2.6.40/core was forked off the 2.6.39 devel tree, we've
had churn in the core area that makes it difficult to handle
patches for eg cfq or blk-throttle. Instead of requiring that they
be based in older versions with bugs that have been fixed later
in the rc cycle, merge in 2.6.39 final.
Also fixes up conflicts in the below files.
Conflicts:
drivers/block/paride/pcd.c
drivers/cdrom/viocd.c
drivers/ide/ide-cd.c
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
blk_cleanup_queue() calls elevator_exit() and after this, we can't
touch the elevator without oopsing. __elv_next_request() must check
for this state because in the refcounted queue model, we can still
call it after blk_cleanup_queue() has been called.
This was reported as causing an oops attributable to scsi.
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>