Commit Graph

386804 Commits

Author SHA1 Message Date
Tejun Heo a7c6d554aa cgroup: add/update accessors which obtain subsys specific data from css
css (cgroup_subsys_state) is usually embedded in a subsys specific
data structure.  Subsystems either use container_of() directly to cast
from css to such data structure or has an accessor function wrapping
such cast.  As cgroup as whole is moving towards using css as the main
interface handle, add and update such accessors to ease dealing with
css's.

All accessors explicitly handle NULL input and return NULL in those
cases.  While this looks like an extra branch in the code, as all
controllers specific data structures have css as the first field, the
casting doesn't involve any offsetting and the compiler can trivially
optimize out the branch.

* blkio, freezer, cpuset, cpu, cpuacct and net_cls didn't have such
  accessor.  Added.

* memory, hugetlb and devices already had one but didn't explicitly
  handle NULL input.  Updated.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2013-08-08 20:11:23 -04:00
Tejun Heo 72c97e54e0 cgroup: add subsystem pointer to cgroup_subsys_state
Currently, given a cgroup_subsys_state, there's no way to find out
which subsystem the css is for, which we'll need to convert the cgroup
controller API to primarily use @css instead of @cgroup.  This patch
adds cgroup_subsys_state->ss which points to the subsystem the @css
belongs to.

While at it, remove the comment about accessing @css->cgroup to
determine the hierarchy.  cgroup core will provide API to traverse
hierarchy of css'es and we don't want subsystems to directly walk
cgroup hierarchies anymore.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2013-08-08 20:11:22 -04:00
Tejun Heo 3f79851831 hugetlb_cgroup: pass around @hugetlb_cgroup instead of @cgroup
cgroup controller API will be converted to primarily use struct
cgroup_subsys_state instead of struct cgroup.  In preparation, make
hugetlb_cgroup functions pass around struct hugetlb_cgroup instead of
struct cgroup.

This patch shouldn't cause any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
2013-08-08 20:11:22 -04:00
Tejun Heo 6d37b97428 netprio_cgroup: pass around @css instead of @cgroup and kill struct cgroup_netprio_state
cgroup controller API will be converted to primarily use struct
cgroup_subsys_state instead of struct cgroup.  In preparation, make
the internal functions of netprio_cgroup pass around @css instead of
@cgrp.

While at it, kill struct cgroup_netprio_state which only contained
struct cgroup_subsys_state without serving any purpose.  All functions
are converted to deal with @css directly.

This patch shouldn't cause any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: David S. Miller <davem@davemloft.net>
2013-08-08 20:11:22 -04:00
Tejun Heo c9710d8018 cpuset: drop "const" qualifiers from struct cpuset instances
cpuset uses "const" qualifiers on struct cpuset in some functions;
however, it doesn't work well when a value derived from returned const
pointer has to be passed to an accessor.  It's C after all.

Drop the "const" qualifiers except for the trivially leaf ones.  This
patch doesn't make any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2013-08-08 20:11:22 -04:00
Tejun Heo 8af01f56a0 cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/
The names of the two struct cgroup_subsys_state accessors -
cgroup_subsys_state() and task_subsys_state() - are somewhat awkward.
The former clashes with the type name and the latter doesn't even
indicate it's somehow related to cgroup.

We're about to revamp large portion of cgroup API, so, let's rename
them so that they're less awkward.  Most per-controller usages of the
accessors are localized in accessor wrappers and given the amount of
scheduled changes, this isn't gonna add any noticeable headache.

Rename cgroup_subsys_state() to cgroup_css() and task_subsys_state()
to task_css().  This patch is pure rename.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2013-08-08 20:11:22 -04:00
Tejun Heo 61584e3f49 cgroup: Merge branch 'for-3.11-fixes' into for-3.12
for-3.12 branch is about to receive invasive updates which are
dependent on da0a12caff ("cgroup: fix a leak when percpu_ref_init()
fails").  Given the amount of scheduled changes, I think it'd less
painful to pull in for-3.11-fixes as preparation.  Pull in
for-3.11-fixes into for-3.12.

Signed-off-by: Tejun Heo <tj@kernel.org>
2013-08-02 16:12:13 -04:00
Li Zefan b395890a09 cgroup: rename cgroup_pidlist->mutex
It's a rw_semaphore not a mutex.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-08-01 09:29:41 -04:00
Li Zefan 876ede8b2b cgroup: restructure the failure path in cgroup_write_event_control()
It uses a single label and checks the validity of each pointer. This
is err-prone, and actually we had a bug because one of the check was
insufficient.

Use multi lables as we do in other places.

v2:
- drop initializations of local variables.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-08-01 09:29:41 -04:00
Li Zefan e14880f7bb cgroup: implement cgroup_from_id()
This will be used as a replacement for css_lookup().

There's a difference with cgroup id and css id. cgroup id starts with 0,
while css id starts with 1.

v4:
- also check if cggroup_mutex is held.
- make it an inline function.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-07-31 07:47:34 -04:00
Li Zefan b414dc09a3 cgroup: document how cgroup IDs are assigned
As cgroup id has been used in netprio cgroup and will be used in memcg,
it's important to make it clear how a cgroup id is allocated.

For example, in netprio cgroup, the id is used as index of anarray.

Signed-off-by: Li Zefan <lizefan@huwei.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-07-31 07:47:34 -04:00
Li Zefan 4e96ee8e98 cgroup: convert cgroup_ida to cgroup_idr
This enables us to lookup a cgroup by its id.

v4:
- add a comment for idr_remove() in cgroup_offline_fn().

v3:
- on success, idr_alloc() returns the id but not 0, so fix the BUG_ON()
  in cgroup_init().
- pass the right value to idr_alloc() so that the id for dummy cgroup is 0.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-07-31 07:47:34 -04:00
Li Zefan 6f4b7e632d cgroup: more naming cleanups
Constantly use @cset for css_set variables and use @cgrp as cgroup
variables.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-07-31 06:20:18 -04:00
Li Zefan e0798ce273 cgroup: remove struct cgroup_seqfile_state
We can use struct cfent instead.

v2:
- remove cgroup_seqfile_release().

Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-07-31 06:20:18 -04:00
Li Zefan 2a4ac63333 cgroup: remove sparse tags from offline_css()
This should have been removed in commit d7eeac1913
("cgroup: hold cgroup_mutex before calling css_offline").

While at it, update the comments.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-07-31 06:20:18 -04:00
Li Zefan da0a12caff cgroup: fix a leak when percpu_ref_init() fails
ss->css_free() is not called when perfcpu_ref_init() fails.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-07-31 06:13:25 -04:00
Zhao Hongjiang 0b9e6965ad cpuset: relocate a misplaced comment
Comment for cpuset_css_offline() was on top of cpuset_css_free().
Move it.

Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-07-29 14:13:56 -04:00
Zhao Hongjiang 9ad9d25a1e cpuset: get rid of the useless forward declaration of cpuset
get rid of the useless forward declaration of the struct cpuset cause the 
below define it.

Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-07-29 14:08:08 -04:00
Tejun Heo a698b4488a cgroup: remove gratuituous BUG_ON()s from rebind_subsystems()
rebind_subsystems() performs santiy checks even on subsystems which
aren't specified to be added or removed and the checks aren't all that
useful given that these are in a very cold path while the violations
they check would trip up in much hotter paths.

Let's remove these from rebind_subsystems().

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2013-07-16 05:28:24 -06:00
Tejun Heo 1d5be6b287 cgroup: move module ref handling into rebind_subsystems()
Module ref handling in cgroup is rather weird.
parse_cgroupfs_options() grabs all the modules for the specified
subsystems.  A module ref is kept if the specified subsystem is newly
bound to the hierarchy.  If not, or the operation fails, the refs are
dropped.  This scatters module ref handling across multiple functions
making it difficult to track.  It also make the function nasty to use
for dynamic subsystem binding which is necessary for the planned
unified hierarchy.

There's nothing which requires the subsystem modules to be pinned
between parse_cgroupfs_options() and rebind_subsystems() in both mount
and remount paths.  parse_cgroupfs_options() can just parse and
rebind_subsystems() can handle pinning the subsystems that it wants to
bind, which is a natural part of its task - binding - anyway.

Move module ref handling into rebind_subsystems() which makes the code
a lot simpler - modules are gotten iff it's gonna be bound and put iff
unbound or binding fails.

v2: Li pointed out that if a controller module is unloaded between
    parsing and binding, rebind_subsystems() won't notice the missing
    controller as it only iterates through existing controllers.  Fix
    it by updating rebind_subsystems() to compare @added_mask to
    @pinned and fail with -ENOENT if they don't match.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2013-07-16 05:28:24 -06:00
Tejun Heo 913ffdb543 cgroup: replace task_cgroup_path_from_hierarchy() with task_cgroup_path()
task_cgroup_path_from_hierarchy() was added for the planned new users
and none of the currently planned users wants to know about multiple
hierarchies.  This patch drops the multiple hierarchy part and makes
it always return the path in the first non-dummy hierarchy.

As unified hierarchy will always have id 1, this is guaranteed to
return the path for the unified hierarchy if mounted; otherwise, it
will return the path from the hierarchy which happens to occupy the
lowest hierarchy id, which will usually be the first hierarchy mounted
after boot.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kaluža <jkaluza@redhat.com>
2013-07-12 12:49:05 -07:00
Tejun Heo f172e67cf9 cgroup: move number_of_cgroups test out of rebind_subsystems() into cgroup_remount()
rebind_subsystems() currently fails if the hierarchy has any !root
cgroups; however, on the planned unified hierarchy,
rebind_subsystems() will be used while populated.  Move the test to
cgroup_remount(), which is the only place the test is necessary
anyway.

As it's impossible for the other two callers of rebind_subsystems() to
have populated hierarchy, this doesn't make any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2013-07-12 12:34:02 -07:00
Tejun Heo 3126121fb3 cgroup: make rebind_subsystems() handle file additions and removals with proper error handling
Currently, creating and removing cgroup files in the root directory
are handled separately from the actual subsystem binding and unbinding
which happens in rebind_subsystems().  Also, rebind_subsystems() users
aren't handling file creation errors properly.  Let's integrate
top_cgroup file handling into rebind_subsystems() so that it's simpler
to use and everyone handles file creation errors correctly.

* On a successful return, rebind_subsystems() is guaranteed to have
  created all files of the new subsystems and deleted the ones
  belonging to the removed subsystems.  After a failure, no file is
  created or removed.

* cgroup_remount() no longer needs to make explicit populate/clear
  calls as it's all handled by rebind_subsystems(), and it gets proper
  error handling automatically.

* cgroup_mount() has been updated such that the root dentry and cgroup
  are linked before rebind_subsystems().  Also, the init_cred dancing
  and base file handling are moved right above rebind_subsystems()
  call and proper error handling for the base files is added.  While
  at it, add a comment explaining what's going on with the cred thing.

* cgroup_kill_sb() calls rebind_subsystems() to unbind all subsystems
  which now implies removing all subsystem files which requires the
  directory's i_mutex.  Grab it.  This means that files on the root
  cgroup are removed earlier - they used to be deleted from generic
  super_block cleanup from vfs.  This doesn't lead to any functional
  difference and it's cleaner to do the clean up explicitly for all
  files.

Combined with the previous changes, this makes all cgroup file
creation errors handled correctly.

v2: Added comment on init_cred.

v3: Li spotted that cgroup_mount() wasn't freeing tmp_links after base
    file addition failure.  Fix it by adding free_tmp_links error
    handling label.

v4: v3 introduced build bugs which got noticed by Fengguang's awesome
    kbuild test robot.  Fixed, and shame on me.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
2013-07-12 12:34:02 -07:00
Tejun Heo b420ba7db1 cgroup: use for_each_subsys() instead of for_each_root_subsys() in cgroup_populate/clear_dir()
rebind_subsystems() will be updated to handle file creations and
removals with proper error handling and to do that will need to
perform file operations before actually adding the subsystem to the
hierarchy.

To enable such usage, update cgroup_populate/clear_dir() to use
for_each_subsys() instead of for_each_root_subsys() so that they
operate on all subsystems specified by @subsys_mask whether that
subsystem is currently bound to the hierarchy or not.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2013-07-12 12:34:02 -07:00
Tejun Heo bee550994f cgroup: update error handling in cgroup_populate_dir()
cgroup_populate_dir() didn't use to check whether the actual file
creations were successful and could return success with only subset of
the requested files created, which is nasty.

This patch udpates cgroup_populate_dir() so that it either succeeds
with all files or fails with no file.

v2: The original patch also converted for_each_root_subsys() usages to
    for_each_subsys() without explaining why.  That part has been
    moved to a separate patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2013-07-12 12:34:01 -07:00