Commit Graph

96 Commits

Author SHA1 Message Date
Tom Rix
9daf0a4d32 quota: cleanup double word in comment
Remove the second 'handle'.

Link: https://lore.kernel.org/r/20220116125936.389767-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2022-01-24 14:45:02 +01:00
Roman Anufriev
5190db9fdd fs/quota: update quota state flags scheme with project quota flags
Current quota state flags scheme doesn't include project quota and thus
shows all flags after DQUOT_USAGE_ENABLED wrong. Fix this and also add
DQUOT_NOLIST_DIRTY to the scheme.

Link: https://lore.kernel.org/r/1602989814-28922-1-git-send-email-dotdot@yandex-team.ru
Signed-off-by: Roman Anufriev <dotdot@yandex-team.ru>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-11-03 11:16:04 +01:00
Konstantin Khlebnikov
6fcbcec9cf fs/quota: handle overflows of sysctl fs.quota.* and report as unsigned long
Quota statistics counted as 64-bit per-cpu counter. Reading sums per-cpu
fractions as signed 64-bit int, filters negative values and then reports
lower half as signed 32-bit int.

Result may looks like:

fs.quota.allocated_dquots = 22327
fs.quota.cache_hits = -489852115
fs.quota.drops = -487288718
fs.quota.free_dquots = 22083
fs.quota.lookups = -486883485
fs.quota.reads = 22327
fs.quota.syncs = 335064
fs.quota.writes = 3088689

Values bigger than 2^31-1 reported as negative.

All counters except "allocated_dquots" and "free_dquots" are monotonic,
thus they should be reported as is without filtering negative values.

Kernel doesn't have generic helper for 64-bit sysctl yet,
let's use at least unsigned long.

Link: https://lore.kernel.org/r/157337934693.2078.9842146413181153727.stgit@buzz
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-11-11 11:06:27 +01:00
Jeremy Cline
64d9d13828 fs/quota: Replace XQM_MAXQUOTAS usage with MAXQUOTAS
XQM_MAXQUOTAS and MAXQUOTAS are, it appears, equivalent. Replace all
usage of XQM_MAXQUOTAS and remove it along with the unused XQM_*QUOTA
definitions.

Signed-off-by: Jeremy Cline <jcline@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2018-08-22 18:17:29 +02:00
Ritesh Harjani
b91ed9d808 quota: Kill an unused extern entry form quota.h
Kill an unused extern entry from quota.h
which is leftover of below patch.

[f32764bd2: quota: Convert quota statistics to generic percpu_counter]

Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
Signed-off-by: Jan Kara <jack@suse.cz>
2018-03-26 13:11:35 +02:00
Jan Kara
6c83fd5142 quota: Add lock annotations to struct members
Add annotation which lock protects which struct members to struct dquot
and struct mem_dqinfo.

Signed-off-by: Jan Kara <jack@suse.cz>
2017-08-21 14:06:46 +02:00
Jan Kara
7b9ca4c61b quota: Reduce contention on dq_data_lock
dq_data_lock is currently used to protect all modifications of quota
accounting information, consistency of quota accounting on the inode,
and dquot pointers from inode. As a result contention on the lock can be
pretty heavy.

Reduce the contention on the lock by protecting quota accounting
information by a new dquot->dq_dqb_lock and consistency of quota
accounting with inode usage by inode->i_lock.

This change reduces time to create 500000 files on ext4 on ramdisk by 50
different processes in separate directories by 6% when user quota is
turned on. When those 50 processes belong to 50 different users, the
improvement is about 9%.

Signed-off-by: Jan Kara <jack@suse.cz>
2017-08-17 22:07:59 +02:00
Jan Kara
834057bf84 quota: Allow disabling tracking of dirty dquots in a list
Filesystems that are journalling quotas generally don't need tracking of
dirty dquots in a list since forcing a transaction commit flushes all
quotas anyway. Allow filesystem to say it doesn't want dquots to be
tracked as it reduces contention on the dq_list_lock.

Signed-off-by: Jan Kara <jack@suse.cz>
2017-08-17 22:00:45 +02:00
Jan Kara
503330f382 quota: Remove dq_wait_unused from dquot
Currently every dquot carries a wait_queue_head_t used only when we are
turning quotas off to wait for last users to drop dquot references.
Since such rare case is not performance sensitive in any means, just use
a global waitqueue for this and save space in struct dquot. Also convert
the logic to use wait_event() instead of open-coding it.

Signed-off-by: Jan Kara <jack@suse.cz>
2017-08-17 22:00:40 +02:00
Jan Kara
bc8230ee8e quota: Convert dqio_mutex to rwsem
Convert dqio_mutex to rwsem and call it dqio_sem. No functional changes
yet.

Signed-off-by: Jan Kara <jack@suse.cz>
2017-08-17 18:52:48 +02:00
Tahsin Erdogan
7a9ca53aea quota: add get_inode_usage callback to transfer multi-inode charges
Ext4 ea_inode feature allows storing xattr values in external inodes to
be able to store values that are bigger than a block in size. Ext4 also
has deduplication support for these type of inodes. With deduplication,
the actual storage waste is eliminated but the users of such inodes are
still charged full quota for the inodes as if there was no sharing
happening in the background.

This design requires ext4 to manually charge the users because the
inodes are shared.

An implication of this is that, if someone calls chown on a file that
has such references we need to transfer the quota for the file and xattr
inodes. Current dquot_transfer() function implicitly transfers one inode
charge. With ea_inode feature, we would like to transfer multiple inode
charges.

Add get_inode_usage callback which can interrogate the total number of
inodes that were charged for a given inode.

[ Applied fix from Colin King to make sure the 'ret' variable is
  initialized on the successful return path.  Detected by
  CoverityScan, CID#1446616 ("Uninitialized scalar variable") --tytso]

Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jan Kara <jack@suse.cz>
2017-06-22 11:46:48 -04:00
Linus Torvalds
e93b1cc8a8 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota, fsnotify and ext2 updates from Jan Kara:
 "Changes to locking of some quota operations from dedicated quota mutex
  to s_umount semaphore, a fsnotify fix and a simple ext2 fix"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: Fix bogus warning in dquot_disable()
  fsnotify: Fix possible use-after-free in inode iteration on umount
  ext2: reject inodes with negative size
  quota: Remove dqonoff_mutex
  ocfs2: Use s_umount for quota recovery protection
  quota: Remove dqonoff_mutex from dquot_scan_active()
  ocfs2: Protect periodic quota syncing with s_umount semaphore
  quota: Use s_umount protection for quota operations
  quota: Hold s_umount in exclusive mode when enabling / disabling quotas
  fs: Provide function to get superblock with exclusive s_umount
2016-12-19 08:23:53 -08:00
Al Viro
8c54ca9c68 quota: constify struct path in quota_on
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-05 19:03:06 -05:00
Jan Kara
c3b004460d quota: Remove dqonoff_mutex
The only places that were grabbing dqonoff_mutex are functions turning
quotas on and off and these are properly serialized using s_umount
semaphore. Remove dqonoff_mutex.

Signed-off-by: Jan Kara <jack@suse.cz>
2016-11-30 08:38:07 +01:00
Linus Torvalds
a867d7349e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns vfs updates from Eric Biederman:
 "This tree contains some very long awaited work on generalizing the
  user namespace support for mounting filesystems to include filesystems
  with a backing store.  The real world target is fuse but the goal is
  to update the vfs to allow any filesystem to be supported.  This
  patchset is based on a lot of code review and testing to approach that
  goal.

  While looking at what is needed to support the fuse filesystem it
  became clear that there were things like xattrs for security modules
  that needed special treatment.  That the resolution of those concerns
  would not be fuse specific.  That sorting out these general issues
  made most sense at the generic level, where the right people could be
  drawn into the conversation, and the issues could be solved for
  everyone.

  At a high level what this patchset does a couple of simple things:

   - Add a user namespace owner (s_user_ns) to struct super_block.

   - Teach the vfs to handle filesystem uids and gids not mapping into
     to kuids and kgids and being reported as INVALID_UID and
     INVALID_GID in vfs data structures.

  By assigning a user namespace owner filesystems that are mounted with
  only user namespace privilege can be detected.  This allows security
  modules and the like to know which mounts may not be trusted.  This
  also allows the set of uids and gids that are communicated to the
  filesystem to be capped at the set of kuids and kgids that are in the
  owning user namespace of the filesystem.

  One of the crazier corner casees this handles is the case of inodes
  whose i_uid or i_gid are not mapped into the vfs.  Most of the code
  simply doesn't care but it is easy to confuse the inode writeback path
  so no operation that could cause an inode write-back is permitted for
  such inodes (aka only reads are allowed).

  This set of changes starts out by cleaning up the code paths involved
  in user namespace permirted mounts.  Then when things are clean enough
  adds code that cleanly sets s_user_ns.  Then additional restrictions
  are added that are possible now that the filesystem superblock
  contains owner information.

  These changes should not affect anyone in practice, but there are some
  parts of these restrictions that are changes in behavior.

   - Andy's restriction on suid executables that does not honor the
     suid bit when the path is from another mount namespace (think
     /proc/[pid]/fd/) or when the filesystem was mounted by a less
     privileged user.

   - The replacement of the user namespace implicit setting of MNT_NODEV
     with implicitly setting SB_I_NODEV on the filesystem superblock
     instead.

     Using SB_I_NODEV is a stronger form that happens to make this state
     user invisible.  The user visibility can be managed but it caused
     problems when it was introduced from applications reasonably
     expecting mount flags to be what they were set to.

  There is a little bit of work remaining before it is safe to support
  mounting filesystems with backing store in user namespaces, beyond
  what is in this set of changes.

   - Verifying the mounter has permission to read/write the block device
     during mount.

   - Teaching the integrity modules IMA and EVM to handle filesystems
     mounted with only user namespace root and to reduce trust in their
     security xattrs accordingly.

   - Capturing the mounters credentials and using that for permission
     checks in d_automount and the like.  (Given that overlayfs already
     does this, and we need the work in d_automount it make sense to
     generalize this case).

  Furthermore there are a few changes that are on the wishlist:

   - Get all filesystems supporting posix acls using the generic posix
     acls so that posix_acl_fix_xattr_from_user and
     posix_acl_fix_xattr_to_user may be removed.  [Maintainability]

   - Reducing the permission checks in places such as remount to allow
     the superblock owner to perform them.

   - Allowing the superblock owner to chown files with unmapped uids and
     gids to something that is mapped so the files may be treated
     normally.

  I am not considering even obvious relaxations of permission checks
  until it is clear there are no more corner cases that need to be
  locked down and handled generically.

  Many thanks to Seth Forshee who kept this code alive, and putting up
  with me rewriting substantial portions of what he did to handle more
  corner cases, and for his diligent testing and reviewing of my
  changes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits)
  fs: Call d_automount with the filesystems creds
  fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns
  evm: Translate user/group ids relative to s_user_ns when computing HMAC
  dquot: For now explicitly don't support filesystems outside of init_user_ns
  quota: Handle quota data stored in s_user_ns in quota_setxquota
  quota: Ensure qids map to the filesystem
  vfs: Don't create inodes with a uid or gid unknown to the vfs
  vfs: Don't modify inodes with a uid or gid unknown to the vfs
  cred: Reject inodes with invalid ids in set_create_file_as()
  fs: Check for invalid i_uid in may_follow_link()
  vfs: Verify acls are valid within superblock's s_user_ns.
  userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS
  fs: Refuse uid/gid changes which don't map into s_user_ns
  selinux: Add support for unprivileged mounts from user namespaces
  Smack: Handle labels consistently in untrusted mounts
  Smack: Add support for unprivileged mounts from user namespaces
  fs: Treat foreign mounts as nosuid
  fs: Limit file caps to the user namespace of the super block
  userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag
  userns: Remove implicit MNT_NODEV fragility.
  ...
2016-07-29 15:54:19 -07:00
Eric W. Biederman
d49d37624a quota: Ensure qids map to the filesystem
Introduce the helper qid_has_mapping and use it to ensure that the
quota system only considers qids that map to the filesystems
s_user_ns.

In practice for quota supporting filesystems today this is the exact
same check as qid_valid.  As only 0xffffffff aka (qid_t)-1 does not
map into init_user_ns.

Replace the qid_valid calls with qid_has_mapping as values come in
from userspace.  This is harmless today and it prepares the quota
system to work on filesystems with quotas but mounted by unprivileged
users.

Call qid_has_mapping from dqget.  This ensures the passed in qid has a
prepresentation on the underlying filesystem.  Previously this was
unnecessary as filesystesm never had qids that could not map.  With
the introduction of filesystems outside of s_user_ns this will not
remain true.

All of this ensures the quota code never has to deal with qids that
don't map to the underlying filesystem.

Cc: Jan Kara <jack@suse.cz>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-07-05 15:12:05 -05:00
Arnd Bergmann
e008bb6134 quota: use time64_t internally
The quota subsystem has two formats, the old v1 format using architecture
specific time_t values on the on-disk format, while the v2 format
(introduced in Linux 2.5.16 and 2.4.22) uses fixed 64-bit little-endian.

While there is no future for the v1 format beyond y2038, the v2 format
is almost there on 32-bit architectures, as both the user interface
and the on-disk format use 64-bit timestamps, just not the time_t
inbetween.

This changes the internal representation to use time64_t, which will
end up doing the right thing everywhere for v2 format.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2016-06-19 18:09:31 +02:00
Jan Kara
be6257b251 quota: Add support for ->get_nextdqblk() for VFS quota
Add infrastructure for supporting get_nextdqblk() callback for VFS
quotas. Translate the operation into a callback to appropriate
filesystem and consequently to quota format callback.

Signed-off-by: Jan Kara <jack@suse.cz>
2016-02-09 13:05:23 +01:00
Eric Sandeen
8b37524962 quota: add new quotactl Q_XGETNEXTQUOTA
Q_XGETNEXTQUOTA is exactly like Q_XGETQUOTA, except that it
will return quota information for the id equal to or greater
than the id requested.  In other words, if the requested id has
no quota, the command will return quota information for the
next higher id which does have a quota set.  If no higher id
has an active quota, -ESRCH is returned.

This allows filesystems to do efficient iteration in kernelspace,
much like extN filesystems do in userspace when asked to report
all active quotas.

The patch adds a d_id field to struct qc_dqblk so that we can
pass back the id of the quota which was found, and return it
to userspace.

Today, filesystems such as XFS require getpwent-style iterations,
and for systems which have i.e. LDAP backends, this can be very
slow, or even impossible if iteration is not allowed in the
configuration.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-08 11:21:50 +11:00
Li Xi
847aac644e vfs: Add general support to enforce project quota limits
This patch adds support for a new quota type PRJQUOTA for project quota
enforcement. Also a new method get_projid() is added into dquot_operations
structure.

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-03-18 21:55:08 +01:00
Jan Kara
7dca0548a2 Merge branch 'quota_interface' into for_next_testing 2015-03-16 10:26:41 +01:00
Konstantin Khlebnikov
781970240a quota: reorder flags in quota state
Flags in struct quota_state keep flags for each quota type and
some common flags. This patch reorders typed flags:

Before:

0 USRQUOTA DQUOT_USAGE_ENABLED
1 USRQUOTA DQUOT_LIMITS_ENABLED
2 USRQUOTA DQUOT_SUSPENDED
3 GRPQUOTA DQUOT_USAGE_ENABLED
4 GRPQUOTA DQUOT_LIMITS_ENABLED
5 GRPQUOTA DQUOT_SUSPENDED
6          DQUOT_QUOTA_SYS_FILE
7          DQUOT_NEGATIVE_USAGE

After:

0 USRQUOTA DQUOT_USAGE_ENABLED
1 GRPQUOTA DQUOT_USAGE_ENABLED
2 USRQUOTA DQUOT_LIMITS_ENABLED
3 GRPQUOTA DQUOT_LIMITS_ENABLED
4 USRQUOTA DQUOT_SUSPENDED
5 GRPQUOTA DQUOT_SUSPENDED
6          DQUOT_QUOTA_SYS_FILE
7          DQUOT_NEGATIVE_USAGE

Now we can get bitmap of all enabled/suspended quota types without loop.
For example suspended: (flags / DQUOT_SUSPENDED) & ((1 << MAXQUOTAS) - 1).

add/remove: 0/1 grow/shrink: 3/11 up/down: 56/-215 (-159)
function                                     old     new   delta
__dquot_initialize                           423     447     +24
dquot_transfer                               181     197     +16
dquot_alloc_inode                            286     302     +16
dquot_reclaim_space_nodirty                  316     313      -3
dquot_claim_space_nodirty                    314     311      -3
dquot_resume                                 286     281      -5
dquot_free_inode                             332     324      -8
__dquot_alloc_space                          500     492      -8
dquot_disable                               1944    1929     -15
dquot_quota_enable                           252     236     -16
__dquot_free_space                           750     734     -16
dquot_writeback_dquots                       625     608     -17
__dquot_transfer                            1186    1154     -32
dquot_quota_sync                             299     261     -38
dquot_active.isra                             54       -     -54

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-03-04 16:42:46 +01:00
Jan Kara
5eacb2ac02 quota: Make ->set_info use structure with neccesary info to VFS and XFS
Change ->set_info to take new qc_info structure which contains all the
necessary information both for XFS and VFS. Convert Q_SETINFO handler
to use this structure.

Signed-off-by: Jan Kara <jack@suse.cz>
2015-03-04 16:06:38 +01:00
Jan Kara
59b6ba6990 quota: Remove ->get_xstate and ->get_xstatev callbacks
These callbacks are now unused. Remove them.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-03-04 16:06:37 +01:00
Jan Kara
0a240339a8 quota: Make VFS quotas use new interface for getting quota info
Create new internal interface for getting information about quota which
contains everything needed for both VFS quotas and XFS quotas. Make VFS
use this and hook it up to Q_GETINFO.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-03-04 16:06:34 +01:00