linux-cix

mirror of https://github.com/armbian/linux-cix.git synced 2026-01-06 12:30:45 -08:00

Author	SHA1	Message	Date
Matthew Wilcox (Oracle)	91abab8383	XArray: Fix xas_next() with a single entry at 0 If there is only a single entry at 0, the first time we call xas_next(), we return the entry. Unfortunately, all subsequent times we call xas_next(), we also return the entry at 0 instead of noticing that the xa_index is now greater than zero. This broke find_get_pages_contig(). Fixes: `64d3e9a9e0` ("xarray: Step through an XArray") Reported-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>	2019-07-01 17:11:16 -04:00
Johannes Weiner	7b785645e8	mm: fix page cache convergence regression Since `a283348629` ("page cache: Finish XArray conversion"), on most major Linux distributions, the page cache doesn't correctly transition when the hot data set is changing, and leaves the new pages thrashing indefinitely instead of kicking out the cold ones. On a freshly booted, freshly ssh'd into virtual machine with 1G RAM running stock Arch Linux: [root@ham ~]# ./reclaimtest.sh + dd of=workingset-a bs=1M count=0 seek=600 + cat workingset-a + cat workingset-a + cat workingset-a + cat workingset-a + cat workingset-a + cat workingset-a + cat workingset-a + cat workingset-a + ./mincore workingset-a 153600/153600 workingset-a + dd of=workingset-b bs=1M count=0 seek=600 + cat workingset-b + cat workingset-b + cat workingset-b + cat workingset-b + ./mincore workingset-a workingset-b 104029/153600 workingset-a 120086/153600 workingset-b + cat workingset-b + cat workingset-b + cat workingset-b + cat workingset-b + ./mincore workingset-a workingset-b 104029/153600 workingset-a 120268/153600 workingset-b workingset-b is a 600M file on a 1G host that is otherwise entirely idle. No matter how often it's being accessed, it won't get cached. While investigating, I noticed that the non-resident information gets aggressively reclaimed - /proc/vmstat::workingset_nodereclaim. This is a problem because a workingset transition like this relies on the non-resident information tracked in the page cache tree of evicted file ranges: when the cache faults are refaults of recently evicted cache, we challenge the existing active set, and that allows a new workingset to establish itself. Tracing the shrinker that maintains this memory revealed that all page cache tree nodes were allocated to the root cgroup. This is a problem, because 1) the shrinker sizes the amount of non-resident information it keeps to the size of the cgroup's other memory and 2) on most major Linux distributions, only kernel threads live in the root cgroup and everything else gets put into services or session groups: [root@ham ~]# cat /proc/self/cgroup 0::/user.slice/user-0.slice/session-c1.scope As a result, we basically maintain no non-resident information for the workloads running on the system, thus breaking the caching algorithm. Looking through the code, I found the culprit in the above-mentioned patch: when switching from the radix tree to xarray, it dropped the __GFP_ACCOUNT flag from the tree node allocations - the flag that makes sure the allocated memory gets charged to and tracked by the cgroup of the calling process - in this case, the one doing the fault. To fix this, allow xarray users to specify per-tree flag that makes xarray allocate nodes using __GFP_ACCOUNT. Then restore the page cache tree annotation to request such cgroup tracking for the cache nodes. With this patch applied, the page cache correctly converges on new workingsets again after just a few iterations: [root@ham ~]# ./reclaimtest.sh + dd of=workingset-a bs=1M count=0 seek=600 + cat workingset-a + cat workingset-a + cat workingset-a + cat workingset-a + cat workingset-a + cat workingset-a + cat workingset-a + cat workingset-a + ./mincore workingset-a 153600/153600 workingset-a + dd of=workingset-b bs=1M count=0 seek=600 + cat workingset-b + ./mincore workingset-a workingset-b 124607/153600 workingset-a 87876/153600 workingset-b + cat workingset-b + ./mincore workingset-a workingset-b 81313/153600 workingset-a 133321/153600 workingset-b + cat workingset-b + ./mincore workingset-a workingset-b 63036/153600 workingset-a 153600/153600 workingset-b Cc: stable@vger.kernel.org # 4.20+ Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>	2019-05-31 13:52:41 -04:00
Matthew Wilcox	4a5c8d8989	XArray: Fix xa_reserve for 2-byte aligned entries If we reserve index 0, the next entry to be stored there might be 2-byte aligned. That means we have to create the root xa_node at the time of reserving the initial entry. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2019-02-21 17:54:44 -05:00
Matthew Wilcox	2fbe967b3e	XArray: Fix xa_erase of 2-byte aligned entries xas_store() was interpreting the entry it found in the array as a node entry if the bottom two bits had value 2. That's only true if either the entry is in the root node or in a non-leaf node. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2019-02-21 17:36:45 -05:00
Matthew Wilcox	962033d55d	XArray: Use xa_cmpxchg to implement xa_reserve Jason feels this is clearer, and it saves a function and an exported symbol. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Matthew Wilcox <willy@infradead.org>	2019-02-20 17:08:54 -05:00
Matthew Wilcox	b38f6c5027	XArray: Fix xa_release in allocating arrays xa_cmpxchg() was a little too magic in turning ZERO entries into NULL, and would leave the entry set to the ZERO entry instead of releasing it for future use. After careful review of existing users of xa_cmpxchg(), change the semantics so that it does not translate either incoming argument from NULL into ZERO entries. Add several tests to the test-suite to make sure this problem doesn't come back. Reported-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Matthew Wilcox <willy@infradead.org>	2019-02-20 17:08:54 -05:00
Matthew Wilcox	2fa044e51a	XArray: Add cyclic allocation This differs slightly from the IDR equivalent in five ways. 1. It can allocate up to UINT_MAX instead of being limited to INT_MAX, like xa_alloc(). Also like xa_alloc(), it will write to the 'id' pointer before placing the entry in the XArray. 2. The 'next' cursor is allocated separately from the XArray instead of being part of the IDR. This saves memory for all the users which do not use the cyclic allocation API and suits some users better. 3. It returns -EBUSY instead of -ENOSPC. 4. It will attempt to wrap back to the minimum value on memory allocation failure as well as on an -EBUSY error, assuming that a user would rather allocate a small ID than suffer an ID allocation failure. 5. It reports whether it has wrapped, which is important to some users. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2019-02-06 13:32:25 -05:00
Matthew Wilcox	a3e4d3f97e	XArray: Redesign xa_alloc API It was too easy to forget to initialise the start index. Add an xa_limit data structure which can be used to pass min & max, and define a couple of special values for common cases. Also add some more tests cribbed from the IDR test suite. Change the return value from -ENOSPC to -EBUSY to match xa_insert(). Signed-off-by: Matthew Wilcox <willy@infradead.org>	2019-02-06 13:32:23 -05:00
Matthew Wilcox	3ccaf57a6a	XArray: Add support for 1s-based allocation A lot of places want to allocate IDs starting at 1 instead of 0. While the xa_alloc() API supports this, it's not very efficient if lots of IDs are allocated, due to having to walk down to the bottom of the tree to see if ID 1 is available, then all the way over to the next non-allocated ID. This method marks ID 0 as being occupied which wastes one slot in the XArray, but preserves xa_empty() as working. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2019-02-06 13:13:24 -05:00
Matthew Wilcox	fd9dc93e36	XArray: Change xa_insert to return -EBUSY Userspace translates EEXIST to "File exists" which isn't a very good error message for the problem. "Device or resource busy" is a better indication of what went wrong. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2019-02-06 13:12:15 -05:00
Matthew Wilcox	809ab9371c	XArray: Update xa_erase family descriptions xa_erase does not allocate memory and doesn't have a gfp parameter. Update the descriptions of all four variants to be more useful. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2019-02-04 23:16:58 -05:00
Matthew Wilcox	b0606fed6e	XArray: Honour reserved entries in xa_insert xa_insert() should treat reserved entries as occupied, not as available. Also, it should treat requests to insert a NULL pointer as a request to reserve the slot. Add xa_insert_bh() and xa_insert_irq() for completeness. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2019-01-06 22:12:58 -05:00
Matthew Wilcox	76b4e52995	XArray: Permit storing 2-byte-aligned pointers On m68k, statically allocated pointers may only be two-byte aligned. This clashes with the XArray's method for tagging internal pointers. Permit storing these pointers in single slots (ie not in multislots). Signed-off-by: Matthew Wilcox <willy@infradead.org>	2019-01-06 22:12:57 -05:00
Matthew Wilcox	02669b17a4	XArray: Turn xa_init_flags into a static inline A regular xa_init_flags() put all dynamically-initialised XArrays into the same locking class. That leads to lockdep believing that taking one XArray lock while holding another is a deadlock. It's possible to work around some of these situations with separate locking classes for irq/bh/regular XArrays, and SINGLE_DEPTH_NESTING, but that's ugly, and it doesn't work for all situations (where we have completely unrelated XArrays). Signed-off-by: Matthew Wilcox <willy@infradead.org>	2019-01-06 21:24:43 -05:00
Matthew Wilcox	48483614de	XArray: Fix xa_alloc when id exceeds max Specifying a starting ID greater than the maximum ID isn't something attempted very often, but it should fail. It was succeeding due to xas_find_marked() returning the wrong error state, so add tests for both xa_alloc() and xas_find_marked(). Fixes: `b803b42823` ("xarray: Add XArray iterators") Signed-off-by: Matthew Wilcox <willy@infradead.org>	2018-12-13 14:07:33 -05:00
Matthew Wilcox	44a4a66b61	XArray: Correct xa_store_range The explicit '64' should have been BITS_PER_LONG, but while looking at this code I realised I meant to use __ffs(), not ilog2(). Signed-off-by: Matthew Wilcox <willy@infradead.org>	2018-11-16 16:27:28 -05:00
Matthew Wilcox	804dfaf01b	XArray: Fix Documentation Minor fixes. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2018-11-05 16:38:10 -05:00
Matthew Wilcox	d9c480435a	XArray: Handle NULL pointers differently for allocation For allocating XArrays, it makes sense to distinguish beteen erasing an entry and storing NULL. Storing NULL keeps the index allocated with a NULL pointer associated with it while xa_erase() frees the index. Some existing IDR users rely on this ability. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2018-11-05 16:38:09 -05:00
Matthew Wilcox	611f318637	XArray: Unify xa_store and __xa_store Saves around 115 bytes on a tinyconfig build and reduces the amount of code duplication in the XArray implementation. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2018-11-05 16:38:09 -05:00
Matthew Wilcox	9c16bb8890	XArray: Turn xa_erase into an exported function Make xa_erase() take the spinlock and then call __xa_erase(), but make it out of line since it's such a common function. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2018-11-05 16:38:09 -05:00
Matthew Wilcox	c5beb07e7a	XArray: Unify xa_cmpxchg and __xa_cmpxchg xa_cmpxchg() was one of the largest functions in the xarray implementation. By turning it into a wrapper and having the callers take the lock (like several other functions), we save 160 bytes on a tinyconfig build and reduce the duplication in xarray.c. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2018-11-05 16:38:08 -05:00
Matthew Wilcox	4c0608f4a0	XArray: Regularise xa_reserve The xa_reserve() function was a little unusual in that it attempted to be callable for all kinds of locking scenarios. Make it look like the other APIs with __xa_reserve, xa_reserve_bh and xa_reserve_irq variants. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2018-11-05 16:38:08 -05:00
Matthew Wilcox	9ee5a3b7ee	XArray: Export __xa_foo to non-GPL modules Without this, it's not possible to use static inlines like xa_store_bh() and xa_erase_irq(). Signed-off-by: Matthew Wilcox <willy@infradead.org>	2018-11-05 14:56:58 -05:00
Matthew Wilcox	8229706e03	XArray: Fix xa_for_each with a single element at 0 The following sequence of calls would result in an infinite loop in xa_find_after(): xa_store(xa, 0, x, GFP_KERNEL); index = 0; xa_for_each(xa, entry, index, ULONG_MAX, XA_PRESENT) { } xa_find_after() was confusing the situation where we found no entry in the tree with finding a multiorder entry, so it would look for the successor entry forever. Just check for this case explicitly. Includes a few new checks in the test suite to be sure this doesn't reappear. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2018-11-05 14:56:46 -05:00
Matthew Wilcox	0e9446c35a	xarray: Add range store functionality This version of xa_store_range() really only supports load and store. Our only user only needs basic load and store functionality, so there's no need to do the extra work to support marking and overlapping stores correctly yet. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2018-10-21 10:46:46 -04:00

1 2

38 Commits