linux-rockchip

mirror of https://github.com/armbian/linux-rockchip.git synced 2026-01-06 11:08:10 -08:00

Author	SHA1	Message	Date
Greg Kroah-Hartman	aa4cd140bb	Linux 6.1.112 Link: https://lore.kernel.org/r/20240927121719.897851549@linuxfoundation.org Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Allen Pais <apais@linux.microsoft.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Ron Economos <re@w6rz.net> Tested-by: kernelci.org bot <bot@kernelci.org> Tested-by: Pavel Machek (CIP) <pavel@denx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:56 +02:00
Edward Adam Davis	ba6269e187	USB: usbtmc: prevent kernel-usb-infoleak commit 625fa77151f00c1bd00d34d60d6f2e710b3f9aad upstream. The syzbot reported a kernel-usb-infoleak in usbtmc_write, we need to clear the structure before filling fields. Fixes: `4ddc645f40` ("usb: usbtmc: Add ioctl for vendor specific write") Reported-and-tested-by: syzbot+9d34f80f841e948c3fdb@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9d34f80f841e948c3fdb Signed-off-by: Edward Adam Davis <eadavis@qq.com> Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/tencent_9649AA6EC56EDECCA8A7D106C792D1C66B06@qq.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:56 +02:00
Junhao Xie	c74796ff4f	USB: serial: pl2303: add device id for Macrosilicon MS3020 commit 7d47d22444bb7dc1b6d768904a22070ef35e1fc0 upstream. Add the device id for the Macrosilicon MS3020 which is a PL2303HXN based device. Signed-off-by: Junhao Xie <bigfoot@classfun.cn> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:56 +02:00
Tony Luck	a20eea14a6	x86/mm: Switch to new Intel CPU model defines commit 2eda374e883ad297bd9fe575a16c1dc850346075 upstream. New CPU #defines encode vendor and family as well as model. [ dhansen: vertically align 0's in invlpg_miss_ids[] ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181518.41946-1-tony.luck%40intel.com [ Ricardo: I used the old match macro X86_MATCH_INTEL_FAM6_MODEL() instead of X86_MATCH_VFM() as in the upstream commit. I also kept the ALDERLAKE_N name instead of ATOM_GRACEMONT. Both refer to the same CPU model. ] Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:56 +02:00
Sumeet Pawnikar	ee8adcb4c0	powercap: RAPL: fix invalid initialization for pl4_supported field commit d05b5e0baf424c8c4b4709ac11f66ab726c8deaf upstream. The current initialization of the struct x86_cpu_id via pl4_support_ids[] is partial and wrong. It is initializing "stepping" field with "X86_FEATURE_ANY" instead of "feature" field. Use X86_MATCH_INTEL_FAM6_MODEL macro instead of initializing each field of the struct x86_cpu_id for pl4_supported list of CPUs. This X86_MATCH_INTEL_FAM6_MODEL macro internally uses another macro X86_MATCH_VENDOR_FAM_MODEL_FEATURE for X86 based CPU matching with appropriate initialized values. Reported-by: Dave Hansen <dave.hansen@intel.com> Link: https://lore.kernel.org/lkml/28ead36b-2d9e-1a36-6f4e-04684e420260@intel.com Fixes: eb52bc2ae5b8 ("powercap: RAPL: Add Power Limit4 support for Meteor Lake SoC") Fixes: `b08b95cf30` ("powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P") Fixes: `5157559069` ("powercap: RAPL: Add Power Limit4 support for RaptorLake") Fixes: `1cc5b9a411` ("powercap: Add Power Limit4 support for Alder Lake SoC") Fixes: `8365a898fe` ("powercap: Add Power Limit4 support") Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> [ Ricardo: I removed METEORLAKE and METEORLAKE_L from pl4_support_ids as they are not included in v6.1. ] Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:56 +02:00
Filipe Manana	563df8b411	btrfs: calculate the right space for delayed refs when updating global reserve commit f8f210dc84709804c9f952297f2bfafa6ea6b4bd upstream. When updating the global block reserve, we account for the 6 items needed by an unlink operation and the 6 delayed references for each one of those items. However the calculation for the delayed references is not correct in case we have the free space tree enabled, as in that case we need to touch the free space tree as well and therefore need twice the number of bytes. So use the btrfs_calc_delayed_ref_bytes() helper to calculate the number of bytes need for the delayed references at btrfs_update_global_block_rsv(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> [Diogo: this patch has been cherry-picked from the original commit; conflicts included lack of a define (picked from commit 5630e2bcfe223) and lack of btrfs_calc_delayed_ref_bytes (picked from commit 0e55a54502b97) - changed const struct -> struct for compatibility.] Signed-off-by: Diogo Jahchan Koike <djahchankoike@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Matthieu Baerts (NGI0)	2626cbee1f	selftests: mptcp: join: restrict fullmesh endp on 1st sf commit 49ac6f05ace5bb0070c68a0193aa05d3c25d4c83 upstream. A new endpoint using the IP of the initial subflow has been recently added to increase the code coverage. But it breaks the test when using old kernels not having commit `86e39e0448` ("mptcp: keep track of local endpoint still available for each msk"), e.g. on v5.15. Similar to commit d4c81bbb8600 ("selftests: mptcp: join: support local endpoint being tracked or not"), it is possible to add the new endpoint conditionally, by checking if "mptcp_pm_subflow_check_next" is present in kallsyms: this is not directly linked to the commit introducing this symbol but for the parent one which is linked anyway. So we can know in advance what will be the expected behaviour, and add the new endpoint only when it makes sense to do so. Fixes: 4878f9f8421f ("selftests: mptcp: join: validate fullmesh endp on 1st sf") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240910-net-selftests-mptcp-fix-install-v1-1-8f124aa9156d@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> [ Conflicts in mptcp_join.sh, because the 'run_tests' helper has been modified in multiple commits that are not in this version, e.g. commit e571fb09c893 ("selftests: mptcp: add speed env var"). The conflict was in the context, the new lines can still be added at the same place. ] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Marc Kleine-Budde	0ba8b599c3	can: mcp251xfd: move mcp251xfd_timestamp_start()/stop() into mcp251xfd_chip_start/stop() commit a7801540f325d104de5065850a003f1d9bdc6ad3 upstream. The mcp251xfd wakes up from Low Power or Sleep Mode when SPI activity is detected. To avoid this, make sure that the timestamp worker is stopped before shutting down the chip. Split the starting of the timestamp worker out of mcp251xfd_timestamp_init() into the separate function mcp251xfd_timestamp_start(). Call mcp251xfd_timestamp_init() before mcp251xfd_chip_start(), move mcp251xfd_timestamp_start() to mcp251xfd_chip_start(). In this way, mcp251xfd_timestamp_stop() can be called unconditionally by mcp251xfd_chip_stop(). Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Marc Kleine-Budde	88047c4b2d	can: mcp251xfd: properly indent labels commit 51b2a721612236335ddec4f3fb5f59e72a204f3a upstream. To fix the coding style, remove the whitespace in front of labels. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Hagar Hemdan	672c19165f	gpio: prevent potential speculation leaks in gpio_device_get_desc() commit d795848ecce24a75dfd46481aee066ae6fe39775 upstream. Userspace may trigger a speculative read of an address outside the gpio descriptor array. Users can do that by calling gpio_ioctl() with an offset out of range. Offset is copied from user and then used as an array index to get the gpio descriptor without sanitization in gpio_device_get_desc(). This change ensures that the offset is sanitized by using array_index_nospec() to mitigate any possibility of speculative information leaks. This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc. Signed-off-by: Hagar Hemdan <hagarhem@amazon.com> Link: https://lore.kernel.org/r/20240523085332.1801-1-hagarhem@amazon.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Hugo SIMELIERE <hsimeliere.opensource@witekio.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Kent Gibson	5c3a421c1f	gpiolib: cdev: Ignore reconfiguration without direction commit b440396387418fe2feaacd41ca16080e7a8bc9ad upstream. linereq_set_config() behaves badly when direction is not set. The configuration validation is borrowed from linereq_create(), where, to verify the intent of the user, the direction must be set to in order to effect a change to the electrical configuration of a line. But, when applied to reconfiguration, that validation does not allow for the unset direction case, making it possible to clear flags set previously without specifying the line direction. Adding to the inconsistency, those changes are not immediately applied by linereq_set_config(), but will take effect when the line value is next get or set. For example, by requesting a configuration with no flags set, an output line with GPIO_V2_LINE_FLAG_ACTIVE_LOW and GPIO_V2_LINE_FLAG_OPEN_DRAIN set could have those flags cleared, inverting the sense of the line and changing the line drive to push-pull on the next line value set. Skip the reconfiguration of lines for which the direction is not set, and only reconfigure the lines for which direction is set. Fixes: `a54756cb24` ("gpiolib: cdev: support GPIO_V2_LINE_SET_CONFIG_IOCTL") Signed-off-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20240626052925.174272-3-warthog618@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Ping-Ke Shih	e388656a85	Revert "wifi: cfg80211: check wiphy mutex is held for wdev mutex" This reverts commit `19d13ec00a` which is commmit 1474bc87fe57deac726cc10203f73daa6c3212f7 upstream. The reverted commit is based on implementation of wiphy locking that isn't planned to redo on a stable kernel, so revert it to avoid warning: WARNING: CPU: 0 PID: 9 at net/wireless/core.h:231 disconnect_work+0xb8/0x144 [cfg80211] CPU: 0 PID: 9 Comm: kworker/0:1 Not tainted 6.6.51-00141-ga1649b6f8ed6 #7 Hardware name: Freescale i.MX6 SoloX (Device Tree) Workqueue: events disconnect_work [cfg80211] unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x58/0x70 dump_stack_lvl from __warn+0x70/0x1c0 __warn from warn_slowpath_fmt+0x16c/0x294 warn_slowpath_fmt from disconnect_work+0xb8/0x144 [cfg80211] disconnect_work [cfg80211] from process_one_work+0x204/0x620 process_one_work from worker_thread+0x1b0/0x474 worker_thread from kthread+0x10c/0x12c kthread from ret_from_fork+0x14/0x24 Reported-by: petter@technux.se Closes: https://lore.kernel.org/linux-wireless/9e98937d781c990615ef27ee0c858ff9@technux.se/T/#t Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:55 +02:00
Pablo Neira Ayuso	ddeead4761	netfilter: nf_tables: missing iterator type in lookup walk commit efefd4f00c967d00ad7abe092554ffbb70c1a793 upstream. Add missing decorator type to lookup expression and tighten WARN_ON_ONCE check in pipapo to spot earlier that this is unset. Fixes: 29b359cf6d95 ("netfilter: nft_set_pipapo: walk over current view on netlink dump") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Pablo Neira Ayuso	52735a010f	netfilter: nft_set_pipapo: walk over current view on netlink dump commit 29b359cf6d95fd60730533f7f10464e95bd17c73 upstream. The generation mask can be updated while netlink dump is in progress. The pipapo set backend walk iterator cannot rely on it to infer what view of the datastructure is to be used. Add notation to specify if user wants to read/update the set. Based on patch from Florian Westphal. Fixes: 2b84e215f874 ("netfilter: nft_set_pipapo: .walk does not deal with generations") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Dan Carpenter	8a64f87e74	netfilter: nft_socket: Fix a NULL vs IS_ERR() bug in nft_socket_cgroup_subtree_level() commit 7052622fccb1efb850c6b55de477f65d03525a30 upstream. The cgroup_get_from_path() function never returns NULL, it returns error pointers. Update the error handling to match. Fixes: 7f3287db6543 ("netfilter: nft_socket: make cgroupsv2 matching work with namespaces") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://patch.msgid.link/bbc0c4e0-05cc-4f44-8797-2f4b3920a820@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Florian Westphal	ace0db36b4	netfilter: nft_socket: make cgroupsv2 matching work with namespaces commit 7f3287db654395f9c5ddd246325ff7889f550286 upstream. When running in container environmment, /sys/fs/cgroup/ might not be the real root node of the sk-attached cgroup. Example: In container: % stat /sys//fs/cgroup/ Device: 0,21 Inode: 2214 .. % stat /sys/fs/cgroup/foo Device: 0,21 Inode: 2264 .. The expectation would be for: nft add rule .. socket cgroupv2 level 1 "foo" counter to match traffic from a process that got added to "foo" via "echo $pid > /sys/fs/cgroup/foo/cgroup.procs". However, 'level 3' is needed to make this work. Seen from initial namespace, the complete hierarchy is: % stat /sys/fs/cgroup/system.slice/docker-.../foo Device: 0,21 Inode: 2264 .. i.e. hierarchy is 0 1 2 3 / -> system.slice -> docker-1... -> foo ... but the container doesn't know that its "/" is the "docker-1.." cgroup. Current code will retrieve the 'system.slice' cgroup node and store its kn->id in the destination register, so compare with 2264 ("foo" cgroup id) will not match. Fetch "/" cgroup from ->init() and add its level to the level we try to extract. cgroup root-level is 0 for the init-namespace or the level of the ancestor that is exposed as the cgroup root inside the container. In the above case, cgrp->level of "/" resolved in the container is 2 (docker-1...scope/) and request for 'level 1' will get adjusted to fetch the actual level (3). v2: use CONFIG_SOCK_CGROUP_DATA, eval function depends on it. (kernel test robot) Cc: cgroups@vger.kernel.org Fixes: `e0bb96db96` ("netfilter: nft_socket: add support for cgroupsv2") Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Dave Chinner	5899daf1d8	xfs: journal geometry is not properly bounds checked [ Upstream commit f1e1765aad7de7a8b8102044fc6a44684bc36180 ] If the journal geometry results in a sector or log stripe unit validation problem, it indicates that we cannot set the log up to safely write to the the journal. In these cases, we must abort the mount because the corruption needs external intervention to resolve. Similarly, a journal that is too large cannot be written to safely, either, so we shouldn't allow those geometries to mount, either. If the log is too small, we risk having transaction reservations overruning the available log space and the system hanging waiting for space it can never provide. This is purely a runtime hang issue, not a corruption issue as per the first cases listed above. We abort mounts of the log is too small for V5 filesystems, but we must allow v4 filesystems to mount because, historically, there was no log size validity checking and so some systems may still be out there with undersized logs. The problem is that on V4 filesystems, when we discover a log geometry problem, we skip all the remaining checks and then allow the log to continue mounting. This mean that if one of the log size checks fails, we skip the log stripe unit check. i.e. we allow the mount because a "non-fatal" geometry is violated, and then fail to check the hard fail geometries that should fail the mount. Move all these fatal checks to the superblock verifier, and add a new check for the two log sector size geometry variables having the same values. This will prevent any attempt to mount a log that has invalid or inconsistent geometries long before we attempt to mount the log. However, for the minimum log size checks, we can only do that once we've setup up the log and calculated all the iclog sizes and roundoffs. Hence this needs to remain in the log mount code after the log has been initialised. It is also the only case where we should allow a v4 filesystem to continue running, so leave that handling in place, too. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Darrick J. Wong	68e6efe0d4	xfs: set bnobt/cntbt numrecs correctly when formatting new AGs [ Upstream commit 8e698ee72c4ecbbf18264568eb310875839fd601 ] Through generic/300, I discovered that mkfs.xfs creates corrupt filesystems when given these parameters: Filesystems formatted with --unsupported are not supported!! meta-data=/dev/sda isize=512 agcount=8, agsize=16352 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=1 = reflink=1 bigtime=1 inobtcount=1 nrext64=1 data = bsize=4096 blocks=130816, imaxpct=25 = sunit=32 swidth=128 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=8192, version=2 = sectsz=512 sunit=32 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 = rgcount=0 rgsize=0 blks Discarding blocks...Done. Phase 1 - find and verify superblock... - reporting progress in intervals of 15 minutes Phase 2 - using internal log - zero log... - 16:30:50: zeroing log - 16320 of 16320 blocks done - scan filesystem freespace and inode maps... agf_freeblks 25, counted 0 in ag 4 sb_fdblocks 8823, counted 8798 The root cause of this problem is the numrecs handling in xfs_freesp_init_recs, which is used to initialize a new AG. Prior to calling the function, we set up the new bnobt block with numrecs == 1 and rely on _freesp_init_recs to format that new record. If the last record created has a blockcount of zero, then it sets numrecs = 0. That last bit isn't correct if the AG contains the log, the start of the log is not immediately after the initial blocks due to stripe alignment, and the end of the log is perfectly aligned with the end of the AG. For this case, we actually formatted a single bnobt record to handle the free space before the start of the (stripe aligned) log, and incremented arec to try to format a second record. That second record turned out to be unnecessary, so what we really want is to leave numrecs at 1. The numrecs handling itself is overly complicated because a different function sets numrecs == 1. Change the bnobt creation code to start with numrecs set to zero and only increment it after successfully formatting a free space extent into the btree block. Fixes: `f327a00745` ("xfs: account for log space when formatting new AGs") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Darrick J. Wong	af871df651	xfs: fix reloading entire unlinked bucket lists [ Upstream commit 537c013b140d373d1ffe6290b841dc00e67effaa ] During review of the patcheset that provided reloading of the incore iunlink list, Dave made a few suggestions, and I updated the copy in my dev tree. Unfortunately, I then got distracted by ... who even knows what ... and forgot to backport those changes from my dev tree to my release candidate branch. I then sent multiple pull requests with stale patches, and that's what was merged into -rc3. So. This patch re-adds the use of an unlocked iunlink list check to determine if we want to allocate the resources to recreate the incore list. Since lost iunlinked inodes are supposed to be rare, this change helps us avoid paying the transaction and AGF locking costs every time we open any inode. This also re-adds the shutdowns on failure, and re-applies the restructuring of the inner loop in xfs_inode_reload_unlinked_bucket, and re-adds a requested comment about the quotachecking code. Retain the original RVB tag from Dave since there's no code change from the last submission. Fixes: 68b957f64fca1 ("xfs: load uncached unlinked inodes into memory on demand") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Darrick J. Wong	62ca591045	xfs: make inode unlinked bucket recovery work with quotacheck [ Upstream commit 49813a21ed57895b73ec4ed3b99d4beec931496f ] Teach quotacheck to reload the unlinked inode lists when walking the inode table. This requires extra state handling, since it's possible that a reloaded inode will get inactivated before quotacheck tries to scan it; in this case, we need to ensure that the reloaded inode does not have dquots attached when it is freed. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:54 +02:00
Darrick J. Wong	e9d1551f80	xfs: reload entire unlinked bucket lists [ Upstream commit 83771c50e42b92de6740a63e152c96c052d37736 ] The previous patch to reload unrecovered unlinked inodes when adding a newly created inode to the unlinked list is missing a key piece of functionality. It doesn't handle the case that someone calls xfs_iget on an inode that is not the last item in the incore list. For example, if at mount time the ondisk iunlink bucket looks like this: AGI -> 7 -> 22 -> 3 -> NULL None of these three inodes are cached in memory. Now let's say that someone tries to open inode 3 by handle. We need to walk the list to make sure that inodes 7 and 22 get loaded cold, and that the i_prev_unlinked of inode 3 gets set to 22. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:53 +02:00
Darrick J. Wong	8ffd3ae7a0	xfs: use i_prev_unlinked to distinguish inodes that are not on the unlinked list [ Upstream commit f12b96683d6976a3a07fdf3323277c79dbe8f6ab ] Alter the definition of i_prev_unlinked slightly to make it more obvious when an inode with 0 link count is not part of the iunlink bucket lists rooted in the AGI. This distinction is necessary because it is not sufficient to check inode.i_nlink to decide if an inode is on the unlinked list. Updates to i_nlink can happen while holding only ILOCK_EXCL, but updates to an inode's position in the AGI unlinked list (which happen after the nlink update) requires both ILOCK_EXCL and the AGI buffer lock. The next few patches will make it possible to reload an entire unlinked bucket list when we're walking the inode table or performing handle operations and need more than the ability to iget the last inode in the chain. The upcoming directory repair code also needs to be able to make this distinction to decide if a zero link count directory should be moved to the orphanage or allowed to inactivate. An upcoming enhancement to the online AGI fsck code will need this distinction to check and rebuild the AGI unlinked buckets. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:53 +02:00
Shiyang Ruan	8e2147f37f	xfs: correct calculation for agend and blockcount [ Upstream commit 3c90c01e49342b166e5c90ec2c85b220be15a20e ] The agend should be "start + length - 1", then, blockcount should be "end + 1 - start". Correct 2 calculation mistakes. Also, rename "agend" to "range_agend" because it's not the end of the AG per se; it's the end of the dead region within an AG's agblock space. Fixes: 5cf32f63b0f4 ("xfs: fix the calculation for "end" and "length"") Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:53 +02:00
Dave Chinner	d931b6c6a9	xfs: fix unlink vs cluster buffer instantiation race [ Upstream commit 348a1983cf4cf5099fc398438a968443af4c9f65 ] Luis has been reporting an assert failure when freeing an inode cluster during inode inactivation for a while. The assert looks like: XFS: Assertion failed: bp->b_flags & XBF_DONE, file: fs/xfs/xfs_trans_buf.c, line: 241 ------------[ cut here ]------------ kernel BUG at fs/xfs/xfs_message.c:102! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 4 PID: 73 Comm: kworker/4:1 Not tainted 6.10.0-rc1 #4 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Workqueue: xfs-inodegc/loop5 xfs_inodegc_worker [xfs] RIP: 0010:assfail (fs/xfs/xfs_message.c:102) xfs RSP: 0018:ffff88810188f7f0 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff88816e748250 RCX: 1ffffffff844b0e7 RDX: 0000000000000004 RSI: ffff88810188f558 RDI: ffffffffc2431fa0 RBP: 1ffff11020311f01 R08: 0000000042431f9f R09: ffffed1020311e9b R10: ffff88810188f4df R11: ffffffffac725d70 R12: ffff88817a3f4000 R13: ffff88812182f000 R14: ffff88810188f998 R15: ffffffffc2423f80 FS: 0000000000000000(0000) GS:ffff8881c8400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055fe9d0f109c CR3: 000000014426c002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:241 (discriminator 1)) xfs xfs_imap_to_bp (fs/xfs/xfs_trans.h:210 fs/xfs/libxfs/xfs_inode_buf.c:138) xfs xfs_inode_item_precommit (fs/xfs/xfs_inode_item.c:145) xfs xfs_trans_run_precommits (fs/xfs/xfs_trans.c:931) xfs __xfs_trans_commit (fs/xfs/xfs_trans.c:966) xfs xfs_inactive_ifree (fs/xfs/xfs_inode.c:1811) xfs xfs_inactive (fs/xfs/xfs_inode.c:2013) xfs xfs_inodegc_worker (fs/xfs/xfs_icache.c:1841 fs/xfs/xfs_icache.c:1886) xfs process_one_work (kernel/workqueue.c:3231) worker_thread (kernel/workqueue.c:3306 (discriminator 2) kernel/workqueue.c:3393 (discriminator 2)) kthread (kernel/kthread.c:389) ret_from_fork (arch/x86/kernel/process.c:147) ret_from_fork_asm (arch/x86/entry/entry_64.S:257) </TASK> And occurs when the the inode precommit handlers is attempt to look up the inode cluster buffer to attach the inode for writeback. The trail of logic that I can reconstruct is as follows. 1. the inode is clean when inodegc runs, so it is not attached to a cluster buffer when precommit runs. 2. #1 implies the inode cluster buffer may be clean and not pinned by dirty inodes when inodegc runs. 3. #2 implies that the inode cluster buffer can be reclaimed by memory pressure at any time. 4. The assert failure implies that the cluster buffer was attached to the transaction, but not marked done. It had been accessed earlier in the transaction, but not marked done. 5. #4 implies the cluster buffer has been invalidated (i.e. marked stale). 6. #5 implies that the inode cluster buffer was instantiated uninitialised in the transaction in xfs_ifree_cluster(), which only instantiates the buffers to invalidate them and never marks them as done. Given factors 1-3, this issue is highly dependent on timing and environmental factors. Hence the issue can be very difficult to reproduce in some situations, but highly reliable in others. Luis has an environment where it can be reproduced easily by g/531 but, OTOH, I've reproduced it only once in ~2000 cycles of g/531. I think the fix is to have xfs_ifree_cluster() set the XBF_DONE flag on the cluster buffers, even though they may not be initialised. The reasons why I think this is safe are: 1. A buffer cache lookup hit on a XBF_STALE buffer will clear the XBF_DONE flag. Hence all future users of the buffer know they have to re-initialise the contents before use and mark it done themselves. 2. xfs_trans_binval() sets the XFS_BLI_STALE flag, which means the buffer remains locked until the journal commit completes and the buffer is unpinned. Hence once marked XBF_STALE/XFS_BLI_STALE by xfs_ifree_cluster(), the only context that can access the freed buffer is the currently running transaction. 3. #2 implies that future buffer lookups in the currently running transaction will hit the transaction match code and not the buffer cache. Hence XBF_STALE and XFS_BLI_STALE will not be cleared unless the transaction initialises and logs the buffer with valid contents again. At which point, the buffer will be marked marked XBF_DONE again, so having XBF_DONE already set on the stale buffer is a moot point. 4. #2 also implies that any concurrent access to that cluster buffer will block waiting on the buffer lock until the inode cluster has been fully freed and is no longer an active inode cluster buffer. 5. #4 + #1 means that any future user of the disk range of that buffer will always see the range of disk blocks covered by the cluster buffer as not done, and hence must initialise the contents themselves. 6. Setting XBF_DONE in xfs_ifree_cluster() then means the unlinked inode precommit code will see a XBF_DONE buffer from the transaction match as it expects. It can then attach the stale but newly dirtied inode to the stale but newly dirtied cluster buffer without unexpected failures. The stale buffer will then sail through the journal and do the right thing with the attached stale inode during unpin. Hence the fix is just one line of extra code. The explanation of why we have to set XBF_DONE in xfs_ifree_cluster, OTOH, is long and complex.... Fixes: 82842fee6e59 ("xfs: fix AGF vs inode cluster buffer deadlock") Signed-off-by: Dave Chinner <dchinner@redhat.com> Tested-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:53 +02:00
Darrick J. Wong	1486aeb788	xfs: fix negative array access in xfs_getbmap [ Upstream commit 1bba82fe1afac69c85c1f5ea137c8e73de3c8032 ] In commit 8ee81ed581ff, Ye Bin complained about an ASSERT in the bmapx code that trips if we encounter a delalloc extent after flushing the pagecache to disk. The ioctl code does not hold MMAPLOCK so it's entirely possible that a racing write page fault can create a delalloc extent after the file has been flushed. The proposed solution was to replace the assertion with an early return that avoids filling out the bmap recordset with a delalloc entry if the caller didn't ask for it. At the time, I recall thinking that the forward logic sounded ok, but felt hesitant because I suspected that changing this code would cause something /else/ to burst loose due to some other subtlety. syzbot of course found that subtlety. If all the extent mappings found after the flush are delalloc mappings, we'll reach the end of the data fork without ever incrementing bmv->bmv_entries. This is new, since before we'd have emitted the delalloc mappings even though the caller didn't ask for them. Once we reach the end, we'll try to set BMV_OF_LAST on the -1st entry (because bmv_entries is zero) and go corrupt something else in memory. Yay. I really dislike all these stupid patches that fiddle around with debug code and break things that otherwise worked well enough. Nobody was complaining that calling XFS_IOC_BMAPX without BMV_IF_DELALLOC would return BMV_OF_DELALLOC records, and now we've gone from "weird behavior that nobody cared about" to "bad behavior that must be addressed immediately". Maybe I'll just ignore anything from Huawei from now on for my own sake. Reported-by: syzbot+c103d3808a0de5faaf80@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-xfs/20230412024907.GP360889@frogsfrogsfrogs/ Fixes: 8ee81ed581ff ("xfs: fix BUG_ON in xfs_getbmap()") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-30 16:23:53 +02:00

1 2 3 4 5 ...

1157446 Commits