Commit Graph

82804 Commits

Author SHA1 Message Date
Linus Torvalds
57012c5753 Merge tag 'net-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from can, netfilter.

  Current release - regressions:

   - core: fix splice_to_socket() for O_NONBLOCK socket

   - af_unix: fix fortify_panic() in unix_bind_bsd().

   - can: raw: fix lockdep issue in raw_release()

  Previous releases - regressions:

   - tcp: reduce chance of collisions in inet6_hashfn().

   - netfilter: skip immediate deactivate in _PREPARE_ERROR

   - tipc: stop tipc crypto on failure in tipc_node_create

   - eth: igc: fix kernel panic during ndo_tx_timeout callback

   - eth: iavf: fix potential deadlock on allocation failure

  Previous releases - always broken:

   - ipv6: fix bug where deleting a mngtmpaddr can create a new
     temporary address

   - eth: ice: fix memory management in ice_ethtool_fdir.c

   - eth: hns3: fix the imp capability bit cannot exceed 32 bits issue

   - eth: vxlan: calculate correct header length for GPE

   - eth: stmmac: apply redundant write work around on 4.xx too"

* tag 'net-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits)
  tipc: stop tipc crypto on failure in tipc_node_create
  af_unix: Terminate sun_path when bind()ing pathname socket.
  tipc: check return value of pskb_trim()
  benet: fix return value check in be_lancer_xmit_workarounds()
  virtio-net: fix race between set queues and probe
  net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64
  splice, net: Fix splice_to_socket() for O_NONBLOCK socket
  net: fec: tx processing does not call XDP APIs if budget is 0
  mptcp: more accurate NL event generation
  selftests: mptcp: join: only check for ip6tables if needed
  tools: ynl-gen: fix parse multi-attr enum attribute
  tools: ynl-gen: fix enum index in _decode_enum(..)
  netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID
  netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR
  netfilter: nft_set_rbtree: fix overlap expiration walk
  igc: Fix Kernel Panic during ndo_tx_timeout callback
  net: dsa: qca8k: fix mdb add/del case with 0 VID
  net: dsa: qca8k: fix broken search_and_del
  net: dsa: qca8k: fix search_and_insert wrong handling of new rule
  net: dsa: qca8k: enable use_single_write for qca8xxx
  ...
2023-07-27 12:27:37 -07:00
Linus Torvalds
64de76ce8e Merge tag 'for-6.5-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:

 - fix accounting of global block reserve size when block group tree is
   enabled

 - the async discard has been enabled in 6.2 unconditionally, but for
   zoned mode it does not make that much sense to do it asynchronously
   as the zones are reset as needed

 - error handling and proper error value propagation fixes

* tag 'for-6.5-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: check for commit error at btrfs_attach_transaction_barrier()
  btrfs: check if the transaction was aborted at btrfs_wait_for_commit()
  btrfs: remove BUG_ON()'s in add_new_free_space()
  btrfs: account block group tree when calculating global reserve size
  btrfs: zoned: do not enable async discard
2023-07-27 11:44:08 -07:00
Jan Stancek
0f0fa27b87 splice, net: Fix splice_to_socket() for O_NONBLOCK socket
LTP sendfile07 [1], which expects sendfile() to return EAGAIN when
transferring data from regular file to a "full" O_NONBLOCK socket,
started failing after commit 2dc334f1a6 ("splice, net: Use
sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage()").
sendfile() no longer immediately returns, but now blocks.

Removed sock_sendpage() handled this case by setting a MSG_DONTWAIT
flag, fix new splice_to_socket() to do the same for O_NONBLOCK sockets.

[1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/sendfile/sendfile07.c

Fixes: 2dc334f1a6 ("splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage()")
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Xi Ruoyao <xry111@xry111.site>
Link: https://lore.kernel.org/r/023c0e21e595e00b93903a813bc0bfb9a5d7e368.1690219914.git.jstancek@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-26 21:56:06 -07:00
Linus Torvalds
f40125c0a1 Merge tag '6.5-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull ksmbd server fixes from Steve French:

 - fixes for two possible out of bounds access (in negotiate, and in
   decrypt msg)

 - fix unsigned compared to zero warning

 - fix path lookup crossing a mountpoint

 - fix case when first compound request is a tree connect

 - fix memory leak if reads are compounded

* tag '6.5-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: fix out of bounds in init_smb2_rsp_hdr()
  ksmbd: no response from compound read
  ksmbd: validate session id and tree id in compound request
  ksmbd: fix out of bounds in smb3_decrypt_req()
  ksmbd: check if a mount point is crossed during path lookup
  ksmbd: Fix unsigned expression compared with zero
2023-07-26 11:20:36 -07:00
Filipe Manana
b28ff3a7d7 btrfs: check for commit error at btrfs_attach_transaction_barrier()
btrfs_attach_transaction_barrier() is used to get a handle pointing to the
current running transaction if the transaction has not started its commit
yet (its state is < TRANS_STATE_COMMIT_START). If the transaction commit
has started, then we wait for the transaction to commit and finish before
returning - however we completely ignore if the transaction was aborted
due to some error during its commit, we simply return ERR_PT(-ENOENT),
which makes the caller assume everything is fine and no errors happened.

This could make an fsync return success (0) to user space when in fact we
had a transaction abort and the target inode changes were therefore not
persisted.

Fix this by checking for the return value from btrfs_wait_for_commit(),
and if it returned an error, return it back to the caller.

Fixes: d4edf39bd5 ("Btrfs: fix uncompleted transaction")
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-07-26 13:57:47 +02:00
Eric Snowberg
18b44bc5a6 ovl: Always reevaluate the file signature for IMA
Commit db1d1e8b98 ("IMA: use vfs_getattr_nosec to get the i_version")
partially closed an IMA integrity issue when directly modifying a file
on the lower filesystem.  If the overlay file is first opened by a user
and later the lower backing file is modified by root, but the extended
attribute is NOT updated, the signature validation succeeds with the old
original signature.

Update the super_block s_iflags to SB_I_IMA_UNVERIFIABLE_SIGNATURE to
force signature reevaluation on every file access until a fine grained
solution can be found.

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-25 15:36:22 -07:00
Linus Torvalds
0b4a9fdc93 Merge tag 'nfsd-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fix from Chuck Lever:

 - Fix TEST_STATEID response

* tag 'nfsd-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  nfsd: Remove incorrect check in nfsd4_validate_stateid
2023-07-25 13:54:04 -07:00
Christian Brauner
20ea1e7d13 file: always lock position for FMODE_ATOMIC_POS
The pidfd_getfd() system call allows a caller with ptrace_may_access()
abilities on another process to steal a file descriptor from this
process. This system call is used by debuggers, container runtimes,
system call supervisors, networking proxies etc. So while it is a
special interest system call it is used in common tools.

That ability ends up breaking our long-time optimization in fdget_pos(),
which "knew" that if we had exclusive access to the file descriptor
nobody else could access it, and we didn't need the lock for the file
position.

That check for file_count(file) was always fairly subtle - it depended
on __fdget() not incrementing the file count for single-threaded
processes and thus included that as part of the rule - but it did mean
that we didn't need to take the lock in all those traditional unix
process contexts.

So it's sad to see this go, and I'd love to have some way to re-instate
the optimization. At the same time, the lock obviously isn't ever
contended in the case we optimized, so all we were optimizing away is
the atomics and the cacheline dirtying. Let's see if anybody even
notices that the optimization is gone.

Link: https://lore.kernel.org/linux-fsdevel/20230724-vfs-fdget_pos-v1-1-a4abfd7103f3@kernel.org/
Fixes: 8649c322f7 ("pid: Implement pidfd_getfd syscall")
Cc: stable@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-24 10:16:36 -07:00
Filipe Manana
bf7ecbe987 btrfs: check if the transaction was aborted at btrfs_wait_for_commit()
At btrfs_wait_for_commit() we wait for a transaction to finish and then
always return 0 (success) without checking if it was aborted, in which
case the transaction didn't happen due to some critical error. Fix this
by checking if the transaction was aborted.

Fixes: 462045928b ("Btrfs: add START_SYNC, WAIT_SYNC ioctls")
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-07-24 18:06:27 +02:00
Filipe Manana
d8ccbd2191 btrfs: remove BUG_ON()'s in add_new_free_space()
At add_new_free_space() we have these BUG_ON()'s that are there to deal
with any failure to add free space to the in memory free space cache.
Such failures are mostly -ENOMEM that should be very rare. However there's
no need to have these BUG_ON()'s, we can just return any error to the
caller and all callers and their upper call chain are already dealing with
errors.

So just make add_new_free_space() return any errors, while removing the
BUG_ON()'s, and returning the total amount of added free space to an
optional u64 pointer argument.

Reported-by: syzbot+3ba856e07b7127889d8c@syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/000000000000e9cb8305ff4e8327@google.com/
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-07-24 18:06:05 +02:00
Linus Torvalds
15b593ba68 Merge tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
 "Bug and regression fixes for 6.5-rc3 for ext4's mballoc and jbd2's
  checkpoint code"

* tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix rbtree traversal bug in ext4_mb_use_preallocated
  ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()
  ext4: correct inline offset when handling xattrs in inode body
  jbd2: remove __journal_try_to_free_buffer()
  jbd2: fix a race when checking checkpoint buffer busy
  jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint
  jbd2: remove journal_clean_one_cp_list()
  jbd2: remove t_checkpoint_io_list
  jbd2: recheck chechpointing non-dirty buffer
2023-07-23 10:21:49 -07:00
Linus Torvalds
8266f53b39 Merge tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fix from Steve French:
 "Add minor debugging improvement.

  The change improves ability to read a network trace to debug problems
  on encrypted connections which are very common (e.g. using wireshark
  or tcpdump).

  That works today with tools like 'smbinfo keys /mnt/file' but requires
  passing in a filename on the mount (see e.g. [1]), but it often makes
  more sense to just pass in the mount point path (ie a directory not a
  filename).

  So this fix was needed to debug some types of problems (an obvious
  example is on an encrypted connection failing operations on an empty
  share or with no files in the root of the directory) - so you can
  simply pass in the 'smbinfo keys <mntpoint>' and get the information
  that wireshark needs"

Link: https://wiki.samba.org/index.php/Wireshark_Decryption [1]

* tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal module version number for cifs.ko
  cifs: allow dumping keys for directories too
2023-07-23 10:16:44 -07:00
Namjae Jeon
536bb492d3 ksmbd: fix out of bounds in init_smb2_rsp_hdr()
If client send smb2 negotiate request and then send smb1 negotiate
request, init_smb2_rsp_hdr is called for smb1 negotiate request since
need_neg is set to false. This patch ignore smb1 packets after ->need_neg
is set to false.

Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-21541
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-23 10:25:11 -05:00
Namjae Jeon
e202a1e863 ksmbd: no response from compound read
ksmbd doesn't support compound read. If client send read-read in
compound to ksmbd, there can be memory leak from read buffer.
Windows and linux clients doesn't send it to server yet. For now,
No response from compound read. compound read will be supported soon.

Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-21587, ZDI-CAN-21588
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-23 10:25:11 -05:00
Namjae Jeon
3df0411e13 ksmbd: validate session id and tree id in compound request
`smb2_get_msg()` in smb2_get_ksmbd_tcon() and smb2_check_user_session()
will always return the first request smb2 header in a compound request.
if `SMB2_TREE_CONNECT_HE` is the first command in compound request, will
return 0, i.e. The tree id check is skipped.
This patch use ksmbd_req_buf_next() to get current command in compound.

Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-21506
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-23 10:25:11 -05:00
Namjae Jeon
dc318846f3 ksmbd: fix out of bounds in smb3_decrypt_req()
smb3_decrypt_req() validate if pdu_length is smaller than
smb2_transform_hdr size.

Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-21589
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-23 10:25:11 -05:00
Namjae Jeon
2b57a4322b ksmbd: check if a mount point is crossed during path lookup
Since commit 74d7970feb ("ksmbd: fix racy issue from using ->d_parent and
->d_name"), ksmbd can not lookup cross mount points. If last component is
a cross mount point during path lookup, check if it is crossed to follow it
down. And allow path lookup to cross a mount point when a crossmnt
parameter is set to 'yes' in smb.conf.

Cc: stable@vger.kernel.org
Fixes: 74d7970feb ("ksmbd: fix racy issue from using ->d_parent and ->d_name")
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-23 10:25:11 -05:00
Ojaswin Mujoo
9d3de7ee19 ext4: fix rbtree traversal bug in ext4_mb_use_preallocated
During allocations, while looking for preallocations(PA) in the per
inode rbtree, we can't do a direct traversal of the tree because
ext4_mb_discard_group_preallocation() can paralelly mark the pa deleted
and that can cause direct traversal to skip some entries. This was
leading to a BUG_ON() being hit [1] when we missed a PA that could satisfy
our request and ultimately tried to create a new PA that would overlap
with the missed one.

To makes sure we handle that case while still keeping the performance of
the rbtree, we make use of the fact that the only pa that could possibly
overlap the original goal start is the one that satisfies the below
conditions:

  1. It must have it's logical start immediately to the left of
  (ie less than) original logical start.

  2. It must not be deleted

To find this pa we use the following traversal method:

1. Descend into the rbtree normally to find the immediate neighboring
PA. Here we keep descending irrespective of if the PA is deleted or if
it overlaps with our request etc. The goal is to find an immediately
adjacent PA.

2. If the found PA is on right of original goal, use rb_prev() to find
the left adjacent PA.

3. Check if this PA is deleted and keep moving left with rb_prev() until
a non deleted PA is found.

4. This is the PA we are looking for. Now we can check if it can satisfy
the original request and proceed accordingly.

This approach also takes care of having deleted PAs in the tree.

(While we are at it, also fix a possible overflow bug in calculating the
end of a PA)

[1] https://lore.kernel.org/linux-ext4/CA+G9fYv2FRpLqBZf34ZinR8bU2_ZRAUOjKAD3+tKRFaEQHtt8Q@mail.gmail.com/

Cc: stable@kernel.org # 6.4
Fixes: 3872778664 ("ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list")
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reviewed-by: Ritesh Harjani (IBM) ritesh.list@gmail.com
Tested-by: Ritesh Harjani (IBM) ritesh.list@gmail.com
Link: https://lore.kernel.org/r/edd2efda6a83e6343c5ace9deea44813e71dbe20.1690045963.git.ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-07-23 08:21:14 -04:00
Ojaswin Mujoo
5d5460fa79 ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()
In ext4_mb_choose_next_group_best_avail(), we want the start order to be
1 less than goal length and the min_order to be, at max, 1 more than the
original length. This commit fixes an off by one issue that arose due to
the fact that 1 << fls(n) > (n).

After all the processing:

order = 1 order below goal len
min_order = maximum of the three:-
             - order - trim_order
             - 1 order below B2C(s_stripe)
             - 1 order above original len

Cc: stable@kernel.org
Fixes: 33122aa930 ("ext4: Add allocation criteria 1.5 (CR1_5)")
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Link: https://lore.kernel.org/r/20230609103403.112807-1-ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-07-23 08:21:14 -04:00
Eric Whitney
6909cf5c41 ext4: correct inline offset when handling xattrs in inode body
When run on a file system where the inline_data feature has been
enabled, xfstests generic/269, generic/270, and generic/476 cause ext4
to emit error messages indicating that inline directory entries are
corrupted.  This occurs because the inline offset used to locate
inline directory entries in the inode body is not updated when an
xattr in that shared region is deleted and the region is shifted in
memory to recover the space it occupied.  If the deleted xattr precedes
the system.data attribute, which points to the inline directory entries,
that attribute will be moved further up in the region.  The inline
offset continues to point to whatever is located in system.data's former
location, with unfortunate effects when used to access directory entries
or (presumably) inline data in the inode body.

Cc: stable@kernel.org
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Link: https://lore.kernel.org/r/20230522181520.1570360-1-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-07-23 08:21:05 -04:00
Steve French
ba61a03af2 cifs: update internal module version number for cifs.ko
From 2.43 to 2.44

Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-22 14:43:13 -05:00
Shyam Prasad N
b3edef6b9c cifs: allow dumping keys for directories too
Dumping the enc/dec keys is a session wide operation.
And it should not matter if the ioctl was run on
a regular file or a directory.

Currently, we obtain the tcon pointer from the
cifs file handle. But since there's no dir open call
in cifs, this is not populated for dirs.

This change allows dumping of session keys using ioctl
even for directories. To do this, we'll now get the
tcon pointer from the superblock, and not from the file
handle.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-22 14:42:54 -05:00
Filipe Manana
8dbfc14fc7 btrfs: account block group tree when calculating global reserve size
When using the block group tree feature, this tree is a critical tree just
like the extent, csum and free space trees, and just like them it uses the
delayed refs block reserve.

So take into account the block group tree, and its current size, when
calculating the size for the global reserve.

CC: stable@vger.kernel.org # 6.1+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-07-20 19:22:54 +02:00
Naohiro Aota
95ca6599a5 btrfs: zoned: do not enable async discard
The zoned mode need to reset a zone before using it. We rely on btrfs's
original discard functionality (discarding unused block group range) to do
the resetting.

While the commit 63a7cb1307 ("btrfs: auto enable discard=async when
possible") made the discard done in an async manner, a zoned reset do not
need to be async, as it is fast enough.

Even worth, delaying zone rests prevents using those zones again. So, let's
disable async discard on the zoned mode.

Fixes: 63a7cb1307 ("btrfs: auto enable discard=async when possible")
CC: stable@vger.kernel.org # 6.3+
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update message text ]
Signed-off-by: David Sterba <dsterba@suse.com>
2023-07-20 19:18:14 +02:00
Linus Torvalds
e599e16c16 Merge tag 'iomap-6.5-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull iomap fix from Darrick Wong:
 "Fix partial write regression.

  It turns out that fstests doesn't have any test coverage for short
  writes, but LTP does. Fortunately, this was caught right after -rc1
  was tagged.

  Summary:

   - Fix a bug wherein a failed write could clobber short write status"

* tag 'iomap-6.5-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  iomap: micro optimize the ki_pos assignment in iomap_file_buffered_write
  iomap: fix a regression for partial write errors
2023-07-20 10:10:02 -07:00