Commit Graph

387 Commits

Author SHA1 Message Date
Linus Torvalds
fcc79e1714 Merge tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
 "The most significant set of changes is the per netns RTNL. The new
  behavior is disabled by default, regression risk should be contained.

  Notably the new config knob PTP_1588_CLOCK_VMCLOCK will inherit its
  default value from PTP_1588_CLOCK_KVM, as the first is intended to be
  a more reliable replacement for the latter.

  Core:

   - Started a very large, in-progress, effort to make the RTNL lock
     scope per network-namespace, thus reducing the lock contention
     significantly in the containerized use-case, comprising:
       - RCU-ified some relevant slices of the FIB control path
       - introduce basic per netns locking helpers
       - namespacified the IPv4 address hash table
       - remove rtnl_register{,_module}() in favour of
         rtnl_register_many()
       - refactor rtnl_{new,del,set}link() moving as much validation as
         possible out of RTNL lock
       - convert all phonet doit() and dumpit() handlers to RCU
       - convert IPv4 addresses manipulation to per-netns RTNL
       - convert virtual interface creation to per-netns RTNL
     the per-netns lock infrastructure is guarded by the
     CONFIG_DEBUG_NET_SMALL_RTNL knob, disabled by default ad interim.

   - Introduce NAPI suspension, to efficiently switching between busy
     polling (NAPI processing suspended) and normal processing.

   - Migrate the IPv4 routing input, output and control path from direct
     ToS usage to DSCP macros. This is a work in progress to make ECN
     handling consistent and reliable.

   - Add drop reasons support to the IPv4 rotue input path, allowing
     better introspection in case of packets drop.

   - Make FIB seqnum lockless, dropping RTNL protection for read access.

   - Make inet{,v6} addresses hashing less predicable.

   - Allow providing timestamp OPT_ID via cmsg, to correlate TX packets
     and timestamps

  Things we sprinkled into general kernel code:

   - Add small file operations for debugfs, to reduce the struct ops
     size.

   - Refactoring and optimization for the implementation of page_frag
     API, This is a preparatory work to consolidate the page_frag
     implementation.

  Netfilter:

   - Optimize set element transactions to reduce memory consumption

   - Extended netlink error reporting for attribute parser failure.

   - Make legacy xtables configs user selectable, giving users the
     option to configure iptables without enabling any other config.

   - Address a lot of false-positive RCU issues, pointed by recent CI
     improvements.

  BPF:

   - Put xsk sockets on a struct diet and add various cleanups. Overall,
     this helps to bump performance by 12% for some workloads.

   - Extend BPF selftests to increase coverage of XDP features in
     combination with BPF cpumap.

   - Optimize and homogenize bpf_csum_diff helper for all archs and also
     add a batch of new BPF selftests for it.

   - Extend netkit with an option to delegate skb->{mark,priority}
     scrubbing to its BPF program.

   - Make the bpf_get_netns_cookie() helper available also to tc(x) BPF
     programs.

  Protocols:

   - Introduces 4-tuple hash for connected udp sockets, speeding-up
     significantly connected sockets lookup.

   - Add a fastpath for some TCP timers that usually expires after
     close, the socket lock contention.

   - Add inbound and outbound xfrm state caches to speed up state
     lookups.

   - Avoid sending MPTCP advertisements on stale subflows, reducing
     risks on loosing them.

   - Make neighbours table flushing more scalable, maintaining per
     device neigh lists.

  Driver API:

   - Introduce a unified interface to configure transmission H/W
     shaping, and expose it to user-space via generic-netlink.

   - Add support for per-NAPI config via netlink. This makes napi
     configuration persistent across queues removal and re-creation.
     Requires driver updates, currently supported drivers are:
     nVidia/Mellanox mlx4 and mlx5, Broadcom brcm and Intel ice.

   - Add ethtool support for writing SFP / PHY firmware blocks.

   - Track RSS context allocation from ethtool core.

   - Implement support for mirroring to DSA CPU port, via TC mirror
     offload.

   - Consolidate FDB updates notification, to avoid duplicates on
     device-specific entries.

   - Expose DPLL clock quality level to the user-space.

   - Support master-slave PHY config via device tree.

  Tests and tooling:

   - forwarding: introduce deferred commands, to simplify the cleanup
     phase

  Drivers:

   - Updated several drivers - Amazon vNic, Google vNic, Microsoft vNic,
     Intel e1000e and Broadcom Tigon3 - to use netdev-genl to link the
     IRQs and queues to NAPI IDs, allowing busy polling and better
     introspection.

   - Ethernet high-speed NICs:
      - nVidia/Mellanox:
         - mlx5:
           - a large refactor to implement support for cross E-Switch
             scheduling
           - refactor H/W conter management to let it scale better
           - H/W GRO cleanups
      - Intel (100G, ice)::
         - add support for ethtool reset
         - implement support for per TX queue H/W shaping
      - AMD/Solarflare:
         - implement per device queue stats support
      - Broadcom (bnxt):
         - improve wildcard l4proto on IPv4/IPv6 ntuple rules
      - Marvell Octeon:
         - Add representor support for each Resource Virtualization Unit
           (RVU) device.
      - Hisilicon:
         - add support for the BMC Gigabit Ethernet
      - IBM (EMAC):
         - driver cleanup and modernization
      - Cisco (VIC):
         - raise the queues number limit to 256

   - Ethernet virtual:
      - Google vNIC:
         - implement page pool support
      - macsec:
         - inherit lower device's features and TSO limits when
           offloading
      - virtio_net:
         - enable premapped mode by default
         - support for XDP socket(AF_XDP) zerocopy TX
      - wireguard:
         - set the TSO max size to be GSO_MAX_SIZE, to aggregate larger
           packets.

   - Ethernet NICs embedded and virtual:
      - Broadcom ASP:
         - enable software timestamping
      - Freescale:
         - add enetc4 PF driver
      - MediaTek: Airoha SoC:
         - implement BQL support
      - RealTek r8169:
         - enable TSO by default on r8168/r8125
         - implement extended ethtool stats
      - Renesas AVB:
         - enable TX checksum offload
      - Synopsys (stmmac):
         - support header splitting for vlan tagged packets
         - move common code for DWMAC4 and DWXGMAC into a separate FPE
           module.
         - add dwmac driver support for T-HEAD TH1520 SoC
      - Synopsys (xpcs):
         - driver refactor and cleanup
      - TI:
         - icssg_prueth: add VLAN offload support
      - Xilinx emaclite:
         - add clock support

   - Ethernet switches:
      - Microchip:
         - implement support for the lan969x Ethernet switch family
         - add LAN9646 switch support to KSZ DSA driver

   - Ethernet PHYs:
      - Marvel: 88q2x: enable auto negotiation
      - Microchip: add support for LAN865X Rev B1 and LAN867X Rev C1/C2

   - PTP:
      - Add support for the Amazon virtual clock device
      - Add PtP driver for s390 clocks

   - WiFi:
      - mac80211
         - EHT 1024 aggregation size for transmissions
         - new operation to indicate that a new interface is to be added
         - support radio separation of multi-band devices
         - move wireless extension spy implementation to libiw
      - Broadcom:
         - brcmfmac: optional LPO clock support
      - Microchip:
         - add support for Atmel WILC3000
      - Qualcomm (ath12k):
         - firmware coredump collection support
         - add debugfs support for a multitude of statistics
      - Qualcomm (ath5k):
         -  Arcadyan ARV45XX AR2417 & Gigaset SX76[23] AR241[34]A support
      - Realtek:
         - rtw88: 8821au and 8812au USB adapters support
         - rtw89: add thermal protection
         - rtw89: fine tune BT-coexsitence to improve user experience
         - rtw89: firmware secure boot for WiFi 6 chip

   - Bluetooth
      - add Qualcomm WCN785x support for ids Foxconn 0xe0fc/0xe0f3 and
        0x13d3:0x3623
      - add Realtek RTL8852BE support for id Foxconn 0xe123
      - add MediaTek MT7920 support for wireless module ids
      - btintel_pcie: add handshake between driver and firmware
      - btintel_pcie: add recovery mechanism
      - btnxpuart: add GPIO support to power save feature"

* tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1475 commits)
  mm: page_frag: fix a compile error when kernel is not compiled
  Documentation: tipc: fix formatting issue in tipc.rst
  selftests: nic_performance: Add selftest for performance of NIC driver
  selftests: nic_link_layer: Add selftest case for speed and duplex states
  selftests: nic_link_layer: Add link layer selftest for NIC driver
  bnxt_en: Add FW trace coredump segments to the coredump
  bnxt_en: Add a new ethtool -W dump flag
  bnxt_en: Add 2 parameters to bnxt_fill_coredump_seg_hdr()
  bnxt_en: Add functions to copy host context memory
  bnxt_en: Do not free FW log context memory
  bnxt_en: Manage the FW trace context memory
  bnxt_en: Allocate backing store memory for FW trace logs
  bnxt_en: Add a 'force' parameter to bnxt_free_ctx_mem()
  bnxt_en: Refactor bnxt_free_ctx_mem()
  bnxt_en: Add mem_valid bit to struct bnxt_ctx_mem_type
  bnxt_en: Update firmware interface spec to 1.10.3.85
  selftests/bpf: Add some tests with sockmap SK_PASS
  bpf: fix recursive lock when verdict program return SK_PASS
  wireguard: device: support big tcp GSO
  wireguard: selftests: load nf_conntrack if not present
  ...
2024-11-21 08:28:08 -08:00
Linus Torvalds
364eeb79a2 Merge tag 'locking-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "Lockdep:
   - Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING (Sebastian Andrzej
     Siewior)
   - Add lockdep_cleanup_dead_cpu() (David Woodhouse)

  futexes:
   - Use atomic64_inc_return() in get_inode_sequence_number() (Uros
     Bizjak)
   - Use atomic64_try_cmpxchg_relaxed() in get_inode_sequence_number()
     (Uros Bizjak)

  RT locking:
   - Add sparse annotation PREEMPT_RT's locking (Sebastian Andrzej
     Siewior)

  spinlocks:
   - Use atomic_try_cmpxchg_release() in osq_unlock() (Uros Bizjak)

  atomics:
   - x86: Use ALT_OUTPUT_SP() for __alternative_atomic64() (Uros Bizjak)
   - x86: Use ALT_OUTPUT_SP() for __arch_{,try_}cmpxchg64_emu() (Uros
     Bizjak)

  KCSAN, seqlocks:
   - Support seqcount_latch_t (Marco Elver)

  <linux/cleanup.h>:
   - Add if_not_guard() conditional guard helper (David Lechner)
   - Adjust scoped_guard() macros to avoid potential warning (Przemek
     Kitszel)
   - Remove address space of returned pointer (Uros Bizjak)

  WW mutexes:
   - locking/ww_mutex: Adjust to lockdep nest_lock requirements (Thomas
     Hellström)

  Rust integration:
   - Fix raw_spin_lock initialization on PREEMPT_RT (Eder Zulian)

  Misc cleanups & fixes:
   - lockdep: Fix wait-type check related warnings (Ahmed Ehab)
   - lockdep: Use info level for initial info messages (Jiri Slaby)
   - spinlocks: Make __raw_* lock ops static (Geert Uytterhoeven)
   - pvqspinlock: Convert fields of 'enum vcpu_state' to uppercase
     (Qiuxu Zhuo)
   - iio: magnetometer: Fix if () scoped_guard() formatting (Stephen
     Rothwell)
   - rtmutex: Fix misleading comment (Peter Zijlstra)
   - percpu-rw-semaphores: Fix grammar in percpu-rw-semaphore.rst (Xiu
     Jianfeng)"

* tag 'locking-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
  locking/Documentation: Fix grammar in percpu-rw-semaphore.rst
  iio: magnetometer: fix if () scoped_guard() formatting
  rust: helpers: Avoid raw_spin_lock initialization for PREEMPT_RT
  kcsan, seqlock: Fix incorrect assumption in read_seqbegin()
  seqlock, treewide: Switch to non-raw seqcount_latch interface
  kcsan, seqlock: Support seqcount_latch_t
  time/sched_clock: Broaden sched_clock()'s instrumentation coverage
  time/sched_clock: Swap update_clock_read_data() latch writes
  locking/atomic/x86: Use ALT_OUTPUT_SP() for __arch_{,try_}cmpxchg64_emu()
  locking/atomic/x86: Use ALT_OUTPUT_SP() for __alternative_atomic64()
  cleanup: Add conditional guard helper
  cleanup: Adjust scoped_guard() macros to avoid potential warning
  locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()
  cleanup: Remove address space of returned pointer
  locking/rtmutex: Fix misleading comment
  locking/rt: Annotate unlock followed by lock for sparse.
  locking/rt: Add sparse annotation for RCU.
  locking/rt: Remove one __cond_lock() in RT's spin_trylock_irqsave()
  locking/rt: Add sparse annotation PREEMPT_RT's sleeping locks.
  locking/pvqspinlock: Convert fields of 'enum vcpu_state' to uppercase
  ...
2024-11-19 12:43:11 -08:00
Linus Torvalds
5bb6ba448f Merge tag 'vfs-6.13.rust.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs rust file abstractions from Christian Brauner:
 "This contains the file abstractions needed by the Rust implementation
  of the Binder driver and other parts of the kernel.

  Let's treat this as a first attempt at getting something working but I
  do expect the actual interfaces to change significantly over time.
  Simply because we are still figuring out what actually works. But
  there's no point in further theorizing. Let's see how it holds up with
  actual users"

* tag 'vfs-6.13.rust.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  rust: task: adjust safety comments in Task methods
  rust: add seqfile abstraction
  rust: file: add abstraction for `poll_table`
  rust: file: add `Kuid` wrapper
  rust: file: add `FileDescriptorReservation`
  rust: security: add abstraction for secctx
  rust: cred: add Rust abstraction for `struct cred`
  rust: file: add Rust abstraction for `struct file`
  rust: task: add `Task::current_raw`
  rust: types: add `NotThreadSafe`
2024-11-18 09:51:32 -08:00
Eder Zulian
5c2e7736e2 rust: helpers: Avoid raw_spin_lock initialization for PREEMPT_RT
When PREEMPT_RT=y, spin locks are mapped to rt_mutex types, so using
spinlock_check() + __raw_spin_lock_init() to initialize spin locks is
incorrect, and would cause build errors.

Introduce __spin_lock_init() to initialize a spin lock with lockdep
rquired information for PREEMPT_RT builds, and use it in the Rust
helper.

Fixes: d2d6422f8b ("x86: Allow to enable PREEMPT_RT.")
Closes: https://lore.kernel.org/oe-kbuild-all/202409251238.vetlgXE9-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Eder Zulian <ezulian@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Tested-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/r/20241107163223.2092690-2-ezulian@redhat.com
2024-11-11 11:49:46 +01:00
Paolo Abeni
91afa49a3e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc4).

Conflicts:

107a034d5c ("net/mlx5: qos: Store rate groups in a qos domain")
1da9cfd6c4 ("net/mlx5: Unregister notifier on eswitch init failure")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21 09:14:18 +02:00
Alice Ryhl
fe95f58320 rust: task: adjust safety comments in Task methods
The `Task` struct has several safety comments that aren't so great. For
example, the reason that it's okay to read the `pid` is that the field
is immutable, so there is no data race, which is not what the safety
comment says.

Thus, improve the safety comments. Also add an `as_ptr` helper. This
makes it easier to read the various accessors on Task, as `self.0` may
be confusing syntax for new Rust users.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20241015-task-safety-cmnts-v1-1-46ee92c82768@google.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-15 16:07:02 +02:00
Linus Torvalds
f683c9b134 Merge tag 'driver-core-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
 "Here is a single driver core fix, and a .mailmap update.

  The fix is for the rust driver core bindings, turned out that the
  from_raw binding wasn't a good idea (don't want to pass a pointer to a
  reference counted object without actually incrementing the pointer.)
  So this change fixes it up as the from_raw binding came in in -rc1.

  The other change is a .mailmap update.

  Both have been in linux-next for a while with no reported issues"

* tag 'driver-core-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  mailmap: update mail for Fiona Behrens
  rust: device: change the from_raw() function
2024-10-13 09:10:52 -07:00
Jakub Kicinski
9c0fc36ec4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc3).

No conflicts and no adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10 13:13:33 -07:00
Alice Ryhl
22018a5a54 rust: add seqfile abstraction
This adds a simple seq file abstraction that lets you print to a seq
file using ordinary Rust printing syntax.

An example user from Rust Binder:

    pub(crate) fn full_debug_print(
        &self,
        m: &SeqFile,
        owner_inner: &mut ProcessInner,
    ) -> Result<()> {
        let prio = self.node_prio();
        let inner = self.inner.access_mut(owner_inner);
        seq_print!(
            m,
            "  node {}: u{:016x} c{:016x} pri {}:{} hs {} hw {} cs {} cw {}",
            self.debug_id,
            self.ptr,
            self.cookie,
            prio.sched_policy,
            prio.prio,
            inner.strong.has_count,
            inner.weak.has_count,
            inner.strong.count,
            inner.weak.count,
        );
        if !inner.refs.is_empty() {
            seq_print!(m, " proc");
            for node_ref in &inner.refs {
                seq_print!(m, " {}", node_ref.process.task.pid());
            }
        }
        seq_print!(m, "\n");
        for t in &inner.oneway_todo {
            t.debug_print_inner(m, "    pending async transaction ");
        }
        Ok(())
    }

The `SeqFile` type is marked not thread safe so that `call_printf` can
be a `&self` method. The alternative is to use `self: Pin<&mut Self>`
which is inconvenient, or to have `SeqFile` wrap a pointer instead of
wrapping the C struct directly.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20241001-seqfile-v1-1-dfcd0fc21e96@google.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-08 14:32:39 +02:00
Christian Brauner
70d7f7dbd9 Merge patch series "File abstractions needed by Rust Binder"
Alice Ryhl <aliceryhl@google.com> says:

This patchset contains the file abstractions needed by the Rust
implementation of the Binder driver.

Please see the Rust Binder RFC for usage examples:
https://lore.kernel.org/rust-for-linux/20231101-rust-binder-v1-0-08ba9197f637@google.com

Users of "rust: types: add `NotThreadSafe`":
	[PATCH 5/9] rust: file: add `FileDescriptorReservation`

Users of "rust: task: add `Task::current_raw`":
	[PATCH 7/9] rust: file: add `Kuid` wrapper
	[PATCH 8/9] rust: file: add `DeferredFdCloser`

Users of "rust: file: add Rust abstraction for `struct file`":
	[PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder
	[PATCH RFC 03/20] rust_binder: add threading support

Users of "rust: cred: add Rust abstraction for `struct cred`":
	[PATCH RFC 05/20] rust_binder: add nodes and context managers
	[PATCH RFC 06/20] rust_binder: add oneway transactions
	[PATCH RFC 11/20] rust_binder: send nodes in transaction
	[PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support

Users of "rust: security: add abstraction for secctx":
	[PATCH RFC 06/20] rust_binder: add oneway transactions

Users of "rust: file: add `FileDescriptorReservation`":
	[PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support
	[PATCH RFC 14/20] rust_binder: add BINDER_TYPE_FDA support

Users of "rust: file: add `Kuid` wrapper":
	[PATCH RFC 05/20] rust_binder: add nodes and context managers
	[PATCH RFC 06/20] rust_binder: add oneway transactions

Users of "rust: file: add abstraction for `poll_table`":
	[PATCH RFC 07/20] rust_binder: add epoll support

This patchset has some uses of read_volatile in place of READ_ONCE.
Please see the following rfc for context on this:
https://lore.kernel.org/all/20231025195339.1431894-1-boqun.feng@gmail.com/

* patches from https://lore.kernel.org/r/20240915-alice-file-v10-0-88484f7a3dcf@google.com:
  rust: file: add abstraction for `poll_table`
  rust: file: add `Kuid` wrapper
  rust: file: add `FileDescriptorReservation`
  rust: security: add abstraction for secctx
  rust: cred: add Rust abstraction for `struct cred`
  rust: file: add Rust abstraction for `struct file`
  rust: task: add `Task::current_raw`
  rust: types: add `NotThreadSafe`

Link: https://lore.kernel.org/r/20240915-alice-file-v10-0-88484f7a3dcf@google.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-08 14:32:22 +02:00
FUJITA Tomonori
3ed8d344e0 rust: net::phy always define device_table in module_phy_driver macro
device_table in module_phy_driver macro is defined only when the
driver is built as a module. So a PHY driver imports phy::DeviceId
module in the following way then hits `unused import` warning when
it's compiled as built-in:

 use kernel::net::phy::DeviceId;

 kernel::module_phy_driver! {
     drivers: [PhyQT2025],
     device_table: [
        DeviceId::new_with_driver::<PhyQT2025>(),
     ],

Put device_table in a const. It's not included in the kernel image if
unused (when the driver is compiled as built-in), and the compiler
doesn't complain.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20240930134038.1309-1-fujita.tomonori@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 16:42:18 -07:00
Guilherme Giacomo Simoes
cc4332afb5 rust: device: change the from_raw() function
The function Device::from_raw() increments a refcount by a call to
bindings::get_device(ptr). This can be confused because usually
from_raw() functions don't increment a refcount.
Hence, rename Device::from_raw() to avoid confuion with other "from_raw"
semantics.

The new name of function should be "get_device" to be consistent with
the function get_device() already exist in .c files.

This function body also changed, because the `into()` will convert the
`&'a Device` into `ARef<Device>` and also call `inc_ref` from the
`AlwaysRefCounted` trait implemented for Device.

Signed-off-by: Guilherme Giacomo Simoes <trintaeoitogc@gmail.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Closes: https://github.com/Rust-for-Linux/linux/issues/1088
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/r/20241001205603.106278-1-trintaeoitogc@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-03 13:54:48 +02:00
Miguel Ojeda
05cef2c4a4 rust: kunit: use C-string literals to clean warning
Starting with upstream Rust commit a5e3a3f9b6bd ("move
`manual_c_str_literals` to complexity"), to be released in Rust 1.83.0
[1], Clippy now warns on `manual_c_str_literals` by default, e.g.:

    error: manually constructing a nul-terminated string
      --> rust/kernel/kunit.rs:21:13
       |
    21 |             b"\x013%pA\0".as_ptr() as _,
       |             ^^^^^^^^^^^^^ help: use a `c""` literal: `c"\x013%pA"`
       |
       = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_c_str_literals
       = note: `-D clippy::manual-c-str-literals` implied by `-D warnings`
       = help: to override `-D warnings` add `#[allow(clippy::manual_c_str_literals)]`

Apply the suggestion to clean up the warnings.

Link: https://github.com/rust-lang/rust-clippy/pull/13263 [1]
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Link: https://lore.kernel.org/r/20240927164414.560906-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-10-01 23:46:42 +02:00
Alice Ryhl
ac681835b6 rust: file: add abstraction for poll_table
The existing `CondVar` abstraction is a wrapper around
`wait_queue_head`, but it does not support all use-cases of the C
`wait_queue_head` type. To be specific, a `CondVar` cannot be registered
with a `struct poll_table`. This limitation has the advantage that you
do not need to call `synchronize_rcu` when destroying a `CondVar`.

However, we need the ability to register a `poll_table` with a
`wait_queue_head` in Rust Binder. To enable this, introduce a type
called `PollCondVar`, which is like `CondVar` except that you can
register a `poll_table`. We also introduce `PollTable`, which is a safe
wrapper around `poll_table` that is intended to be used with
`PollCondVar`.

The destructor of `PollCondVar` unconditionally calls `synchronize_rcu`
to ensure that the removal of epoll waiters has fully completed before
the `wait_queue_head` is destroyed.

That said, `synchronize_rcu` is rather expensive and is not needed in
all cases: If we have never registered a `poll_table` with the
`wait_queue_head`, then we don't need to call `synchronize_rcu`. (And
this is a common case in Binder - not all processes use Binder with
epoll.) The current implementation does not account for this, but if we
find that it is necessary to improve this, a future patch could store a
boolean next to the `wait_queue_head` to keep track of whether a
`poll_table` has ever been registered.

Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20240915-alice-file-v10-8-88484f7a3dcf@google.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-30 13:02:29 +02:00
Alice Ryhl
8ad1a41f7e rust: file: add Kuid wrapper
Adds a wrapper around `kuid_t` called `Kuid`. This allows us to define
various operations on kuids such as equality and current_euid. It also
lets us provide conversions from kuid into userspace values.

Rust Binder needs these operations because it needs to compare kuids for
equality, and it needs to tell userspace about the pid and uid of
incoming transactions.

To read kuids from a `struct task_struct`, you must currently use
various #defines that perform the appropriate field access under an RCU
read lock. Currently, we do not have a Rust wrapper for rcu_read_lock,
which means that for this patch, there are two ways forward:

 1. Inline the methods into Rust code, and use __rcu_read_lock directly
    rather than the rcu_read_lock wrapper. This gives up lockdep for
    these usages of RCU.

 2. Wrap the various #defines in helpers and call the helpers from Rust.

This patch uses the second option. One possible disadvantage of the
second option is the possible introduction of speculation gadgets, but
as discussed in [1], the risk appears to be acceptable.

Of course, once a wrapper for rcu_read_lock is available, it is
preferable to use that over either of the two above approaches.

Link: https://lore.kernel.org/all/202312080947.674CD2DC7@keescook/ [1]
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20240915-alice-file-v10-7-88484f7a3dcf@google.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-30 13:02:29 +02:00
Wedson Almeida Filho
5da9857b12 rust: file: add FileDescriptorReservation
Allow for the creation of a file descriptor in two steps: first, we
reserve a slot for it, then we commit or drop the reservation. The first
step may fail (e.g., the current process ran out of available slots),
but commit and drop never fail (and are mutually exclusive).

This is needed by Rust Binder when fds are sent from one process to
another. It has to be a two-step process to properly handle the case
where multiple fds are sent: The operation must fail or succeed
atomically, which we achieve by first reserving the fds we need, and
only installing the files once we have reserved enough fds to send the
files.

Fd reservations assume that the value of `current` does not change
between the call to get_unused_fd_flags and the call to fd_install (or
put_unused_fd). By not implementing the Send trait, this abstraction
ensures that the `FileDescriptorReservation` cannot be moved into a
different process.

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20240915-alice-file-v10-6-88484f7a3dcf@google.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-30 13:02:29 +02:00
Alice Ryhl
94d356c033 rust: security: add abstraction for secctx
Add an abstraction for viewing the string representation of a security
context.

This is needed by Rust Binder because it has a feature where a process
can view the string representation of the security context for incoming
transactions. The process can use that to authenticate incoming
transactions, and since the feature is provided by the kernel, the
process can trust that the security context is legitimate.

This abstraction makes the following assumptions about the C side:
* When a call to `security_secid_to_secctx` is successful, it returns a
  pointer and length. The pointer references a byte string and is valid
  for reading for that many bytes.
* The string may be referenced until `security_release_secctx` is
  called.
* If CONFIG_SECURITY is set, then the three methods mentioned in
  rust/helpers are available without a helper. (That is, they are not a
  #define or `static inline`.)

Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20240915-alice-file-v10-5-88484f7a3dcf@google.com
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-30 13:02:28 +02:00
Wedson Almeida Filho
a3df991d3d rust: cred: add Rust abstraction for struct cred
Add a wrapper around `struct cred` called `Credential`, and provide
functionality to get the `Credential` associated with a `File`.

Rust Binder must check the credentials of processes when they attempt to
perform various operations, and these checks usually take a
`&Credential` as parameter. The security_binder_set_context_mgr function
would be one example. This patch is necessary to access these security_*
methods from Rust.

This Rust abstraction makes the following assumptions about the C side:
* `struct cred` is refcounted with `get_cred`/`put_cred`.
* It's okay to transfer a `struct cred` across threads, that is, you do
  not need to call `put_cred` on the same thread as where you called
  `get_cred`.
* The `euid` field of a `struct cred` never changes after
  initialization.
* The `f_cred` field of a `struct file` never changes after
  initialization.

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20240915-alice-file-v10-4-88484f7a3dcf@google.com
Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-30 13:02:28 +02:00
Wedson Almeida Filho
851849824b rust: file: add Rust abstraction for struct file
This abstraction makes it possible to manipulate the open files for a
process. The new `File` struct wraps the C `struct file`. When accessing
it using the smart pointer `ARef<File>`, the pointer will own a
reference count to the file. When accessing it as `&File`, then the
reference does not own a refcount, but the borrow checker will ensure
that the reference count does not hit zero while the `&File` is live.

Since this is intended to manipulate the open files of a process, we
introduce an `fget` constructor that corresponds to the C `fget`
method. In future patches, it will become possible to create a new fd in
a process and bind it to a `File`. Rust Binder will use these to send
fds from one process to another.

We also provide a method for accessing the file's flags. Rust Binder
will use this to access the flags of the Binder fd to check whether the
non-blocking flag is set, which affects what the Binder ioctl does.

This introduces a struct for the EBADF error type, rather than just
using the Error type directly. This has two advantages:
* `File::fget` returns a `Result<ARef<File>, BadFdError>`, which the
  compiler will represent as a single pointer, with null being an error.
  This is possible because the compiler understands that `BadFdError`
  has only one possible value, and it also understands that the
  `ARef<File>` smart pointer is guaranteed non-null.
* Additionally, we promise to users of the method that the method can
  only fail with EBADF, which means that they can rely on this promise
  without having to inspect its implementation.
That said, there are also two disadvantages:
* Defining additional error types involves boilerplate.
* The question mark operator will only utilize the `From` trait once,
  which prevents you from using the question mark operator on
  `BadFdError` in methods that return some third error type that the
  kernel `Error` is convertible into. (However, it works fine in methods
  that return `Error`.)

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20240915-alice-file-v10-3-88484f7a3dcf@google.com
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-30 13:02:28 +02:00
Alice Ryhl
913f8cf4f3 rust: task: add Task::current_raw
Introduces a safe function for getting a raw pointer to the current
task.

When writing bindings that need to access the current task, it is often
more convenient to call a method that directly returns a raw pointer
than to use the existing `Task::current` method. However, the only way
to do that is `bindings::get_current()` which is unsafe since it calls
into C. By introducing `Task::current_raw()`, it becomes possible to
obtain a pointer to the current task without using unsafe.

Link: https://lore.kernel.org/all/CAH5fLgjT48X-zYtidv31mox3C4_Ogoo_2cBOCmX0Ang3tAgGHA@mail.gmail.com/
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20240915-alice-file-v10-2-88484f7a3dcf@google.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-30 13:02:28 +02:00
Alice Ryhl
e7572e5dea rust: types: add NotThreadSafe
This introduces a new marker type for types that shouldn't be thread
safe. By adding a field of this type to a struct, it becomes non-Send
and non-Sync, which means that it cannot be accessed in any way from
threads other than the one it was created on.

This is useful for APIs that require globals such as `current` to remain
constant while the value exists.

We update two existing users in the Kernel to use this helper:

 * `Task::current()` - moving the return type of this value to a
   different thread would not be safe as you can no longer be guaranteed
   that the `current` pointer remains valid.
 * Lock guards. Mutexes and spinlocks should be unlocked on the same
   thread as where they were locked, so we enforce this using the Send
   trait.

There are also additional users in later patches of this patchset. See
[1] and [2] for the discussion that led to the introduction of this
patch.

Link: https://lore.kernel.org/all/nFDPJFnzE9Q5cqY7FwSMByRH2OAn_BpI4H53NQfWIlN6I2qfmAqnkp2wRqn0XjMO65OyZY4h6P4K2nAGKJpAOSzksYXaiAK_FoH_8QbgBI4=@proton.me/ [1]
Link: https://lore.kernel.org/all/nFDPJFnzE9Q5cqY7FwSMByRH2OAn_BpI4H53NQfWIlN6I2qfmAqnkp2wRqn0XjMO65OyZY4h6P4K2nAGKJpAOSzksYXaiAK_FoH_8QbgBI4=@proton.me/ [2]
Suggested-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Reviewed-by: Björn Roy Baron <bjorn3_gh@protonmail.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20240915-alice-file-v10-1-88484f7a3dcf@google.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-30 13:02:27 +02:00
Dirk Behme
d065cc7605 rust: mutex: fix __mutex_init() usage in case of PREEMPT_RT
In case CONFIG_PREEMPT_RT is enabled __mutex_init() becomes a macro
instead of an extern function (simplified from
include/linux/mutex.h):

    #ifndef CONFIG_PREEMPT_RT
    extern void __mutex_init(struct mutex *lock, const char *name,
    			 struct lock_class_key *key);
    #else
    #define __mutex_init(mutex, name, key)		\
    do {						\
	rt_mutex_base_init(&(mutex)->rtmutex);		\
    	__mutex_rt_init((mutex), name, key);		\
    } while (0)
    #endif

The macro isn't resolved by bindgen, then. What results in a build
error:

error[E0425]: cannot find function `__mutex_init` in crate `bindings`
     --> rust/kernel/sync/lock/mutex.rs:104:28
      |
104   |           unsafe { bindings::__mutex_init(ptr, name, key) }
      |                              ^^^^^^^^^^^^ help: a function with a similar name exists: `__mutex_rt_init`
      |
     ::: rust/bindings/bindings_generated.rs:23722:5
      |
23722 | /     pub fn __mutex_rt_init(
23723 | |         lock: *mut mutex,
23724 | |         name: *const core::ffi::c_char,
23725 | |         key: *mut lock_class_key,
23726 | |     );
      | |_____- similarly named function `__mutex_rt_init` defined here

Fix this by adding a helper.

As explained by Gary Guo in [1] no #ifdef CONFIG_PREEMPT_RT
is needed here as rust/bindings/lib.rs prefers externed function to
helpers if an externed function exists.

Reported-by: Conor Dooley <conor@kernel.org>
Link: https://lore.kernel.org/rust-for-linux/20240913-shack-estate-b376a65921b1@spud/
Link: https://lore.kernel.org/rust-for-linux/20240915123626.1a170103.gary@garyguo.net/ [1]
Fixes: 6d20d629c6 ("rust: lock: introduce `Mutex`")
Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://lore.kernel.org/r/20240916073752.3123484-1-dirk.behme@de.bosch.com
[ Reworded to include the proper example by Dirk. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-26 21:22:15 +02:00
Gary Guo
732cd686cd rust: fix ARCH_SLAB_MINALIGN multiple definition error
We use const helpers in form of

    const size_t RUST_CONST_HELPER_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN;

to aid generation of constants by bindgen because it is otherwise a
macro definition of an expression and bindgen doesn't expand the
constant. The helpers are then have `RUST_CONST_HELPER` prefix stripped
and exposed to Rust code as if `ARCH_SLAB_MISALIGN` is generated
natively by bindgen.

This works well for most constants, but on RISC-V, `ARCH_SLAB_MINALIGN`
is defined directly as literal constant if `!CONFIG_MMU`, and bindgen
would generate `ARCH_SLAB_MINALIGN` directly, thus conflict with the
one generated through the helper.

To fix this, we simply need to block bindgen from generating directly
without going through helper.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202409160804.eSg9zh1e-lkp@intel.com/
Signed-off-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Christian Schrefl <chrisi.schrefl@gmail.com>
Link: https://lore.kernel.org/r/20240916003347.1744345-1-gary@garyguo.net
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-26 17:52:35 +02:00
Alice Ryhl
a8ee30f45d rust: sync: require T: Sync for LockedBy::access
The `LockedBy::access` method only requires a shared reference to the
owner, so if we have shared access to the `LockedBy` from several
threads at once, then two threads could call `access` in parallel and
both obtain a shared reference to the inner value. Thus, require that
`T: Sync` when calling the `access` method.

An alternative is to require `T: Sync` in the `impl Sync for LockedBy`.
This patch does not choose that approach as it gives up the ability to
use `LockedBy` with `!Sync` types, which is okay as long as you only use
`access_mut`.

Cc: stable@vger.kernel.org
Fixes: 7b1f55e3a9 ("rust: sync: introduce `LockedBy`")
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://lore.kernel.org/r/20240915-locked-by-sync-fix-v2-1-1a8d89710392@google.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-26 17:51:49 +02:00
Miguel Ojeda
ece207a83e rust: kernel: sort Rust modules
Rust modules are intended to be sorted, thus do so.

This makes `rustfmtcheck` to pass again.

Fixes: 5701725692 ("Merge tag 'rust-6.12' of https://github.com/Rust-for-Linux/linux")
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20240926124751.345471-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-26 17:49:00 +02:00