Commit Graph

87 Commits

Author SHA1 Message Date
Miguel Ojeda
e4fc6580b0 samples: add first Rust examples
The beginning of a set of Rust modules that showcase how Rust
modules look like and how to use the abstracted kernel features.

It also includes an example of a Rust host program with
several modules.

These samples also double as tests in the CI.

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Co-developed-by: Alex Gaynor <alex.gaynor@gmail.com>
Signed-off-by: Alex Gaynor <alex.gaynor@gmail.com>
Co-developed-by: Finn Behrens <me@kloenk.de>
Signed-off-by: Finn Behrens <me@kloenk.de>
Co-developed-by: Wedson Almeida Filho <wedsonaf@google.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@google.com>
Co-developed-by: Milan Landaverde <milan@mdaverde.com>
Signed-off-by: Milan Landaverde <milan@mdaverde.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2022-09-28 09:03:08 +02:00
Linus Torvalds
169e77764a Merge tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
 "The sprinkling of SPI drivers is because we added a new one and Mark
  sent us a SPI driver interface conversion pull request.

  Core
  ----

   - Introduce XDP multi-buffer support, allowing the use of XDP with
     jumbo frame MTUs and combination with Rx coalescing offloads (LRO).

   - Speed up netns dismantling (5x) and lower the memory cost a little.
     Remove unnecessary per-netns sockets. Scope some lists to a netns.
     Cut down RCU syncing. Use batch methods. Allow netdev registration
     to complete out of order.

   - Support distinguishing timestamp types (ingress vs egress) and
     maintaining them across packet scrubbing points (e.g. redirect).

   - Continue the work of annotating packet drop reasons throughout the
     stack.

   - Switch netdev error counters from an atomic to dynamically
     allocated per-CPU counters.

   - Rework a few preempt_disable(), local_irq_save() and busy waiting
     sections problematic on PREEMPT_RT.

   - Extend the ref_tracker to allow catching use-after-free bugs.

  BPF
  ---

   - Introduce "packing allocator" for BPF JIT images. JITed code is
     marked read only, and used to be allocated at page granularity.
     Custom allocator allows for more efficient memory use, lower iTLB
     pressure and prevents identity mapping huge pages from getting
     split.

   - Make use of BTF type annotations (e.g. __user, __percpu) to enforce
     the correct probe read access method, add appropriate helpers.

   - Convert the BPF preload to use light skeleton and drop the
     user-mode-driver dependency.

   - Allow XDP BPF_PROG_RUN test infra to send real packets, enabling
     its use as a packet generator.

   - Allow local storage memory to be allocated with GFP_KERNEL if
     called from a hook allowed to sleep.

   - Introduce fprobe (multi kprobe) to speed up mass attachment (arch
     bits to come later).

   - Add unstable conntrack lookup helpers for BPF by using the BPF
     kfunc infra.

   - Allow cgroup BPF progs to return custom errors to user space.

   - Add support for AF_UNIX iterator batching.

   - Allow iterator programs to use sleepable helpers.

   - Support JIT of add, and, or, xor and xchg atomic ops on arm64.

   - Add BTFGen support to bpftool which allows to use CO-RE in kernels
     without BTF info.

   - Large number of libbpf API improvements, cleanups and deprecations.

  Protocols
  ---------

   - Micro-optimize UDPv6 Tx, gaining up to 5% in test on dummy netdev.

   - Adjust TSO packet sizes based on min_rtt, allowing very low latency
     links (data centers) to always send full-sized TSO super-frames.

   - Make IPv6 flow label changes (AKA hash rethink) more configurable,
     via sysctl and setsockopt. Distinguish between server and client
     behavior.

   - VxLAN support to "collect metadata" devices to terminate only
     configured VNIs. This is similar to VLAN filtering in the bridge.

   - Support inserting IPv6 IOAM information to a fraction of frames.

   - Add protocol attribute to IP addresses to allow identifying where
     given address comes from (kernel-generated, DHCP etc.)

   - Support setting socket and IPv6 options via cmsg on ping6 sockets.

   - Reject mis-use of ECN bits in IP headers as part of DSCP/TOS.
     Define dscp_t and stop taking ECN bits into account in fib-rules.

   - Add support for locked bridge ports (for 802.1X).

   - tun: support NAPI for packets received from batched XDP buffs,
     doubling the performance in some scenarios.

   - IPv6 extension header handling in Open vSwitch.

   - Support IPv6 control message load balancing in bonding, prevent
     neighbor solicitation and advertisement from using the wrong port.
     Support NS/NA monitor selection similar to existing ARP monitor.

   - SMC
      - improve performance with TCP_CORK and sendfile()
      - support auto-corking
      - support TCP_NODELAY

   - MCTP (Management Component Transport Protocol)
      - add user space tag control interface
      - I2C binding driver (as specified by DMTF DSP0237)

   - Multi-BSSID beacon handling in AP mode for WiFi.

   - Bluetooth:
      - handle MSFT Monitor Device Event
      - add MGMT Adv Monitor Device Found/Lost events

   - Multi-Path TCP:
      - add support for the SO_SNDTIMEO socket option
      - lots of selftest cleanups and improvements

   - Increase the max PDU size in CAN ISOTP to 64 kB.

  Driver API
  ----------

   - Add HW counters for SW netdevs, a mechanism for devices which
     offload packet forwarding to report packet statistics back to
     software interfaces such as tunnels.

   - Select the default NIC queue count as a fraction of number of
     physical CPU cores, instead of hard-coding to 8.

   - Expose devlink instance locks to drivers. Allow device layer of
     drivers to use that lock directly instead of creating their own
     which always runs into ordering issues in devlink callbacks.

   - Add header/data split indication to guide user space enabling of
     TCP zero-copy Rx.

   - Allow configuring completion queue event size.

   - Refactor page_pool to enable fragmenting after allocation.

   - Add allocation and page reuse statistics to page_pool.

   - Improve Multiple Spanning Trees support in the bridge to allow
     reuse of topologies across VLANs, saving HW resources in switches.

   - DSA (Distributed Switch Architecture):
      - replay and offload of host VLAN entries
      - offload of static and local FDB entries on LAG interfaces
      - FDB isolation and unicast filtering

  New hardware / drivers
  ----------------------

   - Ethernet:
      - LAN937x T1 PHYs
      - Davicom DM9051 SPI NIC driver
      - Realtek RTL8367S, RTL8367RB-VB switch and MDIO
      - Microchip ksz8563 switches
      - Netronome NFP3800 SmartNICs
      - Fungible SmartNICs
      - MediaTek MT8195 switches

   - WiFi:
      - mt76: MediaTek mt7916
      - mt76: MediaTek mt7921u USB adapters
      - brcmfmac: Broadcom BCM43454/6

   - Mobile:
      - iosm: Intel M.2 7360 WWAN card

  Drivers
  -------

   - Convert many drivers to the new phylink API built for split PCS
     designs but also simplifying other cases.

   - Intel Ethernet NICs:
      - add TTY for GNSS module for E810T device
      - improve AF_XDP performance
      - GTP-C and GTP-U filter offload
      - QinQ VLAN support

   - Mellanox Ethernet NICs (mlx5):
      - support xdp->data_meta
      - multi-buffer XDP
      - offload tc push_eth and pop_eth actions

   - Netronome Ethernet NICs (nfp):
      - flow-independent tc action hardware offload (police / meter)
      - AF_XDP

   - Other Ethernet NICs:
      - at803x: fiber and SFP support
      - xgmac: mdio: preamble suppression and custom MDC frequencies
      - r8169: enable ASPM L1.2 if system vendor flags it as safe
      - macb/gem: ZynqMP SGMII
      - hns3: add TX push mode
      - dpaa2-eth: software TSO
      - lan743x: multi-queue, mdio, SGMII, PTP
      - axienet: NAPI and GRO support

   - Mellanox Ethernet switches (mlxsw):
      - source and dest IP address rewrites
      - RJ45 ports

   - Marvell Ethernet switches (prestera):
      - basic routing offload
      - multi-chain TC ACL offload

   - NXP embedded Ethernet switches (ocelot & felix):
      - PTP over UDP with the ocelot-8021q DSA tagging protocol
      - basic QoS classification on Felix DSA switch using dcbnl
      - port mirroring for ocelot switches

   - Microchip high-speed industrial Ethernet (sparx5):
      - offloading of bridge port flooding flags
      - PTP Hardware Clock

   - Other embedded switches:
      - lan966x: PTP Hardward Clock
      - qca8k: mdio read/write operations via crafted Ethernet packets

   - Qualcomm 802.11ax WiFi (ath11k):
      - add LDPC FEC type and 802.11ax High Efficiency data in radiotap
      - enable RX PPDU stats in monitor co-exist mode

   - Intel WiFi (iwlwifi):
      - UHB TAS enablement via BIOS
      - band disablement via BIOS
      - channel switch offload
      - 32 Rx AMPDU sessions in newer devices

   - MediaTek WiFi (mt76):
      - background radar detection
      - thermal management improvements on mt7915
      - SAR support for more mt76 platforms
      - MBSSID and 6 GHz band on mt7915

   - RealTek WiFi:
      - rtw89: AP mode
      - rtw89: 160 MHz channels and 6 GHz band
      - rtw89: hardware scan

   - Bluetooth:
      - mt7921s: wake on Bluetooth, SCO over I2S, wide-band-speed (WBS)

   - Microchip CAN (mcp251xfd):
      - multiple RX-FIFOs and runtime configurable RX/TX rings
      - internal PLL, runtime PM handling simplification
      - improve chip detection and error handling after wakeup"

* tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2521 commits)
  llc: fix netdevice reference leaks in llc_ui_bind()
  drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool
  ice: don't allow to run ice_send_event_to_aux() in atomic ctx
  ice: fix 'scheduling while atomic' on aux critical err interrupt
  net/sched: fix incorrect vlan_push_eth dest field
  net: bridge: mst: Restrict info size queries to bridge ports
  net: marvell: prestera: add missing destroy_workqueue() in prestera_module_init()
  drivers: net: xgene: Fix regression in CRC stripping
  net: geneve: add missing netlink policy and size for IFLA_GENEVE_INNER_PROTO_INHERIT
  net: dsa: fix missing host-filtered multicast addresses
  net/mlx5e: Fix build warning, detected write beyond size of field
  iwlwifi: mvm: Don't fail if PPAG isn't supported
  selftests/bpf: Fix kprobe_multi test.
  Revert "rethook: x86: Add rethook x86 implementation"
  Revert "arm64: rethook: Add arm64 rethook implementation"
  Revert "powerpc: Add rethook support"
  Revert "ARM: rethook: Add rethook arm implementation"
  netdevice: add missing dm_private kdoc
  net: bridge: mst: prevent NULL deref in br_mst_info_size()
  selftests: forwarding: Use same VRF for port and VLAN upper
  ...
2022-03-24 13:13:26 -07:00
Masami Hiramatsu
6ee64cc302 fprobe: Add sample program for fprobe
Add a sample program for the fprobe. The sample_fprobe puts a fprobe on
kernel_clone() by default. This dump stack and some called address info
at the function entry and exit.

The sample_fprobe.ko gets 2 parameters.
- symbol: you can specify the comma separated symbols or wildcard symbol
  pattern (in this case you can not use comma)
- stackdump: a bool value to enable or disable stack dump in the fprobe
  handler.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735291987.1084943.4449670993752806840.stgit@devnote2
2022-03-17 20:16:58 -07:00
Steven Rostedt (Google)
953c2f0521 tracing: Add sample code for custom trace events
Add sample code to show how to create custom trace events in the tracefs
directory that can be enabled and modified like any event in tracefs
(including triggers, histograms, synthetic events and event probes).

The example is creating a custom sched_switch and a sched_waking to limit
what is recorded:

If the custom sched switch only records the prev_prio, next_prio and
next_pid, it can bring the size from 64 bytes per event, down to just 16
bytes!

If sched_waking only records the prio and pid of the woken event, it will
bring the size down from 36 bytes to 12 bytes per event.

This will allow for a much smaller footprint into the ring buffer and keep
more events from dropping.

Link: https://lkml.kernel.org/r/20220303220625.369226746@goodmis.org

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Suggested-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-11 11:48:43 -05:00
Greg Kroah-Hartman
fdcee305c0 Merge tag 'coresight-next-v5.17' of gitolite.kernel.org:pub/scm/linux/kernel/git/coresight/linux into char-misc-next
Mathieu writes:

Coresight changes for v5.17

This pull request includes:

- A patch that uses devm_bitmap_zalloc() instead of the open-coded
equivalent.

- Work to make coresight complex configuration loadable via modules.

- Some coresight documentation updates.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

* tag 'coresight-next-v5.17' of gitolite.kernel.org:pub/scm/linux/kernel/git/coresight/linux:
  coresight: core: Fix typo in a comment
  Documentation: coresight: Update coresight configuration docs
  coresight: configfs: Allow configfs to activate configuration
  coresight: syscfg: Example CoreSight configuration loadable module
  coresight: syscfg: Update load API for config loadable modules
  coresight: configuration: Update API to permit dynamic load/unload
  coresight: configuration: Update API to introduce load owner concept
  Documentation: coresight: Fix documentation issue
  coresight: Use devm_bitmap_zalloc when applicable
2021-12-21 10:07:07 +01:00
Mike Leach
ede5bab874 coresight: syscfg: Example CoreSight configuration loadable module
An example of creating a loadable module to add CoreSight configurations
into a system.

In the Kernel samples/coresight directory.

Signed-off-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20211124200038.28662-5-mike.leach@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2021-11-26 11:34:07 -07:00
Heiko Carstens
503e451084 ftrace/samples: add missing Kconfig option for ftrace direct multi sample
Currently it is not possible to build the ftrace direct multi example
anymore due to broken config dependencies. Fix this by adding
SAMPLE_FTRACE_DIRECT_MULTI config option.

This broke when merging s390-5.16-1 due to an incorrect merge conflict
resolution proposed by me.

Also rename SAMPLE_FTRACE_MULTI_DIRECT to SAMPLE_FTRACE_DIRECT_MULTI
so it matches the module name.

Fixes: 0b707e572a ("Merge tag 's390-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux")
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20211115195614.3173346-2-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-18 17:50:54 +01:00
Linus Torvalds
2acda7549e Merge tag 'fsnotify_for_v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify updates from Jan Kara:
 "Support for reporting filesystem errors through fanotify so that
  system health monitoring daemons can watch for these and act instead
  of scraping system logs"

* tag 'fsnotify_for_v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (34 commits)
  samples: remove duplicate include in fs-monitor.c
  samples: Fix warning in fsnotify sample
  docs: Fix formatting of literal sections in fanotify docs
  samples: Make fs-monitor depend on libc and headers
  docs: Document the FAN_FS_ERROR event
  samples: Add fs error monitoring example
  ext4: Send notifications on error
  fanotify: Allow users to request FAN_FS_ERROR events
  fanotify: Emit generic error info for error event
  fanotify: Report fid info for file related file system errors
  fanotify: WARN_ON against too large file handles
  fanotify: Add helpers to decide whether to report FID/DFID
  fanotify: Wrap object_fh inline space in a creator macro
  fanotify: Support merging of error events
  fanotify: Support enqueueing of error events
  fanotify: Pre-allocate pool of error events
  fanotify: Reserve UAPI bits for FAN_FS_ERROR
  fsnotify: Support FS_ERROR event type
  fanotify: Require fid_mode for any non-fd event
  fanotify: Encode empty file handle when no inode is provided
  ...
2021-11-06 16:43:20 -07:00
Linus Torvalds
0b707e572a Merge tag 's390-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:

 - Add support for ftrace with direct call and ftrace direct call
   samples.

 - Add support for kernel command lines longer than current 896 bytes
   and make its length configurable.

 - Add support for BEAR enhancement facility to improve last breaking
   event instruction tracking.

 - Add kprobes sanity checks and testcases to prevent kprobe in the mid
   of an instruction.

 - Allow concurrent access to /dev/hwc for the CPUMF users.

 - Various ftrace / jump label improvements.

 - Convert unwinder tests to KUnit.

 - Add s390_iommu_aperture kernel parameter to tweak the limits on
   concurrently usable DMA mappings.

 - Add ap.useirq AP module option which can be used to disable interrupt
   use.

 - Add add_disk() error handling support to block device drivers.

 - Drop arch specific and use generic implementation of strlcpy and
   strrchr.

 - Several __pa/__va usages fixes.

 - Various cio, crypto, pci, kernel doc and other small fixes and
   improvements all over the code.

[ Merge fixup as per https://lore.kernel.org/all/YXAqZ%2FEszRisunQw@osiris/ ]

* tag 's390-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (63 commits)
  s390: make command line configurable
  s390: support command lines longer than 896 bytes
  s390/kexec_file: move kernel image size check
  s390/pci: add s390_iommu_aperture kernel parameter
  s390/spinlock: remove incorrect kernel doc indicator
  s390/string: use generic strlcpy
  s390/string: use generic strrchr
  s390/ap: function rework based on compiler warning
  s390/cio: make ccw_device_dma_* more robust
  s390/vfio-ap: s390/crypto: fix all kernel-doc warnings
  s390/hmcdrv: fix kernel doc comments
  s390/ap: new module option ap.useirq
  s390/cpumf: Allow multiple processes to access /dev/hwc
  s390/bitops: return true/false (not 1/0) from bool functions
  s390: add support for BEAR enhancement facility
  s390: introduce nospec_uses_trampoline()
  s390: rename last_break to pgm_last_break
  s390/ptrace: add last_break member to pt_regs
  s390/sclp: sort out physical vs virtual pointers usage
  s390/setup: convert start and end initrd pointers to virtual
  ...
2021-11-06 14:48:06 -07:00
Gabriel Krisman Bertazi
8fc70b3a14 samples: Make fs-monitor depend on libc and headers
Prevent build errors when headers or libc are not available, such as on
kernel build bots, like the below:

samples/fanotify/fs-monitor.c:7:10: fatal error: errno.h: No such file
or directory
  7 | #include <errno.h>
    |          ^~~~~~~~~

Link: https://lore.kernel.org/r/87fsslasgz.fsf@collabora.com
Suggested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-11-01 12:39:53 +01:00
Gabriel Krisman Bertazi
5451093081 samples: Add fs error monitoring example
Introduce an example of a FAN_FS_ERROR fanotify user to track filesystem
errors.

Link: https://lore.kernel.org/r/20211025192746.66445-31-krisman@collabora.com
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-10-27 12:53:47 +02:00
Heiko Carstens
c316eb4460 samples: add HAVE_SAMPLE_FTRACE_DIRECT config option
Add HAVE_SAMPLE_FTRACE_DIRECT config option which can be selected by
architectures which have support for ftrace direct call samples.

Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20211012133802.2460757-4-hca@linux.ibm.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-19 15:39:53 +02:00
Jason Gunthorpe
af3ab3f9b9 vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE
For some reason the vfio_mdev shim mdev_driver has its own module and
kconfig. As the next patch requires access to it from mdev.ko merge the
two modules together and remove VFIO_MDEV_DEVICE.

A later patch deletes this driver entirely.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Link: https://lore.kernel.org/r/20210617142218.1877096-7-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-21 15:29:25 -06:00
Mickaël Salaün
ba84b0bf5a samples/landlock: Add a sandbox manager example
Add a basic sandbox tool to launch a command which can only access a
list of file hierarchies in a read-only or read-write way.

Cc: James Morris <jmorris@namei.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jann Horn <jannh@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-12-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 12:22:11 -07:00
Gabriel Krisman Bertazi
8fe62e0c0e watch_queue: Drop references to /dev/watch_queue
The merged API doesn't use a watch_queue device, but instead relies on
pipes, so let the documentation reflect that.

Fixes: f7e47677e3 ("watch_queue: Add a key/keyring notification facility")
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Ben Boeckel <mathstuf@gmail.com>
2021-01-21 16:16:08 +00:00
Masahiro Yamada
214377e9b7 samples: watch_queue: build sample program for target architecture
This userspace program includes UAPI headers exported to usr/include/.
'make headers' always works for the target architecture (i.e. the same
architecture as the kernel), so the sample program should be built for
the target as well. Kbuild now supports 'userprogs' for that.

I also guarded the CONFIG option by 'depends on CC_CAN_LINK' because
$(CC) may not provide libc.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-22 01:56:09 +09:00
Linus Torvalds
6adc19fd13 Merge tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull more Kbuild updates from Masahiro Yamada:

 - fix build rules in binderfs sample

 - fix build errors when Kbuild recurses to the top Makefile

 - covert '---help---' in Kconfig to 'help'

* tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  treewide: replace '---help---' in Kconfig files with 'help'
  kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables
  samples: binderfs: really compile this sample and fix build issues
2020-06-13 13:29:16 -07:00
Linus Torvalds
6c32978414 Merge tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull notification queue from David Howells:
 "This adds a general notification queue concept and adds an event
  source for keys/keyrings, such as linking and unlinking keys and
  changing their attributes.

  Thanks to Debarshi Ray, we do have a pull request to use this to fix a
  problem with gnome-online-accounts - as mentioned last time:

     https://gitlab.gnome.org/GNOME/gnome-online-accounts/merge_requests/47

  Without this, g-o-a has to constantly poll a keyring-based kerberos
  cache to find out if kinit has changed anything.

  [ There are other notification pending: mount/sb fsinfo notifications
    for libmount that Karel Zak and Ian Kent have been working on, and
    Christian Brauner would like to use them in lxc, but let's see how
    this one works first ]

  LSM hooks are included:

   - A set of hooks are provided that allow an LSM to rule on whether or
     not a watch may be set. Each of these hooks takes a different
     "watched object" parameter, so they're not really shareable. The
     LSM should use current's credentials. [Wanted by SELinux & Smack]

   - A hook is provided to allow an LSM to rule on whether or not a
     particular message may be posted to a particular queue. This is
     given the credentials from the event generator (which may be the
     system) and the watch setter. [Wanted by Smack]

  I've provided SELinux and Smack with implementations of some of these
  hooks.

  WHY
  ===

  Key/keyring notifications are desirable because if you have your
  kerberos tickets in a file/directory, your Gnome desktop will monitor
  that using something like fanotify and tell you if your credentials
  cache changes.

  However, we also have the ability to cache your kerberos tickets in
  the session, user or persistent keyring so that it isn't left around
  on disk across a reboot or logout. Keyrings, however, cannot currently
  be monitored asynchronously, so the desktop has to poll for it - not
  so good on a laptop. This facility will allow the desktop to avoid the
  need to poll.

  DESIGN DECISIONS
  ================

   - The notification queue is built on top of a standard pipe. Messages
     are effectively spliced in. The pipe is opened with a special flag:

        pipe2(fds, O_NOTIFICATION_PIPE);

     The special flag has the same value as O_EXCL (which doesn't seem
     like it will ever be applicable in this context)[?]. It is given up
     front to make it a lot easier to prohibit splice&co from accessing
     the pipe.

     [?] Should this be done some other way?  I'd rather not use up a new
         O_* flag if I can avoid it - should I add a pipe3() system call
         instead?

     The pipe is then configured::

        ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);
        ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);

     Messages are then read out of the pipe using read().

   - It should be possible to allow write() to insert data into the
     notification pipes too, but this is currently disabled as the
     kernel has to be able to insert messages into the pipe *without*
     holding pipe->mutex and the code to make this work needs careful
     auditing.

   - sendfile(), splice() and vmsplice() are disabled on notification
     pipes because of the pipe->mutex issue and also because they
     sometimes want to revert what they just did - but one or more
     notification messages might've been interleaved in the ring.

   - The kernel inserts messages with the wait queue spinlock held. This
     means that pipe_read() and pipe_write() have to take the spinlock
     to update the queue pointers.

   - Records in the buffer are binary, typed and have a length so that
     they can be of varying size.

     This allows multiple heterogeneous sources to share a common
     buffer; there are 16 million types available, of which I've used
     just a few, so there is scope for others to be used. Tags may be
     specified when a watchpoint is created to help distinguish the
     sources.

   - Records are filterable as types have up to 256 subtypes that can be
     individually filtered. Other filtration is also available.

   - Notification pipes don't interfere with each other; each may be
     bound to a different set of watches. Any particular notification
     will be copied to all the queues that are currently watching for it
     - and only those that are watching for it.

   - When recording a notification, the kernel will not sleep, but will
     rather mark a queue as having lost a message if there's
     insufficient space. read() will fabricate a loss notification
     message at an appropriate point later.

   - The notification pipe is created and then watchpoints are attached
     to it, using one of:

        keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fds[1], 0x01);
        watch_mount(AT_FDCWD, "/", 0, fd, 0x02);
        watch_sb(AT_FDCWD, "/mnt", 0, fd, 0x03);

     where in both cases, fd indicates the queue and the number after is
     a tag between 0 and 255.

   - Watches are removed if either the notification pipe is destroyed or
     the watched object is destroyed. In the latter case, a message will
     be generated indicating the enforced watch removal.

  Things I want to avoid:

   - Introducing features that make the core VFS dependent on the
     network stack or networking namespaces (ie. usage of netlink).

   - Dumping all this stuff into dmesg and having a daemon that sits
     there parsing the output and distributing it as this then puts the
     responsibility for security into userspace and makes handling
     namespaces tricky. Further, dmesg might not exist or might be
     inaccessible inside a container.

   - Letting users see events they shouldn't be able to see.

  TESTING AND MANPAGES
  ====================

   - The keyutils tree has a pipe-watch branch that has keyctl commands
     for making use of notifications. Proposed manual pages can also be
     found on this branch, though a couple of them really need to go to
     the main manpages repository instead.

     If the kernel supports the watching of keys, then running "make
     test" on that branch will cause the testing infrastructure to spawn
     a monitoring process on the side that monitors a notifications pipe
     for all the key/keyring changes induced by the tests and they'll
     all be checked off to make sure they happened.

        https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=pipe-watch

   - A test program is provided (samples/watch_queue/watch_test) that
     can be used to monitor for keyrings, mount and superblock events.
     Information on the notifications is simply logged to stdout"

* tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  smack: Implement the watch_key and post_notification hooks
  selinux: Implement the watch_key security hook
  keys: Make the KEY_NEED_* perms an enum rather than a mask
  pipe: Add notification lossage handling
  pipe: Allow buffers to be marked read-whole-or-error for notifications
  Add sample notification program
  watch_queue: Add a key/keyring notification facility
  security: Add hooks to rule on setting a watch
  pipe: Add general notification queue support
  pipe: Add O_NOTIFICATION_PIPE
  security: Add a hook for the point of notification insertion
  uapi: General notification queue definitions
2020-06-13 09:56:21 -07:00
Masahiro Yamada
fca5e94921 samples: binderfs: really compile this sample and fix build issues
Even after commit c624adc9cb ("samples: fix binderfs sample"), this
sample is never compiled.

'hostprogs' teaches Kbuild that this is a host program, but not enough
to order to compile it. You must add it to 'always-y' to really compile
it.

Since this sample has never been compiled in upstream, various issues
are left unnoticed.

[1] compilers without <linux/android/binderfs.h> are still widely used

<linux/android/binderfs.h> is only available since commit c13295ad21
("binderfs: rename header to binderfs.h"), i.e., Linux 5.0

If your compiler is based on UAPI headers older than Linux 5.0, you
will see the following error:

  samples/binderfs/binderfs_example.c:16:10: fatal error: linux/android/binderfs.h: No such file or directory
   #include <linux/android/binderfs.h>
            ^~~~~~~~~~~~~~~~~~~~~~~~~~
  compilation terminated.

You cannot rely on compilers having such a new header.

The common approach is to install UAPI headers of this kernel into
usr/include, and then add it to the header search path.

I added 'depends on HEADERS_INSTALL' in Kconfig, and '-I usr/include'
compiler flag in Makefile.

[2] compile the sample for target architecture

Because headers_install works for the target architecture, only the
native compiler was able to build sample code that requires
'-I usr/include'.

Commit 7f3a59db27 ("kbuild: add infrastructure to build userspace
programs") added the new syntax 'userprogs' to compile user-space
programs for the target architecture.

Use it, and then 'ifndef CROSS_COMPILE' will go away.

I added 'depends on CC_CAN_LINK' because $(CC) is not necessarily
capable of linking user-space programs.

[3] use subdir-y to descend into samples/binderfs

Since this directory does not contain any kernel-space code, it has no
point in generating built-in.a or modules.order.

Replace obj-$(CONFIG_...) with subdir-$(CONFIG_...).

[4] -Wunused-variable warning

If I compile this, I see the following warning.

  samples/binderfs/binderfs_example.c: In function 'main':
  samples/binderfs/binderfs_example.c:21:9: warning: unused variable 'len' [-Wunused-variable]
     21 |  size_t len;
        |         ^~~

I removed the unused 'len'.

[5] CONFIG_ANDROID_BINDERFS is not required

Since this is a user-space standalone program, it is independent of
the kernel configuration.

Remove 'depends on ANDROID_BINDERFS'.

Fixes: 9762dc1432 ("samples: add binderfs sample program")
Fixes: c624adc9cb ("samples: fix binderfs sample")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-06-11 20:14:41 +09:00
Linus Torvalds
cff11abeca Merge tag 'kbuild-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:

 - fix warnings in 'make clean' for ARCH=um, hexagon, h8300, unicore32

 - ensure to rebuild all objects when the compiler is upgraded

 - exclude system headers from dependency tracking and fixdep processing

 - fix potential bit-size mismatch between the kernel and BPF user-mode
   helper

 - add the new syntax 'userprogs' to build user-space programs for the
   target architecture (the same arch as the kernel)

 - compile user-space sample code under samples/ for the target arch
   instead of the host arch

 - make headers_install fail if a CONFIG option is leaked to user-space

 - sanitize the output format of scripts/checkstack.pl

 - handle ARM 'push' instruction in scripts/checkstack.pl

 - error out before modpost if a module name conflict is found

 - error out when multiple directories are passed to M= because this
   feature is broken for a long time

 - add CONFIG_DEBUG_INFO_COMPRESSED to support compressed debug info

 - a lot of cleanups of modpost

 - dump vmlinux symbols out into vmlinux.symvers, and reuse it in the
   second pass of modpost

 - do not run the second pass of modpost if nothing in modules is
   updated

 - install modules.builtin(.modinfo) by 'make install' as well as by
   'make modules_install' because it is useful even when
   CONFIG_MODULES=n

 - add new command line variables, GZIP, BZIP2, LZOP, LZMA, LZ4, and XZ
   to allow users to use alternatives such as pigz, pbzip2, etc.

* tag 'kbuild-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (96 commits)
  kbuild: add variables for compression tools
  Makefile: install modules.builtin even if CONFIG_MODULES=n
  mksysmap: Fix the mismatch of '.L' symbols in System.map
  kbuild: doc: rename LDFLAGS to KBUILD_LDFLAGS
  modpost: change elf_info->size to size_t
  modpost: remove is_vmlinux() helper
  modpost: strip .o from modname before calling new_module()
  modpost: set have_vmlinux in new_module()
  modpost: remove mod->skip struct member
  modpost: add mod->is_vmlinux struct member
  modpost: remove is_vmlinux() call in check_for_{gpl_usage,unused}()
  modpost: remove mod->is_dot_o struct member
  modpost: move -d option in scripts/Makefile.modpost
  modpost: remove -s option
  modpost: remove get_next_text() and make {grab,release_}file static
  modpost: use read_text_file() and get_line() for reading text files
  modpost: avoid false-positive file open error
  modpost: fix potential mmap'ed file overrun in get_src_version()
  modpost: add read_text_file() and get_line() helpers
  modpost: do not call get_modinfo() for vmlinux(.o)
  ...
2020-06-06 12:00:25 -07:00
David Howells
f5b5a164f9 Add sample notification program
The sample program is run like:

	./samples/watch_queue/watch_test

and watches "/" for mount changes and the current session keyring for key
changes:

	# keyctl add user a a @s
	1035096409
	# keyctl unlink 1035096409 @s

producing:

	# ./watch_test
	read() = 16
	NOTIFY[000]: ty=000001 sy=02 i=00000110
	KEY 2ffc2e5d change=2[linked] aux=1035096409
	read() = 16
	NOTIFY[000]: ty=000001 sy=02 i=00000110
	KEY 2ffc2e5d change=3[unlinked] aux=1035096409

Other events may be produced, such as with a failing disk:

	read() = 22
	NOTIFY[000]: ty=000003 sy=02 i=00000416
	USB 3-7.7 dev-reset e=0 r=0
	read() = 24
	NOTIFY[000]: ty=000002 sy=06 i=00000418
	BLOCK 00800050 e=6[critical medium] s=64000ef8

This corresponds to:

	blk_update_request: critical medium error, dev sdf, sector 1677725432 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0

in dmesg.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-19 15:38:07 +01:00
Masahiro Yamada
88a8e278ff samples: watchdog: use 'userprogs' syntax
Kbuild now supports the 'userprogs' syntax to compile userspace
programs for the same architecture as the kernel.

Add the entry to samples/Makefile to put this into the build bot
coverage.

I also added the CONFIG option guarded by 'depends on CC_CAN_LINK'
because $(CC) may not provide libc.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
2020-05-17 18:52:02 +09:00
Masahiro Yamada
b98ccc7150 samples: timers: use 'userprogs' syntax
Kbuild now supports the 'userprogs' syntax to compile userspace
programs for the same architecture as the kernel.

Add the entry to samples/Makefile to put this into the build bot
coverage.

I also added the CONFIG option guarded by 'depends on CC_CAN_LINK'
because $(CC) may not provide libc.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
2020-05-17 18:52:02 +09:00
Masahiro Yamada
87ffbba9a9 samples: auxdisplay: use 'userprogs' syntax
Kbuild now supports the 'userprogs' syntax to compile userspace
programs for the same architecture as the kernel.

Add the entry to samples/Makefile to put this into the build bot
coverage.

I also added the CONFIG option guarded by 'depends on CC_CAN_LINK'
because $(CC) may not provide libc.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
2020-05-17 18:52:02 +09:00
Masahiro Yamada
c4c10996b1 samples: mei: build sample program for target architecture
This userspace program includes UAPI headers exported to usr/include/.
'make headers' always works for the target architecture (i.e. the same
architecture as the kernel), so the sample program should be built for
the target as well. Kbuild now supports 'userprogs' for that.

I also guarded the CONFIG option by 'depends on CC_CAN_LINK' because
$(CC) may not provide libc.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
2020-05-17 18:52:02 +09:00