Commit Graph

989 Commits

Author SHA1 Message Date
Gustavo A. R. Silva
f2baaff279 samples: mei: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:31 -05:00
Linus Torvalds
4f9b3a3775 binderfs: add gitignore for generated sample program
Let's keep "git status" happy and quiet.

Fixes: 9762dc1432 ("samples: add binderfs sample program
Fixes: fca5e94921 ("samples: binderfs: really compile this sample and fix build issues")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-13 13:41:24 -07:00
Linus Torvalds
6adc19fd13 Merge tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull more Kbuild updates from Masahiro Yamada:

 - fix build rules in binderfs sample

 - fix build errors when Kbuild recurses to the top Makefile

 - covert '---help---' in Kconfig to 'help'

* tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  treewide: replace '---help---' in Kconfig files with 'help'
  kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables
  samples: binderfs: really compile this sample and fix build issues
2020-06-13 13:29:16 -07:00
Linus Torvalds
298ce0fd50 watch_queue: add gitignore for generated sample program
Let's keep "git status" happy and quiet.

Fixes: f5b5a164f9 ("Add sample notification program")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-13 13:00:54 -07:00
Linus Torvalds
6c32978414 Merge tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull notification queue from David Howells:
 "This adds a general notification queue concept and adds an event
  source for keys/keyrings, such as linking and unlinking keys and
  changing their attributes.

  Thanks to Debarshi Ray, we do have a pull request to use this to fix a
  problem with gnome-online-accounts - as mentioned last time:

     https://gitlab.gnome.org/GNOME/gnome-online-accounts/merge_requests/47

  Without this, g-o-a has to constantly poll a keyring-based kerberos
  cache to find out if kinit has changed anything.

  [ There are other notification pending: mount/sb fsinfo notifications
    for libmount that Karel Zak and Ian Kent have been working on, and
    Christian Brauner would like to use them in lxc, but let's see how
    this one works first ]

  LSM hooks are included:

   - A set of hooks are provided that allow an LSM to rule on whether or
     not a watch may be set. Each of these hooks takes a different
     "watched object" parameter, so they're not really shareable. The
     LSM should use current's credentials. [Wanted by SELinux & Smack]

   - A hook is provided to allow an LSM to rule on whether or not a
     particular message may be posted to a particular queue. This is
     given the credentials from the event generator (which may be the
     system) and the watch setter. [Wanted by Smack]

  I've provided SELinux and Smack with implementations of some of these
  hooks.

  WHY
  ===

  Key/keyring notifications are desirable because if you have your
  kerberos tickets in a file/directory, your Gnome desktop will monitor
  that using something like fanotify and tell you if your credentials
  cache changes.

  However, we also have the ability to cache your kerberos tickets in
  the session, user or persistent keyring so that it isn't left around
  on disk across a reboot or logout. Keyrings, however, cannot currently
  be monitored asynchronously, so the desktop has to poll for it - not
  so good on a laptop. This facility will allow the desktop to avoid the
  need to poll.

  DESIGN DECISIONS
  ================

   - The notification queue is built on top of a standard pipe. Messages
     are effectively spliced in. The pipe is opened with a special flag:

        pipe2(fds, O_NOTIFICATION_PIPE);

     The special flag has the same value as O_EXCL (which doesn't seem
     like it will ever be applicable in this context)[?]. It is given up
     front to make it a lot easier to prohibit splice&co from accessing
     the pipe.

     [?] Should this be done some other way?  I'd rather not use up a new
         O_* flag if I can avoid it - should I add a pipe3() system call
         instead?

     The pipe is then configured::

        ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);
        ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);

     Messages are then read out of the pipe using read().

   - It should be possible to allow write() to insert data into the
     notification pipes too, but this is currently disabled as the
     kernel has to be able to insert messages into the pipe *without*
     holding pipe->mutex and the code to make this work needs careful
     auditing.

   - sendfile(), splice() and vmsplice() are disabled on notification
     pipes because of the pipe->mutex issue and also because they
     sometimes want to revert what they just did - but one or more
     notification messages might've been interleaved in the ring.

   - The kernel inserts messages with the wait queue spinlock held. This
     means that pipe_read() and pipe_write() have to take the spinlock
     to update the queue pointers.

   - Records in the buffer are binary, typed and have a length so that
     they can be of varying size.

     This allows multiple heterogeneous sources to share a common
     buffer; there are 16 million types available, of which I've used
     just a few, so there is scope for others to be used. Tags may be
     specified when a watchpoint is created to help distinguish the
     sources.

   - Records are filterable as types have up to 256 subtypes that can be
     individually filtered. Other filtration is also available.

   - Notification pipes don't interfere with each other; each may be
     bound to a different set of watches. Any particular notification
     will be copied to all the queues that are currently watching for it
     - and only those that are watching for it.

   - When recording a notification, the kernel will not sleep, but will
     rather mark a queue as having lost a message if there's
     insufficient space. read() will fabricate a loss notification
     message at an appropriate point later.

   - The notification pipe is created and then watchpoints are attached
     to it, using one of:

        keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fds[1], 0x01);
        watch_mount(AT_FDCWD, "/", 0, fd, 0x02);
        watch_sb(AT_FDCWD, "/mnt", 0, fd, 0x03);

     where in both cases, fd indicates the queue and the number after is
     a tag between 0 and 255.

   - Watches are removed if either the notification pipe is destroyed or
     the watched object is destroyed. In the latter case, a message will
     be generated indicating the enforced watch removal.

  Things I want to avoid:

   - Introducing features that make the core VFS dependent on the
     network stack or networking namespaces (ie. usage of netlink).

   - Dumping all this stuff into dmesg and having a daemon that sits
     there parsing the output and distributing it as this then puts the
     responsibility for security into userspace and makes handling
     namespaces tricky. Further, dmesg might not exist or might be
     inaccessible inside a container.

   - Letting users see events they shouldn't be able to see.

  TESTING AND MANPAGES
  ====================

   - The keyutils tree has a pipe-watch branch that has keyctl commands
     for making use of notifications. Proposed manual pages can also be
     found on this branch, though a couple of them really need to go to
     the main manpages repository instead.

     If the kernel supports the watching of keys, then running "make
     test" on that branch will cause the testing infrastructure to spawn
     a monitoring process on the side that monitors a notifications pipe
     for all the key/keyring changes induced by the tests and they'll
     all be checked off to make sure they happened.

        https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=pipe-watch

   - A test program is provided (samples/watch_queue/watch_test) that
     can be used to monitor for keyrings, mount and superblock events.
     Information on the notifications is simply logged to stdout"

* tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  smack: Implement the watch_key and post_notification hooks
  selinux: Implement the watch_key security hook
  keys: Make the KEY_NEED_* perms an enum rather than a mask
  pipe: Add notification lossage handling
  pipe: Allow buffers to be marked read-whole-or-error for notifications
  Add sample notification program
  watch_queue: Add a key/keyring notification facility
  security: Add hooks to rule on setting a watch
  pipe: Add general notification queue support
  pipe: Add O_NOTIFICATION_PIPE
  security: Add a hook for the point of notification insertion
  uapi: General notification queue definitions
2020-06-13 09:56:21 -07:00
Masahiro Yamada
fca5e94921 samples: binderfs: really compile this sample and fix build issues
Even after commit c624adc9cb ("samples: fix binderfs sample"), this
sample is never compiled.

'hostprogs' teaches Kbuild that this is a host program, but not enough
to order to compile it. You must add it to 'always-y' to really compile
it.

Since this sample has never been compiled in upstream, various issues
are left unnoticed.

[1] compilers without <linux/android/binderfs.h> are still widely used

<linux/android/binderfs.h> is only available since commit c13295ad21
("binderfs: rename header to binderfs.h"), i.e., Linux 5.0

If your compiler is based on UAPI headers older than Linux 5.0, you
will see the following error:

  samples/binderfs/binderfs_example.c:16:10: fatal error: linux/android/binderfs.h: No such file or directory
   #include <linux/android/binderfs.h>
            ^~~~~~~~~~~~~~~~~~~~~~~~~~
  compilation terminated.

You cannot rely on compilers having such a new header.

The common approach is to install UAPI headers of this kernel into
usr/include, and then add it to the header search path.

I added 'depends on HEADERS_INSTALL' in Kconfig, and '-I usr/include'
compiler flag in Makefile.

[2] compile the sample for target architecture

Because headers_install works for the target architecture, only the
native compiler was able to build sample code that requires
'-I usr/include'.

Commit 7f3a59db27 ("kbuild: add infrastructure to build userspace
programs") added the new syntax 'userprogs' to compile user-space
programs for the target architecture.

Use it, and then 'ifndef CROSS_COMPILE' will go away.

I added 'depends on CC_CAN_LINK' because $(CC) is not necessarily
capable of linking user-space programs.

[3] use subdir-y to descend into samples/binderfs

Since this directory does not contain any kernel-space code, it has no
point in generating built-in.a or modules.order.

Replace obj-$(CONFIG_...) with subdir-$(CONFIG_...).

[4] -Wunused-variable warning

If I compile this, I see the following warning.

  samples/binderfs/binderfs_example.c: In function 'main':
  samples/binderfs/binderfs_example.c:21:9: warning: unused variable 'len' [-Wunused-variable]
     21 |  size_t len;
        |         ^~~

I removed the unused 'len'.

[5] CONFIG_ANDROID_BINDERFS is not required

Since this is a user-space standalone program, it is independent of
the kernel configuration.

Remove 'depends on ANDROID_BINDERFS'.

Fixes: 9762dc1432 ("samples: add binderfs sample program")
Fixes: c624adc9cb ("samples: fix binderfs sample")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-06-11 20:14:41 +09:00
Linus Torvalds
cff11abeca Merge tag 'kbuild-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:

 - fix warnings in 'make clean' for ARCH=um, hexagon, h8300, unicore32

 - ensure to rebuild all objects when the compiler is upgraded

 - exclude system headers from dependency tracking and fixdep processing

 - fix potential bit-size mismatch between the kernel and BPF user-mode
   helper

 - add the new syntax 'userprogs' to build user-space programs for the
   target architecture (the same arch as the kernel)

 - compile user-space sample code under samples/ for the target arch
   instead of the host arch

 - make headers_install fail if a CONFIG option is leaked to user-space

 - sanitize the output format of scripts/checkstack.pl

 - handle ARM 'push' instruction in scripts/checkstack.pl

 - error out before modpost if a module name conflict is found

 - error out when multiple directories are passed to M= because this
   feature is broken for a long time

 - add CONFIG_DEBUG_INFO_COMPRESSED to support compressed debug info

 - a lot of cleanups of modpost

 - dump vmlinux symbols out into vmlinux.symvers, and reuse it in the
   second pass of modpost

 - do not run the second pass of modpost if nothing in modules is
   updated

 - install modules.builtin(.modinfo) by 'make install' as well as by
   'make modules_install' because it is useful even when
   CONFIG_MODULES=n

 - add new command line variables, GZIP, BZIP2, LZOP, LZMA, LZ4, and XZ
   to allow users to use alternatives such as pigz, pbzip2, etc.

* tag 'kbuild-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (96 commits)
  kbuild: add variables for compression tools
  Makefile: install modules.builtin even if CONFIG_MODULES=n
  mksysmap: Fix the mismatch of '.L' symbols in System.map
  kbuild: doc: rename LDFLAGS to KBUILD_LDFLAGS
  modpost: change elf_info->size to size_t
  modpost: remove is_vmlinux() helper
  modpost: strip .o from modname before calling new_module()
  modpost: set have_vmlinux in new_module()
  modpost: remove mod->skip struct member
  modpost: add mod->is_vmlinux struct member
  modpost: remove is_vmlinux() call in check_for_{gpl_usage,unused}()
  modpost: remove mod->is_dot_o struct member
  modpost: move -d option in scripts/Makefile.modpost
  modpost: remove -s option
  modpost: remove get_next_text() and make {grab,release_}file static
  modpost: use read_text_file() and get_line() for reading text files
  modpost: avoid false-positive file open error
  modpost: fix potential mmap'ed file overrun in get_src_version()
  modpost: add read_text_file() and get_line() helpers
  modpost: do not call get_modinfo() for vmlinux(.o)
  ...
2020-06-06 12:00:25 -07:00
Linus Torvalds
cb8e59cc87 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:

 1) Allow setting bluetooth L2CAP modes via socket option, from Luiz
    Augusto von Dentz.

 2) Add GSO partial support to igc, from Sasha Neftin.

 3) Several cleanups and improvements to r8169 from Heiner Kallweit.

 4) Add IF_OPER_TESTING link state and use it when ethtool triggers a
    device self-test. From Andrew Lunn.

 5) Start moving away from custom driver versions, use the globally
    defined kernel version instead, from Leon Romanovsky.

 6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin.

 7) Allow hard IRQ deferral during NAPI, from Eric Dumazet.

 8) Add sriov and vf support to hinic, from Luo bin.

 9) Support Media Redundancy Protocol (MRP) in the bridging code, from
    Horatiu Vultur.

10) Support netmap in the nft_nat code, from Pablo Neira Ayuso.

11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina
    Dubroca. Also add ipv6 support for espintcp.

12) Lots of ReST conversions of the networking documentation, from Mauro
    Carvalho Chehab.

13) Support configuration of ethtool rxnfc flows in bcmgenet driver,
    from Doug Berger.

14) Allow to dump cgroup id and filter by it in inet_diag code, from
    Dmitry Yakunin.

15) Add infrastructure to export netlink attribute policies to
    userspace, from Johannes Berg.

16) Several optimizations to sch_fq scheduler, from Eric Dumazet.

17) Fallback to the default qdisc if qdisc init fails because otherwise
    a packet scheduler init failure will make a device inoperative. From
    Jesper Dangaard Brouer.

18) Several RISCV bpf jit optimizations, from Luke Nelson.

19) Correct the return type of the ->ndo_start_xmit() method in several
    drivers, it's netdev_tx_t but many drivers were using
    'int'. From Yunjian Wang.

20) Add an ethtool interface for PHY master/slave config, from Oleksij
    Rempel.

21) Add BPF iterators, from Yonghang Song.

22) Add cable test infrastructure, including ethool interfaces, from
    Andrew Lunn. Marvell PHY driver is the first to support this
    facility.

23) Remove zero-length arrays all over, from Gustavo A. R. Silva.

24) Calculate and maintain an explicit frame size in XDP, from Jesper
    Dangaard Brouer.

25) Add CAP_BPF, from Alexei Starovoitov.

26) Support terse dumps in the packet scheduler, from Vlad Buslov.

27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei.

28) Add devm_register_netdev(), from Bartosz Golaszewski.

29) Minimize qdisc resets, from Cong Wang.

30) Get rid of kernel_getsockopt and kernel_setsockopt in order to
    eliminate set_fs/get_fs calls. From Christoph Hellwig.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits)
  selftests: net: ip_defrag: ignore EPERM
  net_failover: fixed rollback in net_failover_open()
  Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv"
  Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv"
  vmxnet3: allow rx flow hash ops only when rss is enabled
  hinic: add set_channels ethtool_ops support
  selftests/bpf: Add a default $(CXX) value
  tools/bpf: Don't use $(COMPILE.c)
  bpf, selftests: Use bpf_probe_read_kernel
  s390/bpf: Use bcr 0,%0 as tail call nop filler
  s390/bpf: Maintain 8-byte stack alignment
  selftests/bpf: Fix verifier test
  selftests/bpf: Fix sample_cnt shared between two threads
  bpf, selftests: Adapt cls_redirect to call csum_level helper
  bpf: Add csum_level helper for fixing up csum levels
  bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
  sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()
  crypto/chtls: IPv6 support for inline TLS
  Crypto/chcr: Fixes a coccinile check error
  Crypto/chcr: Fixes compilations warnings
  ...
2020-06-03 16:27:18 -07:00
Linus Torvalds
f359287765 Merge branch 'from-miklos' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "Assorted patches from Miklos.

  An interesting part here is /proc/mounts stuff..."

The "/proc/mounts stuff" is using a cursor for keeeping the location
data while traversing the mount listing.

Also probably worth noting is the addition of faccessat2(), which takes
an additional set of flags to specify how the lookup is done
(AT_EACCESS, AT_SYMLINK_NOFOLLOW, AT_EMPTY_PATH).

* 'from-miklos' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: add faccessat2 syscall
  vfs: don't parse "silent" option
  vfs: don't parse "posixacl" option
  vfs: don't parse forbidden flags
  statx: add mount_root
  statx: add mount ID
  statx: don't clear STATX_ATIME on SB_RDONLY
  uapi: deprecate STATX_ALL
  utimensat: AT_EMPTY_PATH support
  vfs: split out access_override_creds()
  proc/mounts: add cursor
  aio: fix async fsync creds
  vfs: allow unprivileged whiteout creation
2020-06-01 16:44:06 -07:00
Linus Torvalds
b23c4771ff Merge tag 'docs-5.8' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
 "A fair amount of stuff this time around, dominated by yet another
  massive set from Mauro toward the completion of the RST conversion. I
  *really* hope we are getting close to the end of this. Meanwhile,
  those patches reach pretty far afield to update document references
  around the tree; there should be no actual code changes there. There
  will be, alas, more of the usual trivial merge conflicts.

  Beyond that we have more translations, improvements to the sphinx
  scripting, a number of additions to the sysctl documentation, and lots
  of fixes"

* tag 'docs-5.8' of git://git.lwn.net/linux: (130 commits)
  Documentation: fixes to the maintainer-entry-profile template
  zswap: docs/vm: Fix typo accept_threshold_percent in zswap.rst
  tracing: Fix events.rst section numbering
  docs: acpi: fix old http link and improve document format
  docs: filesystems: add info about efivars content
  Documentation: LSM: Correct the basic LSM description
  mailmap: change email for Ricardo Ribalda
  docs: sysctl/kernel: document unaligned controls
  Documentation: admin-guide: update bug-hunting.rst
  docs: sysctl/kernel: document ngroups_max
  nvdimm: fixes to maintainter-entry-profile
  Documentation/features: Correct RISC-V kprobes support entry
  Documentation/features: Refresh the arch support status files
  Revert "docs: sysctl/kernel: document ngroups_max"
  docs: move locking-specific documents to locking/
  docs: move digsig docs to the security book
  docs: move the kref doc into the core-api book
  docs: add IRQ documentation at the core-api book
  docs: debugging-via-ohci1394.txt: add it to the core-api book
  docs: fix references for ipmi.rst file
  ...
2020-06-01 15:45:27 -07:00
Linus Torvalds
69fc06f70f Merge tag 'objtool-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
 "There are a lot of objtool changes in this cycle, all across the map:

   - Speed up objtool significantly, especially when there are large
     number of sections

   - Improve objtool's understanding of special instructions such as
     IRET, to reduce the number of annotations required

   - Implement 'noinstr' validation

   - Do baby steps for non-x86 objtool use

   - Simplify/fix retpoline decoding

   - Add vmlinux validation

   - Improve documentation

   - Fix various bugs and apply smaller cleanups"

* tag 'objtool-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  objtool: Enable compilation of objtool for all architectures
  objtool: Move struct objtool_file into arch-independent header
  objtool: Exit successfully when requesting help
  objtool: Add check_kcov_mode() to the uaccess safelist
  samples/ftrace: Fix asm function ELF annotations
  objtool: optimize add_dead_ends for split sections
  objtool: use gelf_getsymshndx to handle >64k sections
  objtool: Allow no-op CFI ops in alternatives
  x86/retpoline: Fix retpoline unwind
  x86: Change {JMP,CALL}_NOSPEC argument
  x86: Simplify retpoline declaration
  x86/speculation: Change FILL_RETURN_BUFFER to work with objtool
  objtool: Add support for intra-function calls
  objtool: Move the IRET hack into the arch decoder
  objtool: Remove INSN_STACK
  objtool: Make handle_insn_ops() unconditional
  objtool: Rework allocating stack_ops on decode
  objtool: UNWIND_HINT_RET_OFFSET should not check registers
  objtool: is_fentry_call() crashes if call has no destination
  x86,smap: Fix smap_{save,restore}() alternatives
  ...
2020-06-01 13:13:00 -07:00
Linus Torvalds
0bd957eb11 Merge tag 'core-kprobes-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull kprobes updates from Ingo Molnar:
 "Various kprobes updates, mostly centered around cleaning up the
  no-instrumentation logic.

  Instead of the current per debug facility blacklist, use the more
  generic .noinstr.text approach, combined with a 'noinstr' marker for
  functions.

  Also add instrumentation_begin()/end() to better manage the exact
  place in entry code where instrumentation may be used.

  And add a kprobes blacklist for modules"

* tag 'core-kprobes-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kprobes: Prevent probes in .noinstr.text section
  vmlinux.lds.h: Create section for protection against instrumentation
  samples/kprobes: Add __kprobes and NOKPROBE_SYMBOL() for handlers.
  kprobes: Support NOKPROBE_SYMBOL() in modules
  kprobes: Support __kprobes blacklist in modules
  kprobes: Lock kprobe_mutex while showing kprobe_blacklist
2020-06-01 12:45:04 -07:00
Josh Poimboeuf
9d907f1ae8 samples/ftrace: Fix asm function ELF annotations
Enable objtool coverage for the sample ftrace modules by adding ELF
annotations to the asm trampoline functions.

  samples/ftrace/ftrace-direct.o: warning: objtool: .text+0x0: unreachable instruction
  samples/ftrace/ftrace-direct-modify.o: warning: objtool: .text+0x0: unreachable instruction
  samples/ftrace/ftrace-direct-too.o: warning: objtool: .text+0x0: unreachable instruction

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-05-20 08:30:43 -05:00
Daniel T. Lee
59929cd1fe samples, bpf: Refactor kprobe, tail call kern progs map definition
Because the previous two commit replaced the bpf_load implementation of
the user program with libbpf, the corresponding kernel program's MAP
definition can be replaced with new BTF-defined map syntax.

This commit only updates the samples which uses libbpf API for loading
bpf program not with bpf_load.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200516040608.1377876-6-danieltimlee@gmail.com
2020-05-19 17:13:03 +02:00
Daniel T. Lee
14846dda63 samples, bpf: Add tracex7 test file to .gitignore
This commit adds tracex7 test file (testfile.img) to .gitignore which
comes from test_override_return.sh.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200516040608.1377876-5-danieltimlee@gmail.com
2020-05-19 17:13:00 +02:00
Daniel T. Lee
bc1a85977b samples, bpf: Refactor tail call user progs with libbpf
BPF tail call uses the BPF_MAP_TYPE_PROG_ARRAY type map for calling
into other BPF programs and this PROG_ARRAY should be filled prior to
use. Currently, samples with the PROG_ARRAY type MAP fill this program
array with bpf_load. For bpf_load to fill this map, kernel BPF program
must specify the section with specific format of <prog_type>/<array_idx>
(e.g. SEC("socket/0"))

But by using libbpf instead of bpf_load, user program can specify which
programs should be added to PROG_ARRAY. The advantage of this approach
is that you can selectively add only the programs you want, rather than
adding all of them to PROG_ARRAY, and it's much more intuitive than the
traditional approach.

This commit refactors user programs with the PROG_ARRAY type MAP with
libbpf instead of using bpf_load.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200516040608.1377876-4-danieltimlee@gmail.com
2020-05-19 17:12:56 +02:00
Daniel T. Lee
63841bc083 samples, bpf: Refactor kprobe tracing user progs with libbpf
Currently, the kprobe BPF program attachment method for bpf_load is
quite old. The implementation of bpf_load "directly" controls and
manages(create, delete) the kprobe events of DEBUGFS. On the other hand,
using using the libbpf automatically manages the kprobe event.
(under bpf_link interface)

By calling bpf_program__attach(_kprobe) in libbpf, the corresponding
kprobe is created and the BPF program will be attached to this kprobe.
To remove this, by simply invoking bpf_link__destroy will clean up the
event.

This commit refactors kprobe tracing programs (tracex{1~7}_user.c) with
libbpf using bpf_link interface and bpf_program__attach.

tracex2_kern.c, which tracks system calls (sys_*), has been modified to
append prefix depending on architecture.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200516040608.1377876-3-danieltimlee@gmail.com
2020-05-19 17:12:53 +02:00
Daniel T. Lee
0efdcefb00 samples, bpf: Refactor pointer error check with libbpf
Current method of checking pointer error is not user friendly.
Especially the __must_check define makes this less intuitive.

Since, libbpf has an API libbpf_get_error() which checks pointer error,
this commit refactors existing pointer error check logic with libbpf.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200516040608.1377876-2-danieltimlee@gmail.com
2020-05-19 17:12:49 +02:00
David Howells
e7d553d69c pipe: Add notification lossage handling
Add handling for loss of notifications by having read() insert a
loss-notification message after it has read the pipe buffer that was last
in the ring when the loss occurred.

Lossage can come about either by running out of notification descriptors or
by running out of space in the pipe ring.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-19 15:40:28 +01:00
David Howells
8cfba76383 pipe: Allow buffers to be marked read-whole-or-error for notifications
Allow a buffer to be marked such that read() must return the entire buffer
in one go or return ENOBUFS.  Multiple buffers can be amalgamated into a
single read, but a short read will occur if the next "whole" buffer won't
fit.

This is useful for watch queue notifications to make sure we don't split a
notification across multiple reads, especially given that we need to
fabricate an overrun record under some circumstances - and that isn't in
the buffers.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-19 15:38:18 +01:00
David Howells
f5b5a164f9 Add sample notification program
The sample program is run like:

	./samples/watch_queue/watch_test

and watches "/" for mount changes and the current session keyring for key
changes:

	# keyctl add user a a @s
	1035096409
	# keyctl unlink 1035096409 @s

producing:

	# ./watch_test
	read() = 16
	NOTIFY[000]: ty=000001 sy=02 i=00000110
	KEY 2ffc2e5d change=2[linked] aux=1035096409
	read() = 16
	NOTIFY[000]: ty=000001 sy=02 i=00000110
	KEY 2ffc2e5d change=3[unlinked] aux=1035096409

Other events may be produced, such as with a failing disk:

	read() = 22
	NOTIFY[000]: ty=000003 sy=02 i=00000416
	USB 3-7.7 dev-reset e=0 r=0
	read() = 24
	NOTIFY[000]: ty=000002 sy=06 i=00000418
	BLOCK 00800050 e=6[critical medium] s=64000ef8

This corresponds to:

	blk_update_request: critical medium error, dev sdf, sector 1677725432 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0

in dmesg.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-19 15:38:07 +01:00
Masahiro Yamada
88a8e278ff samples: watchdog: use 'userprogs' syntax
Kbuild now supports the 'userprogs' syntax to compile userspace
programs for the same architecture as the kernel.

Add the entry to samples/Makefile to put this into the build bot
coverage.

I also added the CONFIG option guarded by 'depends on CC_CAN_LINK'
because $(CC) may not provide libc.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
2020-05-17 18:52:02 +09:00
Masahiro Yamada
b98ccc7150 samples: timers: use 'userprogs' syntax
Kbuild now supports the 'userprogs' syntax to compile userspace
programs for the same architecture as the kernel.

Add the entry to samples/Makefile to put this into the build bot
coverage.

I also added the CONFIG option guarded by 'depends on CC_CAN_LINK'
because $(CC) may not provide libc.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
2020-05-17 18:52:02 +09:00
Masahiro Yamada
87ffbba9a9 samples: auxdisplay: use 'userprogs' syntax
Kbuild now supports the 'userprogs' syntax to compile userspace
programs for the same architecture as the kernel.

Add the entry to samples/Makefile to put this into the build bot
coverage.

I also added the CONFIG option guarded by 'depends on CC_CAN_LINK'
because $(CC) may not provide libc.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
2020-05-17 18:52:02 +09:00
Masahiro Yamada
c4c10996b1 samples: mei: build sample program for target architecture
This userspace program includes UAPI headers exported to usr/include/.
'make headers' always works for the target architecture (i.e. the same
architecture as the kernel), so the sample program should be built for
the target as well. Kbuild now supports 'userprogs' for that.

I also guarded the CONFIG option by 'depends on CC_CAN_LINK' because
$(CC) may not provide libc.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
2020-05-17 18:52:02 +09:00