Pull networking updates from David Miller:
"Highlights:
1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini
Varadhan.
2) Simplify classifier state on sk_buff in order to shrink it a bit.
From Willem de Bruijn.
3) Introduce SIPHASH and it's usage for secure sequence numbers and
syncookies. From Jason A. Donenfeld.
4) Reduce CPU usage for ICMP replies we are going to limit or
suppress, from Jesper Dangaard Brouer.
5) Introduce Shared Memory Communications socket layer, from Ursula
Braun.
6) Add RACK loss detection and allow it to actually trigger fast
recovery instead of just assisting after other algorithms have
triggered it. From Yuchung Cheng.
7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot.
8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert.
9) Export MPLS packet stats via netlink, from Robert Shearman.
10) Significantly improve inet port bind conflict handling, especially
when an application is restarted and changes it's setting of
reuseport. From Josef Bacik.
11) Implement TX batching in vhost_net, from Jason Wang.
12) Extend the dummy device so that VF (virtual function) features,
such as configuration, can be more easily tested. From Phil
Sutter.
13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric
Dumazet.
14) Add new bpf MAP, implementing a longest prefix match trie. From
Daniel Mack.
15) Packet sample offloading support in mlxsw driver, from Yotam Gigi.
16) Add new aquantia driver, from David VomLehn.
17) Add bpf tracepoints, from Daniel Borkmann.
18) Add support for port mirroring to b53 and bcm_sf2 drivers, from
Florian Fainelli.
19) Remove custom busy polling in many drivers, it is done in the core
networking since 4.5 times. From Eric Dumazet.
20) Support XDP adjust_head in virtio_net, from John Fastabend.
21) Fix several major holes in neighbour entry confirmation, from
Julian Anastasov.
22) Add XDP support to bnxt_en driver, from Michael Chan.
23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan.
24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi.
25) Support GRO in IPSEC protocols, from Steffen Klassert"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits)
Revert "ath10k: Search SMBIOS for OEM board file extension"
net: socket: fix recvmmsg not returning error from sock_error
bnxt_en: use eth_hw_addr_random()
bpf: fix unlocking of jited image when module ronx not set
arch: add ARCH_HAS_SET_MEMORY config
net: napi_watchdog() can use napi_schedule_irqoff()
tcp: Revert "tcp: tcp_probe: use spin_lock_bh()"
net/hsr: use eth_hw_addr_random()
net: mvpp2: enable building on 64-bit platforms
net: mvpp2: switch to build_skb() in the RX path
net: mvpp2: simplify MVPP2_PRS_RI_* definitions
net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT
net: mvpp2: remove unused register definitions
net: mvpp2: simplify mvpp2_bm_bufs_add()
net: mvpp2: drop useless fields in mvpp2_bm_pool and related code
net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue'
net: mvpp2: release reference to txq_cpu[] entry after unmapping
net: mvpp2: handle too large value in mvpp2_rx_time_coal_set()
net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set()
net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set
...
Pull security layer updates from James Morris:
"Highlights:
- major AppArmor update: policy namespaces & lots of fixes
- add /sys/kernel/security/lsm node for easy detection of loaded LSMs
- SELinux cgroupfs labeling support
- SELinux context mounts on tmpfs, ramfs, devpts within user
namespaces
- improved TPM 2.0 support"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (117 commits)
tpm: declare tpm2_get_pcr_allocation() as static
tpm: Fix expected number of response bytes of TPM1.2 PCR Extend
tpm xen: drop unneeded chip variable
tpm: fix misspelled "facilitate" in module parameter description
tpm_tis: fix the error handling of init_tis()
KEYS: Use memzero_explicit() for secret data
KEYS: Fix an error code in request_master_key()
sign-file: fix build error in sign-file.c with libressl
selinux: allow changing labels for cgroupfs
selinux: fix off-by-one in setprocattr
tpm: silence an array overflow warning
tpm: fix the type of owned field in cap_t
tpm: add securityfs support for TPM 2.0 firmware event log
tpm: enhance read_log_of() to support Physical TPM event log
tpm: enhance TPM 2.0 PCR extend to support multiple banks
tpm: implement TPM 2.0 capability to get active PCR banks
tpm: fix RC value check in tpm2_seal_trusted
tpm_tis: fix iTPM probe via probe_itpm() function
tpm: Begin the process to deprecate user_read_timer
tpm: remove tpm_read_index and tpm_write_index from tpm.h
...
Pull perf updates from Ingo Molnar:
"On the kernel side the main changes in this cycle were:
- Add Intel Kaby Lake CPU support (Srinivas Pandruvada)
- AMD uncore driver updates for fam17 (Janakarajan Natarajan)
- Intel/PT updates and core events optimizations and cleanups
(Alexander Shishkin)
- cgroups events fixes (David Carrillo-Cisneros)
- kprobes improvements (Masami Hiramatsu)
- ... plus misc fixes and updates.
On the tooling side the main changes were:
- Support clang build in tools/{perf,lib/{bpf,traceevent,api}} with
CC=clang, to, for instance, take advantage of better warnings
(Arnaldo Carvalho de Melo):
- Introduce the 'delta-abs' 'perf diff' compute method, that orders
the histogram entries by the absolute value of the percentage delta
for a function in two perf.data files, i.e. the functions that
changed the most (increase or decrease in samples) comes first
(Namhyung Kim)
- Add support for parsing Intel uncore vendor event files and add
uncore vendor events for the Intel server processors (Haswell,
Broadwell, IvyBridge), Xeon Phi (Knights Landing) and Broadwell DE
(Andi Kleen)
- Introduce 'perf ftrace' a perf front end to the kernel's ftrace
function and function_graph tracer, defaulting to the
"function_graph" tracer, more work will be done in reviving this
effort, forward porting it from its initial patch submission
(Namhyung Kim)
- Add 'e' and 'c' hotkeys to expand/collapse call chains for a single
hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa)
- Account thread wait time (off CPU time) separately: sleep, iowait
and preempt, based on the prev_state of the last event, show the
breakdown when using "perf sched timehist --state" (Namhyumg Kim)
- Add more triggers to switch the output file (perf.data.TIMESTAMP).
Now, in addition to switching to a different output file when
receiving a SIGUSR2, one can also specify file size and time based
triggers:
perf record -a --switch-output=signal
is equivalent to what we had before:
perf record -a --switch-output
While we can also ask for the file to be "sliced" by size, taking
into account that that will happen only when we get woken up by the
kernel, i.e. one has to take into account the --mmap-pages (the
size of the perf mmap ring buffer):
perf record -a --switch-output=2G
will break the perf.data output into multiple files limited to 2GB
of samples, right when generating the output.
For time based samples, alert() will be used, so to have 1 minute
limited perf.data output files:
perf record -a --switch-output=1m
(Jiri Olsa)
- Improve 'perf trace' (Arnaldo Carvalho de Melo)
- 'perf kallsyms' toy tool to look for extended symbol information on
the running kernel and demonstrate the machine/thread/symbol APIs
for use in other tools, such as 'perf probe' (Arnaldo Carvalho de
Melo)
- ... plus tons of other changes, see the shortlog and Git log for
details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (131 commits)
perf tools: Add missing parse_events_error() prototype
perf pmu: Fix check for unset alias->unit array
perf tools: Be consistent on the type of map->symbols[] interator
perf intel pt decoder: clang has no -Wno-override-init
perf evsel: Do not put a variable sized type not at the end of a struct
perf probe: Avoid accessing uninitialized 'map' variable
perf tools: Do not put a variable sized type not at the end of a struct
perf record: Do not put a variable sized type not at the end of a struct
perf tests: Synthesize struct instead of using field after variable sized type
perf bench numa: Make sure dprintf() is not defined
Revert "perf bench futex: Sanitize numeric parameters"
tools lib subcmd: Make it an error to pass a signed value to OPTION_UINTEGER
tools: Set the maximum optimization level according to the compiler being used
tools: Suppress request for warning options not existent in clang
samples/bpf: Reset global variables
samples/bpf: Ignore already processed ELF sections
samples/bpf: Add missing header
perf symbols: dso->name is an array, no need to check it against NULL
perf tests record: No need to test an array against NULL
perf symbols: No need to check if sym->name is NULL
...
If BPF_F_ALLOW_OVERRIDE flag is used in BPF_PROG_ATTACH command
to the given cgroup the descendent cgroup will be able to override
effective bpf program that was inherited from this cgroup.
By default it's not passed, therefore override is disallowed.
Examples:
1.
prog X attached to /A with default
prog Y fails to attach to /A/B and /A/B/C
Everything under /A runs prog X
2.
prog X attached to /A with allow_override.
prog Y fails to attach to /A/B with default (non-override)
prog M attached to /A/B with allow_override.
Everything under /A/B runs prog M only.
3.
prog X attached to /A with allow_override.
prog Y fails to attach to /A with default.
The user has to detach first to switch the mode.
In the future this behavior may be extended with a chain of
non-overridable programs.
Also fix the bug where detach from cgroup where nothing is attached
was not throwing error. Return ENOENT in such case.
Add several testcases and adjust libbpf.
Fixes: 3007098494 ("cgroup: add support for eBPF programs")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend the map_perf_test_{user,kern}.c infrastructure to stress test
lpm-trie lookups. We hook into the kprobe on sys_gettid() and measure
the latency depending on trie size and lookup count.
On my Intel Haswell i7-6400U, a single gettid() syscall with an empty
bpf program takes roughly 6.5us on my system. Lookups in empty tries
take ~1.8us on first try, ~0.9us on retries. Lookups in tries with 8192
entries take ~7.1us (on the first _and_ any subsequent try).
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Mack <daniel@zonque.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix build errors for samples/bpf xdp_tx_iptunnel and tc_l2_redirect,
when dynamic debugging is enabled (CONFIG_DYNAMIC_DEBUG) by defining a
fake KBUILD_MODNAME.
Just like Daniel Borkmann fixed other samples/bpf in commit
96a8eb1eee ("bpf: fix samples to add fake KBUILD_MODNAME").
Fixes: 12d8bb64e3 ("bpf: xdp: Add XDP example for head adjustment")
Fixes: 90e02896f1 ("bpf: Add test for bpf_redirect to ipip/ip6tnl")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull perf fixes from Ingo Molnar:
"Misc race fixes uncovered by fuzzing efforts, a Sparse fix, two PMU
driver fixes, plus miscellanous tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Reject non sampling events with precise_ip
perf/x86/intel: Account interrupts for PEBS errors
perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race
perf/core: Fix sys_perf_event_open() vs. hotplug
perf/x86/intel: Use ULL constant to prevent undefined shift behaviour
perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code
perf/x86: Set pmu->module in Intel PMU modules
perf probe: Fix to probe on gcc generated symbols for offline kernel
perf probe: Fix --funcs to show correct symbols for offline module
perf symbols: Robustify reading of build-id from sysfs
perf tools: Install tools/lib/traceevent plugins with install-bin
tools lib traceevent: Fix prev/next_prio for deadline tasks
perf record: Fix --switch-output documentation and comment
perf record: Make __record_options static
tools lib subcmd: Add OPT_STRING_OPTARG_SET option
perf probe: Fix to get correct modname from elf header
samples/bpf trace_output_user: Remove duplicate sys/ioctl.h include
samples/bpf sock_example: Avoid getting ethhdr from two includes
perf sched timehist: Show total scheduling time
We set info.count to 1 in mtty_get_irq_info() so static checkers
complain that, "Why do we have impossible conditions?" The answer is
that it seems to be left over dead code that can be safely removed.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This is a sample driver for documentation so the impact is probably
pretty low. But we should check that bar_index is valid so we
don't write beyond the end of the mdev_state->region_info[] array.
Fixes: 9d1a546c53 ("docs: Sample driver to demonstrate how to use Mediated device framework.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The copy_to_user() function returns the number of bytes which it wasn't
able to copy but we want to return a negative error code.
Fixes: 9d1a546c53 ("docs: Sample driver to demonstrate how to use Mediated device framework.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pull perf/urgent fixes and one improvement from Arnaldo Carvalho de Melo:
Fixes:
- Fix prev/next_prio formatting for deadline tasks in libtraceevent (Daniel Bristot de Oliveira)
- Robustify reading of build-ids from /sys/kernel/note (Arnaldo Carvalho de Melo)
- Fix building some sample/bpf in Alpine Linux 3.4 (Arnaldo Carvalho de Melo)
- Fix 'make install-bin' to install libtraceevent plugins (Arnaldo Carvalho de Melo)
- Fix 'perf record --switch-output' documentation and comment (Jiri Olsa)
- Fix 'perf probe' for cross arch probing (Masami Hiramatsu)
Improvement:
- Show total scheduling time in 'perf sched timehist' (Namhyumg Kim)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To avoid the following build failure on Alpine Linux 3.4, that has
clang-3.8 with the bpf target:
HOSTCC samples/bpf/sock_example.o
In file included from /usr/include/net/ethernet.h:10:0,
from /git/linux/samples/bpf/sock_example.h:7,
from /git/linux/samples/bpf/sock_example.c:30:
/usr/include/netinet/if_ether.h:96:8: error: redefinition of 'struct
ethhdr'
struct ethhdr {
^
In file included from /git/linux/samples/bpf/sock_example.c:26:0:
./usr/include/linux/if_ether.h:144:8: note: originally defined here
struct ethhdr {
^
scripts/Makefile.host:124: recipe for target
'samples/bpf/sock_example.o' failed
make[2]: *** [samples/bpf/sock_example.o] Error 1
/git/linux/Makefile:1658: recipe for target 'samples/bpf/' failed
So include net/if_ether.h for the needs of sock_example.h, using the
same include that sock_example.c uses.
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Joe Stringer <joe@ovn.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-m9avekl1b651qe1r1zd5tzz9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf fixes from Ingo Molnar:
"On the kernel side there's two x86 PMU driver fixes and a uprobes fix,
plus on the tooling side there's a number of fixes and some late
updates"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
perf sched timehist: Fix invalid period calculation
perf sched timehist: Remove hardcoded 'comm_width' check at print_summary
perf sched timehist: Enlarge default 'comm_width'
perf sched timehist: Honour 'comm_width' when aligning the headers
perf/x86: Fix overlap counter scheduling bug
perf/x86/pebs: Fix handling of PEBS buffer overflows
samples/bpf: Move open_raw_sock to separate header
samples/bpf: Remove perf_event_open() declaration
samples/bpf: Be consistent with bpf_load_program bpf_insn parameter
tools lib bpf: Add bpf_prog_{attach,detach}
samples/bpf: Switch over to libbpf
perf diff: Do not overwrite valid build id
perf annotate: Don't throw error for zero length symbols
perf bench futex: Fix lock-pi help string
perf trace: Check if MAP_32BIT is defined (again)
samples/bpf: Make perf_event_read() static
uprobes: Fix uprobes on MIPS, allow for a cache flush after ixol breakpoint creation
samples/bpf: Make samples more libbpf-centric
tools lib bpf: Add flags to bpf_create_map()
tools lib bpf: use __u32 from linux/types.h
...