This patch adds a new NFTA_LIMIT_TYPE netlink attribute to indicate the type of
limiting.
Contrary to per-packet limiting, the cost is calculated from the packet path
since this depends on the packet length.
The burst attribute indicates the number of bytes in which the rate can be
exceeded.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds the burst parameter. This burst indicates the number of packets
that can exceed the limit.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This new expression uses the nf_dup engine to clone packets to a given gateway.
Unlike xt_TEE, we use an index to indicate output interface which should be
fine at this stage.
Moreover, change to the preemtion-safe this_cpu_read(nf_skb_duplicated) from
nf_dup_ipv{4,6} to silence a lockdep splat.
Based on the original tee expression from Arturo Borrero Gonzalez, although
this patch has diverted quite a bit from this initial effort due to the
change to support maps.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This adds the ability audit the actions of a not-yet-running process.
This patch implements the ability to filter on the executable path. Instead of
just hard coding the ino and dev of the executable we care about at the moment
the rule is inserted into the kernel, use the new audit_fsnotify
infrastructure to manage this dynamically. This means that if the filename
does not yet exist but the containing directory does, or if the inode in
question is unlinked and creat'd (aka updated) the rule will just continue to
work. If the containing directory is moved or deleted or the filesystem is
unmounted, the rule is deleted automatically. A future enhancement would be to
have the rule survive across directory disruptions.
This is a heavily modified version of a patch originally submitted by Eric
Paris with some ideas from Peter Moody.
Cc: Peter Moody <peter@hda3.com>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: minor whitespace clean to satisfy ./scripts/checkpatch]
Signed-off-by: Paul Moore <pmoore@redhat.com>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next, they are:
1) A couple of cleanups for the netfilter core hook from Eric Biederman.
2) Net namespace hook registration, also from Eric. This adds a dependency with
the rtnl_lock. This should be fine by now but we have to keep an eye on this
because if we ever get the per-subsys nfnl_lock before rtnl we have may
problems in the future. But we have room to remove this in the future by
propagating the complexity to the clients, by registering hooks for the init
netns functions.
3) Update nf_tables to use the new net namespace hook infrastructure, also from
Eric.
4) Three patches to refine and to address problems from the new net namespace
hook infrastructure.
5) Switch to alternate jumpstack in xtables iff the packet is reentering. This
only applies to a very special case, the TEE target, but Eric Dumazet
reports that this is slowing down things for everyone else. So let's only
switch to the alternate jumpstack if the tee target is in used through a
static key. This batch also comes with offline precalculation of the
jumpstack based on the callchain depth. From Florian Westphal.
6) Minimal SCTP multihoming support for our conntrack helper, from Michal
Kubecek.
7) Reduce nf_bridge_info per skbuff scratchpad area to 32 bytes, from Florian
Westphal.
8) Fix several checkpatch errors in bridge netfilter, from Bernhard Thaler.
9) Get rid of useless debug message in ip6t_REJECT, from Subash Abhinov.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull PCI fix from Bjorn Helgaas:
"This is a trivial fix for a change that broke user program compilation
(QEMU in this case)"
* tag 'pci-v4.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Restore PCI_MSIX_FLAGS_BIRMASK definition
In multiple locations there are checks for whether the label in hand
is a reserved label or not using the arbritray value of 16. Factor
this out into a #define for better maintainability and for
documentation.
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ioctl IOCTL_MEI_NOTIFY_SET for enabling and disabling
async event notification.
Add ioctl IOCTL_MEI_NOTIFY_GET for receiving and acking
an event notification.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add skb->hash to the __sk_buff offset map, so it can be accessed from
an eBPF program. We currently already do this for classic BPF filters,
but not yet on eBPF, it might be useful as a demuxer in combination with
helpers like bpf_clone_redirect(), toy example:
__section("cls-lb") int ingress_main(struct __sk_buff *skb)
{
unsigned int which = 3 + (skb->hash & 7);
/* bpf_skb_store_bytes(skb, ...); */
/* bpf_l{3,4}_csum_replace(skb, ...); */
bpf_clone_redirect(skb, which, 0);
return -1;
}
I was thinking whether to add skb_get_hash(), but then concluded the
raw skb->hash seems fine in this case: we can directly access the hash
w/o extra eBPF helper function call, it's filled out by many NICs on
ingress, and in case the entropy level would not be sufficient, people
can still implement their own specific sw fallback hash mix anyway.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
arch/s390/net/bpf_jit_comp.c
drivers/net/ethernet/ti/netcp_ethss.c
net/bridge/br_multicast.c
net/ipv4/ip_fragment.c
All four conflicts were cases of simple overlapping
changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
tlb_dynamic_lb could be set only via sysfs, this patch allows it to be
set via netlink.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two vxlan driver flags FLOWBASED and COLLECT_METADATA need to be set to
make use of its new flow mode. The former already exposed. Expose the latter.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce helpers to let eBPF programs attached to TC manipulate tunnel metadata:
bpf_skb_[gs]et_tunnel_key(skb, key, size, flags)
skb: pointer to skb
key: pointer to 'struct bpf_tunnel_key'
size: size of 'struct bpf_tunnel_key'
flags: room for future extensions
First eBPF program that uses these helpers will allocate per_cpu
metadata_dst structures that will be used on TX.
On RX metadata_dst is allocated by tunnel driver.
Typical usage for TX:
struct bpf_tunnel_key tkey;
... populate tkey ...
bpf_skb_set_tunnel_key(skb, &tkey, sizeof(tkey), 0);
bpf_clone_redirect(skb, vxlan_dev_ifindex, 0);
RX:
struct bpf_tunnel_key tkey = {};
bpf_skb_get_tunnel_key(skb, &tkey, sizeof(tkey), 0);
... lookup or redirect based on tkey ...
'struct bpf_tunnel_key' will be extended in the future by adding
elements to the end and the 'size' argument will indicate which fields
are populated, thereby keeping backwards compatibility.
The 'flags' argument may be used as well when the 'size' is not enough or
to indicate completely different layout of bpf_tunnel_key.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6fd99094de ("ipv6: Don't reduce hop limit for an interface")
disabled accept hop limit from RA if it is smaller than the current hop
limit for security stuff. But this behavior kind of break the RFC definition.
RFC 4861, 6.3.4. Processing Received Router Advertisements
A Router Advertisement field (e.g., Cur Hop Limit, Reachable Time,
and Retrans Timer) may contain a value denoting that it is
unspecified. In such cases, the parameter should be ignored and the
host should continue using whatever value it is already using.
If the received Cur Hop Limit value is non-zero, the host SHOULD set
its CurHopLimit variable to the received value.
So add sysctl option accept_ra_min_hop_limit to let user choose the minimum
hop limit value they can accept from RA. And set default to 1 to meet RFC
standards.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently nf_conntrack_proto_sctp module handles only packets between
primary addresses used to establish the connection. Any packets between
secondary addresses are classified as invalid so that usual firewall
configurations drop them. Allowing HEARTBEAT and HEARTBEAT-ACK chunks to
establish a new conntrack would allow traffic between secondary
addresses to pass through. A more sophisticated solution based on the
addresses advertised in the initial handshake (and possibly also later
dynamic address addition and removal) would be much harder to implement.
Moreover, in general we cannot assume to always see the initial
handshake as it can be routed through a different path.
The patch adds two new conntrack states:
SCTP_CONNTRACK_HEARTBEAT_SENT - a HEARTBEAT chunk seen but not acked
SCTP_CONNTRACK_HEARTBEAT_ACKED - a HEARTBEAT acked by HEARTBEAT-ACK
State transition rules:
- HEARTBEAT_SENT responds to usual chunks the same way as NONE (so that
the behaviour changes as little as possible)
- HEARTBEAT_ACKED responds to usual chunks the same way as ESTABLISHED
does, except the resulting state is HEARTBEAT_ACKED rather than
ESTABLISHED
- previously existing states except NONE are preserved when HEARTBEAT or
HEARTBEAT-ACK is seen
- NONE (in the initial direction) changes to HEARTBEAT_SENT on HEARTBEAT
and to CLOSED on HEARTBEAT-ACK
- HEARTBEAT_SENT changes to HEARTBEAT_ACKED on HEARTBEAT-ACK in the
reply direction
- HEARTBEAT_SENT and HEARTBEAT_ACKED are preserved on HEARTBEAT and
HEARTBEAT-ACK otherwise
Normally, vtag is set from the INIT chunk for the reply direction and
from the INIT-ACK chunk for the originating direction (i.e. each of
these defines vtag value for the opposite direction). For secondary
conntracks, we can't rely on seeing INIT/INIT-ACK and even if we have
seen them, we would need to connect two different conntracks. Therefore
simplified logic is applied: vtag of first packet in each direction
(HEARTBEAT in the originating and HEARTBEAT-ACK in reply direction) is
saved and all following packets in that direction are compared with this
saved value. While INIT and INIT-ACK define vtag for the opposite
direction, vtags extracted from HEARTBEAT and HEARTBEAT-ACK are always
for their direction.
Default timeout values for new states are
HEARTBEAT_SENT: 30 seconds (default hb_interval)
HEARTBEAT_ACKED: 210 seconds (hb_interval * path_max_retry + max_rto)
(We cannot expect to see the shutdown sequence so that, unlike
ESTABLISHED, the HEARTBEAT_ACKED timeout shouldn't be too long.)
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
OTG 2.0 introduces bcdOTG in otg descriptor to identify the OTG and EH
supplement release number with which the OTG device is compliant, this
patch adds structure usb_otg20_descriptor for OTG 2.0 and above.
Signed-off-by: Macpaul Lin <macpaul@gmail.com>
Signed-off-by: Li Jun <jun.li@freescale.com>
Reviewed-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
This patch adds names for missing irq types to the trace events.
In order to identify adapter irqs, the define is moved from
interrupt.c to the other basic irq defines in uapi/linux/kvm.h.
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Add support for the three ARS DSM commands:
- Query ARS Capabilities - Queries the firmware to check if a given
range supports scrub, and if so, which type (persistent vs. volatile)
- Start ARS - Starts a scrub for a given range/type
- Query ARS Status - Checks status of a previously started scrub, and
provides the error logs if any.
The commands are described by the example DSM spec at:
http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
Also add these commands to the nfit_test test framework, and return
canned data.
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The spec suggests that this is a simple 'length' field, not a mask.
Update the name accordingly.
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Other serial driver work wants to build on patches now in 4.2-rc4 so
merge the branch so this can properly happen.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>