Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for your net-next tree:
1) Support for matching on ipsec policy already set in the route, from
Florian Westphal.
2) Split set destruction into deactivate and destroy phase to make it
fit better into the transaction infrastructure, also from Florian.
This includes a patch to warn on imbalance when setting the new
activate and deactivate interfaces.
3) Release transaction list from the workqueue to remove expensive
synchronize_rcu() from configuration plane path. This speeds up
configuration plane quite a bit. From Florian Westphal.
4) Add new xfrm/ipsec extension, this new extension allows you to match
for ipsec tunnel keys such as source and destination address, spi and
reqid. From Máté Eckl and Florian Westphal.
5) Add secmark support, this includes connsecmark too, patches
from Christian Gottsche.
6) Allow to specify remaining bytes in xt_quota, from Chenbo Feng.
One follow up patch to calm a clang warning for this one, from
Nathan Chancellor.
7) Flush conntrack entries based on layer 3 family, from Kristian Evensen.
8) New revision for cgroups2 to shrink the path field.
9) Get rid of obsolete need_conntrack(), as a result from recent
demodularization works.
10) Use WARN_ON instead of BUG_ON, from Florian Westphal.
11) Unused exported symbol in nf_nat_ipv4_fn(), from Florian.
12) Remove superfluous check for timeout netlink parser and dump
functions in layer 4 conntrack helpers.
13) Unnecessary redundant rcu read side locks in NAT redirect,
from Taehee Yoo.
14) Pass nf_hook_state structure to error handlers, patch from
Florian Westphal.
15) Remove ->new() interface from layer 4 protocol trackers. Place
them in the ->packet() interface. From Florian.
16) Place conntrack ->error() handling in the ->packet() interface.
Patches from Florian Westphal.
17) Remove unused parameter in the pernet initialization path,
also from Florian.
18) Remove additional parameter to specify layer 3 protocol when
looking up for protocol tracker. From Florian.
19) Shrink array of layer 4 protocol trackers, from Florian.
20) Check for linear skb only once from the ALG NAT mangling
codebase, from Taehee Yoo.
21) Use rhashtable_walk_enter() instead of deprecated
rhashtable_walk_init(), also from Taehee.
22) No need to flush all conntracks when only one single address
is gone, from Tan Hu.
23) Remove redundant check for NAT flags in flowtable code, from
Taehee Yoo.
24) Use rhashtable_lookup() instead of rhashtable_lookup_fast()
from netfilter codebase, since rcu read lock side is already
assumed in this path.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns:
net/netfilter/xt_quota.c:47:44: warning: 'aligned' attribute ignored
when parsing type [-Wignored-attributes]
BUILD_BUG_ON(sizeof(atomic64_t) != sizeof(__aligned_u64));
^~~~~~~~~~~~~
Use 'sizeof(__u64)' instead, as the alignment doesn't affect the size
of the type.
Fixes: e9837e55b0 ("netfilter: xt_quota: fix the behavior of xt_quota module")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Make sure extack is passed to nlmsg_parse where easy to do so.
Most of these are dump handlers and leveraging the extack in
the netlink_callback.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
A major flaw of the current xt_quota module is that quota in a specific
rule gets reset every time there is a rule change in the same table. It
makes the xt_quota module not very useful in a table in which iptables
rules are changed at run time. This fix introduces a new counter that is
visible to userspace as the remaining quota of the current rule. When
userspace restores the rules in a table, it can restore the counter to
the remaining quota instead of resetting it to the full quota.
Signed-off-by: Chenbo Feng <fengc@google.com>
Suggested-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Only check for the network namespace if the socket is available.
Fixes: f564650106 ("netfilter: check if the socket netns is correct.")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Internally, rhashtable_lookup_fast() calls rcu_read_lock() then,
calls rhashtable_lookup(). so that in places where are guaranteed
by rcu read lock, rhashtable_lookup() is enough.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
nf_flow_offload_{ip/ipv6}_hook() check nat flag then, call
nf_flow_nat_{ip/ipv6} but that also check nat flag. so that
nat flag check code in nf_flow_offload_{ip/ipv6}_hook() are unnecessary.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add ability to set the connection tracking secmark value.
Add ability to set the meta secmark value.
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add the ability to set the security context of packets within the nf_tables framework.
Add a nft_object for holding security contexts in the kernel and manipulating packets on the wire.
Convert the security context strings at rule addition time to security identifiers.
This is the same behavior like in xt_SECMARK and offers better performance than computing it per packet.
Set the maximum security context length to 256.
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
else we will oops (null deref) when the attributes aren't present.
Also add back the EOPNOTSUPP in case MARK filtering is requested but
kernel doesn't support it.
Fixes: 59c08c69c2 ("netfilter: ctnetlink: Support L3 protocol-filter on flush")
Reported-by: syzbot+e45eda8eda6e93a03959@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
sizeof(sizeof()) is quite strange and does not seem to be what
is wanted here.
The issue is detected with the help of Coccinelle.
Fixes: 3921584674 ("netfilter: conntrack: remove nlattr_size pointer from l4proto trackers")
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The function nft_validate_register_store requires a struct of type
struct nft_data_types. NFTA_DATA_VALUE is of type enum
nft_verdict_attributes. Pass the correct enum type.
This fixes a warning seen with Clang:
net/netfilter/nft_osf.c:52:8: warning: implicit conversion from
enumeration type 'enum nft_data_attributes' to different enumeration
type 'enum nft_data_types' [-Wenum-conversion]
NFTA_DATA_VALUE, NFT_OSF_MAXGENRELEN);
^~~~~~~~~~~~~~~
Fixes: b96af92d6e ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf")
Link: https://github.com/ClangBuiltLinux/linux/issues/103
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
rhashtable_walk_init() is deprecated and rhashtable_walk_enter() can be
used instead. rhashtable_walk_init() is wrapper function of
rhashtable_walk_enter() so that logic is actually same.
But rhashtable_walk_enter() doesn't return error hence error path
code can be removed.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
__nf_nat_mangle_tcp_packet() and nf_nat_mangle_udp_packet() call
mangle_contents(). and __nf_nat_mangle_tcp_packet()
and mangle_contents() call skb_is_nonlinear(). so that
skb_is_nonlinear() in __nf_nat_mangle_tcp_packet() is unnecessary.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
All higher l4proto numbers are handled by the generic tracker; the
l4proto lookup function already returns generic one in case the l4proto
number exceeds max size.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
l4 protocols are demuxed by l3num, l4num pair.
However, almost all l4 trackers are l3 agnostic.
Only exceptions are:
- gre, icmp (ipv4 only)
- icmpv6 (ipv6 only)
This commit gets rid of the l3 mapping, l4 trackers can now be looked up
by their IPPROTO_XXX value alone, which gets rid of the additional l3
indirection.
For icmp, ipcmp6 and gre, add a check on state->pf and
return -NF_ACCEPT in case we're asked to track e.g. icmpv6-in-ipv4,
this seems more fitting than using the generic tracker.
Additionally we can kill the 2nd l4proto definitions that were needed
for v4/v6 split -- they are now the same so we can use single l4proto
struct for each protocol, rather than two.
The EXPORT_SYMBOLs can be removed as all these object files are
part of nf_conntrack with no external references.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Its unused, next patch will remove l4proto->l3proto number to simplify
l4 protocol demuxer lookup.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
icmp(v6) are the only two layer four protocols that need the error()
callback (to handle icmp errors that are related to an established
connections, e.g. packet too big, port unreachable and the like).
Remove the error callback and handle these two special cases from the core.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The error() handler gets called before allocating or looking up a
connection tracking entry.
We can instead use direct calls from the ->packet() handlers which get
invoked for every packet anyway.
Only exceptions are icmp and icmpv6, these two special cases will be
handled in the next patch.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Only two protocols need the ->error() function: icmp and icmpv6.
This is because icmp error mssages might be RELATED to an existing
connection (e.g. PMTUD, port unreachable and the like), and their
->error() handlers do this.
The error callback is already optional, so remove it for
udp and call them from ->packet() instead.
As the error() callback can call checksum functions that write to
skb->csum*, the const qualifier has to be removed as well.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
->new() gets invoked after ->error() and before ->packet() if
a conntrack lookup has found no result for the tuple.
We can fold it into ->packet() -- the packet() implementations
can check if the conntrack is confirmed (new) or not
(already in hash).
If its unconfirmed, the conntrack isn't in the hash yet so current
skb created a new conntrack entry.
Only relevant side effect -- if packet() doesn't return NF_ACCEPT
but -NF_ACCEPT (or drop), while the conntrack was just created,
then the newly allocated conntrack is freed right away, rather than not
created in the first place.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
nf_hook_state contains all the hook meta-information: netns, protocol family,
hook location, and so on.
Instead of only passing selected information, pass a pointer to entire
structure.
This will allow to merge the error and the packet handlers and remove
the ->new() function in followup patches.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>