Commit Graph

691857 Commits

Author SHA1 Message Date
Geliang Tang 46b20c38f3 netfilter: use audit_log()
Use audit_log() instead of open-coding it.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-08-19 13:09:31 +02:00
Taehee Yoo 166327d79d netfilter: remove prototype of netfilter_queue_init
The netfilter_queue_init() has been removed.
so we can remove the prototype of that.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-08-19 13:08:25 +02:00
Taehee Yoo a2acc54340 netfilter: connlimit: merge root4 and root6.
The root4 variable is used only when connlimit extension module has been
stored by the iptables command. and the roo6 variable is used only when
connlimit extension module has been stored by the ip6tables command.
So the root4 and roo6 variable does not be used at the same time.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-08-19 13:07:53 +02:00
Julia Lawall 549d2d41c1 netfilter: constify nf_loginfo structures
The nf_loginfo structures are only passed as the seventh argument to
nf_log_trace, which is declared as const or stored in a local const
variable.  Thus the nf_loginfo structures themselves can be const.

Done with the help of Coccinelle.

// <smpl>
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct nf_loginfo i@p = { ... };

@ok1@
identifier r.i;
expression list[6] es;
position p;
@@
 nf_log_trace(es,&i@p,...)

@ok2@
identifier r.i;
const struct nf_loginfo *e;
position p;
@@
 e = &i@p

@bad@
position p != {r.p,ok1.p,ok2.p};
identifier r.i;
struct nf_loginfo e;
@@
e@i@p

@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
 struct nf_loginfo i = { ... };
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-08-02 14:25:59 +02:00
Julia Lawall 2a04aabf5c netfilter: constify nf_conntrack_l3/4proto parameters
When a nf_conntrack_l3/4proto parameter is not on the left hand side
of an assignment, its address is not taken, and it is not passed to a
function that may modify its fields, then it can be declared as const.

This change is useful from a documentation point of view, and can
possibly facilitate making some nf_conntrack_l3/4proto structures const
subsequently.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-08-02 14:25:57 +02:00
Taehee Yoo 5b9ccdcb98 netfilter: xtables: Remove unused variable in compat_copy_entry_from_user()
The target variable is not used in the compat_copy_entry_from_user().
So It can be removed.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-08-02 14:25:50 +02:00
Florian Westphal 4d3a57f23d netfilter: conntrack: do not enable connection tracking unless needed
Discussion during NFWS 2017 in Faro has shown that the current
conntrack behaviour is unreasonable.

Even if conntrack module is loaded on behalf of a single net namespace,
its turned on for all namespaces, which is expensive.  Commit
481fa37347 ("netfilter: conntrack: add nf_conntrack_default_on sysctl")
attempted to provide an alternative to the 'default on' behaviour by
adding a sysctl to change it.

However, as Eric points out, the sysctl only becomes available
once the module is loaded, and then its too late.

So we either have to move the sysctl to the core, or, alternatively,
change conntrack to become active only once the rule set requires this.

This does the latter, conntrack is only enabled when a rule needs it.

Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 20:42:00 +02:00
Florian Westphal 9b7e26aee7 netfilter: nft_set_rbtree: use seqcount to avoid lock in most cases
switch to lockless lockup. write side now also increments sequence
counter.  On lookup, sample counter value and only take the lock
if we did not find a match and the counter has changed.

This avoids need to write to private area in normal (lookup) cases.

In case we detect a writer (seqretry is true) we fall back to taking
the readlock.

The readlock is also used during dumps to ensure we get a consistent
tree walk.

Similar technique (rbtree+seqlock) was used by David Howells in rxrpc.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 20:41:59 +02:00
Phil Sutter 6150957521 netfilter: nf_tables: Allow object names of up to 255 chars
Same conversion as for table names, use NFT_NAME_MAXLEN as upper
boundary as well.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 20:41:59 +02:00
Phil Sutter 387454901b netfilter: nf_tables: Allow set names of up to 255 chars
Same conversion as for table names, use NFT_NAME_MAXLEN as upper
boundary as well.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 20:41:58 +02:00
Phil Sutter b7263e071a netfilter: nf_tables: Allow chain name of up to 255 chars
Same conversion as for table names, use NFT_NAME_MAXLEN as upper
boundary as well.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 20:41:57 +02:00
Phil Sutter e46abbcc05 netfilter: nf_tables: Allow table names of up to 255 chars
Allocate all table names dynamically to allow for arbitrary lengths but
introduce NFT_NAME_MAXLEN as an upper sanity boundary. It's value was
chosen to allow using a domain name as per RFC 1035.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 20:41:57 +02:00
Phil Sutter 2cf0c8b3e6 netlink: Introduce nla_strdup()
This is similar to strdup() for netlink string attributes.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 20:41:50 +02:00
Phil Sutter 6e692678d7 netfilter: nf_tables: No need to check chain existence when tracing
nft_trace_notify() is called only from __nft_trace_packet(), which
assigns its parameter 'chain' to info->chain. __nft_trace_packet() in
turn later dereferences 'chain' unconditionally, which indicates that
it's never NULL. Same does nft_do_chain(), the only user of the tracing
infrastructure. Hence it is safe to assume the check removed here is not
needed.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 19:14:05 +02:00
Florian Westphal 591bb2789b netfilter: nf_hook_ops structs can be const
We no longer place these on a list so they can be const.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 19:10:44 +02:00
Florian Westphal 5da773a3e8 netfilter: nfnetlink_queue: don't queue dying conntracks to userspace
When skb is queued to userspace it leaves softirq/rcu protection.
skb->nfct (via conntrack extensions such as helper) could then reference
modules that no longer exist if the conntrack was not yet confirmed.

nf_ct_iterate_destroy() will set the DYING bit for unconfirmed
conntracks, we therefore solve this race as follows:

1. take the queue spinlock.
2. check if the conntrack is unconfirmed and has dying bit set.
   In this case, we must discard skb while we're still inside
   rcu read-side section.
3. If nf_ct_iterate_destroy() is called right after the packet is queued
   to userspace, it will be removed from the queue via
   nf_ct_iterate_destroy -> nf_queue_nf_hook_drop.

When userspace sends the verdict (nfnetlink takes rcu read lock), there
are two cases to consider:

1. nf_ct_iterate_destroy() was called while packet was out.
   In this case, skb will have been removed from the queue already
   and no reinject takes place as we won't find a matching entry for the
   packet id.

2. nf_ct_iterate_destroy() gets called right after verdict callback
   found and removed the skb from queue list.

   In this case, skb->nfct is marked as dying but it is still valid.
   The skb will be dropped either in nf_conntrack_confirm (we don't
   insert DYING conntracks into hash table) or when we try to queue
   the skb again, but either events don't occur before the rcu read lock
   is dropped.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 19:09:39 +02:00
Florian Westphal e2a750070a netfilter: conntrack: destroy functions need to free queued packets
queued skbs might be using conntrack extensions that are being removed,
such as timeout.  This happens for skbs that have a skb->nfct in
unconfirmed state (i.e., not in hash table yet).

This is destructive, but there are only two use cases:
 - module removal (rare)
 - netns cleanup (most likely no conntracks exist, and if they do,
   they are removed anyway later on).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 19:09:39 +02:00
Florian Westphal 84657984c2 netfilter: add and use nf_ct_unconfirmed_destroy
This also removes __nf_ct_unconfirmed_destroy() call from
nf_ct_iterate_cleanup_net, so that function can be used only
when missing conntracks from unconfirmed list isn't a problem.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 19:09:39 +02:00
Florian Westphal ac7b848390 netfilter: expect: add and use nf_ct_expect_iterate helpers
We have several spots that open-code a expect walk, add a helper
that is similar to nf_ct_iterate_destroy/nf_ct_iterate_cleanup.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 19:09:38 +02:00
subashab@codeaurora.org a232cd0e0c netfilter: conntrack: Change to deferable work queue
Delayed workqueue causes wakeups to idle CPUs. This was
causing a power impact for devices. Use deferable work
queue instead so that gc_worker runs when CPU is active only.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 19:03:50 +02:00
Pablo M. Bermudo Garay 6392c22603 netfilter: nf_tables: add fib expression to the netdev family
Add fib expression support for netdev family. Like inet family, netdev
delegates the actual decision to the corresponding backend, either ipv4
or ipv6.

This allows to perform very early reverse path filtering, among other
things.

You can find more information about fib expression in the f6d0cbcf09
("<netfilter: nf_tables: add fib expression>") commit message.

Signed-off-by: Pablo M. Bermudo Garay <pablombg@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 19:01:40 +02:00
Pablo M. Bermudo Garay f347ec852c netfilter: nf_tables: fib: use skb_header_pointer
This is a preparatory patch for adding fib support to the netdev family.

The netdev family receives the packets from ingress hook. At this point
we have no guarantee that the ip header is linear. So this patch
replaces ip_hdr with skb_header_pointer in order to address that
possible situation.

Signed-off-by: Pablo M. Bermudo Garay <pablombg@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-31 19:01:39 +02:00
Phil Sutter 784b4e612d netfilter: nf_tables: Attach process info to NFT_MSG_NEWGEN notifications
This is helpful for 'nft monitor' to track which process caused a given
change to the ruleset.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-24 13:25:07 +02:00
Taehee Yoo 0b35f6031a netfilter: Remove duplicated rcu_read_lock.
This patch removes duplicate rcu_read_lock().

1. IPVS part:

According to Julian Anastasov's mention, contexts of ipvs are described
at: http://marc.info/?l=netfilter-devel&m=149562884514072&w=2, in summary:

 - packet RX/TX: does not need locks because packets come from hooks.
 - sync msg RX: backup server uses RCU locks while registering new
   connections.
 - ip_vs_ctl.c: configuration get/set, RCU locks needed.
 - xt_ipvs.c: It is a netfilter match, running from hook context.

As result, rcu_read_lock and rcu_read_unlock can be removed from:

 - ip_vs_core.c: all
 - ip_vs_ctl.c:
   - only from ip_vs_has_real_service
 - ip_vs_ftp.c: all
 - ip_vs_proto_sctp.c: all
 - ip_vs_proto_tcp.c: all
 - ip_vs_proto_udp.c: all
 - ip_vs_xmit.c: all (contains only packet processing)

2. Netfilter part:

There are three types of functions that are guaranteed the rcu_read_lock().
First, as result, functions are only called by nf_hook():

 - nf_conntrack_broadcast_help(), pptp_expectfn(), set_expected_rtp_rtcp().
 - tcpmss_reverse_mtu(), tproxy_laddr4(), tproxy_laddr6().
 - match_lookup_rt6(), check_hlist(), hashlimit_mt_common().
 - xt_osf_match_packet().

Second, functions that caller already held the rcu_read_lock().
 - destroy_conntrack(), ctnetlink_conntrack_event().
 - ctnl_timeout_find_get(), nfqnl_nf_hook_drop().

Third, functions that are mixed with type1 and type2.

These functions are called by nf_hook() also these are called by
ordinary functions that already held the rcu_read_lock():

 - __ctnetlink_glue_build(), ctnetlink_expect_event().
 - ctnetlink_proto_size().

Applied files are below:

- nf_conntrack_broadcast.c, nf_conntrack_core.c, nf_conntrack_netlink.c.
- nf_conntrack_pptp.c, nf_conntrack_sip.c, nfnetlink_cttimeout.c.
- nfnetlink_queue.c, xt_TCPMSS.c, xt_TPROXY.c, xt_addrtype.c.
- xt_connlimit.c, xt_hashlimit.c, xt_osf.c

Detailed calltrace can be found at:
http://marc.info/?l=netfilter-devel&m=149667610710350&w=2

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-24 13:24:46 +02:00
Pablo Neira Ayuso 9f08ea8481 netfilter: nf_tables: keep chain counters away from hot path
These chain counters are only used by the iptables-compat tool, that
allow users to use the x_tables extensions from the existing nf_tables
framework. This patch makes nf_tables by ~5% for the general usecase,
ie. native nft users, where no chain counters are used at all.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-24 12:23:16 +02:00