Commit Graph

2514 Commits

Author SHA1 Message Date
Eric Dumazet
29e75252da [IPV4] route cache: Introduce rt_genid for smooth cache invalidation
Current ip route cache implementation is not suited to large caches.

We can consume a lot of CPU when cache must be invalidated, since we
currently need to evict all cache entries, and this eviction is
sometimes asynchronous. min_delay & max_delay can somewhat control this
asynchronism behavior, but whole thing is a kludge, regularly triggering
infamous soft lockup messages. When entries are still in use, this also
consumes a lot of ram, filling dst_garbage.list.

A better scheme is to use a generation identifier on each entry,
so that cache invalidation can be performed by changing the table
identifier, without having to scan all entries.
No more delayed flushing, no more stalling when secret_interval expires.

Invalidated entries will then be freed at GC time (controled by
ip_rt_gc_timeout or stress), or when an invalidated entry is found
in a chain when an insert is done.
Thus we keep a normal equilibrium.

This patch :
- renames rt_hash_rnd to rt_genid (and makes it an atomic_t)
- Adds a new rt_genid field to 'struct rtable' (filling a hole on 64bit)
- Checks entry->rt_genid at appropriate places :
2008-01-31 19:28:27 -08:00
Shan Wei
16ca3f9130 [TCP]: Fix a bug in strategy_allowed_congestion_control
In strategy_allowed_congestion_control of the 2.6.24 kernel, when
sysctl_string return 1 on success,it should call
tcp_set_allowed_congestion_control to set the allowed congestion
control.But, it don't.  the sysctl_string return 1 on success,
otherwise return negative, never return 0.The patch fix the problem.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:23 -08:00
Stephen Hemminger
71d67e666e [IPV4] fib_trie: rescan if key is lost during dump
Normally during a dump the key of the last dumped entry is used for
continuation, but since lock is dropped it might be lost. In that case
fallback to the old counter based N^2 behaviour.  This means the dump
will end up skipping some routes which matches what FIB_HASH does.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:23 -08:00
Pavel Emelyanov
fa4d3c6210 [NETNS]: Udp sockets per-net lookup.
Add the net parameter to udp_get_port family of calls and
udp_lookup one and use it to filter sockets.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:21 -08:00
Pavel Emelyanov
d86e0dac2c [NETNS]: Tcp-v6 sockets per-net lookup.
Add a net argument to inet6_lookup and propagate it further.
Actually, this is tcp-v6 implementation of what was done for
tcp-v4 sockets in a previous patch.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:20 -08:00
Pavel Emelyanov
c67499c0e7 [NETNS]: Tcp-v4 sockets per-net lookup.
Add a net argument to inet_lookup and propagate it further
into lookup calls. Plus tune the __inet_check_established.

The dccp and inet_diag, which use that lookup functions
pass the init_net into them.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:19 -08:00
Pavel Emelyanov
941b1d22cc [NETNS]: Make bind buckets live in net namespaces.
This tags the inet_bind_bucket struct with net pointer,
initializes it during creation and makes a filtering
during lookup.

A better hashfn, that takes the net into account is to
be done in the future, but currently all bind buckets
with similar port will be in one hash chain.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:18 -08:00
Pavel Emelyanov
5ee31fc1ec [INET]: Consolidate inet(6)_hash_connect.
These two functions are the same except for what they call
to "check_established" and "hash" for a socket.

This saves half-a-kilo for ipv4 and ipv6.

 add/remove: 1/0 grow/shrink: 1/4 up/down: 582/-1128 (-546)
 function                                     old     new   delta
 __inet_hash_connect                            -     577    +577
 arp_ignore                                   108     113      +5
 static.hint                                    8       4      -4
 rt_worker_func                               376     372      -4
 inet6_hash_connect                           584      25    -559
 inet_hash_connect                            586      25    -561

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:17 -08:00
Patrick McHardy
969d71089f [NETFILTER]: nf_nat: fix sparse warning
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:15 -08:00
Patrick McHardy
c392a74018 [NETFILTER]: {ip,ip6}_queue: fix build error
Reported by Ingo Molnar:

 net/built-in.o: In function `ip_queue_init':
 ip_queue.c:(.init.text+0x322c): undefined reference to `net_ipv4_ctl_path'

Fix the build error and also handle CONFIG_PROC_FS=n properly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:14 -08:00
Jan Engelhardt
32948588ac [NETFILTER]: nf_conntrack: annotate l3protos with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:13 -08:00
Jan Engelhardt
7cc3864d39 [NETFILTER]: nf_{conntrack,nat}_icmp: constify and annotate
Constify a few data tables use const qualifiers on variables where
possible in the nf_conntrack_icmp* sources.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:12 -08:00
Jan Engelhardt
dc35dc5a4c [NETFILTER]: nf_{conntrack,nat}_proto_gre: annotate with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:12 -08:00
Jan Engelhardt
da3f13c95a [NETFILTER]: nf_{conntrack,nat}_proto_udp{,lite}: annotate with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:11 -08:00
Jan Engelhardt
82f568fc2f [NETFILTER]: nf_{conntrack,nat}_proto_tcp: constify and annotate TCP modules
Constify a few data tables use const qualifiers on variables where
possible in the nf_*_proto_tcp sources.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:10 -08:00
Jan Engelhardt
9ddd0ed050 [NETFILTER]: nf_{conntrack,nat}_pptp: annotate PPtP helper with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:09 -08:00
Jan Engelhardt
de24b4ebb8 [NETFILTER]: nf_{conntrack,nat}_tftp: annotate TFTP helper with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:08 -08:00
Jan Engelhardt
13f7d63c29 [NETFILTER]: nf_{conntrack,nat}_sip: annotate SIP helper with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:08 -08:00
Jan Engelhardt
905e3e8ec5 [NETFILTER]: nf_conntrack_h323: constify and annotate H.323 helper
Constify data tables (predominantly in nf_conntrack_h323_types.c, but
also a few in nf_conntrack_h323_asn1.c) and use const qualifiers on
variables where possible in the h323 sources.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:07 -08:00
Alexey Dobriyan
3cb609d57c [NETFILTER]: x_tables: create per-netns /proc/net/*_tables_*
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:06 -08:00
Ilpo Järvinen
a38201e3c9 [NETFILTER]: ipt_CLUSTERIP: kill clusterip_config_entry_get
It's unused static inline.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:02 -08:00
Patrick McHardy
02502f6224 [NETFILTER]: nf_nat: switch rwlock to spinlock
Since we're using RCU, all users of nf_nat_lock take a write_lock.
Switch it to a spinlock.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:00 -08:00
Patrick McHardy
4d354c5782 [NETFILTER]: nf_nat: use RCU for bysource hash
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:00 -08:00
Patrick McHardy
c88130bcd5 [NETFILTER]: nf_conntrack: naming unification
Rename all "conntrack" variables to "ct" for more consistency and
avoiding some overly long lines.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:59 -08:00
Patrick McHardy
76507f69c4 [NETFILTER]: nf_conntrack: use RCU for conntrack hash
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:54 -08:00