The IP tunnel hash heads can be embedded in the per-net structure
since it is a fixed size. Reduce the size so that the total structure
fits in a page size. The original size was overly large, even NETDEV_HASHBITS
is only 8 bits!
Also, add some white space for readability.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>.
Signed-off-by: David S. Miller <davem@davemloft.net>
Move refcnt, pref, suppress_ifgroup, suppress_prefixlen out of first
cache line, as they are not used in fast path.
Make sure ctarget & fr_net are in first cache line.
(Assuming 64 bit arches and 64 bytes cache lines)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change brings the suppressor attribute names into line; it also changes
the data types to provide a more consistent interface.
While -1 indicates that the suppressor is not enabled, values >= 0 for
suppress_prefixlen or suppress_ifgroup reject routing decisions violating the
constraint.
This changes the previously presented behaviour of suppress_prefixlen, where a
prefix length _less_ than the attribute value was rejected. After this change,
a prefix length less than *or* equal to the value is considered a violation of
the rule constraint.
It also changes the default values for default and newly added rules (disabling
any suppression for those).
Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change adds the ability to suppress a routing decision based upon the
interface group the selected interface belongs to. This allows it to
exclude specific devices from a routing decision.
Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sctp sits on IPv6, sctp_transport_dst_check pass cookie as ZERO,
as a result ip6_dst_check always fail out. This behaviour makes
transport->dst useless, because every sctp_packet_transmit must look
for valid dst.
Add a dst_cookie into sctp_transport, and set the cookie whenever we
get new dst for sctp_transport. So dst validness could be checked
against it.
Since I have split genid for IPv4 and IPv6, also delete/add IPv6 address
will also bump IPv6 genid. So issues we discussed in:
http://marc.info/?l=linux-netdev&m=137404469219410&w=4
have all been sloved for this patch.
Signed-off-by: Fan Du <fan.du@windriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's convenient to have ethernet mac addresses use
ETH_ALEN to be able to grep for them a bit easier and
also to ensure that the addresses are __aligned(2).
Add #include <linux/if_ether.h> as necessary.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When CONFIG_NET_LL_RX_POLL is not set, I got:
net/socket.c: In function ‘sock_poll’:
net/socket.c:1165:4: error: implicit declaration of function ‘sk_busy_loop’ [-Werror=implicit-function-declaration]
Fix this by adding a nop when !CONFIG_NET_LL_RX_POLL.
Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On a high-traffic router with many processors and many IPv6 dst
entries, soft lockup in fib6_run_gc() can occur when number of
entries reaches gc_thresh.
This happens because fib6_run_gc() uses fib6_gc_lock to allow
only one thread to run the garbage collector but ip6_dst_gc()
doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
returns. On a system with many entries, this can take some time
so that in the meantime, other threads pass the tests in
ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
the lock. They then have to run the garbage collector one after
another which blocks them for quite long.
Resolve this by replacing special value ~0UL of expire parameter
to fib6_run_gc() by explicit "force" parameter to choose between
spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
force=false if gc_thresh is reached but not max_size.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Reflow modified prototypes to 80 columns.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Reflow modified prototypes to 80 columns.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Reflow modified prototypes to 80 columns.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Reflow modified prototypes to 80 columns.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Reflow modified prototypes to 80 columns.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Reflow modified prototypes to 80 columns.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change adds a new operation to the fib_rules_ops struct; it allows the
suppression of routing decisions if certain criteria are not met by its
results.
The first implemented constraint is a minimum prefix length added to the
structures of routing rules. If a rule is added with a minimum prefix length
>0, only routes meeting this threshold will be considered. Any other (more
general) routing table entries will be ignored.
When configuring a system with multiple network uplinks and default routes, it
is often convinient to reference the main routing table multiple times - but
omitting the default route. Using this patch and a modified "ip" utility, this
can be achieved by using the following command sequence:
$ ip route add table secuplink default via 10.42.23.1
$ ip rule add pref 100 table main prefixlength 1
$ ip rule add pref 150 fwmark 0xA table secuplink
With this setup, packets marked 0xA will be processed by the additional routing
table "secuplink", but only if no suitable route in the main routing table can
be found. By using a minimal prefixlength of 1, the default route (/0) of the
table "main" is hidden to packets processed by rule 100; packets traveling to
destinations with more specific routing entries are processed as usual.
Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Reflow modified prototypes to 80 columns.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 547669d483 ("tcp: xps: fix reordering issues") added
unexpected reorders in case netem is used in a MQ setup for high
performance test bed.
ETH=eth0
tc qd del dev $ETH root 2>/dev/null
tc qd add dev $ETH root handle 1: mq
for i in `seq 1 32`
do
tc qd add dev $ETH parent 1:$i netem delay 100ms
done
As all tcp packets are orphaned by netem, TCP stack believes it can
set skb->ooo_okay on all packets.
In order to allow producers to send more packets, we want to
keep sk_wmem_alloc from reaching sk_sndbuf limit.
We can do that by accounting one byte per skb in netem queues,
so that TCP stack is not fooled too much.
Tested:
With above MQ/netem setup, scaling number of concurrent flows gives
linear results and no reorders/retransmits
lpq83:~# for n in 1 10 20 30 40 50 60 70 80 90 100
do echo -n "n:$n " ; ./super_netperf $n -H 10.7.7.84; done
n:1 198.46
n:10 2002.69
n:20 4000.98
n:30 6006.35
n:40 8020.93
n:50 10032.3
n:60 12081.9
n:70 13971.3
n:80 16009.7
n:90 17117.3
n:100 17425.5
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current net name space has only one genid for both IPv4 and IPv6, it has below
drawbacks:
- Add/delete an IPv4 address will invalidate all IPv6 routing table entries.
- Insert/remove XFRM policy will also invalidate both IPv4/IPv6 routing table
entries even when the policy is only applied for one address family.
Thus, this patch attempt to split one genid for two to cater for IPv4 and IPv6
separately in a fine granularity.
Signed-off-by: Fan Du <fan.du@windriver.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>