This patch replaces unnecessary uses of skb_copy, pskb_copy and
skb_realloc_headroom by functions such as skb_make_writable and
pskb_expand_head.
This allows us to remove the double pointers later.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that all callers of netfilter can guarantee that the skb is not shared,
we no longer have to copy the skb in skb_make_writable.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
From RFC 3493, Section 5.2:
IPV6_MULTICAST_IF
Set the interface to use for outgoing multicast packets. The
argument is the index of the interface to use. If the
interface index is specified as zero, the system selects the
interface (for example, by looking up the address in a routing
table and using the resulting interface).
This patch adds support for (index == 0) to reset the value to it's
original state, allowing the system to choose the best interface. IPv4
already behaves this way.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As discussed before, this patch provides userland with a way to access
relevant options in Router Advertisements, after they are processed
and validated by the kernel. Extra options are processed in a generic
way; this patch only exports RDNSS options described in RFC5006, but
support to control which options are exported could be easily added.
A new rtnetlink message type is defined, to transport Neighbor
Discovery options, along with optional context information. At the
moment only the address of the router sending an RDNSS option is
included, but additional attributes may be later defined, if needed by
new use cases.
Signed-off-by: Pierre Ynard <linkfanel@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch make processing netlink user -> kernel messages synchronious.
This change was inspired by the talk with Alexey Kuznetsov about current
netlink messages processing. He says that he was badly wrong when introduced
asynchronious user -> kernel communication.
The call netlink_unicast is the only path to send message to the kernel
netlink socket. But, unfortunately, it is also used to send data to the
user.
Before this change the user message has been attached to the socket queue
and sk->sk_data_ready was called. The process has been blocked until all
pending messages were processed. The bad thing is that this processing
may occur in the arbitrary process context.
This patch changes nlk->data_ready callback to get 1 skb and force packet
processing right in the netlink_unicast.
Kernel -> user path in netlink_unicast remains untouched.
EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock
drop, but the process remains in the cycle until the message will be fully
processed. So, there is no need to use this kludges now.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Expansion of original idea from Denis V. Lunev <den@openvz.org>
Add robustness and locking to the local_port_range sysctl.
1. Enforce that low < high when setting.
2. Use seqlock to ensure atomic update.
The locking might seem like overkill, but there are
cases where sysadmin might want to change value in the
middle of a DoS attack.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moves the setting of the IP length and checksum fields out of
the transforms and into the xfrmX_output functions. This would help future
efforts in merging the transforms themselves.
It also adds an optimisation to ipcomp due to the fact that the transport
offset is guaranteed to be zero.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the duplicate ipv6_{auth,esp,comp}_hdr structures since
they're identical to the IPv4 versions. Duplicating them would only create
problems for ourselves later when we need to add things like extended
sequence numbers.
I've also added transport header type conversion headers for these types
which are now used by the transforms.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IPv6 calling convention for x->mode->output is more general and could
help an eventual protocol-generic x->type->output implementation. This
patch adopts it for IPv4 as well and modifies the IPv4 type output functions
accordingly.
It also rewrites the IPv6 mac/transport header calculation to be based off
the network header where practical.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes the calling convention so that on entry from
x->mode->output and before entry into x->type->output skb->data
will point to the payload instead of the IP header.
This is essentially a redistribution of skb_push/skb_pull calls
with the aim of minimising them on the common path of tunnel +
ESP.
It'll also let us use the same calling convention between IPv4
and IPv6 with the next patch.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The beet output function completely kills any extension headers by replacing
them with the IPv6 header. This is because it essentially ignores the
result of ip6_find_1stfragopt by simply acting as if there aren't any
extension headers.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
To judge the timing for DAD, netif_carrier_ok() is used. However,
there is a possibility that dev->qdisc stays noop_qdisc even if
netif_carrier_ok() returns true. In that case, DAD NS is not sent out.
We need to defer the IPv6 device initialization until a valid qdisc
is specified.
Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This concerns the ipv4 and ipv6 code mostly, but also the netlink
and unix sockets.
The netlink code is an example of how to use the __seq_open_private()
call - it saves the net namespace on this private.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch releases the lock on the state before calling x->type->output.
It also adds the lock to the spots where they're currently needed.
Most of those places (all except mip6) are expected to disappear with
async crypto.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current the x->mode->output functions store the IPv6 nh pointer in the
skb network header. This is inconvenient because the network header then
has to be fixed up before the packet can leave the IPsec stack. The mac
header field is unused on output so we can use that to store this instead.
This patch does that and removes the network header fix-up in xfrm_output.
It also uses ipv6_hdr where appropriate in the x->type->output functions.
There is also a minor clean-up in esp4 to make it use the same code as
esp6 to help any subsequent effort to merge the two.
Lastly it kills two redundant skb_set_* statements in BEET that were
simply copied over from transport mode.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
In ip6_fib.c, fib6_clean_node() casts a fib6_walker_t pointer to
a fib6_cleaner_t pointer assuming a struct fib6_walker_t (field 'w')
is the first field in struct fib6_walker_t.
To prevent any future problems that may occur if one day a field
is inadvertently inserted before the 'w' field in struct fib6_cleaner_t,
(and to improve readability), this patch uses the container_of() macro.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The lastused update check in xfrm_output can be done just as well in
the mode output function which is specific to RO.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The replay counter is one of only two remaining things in the output code
that requires a lock on the xfrm state (the other being the crypto). This
patch moves it into the generic xfrm_output so we can remove the lock from
the transforms themselves.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most of the code in xfrm4_output_one and xfrm6_output_one are identical so
this patch moves them into a common xfrm_output function which will live
in net/xfrm.
In fact this would seem to fix a bug as on IPv4 we never reset the network
header after a transform which may upset netfilter later on.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The keys are only used during initialisation so we don't need to carry them
in esp_data. Since we don't have to allocate them again, there is no need
to place a limit on the authentication key length anymore.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The keys are only used during initialisation so we don't need to carry them
in esp_data. Since we don't have to allocate them again, there is no need
to place a limit on the authentication key length anymore.
This patch also kills the unused auth.icv member.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a bunch of sparse warnings. Mostly about 0 used as
NULL pointer, and shadowed variable declarations.
One notable case was that hash size should have been unsigned.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no struct nfattr anymore, rename functions to 'nlattr'.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of the duplicated rtnetlink macros and use the generic netlink
attribute functions. The old duplicated stuff is moved to a new header
file that exists just for userspace.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>