Commit Graph

1475 Commits

Author SHA1 Message Date
Herbert Xu
e40b328615 [IPSEC]: Forbid BEET + ipcomp for now
While BEET can theoretically work with IPComp the current code can't
do that because it tries to construct a BEET mode tunnel type which
doesn't (and cannot) exist.  In fact as it is it won't even attach a
tunnel object at all for BEET which is bogus.

To support this fully we'd also need to change the policy checks on
input to recognise a plain tunnel as a legal variant of an optional
BEET transform.

This patch simply fails such constructions for now.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:43 -08:00
Herbert Xu
25ee3286dc [IPSEC]: Merge common code into xfrm_bundle_create
Half of the code in xfrm4_bundle_create and xfrm6_bundle_create are
common.  This patch extracts that logic and puts it into
xfrm_bundle_create.  The rest of it are then accessed through afinfo.

As a result this fixes the problem with inter-family transforms where
we treat every xfrm dst in the bundle as if it belongs to the top
family.

This patch also fixes a long-standing error-path bug where we may free
the xfrm states twice.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:43 -08:00
Herbert Xu
66cdb3ca27 [IPSEC]: Move flow construction into xfrm_dst_lookup
This patch moves the flow construction from the callers of
xfrm_dst_lookup into that function.  It also changes xfrm_dst_lookup
so that it takes an xfrm state as its argument instead of explicit
addresses.

This removes any address-specific logic from the callers of
xfrm_dst_lookup which is needed to correctly support inter-family
transforms.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:42 -08:00
Herbert Xu
f04e7e8d7f [IPSEC]: Replace x->type->{local,remote}_addr with flags
The functions local_addr and remote_addr are more than what they're
needed for.  The same thing can be done easily with flags on the type
object.  This patch does that and simplifies the wrapper functions in
xfrm6_policy accordingly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:41 -08:00
Herbert Xu
fff6938880 [IPSEC]: Make sure idev is consistent with dev in xfrm_dst
Previously we took the device from the bottom route and idev from the
top route.  This is bad because idev may well point to a different
device.  This patch changes it so that we get the idev from the device
directly.

It also makes it an error if either dev or idev is NULL.  This is
consistent with the rest of the routing code which also treats these
cases as errors.

I've removed the err initialisation in xfrm6_policy.c because it
achieves no purpose and hid a bug when an initial version of this
patch neglected to set err to -ENODEV (fortunately the IPv4 version
warned about it).

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:40 -08:00
Herbert Xu
45ff5a3f9a [IPSEC]: Set dst->input to dst_discard
The input function should never be invoked on IPsec dst objects.  This
is because we don't apply IPsec on input until after we've made the
routing decision.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:40 -08:00
Herbert Xu
8ce68ceb55 [IPSEC]: Only set neighbour on top xfrm dst
The neighbour field is only used by dst_confirm which only ever happens on
the top-most xfrm dst.  So it's a waste to duplicate for every other xfrm
dst.  This patch moves its setting out of the loop so that only the top one
gets set.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:39 -08:00
Herbert Xu
352e512c32 [NET]: Eliminate duplicate copies of dst_discard
We have a number of copies of dst_discard scattered around the place
which all do the same thing, namely free a packet on the input or
output paths.

This patch deletes all of them except dst_discard and points all the
users to it.

The only non-trivial bit is decnet where it returns an error.
However, conceptually this is identical to the blackhole functions
used in IPv4 and IPv6 which do not return errors.  So they should
either all return errors or all return zero.  For now I've stuck with
the majority and picked zero as the return value.

It doesn't really matter in practice since few if any driver would
react differently depending on a zero return value or NET_RX_DROP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:37 -08:00
Herbert Xu
b4ce92775c [IPV6]: Move nfheader_len into rt6_info
The dst member nfheader_len is only used by IPv6.  It's also currently
creating a rather ugly alignment hole in struct dst.  Therefore this patch
moves it from there into struct rt6_info.

It also reorders the fields in rt6_info to minimize holes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:37 -08:00
Herbert Xu
0148894223 [IPV6]: Only set nfheader_len for top xfrm dst
We only need to set nfheader_len in the top xfrm dst.  This is because
we only ever read the nfheader_len from the top xfrm dst.

It is also easier to count nfheader_len as part of header_len which
then lets us remove the ugly wrapper functions for incrementing and
decrementing header lengths in xfrm6_policy.c.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:35 -08:00
Pavel Emelyanov
b24b8a247f [NET]: Convert init_timer into setup_timer
Many-many code in the kernel initialized the timer->function
and  timer->data together with calling init_timer(timer). There
is already a helper for this. Use it for networking code.

The patch is HUGE, but makes the code 130 lines shorter
(98 insertions(+), 228 deletions(-)).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:35 -08:00
Wang Chen
a92aa318b4 [IPV6]: Add raw6 drops counter.
Add raw drops counter for IPv6 in /proc/net/raw6 .

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:34 -08:00
Jens Axboe
a0974dd3da [TCP] splice: add tcp_splice_read() to IPV6
Thanks to YOSHIFUJI Hideaki for the hint!

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:32 -08:00
Rolf Manderscheid
a9e527e3f9 IPoIB: improve IPv4/IPv6 to IB mcast mapping functions
An IPoIB subnet on an IB fabric that spans multiple IB subnets can't
use link-local scope in multicast GIDs.  The existing routines that
map IP/IPv6 multicast addresses into IB link-level addresses hard-code
the scope to link-local, and they also leave the partition key field
uninitialised.  This patch adds a parameter (the link-level broadcast
address) to the mapping routines, allowing them to initialise both the
scope and the P_Key appropriately, and fixes up the call sites.

The next step will be to add a way to configure the scope for an IPoIB
interface.

Signed-off-by: Rolf Manderscheid <rvm@obsidianresearch.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-01-25 14:15:37 -08:00
Herbert Xu
f945fa7ad9 [INET]: Fix truesize setting in ip_append_data
As it is ip_append_data only counts page fragments to the skb that
allocated it.  As such it means that the first skb gets hit with a
4K charge even though it might have only used a fraction of it while
all subsequent skb's that use the same page gets away with no charge
at all.

This bug was exposed by the UDP accounting patch.

[ The wmem_alloc bumping needs to be moved with the truesize,
  noticed by Takahiro Yasui.  -DaveM ]

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-23 03:11:43 -08:00
Wang Chen
fa95c28322 [IPV6]: RFC 2011 compatibility broken
The snmp6 entry name was changed, and it broke compatibility
to RFC 2011.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-21 03:05:43 -08:00
Wang Chen
c964ff4ffb [IPV6]: ICMP6_MIB_OUTMSGS increment duplicated
icmpv6_send() calls ip6_push_pending_frames() indirectly.
Both ip6_push_pending_frames() and icmpv6_send() increment
counter ICMP6_MIB_OUTMSGS.

This patch remove the increment from icmpv6_send.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-21 03:05:20 -08:00
YOSHIFUJI Hideaki
398bcbebb6 [IPV6] ROUTE: Make sending algorithm more friendly with RFC 4861.
We omit (or delay) sending NSes for known-to-unreachable routers (in
NUD_FAILED state) according to RFC 4191 (Default Router Preferences
and More-Specific Routes).  But this is not fully compatible with RFC
4861 (Neighbor Discovery Protocol for IPv6), which does not remember
unreachability of neighbors.

So, let's avoid mixing sending algorithm of RFC 4191 and that of RFC
4861, and make the algorithm more friendly with RFC 4861 if RFC 4191
is disabled.

Issue was found by IPv6 Ready Logo Core Self_Test 1.5.0b2 (by TAHI
Project), and has been tracked down by Mitsuru Chinen
<mitch@linux.vnet.ibm.com>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-20 20:31:40 -08:00
Pavel Emelyanov
b3652b2dc5 [IPV6]: Mischecked tw match in __inet6_check_established.
When looking for a conflicting connection the !sk->sk_bound_dev_if
check is performed only for live sockets, but not for timewait-ed.

This is not the case for ipv4, for __inet6_lookup_established in
both ipv4 and ipv6 and for other places that check for tw-s.

Was this missed accidentally? If so, then this patch fixes it and
besides makes use if the dif variable declared in the function.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-20 20:31:36 -08:00
Yasuyuki Kozakai
8f41f01786 [NETFILTER]: ip6t_eui64: Fixes calculation of Universal/Local bit
RFC2464 says that the next to lowerst order bit of the first octet
of the Interface Identifier is formed by complementing
the Universal/Local bit of the EUI-64. But ip6t_eui64 uses OR not XOR.

Thanks Peter Ivancik for reporing this bug and posting a patch
for it.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-10 22:40:39 -08:00
Brian Haley
1ac4f00885 [IPV6]: IPV6_MULTICAST_IF setting is ignored on link-local connect()
Signed-off-by: Brian Haley <brian.haley@hp.com>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-08 23:52:21 -08:00
Joe Perches
bea8519547 [IPV6]: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-20 14:01:35 -08:00
Wei Yongjun
cf6fc4a924 [IPV6]: Fix the return value of ipv6_getsockopt
If CONFIG_NETFILTER if not selected when compile the kernel source code, 
ipv6_getsockopt will returen an EINVAL error if optname is not supported by
the kernel. But if CONFIG_NETFILTER is selected, ENOPROTOOPT error will 
be return.

This patch fix to always return ENOPROTOOPT error if optname argument of 
ipv6_getsockopt is not supported by the kernel.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-16 13:39:57 -08:00
Thomas Graf
95a02cfd4d [IPv6] ESP: Discard dummy packets introduced in rfc4303
RFC4303 introduces dummy packets with a nexthdr value of 59
to implement traffic confidentiality. Such packets need to
be dropped silently and the payload may not be attempted to
be parsed as it consists of random chunk.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-11 02:45:27 -08:00
YOSHIFUJI Hideaki
1df2e44560 [IPV6] XFRM: Fix auditing rt6i_flags; use RTF_xxx flags instead of RTCF_xxx.
RTCF_xxx flags, defined in include/linux/in_route.h) are available for
IPv4 route (rtable) entries only.  Use RTF_xxx flags instead, defined
in include/linux/ipv6_route.h, for IPv6 route entries (rt6_info).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-11 02:45:24 -08:00