Both high-sack detection and new lowest seq variables have
unnecessary zero special case which are now removed by setting
safe initial seqnos.
This also fixes problem which caused zero received_upto being
passed to tcp_mark_lost_retrans which confused after relations
within the marker loop causing incorrect TCPCB_SACKED_RETRANS
clearing. The problem was noticed because of a performance
report from TAKANO Ryousei <takano@axe-inc.co.jp>.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Ryousei Takano <takano-ryousei@aist.go.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
While looking at a net driver with the following construct,
if (!netif_carrier_ok(dev))
netif_carrier_on(dev);
it stuck me that the netif_carrier_ok() check was redundant, since
netif_carrier_on() checks bit __LINK_STATE_NOCARRIER anyway. This is
the same reason why netif_queue_stopped() need not be called prior to
netif_wake_queue().
This is true, but there is however an unwanted side effect from assuming
that netif_carrier_on() can be called multiple times: it touches the
watchdog, regardless of pre-existing carrier state.
The fix: move watchdog-up inside the bit-cleared code path.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix uninitialised variable in ip_frag_reasm(). err should be set to
-ENOMEM if the initial call of skb_clone() fails.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new field to xfrm states called inner_mode. The existing
mode object is renamed to outer_mode.
This is the first part of an attempt to fix inter-family transforms. As it
is we always use the outer family when determining which mode to use. As a
result we may end up shoving IPv4 packets into netfilter6 and vice versa.
What we really want is to use the inner family for the first part of outbound
processing and the outer family for the second part. For inbound processing
we'd use the opposite pairing.
I've also added a check to prevent silly combinations such as transport mode
with inter-family transforms.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Combining RO and AH/ESP/IPCOMP does not make sense. So this patch adds a
check in the state initialisation function to prevent this.
This allows us to safely remove the mode input function of RO since it
can never be called anymore. Indeed, if somehow it does get called we'll
know about it through an OOPS instead of it slipping past silently.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
For IPv4 we were using the bottom route's peer instead of the top one.
This is wrong because the peer is only used by TCP to keep track of
information about the TCP destination address which certainly does not
live in the bottom route.
This patch fixes that which allows us to get rid of the family check
since the bottom route could be IPv6 while the top one must always
be IPv4.
I've also changed the other fields which are IPv4-specific to get the
info from the top route instead of potentially bogus data from the
bottom route.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is convenient to have a pointer from xfrm_state to address-specific
functions such as the output function for a family. Currently the
address-specific policy code calls out to the xfrm state code to get
those pointers when we could get it in an easier way via the state
itself.
This patch adds an xfrm_state_afinfo to xfrm_mode (since they're
address-specific) and changes the policy code to use it. I've also
added an owner field to do reference counting on the module providing
the afinfo even though it isn't strictly necessary today since IPv6
can't be unloaded yet.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently BEET mode does not reinject the packet back into the stack
like tunnel mode does. Since BEET should behave just like tunnel mode
this is incorrect.
This patch fixes this by introducing a flags field to xfrm_mode that
tells the IPsec code whether it should terminate and reinject the packet
back into the stack.
It then sets the flag for BEET and tunnel mode.
I've also added a number of missing BEET checks elsewhere where we check
whether a given mode is a tunnel or not.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The type and mode maps are only used by SAs, not policies. So it makes
sense to move them from xfrm_policy.c into xfrm_state.c. This also allows
us to mark xfrm_get_type/xfrm_put_type/xfrm_get_mode/xfrm_put_mode as
static.
The only other change I've made in the move is to get rid of the casts
on the request_module call for types. They're unnecessary because C
will promote them to ints anyway.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently xfrm_parse_spi requires there to be 16 bytes for AH and ESP.
In contrived cases there may not actually be 16 bytes there since the
respective header sizes are less than that (8 and 12 currently).
This patch changes the test to use the actual header length instead of 16.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Not every transform needs to zap ip_summed. For example, a pure tunnel
mode encapsulation does not affect the hardware checksum at all. In fact,
every algorithm (that needs this) other than AH6 already does its own
ip_summed zapping.
This patch moves the zapping into AH6 which is in line with what IPv4 does.
Possible future optimisation: Checksum the data as we copy them in IPComp.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently xfrm6_rcv_spi gets the nexthdr value itself from the packet.
This means that we need to fix up the value in case we have a 4-on-6
tunnel. Moving this logic into the caller simplifies things and allows
us to merge the code with IPv4.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moves the tunnel parsing for IPv4 out of xfrm4_input and into
xfrm4_tunnel. This change is in line with what IPv6 does and will allow
us to merge the two input functions.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
I noticed that my recent patch broke 6-on-4 pure IPsec tunnels (the ones
that are only used for incompressible IPsec packets). Subsequent reviews
show that I broke 6-on-6 pure tunnels more than three years ago and nobody
ever noticed. I suppose every must be testing 6-on-6 IPComp with large
pings which are very compressible :)
This patch fixes both cases.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This functions is never called with NULL or not setup argument,
so the checks inside are redundant.
Also, the return value is always -ENOMEM, so no need in
additional variable for this.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The proposed fix is to delay the reference counter decrement
until the quiescent state pass. This will give sk_clone() a
chance to get the reference on the cloned filter.
Regular sk_filter_uncharge can happen from the sk_free() only
and there's no need in delaying the put - the socket is dead
anyway and is to be release itself.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sk_filter_uncharge is called for error handling and
for releasing the former filter, but this will have to
be done in a bit different manner, so cleanup the error
path a bit.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is done merely as a preparation for the fix.
The sk_filter_uncharge() unaccounts the filter memory and calls
the sk_filter_release(), which in turn decrements the refcount
anf frees the filter.
The latter function will be required separately.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Filter is attached in a separate function, so do the
same for filter detaching.
This also removes one variable sock_setsockopt().
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The previous IW_SCAN_THIS_ESSID patch left a hole allowing scan
requests on interfaces in inappropriate modes.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we now allocate the queues in inet_fragment.c, we
can safely free it in the same place. The ->destructor
callback thus becomes optional for inet_frags.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since this callback is used to check for conflicts in
hashtable when inserting a newly created frag queue, we can
do the same by checking for matching the queue with the
argument, used to create one.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here we need another callback ->match to check whether the
entry found in hash matches the key passed. The key used
is the same as the creation argument for inet_frag_create.
Yet again, this ->match is the same for netfilter and ipv6.
Running a frew steps forward - this callback will later
replace the ->equal one.
Since the inet_frag_find() uses the already consolidated
inet_frag_create() remove the xxx_frag_create from protocol
codes.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>