Having walked through the entire skbuff, skb_seq_read would leave the
last fragment mapped. As a consequence, the unwary caller would leak
kmaps, and proceed with preempt_count off by one. The only (kind of
non-intuitive) workaround is to use skb_seq_read_abort.
This patch makes sure skb_seq_read always unmaps frag_data after
having cycled through the skb's paged part.
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This moves the local_irq_enable() call in net_rx_action() to before
calling the CONFIG_NET_DMA's dma_async_memcpy_issue_pending() rather
than after. This shortens the irq disabled window and allows for DMA
drivers that need to do their own irq hold.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_read_sock() currently assumes that the recv_actor() only returns
number of bytes copied. For network splice receive, we may have to
return an error in some cases. So allow the actor to return a negative
error value.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tipc netlink config handler uses the nlmsg_pid from the
request header as destination for its reply. If the application
initialized nlmsg_pid to 0, the reply is looped back to the kernel,
causing hangup. Fix: use nlmsg_pid of the skb that triggered the
request.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no realistic situation to change helper (Who wants IRC helper to
track FTP traffic ?). Moreover, if we want to do that, we need to fix race
issue by nfctnetlink and running helper. That will add overhead to packet
processing. It wouldn't pay. So this rejects the request to change
helper. The requests to add or remove helper are accepted as ever.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return the number of bytes buffered in rxrpc_send_data().
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip_vs currently fails to reset its ip_vs_sync_state variable if the
sync thread fails to start properly. The result is that the kernel
will report a running daemon when their actuall is none.
If you issue the following commands:
1. ipvsadm --start-daemon master --mcast-interface bla
2. ipvsadm -L --daemon
3. ipvsadm --stop-daemon master
Assuming that bla is not an actual interface, step 2 should return no
data, but instead returns:
$ ipvsadm -L --daemon
master sync daemon (mcast=bla, syncid=0)
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
My IPsec MTU optimization patch introduced a regression in MTU calculation
for non-ESP SAs, the SA's header_len needs to be subtracted from the MTU if
the transform doesn't provide a ->get_mtu() function.
Reported-and-tested-by: Marco Berizzi <pupilla@hotmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a NULL dereference spotted by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6f74651ae6 is found guilty
of breaking DSACK counting, which should be done only for the
SACK block reported by the DSACK instead of every SACK block
that is received along with DSACK information.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 164891aadf broke RTT
sampling of congestion control modules. Inaccurate timestamps
could be fed to them without providing any way for them to
identify such cases. Previously RTT sampler was called only if
FLAG_RETRANS_DATA_ACKED was not set filtering inaccurate
timestamps nicely. In addition, the new behavior could give an
invalid timestamp (zero) to RTT sampler if only skbs with
TCPCB_RETRANS were ACKed. This solves both problems.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent patch that added ipv6_hwtype is broken on tuntap tunnels.
Indeed, it's broken on any device that does not pass the ipv6_hwtype
test.
The reason is that the original test only applies to autoconfiguration,
not IPv6 support. IPv6 support is allowed on any device. In fact,
even with the ipv6_hwtype patch applied you can still add IPv6 addresses
to any interface that doesn't pass thw ipv6_hwtype test provided that
they have a sufficiently large MTU. This is a serious problem because
come deregistration time these devices won't be cleaned up properly.
I've gone back and looked at the rationale for the patch. It appears
that the real problem is that we were creating IPv6 devices even if the
MTU was too small. So here's a patch which fixes that and reverts the
ipv6_hwtype stuff.
Thanks to Kanru Chen for reporting this issue.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now, when we receive a mtu estimate smaller then minim
threshold in the ICMP message, we disable the path mtu discovery
on the transport. This leads to the never increasing sctp fragmentation
point even when the real path mtu has increased.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Currently, if the socket is owned by the user, we drop the ICMP
message. As a result SCTP forgets that path MTU changed and
never adjusting it's estimate. This causes all subsequent
packets to be fragmented. With this patch, we'll flag the association
that it needs to udpate it's estimate based on the already updated
routing information.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Introduce new function sctp_transport_update_pmtu that updates
the transports and destination caches view of the path mtu.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
If the copy_to_user or copy_user calls fail in sctp_getsockopt_local_addrs(),
the function should free locally allocated storage before returning error.
Spotted by Coverity.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Allow sctp_bindx() to accept multiple address with
unspecified port. In this case, all addresses inherit
the first bound port. We still catch full mis-matches.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
During peeloff of AF_INET6 socket, the inet6_sk(sk)->daddr
wasn't set correctly since the code was assuming IPv4 only.
Now we use a correct call to set the destination address.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Because of the current default of 100, Cubic and BIC perform very
poorly compared to standard Reno.
In the worst case, this change makes Cubic and BIC as aggressive as
Reno. So this change should be very safe.
Signed-off-by: David S. Miller <davem@davemloft.net>
Without FRTO, the tcp_try_to_open is never called with
lost_out > 0 (see tcp_time_to_recover). However, when FRTO is
enabled, the !tp->lost condition is not used until end of FRTO
because that way TCP avoids premature entry to fast recovery
during FRTO.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>