Having two or more qdisc_run's contend against each other is bad because
it can induce packet reordering if the packets have to be requeued. It
appears that this is an unintended consequence of relinquinshing the queue
lock while transmitting. That in turn is needed for devices that spend a
lot of time in their transmit routine.
There are no advantages to be had as devices with queues are inherently
single-threaded (the loopback device is not but then it doesn't have a
queue).
Even if you were to add a queue to a parallel virtual device (e.g., bolt
a tbf filter in front of an ipip tunnel device), you would still want to
process the queue in sequence to ensure that the packets are ordered
correctly.
The solution here is to steal a bit from net_device to prevent this.
BTW, as qdisc_restart is no longer used by anyone as a module inside the
kernel (IIRC it used to with netif_wake_queue), I have not exported the
new __qdisc_run function.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix endless loop in the SCTP match similar to those already fixed in
the SCTP conntrack helper (was CVE-2006-1527).
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (109 commits)
[ETHTOOL]: Fix UFO typo
[SCTP]: Fix persistent slowdown in sctp when a gap ack consumes rx buffer.
[SCTP]: Send only 1 window update SACK per message.
[SCTP]: Don't do CRC32C checksum over loopback.
[SCTP] Reset rtt_in_progress for the chunk when processing its sack.
[SCTP]: Reject sctp packets with broadcast addresses.
[SCTP]: Limit association max_retrans setting in setsockopt.
[PFKEYV2]: Fix inconsistent typing in struct sadb_x_kmprivate.
[IPV6]: Sum real space for RTAs.
[IRDA]: Use put_unaligned() in irlmp_do_discovery().
[BRIDGE]: Add support for NETIF_F_HW_CSUM devices
[NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM
[TG3]: Convert to non-LLTX
[TG3]: Remove unnecessary tx_lock
[TCP]: Add tcp_slow_start_after_idle sysctl.
[BNX2]: Update version and reldate
[BNX2]: Use CPU native page size
[BNX2]: Use compressed firmware
[BNX2]: Add firmware decompression
[BNX2]: Allow WoL settings on new 5708 chips
...
Manual fixup for conflict in drivers/net/tulip/winbond-840.c
The function ethtool_get_ufo was referring to ETHTOOL_GTSO instead of
ETHTOOL_GUFO.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the event that our entire receive buffer is full with a series of
chunks that represent a single gap-ack, and then we accept a chunk
(or chunks) that fill in the gap between the ctsn and the first gap,
we renege chunks from the end of the buffer, which effectively does
nothing but move our gap to the end of our received tsn stream. This
does little but move our missing tsns down stream a little, and, if the
sender is sending sufficiently large retransmit frames, the result is a
perpetual slowdown which can never be recovered from, since the only
chunk that can be accepted to allow progress in the tsn stream necessitates
that a new gap be created to make room for it. This leads to a constant
need for retransmits, and subsequent receiver stalls. The fix I've come up
with is to deliver the frame without reneging if we have a full receive
buffer and the receiving sockets sk_receive_queue is empty(indicating that
the receive buffer is being blocked by a missing tsn).
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now, every time we increase our rwnd by more then MTU bytes, we
trigger a SACK. When processing large messages, this will generate a
SACK for almost every other SCTP fragment. However since we are freeing
the entire message at the same time, we might as well collapse the SACK
generation to 1.
Signed-off-by: Tsutomu Fujii <t-fujii@nb.jp.nec.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using ASSOCINFO socket option, we need to limit the number of
maximum association retransmissions to be no greater than the sum
of all the path retransmissions. This is specified in Section 7.1.2
of the SCTP socket API draft.
However, we only do this if the association has multiple paths. If
there is only one path, the protocol stack will use the
assoc_max_retrans setting when trying to retransmit packets.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
irda_device_info->hints[] is byte aligned but is being
accessed as a u16
Based upon a patch by Luke Yang <luke.adi@gmail.com>.
Signed-off-by: David S. Miller <davem@davemloft.net>
As it is the bridge will only ever declare NETIF_F_IP_CSUM even if all
its constituent devices support NETIF_F_HW_CSUM. This patch fixes
this by supporting the first one out of NETIF_F_NO_CSUM,
NETIF_F_HW_CSUM, and NETIF_F_IP_CSUM that is supported by all
constituent devices.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
identically so we test for them in quite a few places. For the sake
of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two. We
also test the disjunct of NETIF_F_IP_CSUM and the other two in various
places, for that purpose I've added NETIF_F_ALL_CSUM.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
A lot of people have asked for a way to disable tcp_cwnd_restart(),
and it seems reasonable to add a sysctl to do that.
Signed-off-by: David S. Miller <davem@davemloft.net>
RTT_min is updated each time a timeout event occurs
in order to cope with hard handovers in wireless scenarios such as UMTS.
Signed-off-by: Luca De Cicco <ldecicco@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bandwidth estimate filter is now initialized with the first
sample in order to have better performances in the case of small
file transfers.
Signed-off-by: Luca De Cicco <ldecicco@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Need to update send sequence number tracking after first ack.
Rework of patch from Luca De Cicco.
Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sysctl net.ipv4.ip_autoconfig is a legacy value that is not used.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's better to warn and fail rather than rarely triggering BUG on paths
that incorrectly call skb_trim/__skb_trim on a non-linear skb.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>