Commit Graph

361263 Commits

Author SHA1 Message Date
Nandita Dukkipati 6ba8a3b19e tcp: Tail loss probe (TLP)
This patch series implement the Tail loss probe (TLP) algorithm described
in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The
first patch implements the basic algorithm.

TLP's goal is to reduce tail latency of short transactions. It achieves
this by converting retransmission timeouts (RTOs) occuring due
to tail losses (losses at end of transactions) into fast recovery.
TLP transmits one packet in two round-trips when a connection is in
Open state and isn't receiving any ACKs. The transmitted packet, aka
loss probe, can be either new or a retransmission. When there is tail
loss, the ACK from a loss probe triggers FACK/early-retransmit based
fast recovery, thus avoiding a costly RTO. In the absence of loss,
there is no change in the connection state.

PTO stands for probe timeout. It is a timer event indicating
that an ACK is overdue and triggers a loss probe packet. The PTO value
is set to max(2*SRTT, 10ms) and is adjusted to account for delayed
ACK timer when there is only one oustanding packet.

TLP Algorithm

On transmission of new data in Open state:
  -> packets_out > 1: schedule PTO in max(2*SRTT, 10ms).
  -> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms)
  -> PTO = min(PTO, RTO)

Conditions for scheduling PTO:
  -> Connection is in Open state.
  -> Connection is either cwnd limited or no new data to send.
  -> Number of probes per tail loss episode is limited to one.
  -> Connection is SACK enabled.

When PTO fires:
  new_segment_exists:
    -> transmit new segment.
    -> packets_out++. cwnd remains same.

  no_new_packet:
    -> retransmit the last segment.
       Its ACK triggers FACK or early retransmit based recovery.

ACK path:
  -> rearm RTO at start of ACK processing.
  -> reschedule PTO if need be.

In addition, the patch includes a small variation to the Early Retransmit
(ER) algorithm, such that ER and TLP together can in principle recover any
N-degree of tail loss through fast recovery. TLP is controlled by the same
sysctl as ER, tcp_early_retrans sysctl.
tcp_early_retrans==0; disables TLP and ER.
		 ==1; enables RFC5827 ER.
		 ==2; delayed ER.
		 ==3; TLP and delayed ER. [DEFAULT]
		 ==4; TLP only.

The TLP patch series have been extensively tested on Google Web servers.
It is most effective for short Web trasactions, where it reduced RTOs by 15%
and improved HTTP response time (average by 6%, 99th percentile by 10%).
The transmitted probes account for <0.5% of the overall transmissions.

Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 08:30:34 -04:00
Fabio Estevam 83e519b634 fec: Use devm_request_and_ioremap()
Using devm_request_and_ioremap() can make the code cleaner and simpler.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 08:04:09 -04:00
Fabio Estevam a2e4b59a71 fec: Remove unused pci header
PCI header is not needed, so get rid of it.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 08:04:09 -04:00
Wei Yongjun 74694e7bd0 bridge: using for_each_set_bit to simplify the code
Using for_each_set_bit() to simplify the code.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 08:04:09 -04:00
Wei Yongjun 5096e3c4b2 bridge: using for_each_set_bit_from to simplify the code
Using for_each_set_bit_from() to simplify the code.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 08:04:08 -04:00
Wei Yongjun 3a918f4036 bnx2x: use list_move instead of list_del/list_add
Using list_move() instead of list_del() + list_add().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 08:04:08 -04:00
Wei Yongjun 2e3a118f82 qlcnic: remove duplicated include from qlcnic_sysfs.c
Remove duplicated include.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 08:04:08 -04:00
Dmitry Kravkov 704ba4b777 bnx2x: Restore FCoE 4-port devices support
bnx2x FW 1.78.17 properly supports DCBX configuration for 4-port devices,
enabling FCoE support on 57840 boards.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:26 -04:00
Dmitry Kravkov 91226790bb bnx2x: use FW 7.8.17
Update appropriate HSI files and adapt driver accordingly.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:26 -04:00
Yuval Mintz 82594f8f47 bnx2x: Avoid using zero MAC
Prevent bnx2x devices which are used mainly for storage from using zero
MAC addresses as their primary MAC address.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:25 -04:00
Yaniv Rosner e438c5d651 bnx2x: Control SFP+ tap values via nvm config
Configure SFP+ tap values to optimize link signal according to NVRAM setup.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:25 -04:00
Yaniv Rosner 31b958d755 bnx2x: Add EEE support for BCM84834
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:24 -04:00
Yaniv Rosner b807c74855 bnx2x: Add RJ45 SFP module detection
Add RJ45 SFP module detection. In case the user set 10G link speed, and the
module doesn't support it, then force the speed to 1G and notify the user.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:24 -04:00
Yuval Mintz ab5777d748 bnx2x: Get gso_segs from FW
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:24 -04:00
Ariel Elior 3c76feff68 bnx2x: Control number of vfs dynamically
1. Support sysfs interface for getting the maximal number of virtual functions
   of a given physical function.
2. Support sysfs interface for getting and setting the current number of
   virtual functions.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:23 -04:00
Ariel Elior 3ec9f9ca79 bnx2x: Add iproute2 support for vfs
This patch adds support for iproute2 callbacks allowing querying a physical
function as to its child virtual functions, and setting the macs and vlans
of said virtual functions.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:23 -04:00
Ariel Elior 3786b9426d bnx2x: Prevent "Unknown MF" print in SF mode
When using a chip operating in Single Function mode, when the chip is probed
the bnx2x would print a message warning of an unknown Multi Function mode.

This patch prevents said message.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:22 -04:00
Yuval Mintz f22fdf25f4 bnx2x: Take chip version from MFW
In latest boards, the CHIP_METAL register contains an incorrect
revision value, so the correct one needs to be obtained in a
different manner.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:22 -04:00
Ariel Elior 005a07baa1 bnx2x: Set ethtool ops for vfs
Virtual functions don't have access to HW registers, therefore most ethtool ops
are forbidden to them. Instead of checking in each op whether the device being
driven is a virtual function or a physical function, this patch creates a
separate ethtool ops struct for virtual functions and uses it to initialize
the ethtool ops of the driver in case it is driving a virtual function device.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:21 -04:00
Yuval Mintz 07ef7bec68 bnx2x: fix vlan-mac memory leak
Release (previously leaking) memory when elements are removed from pending
execution lists in the bnx2x driver.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 07:54:21 -04:00
Silviu-Mihai Popescu 3ffd880d3c ethernet: amd: use PTR_RET instead of IS_ERR + PTR_ERR
This uses PTR_RET instead of IS_ERR and PTR_ERR in order to increase
readability.

Signed-off-by: Silviu-Mihai Popescu <silviupopescu1990@gmail.com>
Acked-by: <Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 06:57:22 -04:00
Hector Palacios b6bb4dfcb1 phy/micrel: move flag handling to function for common use
The flag MICREL_PHY_50MHZ_CLK is not of exclusive use of KSZ8051
model. At least KSZ8021 and KSZ8031 models also use it.
This patch moves the handling of this and future flags to a
separate function so that the different PHY models can call it on
their init function, if needed.

Signed-off-by: Hector Palacios <hector.palacios@digi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 06:50:59 -04:00
Hector Palacios b818d1a7f7 phy/micrel: Add support for KSZ8031
Micrel PHY KSZ8031 is similar to KSZ8021 and also requires the special
initialization of "Operation Mode Strap Override" in reg 0x16
introduced in 212ea99 (phy/micrel: Implement support for KSZ8021).

Signed-off-by: Hector Palacios <hector.palacios@digi.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 06:50:58 -04:00
David S. Miller e5f2ef7ab4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/intel/e1000e/netdev.c

Minor conflict in e1000e, a line that got fixed in 'net'
has been removed in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 05:52:22 -04:00
stephen hemminger 3da889b616 bridge: reserve space for IFLA_BRPORT_FAST_LEAVE
The bridge multicast fast leave feature was added sufficient space
was not reserved in the netlink message. This means the flag may be
lost in netlink events and results of queries.

Found by observation while looking up some netlink stuff for discussion with Vlad.
Problem introduced by commit c2d3babfaf
Author: David S. Miller <davem@davemloft.net>
Date:   Wed Dec 5 16:24:45 2012 -0500

    bridge: implement multicast fast leave

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 05:38:29 -04:00