Commit Graph

455686 Commits

Author SHA1 Message Date
Tom Herbert cb1ce2ef38 ipv6: Implement automatic flow label generation on transmit
Automatically generate flow labels for IPv6 packets on transmit.
The flow label is computed based on skb_get_hash. The flow label will
only automatically be set when it is zero otherwise (i.e. flow label
manager hasn't set one). This supports the transmit side functionality
of RFC 6438.

Added an IPv6 sysctl auto_flowlabels to enable/disable this behavior
system wide, and added IPV6_AUTOFLOWLABEL socket option to enable this
functionality per socket.

By default, auto flowlabels are disabled to avoid possible conflicts
with flow label manager, however if this feature proves useful we
may want to enable it by default.

It should also be noted that FreeBSD has already implemented automatic
flow labels (including the sysctl and socket option). In FreeBSD,
automatic flow labels default to enabled.

Performance impact:

Running super_netperf with 200 flows for TCP_RR and UDP_RR for
IPv6. Note that in UDP case, __skb_get_hash will be called for
every packet with explains slight regression. In the TCP case
the hash is saved in the socket so there is no regression.

Automatic flow labels disabled:

  TCP_RR:
    86.53% CPU utilization
    127/195/322 90/95/99% latencies
    1.40498e+06 tps

  UDP_RR:
    90.70% CPU utilization
    118/168/243 90/95/99% latencies
    1.50309e+06 tps

Automatic flow labels enabled:

  TCP_RR:
    85.90% CPU utilization
    128/199/337 90/95/99% latencies
    1.40051e+06

  UDP_RR
    92.61% CPU utilization
    115/164/236 90/95/99% latencies
    1.4687e+06

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:21 -07:00
Tom Herbert 19469a873b flow_dissector: Use IPv6 flow label in flow_dissector
This patch implements the receive side to support RFC 6438 which is to
use the flow label as an ECMP hash. If an IPv6 flow label is set
in a packet we can use this as input for computing an L4-hash. There
should be no need to parse any transport headers in this case.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:21 -07:00
Tom Herbert 535fb8d006 vxlan: Call udp_flow_src_port
In vxlan and OVS vport-vxlan call common function to get source port
for a UDP tunnel. Removed vxlan_src_port since the functionality is
now in udp_flow_src_port.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:21 -07:00
Tom Herbert b8f1a55639 udp: Add function to make source port for UDP tunnels
This patch adds udp_flow_src_port function which is intended to be
a common function that UDP tunnel implementations call to set the source
port. The source port is chosen so that a hash over the outer headers
(IP addresses and UDP ports) acts as suitable hash for the flow of the
encapsulated packet. In this manner, UDP encapsulation works with RSS
and ECMP based wrt the inner flow.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:21 -07:00
Tom Herbert 0e001614e8 net: Call skb_get_hash in get_xps_queue and __skb_tx_hash
Call standard function to get a packet hash instead of taking this from
skb->sk->sk_hash or only using skb->protocol.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:21 -07:00
Tom Herbert b73c3d0e4f net: Save TX flow hash in sock and set in skbuf on xmit
For a connected socket we can precompute the flow hash for setting
in skb->hash on output. This is a performance advantage over
calculating the skb->hash for every packet on the connection. The
computation is done using the common hash algorithm to be consistent
with computations done for packets of the connection in other states
where thers is no socket (e.g. time-wait, syn-recv, syn-cookies).

This patch adds sk_txhash to the sock structure. inet_set_txhash and
ip6_set_txhash functions are added which are called from points in
TCP and UDP where socket moves to established state.

skb_set_hash_from_sk is a function which sets skb->hash from the
sock txhash value. This is called in UDP and TCP transmit path when
transmitting within the context of a socket.

Tested: ran super_netperf with 200 TCP_RR streams over a vxlan
interface (in this case skb_get_hash called on every TX packet to
create a UDP source port).

Before fix:

  95.02% CPU utilization
  154/256/505 90/95/99% latencies
  1.13042e+06 tps

  Time in functions:
    0.28% skb_flow_dissect
    0.21% __skb_get_hash

After fix:

  94.95% CPU utilization
  156/254/485 90/95/99% latencies
  1.15447e+06

  Neither __skb_get_hash nor skb_flow_dissect appear in perf

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:21 -07:00
Tom Herbert 5ed20a68cd flow_dissector: Abstract out hash computation
Move the hash computation located in __skb_get_hash to be a separate
function which takes flow_keys as input. This will allow flow hash
computation in other contexts where we only have addresses and ports.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:20 -07:00
David S. Miller 081a20ffcc Merge branch 'systemport-next'
Florian Fainelli says:

====================
net: systemport: PM and Wake-on-LAN support

This patchset brings Power Management and Wake-on-LAN support to the
Broadcom SYSTEM PORT driver.

S2 and S3 modes are supported, while we only support Wake-on-LAN using
MagicPackets for now
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 20:56:55 -07:00
Florian Fainelli 83e82f4c70 net: systemport: add Wake-on-LAN support
Support for Wake-on-LAN using Magic Packet with or without SecureOn
password is implemented doing the following:

- setting the password to the relevant UniMAC registers
- flagging the device as a wakeup source for the system, as well as
  its Wake-on-LAN interrupt
- prepare the hardware for entering WoL mode
- enabling the MPD interrupt to wake us

The Device Tree binding documentation is also reflected to specify the
third optional Wake-on-LAN interrupt line.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 20:56:47 -07:00
Florian Fainelli 9d34c1cb01 net: systemport: rename rx_csum_en to rx_chk_en
This boolean tells us whether we are using the RXCHK hardware block,
so use a variable name that reflects that. RXCHK might be used in the
future to implement Wake-on-LAN using ARP or unicast packets.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 20:56:47 -07:00
Florian Fainelli 40755a0fce net: systemport: add suspend and resume support
Implement the hardware recommended suspend/resume procedure for
SYSTEMPORT. We leverage the previous factoring work such that we can
logically break all suspend/resume operations into disctint RX and TX
code paths.

When the system enters S3, we will loose all register contents, so
make sure that we correctly re-program all the hardware and software
views of the RX & TX rings as well.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 20:56:47 -07:00
Florian Fainelli b02e6d9ba7 net: systemport: add bcm_sysport_netif_{enable,stop}
Factor common code that either enables or disables the network
interface with the networking stack. We are going to reuse these
functions for suspend/resume callbacks.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 20:56:47 -07:00
Florian Fainelli 18e21b01fb net: systemport: update umac_enable_set to take a bitmask
Quite often we need to enable either the transmitter or the receiver
bits in UMAC_CMD, use umac_enable_set() to do that for us.

This is a preliminary change to introduce suspend/resume support in the
driver.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 20:56:47 -07:00
Varka Bhadram 4710d806fc 6lowpan: mac802154: fix coding style issues
This patch fixed the coding style issues reported by checkpatch.pl

following issues fixed:
	CHECK: Alignment should match open parenthesis
	WARNING: line over 80 characters
	CHECK: Blank lines aren't necessary before a close brace '}'
	WARNING: networking block comments don't use an empty /* line, use /* Comment...
	WARNING: Missing a blank line after declarations
	WARNING: networking block comments start with * on subsequent lines
	CHECK: braces {} should be used on all arms of this statement

Signed-off-by: Varka Bhadram <varkab@cdac.in>
Tested-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 20:55:22 -07:00
Rami Rosen 46c9521fc2 netlink: Fix do_one_broadcast() prototype.
This patch changes the prototype of the do_one_broadcast() method so that it will return void.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 20:52:49 -07:00
David S. Miller 8e6e85e606 Merge branch 'tipc-next'
Erik Hugne says:

====================
tipc: link state processing improvements

Message delivery is separated from the link state processing, and
we fix a bug in receive-path triggered acks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 19:55:49 -07:00
Erik Hugne 3f53bd8f8b tipc: fix link acknowledge logic in receive path
Link state acks triggered from the receive path is done before
the last received packet have been processed by the link layer.
The effect of this is that the last received packet will not be
included in the ack. This causes problems if the link window is
set to TIPC_MIN_LINK_WIN, where the ack interval will be equal to
the link tolerance, and the link enters a stop-and-go behavior.
We move the ack logic to after link state processing, just before
the packet is delivered to higher layers.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Carl Sigurjonsson <carl.sigurjonsson@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 19:55:07 -07:00
Erik Hugne 7ae934bebe tipc: refactor message delivery out of tipc_rcv
This is a cosmetic change, separating message delivery from the
link state processing.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 19:55:07 -07:00
Neal Cardwell 86c6a2c75a tcp: switch snt_synack back to measuring transmit time of first SYNACK
Always store in snt_synack the time at which the server received the
first client SYN and attempted to send the first SYNACK.

Recent commit aa27fc501 ("tcp: tcp_v[46]_conn_request: fix snt_synack
initialization") resolved an inconsistency between IPv4 and IPv6 in
the initialization of snt_synack. This commit brings back the idea
from 843f4a55e (tcp: use tcp_v4_send_synack on first SYN-ACK), which
was going for the original behavior of snt_synack from the commit
where it was added in 9ad7c049f0 ("tcp: RFC2988bis + taking RTT
sample from 3WHS for the passive open side") in v3.1.

In addition to being simpler (and probably a tiny bit faster),
unconditionally storing the time of the first SYNACK attempt has been
useful because it allows calculating a performance metric quantifying
how long it took to establish a passive TCP connection.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Cc: Octavian Purdila <octavian.purdila@intel.com>
Cc: Jerry Chu <hkchu@google.com>
Acked-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 19:26:37 -07:00
David S. Miller 0b88e7042a Merge branch 'tlan-next'
Ondrej Zary says:

====================
tlan: Link handling improvements and Olicom fixes

This patch series improves link handling in tlan driver, allowing the
cable to be (un)plugged anytime and NetworkManager to work properly.

Also there are some bugfixes related to Olicom OC-2326 card.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 17:07:05 -07:00
Ondrej Zary 9162e7e519 tlan: Isolate external PHY when using internal PHY
When using internal 10 Mbps PHY, isolate the external PHY from MII bus.
External PHY must be kept powered up because it passes TX from tlan chip to
network.

This fixes weird link-loss problems under load with OC-2326 card at 10 Mbps.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 17:06:52 -07:00
Ondrej Zary e697b16b47 tlan: Enable device at resume
pci_disable_device() is called in _suspend but there's no corresponding
pci_enable_device() in _resume.
This causes "disabling already-disabled device" warning on 2nd suspend.

Add pci_enable_device() call to _resume to fix this problem.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 17:06:52 -07:00
Ondrej Zary 7a72eddc8e tlan: Don't disable internal PHY on cards that use it in 10 Mbps mode
In tlan_reset_adapter, we disable internal PHY when an external one is used.
On cards which use internal PHY in 10 Mbps mode, we enable it later when
setting 10 Mbps mode but it does not really work (PHY fails to reset).
Leave it enabled instead.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 17:06:52 -07:00
Ondrej Zary 9cff441ed6 tlan: Add PHY reset timeout
Add a timeout to prevent infinite loop waiting for PHY to reset.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 17:06:52 -07:00
Ondrej Zary 278e48b0c4 tlan: Make autonegotiation faster
Reduce the autonegotiation poll interval from 8 seconds to 2.
This greatly reduces the time needed to detect link presence,
especially on Olicom cards at 10 Mbps (two autonegoatiations required).

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 17:06:52 -07:00