Commit Graph

533628 Commits

Author SHA1 Message Date
Rick Jones b56ea2985d net: track success and failure of TCP PMTU probing
Track success and failure of TCP PMTU probing.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:36:33 -07:00
Hariprasad Shenai 0b2c2a931a cxgb4: Add debugfs entry to enable backdoor access
Add debugfs entry 'use_backdoor' to enable backdoor access to read sge
context. By default, we read sge context's via firmware. In case of FW
issues, one can enable backdoor access via debugfs to dump sge context
for debugging purpose.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:33:06 -07:00
Roopa Prabhu 01faef2ceb mpls: make RTA_OIF optional
If user did not specify an oif, try and get it from the via address.
If failed to get device, return with -ENODEV.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:27:19 -07:00
David S. Miller fd36ef606c Merge branch 'sfc-filter-chaining'
Edward Cree says:

====================
sfc: support for cascaded multicast filtering

Recent versions of firmware for SFC9100 adapters add support for filter
 chaining, in which packets matching multiple filters are delivered to all
 filters' recipients, rather than only the highest match-priority filter as was
 previously the case.
This patch series enables this feature and redesigns the filter handling code
 to make use of it; in particular, subscribing to a multicast address on one
 function no longer prevents traffic to that address reaching another function
 which is in promiscuous or allmulti mode.
If the firmware does not support filter chaining, the driver will fall back to
 the old behaviour.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:33 -07:00
Edward Cree 12fb0da45c sfc: clean fallbacks between promisc/normal in efx_ef10_filter_sync_rx_mode
Separate functions for inserting individual and promisc filters; explicit
 fallback logic in efx_ef10_filter_sync_rx_mode(), in order not to overload
 the 'promisc' flag as also meaning "fall back to promisc".

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:32 -07:00
Daniel Pieczko ab8b1f7cf8 sfc: support cascaded multicast filters
If the workaround to support cascaded multicast filters ("workaround_26807") is
enabled, the broadcast filter and individual multicast filters are not inserted
when in promiscuous or allmulti mode.

There is a race while inserting and removing filters when entering and leaving
promiscuous mode.  When changing promiscuous state with cascaded multicast
filters, the old multicast filters are removed before inserting the new filters
to avoid duplicating packets; this can lead to dropped packets until all
filters have been inserted.

The efx_nic:mc_promisc flag is added to record the presence of a multicast
promiscuous filter; this gives a simple way to tell if the promiscuous state is
changing.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:32 -07:00
Daniel Pieczko 822b96f87f sfc: re-factor efx_ef10_filter_sync_rx_mode()
This change is only re-factoring; there are no changes to functionality
 except for a slight elaboration of an error message (on mismatch filter
 insertion failure).

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:32 -07:00
Jon Cooper b6f568e27b sfc: Insert multicast filters as well as mismatch filters in promiscuous mode
If a function is in promiscuous mode and another function has a broadcast or
 multicast filter inserted, the function in promiscuous mode won't see that
 broadcast or multicast traffic.
Most notably this breaks broadcast, which means ARP doesn't work. Less
 show-stoppingly, a function listening on a multicast address that's also in
 promiscuous mode will not see that multicast traffic if another function is
 also listening on that multicast address.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:32 -07:00
Daniel Pieczko 5a55a72abe sfc: warn if other functions have been reset by MCFW
When enabling the workaround for cascaded multicast filters, the MC
 can reset other functions if they have already inserted filters.
 In that case, the workaround has been enabled, but print an info
 message in the log recording that other functions had to be reset.

As other functions were reset, the MC will have incremented its boot
 count, so also increment the warm_boot_count on the function which
 enabled the workaround, as that function won't have received an MC
 reboot event and does not need to reset.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:32 -07:00
Daniel Pieczko 34ccfe6f8a sfc: add output flag decoding to efx_mcdi_set_workaround
The initial use of this will be to check a flag reporting if an FLR was
performed on other functions when enabling cascaded multicast filters.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:32 -07:00
Edward Cree 832dc9ed43 sfc: cope with ENOSYS from efx_mcdi_get_workarounds()
GET_WORKAROUNDS was only introduced in May 2014, not all firmware
 will have it.  So call sites need to handle ENOSYS.
In this case we're probing the bug26807 workaround, which is not
 implemented in any firmware that doesn't have GET_WORKAROUNDS.
 So interpret ENOSYS as 'false'.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:31 -07:00
Daniel Pieczko 46e612b0fc sfc: enable cascaded multicast filters in MCFW
After creating event queue 0, check to see if the workaround is enabled,
 and enable it if necessary.  This will be called during PCI probe and
 also when coming back up after a reset.  The nic_data->workaround_26807
 will be used in the future to control the filter insertion behaviour
 based on this workaround.

Only the primary PF can enable this workaround, so tolerate an EPERM
 error and continue.  Otherwise, if any step in the checking and enabling
 of the workaround fails, the event queue must be removed.

We check that workaround is implemented before trying to enable it,
 and store the current workaround setting before trying to change it.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:31 -07:00
Edward Cree a9196bb048 sfc: update MCDI protocol definitions
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:31 -07:00
Jon Paul Maloy 16040894b2 tipc: fix compatibility bug
In commit d999297c3d
("tipc: reduce locking scope during packet reception") we introduced
a new function tipc_link_proto_rcv(). This function contains a bug,
so that it sometimes by error sends out a non-zero link priority value
in created protocol messages.

The bug may lead to an extra link reset at initial link establising
with older nodes. This will never happen more than once, whereafter
the link will work as intended.

We fix this bug in this commit.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 16:23:50 -07:00
David S. Miller 67b2914b9c Merge branch 'explicit-inbound-link-state'
Florian Fainelli says:

====================
net: enable inband link state negotiation only when explicitly requested

Changes in v5:

- removed an invalid use of the link_update callback in the SF2 driver
  was appeared after merging "net: phy: fixed_phy: handle link-down case"

- reworded the commit message for patch 2 to make it clear what it fixes and
  why this is required

Initial cover letter from Stas:

Hello.

Currently the link status auto-negotiation is enabled
for any SGMII link with fixed-link DT binding.
The regression was reported:
https://lkml.org/lkml/2015/7/8/865
Apparently not all HW that implements SGMII protocol, generates the
inband status for the auto-negotiation to work.
More details here:
https://lkml.org/lkml/2015/7/10/206

The following patches reverts to the old behavior by default,
which is to not enable the auto-negotiation for fixed-link.
The new DT property is added that allows to explicitly request
the auto-negotiation.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 16:12:56 -07:00
Stas Sergeev f8af8e6eb9 mvneta: use inband status only when explicitly enabled
The commit 898b2970e2 ("mvneta: implement SGMII-based in-band link state
signaling") implemented the link parameters auto-negotiation unconditionally.
Unfortunately it appears that some HW that implements SGMII protocol,
doesn't generate the inband status, so it is not possible to auto-negotiate
anything with such HW.

This patch enables the auto-negotiation only if explicitly requested with
the 'managed' DT property.

This patch fixes the following regression:
https://lkml.org/lkml/2015/7/8/865

Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>

CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
CC: netdev@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 16:12:55 -07:00
Stas Sergeev 4cba5c2103 of_mdio: add new DT property 'managed' to specify the PHY management type
Currently the PHY management type is selected by the MAC driver arbitrary.
The decision is based on the presence of the "fixed-link" node and on a
will of the driver's authors.
This caused a regression recently, when mvneta driver suddenly started
to use the in-band status for auto-negotiation on fixed links.
It appears the auto-negotiation may not work when expected by the MAC driver.
Sebastien Rannou explains:
<< Yes, I confirm that my HW does not generate an in-band status. AFAIK, it's
a PHY that aggregates 4xSGMIIs to 1xQSGMII ; the MAC side of the PHY (with
inband status) is connected to the switch through QSGMII, and in this context
we are on the media side of the PHY. >>
https://lkml.org/lkml/2015/7/10/206

This patch introduces the new string property 'managed' that allows
the user to set the management type explicitly.
The supported values are:
"auto" - default. Uses either MDIO or nothing, depending on the presence
of the fixed-link node
"in-band-status" - use in-band status

Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>

CC: Rob Herring <robh+dt@kernel.org>
CC: Pawel Moll <pawel.moll@arm.com>
CC: Mark Rutland <mark.rutland@arm.com>
CC: Ian Campbell <ijc+devicetree@hellion.org.uk>
CC: Kumar Gala <galak@codeaurora.org>
CC: Florian Fainelli <f.fainelli@gmail.com>
CC: Grant Likely <grant.likely@linaro.org>
CC: devicetree@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 16:12:55 -07:00
Stas Sergeev 868a4215be net: phy: fixed_phy: handle link-down case
fixed_phy_register() currently hardcodes the fixed PHY link to 1, and
expects to find a "speed" parameter to provide correct information
towards the fixed PHY consumer.

In a subsequent change, where we allow "managed" (e.g: (RS)GMII in-band
status auto-negotiation) fixed PHYs, none of these parameters can be
provided since they will be auto-negotiated, hence, we just provide a
zero-initialized fixed_phy_status to fixed_phy_register() which makes it
fail when we call fixed_phy_update_regs() since status.speed = 0 which
makes us hit the "default" label and error out.

Without this change, we would also see potentially inconsistent
speed/duplex parameters for fixed PHYs when the link is DOWN.

CC: netdev@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
[florian: add more background to why this is correct and desirable]
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 16:12:55 -07:00
Florian Fainelli d2eac98f7d net: dsa: bcm_sf2: Do not override speed settings
The SF2 driver currently overrides speed settings for its port
configured using a fixed PHY, this is both unnecessary and incorrect,
because we keep feedback to the hardware parameters that we read from
the PHY device, which in the case of a fixed PHY cannot possibly change
speed.

This is a required change to allow the fixed PHY code to allow
registering a PHY with a link configured as DOWN by default and avoid
some sort of circular dependency where we require the link_update
callback to run to program the hardware, and we then utilize the fixed
PHY parameters to program the hardware with the same settings.

Fixes: 246d7f773c ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 16:12:55 -07:00
Mathias Krause e181a54304 net: #ifdefify sk_classid member of struct sock
The sk_classid member is only required when CONFIG_CGROUP_NET_CLASSID is
enabled. #ifdefify it to reduce the size of struct sock on 32 bit
systems, at least.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 16:04:30 -07:00
David S. Miller e69724f32e Merge branch 'lwtunnel'
Thomas Graf says:

====================
Lightweight & flow based encapsulation

This series combines the work previously posted by Roopa, Robert and
myself. It's according to what we discussed at NFWS. The motivation
of this series is to:

 * Consolidate code between OVS and the rest of the kernel and get
   rid of OVS vports and instead represent them as pure net_devices.
 * Introduce a lightweight tunneling mechanism which enables flow
   based encapsulation to improve scalability on both RX and TX.
 * Do the above in an encapsulation unspecific way so that the
   encapsulation type is eventually abstracted away from the user.
 * Use the same forwarding decision for both native forwarding and
   encapsulation thus allowing to switch between native IPv6 and
   UDP encapsulation based on endpoint without requiring additional
   logic

The fundamental changes introduces in this series are:
 * A new RTA_ENCAP Netlink attribute for routes carrying encapsulation
   instructions. Depending on the specified type, the instructions
   apply to UDP encapsulations, MPLS and possible other in the future.
 * Depending on the encapsulation type, the output function of the
   dst is directly overwritten or the dst merely attaches metadata and
   relies on a subsequent net_device to apply it to the packet. The
   latter is typically used if an inner and outer IP header exist which
   require two subsequent routing lookups to be performed.
 * A new metadata_dst structure which can be attached to skbs to
   carry metadata in between subsystems. This new metadata transport
   is used to provide a single interface for VXLAN, routing and OVS
   to communicate through metadata.

The OVS interfaces remain as-is but will transparently create a real
VXLAN net_device in the background. iproute2 is extended with a new
use cases:

  VXLAN:
  ip route add 40.1.1.1/32 encap vxlan id 10 dst 50.1.1.2 dev vxlan0

  MPLS:
  ip route add 10.1.1.0/30 encap mpls 200 via inet 10.1.1.1 dev swp1

Performance implications:
  The additional memory allocation in the receive path should have
  performance implications although it is not observable in standard
  throughput tests if GRO is properly done. The correct net_device
  model outweights the additional cost of the allocation. Furthermore,
  this implication can be relaxed by reintroducing a direct unqueued
  path from a software device to a consumer like bridge or OVS if
  needed.

    $ netperf  -t TCP_STREAM -H 15.1.1.201
    MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
    15.1.1.201 (15.1.1.201) port 0 AF_INET : demo
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec

     87380  16384  16384    10.00    9118.17

Changes since v1:
 * Properly initialize tun_id as reported by Julian
 * Drop dupliate netif_keep_dst() as reported by Alexei
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:07 -07:00
Thomas Graf 614732eaa1 openvswitch: Use regular VXLAN net_device device
This gets rid of all OVS specific VXLAN code in the receive and
transmit path by using a VXLAN net_device to represent the vport.
Only a small shim layer remains which takes care of handling the
VXLAN specific OVS Netlink configuration.

Unexports vxlan_sock_add(), vxlan_sock_release(), vxlan_xmit_skb()
since they are no longer needed.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:07 -07:00
Thomas Graf c9db965c52 openvswitch: Abstract vport name through ovs_vport_name()
This allows to get rid of the get_name() vport ops later on.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:07 -07:00
Thomas Graf be4ace6e6b openvswitch: Move dev pointer into vport itself
This is the first step in representing all OVS vports as regular
struct net_devices. Move the net_device pointer into the vport
structure itself to get rid of struct vport_netdev.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:07 -07:00
Thomas Graf 34ae932a40 openvswitch: Make tunnel set action attach a metadata dst
Utilize the new metadata dst to attach encapsulation instructions to
the skb. The existing egress_tun_info via the OVS_CB() is left in
place until all tunnel vports have been converted to the new method.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:06 -07:00