Commit Graph

634426 Commits

Author SHA1 Message Date
Tomer Tayar 2edbff8dcb qed: Learn resources from management firmware
Currently, each interfaces assumes it receives an equal portion
of HW/FW resources, but this is wasteful - different partitions
[and specifically, parititions exposing different protocol support]
might require different resources.

Implement a new resource learning scheme where the information is
received directly from the management firmware [which has knowledge
of all of the functions and can serve as arbiter].

Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:52:36 -04:00
Mintz, Yuval 5a1f965aac qed: Use VF-queue feature
Driver sets several restrictions about the number of supported VFs
according to available HW/FW resources.
This creates a problem as there are constellations which can't be
supported [as limitation don't accurately describe the resources],
as well as holes where enabling IOV would fail due to supposed
lack of resources.

This introduces a new interal feature - vf-queues, which would
be used to lift some of the restriction and accurately enumerate
the queues that can be used by a given PF's VFs.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:52:36 -04:00
Mintz, Yuval 6927e82680 qed: Learn of RDMA capabilities per-device
Today, RDMA capabilities are learned from management firmware
which provides a per-device indication for all interfaces.
Newer management firmware is capable of providing a per-device
indication [would later be extended to either RoCE/iWARP].

Try using this newer learning mechanism, but fallback in case
management firmware is too old to retain current functionality.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:52:36 -04:00
Mintz, Yuval d7455f6e44 qede: Decouple ethtool caps from qed
While the qed_lm_maps is closely tied with the QED_LM_* defines,
when iterating over the array use actual size instead of the qed
define to prevent future possible issues.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:52:35 -04:00
Mintz, Yuval 14d39648cb qed*: Add support for WoL
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:52:35 -04:00
Mintz, Yuval 7a4b21b7d1 qed: Add nvram selftest
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:52:35 -04:00
Sudarsana Kalluru 0fefbfbaad qed*: Management firmware - notifications and defaults
Management firmware is interested in various tidbits about
the driver - including the driver state & several configuration
related fields [MTU, primtary MAC, etc.].
This adds the necessray logic to update MFW with such configurations,
some of which are passed directly via qed while for others APIs
are provide so that qede would be able to later configure if needed.

This also introduces a new default configuration for MTU which would
replace the default inherited by being an ethernet device.

Signed-off-by: Sudarsana Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:52:35 -04:00
Julia Lawall 89d9123e8e solos-pci: use permission-specific DEVICE_ATTR variants
Use DEVICE_ATTR_RW for read-write attributes.  This simplifies the
source code, improves readbility, and reduces the chance of
inconsistencies.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@rw@
declarer name DEVICE_ATTR;
identifier x,x_show,x_store;
@@

DEVICE_ATTR(x, \(0644\|S_IRUGO|S_IWUSR\), x_show, x_store);

@script:ocaml@
x << rw.x;
x_show << rw.x_show;
x_store << rw.x_store;
@@

if not (x^"_show" = x_show && x^"_store" = x_store)
then Coccilib.include_match false

@@
declarer name DEVICE_ATTR_RW;
identifier rw.x,rw.x_show,rw.x_store;
@@

- DEVICE_ATTR(x, \(0644\|S_IRUGO|S_IWUSR\), x_show, x_store);
+ DEVICE_ATTR_RW(x);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:32:12 -04:00
Julia Lawall 63215705a6 ptp: use permission-specific DEVICE_ATTR variants
Use DEVICE_ATTR_RO for read only attributes.  This simplifies the
source code, improves readbility, and reduces the chance of
inconsistencies.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@ro@
declarer name DEVICE_ATTR;
identifier x,x_show;
@@

DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL);

@script:ocaml@
x << ro.x;
x_show << ro.x_show;
@@

if not (x^"_show" = x_show) then Coccilib.include_match false

@@
declarer name DEVICE_ATTR_RO;
identifier ro.x,ro.x_show;
@@

- DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL);
+ DEVICE_ATTR_RO(x);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:32:12 -04:00
Daniel Borkmann 0f98621bef bpf, inode: add support for symlinks and fix mtime/ctime
While commit bb35a6ef7d ("bpf, inode: allow for rename and link ops")
added support for hard links that can be used for prog and map nodes,
this work adds simple symlink support, which can be used f.e. for
directories also when unpriviledged and works with cmdline tooling that
understands S_IFLNK anyway. Since the switch in e27f4a942a ("bpf: Use
mount_nodev not mount_ns to mount the bpf filesystem"), there can be
various mount instances with mount_nodev() and thus hierarchy can be
flattened to facilitate object sharing. Thus, we can keep bpf tooling
also working by repointing paths.

Most of the functionality can be used from vfs library operations. The
symlink is stored in the inode itself, that is in i_link, which is
sufficient in our case as opposed to storing it in the page cache.
While at it, I noticed that bpf_mkdir() and bpf_mkobj() don't update
the directories mtime and ctime, so add a common helper for it called
bpf_dentry_finalize() that takes care of it for all cases now.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:28:11 -04:00
Aaron Young 8778b27664 ldmvsw: tx queue stuck in stopped state after LDC reset
The following patch fixes an issue with the ldmvsw driver where
the network connection of a guest domain becomes non-functional after
the guest domain has panic'd and rebooted.

The root cause was determined to be from the following series of
events:

1. Guest domain panics - resulting in the guest no longer processing
   network packets (from ldmvsw driver)
2. The ldmvsw driver (in the control domain) eventually exerts flow
   control due to no more available tx drings and stops the tx queue
   for the guest domain
3. The LDC of the network connection for the guest is reset when
   the guest domain reboots after the panic.
4. The LDC reset event is received by the ldmvsw driver and the ldmvsw
   responds by clearing the tx queue for the guest.
5. ldmvsw waits indefinitely for a DATA ACK from the guest - which is
   the normal method to re-enable the tx queue. But the ACK never comes
   because the tx queue was cleared due to the LDC reset.

To fix this issue, in addition to clearing the tx queue, re-enable the
tx queue on a LDC reset. This prevents the ldmvsw from getting caught in
this deadlocked state of waiting for a DATA ACK which will never come.

Signed-off-by: Aaron Young <Aaron.Young@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:20:29 -04:00
David S. Miller 04f762e8f2 Merge branch 'xps-DCB'
Alexander Duyck says:

====================
Add support for XPS when using DCB

This patch series enables proper isolation between traffic classes when
using XPS while DCB is enabled.  Previously enabling XPS would cause the
traffic to be potentially pulled from one traffic class into another on
egress.  This change essentially multiplies the XPS map by the number of
traffic classes and allows us to do a lookup per traffic class for a given
CPU.

To guarantee the isolation I invalidate the XPS map for any queues that are
moved from one traffic class to another, or if we change the number of
traffic classes.

v2: Added sysfs to display traffic class
    Replaced do/while with for loop
    Cleaned up several other for for loops throughout the patch
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:00:48 -04:00
Alexander Duyck 184c449f91 net: Add support for XPS with QoS via traffic classes
This patch adds support for setting and using XPS when QoS via traffic
classes is enabled.  With this change we will factor in the priority and
traffic class mapping of the packet and use that information to correctly
select the queue.

This allows us to define a set of queues for a given traffic class via
mqprio and then configure the XPS mapping for those queues so that the
traffic flows can avoid head-of-line blocking between the individual CPUs
if so desired.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:00:48 -04:00
Alexander Duyck 6234f87407 net: Refactor removal of queues from XPS map and apply on num_tc changes
This patch updates the code for removing queues from the XPS map and makes
it so that we can apply the code any time we change either the number of
traffic classes or the mapping of a given block of queues.  This way we
avoid having queues pulling traffic from a foreign traffic class.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:00:48 -04:00
Alexander Duyck 8d059b0f6f net: Add sysfs value to determine queue traffic class
Add a sysfs attribute for a Tx queue that allows us to determine the
traffic class for a given queue.  This will allow us to more easily
determine this in the future.  It is needed as XPS will take the traffic
class for a group of queues into account in order to avoid pulling traffic
from one traffic class into another.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:00:47 -04:00
Alexander Duyck 9cf1f6a8c4 net: Move functions for configuring traffic classes out of inline headers
The functions for configuring the traffic class to queue mappings have
other effects that need to be addressed.  Instead of trying to export a
bunch of new functions just relocate the functions so that we can
instrument them directly with the functionality they will need.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 15:00:47 -04:00
Gao Feng 20861f26e3 driver: tun: Use new macro SOCK_IOC_TYPE instead of literal number 0x89
The current codes use _IOC_TYPE(cmd) == 0x89 to check if the cmd is one
socket ioctl command like SIOCGIFHWADDR. But the literal number 0x89 may
confuse readers. So create one macro SOCK_IOC_TYPE to enhance the readability.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 10:56:47 -04:00
Andrey Vagin c62cce2cae net: add an ioctl to get a socket network namespace
Each socket operates in a network namespace where it has been created,
so if we want to dump and restore a socket, we have to know its network
namespace.

We have a socket_diag to get information about sockets, it doesn't
report sockets which are not bound or connected.

This patch introduces a new socket ioctl, which is called SIOCGSKNS
and used to get a file descriptor for a socket network namespace.

A task must have CAP_NET_ADMIN in a target network namespace to
use this ioctl.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 10:56:36 -04:00
David S. Miller 2a43ca0aa9 mv643xx_eth: Properly resolve merge conflict.
The second SET_NETDEV_DEV() in the hunk should be
removed.

Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 10:33:08 -04:00
David S. Miller d8e4aa0707 mv643xx_eth: Fix merge error.
One merge conflict block wasn't resolved.

Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 08:59:27 -04:00
David S. Miller 0a6ce1e3c1 Merge tag 'shared-for-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma
Saeed Mahameed says:

====================
Mellanox mlx5 core driver updates 2016-10-25

This series contains some updates and fixes of mlx5 core and
IB drivers with the addition of two features that demand
new low level commands and infrastructure updates.
 - SRIOV VF max rate limit support
 - mlx5e tc support for FWD rules with counter.

Needed for both net and rdma subsystems.

Updates and Fixes:
From Saeed Mahameed (2):
  - mlx5 IB: Skip handling unknown mlx5 events
  - Add ConnectX-5 PCIe 4.0 VF device ID

From Artemy Kovalyov (2):
  - Update struct mlx5_ifc_xrqc_bits
  - Ensure SRQ physical address structure endianness

From Eugenia Emantayev (1):
  - Fix length of async_event_mask

New Features:
From Mohamad Haj Yahia (3): mlx5 SRIOV VF max rate limit support
  - Introduce TSAR manipulation firmware commands
  - Introduce E-switch QoS management
  - Add SRIOV VF max rate configuration support

From Mark Bloch (7): mlx5e Tc support for FWD rule with counter
  - Don't unlock fte while still using it
  - Use fte status to decide on firmware command
  - Refactor find_flow_rule
  - Group similar rules under the same fte
  - Add multi dest support
  - Add option to add fwd rule with counter
  - mlx5e tc support for FWD rule with counter
  Mark here fixed two trivial issues with the flow steering core, and did
  some refactoring in the flow steering API to support adding mulit destination
  rules to the same hardware flow table entry at once.  In the last two patches
  added the ability to populate a flow rule with a flow counter to the same flow entry.

V2: Dropped some patches that added new structures without adding any usage of them.
    Added SRIOV VF max rate configuration support patch that introduces
    the usage of the TSAR infrastructure.
    Added flow steering fixes and refactoring in addition to mlx5 tc
    support for forward rule with counter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-30 17:31:12 -04:00
Philippe Reynes d46b634945 net: bonding: use new api ethtool_{get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-30 17:23:39 -04:00
David S. Miller 2701c3b39d Merge branch 'mlxsw-IB'
Jiri Pirko says:

====================
mlxsw: Add Infiniband support for Mellanox switches

This patchset adds basic Infiniband support for SwitchX-2, Switch-IB
and Switch-IB-2 ASIC drivers.

SwitchX-2 ASIC is VPI capable, which means each port can be either
Ethernet or Infiniband. When the port is configured as Infiniband,
the Subnet Management Agent (SMA) is managed by the SwitchX-2 firmware
and not by the host. Port configuration, MTU and more are configured
remotely by the Subnet Manager (SM).

Usage:
        $ devlink port show
        pci/0000:03:00.0/1: type eth netdev eth0
        pci/0000:03:00.0/3: type eth netdev eth1
        pci/0000:03:00.0/5: type eth netdev eth2
        pci/0000:03:00.0/6: type eth netdev eth3
        pci/0000:03:00.0/8: type eth netdev eth4

        $ devlink port set pci/0000:03:00.0/1 type ib

        $ devlink port show
        pci/0000:03:00.0/1: type ib

Switch-IB (FDR) and Switch-IB-2 (EDR 100Gbs) ASICs are Infiniband-only
switches. The support provided in the mlxsw_switchib.ko driver is port
initialization only. The firmware running in the Silicon implements
the SMA.

Please note that this patchset does only very basic port initialization.
ib_device or RDMA implementations are not part of this patchset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-30 16:50:20 -04:00
Elad Raz d1ba526384 mlxsw: switchib: Introduce SwitchIB and SwitchIB silicon driver
SwitchIB and SwitchIB-2 are Infiniband switches with up to 36 ports. This
driver initialize the hardware and Firmware which implements the IB
management and connection with the SM.

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-30 16:50:17 -04:00
Elad Raz 64b92b0114 mlxsw: switchx2: Add IB port support
SwitchX-2 is IB capable device. This patch add a support to change the
port type between Ethernet and Infiniband.

When the port is set to IB, the FW implements the Subnet Management Agent
(SMA) manage the port. All port attributes can be control remotely by
the SM.

Usage:
	$ devlink port show
	pci/0000:03:00.0/1: type eth netdev eth0
	pci/0000:03:00.0/3: type eth netdev eth1
	pci/0000:03:00.0/5: type eth netdev eth2
	pci/0000:03:00.0/6: type eth netdev eth3
	pci/0000:03:00.0/8: type eth netdev eth4

	$ devlink port set pci/0000:03:00.0/1 type ib

	$ devlink port show
	pci/0000:03:00.0/1: type ib

Signed-off-by: Elad Raz <eladr@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-30 16:50:17 -04:00