Commit Graph

527 Commits

Author SHA1 Message Date
Cris Forno
9aedc6e2f1 net/ethtool: Introduce link_ksettings API for virtual network devices
With the ethtool_virtdev_set_link_ksettings function in core/ethtool.c,
ibmveth, netvsc, and virtio now use the core's helper function.

Funtionality changes that pertain to ibmveth driver include:

  1. Changed the initial hardcoded link speed to 1GB.

  2. Added support for allowing a user to change the reported link
  speed via ethtool.

Functionality changes to the netvsc driver include:

  1. When netvsc_get_link_ksettings is called, it will defer to the VF
  device if it exists to pull accelerated networking values, otherwise
  pull default or user-defined values.

  2. Similarly, if netvsc_set_link_ksettings called and a VF device
  exists, the real values of speed and duplex are changed.

Signed-off-by: Cris Forno <cforno12@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-29 21:48:55 -08:00
David S. Miller
9f6e055907 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The mptcp conflict was overlapping additions.

The SMC conflict was an additional and removal happening at the same
time.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27 18:31:39 -08:00
Haiyang Zhang
f6f13c125e hv_netvsc: Fix unwanted wakeup in netvsc_attach()
When netvsc_attach() is called by operations like changing MTU, etc.,
an extra wakeup may happen while netvsc_attach() calling
rndis_filter_device_add() which sends rndis messages when queue is
stopped in netvsc_detach(). The completion message will wake up queue 0.

We can reproduce the issue by changing MTU etc., then the wake_queue
counter from "ethtool -S" will increase beyond stop_queue counter:
     stop_queue: 0
     wake_queue: 1
The issue causes queue wake up, and counter increment, no other ill
effects in current code. So we didn't see any network problem for now.

To fix this, initialize tx_disable to true, and set it to false when
the NIC is ready to be attached or registered.

Fixes: 7b2ee50c0c ("hv_netvsc: common detach logic")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:32:37 -08:00
Heiner Kallweit
1eb2c576ef hv_netvsc: use new helper tcp_v6_gso_csum_prep
Use new helper tcp_v6_gso_csum_prep in additional network drivers.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19 11:20:59 -08:00
Haiyang Zhang
184367dce4 hv_netvsc: Fix XDP refcnt for synthetic and VF NICs
The caller of XDP_SETUP_PROG has already incremented refcnt in
__bpf_prog_get(), so drivers should only increment refcnt by
num_queues - 1.

To fix the issue, update netvsc_xdp_set() to add the correct number
to refcnt.

Hold a refcnt in netvsc_xdp_set()’s other caller, netvsc_attach().

And, do the same in netvsc_vf_setxdp(). Otherwise, every time when VF is
removed and added from the host side, the refcnt will be decreased by one,
which may cause page fault when unloading xdp program.

Fixes: 351e158139 ("hv_netvsc: Add XDP support")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-07 11:33:01 +01:00
Haiyang Zhang
351e158139 hv_netvsc: Add XDP support
This patch adds support of XDP in native mode for hv_netvsc driver, and
transparently sets the XDP program on the associated VF NIC as well.

Setting / unsetting XDP program on synthetic NIC (netvsc) propagates to
VF NIC automatically. Setting / unsetting XDP program on VF NIC directly
is not recommended, also not propagated to synthetic NIC, and may be
overwritten by setting of synthetic NIC.

The Azure/Hyper-V synthetic NIC receive buffer doesn't provide headroom
for XDP. We thought about re-use the RNDIS header space, but it's too
small. So we decided to copy the packets to a page buffer for XDP. And,
most of our VMs on Azure have Accelerated  Network (SRIOV) enabled, so
most of the packets run on VF NIC. The synthetic NIC is considered as a
fallback data-path. So the data copy on netvsc won't impact performance
significantly.

XDP program cannot run with LRO (RSC) enabled, so you need to disable LRO
before running XDP:
        ethtool -K eth0 lro off

XDP actions not yet supported:
        XDP_REDIRECT

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:43:19 +01:00
Mohammed Gamal
536dc5df28 hv_netvsc: Fix memory leak when removing rndis device
kmemleak detects the following memory leak when hot removing
a network device:

unreferenced object 0xffff888083f63600 (size 256):
  comm "kworker/0:1", pid 12, jiffies 4294831717 (age 1113.676s)
  hex dump (first 32 bytes):
    00 40 c7 33 80 88 ff ff 00 00 00 00 10 00 00 00  .@.3............
    00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
  backtrace:
    [<00000000d4a8f5be>] rndis_filter_device_add+0x117/0x11c0 [hv_netvsc]
    [<000000009c02d75b>] netvsc_probe+0x5e7/0xbf0 [hv_netvsc]
    [<00000000ddafce23>] vmbus_probe+0x74/0x170 [hv_vmbus]
    [<00000000046e64f1>] really_probe+0x22f/0xb50
    [<000000005cc35eb7>] driver_probe_device+0x25e/0x370
    [<0000000043c642b2>] bus_for_each_drv+0x11f/0x1b0
    [<000000005e3d09f0>] __device_attach+0x1c6/0x2f0
    [<00000000a72c362f>] bus_probe_device+0x1a6/0x260
    [<0000000008478399>] device_add+0x10a3/0x18e0
    [<00000000cf07b48c>] vmbus_device_register+0xe7/0x1e0 [hv_vmbus]
    [<00000000d46cf032>] vmbus_add_channel_work+0x8ab/0x1770 [hv_vmbus]
    [<000000002c94bb64>] process_one_work+0x919/0x17d0
    [<0000000096de6781>] worker_thread+0x87/0xb40
    [<00000000fbe7397e>] kthread+0x333/0x3f0
    [<000000004f844269>] ret_from_fork+0x3a/0x50

rndis_filter_device_add() allocates an instance of struct rndis_device
which never gets deallocated as rndis_filter_device_remove() sets
net_device->extension which points to the rndis_device struct to NULL,
leaving the rndis_device dangling.

Since net_device->extension is eventually freed in free_netvsc_device(),
we refrain from setting it to NULL inside rndis_filter_device_remove()

Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-15 22:37:45 +01:00
Linus Torvalds
78bac77b52 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Several nf_flow_table_offload fixes from Pablo Neira Ayuso,
    including adding a missing ipv6 match description.

 2) Several heap overflow fixes in mwifiex from qize wang and Ganapathi
    Bhat.

 3) Fix uninit value in bond_neigh_init(), from Eric Dumazet.

 4) Fix non-ACPI probing of nxp-nci, from Stephan Gerhold.

 5) Fix use after free in tipc_disc_rcv(), from Tuong Lien.

 6) Enforce limit of 33 tail calls in mips and riscv JIT, from Paul
    Chaignon.

 7) Multicast MAC limit test is off by one in qede, from Manish Chopra.

 8) Fix established socket lookup race when socket goes from
    TCP_ESTABLISHED to TCP_LISTEN, because there lacks an intervening
    RCU grace period. From Eric Dumazet.

 9) Don't send empty SKBs from tcp_write_xmit(), also from Eric Dumazet.

10) Fix active backup transition after link failure in bonding, from
    Mahesh Bandewar.

11) Avoid zero sized hash table in gtp driver, from Taehee Yoo.

12) Fix wrong interface passed to ->mac_link_up(), from Russell King.

13) Fix DSA egress flooding settings in b53, from Florian Fainelli.

14) Memory leak in gmac_setup_txqs(), from Navid Emamdoost.

15) Fix double free in dpaa2-ptp code, from Ioana Ciornei.

16) Reject invalid MTU values in stmmac, from Jose Abreu.

17) Fix refcount leak in error path of u32 classifier, from Davide
    Caratti.

18) Fix regression causing iwlwifi firmware crashes on boot, from Anders
    Kaseorg.

19) Fix inverted return value logic in llc2 code, from Chan Shu Tak.

20) Disable hardware GRO when XDP is attached to qede, frm Manish
    Chopra.

21) Since we encode state in the low pointer bits, dst metrics must be
    at least 4 byte aligned, which is not necessarily true on m68k. Add
    annotations to fix this, from Geert Uytterhoeven.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (160 commits)
  sfc: Include XDP packet headroom in buffer step size.
  sfc: fix channel allocation with brute force
  net: dst: Force 4-byte alignment of dst_metrics
  selftests: pmtu: fix init mtu value in description
  hv_netvsc: Fix unwanted rx_table reset
  net: phy: ensure that phy IDs are correctly typed
  mod_devicetable: fix PHY module format
  qede: Disable hardware gro when xdp prog is installed
  net: ena: fix issues in setting interrupt moderation params in ethtool
  net: ena: fix default tx interrupt moderation interval
  net/smc: unregister ib devices in reboot_event
  net: stmmac: platform: Fix MDIO init for platforms without PHY
  llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c)
  net: hisilicon: Fix a BUG trigered by wrong bytes_compl
  net: dsa: ksz: use common define for tag len
  s390/qeth: don't return -ENOTSUPP to userspace
  s390/qeth: fix promiscuous mode after reset
  s390/qeth: handle error due to unsupported transport mode
  cxgb4: fix refcount init for TC-MQPRIO offload
  tc-testing: initial tdc selftests for cls_u32
  ...
2019-12-22 09:54:33 -08:00
Haiyang Zhang
b0689faa8e hv_netvsc: Fix unwanted rx_table reset
In existing code, the receive indirection table, rx_table, is in
struct rndis_device, which will be reset when changing MTU, ringparam,
etc. User configured receive indirection table values will be lost.

To fix this, move rx_table to struct net_device_context, and check
netif_is_rxfh_configured(), so rx_table will be set to default only
if no user configured value.

Fixes: ff4a441990 ("netvsc: allow get/set of RSS indirection table")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-20 21:50:38 -08:00
Haiyang Zhang
c39ea5cba5 hv_netvsc: Fix tx_table init in rndis_set_subchannel()
Host can provide send indirection table messages anytime after RSS is
enabled by calling rndis_filter_set_rss_param(). So the host provided
table values may be overwritten by the initialization in
rndis_set_subchannel().

To prevent this problem, move the tx_table initialization before calling
rndis_filter_set_rss_param().

Fixes: a6fb6aa3cf ("hv_netvsc: Set tx_table to equal weight after subchannels open")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-12-14 11:27:30 -08:00
Pankaj Bharadiya
c593642c8b treewide: Use sizeof_field() macro
Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except
at places where these are defined. Later patches will remove the unused
definition of FIELD_SIZEOF().

This patch is generated using following script:

EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h"

git grep -l -e "\bFIELD_SIZEOF\b" | while read file;
do

	if [[ "$file" =~ $EXCLUDE_FILES ]]; then
		continue
	fi
	sed -i  -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file;
done

Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: David Miller <davem@davemloft.net> # for net
2019-12-09 10:36:44 -08:00
Linus Torvalds
0dd0c8f7db Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V updates from Sasha Levin:

 - support for new VMBus protocols (Andrea Parri)

 - hibernation support (Dexuan Cui)

 - latency testing framework (Branden Bonaby)

 - decoupling Hyper-V page size from guest page size (Himadri Pandya)

* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (22 commits)
  Drivers: hv: vmbus: Fix crash handler reset of Hyper-V synic
  drivers/hv: Replace binary semaphore with mutex
  drivers: iommu: hyperv: Make HYPERV_IOMMU only available on x86
  HID: hyperv: Add the support of hibernation
  hv_balloon: Add the support of hibernation
  x86/hyperv: Implement hv_is_hibernation_supported()
  Drivers: hv: balloon: Remove dependencies on guest page size
  Drivers: hv: vmbus: Remove dependencies on guest page size
  x86: hv: Add function to allocate zeroed page for Hyper-V
  Drivers: hv: util: Specify ring buffer size using Hyper-V page size
  Drivers: hv: Specify receive buffer size using Hyper-V page size
  tools: hv: add vmbus testing tool
  drivers: hv: vmbus: Introduce latency testing
  video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver
  video: hyperv: hyperv_fb: Obtain screen resolution from Hyper-V host
  hv_netvsc: Add the support of hibernation
  hv_sock: Add the support of hibernation
  video: hyperv_fb: Add the support of hibernation
  scsi: storvsc: Add the support of hibernation
  Drivers: hv: vmbus: Add module parameter to cap the VMBus version
  ...
2019-11-30 14:50:51 -08:00
Stephen Hemminger
6f3aeb1ba0 hv_netvsc: make recording RSS hash depend on feature flag
The recording of RSS hash should be controlled by NETIF_F_RXHASH.

Fixes: 1fac7ca4e6 ("hv_netvsc: record hardware hash in skb")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-23 18:42:41 -08:00
Jakub Kicinski
a9f852e92e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Minor conflict in drivers/s390/net/qeth_l2_main.c, kept the lock
from commit c8183f5489 ("s390/qeth: fix potential deadlock on
workqueue flush"), removed the code which was removed by commit
9897d583b0 ("s390/qeth: consolidate some duplicated HW cmd code").

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-22 16:27:24 -08:00
Haiyang Zhang
171c1fd98d hv_netvsc: Fix send_table offset in case of a host bug
If negotiated NVSP version <= NVSP_PROTOCOL_VERSION_6, the offset may
be wrong (too small) due to a host bug. This can cause missing the
end of the send indirection table, and add multiple zero entries from
leading zeros before the data region. This bug adds extra burden on
channel 0.

So fix the offset by computing it from the data structure sizes. This
will ensure netvsc driver runs normally on unfixed hosts, and future
fixed hosts.

Fixes: 5b54dac856 ("hyperv: Add support for virtual Receive Side Scaling (vRSS)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 19:32:23 -08:00
Haiyang Zhang
71f21959dd hv_netvsc: Fix offset usage in netvsc_send_table()
To reach the data region, the existing code adds offset in struct
nvsp_5_send_indirect_table on the beginning of this struct. But the
offset should be based on the beginning of its container,
struct nvsp_message. This bug causes the first table entry missing,
and adds an extra zero from the zero pad after the data region.
This can put extra burden on the channel 0.

So, correct the offset usage. Also add a boundary check to ensure
not reading beyond data region.

Fixes: 5b54dac856 ("hyperv: Add support for virtual Receive Side Scaling (vRSS)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 19:32:23 -08:00
Dexuan Cui
0efeea5fb1 hv_netvsc: Add the support of hibernation
The existing netvsc_detach() and netvsc_attach() APIs make it easy to
implement the suspend/resume callbacks.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-21 20:10:44 -05:00
Stephen Hemminger
1fac7ca4e6 hv_netvsc: record hardware hash in skb
Since RSS hash is available from the host, record it in
the skb.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05 14:01:35 -08:00
Stephen Hemminger
df9f540ca7 hv_netvsc: flag software created hash value
When the driver needs to create a hash value because it
was not done at higher level, then the hash should be marked
as a software not hardware hash.

Fixes: f72860afa2 ("hv_netvsc: Exclude non-TCP port numbers from vRSS hashing")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05 14:01:35 -08:00
Haiyang Zhang
719b85c336 hv_netvsc: Fix error handling in netvsc_attach()
If rndis_filter_open() fails, we need to remove the rndis device created
in earlier steps, before returning an error code. Otherwise, the retry of
netvsc_attach() from its callers will fail and hang.

Fixes: 7b2ee50c0c ("hv_netvsc: common detach logic")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-30 18:17:36 -07:00
Haiyang Zhang
c4509a5ac0 hv_netvsc: Fix error handling in netvsc_set_features()
When an error is returned by rndis_filter_set_offload_params(), we should
still assign the unaffected features to ndev->features. Otherwise, these
features will be missing.

Fixes: d6792a5a07 ("hv_netvsc: Add handler for LRO setting change")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-30 18:17:36 -07:00
Taehee Yoo
ab92d68fc2 net: core: add generic lockdep keys
Some interface types could be nested.
(VLAN, BONDING, TEAM, MACSEC, MACVLAN, IPVLAN, VIRT_WIFI, VXLAN, etc..)
These interface types should set lockdep class because, without lockdep
class key, lockdep always warn about unexisting circular locking.

In the current code, these interfaces have their own lockdep class keys and
these manage itself. So that there are so many duplicate code around the
/driver/net and /net/.
This patch adds new generic lockdep keys and some helper functions for it.

This patch does below changes.
a) Add lockdep class keys in struct net_device
   - qdisc_running, xmit, addr_list, qdisc_busylock
   - these keys are used as dynamic lockdep key.
b) When net_device is being allocated, lockdep keys are registered.
   - alloc_netdev_mqs()
c) When net_device is being free'd llockdep keys are unregistered.
   - free_netdev()
d) Add generic lockdep key helper function
   - netdev_register_lockdep_key()
   - netdev_unregister_lockdep_key()
   - netdev_update_lockdep_key()
e) Remove unnecessary generic lockdep macro and functions
f) Remove unnecessary lockdep code of each interfaces.

After this patch, each interface modules don't need to maintain
their lockdep keys.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-24 14:53:48 -07:00
Haiyang Zhang
68622d071e hv_netvsc: Sync offloading features to VF NIC
VF NIC may go down then come up during host servicing events. This
causes the VF NIC offloading feature settings to roll back to the
defaults. This patch can synchronize features from synthetic NIC to
the VF NIC during ndo_set_features (ethtool -K),
and netvsc_register_vf when VF comes back after host events.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Mark Bloch <markb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:42:52 +02:00
Haiyang Zhang
b441f79532 hv_netvsc: Allow scatter-gather feature to be tunable
In a previous patch, the NETIF_F_SG was missing after the code changes.
That caused the SG feature to be "fixed". This patch includes it into
hw_features, so it is tunable again.

Fixes: 23312a3be9 ("netvsc: negotiate checksum and segmentation parameters")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:42:52 +02:00
David S. Miller
446bf64b61 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge conflict of mlx5 resolved using instructions in merge
commit 9566e650bf.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-19 11:54:03 -07:00