Commit Graph

678557 Commits

Author SHA1 Message Date
Christian Perle 3500cd73df proc: snmp6: Use correct type in memset
Reading /proc/net/snmp6 yields bogus values on 32 bit kernels.
Use "u64" instead of "unsigned long" in sizeof().

Fixes: 4a4857b1c8 ("proc: Reduce cache miss in snmp6_seq_show")
Signed-off-by: Christian Perle <christian.perle@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12 09:53:14 -04:00
Donald Sharp 4b1f0d33db net: ipmr: Fix some mroute forwarding issues in vrf's
This patch fixes two issues:

1) When forwarding on *,G mroutes that are in a vrf, the
kernel was dropping information about the actual incoming
interface when calling ip_mr_forward from ip_mr_input.
This caused ip_mr_forward to send the multicast packet
back out the incoming interface.  Fix this by
modifying ip_mr_forward to be handed the correctly
resolved dev.

2) When a unresolved cache entry is created we store
the incoming skb on the unresolved cache entry and
upon mroute resolution from the user space daemon,
we attempt to forward the packet.  Again we were
not resolving to the correct incoming device for
a vrf scenario, before calling ip_mr_forward.
Fix this by resolving to the correct interface
and calling ip_mr_forward with the result.

Fixes: e58e415968 ("net: Enable support for VRF with ipv4 multicast")
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 18:15:06 -04:00
David S. Miller 062bb997d2 Merge tag 'mlx5-fixes-2017-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
Mellanox mlx5 fixes 2017-06-11

This series contains some fixes for the mlx5 core and netdev driver.

Please pull and let me know if there's any problem.

For -stable:
("net/mlx5e: Added BW check for DIM decision mechanism")              kernels >= 4.9
("net/mlx5e: Fix wrong indications in DIM due to counter wraparound") kernels >= 4.9
("net/mlx5: Remove several module events out of ethtool stats")       kernels >= 4.10
("net/mlx5: Enable 4K UAR only when page size is bigger than 4K")     kernels >= 4.11

*all patches apply with no issue on their -stable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:40:52 -04:00
David S. Miller 77a6bb5ac0 Merge branch 'ena-fixes'
Netanel Belgazal says:

====================
Bugs fixes in ena ethernet driver

This patchset contains fixes for the bugs that were discovered so far.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:48 -04:00
Netanel Belgazal e7ff7efae5 net: ena: update ena driver to version 1.1.7
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:47 -04:00
Netanel Belgazal 800c55cb76 net: ena: bug fix in lost tx packets detection mechanism
check_for_missing_tx_completions() is called from a timer
task and looking for lost tx packets.
The old implementation accumulate all the lost tx packets
and did not check if those packets were retrieved on a later stage.
This cause to a situation where the driver reset
the device for no reason.

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:47 -04:00
Netanel Belgazal a2cc5198da net: ena: disable admin msix while working in polling mode
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:47 -04:00
Netanel Belgazal a3af7c18cf net: ena: fix theoretical Rx hang on low memory systems
For the rare case where the device runs out of free rx buffer
descriptors (in case of pressure on kernel  memory),
and the napi handler continuously fail to refill new Rx descriptors
until device rx queue totally runs out of all free rx buffers
to post incoming packet, leading to a deadlock:
* The device won't send interrupts since all the new
Rx packets will be dropped.
* The napi handler won't try to allocate new Rx descriptors
since allocation is part of NAPI that's not being invoked any more

The fix involves detecting this scenario and rescheduling NAPI
(to refill buffers) by the keepalive/watchdog task.

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal 0857d92f71 net: ena: add missing unmap bars on device removal
This patch also change the mapping functions to devm_ functions

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal 661d2b0cce net: ena: fix race condition between submit and completion admin command
Bug:
"Completion context is occupied" error printout will be noticed in
dmesg.
This error will cause the admin command to fail, which will lead to
an ena_probe() failure or a watchdog reset (depends on which admin
command failed).

Root cause:
__ena_com_submit_admin_cmd() is the function that submits new entries to
the admin queue.
The function have a check that makes sure the queue is not full and the
function does not override any outstanding command.
It uses head and tail indexes for this check.
The head is increased by ena_com_handle_admin_completion() which runs
from interrupt context, and the tail index is increased by the submit
function (the function is running under ->q_lock, so there is no risk
of multithread increment).
Each command is associated with a completion context. This context
allocated before call to __ena_com_submit_admin_cmd() and freed by
ena_com_wait_and_process_admin_cq_interrupts(), right after the command
was completed.

This can lead to a state where the head was increased, the check passed,
but the completion context is still in use.

Solution:
Use the atomic variable ->outstanding_cmds instead of using the head and
the tail indexes.
This variable is safe for use since it is bumped in get_comp_ctx() in
__ena_com_submit_admin_cmd() and is freed by comp_ctxt_release()

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal 2d2c600a91 net: ena: add missing return when ena_com_get_io_handlers() fails
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal 418df30f7e net: ena: fix bug that might cause hang after consecutive open/close interface.
Fixing a bug that the driver does not unmask the IO interrupts
in ndo_open():
occasionally, the MSI-X interrupt (for one or more IO queues)
can be masked when ndo_close() was called.
If that is followed by ndo open(),
then the MSI-X will be still masked so no interrupt
will be received by the driver.

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal a77c1aafcc net: ena: fix rare uncompleted admin command false alarm
The current flow to detect admin completion is:
while (command_not_completed) {
	if (timeout)
		error

	check_for_completion()
		sleep()
   }
So in case the sleep took more than the timeout
(in case the thread/workqueue was not scheduled due to higher priority
task or prolonged VMexit), the driver can detect a stall even if
the completion is present.

The fix changes the order of this function to first check for
completion and only after that check if the timeout expired.

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:45 -04:00
Majd Dibbiny 91828bd899 net/mlx5: Enable 4K UAR only when page size is bigger than 4K
When the page size isn't bigger than 4K, there is no added value of enabling 4K
UAR feature in the Firmware.

Modified the condition of enabling the 4K UAR accordingly.

Fixes: f502d83495 ("net/mlx5: Activate support for 4K UARs")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11 13:10:36 +03:00
Tal Gilboa 53acd76ce5 net/mlx5e: Fix wrong indications in DIM due to counter wraparound
DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for
changing the channel interrupt moderation values in order to reduce CPU
overhead for all traffic types.
Each iteration of the algorithm, DIM calculates the difference in
throughput, packet rate and interrupt rate from last iteration in order
to make a decision. DIM relies on counters for each metric. When these
counters get to their type's max value they wraparound. In this case
the delta between 'end' and 'start' samples is negative and when
translated to unsigned integers - very high. This results in a false
indication to the algorithm and might result in a wrong decision.

The fix calculates the 'distance' between 'end' and 'start' samples in a
cyclic way around the relevant type's max value. It can also be viewed as
an absolute value around the type's max value instead of around 0.

Testing show higher stability in DIM profile selection and no wraparound
issues.

Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11 13:10:36 +03:00
Tal Gilboa c3164d2fc4 net/mlx5e: Added BW check for DIM decision mechanism
DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for
changing the channel interrupt moderation values in order to reduce CPU
overhead for all traffic types.
Until now only interrupt and packet rate were sampled.
We found a scenario on which we get a false indication since a change in
DIM caused more aggregation and reduced packet rate while increasing BW.

We now regard a change as succesfull iff:
current_BW > (prev_BW + threshold) or
current_BW ~= prev_BW and current_PR > (prev_PR + threshold) or
current_BW ~= prev_BW and current_PR ~= prev_PR and
    current_IR < (prev_IR - threshold)
Where BW = Bandwidth, PR = Packet rate and IR = Interrupt rate

Improvements (ConnectX-4Lx 25GbE, single RX queue, LRO off)
    --------------------------------------------------
    packet size | before[Mb/s] | after[Mb/s] | gain  |
    2B          | 343.4        | 359.4       |  4.5% |
    16B         | 2739.7       | 2814.8      |  2.7% |
    64B         | 9739         | 10185.3     |  4.5% |

Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11 13:10:36 +03:00
Huy Nguyen f729860a17 net/mlx5: Remove several module events out of ethtool stats
Remove the following module event counters out of ethtool stats. The
reason for removing these event counters is that these events do not
occur without techinician's intervention.
  module_pwr_budget_exd
  module_long_range
  module_no_eeprom
  module_enforce_part
  module_unknown_id
  module_unknown_status
  module_plug

Fixes: bedb7c909c ("net/mlx5e: Add port module event counters to ethtool stats")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed by: Gal Pressman <galp@mellanox.com>

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11 13:10:36 +03:00
Mohamad Haj Yahia 3fece5d676 net/mlx5: Continue health polling until it is explicitly stopped
The issue is that when we get an assert we will stop polling the health
and thus we cant enter error state when we have a real health issue.

Fixes: fd76ee4da5 ('net/mlx5_core: Fix internal error detection conditions')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11 13:10:36 +03:00
Mohamad Haj Yahia 57f35c93a2 net/mlx5: Fix create vport flow table flow
Send vport number to the create flow table inner method instead of
ignoring the vport argument and sending always 0.

Fixes: b3ba51498b ('net/mlx5: Refactor create flow table method to accept underlay QP')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11 13:10:36 +03:00
David S. Miller b87fa0fafe Merge branch 'mvpp2-fixes'
Thomas Petazzoni says:

====================
net: mvpp2: driver fixes

As requested, here is a series of patches containing only bug fixes
for the mvpp2 driver. It is based on the latest "net" branch.

Changes since v1:

 - Fixed a build breakage that occurred when only PATCH 1 was only,
   and not later patches in the series. Was reported by the kbuild
   report on the first submission.

 - Added Tested-by from Marc Zyngier on PATCH 2.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 18:22:56 -04:00
Thomas Petazzoni a704bb5c05 net: mvpp2: use {get, put}_cpu() instead of smp_processor_id()
smp_processor_id() should not be used in migration-enabled contexts. We
originally thought it was OK in the specific situation of this driver,
but it was wrong, and calling smp_processor_id() in a migration-enabled
context prints a big fat warning when CONFIG_DEBUG_PREEMPT=y.

Therefore, this commit replaces the smp_processor_id() in
migration-enabled contexts by the appropriate get_cpu/put_cpu sections.

Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Fixes: a786841df7 ("net: mvpp2: handle register mapping and access for PPv2.2")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 18:22:55 -04:00
Thomas Petazzoni 56b8aae959 net: mvpp2: remove mvpp2_bm_cookie_{build,pool_get}
This commit removes the useless remove
mvpp2_bm_cookie_{build,pool_get} functions. All what
mvpp2_bm_cookie_build() was doing is compute a 32-bit value by
concatenating the pool number and the CPU number... only to get the pool
number re-extracted by mvpp2_bm_cookie_pool_get() later on.

Instead, just get the pool number directly from RX descriptor status,
and pass it to mvpp2_pool_refill() and mvpp2_rx_refill().

This has the added benefit of dropping a smp_processor_id() call in a
migration-enabled context, which is wrong, and is the original
motivation for making this change.

Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 18:22:54 -04:00
Jia-Ju Bai 343eba69c6 net: tipc: Fix a sleep-in-atomic bug in tipc_msg_reverse
The kernel may sleep under a rcu read lock in tipc_msg_reverse, and the
function call path is:
tipc_l2_rcv_msg (acquire the lock by rcu_read_lock)
  tipc_rcv
    tipc_sk_rcv
      tipc_msg_reverse
        pskb_expand_head(GFP_KERNEL) --> may sleep
tipc_node_broadcast
  tipc_node_xmit_skb
    tipc_node_xmit
      tipc_sk_rcv
        tipc_msg_reverse
          pskb_expand_head(GFP_KERNEL) --> may sleep

To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 18:20:38 -04:00
Jia-Ju Bai f146e872eb net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx
The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the
function call path is:
cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock)
  cfctrl_linkdown_req
    cfpkt_create
      cfpkt_create_pfx
        alloc_skb(GFP_KERNEL) --> may sleep
cfserl_receive (acquire the lock by rcu_read_lock)
  cfpkt_split
    cfpkt_create_pfx
      alloc_skb(GFP_KERNEL) --> may sleep

There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or
"GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function
is called under a rcu read lock, instead in interrupt.

To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx.

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 18:19:45 -04:00
David S. Miller 5aa32f53ab Revert "net: fec: Add a fec_enet_clear_ethtool_stats() stub for CONFIG_M5272"
This reverts commit bf292f1b2c.

It belongs in 'net-next' not 'net'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:44:28 -04:00