Commit Graph

107 Commits

Author SHA1 Message Date
Allen Hubbe 9a07826f99 NTB: Fix range check on memory window index
The range check must exclude the upper bound.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:27:12 -04:00
Allen Hubbe 2aa2a77a48 NTB: Improve index handling in B2B MW workaround
Check that b2b_mw_idx is in range of the number of memory windows when
initializing the device.  The workaround is considered to be in effect
only if the device b2b_idx is exactly UINT_MAX, instead of any index
past the last memory window.

Only print B2B MW workaround information in debugfs if the workaround is
in effect.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:27:12 -04:00
Dave Jiang 569410ca75 NTB: Use unique DMA channels for TX and RX
Allocate two DMA channels, one for TX operation and one for RX
operation, instead of having one DMA channel for everything. This
provides slightly better performance, and also will make error handling
cleaner later on.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:09 -04:00
Allen Hubbe 905921e748 NTB: Remove dma_sync_wait from ntb_async_rx
The dma_sync_wait can hurt the performance of workloads mixed with both
large and small frames.  Large frames will be copied using the dma
engine.  Small frames will be copied by the cpu.  The dma_sync_wait
prevents the cpu and dma engine copying in parallel.

In the period where the cpu is copying, the dma engine is stopped.  The
dma engine is not doing any useful work to copy large frames during that
time, and the additional time to restart the dma engine for the next
large frame.  This will decrease the throughput for the portion of a
workload with large frames.

In the period where the dma engine is copying, the cpu is held up
waiting for dma to complete.  The small frames processing will be
delayed until the dma is complete.  The RX frames are completed
in-order, and the processing of small frames takes very little time, so
dma_sync_wait may have an insignificant impact on the respose time of
frames.  The more significant impact is to the system, because the delay
in dma_sync_wait is implemented as busy non-blocking wait.  This can
prevent the delayed core from doing any useful work, even if it could be
processing work for other drivers, unrelated to transport RX processing.

After applying the earlier patch to fix out-of-order RX acknoledgement,
the dma_sync_wait is no longer necessary.  Remove it, so that cpu memcpy
will proceed immediately for small frames, in parallel with ongoing dma
for large frames.  Do not hold up the cpu from doing work while dma is
in progress.  The prior fix will continue to ensure in-order completion
of the RX frames to the upper layer, and in-order delivery of the RX
acknoledgement.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:08 -04:00
Dave Jiang d98ef99e37 NTB: Clean up QP stats info
Make QP stats info more readable for debugging purposes.  Also add an
entry to indicate whether DMA is being used.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:08 -04:00
Dave Jiang 315100004f NTB: Make the transport list in order of discovery
The list should be added from the bottom and not the top in order to
ensure the transport is provided in the same order to clients as ntb
devices are discovered.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:08 -04:00
Dave Jiang 0a5d19d9f0 NTB: Add PCI Device IDs for Broadwell Xeon
Adding PCI Device IDs for B2B (back to back), RP (root port, primary),
and TB (transparent bridge, secondary) devices.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:08 -04:00
Dave Jiang e74bfeedad NTB: Add flow control to the ntb_netdev
Right now if we push the NTB really hard, we start dropping packets due
to not able to process the packets fast enough. We need to st:qop the
upper layer from flooding us when that happens.

A timer is necessary in order to restart the queue once the resource has
been processed on the receive side. Due to the way NTB is setup, the
resources on the tx side are tied to the processing of the rx side and
there's no async way to know when the rx side has released those
resources.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:08 -04:00
Kees Cook e15f940908 ntb: avoid format string in dev_set_name
Avoid any chance of format string expansion when calling dev_set_name.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:22 -04:00
Allen Hubbe 30a4bb1e5a NTB: Fix dereference before check
Remove early dereference of a pointer that is checked later in the code.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:22 -04:00
Allen Hubbe 8c9edf63e7 NTB: Fix zero size or integer overflow in ntb_set_mw
A plain 32 bit integer will overflow for values over 4GiB.

Change the plain integer size to the appropriate size type in
ntb_set_mw.  Change the type of the size parameter and two local
variables used for size.

Even if there is no overflow, a size of zero is invalid here.

Reported-by: Juyoung Jung <jjung@micron.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:22 -04:00
Allen Hubbe 8b5a22d8f1 NTB: Schedule to receive on QP link up
Schedule to receive on QP link up, to make sure that the doorbell is
properly cleared for interrupts.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:22 -04:00
Dave Jiang 260bee9451 NTB: Fix oops in debugfs when transport is half-up
When the remote side is not up, we do not have all the context for the
transport, and that causes NULL ptr access. Have the debugfs reads check
to see if transport is up before we make access.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:22 -04:00
Dave Jiang c8650fd03d NTB: Fix transport stats for multiple devices
Currently the debugfs does not have files for all NTB transport queue
pairs.  When there are multiple NTBs present in a system, the QP names
of the last transport clobber the names of previously added transport
QPs.  Only the last added QPs can be observed via debugfs.

Create a directory per NTB transport to associate the QPs with that
transport.  Name the directory the same as the PCI device.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:21 -04:00
Allen Hubbe da2e5ae561 NTB: Fix ntb_transport out-of-order RX update
It was possible for a synchronous update of the RX index in the error
case to get ahead of the asynchronous RX index update in the normal
case.  Change the RX processing to preserve an RX completion order.

There were two error cases.  First, if a buffer is not present to
receive data, there would be no queue entry to preserve the RX
completion order.  Instead of dropping the RX frame, leave the RX frame
in the ring.  Schedule RX processing when RX entries are enqueued, in
case there are RX frames waiting in the ring to be received.

Second, if a buffer is too small to receive data, drop the frame in the
ring, mark the RX entry as done, and indicate the error in the RX entry
length.  Check for a negative length in the receive callback in
ntb_netdev, and count occurrences as rx_length_errors.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:21 -04:00
Dave Jiang bf44fe4671 NTB: Add split BAR output for debugfs stats
When split BAR is enabled, the driver needs to dump out the split BAR
registers rather than the original 64bit BAR registers.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:32 -04:00
Dave Jiang fd839bf884 NTB: Change WARN_ON_ONCE to pr_warn_once on unsafe
The unsafe doorbell and scratchpad access should display reason when
WARN is called.  Otherwise we get a stack dump without any explanation.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:30 -04:00
Dave Jiang 7eb387813d NTB: Print driver name and version in module init
Printouts driver name and version to indicate what is being loaded.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:28 -04:00
Dave Jiang 9891417de8 NTB: Increase transport MTU to 64k from 16k
Benchmarking showed a significant performance increase with the MTU size
to 64k instead of 16k.  Change the driver default to 64k.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:27 -04:00
Dave Jiang 2f887b9a44 NTB: Rename Intel code names to platform names
Instead of using the platform code names, use the correct platform names
to identify the respective Intel NTB hardware.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:25 -04:00
Dave Jiang a41ef053f7 NTB: Default to CPU memcpy for performance
Disable DMA usage by default, since the CPU provides much better
performance with write combining.  Provide a module parameter to enable
DMA usage when offloading the memcpy is preferred.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:24 -04:00
Dave Jiang 06917f7535 NTB: Improve performance with write combining
Changing the memory window BAR mappings to write combining significantly
boosts the performance.  We will also use memcpy that uses non-temporal
store, which showed performance improvement when doing non-cached
memcpys.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:21 -04:00
Allen Hubbe 0e041fb536 NTB: Use NUMA memory in Intel driver
Allocate memory for the NUMA node of the NTB device.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:19 -04:00
Allen Hubbe 1199aa6126 NTB: Use NUMA memory and DMA chan in transport
Allocate memory and request the DMA channel for the same NUMA node as
the NTB device.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:08:33 -04:00
Allen Hubbe 2876228941 NTB: Rate limit ntb_qp_link_work
When the ntb transport is connecting and waiting for the peer, the debug
console receives lots of debug level messages about the remote qp link
status being down.  Rate limit those messages.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:08:30 -04:00