mirror of
https://github.com/ukui/kernel.git
synced 2026-03-09 10:07:04 -07:00
Merge branch 'af_xdp-common-alloc'
Björn Töpel says:
====================
Overview
========
Driver adoption for AF_XDP has been slow. The amount of code required
to proper support AF_XDP is substantial and the driver/core APIs are
vague or even non-existing. Drivers have to manually adjust data
offsets, updating AF_XDP handles differently for different modes
(aligned/unaligned).
This series attempts to improve the situation by introducing an AF_XDP
buffer allocation API. The implementation is based on a single core
(single producer/consumer) buffer pool for the AF_XDP UMEM.
A buffer is allocated using the xsk_buff_alloc() function, and
returned using xsk_buff_free(). If a buffer is disassociated with the
pool, e.g. when a buffer is passed to an AF_XDP socket, a buffer is
said to be released. Currently, the release function is only used by
the AF_XDP internals and not visible to the driver.
Drivers using this API should register the XDP memory model with the
new MEM_TYPE_XSK_BUFF_POOL type, which will supersede the
MEM_TYPE_ZERO_COPY type.
The buffer type is struct xdp_buff, and follows the lifetime of
regular xdp_buffs, i.e. the lifetime of an xdp_buff is restricted to
a NAPI context. In other words, the API is not replacing xdp_frames.
DMA mapping/synching is folded into the buffer handling as well.
@JeffK The Intel drivers changes should go through the bpf-next tree,
and not your regular Intel tree, since multiple (non-Intel)
drivers are affected.
The outline of the series is as following:
Patch 1 is a fix for xsk_umem_xdp_frame_sz().
Patch 2 to 4 are restructures/clean ups. The XSKMAP implementation is
moved to net/xdp/. Functions/defines/enums that are only used by the
AF_XDP internals are moved from the global include/net/xdp_sock.h to
net/xdp/xsk.h. We are also introducing a new "driver include file",
include/net/xdp_sock_drv.h, which is the only file NIC driver
developers adding AF_XDP zero-copy support should care about.
Patch 5 adds the new API, and migrates the "copy-mode"/skb-mode AF_XDP
path to the new API.
Patch 6 to 11 migrates the existing zero-copy drivers to the new API.
Patch 12 removes the MEM_TYPE_ZERO_COPY memory type, and the "handle"
member of struct xdp_buff.
Patch 13 simplifies the xdp_return_{frame,frame_rx_napi,buff}
functions.
Patch 14 is a performance patch, where some functions are inlined.
Finally, patch 15 updates the MAINTAINERS file to correctly mirror the
new file layout.
Note that this series removes the "handle" member from struct
xdp_buff, which reduces the xdp_buff size.
After this series, the diff stat of drivers/net/ is:
27 files changed, 419 insertions(+), 1288 deletions(-)
This series is a first step of simplifying the driver side of
AF_XDP. I think more of the AF_XDP logic can be moved from the drivers
to the AF_XDP core, e.g. the "need wakeup" set/clear functionality.
Statistics when allocation fails can now be added to the socket
statistics via the XDP_STATISTICS getsockopt(). This will be added in
a follow up series.
Performance
===========
As a nice side effect, performance is up a bit as well.
* i40e: 3% higher pps for rxdrop, zero-copy, aligned and unaligned
(40 GbE, 64B packets).
* mlx5: RX +0.8 Mpps, TX +0.4 Mpps
Changelog
=========
v4->v5:
* Fix various kdoc and GCC warnings (W=1). (Jakub)
v3->v4:
* mlx5: Remove unused variable num_xsk_frames. (Jakub)
* i40e: Made i40e_fd_handle_status() static. (kbuild test robot)
v2->v3:
* Added xsk_umem_xdp_frame_sz() fix to the series. (Björn)
* Initialize struct xdp_buff member frame_sz. (Björn)
* Add API to query the DMA address of a frame. (Maxim)
* Do DMA sync for CPU till the end of the frame to handle possible
growth (frame_sz). (Maxim)
* mlx5: Handle frame_sz, use xsk_buff_xdp_get_frame_dma, use
xsk_buff API for DMA sync on TX, add performance numbers. (Maxim)
v1->v2:
* mlx5: Fix DMA address handling, set XDP metadata to invalid. (Maxim)
* ixgbe: Fixed xdp_buff data_end update. (Björn)
* Swapped SoBs in patch 4. (Maxim)
rfc->v1:
* Fixed build errors/warnings for m68k and riscv. (kbuild test
robot)
* Added headroom/chunk size getter. (Maxim/Björn)
* mlx5: Put back the sanity check for XSK params, use XSK API to get
the total headroom size. (Maxim)
* Fixed spelling in commit message. (Björn)
* Make sure xp_validate_desc() is inlined for Tx perf. (Maxim)
* Sorted file entries. (Joe)
* Added xdp_return_{frame,frame_rx_napi,buff} simplification (Björn)
Thanks for all the comments/input/help!
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commit is contained in:
+5
-1
@@ -18443,8 +18443,12 @@ R: Jonathan Lemon <jonathan.lemon@gmail.com>
|
||||
L: netdev@vger.kernel.org
|
||||
L: bpf@vger.kernel.org
|
||||
S: Maintained
|
||||
F: kernel/bpf/xskmap.c
|
||||
F: include/net/xdp_sock*
|
||||
F: include/net/xsk_buffer_pool.h
|
||||
F: include/uapi/linux/if_xdp.h
|
||||
F: net/xdp/
|
||||
F: samples/bpf/xdpsock*
|
||||
F: tools/lib/bpf/xsk*
|
||||
|
||||
XEN BLOCK SUBSYSTEM
|
||||
M: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
||||
|
||||
@@ -11,7 +11,7 @@
|
||||
#include "i40e_diag.h"
|
||||
#include "i40e_xsk.h"
|
||||
#include <net/udp_tunnel.h>
|
||||
#include <net/xdp_sock.h>
|
||||
#include <net/xdp_sock_drv.h>
|
||||
/* All i40e tracepoints are defined by the include below, which
|
||||
* must be included exactly once across the whole kernel with
|
||||
* CREATE_TRACE_POINTS defined
|
||||
@@ -3260,26 +3260,31 @@ static int i40e_configure_rx_ring(struct i40e_ring *ring)
|
||||
if (ring->vsi->type == I40E_VSI_MAIN)
|
||||
xdp_rxq_info_unreg_mem_model(&ring->xdp_rxq);
|
||||
|
||||
kfree(ring->rx_bi);
|
||||
ring->xsk_umem = i40e_xsk_umem(ring);
|
||||
if (ring->xsk_umem) {
|
||||
ring->rx_buf_len = ring->xsk_umem->chunk_size_nohr -
|
||||
XDP_PACKET_HEADROOM;
|
||||
ret = i40e_alloc_rx_bi_zc(ring);
|
||||
if (ret)
|
||||
return ret;
|
||||
ring->rx_buf_len = xsk_umem_get_rx_frame_size(ring->xsk_umem);
|
||||
/* For AF_XDP ZC, we disallow packets to span on
|
||||
* multiple buffers, thus letting us skip that
|
||||
* handling in the fast-path.
|
||||
*/
|
||||
chain_len = 1;
|
||||
ring->zca.free = i40e_zca_free;
|
||||
ret = xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
|
||||
MEM_TYPE_ZERO_COPY,
|
||||
&ring->zca);
|
||||
MEM_TYPE_XSK_BUFF_POOL,
|
||||
NULL);
|
||||
if (ret)
|
||||
return ret;
|
||||
dev_info(&vsi->back->pdev->dev,
|
||||
"Registered XDP mem model MEM_TYPE_ZERO_COPY on Rx ring %d\n",
|
||||
"Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring %d\n",
|
||||
ring->queue_index);
|
||||
|
||||
} else {
|
||||
ret = i40e_alloc_rx_bi(ring);
|
||||
if (ret)
|
||||
return ret;
|
||||
ring->rx_buf_len = vsi->rx_buf_len;
|
||||
if (ring->vsi->type == I40E_VSI_MAIN) {
|
||||
ret = xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
|
||||
@@ -3344,9 +3349,12 @@ static int i40e_configure_rx_ring(struct i40e_ring *ring)
|
||||
ring->tail = hw->hw_addr + I40E_QRX_TAIL(pf_q);
|
||||
writel(0, ring->tail);
|
||||
|
||||
ok = ring->xsk_umem ?
|
||||
i40e_alloc_rx_buffers_zc(ring, I40E_DESC_UNUSED(ring)) :
|
||||
!i40e_alloc_rx_buffers(ring, I40E_DESC_UNUSED(ring));
|
||||
if (ring->xsk_umem) {
|
||||
xsk_buff_set_rxq_info(ring->xsk_umem, &ring->xdp_rxq);
|
||||
ok = i40e_alloc_rx_buffers_zc(ring, I40E_DESC_UNUSED(ring));
|
||||
} else {
|
||||
ok = !i40e_alloc_rx_buffers(ring, I40E_DESC_UNUSED(ring));
|
||||
}
|
||||
if (!ok) {
|
||||
/* Log this in case the user has forgotten to give the kernel
|
||||
* any buffers, even later in the application.
|
||||
|
||||
@@ -521,28 +521,29 @@ int i40e_add_del_fdir(struct i40e_vsi *vsi,
|
||||
/**
|
||||
* i40e_fd_handle_status - check the Programming Status for FD
|
||||
* @rx_ring: the Rx ring for this descriptor
|
||||
* @rx_desc: the Rx descriptor for programming Status, not a packet descriptor.
|
||||
* @qword0_raw: qword0
|
||||
* @qword1: qword1 after le_to_cpu
|
||||
* @prog_id: the id originally used for programming
|
||||
*
|
||||
* This is used to verify if the FD programming or invalidation
|
||||
* requested by SW to the HW is successful or not and take actions accordingly.
|
||||
**/
|
||||
void i40e_fd_handle_status(struct i40e_ring *rx_ring,
|
||||
union i40e_rx_desc *rx_desc, u8 prog_id)
|
||||
static void i40e_fd_handle_status(struct i40e_ring *rx_ring, u64 qword0_raw,
|
||||
u64 qword1, u8 prog_id)
|
||||
{
|
||||
struct i40e_pf *pf = rx_ring->vsi->back;
|
||||
struct pci_dev *pdev = pf->pdev;
|
||||
struct i40e_32b_rx_wb_qw0 *qw0;
|
||||
u32 fcnt_prog, fcnt_avail;
|
||||
u32 error;
|
||||
u64 qw;
|
||||
|
||||
qw = le64_to_cpu(rx_desc->wb.qword1.status_error_len);
|
||||
error = (qw & I40E_RX_PROG_STATUS_DESC_QW1_ERROR_MASK) >>
|
||||
qw0 = (struct i40e_32b_rx_wb_qw0 *)&qword0_raw;
|
||||
error = (qword1 & I40E_RX_PROG_STATUS_DESC_QW1_ERROR_MASK) >>
|
||||
I40E_RX_PROG_STATUS_DESC_QW1_ERROR_SHIFT;
|
||||
|
||||
if (error == BIT(I40E_RX_PROG_STATUS_DESC_FD_TBL_FULL_SHIFT)) {
|
||||
pf->fd_inv = le32_to_cpu(rx_desc->wb.qword0.hi_dword.fd_id);
|
||||
if ((rx_desc->wb.qword0.hi_dword.fd_id != 0) ||
|
||||
pf->fd_inv = le32_to_cpu(qw0->hi_dword.fd_id);
|
||||
if (qw0->hi_dword.fd_id != 0 ||
|
||||
(I40E_DEBUG_FD & pf->hw.debug_mask))
|
||||
dev_warn(&pdev->dev, "ntuple filter loc = %d, could not be added\n",
|
||||
pf->fd_inv);
|
||||
@@ -560,7 +561,7 @@ void i40e_fd_handle_status(struct i40e_ring *rx_ring,
|
||||
/* store the current atr filter count */
|
||||
pf->fd_atr_cnt = i40e_get_current_atr_cnt(pf);
|
||||
|
||||
if ((rx_desc->wb.qword0.hi_dword.fd_id == 0) &&
|
||||
if (qw0->hi_dword.fd_id == 0 &&
|
||||
test_bit(__I40E_FD_SB_AUTO_DISABLED, pf->state)) {
|
||||
/* These set_bit() calls aren't atomic with the
|
||||
* test_bit() here, but worse case we potentially
|
||||
@@ -589,7 +590,7 @@ void i40e_fd_handle_status(struct i40e_ring *rx_ring,
|
||||
} else if (error == BIT(I40E_RX_PROG_STATUS_DESC_NO_FD_ENTRY_SHIFT)) {
|
||||
if (I40E_DEBUG_FD & pf->hw.debug_mask)
|
||||
dev_info(&pdev->dev, "ntuple filter fd_id = %d, could not be removed\n",
|
||||
rx_desc->wb.qword0.hi_dword.fd_id);
|
||||
qw0->hi_dword.fd_id);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1195,6 +1196,11 @@ clear_counts:
|
||||
rc->total_packets = 0;
|
||||
}
|
||||
|
||||
static struct i40e_rx_buffer *i40e_rx_bi(struct i40e_ring *rx_ring, u32 idx)
|
||||
{
|
||||
return &rx_ring->rx_bi[idx];
|
||||
}
|
||||
|
||||
/**
|
||||
* i40e_reuse_rx_page - page flip buffer and store it back on the ring
|
||||
* @rx_ring: rx descriptor ring to store buffers on
|
||||
@@ -1208,7 +1214,7 @@ static void i40e_reuse_rx_page(struct i40e_ring *rx_ring,
|
||||
struct i40e_rx_buffer *new_buff;
|
||||
u16 nta = rx_ring->next_to_alloc;
|
||||
|
||||
new_buff = &rx_ring->rx_bi[nta];
|
||||
new_buff = i40e_rx_bi(rx_ring, nta);
|
||||
|
||||
/* update, and store next to alloc */
|
||||
nta++;
|
||||
@@ -1227,29 +1233,10 @@ static void i40e_reuse_rx_page(struct i40e_ring *rx_ring,
|
||||
}
|
||||
|
||||
/**
|
||||
* i40e_rx_is_programming_status - check for programming status descriptor
|
||||
* @qw: qword representing status_error_len in CPU ordering
|
||||
*
|
||||
* The value of in the descriptor length field indicate if this
|
||||
* is a programming status descriptor for flow director or FCoE
|
||||
* by the value of I40E_RX_PROG_STATUS_DESC_LENGTH, otherwise
|
||||
* it is a packet descriptor.
|
||||
**/
|
||||
static inline bool i40e_rx_is_programming_status(u64 qw)
|
||||
{
|
||||
/* The Rx filter programming status and SPH bit occupy the same
|
||||
* spot in the descriptor. Since we don't support packet split we
|
||||
* can just reuse the bit as an indication that this is a
|
||||
* programming status descriptor.
|
||||
*/
|
||||
return qw & I40E_RXD_QW1_LENGTH_SPH_MASK;
|
||||
}
|
||||
|
||||
/**
|
||||
* i40e_clean_programming_status - try clean the programming status descriptor
|
||||
* i40e_clean_programming_status - clean the programming status descriptor
|
||||
* @rx_ring: the rx ring that has this descriptor
|
||||
* @rx_desc: the rx descriptor written back by HW
|
||||
* @qw: qword representing status_error_len in CPU ordering
|
||||
* @qword0_raw: qword0
|
||||
* @qword1: qword1 representing status_error_len in CPU ordering
|
||||
*
|
||||
* Flow director should handle FD_FILTER_STATUS to check its filter programming
|
||||
* status being successful or not and take actions accordingly. FCoE should
|
||||
@@ -1257,34 +1244,16 @@ static inline bool i40e_rx_is_programming_status(u64 qw)
|
||||
*
|
||||
* Returns an i40e_rx_buffer to reuse if the cleanup occurred, otherwise NULL.
|
||||
**/
|
||||
struct i40e_rx_buffer *i40e_clean_programming_status(
|
||||
struct i40e_ring *rx_ring,
|
||||
union i40e_rx_desc *rx_desc,
|
||||
u64 qw)
|
||||
void i40e_clean_programming_status(struct i40e_ring *rx_ring, u64 qword0_raw,
|
||||
u64 qword1)
|
||||
{
|
||||
struct i40e_rx_buffer *rx_buffer;
|
||||
u32 ntc;
|
||||
u8 id;
|
||||
|
||||
if (!i40e_rx_is_programming_status(qw))
|
||||
return NULL;
|
||||
|
||||
ntc = rx_ring->next_to_clean;
|
||||
|
||||
/* fetch, update, and store next to clean */
|
||||
rx_buffer = &rx_ring->rx_bi[ntc++];
|
||||
ntc = (ntc < rx_ring->count) ? ntc : 0;
|
||||
rx_ring->next_to_clean = ntc;
|
||||
|
||||
prefetch(I40E_RX_DESC(rx_ring, ntc));
|
||||
|
||||
id = (qw & I40E_RX_PROG_STATUS_DESC_QW1_PROGID_MASK) >>
|
||||
id = (qword1 & I40E_RX_PROG_STATUS_DESC_QW1_PROGID_MASK) >>
|
||||
I40E_RX_PROG_STATUS_DESC_QW1_PROGID_SHIFT;
|
||||
|
||||
if (id == I40E_RX_PROG_STATUS_DESC_FD_FILTER_STATUS)
|
||||
i40e_fd_handle_status(rx_ring, rx_desc, id);
|
||||
|
||||
return rx_buffer;
|
||||
i40e_fd_handle_status(rx_ring, qword0_raw, qword1, id);
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -1336,13 +1305,25 @@ err:
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
int i40e_alloc_rx_bi(struct i40e_ring *rx_ring)
|
||||
{
|
||||
unsigned long sz = sizeof(*rx_ring->rx_bi) * rx_ring->count;
|
||||
|
||||
rx_ring->rx_bi = kzalloc(sz, GFP_KERNEL);
|
||||
return rx_ring->rx_bi ? 0 : -ENOMEM;
|
||||
}
|
||||
|
||||
static void i40e_clear_rx_bi(struct i40e_ring *rx_ring)
|
||||
{
|
||||
memset(rx_ring->rx_bi, 0, sizeof(*rx_ring->rx_bi) * rx_ring->count);
|
||||
}
|
||||
|
||||
/**
|
||||
* i40e_clean_rx_ring - Free Rx buffers
|
||||
* @rx_ring: ring to be cleaned
|
||||
**/
|
||||
void i40e_clean_rx_ring(struct i40e_ring *rx_ring)
|
||||
{
|
||||
unsigned long bi_size;
|
||||
u16 i;
|
||||
|
||||
/* ring already cleared, nothing to do */
|
||||
@@ -1361,7 +1342,7 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring)
|
||||
|
||||
/* Free all the Rx ring sk_buffs */
|
||||
for (i = 0; i < rx_ring->count; i++) {
|
||||
struct i40e_rx_buffer *rx_bi = &rx_ring->rx_bi[i];
|
||||
struct i40e_rx_buffer *rx_bi = i40e_rx_bi(rx_ring, i);
|
||||
|
||||
if (!rx_bi->page)
|
||||
continue;
|
||||
@@ -1388,8 +1369,10 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring)
|
||||
}
|
||||
|
||||
skip_free:
|
||||
bi_size = sizeof(struct i40e_rx_buffer) * rx_ring->count;
|
||||
memset(rx_ring->rx_bi, 0, bi_size);
|
||||
if (rx_ring->xsk_umem)
|
||||
i40e_clear_rx_bi_zc(rx_ring);
|
||||
else
|
||||
i40e_clear_rx_bi(rx_ring);
|
||||
|
||||
/* Zero out the descriptor ring */
|
||||
memset(rx_ring->desc, 0, rx_ring->size);
|
||||
@@ -1430,15 +1413,7 @@ void i40e_free_rx_resources(struct i40e_ring *rx_ring)
|
||||
int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring)
|
||||
{
|
||||
struct device *dev = rx_ring->dev;
|
||||
int err = -ENOMEM;
|
||||
int bi_size;
|
||||
|
||||
/* warn if we are about to overwrite the pointer */
|
||||
WARN_ON(rx_ring->rx_bi);
|
||||
bi_size = sizeof(struct i40e_rx_buffer) * rx_ring->count;
|
||||
rx_ring->rx_bi = kzalloc(bi_size, GFP_KERNEL);
|
||||
if (!rx_ring->rx_bi)
|
||||
goto err;
|
||||
int err;
|
||||
|
||||
u64_stats_init(&rx_ring->syncp);
|
||||
|
||||
@@ -1451,7 +1426,7 @@ int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring)
|
||||
if (!rx_ring->desc) {
|
||||
dev_info(dev, "Unable to allocate memory for the Rx descriptor ring, size=%d\n",
|
||||
rx_ring->size);
|
||||
goto err;
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
rx_ring->next_to_alloc = 0;
|
||||
@@ -1463,16 +1438,12 @@ int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring)
|
||||
err = xdp_rxq_info_reg(&rx_ring->xdp_rxq, rx_ring->netdev,
|
||||
rx_ring->queue_index);
|
||||
if (err < 0)
|
||||
goto err;
|
||||
return err;
|
||||
}
|
||||
|
||||
rx_ring->xdp_prog = rx_ring->vsi->xdp_prog;
|
||||
|
||||
return 0;
|
||||
err:
|
||||
kfree(rx_ring->rx_bi);
|
||||
rx_ring->rx_bi = NULL;
|
||||
return err;
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -1592,7 +1563,7 @@ bool i40e_alloc_rx_buffers(struct i40e_ring *rx_ring, u16 cleaned_count)
|
||||
return false;
|
||||
|
||||
rx_desc = I40E_RX_DESC(rx_ring, ntu);
|
||||
bi = &rx_ring->rx_bi[ntu];
|
||||
bi = i40e_rx_bi(rx_ring, ntu);
|
||||
|
||||
do {
|
||||
if (!i40e_alloc_mapped_page(rx_ring, bi))
|
||||
@@ -1614,7 +1585,7 @@ bool i40e_alloc_rx_buffers(struct i40e_ring *rx_ring, u16 cleaned_count)
|
||||
ntu++;
|
||||
if (unlikely(ntu == rx_ring->count)) {
|
||||
rx_desc = I40E_RX_DESC(rx_ring, 0);
|
||||
bi = rx_ring->rx_bi;
|
||||
bi = i40e_rx_bi(rx_ring, 0);
|
||||
ntu = 0;
|
||||
}
|
||||
|
||||
@@ -1981,7 +1952,7 @@ static struct i40e_rx_buffer *i40e_get_rx_buffer(struct i40e_ring *rx_ring,
|
||||
{
|
||||
struct i40e_rx_buffer *rx_buffer;
|
||||
|
||||
rx_buffer = &rx_ring->rx_bi[rx_ring->next_to_clean];
|
||||
rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean);
|
||||
prefetchw(rx_buffer->page);
|
||||
|
||||
/* we are reusing so sync this buffer for CPU use */
|
||||
@@ -2382,9 +2353,12 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
|
||||
*/
|
||||
dma_rmb();
|
||||
|
||||
rx_buffer = i40e_clean_programming_status(rx_ring, rx_desc,
|
||||
qword);
|
||||
if (unlikely(rx_buffer)) {
|
||||
if (i40e_rx_is_programming_status(qword)) {
|
||||
i40e_clean_programming_status(rx_ring,
|
||||
rx_desc->raw.qword[0],
|
||||
qword);
|
||||
rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean);
|
||||
i40e_inc_ntc(rx_ring);
|
||||
i40e_reuse_rx_page(rx_ring, rx_buffer);
|
||||
cleaned_count++;
|
||||
continue;
|
||||
|
||||
@@ -296,17 +296,9 @@ struct i40e_tx_buffer {
|
||||
|
||||
struct i40e_rx_buffer {
|
||||
dma_addr_t dma;
|
||||
union {
|
||||
struct {
|
||||
struct page *page;
|
||||
__u32 page_offset;
|
||||
__u16 pagecnt_bias;
|
||||
};
|
||||
struct {
|
||||
void *addr;
|
||||
u64 handle;
|
||||
};
|
||||
};
|
||||
struct page *page;
|
||||
__u32 page_offset;
|
||||
__u16 pagecnt_bias;
|
||||
};
|
||||
|
||||
struct i40e_queue_stats {
|
||||
@@ -358,6 +350,7 @@ struct i40e_ring {
|
||||
union {
|
||||
struct i40e_tx_buffer *tx_bi;
|
||||
struct i40e_rx_buffer *rx_bi;
|
||||
struct xdp_buff **rx_bi_zc;
|
||||
};
|
||||
DECLARE_BITMAP(state, __I40E_RING_STATE_NBITS);
|
||||
u16 queue_index; /* Queue number of ring */
|
||||
@@ -419,7 +412,6 @@ struct i40e_ring {
|
||||
struct i40e_channel *ch;
|
||||
struct xdp_rxq_info xdp_rxq;
|
||||
struct xdp_umem *xsk_umem;
|
||||
struct zero_copy_allocator zca; /* ZC allocator anchor */
|
||||
} ____cacheline_internodealigned_in_smp;
|
||||
|
||||
static inline bool ring_uses_build_skb(struct i40e_ring *ring)
|
||||
@@ -495,6 +487,7 @@ int __i40e_maybe_stop_tx(struct i40e_ring *tx_ring, int size);
|
||||
bool __i40e_chk_linearize(struct sk_buff *skb);
|
||||
int i40e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames,
|
||||
u32 flags);
|
||||
int i40e_alloc_rx_bi(struct i40e_ring *rx_ring);
|
||||
|
||||
/**
|
||||
* i40e_get_head - Retrieve head from head writeback
|
||||
|
||||
@@ -4,13 +4,9 @@
|
||||
#ifndef I40E_TXRX_COMMON_
|
||||
#define I40E_TXRX_COMMON_
|
||||
|
||||
void i40e_fd_handle_status(struct i40e_ring *rx_ring,
|
||||
union i40e_rx_desc *rx_desc, u8 prog_id);
|
||||
int i40e_xmit_xdp_tx_ring(struct xdp_buff *xdp, struct i40e_ring *xdp_ring);
|
||||
struct i40e_rx_buffer *i40e_clean_programming_status(
|
||||
struct i40e_ring *rx_ring,
|
||||
union i40e_rx_desc *rx_desc,
|
||||
u64 qw);
|
||||
void i40e_clean_programming_status(struct i40e_ring *rx_ring, u64 qword0_raw,
|
||||
u64 qword1);
|
||||
void i40e_process_skb_fields(struct i40e_ring *rx_ring,
|
||||
union i40e_rx_desc *rx_desc, struct sk_buff *skb);
|
||||
void i40e_xdp_ring_update_tail(struct i40e_ring *xdp_ring);
|
||||
@@ -84,6 +80,38 @@ static inline void i40e_arm_wb(struct i40e_ring *tx_ring,
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* i40e_rx_is_programming_status - check for programming status descriptor
|
||||
* @qword1: qword1 representing status_error_len in CPU ordering
|
||||
*
|
||||
* The value of in the descriptor length field indicate if this
|
||||
* is a programming status descriptor for flow director or FCoE
|
||||
* by the value of I40E_RX_PROG_STATUS_DESC_LENGTH, otherwise
|
||||
* it is a packet descriptor.
|
||||
**/
|
||||
static inline bool i40e_rx_is_programming_status(u64 qword1)
|
||||
{
|
||||
/* The Rx filter programming status and SPH bit occupy the same
|
||||
* spot in the descriptor. Since we don't support packet split we
|
||||
* can just reuse the bit as an indication that this is a
|
||||
* programming status descriptor.
|
||||
*/
|
||||
return qword1 & I40E_RXD_QW1_LENGTH_SPH_MASK;
|
||||
}
|
||||
|
||||
/**
|
||||
* i40e_inc_ntc: Advance the next_to_clean index
|
||||
* @rx_ring: Rx ring
|
||||
**/
|
||||
static inline void i40e_inc_ntc(struct i40e_ring *rx_ring)
|
||||
{
|
||||
u32 ntc = rx_ring->next_to_clean + 1;
|
||||
|
||||
ntc = (ntc < rx_ring->count) ? ntc : 0;
|
||||
rx_ring->next_to_clean = ntc;
|
||||
prefetch(I40E_RX_DESC(rx_ring, ntc));
|
||||
}
|
||||
|
||||
void i40e_xsk_clean_rx_ring(struct i40e_ring *rx_ring);
|
||||
void i40e_xsk_clean_tx_ring(struct i40e_ring *tx_ring);
|
||||
bool i40e_xsk_any_rx_ring_enabled(struct i40e_vsi *vsi);
|
||||
|
||||
@@ -689,7 +689,7 @@ union i40e_32byte_rx_desc {
|
||||
__le64 rsvd2;
|
||||
} read;
|
||||
struct {
|
||||
struct {
|
||||
struct i40e_32b_rx_wb_qw0 {
|
||||
struct {
|
||||
union {
|
||||
__le16 mirroring_status;
|
||||
@@ -727,6 +727,9 @@ union i40e_32byte_rx_desc {
|
||||
} hi_dword;
|
||||
} qword3;
|
||||
} wb; /* writeback */
|
||||
struct {
|
||||
u64 qword[4];
|
||||
} raw;
|
||||
};
|
||||
|
||||
enum i40e_rx_desc_status_bits {
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -12,12 +12,13 @@ int i40e_queue_pair_disable(struct i40e_vsi *vsi, int queue_pair);
|
||||
int i40e_queue_pair_enable(struct i40e_vsi *vsi, int queue_pair);
|
||||
int i40e_xsk_umem_setup(struct i40e_vsi *vsi, struct xdp_umem *umem,
|
||||
u16 qid);
|
||||
void i40e_zca_free(struct zero_copy_allocator *alloc, unsigned long handle);
|
||||
bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 cleaned_count);
|
||||
int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget);
|
||||
|
||||
bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi,
|
||||
struct i40e_ring *tx_ring, int napi_budget);
|
||||
int i40e_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags);
|
||||
int i40e_alloc_rx_bi_zc(struct i40e_ring *rx_ring);
|
||||
void i40e_clear_rx_bi_zc(struct i40e_ring *rx_ring);
|
||||
|
||||
#endif /* _I40E_XSK_H_ */
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
/* Copyright (c) 2019, Intel Corporation. */
|
||||
|
||||
#include <net/xdp_sock_drv.h>
|
||||
#include "ice_base.h"
|
||||
#include "ice_dcb_lib.h"
|
||||
|
||||
@@ -308,24 +309,23 @@ int ice_setup_rx_ctx(struct ice_ring *ring)
|
||||
if (ring->xsk_umem) {
|
||||
xdp_rxq_info_unreg_mem_model(&ring->xdp_rxq);
|
||||
|
||||
ring->rx_buf_len = ring->xsk_umem->chunk_size_nohr -
|
||||
XDP_PACKET_HEADROOM;
|
||||
ring->rx_buf_len =
|
||||
xsk_umem_get_rx_frame_size(ring->xsk_umem);
|
||||
/* For AF_XDP ZC, we disallow packets to span on
|
||||
* multiple buffers, thus letting us skip that
|
||||
* handling in the fast-path.
|
||||
*/
|
||||
chain_len = 1;
|
||||
ring->zca.free = ice_zca_free;
|
||||
err = xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
|
||||
MEM_TYPE_ZERO_COPY,
|
||||
&ring->zca);
|
||||
MEM_TYPE_XSK_BUFF_POOL,
|
||||
NULL);
|
||||
if (err)
|
||||
return err;
|
||||
xsk_buff_set_rxq_info(ring->xsk_umem, &ring->xdp_rxq);
|
||||
|
||||
dev_info(ice_pf_to_dev(vsi->back), "Registered XDP mem model MEM_TYPE_ZERO_COPY on Rx ring %d\n",
|
||||
dev_info(ice_pf_to_dev(vsi->back), "Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring %d\n",
|
||||
ring->q_index);
|
||||
} else {
|
||||
ring->zca.free = NULL;
|
||||
if (!xdp_rxq_info_is_reg(&ring->xdp_rxq))
|
||||
/* coverity[check_return] */
|
||||
xdp_rxq_info_reg(&ring->xdp_rxq,
|
||||
@@ -426,7 +426,7 @@ int ice_setup_rx_ctx(struct ice_ring *ring)
|
||||
writel(0, ring->tail);
|
||||
|
||||
err = ring->xsk_umem ?
|
||||
ice_alloc_rx_bufs_slow_zc(ring, ICE_DESC_UNUSED(ring)) :
|
||||
ice_alloc_rx_bufs_zc(ring, ICE_DESC_UNUSED(ring)) :
|
||||
ice_alloc_rx_bufs(ring, ICE_DESC_UNUSED(ring));
|
||||
if (err)
|
||||
dev_info(ice_pf_to_dev(vsi->back), "Failed allocate some buffers on %sRx ring %d (pf_q %d)\n",
|
||||
|
||||
@@ -155,17 +155,16 @@ struct ice_tx_offload_params {
|
||||
};
|
||||
|
||||
struct ice_rx_buf {
|
||||
struct sk_buff *skb;
|
||||
dma_addr_t dma;
|
||||
union {
|
||||
struct {
|
||||
struct sk_buff *skb;
|
||||
dma_addr_t dma;
|
||||
struct page *page;
|
||||
unsigned int page_offset;
|
||||
u16 pagecnt_bias;
|
||||
};
|
||||
struct {
|
||||
void *addr;
|
||||
u64 handle;
|
||||
struct xdp_buff *xdp;
|
||||
};
|
||||
};
|
||||
};
|
||||
@@ -289,7 +288,6 @@ struct ice_ring {
|
||||
struct rcu_head rcu; /* to avoid race on free */
|
||||
struct bpf_prog *xdp_prog;
|
||||
struct xdp_umem *xsk_umem;
|
||||
struct zero_copy_allocator zca;
|
||||
/* CL3 - 3rd cacheline starts here */
|
||||
struct xdp_rxq_info xdp_rxq;
|
||||
/* CLX - the below items are only accessed infrequently and should be
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -10,11 +10,10 @@ struct ice_vsi;
|
||||
|
||||
#ifdef CONFIG_XDP_SOCKETS
|
||||
int ice_xsk_umem_setup(struct ice_vsi *vsi, struct xdp_umem *umem, u16 qid);
|
||||
void ice_zca_free(struct zero_copy_allocator *zca, unsigned long handle);
|
||||
int ice_clean_rx_irq_zc(struct ice_ring *rx_ring, int budget);
|
||||
bool ice_clean_tx_irq_zc(struct ice_ring *xdp_ring, int budget);
|
||||
int ice_xsk_wakeup(struct net_device *netdev, u32 queue_id, u32 flags);
|
||||
bool ice_alloc_rx_bufs_slow_zc(struct ice_ring *rx_ring, u16 count);
|
||||
bool ice_alloc_rx_bufs_zc(struct ice_ring *rx_ring, u16 count);
|
||||
bool ice_xsk_any_rx_ring_ena(struct ice_vsi *vsi);
|
||||
void ice_xsk_clean_rx_ring(struct ice_ring *rx_ring);
|
||||
void ice_xsk_clean_xdp_ring(struct ice_ring *xdp_ring);
|
||||
@@ -27,12 +26,6 @@ ice_xsk_umem_setup(struct ice_vsi __always_unused *vsi,
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
|
||||
static inline void
|
||||
ice_zca_free(struct zero_copy_allocator __always_unused *zca,
|
||||
unsigned long __always_unused handle)
|
||||
{
|
||||
}
|
||||
|
||||
static inline int
|
||||
ice_clean_rx_irq_zc(struct ice_ring __always_unused *rx_ring,
|
||||
int __always_unused budget)
|
||||
@@ -48,8 +41,8 @@ ice_clean_tx_irq_zc(struct ice_ring __always_unused *xdp_ring,
|
||||
}
|
||||
|
||||
static inline bool
|
||||
ice_alloc_rx_bufs_slow_zc(struct ice_ring __always_unused *rx_ring,
|
||||
u16 __always_unused count)
|
||||
ice_alloc_rx_bufs_zc(struct ice_ring __always_unused *rx_ring,
|
||||
u16 __always_unused count)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
@@ -224,17 +224,17 @@ struct ixgbe_tx_buffer {
|
||||
};
|
||||
|
||||
struct ixgbe_rx_buffer {
|
||||
struct sk_buff *skb;
|
||||
dma_addr_t dma;
|
||||
union {
|
||||
struct {
|
||||
struct sk_buff *skb;
|
||||
dma_addr_t dma;
|
||||
struct page *page;
|
||||
__u32 page_offset;
|
||||
__u16 pagecnt_bias;
|
||||
};
|
||||
struct {
|
||||
void *addr;
|
||||
u64 handle;
|
||||
bool discard;
|
||||
struct xdp_buff *xdp;
|
||||
};
|
||||
};
|
||||
};
|
||||
@@ -351,7 +351,6 @@ struct ixgbe_ring {
|
||||
};
|
||||
struct xdp_rxq_info xdp_rxq;
|
||||
struct xdp_umem *xsk_umem;
|
||||
struct zero_copy_allocator zca; /* ZC allocator anchor */
|
||||
u16 ring_idx; /* {rx,tx,xdp}_ring back reference idx */
|
||||
u16 rx_buf_len;
|
||||
} ____cacheline_internodealigned_in_smp;
|
||||
|
||||
@@ -35,7 +35,7 @@
|
||||
#include <net/tc_act/tc_mirred.h>
|
||||
#include <net/vxlan.h>
|
||||
#include <net/mpls.h>
|
||||
#include <net/xdp_sock.h>
|
||||
#include <net/xdp_sock_drv.h>
|
||||
#include <net/xfrm.h>
|
||||
|
||||
#include "ixgbe.h"
|
||||
@@ -3745,8 +3745,7 @@ static void ixgbe_configure_srrctl(struct ixgbe_adapter *adapter,
|
||||
|
||||
/* configure the packet buffer length */
|
||||
if (rx_ring->xsk_umem) {
|
||||
u32 xsk_buf_len = rx_ring->xsk_umem->chunk_size_nohr -
|
||||
XDP_PACKET_HEADROOM;
|
||||
u32 xsk_buf_len = xsk_umem_get_rx_frame_size(rx_ring->xsk_umem);
|
||||
|
||||
/* If the MAC support setting RXDCTL.RLPML, the
|
||||
* SRRCTL[n].BSIZEPKT is set to PAGE_SIZE and
|
||||
@@ -4093,11 +4092,10 @@ void ixgbe_configure_rx_ring(struct ixgbe_adapter *adapter,
|
||||
xdp_rxq_info_unreg_mem_model(&ring->xdp_rxq);
|
||||
ring->xsk_umem = ixgbe_xsk_umem(adapter, ring);
|
||||
if (ring->xsk_umem) {
|
||||
ring->zca.free = ixgbe_zca_free;
|
||||
WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
|
||||
MEM_TYPE_ZERO_COPY,
|
||||
&ring->zca));
|
||||
|
||||
MEM_TYPE_XSK_BUFF_POOL,
|
||||
NULL));
|
||||
xsk_buff_set_rxq_info(ring->xsk_umem, &ring->xdp_rxq);
|
||||
} else {
|
||||
WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
|
||||
MEM_TYPE_PAGE_SHARED, NULL));
|
||||
@@ -4153,8 +4151,7 @@ void ixgbe_configure_rx_ring(struct ixgbe_adapter *adapter,
|
||||
}
|
||||
|
||||
if (ring->xsk_umem && hw->mac.type != ixgbe_mac_82599EB) {
|
||||
u32 xsk_buf_len = ring->xsk_umem->chunk_size_nohr -
|
||||
XDP_PACKET_HEADROOM;
|
||||
u32 xsk_buf_len = xsk_umem_get_rx_frame_size(ring->xsk_umem);
|
||||
|
||||
rxdctl &= ~(IXGBE_RXDCTL_RLPMLMASK |
|
||||
IXGBE_RXDCTL_RLPML_EN);
|
||||
|
||||
@@ -35,7 +35,7 @@ int ixgbe_xsk_umem_setup(struct ixgbe_adapter *adapter, struct xdp_umem *umem,
|
||||
|
||||
void ixgbe_zca_free(struct zero_copy_allocator *alloc, unsigned long handle);
|
||||
|
||||
void ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count);
|
||||
bool ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count);
|
||||
int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector,
|
||||
struct ixgbe_ring *rx_ring,
|
||||
const int budget);
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
/* Copyright(c) 2018 Intel Corporation. */
|
||||
|
||||
#include <linux/bpf_trace.h>
|
||||
#include <net/xdp_sock.h>
|
||||
#include <net/xdp_sock_drv.h>
|
||||
#include <net/xdp.h>
|
||||
|
||||
#include "ixgbe.h"
|
||||
@@ -20,54 +20,11 @@ struct xdp_umem *ixgbe_xsk_umem(struct ixgbe_adapter *adapter,
|
||||
return xdp_get_umem_from_qid(adapter->netdev, qid);
|
||||
}
|
||||
|
||||
static int ixgbe_xsk_umem_dma_map(struct ixgbe_adapter *adapter,
|
||||
struct xdp_umem *umem)
|
||||
{
|
||||
struct device *dev = &adapter->pdev->dev;
|
||||
unsigned int i, j;
|
||||
dma_addr_t dma;
|
||||
|
||||
for (i = 0; i < umem->npgs; i++) {
|
||||
dma = dma_map_page_attrs(dev, umem->pgs[i], 0, PAGE_SIZE,
|
||||
DMA_BIDIRECTIONAL, IXGBE_RX_DMA_ATTR);
|
||||
if (dma_mapping_error(dev, dma))
|
||||
goto out_unmap;
|
||||
|
||||
umem->pages[i].dma = dma;
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
||||
out_unmap:
|
||||
for (j = 0; j < i; j++) {
|
||||
dma_unmap_page_attrs(dev, umem->pages[i].dma, PAGE_SIZE,
|
||||
DMA_BIDIRECTIONAL, IXGBE_RX_DMA_ATTR);
|
||||
umem->pages[i].dma = 0;
|
||||
}
|
||||
|
||||
return -1;
|
||||
}
|
||||
|
||||
static void ixgbe_xsk_umem_dma_unmap(struct ixgbe_adapter *adapter,
|
||||
struct xdp_umem *umem)
|
||||
{
|
||||
struct device *dev = &adapter->pdev->dev;
|
||||
unsigned int i;
|
||||
|
||||
for (i = 0; i < umem->npgs; i++) {
|
||||
dma_unmap_page_attrs(dev, umem->pages[i].dma, PAGE_SIZE,
|
||||
DMA_BIDIRECTIONAL, IXGBE_RX_DMA_ATTR);
|
||||
|
||||
umem->pages[i].dma = 0;
|
||||
}
|
||||
}
|
||||
|
||||
static int ixgbe_xsk_umem_enable(struct ixgbe_adapter *adapter,
|
||||
struct xdp_umem *umem,
|
||||
u16 qid)
|
||||
{
|
||||
struct net_device *netdev = adapter->netdev;
|
||||
struct xdp_umem_fq_reuse *reuseq;
|
||||
bool if_running;
|
||||
int err;
|
||||
|
||||
@@ -78,13 +35,7 @@ static int ixgbe_xsk_umem_enable(struct ixgbe_adapter *adapter,
|
||||
qid >= netdev->real_num_tx_queues)
|
||||
return -EINVAL;
|
||||
|
||||
reuseq = xsk_reuseq_prepare(adapter->rx_ring[0]->count);
|
||||
if (!reuseq)
|
||||
return -ENOMEM;
|
||||
|
||||
xsk_reuseq_free(xsk_reuseq_swap(umem, reuseq));
|
||||
|
||||
err = ixgbe_xsk_umem_dma_map(adapter, umem);
|
||||
err = xsk_buff_dma_map(umem, &adapter->pdev->dev, IXGBE_RX_DMA_ATTR);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
@@ -124,7 +75,7 @@ static int ixgbe_xsk_umem_disable(struct ixgbe_adapter *adapter, u16 qid)
|
||||
ixgbe_txrx_ring_disable(adapter, qid);
|
||||
|
||||
clear_bit(qid, adapter->af_xdp_zc_qps);
|
||||
ixgbe_xsk_umem_dma_unmap(adapter, umem);
|
||||
xsk_buff_dma_unmap(umem, IXGBE_RX_DMA_ATTR);
|
||||
|
||||
if (if_running)
|
||||
ixgbe_txrx_ring_enable(adapter, qid);
|
||||
@@ -143,19 +94,14 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter,
|
||||
struct ixgbe_ring *rx_ring,
|
||||
struct xdp_buff *xdp)
|
||||
{
|
||||
struct xdp_umem *umem = rx_ring->xsk_umem;
|
||||
int err, result = IXGBE_XDP_PASS;
|
||||
struct bpf_prog *xdp_prog;
|
||||
struct xdp_frame *xdpf;
|
||||
u64 offset;
|
||||
u32 act;
|
||||
|
||||
rcu_read_lock();
|
||||
xdp_prog = READ_ONCE(rx_ring->xdp_prog);
|
||||
act = bpf_prog_run_xdp(xdp_prog, xdp);
|
||||
offset = xdp->data - xdp->data_hard_start;
|
||||
|
||||
xdp->handle = xsk_umem_adjust_offset(umem, xdp->handle, offset);
|
||||
|
||||
switch (act) {
|
||||
case XDP_PASS:
|
||||
@@ -186,140 +132,16 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter,
|
||||
return result;
|
||||
}
|
||||
|
||||
static struct
|
||||
ixgbe_rx_buffer *ixgbe_get_rx_buffer_zc(struct ixgbe_ring *rx_ring,
|
||||
unsigned int size)
|
||||
{
|
||||
struct ixgbe_rx_buffer *bi;
|
||||
|
||||
bi = &rx_ring->rx_buffer_info[rx_ring->next_to_clean];
|
||||
|
||||
/* we are reusing so sync this buffer for CPU use */
|
||||
dma_sync_single_range_for_cpu(rx_ring->dev,
|
||||
bi->dma, 0,
|
||||
size,
|
||||
DMA_BIDIRECTIONAL);
|
||||
|
||||
return bi;
|
||||
}
|
||||
|
||||
static void ixgbe_reuse_rx_buffer_zc(struct ixgbe_ring *rx_ring,
|
||||
struct ixgbe_rx_buffer *obi)
|
||||
{
|
||||
u16 nta = rx_ring->next_to_alloc;
|
||||
struct ixgbe_rx_buffer *nbi;
|
||||
|
||||
nbi = &rx_ring->rx_buffer_info[rx_ring->next_to_alloc];
|
||||
/* update, and store next to alloc */
|
||||
nta++;
|
||||
rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0;
|
||||
|
||||
/* transfer page from old buffer to new buffer */
|
||||
nbi->dma = obi->dma;
|
||||
nbi->addr = obi->addr;
|
||||
nbi->handle = obi->handle;
|
||||
|
||||
obi->addr = NULL;
|
||||
obi->skb = NULL;
|
||||
}
|
||||
|
||||
void ixgbe_zca_free(struct zero_copy_allocator *alloc, unsigned long handle)
|
||||
{
|
||||
struct ixgbe_rx_buffer *bi;
|
||||
struct ixgbe_ring *rx_ring;
|
||||
u64 hr, mask;
|
||||
u16 nta;
|
||||
|
||||
rx_ring = container_of(alloc, struct ixgbe_ring, zca);
|
||||
hr = rx_ring->xsk_umem->headroom + XDP_PACKET_HEADROOM;
|
||||
mask = rx_ring->xsk_umem->chunk_mask;
|
||||
|
||||
nta = rx_ring->next_to_alloc;
|
||||
bi = rx_ring->rx_buffer_info;
|
||||
|
||||
nta++;
|
||||
rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0;
|
||||
|
||||
handle &= mask;
|
||||
|
||||
bi->dma = xdp_umem_get_dma(rx_ring->xsk_umem, handle);
|
||||
bi->dma += hr;
|
||||
|
||||
bi->addr = xdp_umem_get_data(rx_ring->xsk_umem, handle);
|
||||
bi->addr += hr;
|
||||
|
||||
bi->handle = xsk_umem_adjust_offset(rx_ring->xsk_umem, (u64)handle,
|
||||
rx_ring->xsk_umem->headroom);
|
||||
}
|
||||
|
||||
static bool ixgbe_alloc_buffer_zc(struct ixgbe_ring *rx_ring,
|
||||
struct ixgbe_rx_buffer *bi)
|
||||
{
|
||||
struct xdp_umem *umem = rx_ring->xsk_umem;
|
||||
void *addr = bi->addr;
|
||||
u64 handle, hr;
|
||||
|
||||
if (addr)
|
||||
return true;
|
||||
|
||||
if (!xsk_umem_peek_addr(umem, &handle)) {
|
||||
rx_ring->rx_stats.alloc_rx_page_failed++;
|
||||
return false;
|
||||
}
|
||||
|
||||
hr = umem->headroom + XDP_PACKET_HEADROOM;
|
||||
|
||||
bi->dma = xdp_umem_get_dma(umem, handle);
|
||||
bi->dma += hr;
|
||||
|
||||
bi->addr = xdp_umem_get_data(umem, handle);
|
||||
bi->addr += hr;
|
||||
|
||||
bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
|
||||
|
||||
xsk_umem_release_addr(umem);
|
||||
return true;
|
||||
}
|
||||
|
||||
static bool ixgbe_alloc_buffer_slow_zc(struct ixgbe_ring *rx_ring,
|
||||
struct ixgbe_rx_buffer *bi)
|
||||
{
|
||||
struct xdp_umem *umem = rx_ring->xsk_umem;
|
||||
u64 handle, hr;
|
||||
|
||||
if (!xsk_umem_peek_addr_rq(umem, &handle)) {
|
||||
rx_ring->rx_stats.alloc_rx_page_failed++;
|
||||
return false;
|
||||
}
|
||||
|
||||
handle &= rx_ring->xsk_umem->chunk_mask;
|
||||
|
||||
hr = umem->headroom + XDP_PACKET_HEADROOM;
|
||||
|
||||
bi->dma = xdp_umem_get_dma(umem, handle);
|
||||
bi->dma += hr;
|
||||
|
||||
bi->addr = xdp_umem_get_data(umem, handle);
|
||||
bi->addr += hr;
|
||||
|
||||
bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
|
||||
|
||||
xsk_umem_release_addr_rq(umem);
|
||||
return true;
|
||||
}
|
||||
|
||||
static __always_inline bool
|
||||
__ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count,
|
||||
bool alloc(struct ixgbe_ring *rx_ring,
|
||||
struct ixgbe_rx_buffer *bi))
|
||||
bool ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 count)
|
||||
{
|
||||
union ixgbe_adv_rx_desc *rx_desc;
|
||||
struct ixgbe_rx_buffer *bi;
|
||||
u16 i = rx_ring->next_to_use;
|
||||
dma_addr_t dma;
|
||||
bool ok = true;
|
||||
|
||||
/* nothing to do */
|
||||
if (!cleaned_count)
|
||||
if (!count)
|
||||
return true;
|
||||
|
||||
rx_desc = IXGBE_RX_DESC(rx_ring, i);
|
||||
@@ -327,21 +149,18 @@ __ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count,
|
||||
i -= rx_ring->count;
|
||||
|
||||
do {
|
||||
if (!alloc(rx_ring, bi)) {
|
||||
bi->xdp = xsk_buff_alloc(rx_ring->xsk_umem);
|
||||
if (!bi->xdp) {
|
||||
ok = false;
|
||||
break;
|
||||
}
|
||||
|
||||
/* sync the buffer for use by the device */
|
||||
dma_sync_single_range_for_device(rx_ring->dev, bi->dma,
|
||||
bi->page_offset,
|
||||
rx_ring->rx_buf_len,
|
||||
DMA_BIDIRECTIONAL);
|
||||
dma = xsk_buff_xdp_get_dma(bi->xdp);
|
||||
|
||||
/* Refresh the desc even if buffer_addrs didn't change
|
||||
* because each write-back erases this info.
|
||||
*/
|
||||
rx_desc->read.pkt_addr = cpu_to_le64(bi->dma);
|
||||
rx_desc->read.pkt_addr = cpu_to_le64(dma);
|
||||
|
||||
rx_desc++;
|
||||
bi++;
|
||||
@@ -355,17 +174,14 @@ __ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count,
|
||||
/* clear the length for the next_to_use descriptor */
|
||||
rx_desc->wb.upper.length = 0;
|
||||
|
||||
cleaned_count--;
|
||||
} while (cleaned_count);
|
||||
count--;
|
||||
} while (count);
|
||||
|
||||
i += rx_ring->count;
|
||||
|
||||
if (rx_ring->next_to_use != i) {
|
||||
rx_ring->next_to_use = i;
|
||||
|
||||
/* update next to alloc since we have filled the ring */
|
||||
rx_ring->next_to_alloc = i;
|
||||
|
||||
/* Force memory writes to complete before letting h/w
|
||||
* know there are new descriptors to fetch. (Only
|
||||
* applicable for weak-ordered memory model archs,
|
||||
@@ -378,40 +194,27 @@ __ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count,
|
||||
return ok;
|
||||
}
|
||||
|
||||
void ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 count)
|
||||
{
|
||||
__ixgbe_alloc_rx_buffers_zc(rx_ring, count,
|
||||
ixgbe_alloc_buffer_slow_zc);
|
||||
}
|
||||
|
||||
static bool ixgbe_alloc_rx_buffers_fast_zc(struct ixgbe_ring *rx_ring,
|
||||
u16 count)
|
||||
{
|
||||
return __ixgbe_alloc_rx_buffers_zc(rx_ring, count,
|
||||
ixgbe_alloc_buffer_zc);
|
||||
}
|
||||
|
||||
static struct sk_buff *ixgbe_construct_skb_zc(struct ixgbe_ring *rx_ring,
|
||||
struct ixgbe_rx_buffer *bi,
|
||||
struct xdp_buff *xdp)
|
||||
struct ixgbe_rx_buffer *bi)
|
||||
{
|
||||
unsigned int metasize = xdp->data - xdp->data_meta;
|
||||
unsigned int datasize = xdp->data_end - xdp->data;
|
||||
unsigned int metasize = bi->xdp->data - bi->xdp->data_meta;
|
||||
unsigned int datasize = bi->xdp->data_end - bi->xdp->data;
|
||||
struct sk_buff *skb;
|
||||
|
||||
/* allocate a skb to store the frags */
|
||||
skb = __napi_alloc_skb(&rx_ring->q_vector->napi,
|
||||
xdp->data_end - xdp->data_hard_start,
|
||||
bi->xdp->data_end - bi->xdp->data_hard_start,
|
||||
GFP_ATOMIC | __GFP_NOWARN);
|
||||
if (unlikely(!skb))
|
||||
return NULL;
|
||||
|
||||
skb_reserve(skb, xdp->data - xdp->data_hard_start);
|
||||
memcpy(__skb_put(skb, datasize), xdp->data, datasize);
|
||||
skb_reserve(skb, bi->xdp->data - bi->xdp->data_hard_start);
|
||||
memcpy(__skb_put(skb, datasize), bi->xdp->data, datasize);
|
||||
if (metasize)
|
||||
skb_metadata_set(skb, metasize);
|
||||
|
||||
ixgbe_reuse_rx_buffer_zc(rx_ring, bi);
|
||||
xsk_buff_free(bi->xdp);
|
||||
bi->xdp = NULL;
|
||||
return skb;
|
||||
}
|
||||
|
||||
@@ -431,14 +234,9 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector,
|
||||
unsigned int total_rx_bytes = 0, total_rx_packets = 0;
|
||||
struct ixgbe_adapter *adapter = q_vector->adapter;
|
||||
u16 cleaned_count = ixgbe_desc_unused(rx_ring);
|
||||
struct xdp_umem *umem = rx_ring->xsk_umem;
|
||||
unsigned int xdp_res, xdp_xmit = 0;
|
||||
bool failure = false;
|
||||
struct sk_buff *skb;
|
||||
struct xdp_buff xdp;
|
||||
|
||||
xdp.rxq = &rx_ring->xdp_rxq;
|
||||
xdp.frame_sz = xsk_umem_xdp_frame_sz(umem);
|
||||
|
||||
while (likely(total_rx_packets < budget)) {
|
||||
union ixgbe_adv_rx_desc *rx_desc;
|
||||
@@ -448,8 +246,8 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector,
|
||||
/* return some buffers to hardware, one at a time is too slow */
|
||||
if (cleaned_count >= IXGBE_RX_BUFFER_WRITE) {
|
||||
failure = failure ||
|
||||
!ixgbe_alloc_rx_buffers_fast_zc(rx_ring,
|
||||
cleaned_count);
|
||||
!ixgbe_alloc_rx_buffers_zc(rx_ring,
|
||||
cleaned_count);
|
||||
cleaned_count = 0;
|
||||
}
|
||||
|
||||
@@ -464,42 +262,40 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector,
|
||||
*/
|
||||
dma_rmb();
|
||||
|
||||
bi = ixgbe_get_rx_buffer_zc(rx_ring, size);
|
||||
bi = &rx_ring->rx_buffer_info[rx_ring->next_to_clean];
|
||||
|
||||
if (unlikely(!ixgbe_test_staterr(rx_desc,
|
||||
IXGBE_RXD_STAT_EOP))) {
|
||||
struct ixgbe_rx_buffer *next_bi;
|
||||
|
||||
ixgbe_reuse_rx_buffer_zc(rx_ring, bi);
|
||||
xsk_buff_free(bi->xdp);
|
||||
bi->xdp = NULL;
|
||||
ixgbe_inc_ntc(rx_ring);
|
||||
next_bi =
|
||||
&rx_ring->rx_buffer_info[rx_ring->next_to_clean];
|
||||
next_bi->skb = ERR_PTR(-EINVAL);
|
||||
next_bi->discard = true;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (unlikely(bi->skb)) {
|
||||
ixgbe_reuse_rx_buffer_zc(rx_ring, bi);
|
||||
if (unlikely(bi->discard)) {
|
||||
xsk_buff_free(bi->xdp);
|
||||
bi->xdp = NULL;
|
||||
bi->discard = false;
|
||||
ixgbe_inc_ntc(rx_ring);
|
||||
continue;
|
||||
}
|
||||
|
||||
xdp.data = bi->addr;
|
||||
xdp.data_meta = xdp.data;
|
||||
xdp.data_hard_start = xdp.data - XDP_PACKET_HEADROOM;
|
||||
xdp.data_end = xdp.data + size;
|
||||
xdp.handle = bi->handle;
|
||||
|
||||
xdp_res = ixgbe_run_xdp_zc(adapter, rx_ring, &xdp);
|
||||
bi->xdp->data_end = bi->xdp->data + size;
|
||||
xsk_buff_dma_sync_for_cpu(bi->xdp);
|
||||
xdp_res = ixgbe_run_xdp_zc(adapter, rx_ring, bi->xdp);
|
||||
|
||||
if (xdp_res) {
|
||||
if (xdp_res & (IXGBE_XDP_TX | IXGBE_XDP_REDIR)) {
|
||||
if (xdp_res & (IXGBE_XDP_TX | IXGBE_XDP_REDIR))
|
||||
xdp_xmit |= xdp_res;
|
||||
bi->addr = NULL;
|
||||
bi->skb = NULL;
|
||||
} else {
|
||||
ixgbe_reuse_rx_buffer_zc(rx_ring, bi);
|
||||
}
|
||||
else
|
||||
xsk_buff_free(bi->xdp);
|
||||
|
||||
bi->xdp = NULL;
|
||||
total_rx_packets++;
|
||||
total_rx_bytes += size;
|
||||
|
||||
@@ -509,7 +305,7 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector,
|
||||
}
|
||||
|
||||
/* XDP_PASS path */
|
||||
skb = ixgbe_construct_skb_zc(rx_ring, bi, &xdp);
|
||||
skb = ixgbe_construct_skb_zc(rx_ring, bi);
|
||||
if (!skb) {
|
||||
rx_ring->rx_stats.alloc_rx_buff_failed++;
|
||||
break;
|
||||
@@ -561,17 +357,17 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector,
|
||||
|
||||
void ixgbe_xsk_clean_rx_ring(struct ixgbe_ring *rx_ring)
|
||||
{
|
||||
u16 i = rx_ring->next_to_clean;
|
||||
struct ixgbe_rx_buffer *bi = &rx_ring->rx_buffer_info[i];
|
||||
struct ixgbe_rx_buffer *bi;
|
||||
u16 i;
|
||||
|
||||
while (i != rx_ring->next_to_alloc) {
|
||||
xsk_umem_fq_reuse(rx_ring->xsk_umem, bi->handle);
|
||||
i++;
|
||||
bi++;
|
||||
if (i == rx_ring->count) {
|
||||
i = 0;
|
||||
bi = rx_ring->rx_buffer_info;
|
||||
}
|
||||
for (i = 0; i < rx_ring->count; i++) {
|
||||
bi = &rx_ring->rx_buffer_info[i];
|
||||
|
||||
if (!bi->xdp)
|
||||
continue;
|
||||
|
||||
xsk_buff_free(bi->xdp);
|
||||
bi->xdp = NULL;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -594,10 +390,9 @@ static bool ixgbe_xmit_zc(struct ixgbe_ring *xdp_ring, unsigned int budget)
|
||||
if (!xsk_umem_consume_tx(xdp_ring->xsk_umem, &desc))
|
||||
break;
|
||||
|
||||
dma = xdp_umem_get_dma(xdp_ring->xsk_umem, desc.addr);
|
||||
|
||||
dma_sync_single_for_device(xdp_ring->dev, dma, desc.len,
|
||||
DMA_BIDIRECTIONAL);
|
||||
dma = xsk_buff_raw_get_dma(xdp_ring->xsk_umem, desc.addr);
|
||||
xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_umem, dma,
|
||||
desc.len);
|
||||
|
||||
tx_bi = &xdp_ring->tx_buffer_info[xdp_ring->next_to_use];
|
||||
tx_bi->bytecount = desc.len;
|
||||
|
||||
@@ -407,10 +407,7 @@ struct mlx5e_dma_info {
|
||||
dma_addr_t addr;
|
||||
union {
|
||||
struct page *page;
|
||||
struct {
|
||||
u64 handle;
|
||||
void *data;
|
||||
} xsk;
|
||||
struct xdp_buff *xsk;
|
||||
};
|
||||
};
|
||||
|
||||
@@ -623,7 +620,6 @@ struct mlx5e_rq {
|
||||
} mpwqe;
|
||||
};
|
||||
struct {
|
||||
u16 umem_headroom;
|
||||
u16 headroom;
|
||||
u32 frame0_sz;
|
||||
u8 map_dir; /* dma map direction */
|
||||
@@ -656,7 +652,6 @@ struct mlx5e_rq {
|
||||
struct page_pool *page_pool;
|
||||
|
||||
/* AF_XDP zero-copy */
|
||||
struct zero_copy_allocator zca;
|
||||
struct xdp_umem *umem;
|
||||
|
||||
struct work_struct recover_work;
|
||||
|
||||
@@ -12,15 +12,16 @@ static inline bool mlx5e_rx_is_xdp(struct mlx5e_params *params,
|
||||
u16 mlx5e_get_linear_rq_headroom(struct mlx5e_params *params,
|
||||
struct mlx5e_xsk_param *xsk)
|
||||
{
|
||||
u16 headroom = NET_IP_ALIGN;
|
||||
u16 headroom;
|
||||
|
||||
if (mlx5e_rx_is_xdp(params, xsk)) {
|
||||
if (xsk)
|
||||
return xsk->headroom;
|
||||
|
||||
headroom = NET_IP_ALIGN;
|
||||
if (mlx5e_rx_is_xdp(params, xsk))
|
||||
headroom += XDP_PACKET_HEADROOM;
|
||||
if (xsk)
|
||||
headroom += xsk->headroom;
|
||||
} else {
|
||||
else
|
||||
headroom += MLX5_RX_HEADROOM;
|
||||
}
|
||||
|
||||
return headroom;
|
||||
}
|
||||
|
||||
@@ -31,7 +31,7 @@
|
||||
*/
|
||||
|
||||
#include <linux/bpf_trace.h>
|
||||
#include <net/xdp_sock.h>
|
||||
#include <net/xdp_sock_drv.h>
|
||||
#include "en/xdp.h"
|
||||
#include "en/params.h"
|
||||
|
||||
@@ -71,7 +71,7 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq,
|
||||
xdptxd.data = xdpf->data;
|
||||
xdptxd.len = xdpf->len;
|
||||
|
||||
if (xdp->rxq->mem.type == MEM_TYPE_ZERO_COPY) {
|
||||
if (xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL) {
|
||||
/* The xdp_buff was in the UMEM and was copied into a newly
|
||||
* allocated page. The UMEM page was returned via the ZCA, and
|
||||
* this new page has to be mapped at this point and has to be
|
||||
@@ -119,50 +119,33 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq,
|
||||
|
||||
/* returns true if packet was consumed by xdp */
|
||||
bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di,
|
||||
void *va, u16 *rx_headroom, u32 *len, bool xsk)
|
||||
u32 *len, struct xdp_buff *xdp)
|
||||
{
|
||||
struct bpf_prog *prog = READ_ONCE(rq->xdp_prog);
|
||||
struct xdp_umem *umem = rq->umem;
|
||||
struct xdp_buff xdp;
|
||||
u32 act;
|
||||
int err;
|
||||
|
||||
if (!prog)
|
||||
return false;
|
||||
|
||||
xdp.data = va + *rx_headroom;
|
||||
xdp_set_data_meta_invalid(&xdp);
|
||||
xdp.data_end = xdp.data + *len;
|
||||
xdp.data_hard_start = va;
|
||||
if (xsk)
|
||||
xdp.handle = di->xsk.handle;
|
||||
xdp.rxq = &rq->xdp_rxq;
|
||||
xdp.frame_sz = rq->buff.frame0_sz;
|
||||
|
||||
act = bpf_prog_run_xdp(prog, &xdp);
|
||||
if (xsk) {
|
||||
u64 off = xdp.data - xdp.data_hard_start;
|
||||
|
||||
xdp.handle = xsk_umem_adjust_offset(umem, xdp.handle, off);
|
||||
}
|
||||
act = bpf_prog_run_xdp(prog, xdp);
|
||||
switch (act) {
|
||||
case XDP_PASS:
|
||||
*rx_headroom = xdp.data - xdp.data_hard_start;
|
||||
*len = xdp.data_end - xdp.data;
|
||||
*len = xdp->data_end - xdp->data;
|
||||
return false;
|
||||
case XDP_TX:
|
||||
if (unlikely(!mlx5e_xmit_xdp_buff(rq->xdpsq, rq, di, &xdp)))
|
||||
if (unlikely(!mlx5e_xmit_xdp_buff(rq->xdpsq, rq, di, xdp)))
|
||||
goto xdp_abort;
|
||||
__set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags); /* non-atomic */
|
||||
return true;
|
||||
case XDP_REDIRECT:
|
||||
/* When XDP enabled then page-refcnt==1 here */
|
||||
err = xdp_do_redirect(rq->netdev, &xdp, prog);
|
||||
err = xdp_do_redirect(rq->netdev, xdp, prog);
|
||||
if (unlikely(err))
|
||||
goto xdp_abort;
|
||||
__set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags);
|
||||
__set_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags);
|
||||
if (!xsk)
|
||||
if (xdp->rxq->mem.type != MEM_TYPE_XSK_BUFF_POOL)
|
||||
mlx5e_page_dma_unmap(rq, di);
|
||||
rq->stats->xdp_redirect++;
|
||||
return true;
|
||||
|
||||
@@ -63,7 +63,7 @@
|
||||
struct mlx5e_xsk_param;
|
||||
int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk);
|
||||
bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di,
|
||||
void *va, u16 *rx_headroom, u32 *len, bool xsk);
|
||||
u32 *len, struct xdp_buff *xdp);
|
||||
void mlx5e_xdp_mpwqe_complete(struct mlx5e_xdpsq *sq);
|
||||
bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq);
|
||||
void mlx5e_free_xdpsq_descs(struct mlx5e_xdpsq *sq);
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user