mirror of
https://github.com/Dasharo/linux.git
synced 2026-03-06 15:25:10 -08:00
Merge branch 'Atomics for eBPF'
Brendan Jackman says: ==================== There's still one unresolved review comment from John[3] which I will resolve with a followup patch. Differences from v6->v7 [1]: * Fixed riscv build error detected by 0-day robot. Differences from v5->v6 [1]: * Carried Björn Töpel's ack for RISC-V code, plus a couple more acks from Yonhgong. * Doc fixups. * Trivial cleanups. Differences from v4->v5 [1]: * Fixed bogus type casts in interpreter that led to warnings from the 0day robot. * Dropped feature-detection for Clang per Andrii's suggestion in [4]. The selftests will now fail to build unless you have llvm-project commit 286daafd6512. The ENABLE_ATOMICS_TEST macro is still needed to support the no_alu32 tests. * Carried some Acks from John and Yonghong. * Dropped confusing usage of __atomic_exchange from prog_test in favour of __sync_lock_test_and_set. * [Really] got rid of all the forest of instruction macros (BPF_ATOMIC_FETCH_ADD and friends); now there's just BPF_ATOMIC_OP to define all the instructions as we use them in the verifier tests. This makes the atomic ops less special in that API, and I don't think the resulting usage is actually any harder to read. Differences from v3->v4 [1]: * Added one Ack from Yonghong. He acked some other patches but those have now changed non-trivally so I didn't add those acks. * Fixups to commit messages. * Fixed disassembly and comments: first arg to atomic_fetch_* is a pointer. * Improved prog_test efficiency. BPF progs are now all loaded in a single call, then the skeleton is re-used for each subtest. * Dropped use of tools/build/feature in favour of a one-liner in the Makefile. * Dropped the commit that created an emit_neg helper in the x86 JIT. It's not used any more (it wasn't used in v3 either). * Combined all the different filter.h macros (used to be BPF_ATOMIC_ADD, BPF_ATOMIC_FETCH_ADD, BPF_ATOMIC_AND, etc) into just BPF_ATOMIC32 and BPF_ATOMIC64. * Removed some references to BPF_STX_XADD from tools/, samples/ and lib/ that I missed before. Differences from v2->v3 [1]: * More minor fixes and naming/comment changes * Dropped atomic subtract: compilers can implement this by preceding an atomic add with a NEG instruction (which is what the x86 JIT did under the hood anyway). * Dropped the use of -mcpu=v4 in the Clang BPF command-line; there is no longer an architecture version bump. Instead a feature test is added to Kbuild - it builds a source file to check if Clang supports BPF atomics. * Fixed the prog_test so it no longer breaks test_progs-no_alu32. This requires some ifdef acrobatics to avoid complicating the prog_tests model where the same userspace code exercises both the normal and no_alu32 BPF test objects, using the same skeleton header. Differences from v1->v2 [1]: * Fixed mistakes in the netronome driver * Addd sub, add, or, xor operations * The above led to some refactors to keep things readable. (Maybe I should have just waited until I'd implemented these before starting the review...) * Replaced BPF_[CMP]SET | BPF_FETCH with just BPF_[CMP]XCHG, which include the BPF_FETCH flag * Added a bit of documentation. Suggestions welcome for more places to dump this info... The prog_test that's added depends on Clang/LLVM features added by Yonghong in commit 286daafd6512 (was https://reviews.llvm.org/D72184). This only includes a JIT implementation for x86_64 - I don't plan to implement JIT support myself for other architectures. Operations ========== This patchset adds atomic operations to the eBPF instruction set. The use-case that motivated this work was a trivial and efficient way to generate globally-unique cookies in BPF progs, but I think it's obvious that these features are pretty widely applicable. The instructions that are added here can be summarised with this list of kernel operations: * atomic[64]_[fetch_]add * atomic[64]_[fetch_]and * atomic[64]_[fetch_]or * atomic[64]_xchg * atomic[64]_cmpxchg The following are left out of scope for this effort: * 16 and 8 bit operations * Explicit memory barriers Encoding ======== I originally planned to add new values for bpf_insn.opcode. This was rather unpleasant: the opcode space has holes in it but no entire instruction classes[2]. Yonghong Song had a better idea: use the immediate field of the existing STX XADD instruction to encode the operation. This works nicely, without breaking existing programs, because the immediate field is currently reserved-must-be-zero, and extra-nicely because BPF_ADD happens to be zero. Note that this of course makes immediate-source atomic operations impossible. It's hard to imagine a measurable speedup from such instructions, and if it existed it would certainly not benefit x86, which has no support for them. The BPF_OP opcode fields are re-used in the immediate, and an additional flag BPF_FETCH is used to mark instructions that should fetch a pre-modification value from memory. So, BPF_XADD is now called BPF_ATOMIC (the old name is kept to avoid breaking userspace builds), and where we previously had .imm = 0, we now have .imm = BPF_ADD (which is 0). Operands ======== Reg-source eBPF instructions only have two operands, while these atomic operations have up to four. To avoid needing to encode additional operands, then: - One of the input registers is re-used as an output register (e.g. atomic_fetch_add both reads from and writes to the source register). - Where necessary (i.e. for cmpxchg) , R0 is "hard-coded" as one of the operands. This approach also allows the new eBPF instructions to map directly to single x86 instructions. [1] Previous iterations: v1: https://lore.kernel.org/bpf/20201123173202.1335708-1-jackmanb@google.com/ v2: https://lore.kernel.org/bpf/20201127175738.1085417-1-jackmanb@google.com/ v3: https://lore.kernel.org/bpf/X8kN7NA7bJC7aLQI@google.com/ v4: https://lore.kernel.org/bpf/20201207160734.2345502-1-jackmanb@google.com/ v5: https://lore.kernel.org/bpf/20201215121816.1048557-1-jackmanb@google.com/ v6: https://lore.kernel.org/bpf/20210112154235.2192781-1-jackmanb@google.com/ [2] Visualisation of eBPF opcode space: https://gist.github.com/bjackman/00fdad2d5dfff601c1918bc29b16e778 [3] Comment from John about propagating bounds in verifier: https://lore.kernel.org/bpf/5fcf0fbcc8aa8_9ab320853@john-XPS-13-9370.notmuch/ [4] Mail from Andrii about not supporting old Clang in selftests: https://lore.kernel.org/bpf/CAEf4BzYBddPaEzRUs=jaWSo5kbf=LZdb7geAUVj85GxLQztuAQ@mail.gmail.com/ ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commit is contained in:
@@ -1006,13 +1006,13 @@ Size modifier is one of ...
|
||||
|
||||
Mode modifier is one of::
|
||||
|
||||
BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
|
||||
BPF_ABS 0x20
|
||||
BPF_IND 0x40
|
||||
BPF_MEM 0x60
|
||||
BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */
|
||||
BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */
|
||||
BPF_XADD 0xc0 /* eBPF only, exclusive add */
|
||||
BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
|
||||
BPF_ABS 0x20
|
||||
BPF_IND 0x40
|
||||
BPF_MEM 0x60
|
||||
BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */
|
||||
BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */
|
||||
BPF_ATOMIC 0xc0 /* eBPF only, atomic operations */
|
||||
|
||||
eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
|
||||
(BPF_IND | <size> | BPF_LD) which are used to access packet data.
|
||||
@@ -1044,11 +1044,50 @@ Unlike classic BPF instruction set, eBPF has generic load/store operations::
|
||||
BPF_MEM | <size> | BPF_STX: *(size *) (dst_reg + off) = src_reg
|
||||
BPF_MEM | <size> | BPF_ST: *(size *) (dst_reg + off) = imm32
|
||||
BPF_MEM | <size> | BPF_LDX: dst_reg = *(size *) (src_reg + off)
|
||||
BPF_XADD | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
|
||||
BPF_XADD | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
|
||||
|
||||
Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW. Note that 1 and
|
||||
2 byte atomic increments are not supported.
|
||||
Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW.
|
||||
|
||||
It also includes atomic operations, which use the immediate field for extra
|
||||
encoding.
|
||||
|
||||
.imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
|
||||
.imm = BPF_ADD, .code = BPF_ATOMIC | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
|
||||
|
||||
The basic atomic operations supported are:
|
||||
|
||||
BPF_ADD
|
||||
BPF_AND
|
||||
BPF_OR
|
||||
BPF_XOR
|
||||
|
||||
Each having equivalent semantics with the ``BPF_ADD`` example, that is: the
|
||||
memory location addresed by ``dst_reg + off`` is atomically modified, with
|
||||
``src_reg`` as the other operand. If the ``BPF_FETCH`` flag is set in the
|
||||
immediate, then these operations also overwrite ``src_reg`` with the
|
||||
value that was in memory before it was modified.
|
||||
|
||||
The more special operations are:
|
||||
|
||||
BPF_XCHG
|
||||
|
||||
This atomically exchanges ``src_reg`` with the value addressed by ``dst_reg +
|
||||
off``.
|
||||
|
||||
BPF_CMPXCHG
|
||||
|
||||
This atomically compares the value addressed by ``dst_reg + off`` with
|
||||
``R0``. If they match it is replaced with ``src_reg``, The value that was there
|
||||
before is loaded back to ``R0``.
|
||||
|
||||
Note that 1 and 2 byte atomic operations are not supported.
|
||||
|
||||
Except ``BPF_ADD`` _without_ ``BPF_FETCH`` (for legacy reasons), all 4 byte
|
||||
atomic operations require alu32 mode. Clang enables this mode by default in
|
||||
architecture v3 (``-mcpu=v3``). For older versions it can be enabled with
|
||||
``-Xclang -target-feature -Xclang +alu32``.
|
||||
|
||||
You may encounter BPF_XADD - this is a legacy name for BPF_ATOMIC, referring to
|
||||
the exclusive-add operation encoded when the immediate field is zero.
|
||||
|
||||
eBPF has one 16-byte instruction: BPF_LD | BPF_DW | BPF_IMM which consists
|
||||
of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single
|
||||
|
||||
@@ -1620,10 +1620,9 @@ exit:
|
||||
}
|
||||
emit_str_r(dst_lo, tmp2, off, ctx, BPF_SIZE(code));
|
||||
break;
|
||||
/* STX XADD: lock *(u32 *)(dst + off) += src */
|
||||
case BPF_STX | BPF_XADD | BPF_W:
|
||||
/* STX XADD: lock *(u64 *)(dst + off) += src */
|
||||
case BPF_STX | BPF_XADD | BPF_DW:
|
||||
/* Atomic ops */
|
||||
case BPF_STX | BPF_ATOMIC | BPF_W:
|
||||
case BPF_STX | BPF_ATOMIC | BPF_DW:
|
||||
goto notyet;
|
||||
/* STX: *(size *)(dst + off) = src */
|
||||
case BPF_STX | BPF_MEM | BPF_W:
|
||||
|
||||
@@ -875,10 +875,18 @@ emit_cond_jmp:
|
||||
}
|
||||
break;
|
||||
|
||||
/* STX XADD: lock *(u32 *)(dst + off) += src */
|
||||
case BPF_STX | BPF_XADD | BPF_W:
|
||||
/* STX XADD: lock *(u64 *)(dst + off) += src */
|
||||
case BPF_STX | BPF_XADD | BPF_DW:
|
||||
case BPF_STX | BPF_ATOMIC | BPF_W:
|
||||
case BPF_STX | BPF_ATOMIC | BPF_DW:
|
||||
if (insn->imm != BPF_ADD) {
|
||||
pr_err_once("unknown atomic op code %02x\n", insn->imm);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/* STX XADD: lock *(u32 *)(dst + off) += src
|
||||
* and
|
||||
* STX XADD: lock *(u64 *)(dst + off) += src
|
||||
*/
|
||||
|
||||
if (!off) {
|
||||
reg = dst;
|
||||
} else {
|
||||
|
||||
@@ -1423,8 +1423,8 @@ jeq_common:
|
||||
case BPF_STX | BPF_H | BPF_MEM:
|
||||
case BPF_STX | BPF_W | BPF_MEM:
|
||||
case BPF_STX | BPF_DW | BPF_MEM:
|
||||
case BPF_STX | BPF_W | BPF_XADD:
|
||||
case BPF_STX | BPF_DW | BPF_XADD:
|
||||
case BPF_STX | BPF_W | BPF_ATOMIC:
|
||||
case BPF_STX | BPF_DW | BPF_ATOMIC:
|
||||
if (insn->dst_reg == BPF_REG_10) {
|
||||
ctx->flags |= EBPF_SEEN_FP;
|
||||
dst = MIPS_R_SP;
|
||||
@@ -1438,7 +1438,12 @@ jeq_common:
|
||||
src = ebpf_to_mips_reg(ctx, insn, src_reg_no_fp);
|
||||
if (src < 0)
|
||||
return src;
|
||||
if (BPF_MODE(insn->code) == BPF_XADD) {
|
||||
if (BPF_MODE(insn->code) == BPF_ATOMIC) {
|
||||
if (insn->imm != BPF_ADD) {
|
||||
pr_err("ATOMIC OP %02x NOT HANDLED\n", insn->imm);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/*
|
||||
* If mem_off does not fit within the 9 bit ll/sc
|
||||
* instruction immediate field, use a temp reg.
|
||||
|
||||
@@ -683,10 +683,18 @@ emit_clear:
|
||||
break;
|
||||
|
||||
/*
|
||||
* BPF_STX XADD (atomic_add)
|
||||
* BPF_STX ATOMIC (atomic ops)
|
||||
*/
|
||||
/* *(u32 *)(dst + off) += src */
|
||||
case BPF_STX | BPF_XADD | BPF_W:
|
||||
case BPF_STX | BPF_ATOMIC | BPF_W:
|
||||
if (insn->imm != BPF_ADD) {
|
||||
pr_err_ratelimited(
|
||||
"eBPF filter atomic op code %02x (@%d) unsupported\n",
|
||||
code, i);
|
||||
return -ENOTSUPP;
|
||||
}
|
||||
|
||||
/* *(u32 *)(dst + off) += src */
|
||||
|
||||
/* Get EA into TMP_REG_1 */
|
||||
EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], dst_reg, off));
|
||||
tmp_idx = ctx->idx * 4;
|
||||
@@ -699,8 +707,15 @@ emit_clear:
|
||||
/* we're done if this succeeded */
|
||||
PPC_BCC_SHORT(COND_NE, tmp_idx);
|
||||
break;
|
||||
/* *(u64 *)(dst + off) += src */
|
||||
case BPF_STX | BPF_XADD | BPF_DW:
|
||||
case BPF_STX | BPF_ATOMIC | BPF_DW:
|
||||
if (insn->imm != BPF_ADD) {
|
||||
pr_err_ratelimited(
|
||||
"eBPF filter atomic op code %02x (@%d) unsupported\n",
|
||||
code, i);
|
||||
return -ENOTSUPP;
|
||||
}
|
||||
/* *(u64 *)(dst + off) += src */
|
||||
|
||||
EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], dst_reg, off));
|
||||
tmp_idx = ctx->idx * 4;
|
||||
EMIT(PPC_RAW_LDARX(b2p[TMP_REG_2], 0, b2p[TMP_REG_1], 0));
|
||||
|
||||
@@ -881,7 +881,7 @@ static int emit_store_r64(const s8 *dst, const s8 *src, s16 off,
|
||||
const s8 *rd = bpf_get_reg64(dst, tmp1, ctx);
|
||||
const s8 *rs = bpf_get_reg64(src, tmp2, ctx);
|
||||
|
||||
if (mode == BPF_XADD && size != BPF_W)
|
||||
if (mode == BPF_ATOMIC && size != BPF_W)
|
||||
return -1;
|
||||
|
||||
emit_imm(RV_REG_T0, off, ctx);
|
||||
@@ -899,7 +899,7 @@ static int emit_store_r64(const s8 *dst, const s8 *src, s16 off,
|
||||
case BPF_MEM:
|
||||
emit(rv_sw(RV_REG_T0, 0, lo(rs)), ctx);
|
||||
break;
|
||||
case BPF_XADD:
|
||||
case BPF_ATOMIC: /* Only BPF_ADD supported */
|
||||
emit(rv_amoadd_w(RV_REG_ZERO, lo(rs), RV_REG_T0, 0, 0),
|
||||
ctx);
|
||||
break;
|
||||
@@ -1260,7 +1260,6 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
|
||||
case BPF_STX | BPF_MEM | BPF_H:
|
||||
case BPF_STX | BPF_MEM | BPF_W:
|
||||
case BPF_STX | BPF_MEM | BPF_DW:
|
||||
case BPF_STX | BPF_XADD | BPF_W:
|
||||
if (BPF_CLASS(code) == BPF_ST) {
|
||||
emit_imm32(tmp2, imm, ctx);
|
||||
src = tmp2;
|
||||
@@ -1271,8 +1270,21 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
|
||||
return -1;
|
||||
break;
|
||||
|
||||
case BPF_STX | BPF_ATOMIC | BPF_W:
|
||||
if (insn->imm != BPF_ADD) {
|
||||
pr_info_once(
|
||||
"bpf-jit: not supported: atomic operation %02x ***\n",
|
||||
insn->imm);
|
||||
return -EFAULT;
|
||||
}
|
||||
|
||||
if (emit_store_r64(dst, src, off, ctx, BPF_SIZE(code),
|
||||
BPF_MODE(code)))
|
||||
return -1;
|
||||
break;
|
||||
|
||||
/* No hardware support for 8-byte atomics in RV32. */
|
||||
case BPF_STX | BPF_XADD | BPF_DW:
|
||||
case BPF_STX | BPF_ATOMIC | BPF_DW:
|
||||
/* Fallthrough. */
|
||||
|
||||
notsupported:
|
||||
|
||||
@@ -1027,10 +1027,18 @@ out_be:
|
||||
emit_add(RV_REG_T1, RV_REG_T1, rd, ctx);
|
||||
emit_sd(RV_REG_T1, 0, rs, ctx);
|
||||
break;
|
||||
/* STX XADD: lock *(u32 *)(dst + off) += src */
|
||||
case BPF_STX | BPF_XADD | BPF_W:
|
||||
/* STX XADD: lock *(u64 *)(dst + off) += src */
|
||||
case BPF_STX | BPF_XADD | BPF_DW:
|
||||
case BPF_STX | BPF_ATOMIC | BPF_W:
|
||||
case BPF_STX | BPF_ATOMIC | BPF_DW:
|
||||
if (insn->imm != BPF_ADD) {
|
||||
pr_err("bpf-jit: not supported: atomic operation %02x ***\n",
|
||||
insn->imm);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/* atomic_add: lock *(u32 *)(dst + off) += src
|
||||
* atomic_add: lock *(u64 *)(dst + off) += src
|
||||
*/
|
||||
|
||||
if (off) {
|
||||
if (is_12b_int(off)) {
|
||||
emit_addi(RV_REG_T1, rd, off, ctx);
|
||||
|
||||
@@ -1205,18 +1205,23 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,
|
||||
jit->seen |= SEEN_MEM;
|
||||
break;
|
||||
/*
|
||||
* BPF_STX XADD (atomic_add)
|
||||
* BPF_ATOMIC
|
||||
*/
|
||||
case BPF_STX | BPF_XADD | BPF_W: /* *(u32 *)(dst + off) += src */
|
||||
/* laal %w0,%src,off(%dst) */
|
||||
EMIT6_DISP_LH(0xeb000000, 0x00fa, REG_W0, src_reg,
|
||||
dst_reg, off);
|
||||
jit->seen |= SEEN_MEM;
|
||||
break;
|
||||
case BPF_STX | BPF_XADD | BPF_DW: /* *(u64 *)(dst + off) += src */
|
||||
/* laalg %w0,%src,off(%dst) */
|
||||
EMIT6_DISP_LH(0xeb000000, 0x00ea, REG_W0, src_reg,
|
||||
dst_reg, off);
|
||||
case BPF_STX | BPF_ATOMIC | BPF_DW:
|
||||
case BPF_STX | BPF_ATOMIC | BPF_W:
|
||||
if (insn->imm != BPF_ADD) {
|
||||
pr_err("Unknown atomic operation %02x\n", insn->imm);
|
||||
return -1;
|
||||
}
|
||||
|
||||
/* *(u32/u64 *)(dst + off) += src
|
||||
*
|
||||
* BFW_W: laal %w0,%src,off(%dst)
|
||||
* BPF_DW: laalg %w0,%src,off(%dst)
|
||||
*/
|
||||
EMIT6_DISP_LH(0xeb000000,
|
||||
BPF_SIZE(insn->code) == BPF_W ? 0x00fa : 0x00ea,
|
||||
REG_W0, src_reg, dst_reg, off);
|
||||
jit->seen |= SEEN_MEM;
|
||||
break;
|
||||
/*
|
||||
|
||||
@@ -1366,12 +1366,18 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
|
||||
break;
|
||||
}
|
||||
|
||||
/* STX XADD: lock *(u32 *)(dst + off) += src */
|
||||
case BPF_STX | BPF_XADD | BPF_W: {
|
||||
case BPF_STX | BPF_ATOMIC | BPF_W: {
|
||||
const u8 tmp = bpf2sparc[TMP_REG_1];
|
||||
const u8 tmp2 = bpf2sparc[TMP_REG_2];
|
||||
const u8 tmp3 = bpf2sparc[TMP_REG_3];
|
||||
|
||||
if (insn->imm != BPF_ADD) {
|
||||
pr_err_once("unknown atomic op %02x\n", insn->imm);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/* lock *(u32 *)(dst + off) += src */
|
||||
|
||||
if (insn->dst_reg == BPF_REG_FP)
|
||||
ctx->saw_frame_pointer = true;
|
||||
|
||||
@@ -1390,11 +1396,16 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
|
||||
break;
|
||||
}
|
||||
/* STX XADD: lock *(u64 *)(dst + off) += src */
|
||||
case BPF_STX | BPF_XADD | BPF_DW: {
|
||||
case BPF_STX | BPF_ATOMIC | BPF_DW: {
|
||||
const u8 tmp = bpf2sparc[TMP_REG_1];
|
||||
const u8 tmp2 = bpf2sparc[TMP_REG_2];
|
||||
const u8 tmp3 = bpf2sparc[TMP_REG_3];
|
||||
|
||||
if (insn->imm != BPF_ADD) {
|
||||
pr_err_once("unknown atomic op %02x\n", insn->imm);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (insn->dst_reg == BPF_REG_FP)
|
||||
ctx->saw_frame_pointer = true;
|
||||
|
||||
|
||||
@@ -205,6 +205,18 @@ static u8 add_2reg(u8 byte, u32 dst_reg, u32 src_reg)
|
||||
return byte + reg2hex[dst_reg] + (reg2hex[src_reg] << 3);
|
||||
}
|
||||
|
||||
/* Some 1-byte opcodes for binary ALU operations */
|
||||
static u8 simple_alu_opcodes[] = {
|
||||
[BPF_ADD] = 0x01,
|
||||
[BPF_SUB] = 0x29,
|
||||
[BPF_AND] = 0x21,
|
||||
[BPF_OR] = 0x09,
|
||||
[BPF_XOR] = 0x31,
|
||||
[BPF_LSH] = 0xE0,
|
||||
[BPF_RSH] = 0xE8,
|
||||
[BPF_ARSH] = 0xF8,
|
||||
};
|
||||
|
||||
static void jit_fill_hole(void *area, unsigned int size)
|
||||
{
|
||||
/* Fill whole space with INT3 instructions */
|
||||
@@ -681,6 +693,42 @@ static void emit_mov_reg(u8 **pprog, bool is64, u32 dst_reg, u32 src_reg)
|
||||
*pprog = prog;
|
||||
}
|
||||
|
||||
/* Emit the suffix (ModR/M etc) for addressing *(ptr_reg + off) and val_reg */
|
||||
static void emit_insn_suffix(u8 **pprog, u32 ptr_reg, u32 val_reg, int off)
|
||||
{
|
||||
u8 *prog = *pprog;
|
||||
int cnt = 0;
|
||||
|
||||
if (is_imm8(off)) {
|
||||
/* 1-byte signed displacement.
|
||||
*
|
||||
* If off == 0 we could skip this and save one extra byte, but
|
||||
* special case of x86 R13 which always needs an offset is not
|
||||
* worth the hassle
|
||||
*/
|
||||
EMIT2(add_2reg(0x40, ptr_reg, val_reg), off);
|
||||
} else {
|
||||
/* 4-byte signed displacement */
|
||||
EMIT1_off32(add_2reg(0x80, ptr_reg, val_reg), off);
|
||||
}
|
||||
*pprog = prog;
|
||||
}
|
||||
|
||||
/*
|
||||
* Emit a REX byte if it will be necessary to address these registers
|
||||
*/
|
||||
static void maybe_emit_mod(u8 **pprog, u32 dst_reg, u32 src_reg, bool is64)
|
||||
{
|
||||
u8 *prog = *pprog;
|
||||
int cnt = 0;
|
||||
|
||||
if (is64)
|
||||
EMIT1(add_2mod(0x48, dst_reg, src_reg));
|
||||
else if (is_ereg(dst_reg) || is_ereg(src_reg))
|
||||
EMIT1(add_2mod(0x40, dst_reg, src_reg));
|
||||
*pprog = prog;
|
||||
}
|
||||
|
||||
/* LDX: dst_reg = *(u8*)(src_reg + off) */
|
||||
static void emit_ldx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off)
|
||||
{
|
||||
@@ -708,15 +756,7 @@ static void emit_ldx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off)
|
||||
EMIT2(add_2mod(0x48, src_reg, dst_reg), 0x8B);
|
||||
break;
|
||||
}
|
||||
/*
|
||||
* If insn->off == 0 we can save one extra byte, but
|
||||
* special case of x86 R13 which always needs an offset
|
||||
* is not worth the hassle
|
||||
*/
|
||||
if (is_imm8(off))
|
||||
EMIT2(add_2reg(0x40, src_reg, dst_reg), off);
|
||||
else
|
||||
EMIT1_off32(add_2reg(0x80, src_reg, dst_reg), off);
|
||||
emit_insn_suffix(&prog, src_reg, dst_reg, off);
|
||||
*pprog = prog;
|
||||
}
|
||||
|
||||
@@ -751,13 +791,53 @@ static void emit_stx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off)
|
||||
EMIT2(add_2mod(0x48, dst_reg, src_reg), 0x89);
|
||||
break;
|
||||
}
|
||||
if (is_imm8(off))
|
||||
EMIT2(add_2reg(0x40, dst_reg, src_reg), off);
|
||||
else
|
||||
EMIT1_off32(add_2reg(0x80, dst_reg, src_reg), off);
|
||||
emit_insn_suffix(&prog, dst_reg, src_reg, off);
|
||||
*pprog = prog;
|
||||
}
|
||||
|
||||
static int emit_atomic(u8 **pprog, u8 atomic_op,
|
||||
u32 dst_reg, u32 src_reg, s16 off, u8 bpf_size)
|
||||
{
|
||||
u8 *prog = *pprog;
|
||||
int cnt = 0;
|
||||
|
||||
EMIT1(0xF0); /* lock prefix */
|
||||
|
||||
maybe_emit_mod(&prog, dst_reg, src_reg, bpf_size == BPF_DW);
|
||||
|
||||
/* emit opcode */
|
||||
switch (atomic_op) {
|
||||
case BPF_ADD:
|
||||
case BPF_SUB:
|
||||
case BPF_AND:
|
||||
case BPF_OR:
|
||||
case BPF_XOR:
|
||||
/* lock *(u32/u64*)(dst_reg + off) <op>= src_reg */
|
||||
EMIT1(simple_alu_opcodes[atomic_op]);
|
||||
break;
|
||||
case BPF_ADD | BPF_FETCH:
|
||||
/* src_reg = atomic_fetch_add(dst_reg + off, src_reg); */
|
||||
EMIT2(0x0F, 0xC1);
|
||||
break;
|
||||
case BPF_XCHG:
|
||||
/* src_reg = atomic_xchg(dst_reg + off, src_reg); */
|
||||
EMIT1(0x87);
|
||||
break;
|
||||
case BPF_CMPXCHG:
|
||||
/* r0 = atomic_cmpxchg(dst_reg + off, r0, src_reg); */
|
||||
EMIT2(0x0F, 0xB1);
|
||||
break;
|
||||
default:
|
||||
pr_err("bpf_jit: unknown atomic opcode %02x\n", atomic_op);
|
||||
return -EFAULT;
|
||||
}
|
||||
|
||||
emit_insn_suffix(&prog, dst_reg, src_reg, off);
|
||||
|
||||
*pprog = prog;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static bool ex_handler_bpf(const struct exception_table_entry *x,
|
||||
struct pt_regs *regs, int trapnr,
|
||||
unsigned long error_code, unsigned long fault_addr)
|
||||
@@ -802,6 +882,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
|
||||
int i, cnt = 0, excnt = 0;
|
||||
int proglen = 0;
|
||||
u8 *prog = temp;
|
||||
int err;
|
||||
|
||||
detect_reg_usage(insn, insn_cnt, callee_regs_used,
|
||||
&tail_call_seen);
|
||||
@@ -837,17 +918,9 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
|
||||
case BPF_ALU64 | BPF_AND | BPF_X:
|
||||
case BPF_ALU64 | BPF_OR | BPF_X:
|
||||
case BPF_ALU64 | BPF_XOR | BPF_X:
|
||||
switch (BPF_OP(insn->code)) {
|
||||
case BPF_ADD: b2 = 0x01; break;
|
||||
case BPF_SUB: b2 = 0x29; break;
|
||||
case BPF_AND: b2 = 0x21; break;
|
||||
case BPF_OR: b2 = 0x09; break;
|
||||
case BPF_XOR: b2 = 0x31; break;
|
||||
}
|
||||
if (BPF_CLASS(insn->code) == BPF_ALU64)
|
||||
EMIT1(add_2mod(0x48, dst_reg, src_reg));
|
||||
else if (is_ereg(dst_reg) || is_ereg(src_reg))
|
||||
EMIT1(add_2mod(0x40, dst_reg, src_reg));
|
||||
maybe_emit_mod(&prog, dst_reg, src_reg,
|
||||
BPF_CLASS(insn->code) == BPF_ALU64);
|
||||
b2 = simple_alu_opcodes[BPF_OP(insn->code)];
|
||||
EMIT2(b2, add_2reg(0xC0, dst_reg, src_reg));
|
||||
break;
|
||||
|
||||
@@ -1027,12 +1100,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
|
||||
else if (is_ereg(dst_reg))
|
||||
EMIT1(add_1mod(0x40, dst_reg));
|
||||
|
||||
switch (BPF_OP(insn->code)) {
|
||||
case BPF_LSH: b3 = 0xE0; break;
|
||||
case BPF_RSH: b3 = 0xE8; break;
|
||||
case BPF_ARSH: b3 = 0xF8; break;
|
||||
}
|
||||
|
||||
b3 = simple_alu_opcodes[BPF_OP(insn->code)];
|
||||
if (imm32 == 1)
|
||||
EMIT2(0xD1, add_1reg(b3, dst_reg));
|
||||
else
|
||||
@@ -1066,11 +1134,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
|
||||
else if (is_ereg(dst_reg))
|
||||
EMIT1(add_1mod(0x40, dst_reg));
|
||||
|
||||
switch (BPF_OP(insn->code)) {
|
||||
case BPF_LSH: b3 = 0xE0; break;
|
||||
case BPF_RSH: b3 = 0xE8; break;
|
||||
case BPF_ARSH: b3 = 0xF8; break;
|
||||
}
|
||||
b3 = simple_alu_opcodes[BPF_OP(insn->code)];
|
||||
EMIT2(0xD3, add_1reg(b3, dst_reg));
|
||||
|
||||
if (src_reg != BPF_REG_4)
|
||||
@@ -1230,21 +1294,56 @@ st: if (is_imm8(insn->off))
|
||||
}
|
||||
break;
|
||||
|
||||
/* STX XADD: lock *(u32*)(dst_reg + off) += src_reg */
|
||||
case BPF_STX | BPF_XADD | BPF_W:
|
||||
/* Emit 'lock add dword ptr [rax + off], eax' */
|
||||
if (is_ereg(dst_reg) || is_ereg(src_reg))
|
||||
EMIT3(0xF0, add_2mod(0x40, dst_reg, src_reg), 0x01);
|
||||
else
|
||||
EMIT2(0xF0, 0x01);
|
||||
goto xadd;
|
||||
case BPF_STX | BPF_XADD | BPF_DW:
|
||||
EMIT3(0xF0, add_2mod(0x48, dst_reg, src_reg), 0x01);
|
||||
xadd: if (is_imm8(insn->off))
|
||||
EMIT2(add_2reg(0x40, dst_reg, src_reg), insn->off);
|
||||
else
|
||||
EMIT1_off32(add_2reg(0x80, dst_reg, src_reg),
|
||||
insn->off);
|
||||
case BPF_STX | BPF_ATOMIC | BPF_W:
|
||||
case BPF_STX | BPF_ATOMIC | BPF_DW:
|
||||
if (insn->imm == (BPF_AND | BPF_FETCH) ||
|
||||
insn->imm == (BPF_OR | BPF_FETCH) ||
|
||||
insn->imm == (BPF_XOR | BPF_FETCH)) {
|
||||
u8 *branch_target;
|
||||
bool is64 = BPF_SIZE(insn->code) == BPF_DW;
|
||||
|
||||
/*
|
||||
* Can't be implemented with a single x86 insn.
|
||||
* Need to do a CMPXCHG loop.
|
||||
*/
|
||||
|
||||
/* Will need RAX as a CMPXCHG operand so save R0 */
|
||||
emit_mov_reg(&prog, true, BPF_REG_AX, BPF_REG_0);
|
||||
branch_target = prog;
|
||||
/* Load old value */
|
||||
emit_ldx(&prog, BPF_SIZE(insn->code),
|
||||
BPF_REG_0, dst_reg, insn->off);
|
||||
/*
|
||||
* Perform the (commutative) operation locally,
|
||||
* put the result in the AUX_REG.
|
||||
*/
|
||||
emit_mov_reg(&prog, is64, AUX_REG, BPF_REG_0);
|
||||
maybe_emit_mod(&prog, AUX_REG, src_reg, is64);
|
||||
EMIT2(simple_alu_opcodes[BPF_OP(insn->imm)],
|
||||
add_2reg(0xC0, AUX_REG, src_reg));
|
||||
/* Attempt to swap in new value */
|
||||
err = emit_atomic(&prog, BPF_CMPXCHG,
|
||||
dst_reg, AUX_REG, insn->off,
|
||||
BPF_SIZE(insn->code));
|
||||
if (WARN_ON(err))
|
||||
return err;
|
||||
/*
|
||||
* ZF tells us whether we won the race. If it's
|
||||
* cleared we need to try again.
|
||||
*/
|
||||
EMIT2(X86_JNE, -(prog - branch_target) - 2);
|
||||
/* Return the pre-modification value */
|
||||
emit_mov_reg(&prog, is64, src_reg, BPF_REG_0);
|
||||
/* Restore R0 after clobbering RAX */
|
||||
emit_mov_reg(&prog, true, BPF_REG_0, BPF_REG_AX);
|
||||
break;
|
||||
|
||||
}
|
||||
|
||||
err = emit_atomic(&prog, insn->imm, dst_reg, src_reg,
|
||||
insn->off, BPF_SIZE(insn->code));
|
||||
if (err)
|
||||
return err;
|
||||
break;
|
||||
|
||||
/* call */
|
||||
@@ -1295,20 +1394,16 @@ xadd: if (is_imm8(insn->off))
|
||||
case BPF_JMP32 | BPF_JSGE | BPF_X:
|
||||
case BPF_JMP32 | BPF_JSLE | BPF_X:
|
||||
/* cmp dst_reg, src_reg */
|
||||
if (BPF_CLASS(insn->code) == BPF_JMP)
|
||||
EMIT1(add_2mod(0x48, dst_reg, src_reg));
|
||||
else if (is_ereg(dst_reg) || is_ereg(src_reg))
|
||||
EMIT1(add_2mod(0x40, dst_reg, src_reg));
|
||||
maybe_emit_mod(&prog, dst_reg, src_reg,
|
||||
BPF_CLASS(insn->code) == BPF_JMP);
|
||||
EMIT2(0x39, add_2reg(0xC0, dst_reg, src_reg));
|
||||
goto emit_cond_jmp;
|
||||
|
||||
case BPF_JMP | BPF_JSET | BPF_X:
|
||||
case BPF_JMP32 | BPF_JSET | BPF_X:
|
||||
/* test dst_reg, src_reg */
|
||||
if (BPF_CLASS(insn->code) == BPF_JMP)
|
||||
EMIT1(add_2mod(0x48, dst_reg, src_reg));
|
||||
else if (is_ereg(dst_reg) || is_ereg(src_reg))
|
||||
EMIT1(add_2mod(0x40, dst_reg, src_reg));
|
||||
maybe_emit_mod(&prog, dst_reg, src_reg,
|
||||
BPF_CLASS(insn->code) == BPF_JMP);
|
||||
EMIT2(0x85, add_2reg(0xC0, dst_reg, src_reg));
|
||||
goto emit_cond_jmp;
|
||||
|
||||
@@ -1344,10 +1439,8 @@ xadd: if (is_imm8(insn->off))
|
||||
case BPF_JMP32 | BPF_JSLE | BPF_K:
|
||||
/* test dst_reg, dst_reg to save one extra byte */
|
||||
if (imm32 == 0) {
|
||||
if (BPF_CLASS(insn->code) == BPF_JMP)
|
||||
EMIT1(add_2mod(0x48, dst_reg, dst_reg));
|
||||
else if (is_ereg(dst_reg))
|
||||
EMIT1(add_2mod(0x40, dst_reg, dst_reg));
|
||||
maybe_emit_mod(&prog, dst_reg, dst_reg,
|
||||
BPF_CLASS(insn->code) == BPF_JMP);
|
||||
EMIT2(0x85, add_2reg(0xC0, dst_reg, dst_reg));
|
||||
goto emit_cond_jmp;
|
||||
}
|
||||
|
||||
@@ -2243,10 +2243,8 @@ emit_jmp:
|
||||
return -EFAULT;
|
||||
}
|
||||
break;
|
||||
/* STX XADD: lock *(u32 *)(dst + off) += src */
|
||||
case BPF_STX | BPF_XADD | BPF_W:
|
||||
/* STX XADD: lock *(u64 *)(dst + off) += src */
|
||||
case BPF_STX | BPF_XADD | BPF_DW:
|
||||
case BPF_STX | BPF_ATOMIC | BPF_W:
|
||||
case BPF_STX | BPF_ATOMIC | BPF_DW:
|
||||
goto notyet;
|
||||
case BPF_JMP | BPF_EXIT:
|
||||
if (seen_exit) {
|
||||
|
||||
@@ -3109,13 +3109,19 @@ mem_xadd(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, bool is64)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int mem_xadd4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
|
||||
static int mem_atomic4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
|
||||
{
|
||||
if (meta->insn.imm != BPF_ADD)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
return mem_xadd(nfp_prog, meta, false);
|
||||
}
|
||||
|
||||
static int mem_xadd8(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
|
||||
static int mem_atomic8(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
|
||||
{
|
||||
if (meta->insn.imm != BPF_ADD)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
return mem_xadd(nfp_prog, meta, true);
|
||||
}
|
||||
|
||||
@@ -3475,8 +3481,8 @@ static const instr_cb_t instr_cb[256] = {
|
||||
[BPF_STX | BPF_MEM | BPF_H] = mem_stx2,
|
||||
[BPF_STX | BPF_MEM | BPF_W] = mem_stx4,
|
||||
[BPF_STX | BPF_MEM | BPF_DW] = mem_stx8,
|
||||
[BPF_STX | BPF_XADD | BPF_W] = mem_xadd4,
|
||||
[BPF_STX | BPF_XADD | BPF_DW] = mem_xadd8,
|
||||
[BPF_STX | BPF_ATOMIC | BPF_W] = mem_atomic4,
|
||||
[BPF_STX | BPF_ATOMIC | BPF_DW] = mem_atomic8,
|
||||
[BPF_ST | BPF_MEM | BPF_B] = mem_st1,
|
||||
[BPF_ST | BPF_MEM | BPF_H] = mem_st2,
|
||||
[BPF_ST | BPF_MEM | BPF_W] = mem_st4,
|
||||
|
||||
@@ -428,9 +428,9 @@ static inline bool is_mbpf_classic_store_pkt(const struct nfp_insn_meta *meta)
|
||||
return is_mbpf_classic_store(meta) && meta->ptr.type == PTR_TO_PACKET;
|
||||
}
|
||||
|
||||
static inline bool is_mbpf_xadd(const struct nfp_insn_meta *meta)
|
||||
static inline bool is_mbpf_atomic(const struct nfp_insn_meta *meta)
|
||||
{
|
||||
return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_XADD);
|
||||
return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_ATOMIC);
|
||||
}
|
||||
|
||||
static inline bool is_mbpf_mul(const struct nfp_insn_meta *meta)
|
||||
|
||||
@@ -479,7 +479,7 @@ nfp_bpf_check_ptr(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
|
||||
pr_vlog(env, "map writes not supported\n");
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
if (is_mbpf_xadd(meta)) {
|
||||
if (is_mbpf_atomic(meta)) {
|
||||
err = nfp_bpf_map_mark_used(env, meta, reg,
|
||||
NFP_MAP_USE_ATOMIC_CNT);
|
||||
if (err)
|
||||
@@ -523,12 +523,17 @@ exit_check_ptr:
|
||||
}
|
||||
|
||||
static int
|
||||
nfp_bpf_check_xadd(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
|
||||
struct bpf_verifier_env *env)
|
||||
nfp_bpf_check_atomic(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
|
||||
struct bpf_verifier_env *env)
|
||||
{
|
||||
const struct bpf_reg_state *sreg = cur_regs(env) + meta->insn.src_reg;
|
||||
const struct bpf_reg_state *dreg = cur_regs(env) + meta->insn.dst_reg;
|
||||
|
||||
if (meta->insn.imm != BPF_ADD) {
|
||||
pr_vlog(env, "atomic op not implemented: %d\n", meta->insn.imm);
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
|
||||
if (dreg->type != PTR_TO_MAP_VALUE) {
|
||||
pr_vlog(env, "atomic add not to a map value pointer: %d\n",
|
||||
dreg->type);
|
||||
@@ -655,8 +660,8 @@ int nfp_verify_insn(struct bpf_verifier_env *env, int insn_idx,
|
||||
if (is_mbpf_store(meta))
|
||||
return nfp_bpf_check_store(nfp_prog, meta, env);
|
||||
|
||||
if (is_mbpf_xadd(meta))
|
||||
return nfp_bpf_check_xadd(nfp_prog, meta, env);
|
||||
if (is_mbpf_atomic(meta))
|
||||
return nfp_bpf_check_atomic(nfp_prog, meta, env);
|
||||
|
||||
if (is_mbpf_alu(meta))
|
||||
return nfp_bpf_check_alu(nfp_prog, meta, env);
|
||||
|
||||
@@ -259,15 +259,32 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
|
||||
.off = OFF, \
|
||||
.imm = 0 })
|
||||
|
||||
/* Atomic memory add, *(uint *)(dst_reg + off16) += src_reg */
|
||||
|
||||
#define BPF_STX_XADD(SIZE, DST, SRC, OFF) \
|
||||
/*
|
||||
* Atomic operations:
|
||||
*
|
||||
* BPF_ADD *(uint *) (dst_reg + off16) += src_reg
|
||||
* BPF_AND *(uint *) (dst_reg + off16) &= src_reg
|
||||
* BPF_OR *(uint *) (dst_reg + off16) |= src_reg
|
||||
* BPF_XOR *(uint *) (dst_reg + off16) ^= src_reg
|
||||
* BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg);
|
||||
* BPF_AND | BPF_FETCH src_reg = atomic_fetch_and(dst_reg + off16, src_reg);
|
||||
* BPF_OR | BPF_FETCH src_reg = atomic_fetch_or(dst_reg + off16, src_reg);
|
||||
* BPF_XOR | BPF_FETCH src_reg = atomic_fetch_xor(dst_reg + off16, src_reg);
|
||||
* BPF_XCHG src_reg = atomic_xchg(dst_reg + off16, src_reg)
|
||||
* BPF_CMPXCHG r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg)
|
||||
*/
|
||||
|
||||
#define BPF_ATOMIC_OP(SIZE, OP, DST, SRC, OFF) \
|
||||
((struct bpf_insn) { \
|
||||
.code = BPF_STX | BPF_SIZE(SIZE) | BPF_XADD, \
|
||||
.code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
|
||||
.dst_reg = DST, \
|
||||
.src_reg = SRC, \
|
||||
.off = OFF, \
|
||||
.imm = 0 })
|
||||
.imm = OP })
|
||||
|
||||
/* Legacy alias */
|
||||
#define BPF_STX_XADD(SIZE, DST, SRC, OFF) BPF_ATOMIC_OP(SIZE, BPF_ADD, DST, SRC, OFF)
|
||||
|
||||
/* Memory store, *(uint *) (dst_reg + off16) = imm32 */
|
||||
|
||||
|
||||
@@ -19,7 +19,8 @@
|
||||
|
||||
/* ld/ldx fields */
|
||||
#define BPF_DW 0x18 /* double word (64-bit) */
|
||||
#define BPF_XADD 0xc0 /* exclusive add */
|
||||
#define BPF_ATOMIC 0xc0 /* atomic memory ops - op type in immediate */
|
||||
#define BPF_XADD 0xc0 /* exclusive add - legacy name */
|
||||
|
||||
/* alu/jmp fields */
|
||||
#define BPF_MOV 0xb0 /* mov reg to reg */
|
||||
@@ -43,6 +44,11 @@
|
||||
#define BPF_CALL 0x80 /* function call */
|
||||
#define BPF_EXIT 0x90 /* function return */
|
||||
|
||||
/* atomic op type fields (stored in immediate) */
|
||||
#define BPF_FETCH 0x01 /* not an opcode on its own, used to build others */
|
||||
#define BPF_XCHG (0xe0 | BPF_FETCH) /* atomic exchange */
|
||||
#define BPF_CMPXCHG (0xf0 | BPF_FETCH) /* atomic compare-and-write */
|
||||
|
||||
/* Register numbers */
|
||||
enum {
|
||||
BPF_REG_0 = 0,
|
||||
@@ -2448,7 +2454,7 @@ union bpf_attr {
|
||||
* running simultaneously.
|
||||
*
|
||||
* A user should care about the synchronization by himself.
|
||||
* For example, by using the **BPF_STX_XADD** instruction to alter
|
||||
* For example, by using the **BPF_ATOMIC** instructions to alter
|
||||
* the shared data.
|
||||
* Return
|
||||
* A pointer to the local storage area.
|
||||
|
||||
@@ -1309,8 +1309,8 @@ EXPORT_SYMBOL_GPL(__bpf_call_base);
|
||||
INSN_3(STX, MEM, H), \
|
||||
INSN_3(STX, MEM, W), \
|
||||
INSN_3(STX, MEM, DW), \
|
||||
INSN_3(STX, XADD, W), \
|
||||
INSN_3(STX, XADD, DW), \
|
||||
INSN_3(STX, ATOMIC, W), \
|
||||
INSN_3(STX, ATOMIC, DW), \
|
||||
/* Immediate based. */ \
|
||||
INSN_3(ST, MEM, B), \
|
||||
INSN_3(ST, MEM, H), \
|
||||
@@ -1618,13 +1618,59 @@ out:
|
||||
LDX_PROBE(DW, 8)
|
||||
#undef LDX_PROBE
|
||||
|
||||
STX_XADD_W: /* lock xadd *(u32 *)(dst_reg + off16) += src_reg */
|
||||
atomic_add((u32) SRC, (atomic_t *)(unsigned long)
|
||||
(DST + insn->off));
|
||||
CONT;
|
||||
STX_XADD_DW: /* lock xadd *(u64 *)(dst_reg + off16) += src_reg */
|
||||
atomic64_add((u64) SRC, (atomic64_t *)(unsigned long)
|
||||
(DST + insn->off));
|
||||
#define ATOMIC_ALU_OP(BOP, KOP) \
|
||||
case BOP: \
|
||||
if (BPF_SIZE(insn->code) == BPF_W) \
|
||||
atomic_##KOP((u32) SRC, (atomic_t *)(unsigned long) \
|
||||
(DST + insn->off)); \
|
||||
else \
|
||||
atomic64_##KOP((u64) SRC, (atomic64_t *)(unsigned long) \
|
||||
(DST + insn->off)); \
|
||||
break; \
|
||||
case BOP | BPF_FETCH: \
|
||||
if (BPF_SIZE(insn->code) == BPF_W) \
|
||||
SRC = (u32) atomic_fetch_##KOP( \
|
||||
(u32) SRC, \
|
||||
(atomic_t *)(unsigned long) (DST + insn->off)); \
|
||||
else \
|
||||
SRC = (u64) atomic64_fetch_##KOP( \
|
||||
(u64) SRC, \
|
||||
(atomic64_t *)(unsigned long) (DST + insn->off)); \
|
||||
break;
|
||||
|
||||
STX_ATOMIC_DW:
|
||||
STX_ATOMIC_W:
|
||||
switch (IMM) {
|
||||
ATOMIC_ALU_OP(BPF_ADD, add)
|
||||
ATOMIC_ALU_OP(BPF_AND, and)
|
||||
ATOMIC_ALU_OP(BPF_OR, or)
|
||||
ATOMIC_ALU_OP(BPF_XOR, xor)
|
||||
#undef ATOMIC_ALU_OP
|
||||
|
||||
case BPF_XCHG:
|
||||
if (BPF_SIZE(insn->code) == BPF_W)
|
||||
SRC = (u32) atomic_xchg(
|
||||
(atomic_t *)(unsigned long) (DST + insn->off),
|
||||
(u32) SRC);
|
||||
else
|
||||
SRC = (u64) atomic64_xchg(
|
||||
(atomic64_t *)(unsigned long) (DST + insn->off),
|
||||
(u64) SRC);
|
||||
break;
|
||||
case BPF_CMPXCHG:
|
||||
if (BPF_SIZE(insn->code) == BPF_W)
|
||||
BPF_R0 = (u32) atomic_cmpxchg(
|
||||
(atomic_t *)(unsigned long) (DST + insn->off),
|
||||
(u32) BPF_R0, (u32) SRC);
|
||||
else
|
||||
BPF_R0 = (u64) atomic64_cmpxchg(
|
||||
(atomic64_t *)(unsigned long) (DST + insn->off),
|
||||
(u64) BPF_R0, (u64) SRC);
|
||||
break;
|
||||
|
||||
default:
|
||||
goto default_label;
|
||||
}
|
||||
CONT;
|
||||
|
||||
default_label:
|
||||
@@ -1634,7 +1680,8 @@ out:
|
||||
*
|
||||
* Note, verifier whitelists all opcodes in bpf_opcode_in_insntable().
|
||||
*/
|
||||
pr_warn("BPF interpreter: unknown opcode %02x\n", insn->code);
|
||||
pr_warn("BPF interpreter: unknown opcode %02x (imm: 0x%x)\n",
|
||||
insn->code, insn->imm);
|
||||
BUG_ON(1);
|
||||
return 0;
|
||||
}
|
||||
|
||||
@@ -80,6 +80,13 @@ const char *const bpf_alu_string[16] = {
|
||||
[BPF_END >> 4] = "endian",
|
||||
};
|
||||
|
||||
static const char *const bpf_atomic_alu_string[16] = {
|
||||
[BPF_ADD >> 4] = "add",
|
||||
[BPF_AND >> 4] = "and",
|
||||
[BPF_OR >> 4] = "or",
|
||||
[BPF_XOR >> 4] = "or",
|
||||
};
|
||||
|
||||
static const char *const bpf_ldst_string[] = {
|
||||
[BPF_W >> 3] = "u32",
|
||||
[BPF_H >> 3] = "u16",
|
||||
@@ -153,14 +160,44 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
|
||||
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
|
||||
insn->dst_reg,
|
||||
insn->off, insn->src_reg);
|
||||
else if (BPF_MODE(insn->code) == BPF_XADD)
|
||||
verbose(cbs->private_data, "(%02x) lock *(%s *)(r%d %+d) += r%d\n",
|
||||
else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
|
||||
(insn->imm == BPF_ADD || insn->imm == BPF_ADD ||
|
||||
insn->imm == BPF_OR || insn->imm == BPF_XOR)) {
|
||||
verbose(cbs->private_data, "(%02x) lock *(%s *)(r%d %+d) %s r%d\n",
|
||||
insn->code,
|
||||
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
|
||||
insn->dst_reg, insn->off,
|
||||
bpf_alu_string[BPF_OP(insn->imm) >> 4],
|
||||
insn->src_reg);
|
||||
else
|
||||
} else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
|
||||
(insn->imm == (BPF_ADD | BPF_FETCH) ||
|
||||
insn->imm == (BPF_AND | BPF_FETCH) ||
|
||||
insn->imm == (BPF_OR | BPF_FETCH) ||
|
||||
insn->imm == (BPF_XOR | BPF_FETCH))) {
|
||||
verbose(cbs->private_data, "(%02x) r%d = atomic%s_fetch_%s((%s *)(r%d %+d), r%d)\n",
|
||||
insn->code, insn->src_reg,
|
||||
BPF_SIZE(insn->code) == BPF_DW ? "64" : "",
|
||||
bpf_atomic_alu_string[BPF_OP(insn->imm) >> 4],
|
||||
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
|
||||
insn->dst_reg, insn->off, insn->src_reg);
|
||||
} else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
|
||||
insn->imm == BPF_CMPXCHG) {
|
||||
verbose(cbs->private_data, "(%02x) r0 = atomic%s_cmpxchg((%s *)(r%d %+d), r0, r%d)\n",
|
||||
insn->code,
|
||||
BPF_SIZE(insn->code) == BPF_DW ? "64" : "",
|
||||
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
|
||||
insn->dst_reg, insn->off,
|
||||
insn->src_reg);
|
||||
} else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
|
||||
insn->imm == BPF_XCHG) {
|
||||
verbose(cbs->private_data, "(%02x) r%d = atomic%s_xchg((%s *)(r%d %+d), r%d)\n",
|
||||
insn->code, insn->src_reg,
|
||||
BPF_SIZE(insn->code) == BPF_DW ? "64" : "",
|
||||
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
|
||||
insn->dst_reg, insn->off, insn->src_reg);
|
||||
} else {
|
||||
verbose(cbs->private_data, "BUG_%02x\n", insn->code);
|
||||
}
|
||||
} else if (class == BPF_ST) {
|
||||
if (BPF_MODE(insn->code) != BPF_MEM) {
|
||||
verbose(cbs->private_data, "BUG_st_%02x\n", insn->code);
|
||||
|
||||
@@ -3604,13 +3604,30 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
|
||||
return err;
|
||||
}
|
||||
|
||||
static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn)
|
||||
static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn)
|
||||
{
|
||||
int load_reg;
|
||||
int err;
|
||||
|
||||
if ((BPF_SIZE(insn->code) != BPF_W && BPF_SIZE(insn->code) != BPF_DW) ||
|
||||
insn->imm != 0) {
|
||||
verbose(env, "BPF_XADD uses reserved fields\n");
|
||||
switch (insn->imm) {
|
||||
case BPF_ADD:
|
||||
case BPF_ADD | BPF_FETCH:
|
||||
case BPF_AND:
|
||||
case BPF_AND | BPF_FETCH:
|
||||
case BPF_OR:
|
||||
case BPF_OR | BPF_FETCH:
|
||||
case BPF_XOR:
|
||||
case BPF_XOR | BPF_FETCH:
|
||||
case BPF_XCHG:
|
||||
case BPF_CMPXCHG:
|
||||
break;
|
||||
default:
|
||||
verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n", insn->imm);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (BPF_SIZE(insn->code) != BPF_W && BPF_SIZE(insn->code) != BPF_DW) {
|
||||
verbose(env, "invalid atomic operand size\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
@@ -3624,6 +3641,13 @@ static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_ins
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
if (insn->imm == BPF_CMPXCHG) {
|
||||
/* Check comparison of R0 with memory location */
|
||||
err = check_reg_arg(env, BPF_REG_0, SRC_OP);
|
||||
if (err)
|
||||
return err;
|
||||
}
|
||||
|
||||
if (is_pointer_value(env, insn->src_reg)) {
|
||||
verbose(env, "R%d leaks addr into mem\n", insn->src_reg);
|
||||
return -EACCES;
|
||||
@@ -3633,21 +3657,38 @@ static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_ins
|
||||
is_pkt_reg(env, insn->dst_reg) ||
|
||||
is_flow_key_reg(env, insn->dst_reg) ||
|
||||
is_sk_reg(env, insn->dst_reg)) {
|
||||
verbose(env, "BPF_XADD stores into R%d %s is not allowed\n",
|
||||
verbose(env, "BPF_ATOMIC stores into R%d %s is not allowed\n",
|
||||
insn->dst_reg,
|
||||
reg_type_str[reg_state(env, insn->dst_reg)->type]);
|
||||
return -EACCES;
|
||||
}
|
||||
|
||||
/* check whether atomic_add can read the memory */
|
||||
/* check whether we can read the memory */
|
||||
err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
|
||||
BPF_SIZE(insn->code), BPF_READ, -1, true);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
/* check whether atomic_add can write into the same memory */
|
||||
return check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
|
||||
BPF_SIZE(insn->code), BPF_WRITE, -1, true);
|
||||
/* check whether we can write into the same memory */
|
||||
err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
|
||||
BPF_SIZE(insn->code), BPF_WRITE, -1, true);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
if (!(insn->imm & BPF_FETCH))
|
||||
return 0;
|
||||
|
||||
if (insn->imm == BPF_CMPXCHG)
|
||||
load_reg = BPF_REG_0;
|
||||
else
|
||||
load_reg = insn->src_reg;
|
||||
|
||||
/* check and record load of old value */
|
||||
err = check_reg_arg(env, load_reg, DST_OP);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int __check_stack_boundary(struct bpf_verifier_env *env, u32 regno,
|
||||
@@ -9524,14 +9565,19 @@ static int do_check(struct bpf_verifier_env *env)
|
||||
} else if (class == BPF_STX) {
|
||||
enum bpf_reg_type *prev_dst_type, dst_reg_type;
|
||||
|
||||
if (BPF_MODE(insn->code) == BPF_XADD) {
|
||||
err = check_xadd(env, env->insn_idx, insn);
|
||||
if (BPF_MODE(insn->code) == BPF_ATOMIC) {
|
||||
err = check_atomic(env, env->insn_idx, insn);
|
||||
if (err)
|
||||
return err;
|
||||
env->insn_idx++;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (BPF_MODE(insn->code) != BPF_MEM || insn->imm != 0) {
|
||||
verbose(env, "BPF_STX uses reserved fields\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/* check src1 operand */
|
||||
err = check_reg_arg(env, insn->src_reg, SRC_OP);
|
||||
if (err)
|
||||
@@ -10008,13 +10054,6 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (BPF_CLASS(insn->code) == BPF_STX &&
|
||||
((BPF_MODE(insn->code) != BPF_MEM &&
|
||||
BPF_MODE(insn->code) != BPF_XADD) || insn->imm != 0)) {
|
||||
verbose(env, "BPF_STX uses reserved fields\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (insn[0].code == (BPF_LD | BPF_IMM | BPF_DW)) {
|
||||
struct bpf_insn_aux_data *aux;
|
||||
struct bpf_map *map;
|
||||
|
||||
@@ -4295,13 +4295,13 @@ static struct bpf_test tests[] = {
|
||||
{ { 0, 0xffffffff } },
|
||||
.stack_depth = 40,
|
||||
},
|
||||
/* BPF_STX | BPF_XADD | BPF_W/DW */
|
||||
/* BPF_STX | BPF_ATOMIC | BPF_W/DW */
|
||||
{
|
||||
"STX_XADD_W: Test: 0x12 + 0x10 = 0x22",
|
||||
.u.insns_int = {
|
||||
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
|
||||
BPF_ST_MEM(BPF_W, R10, -40, 0x10),
|
||||
BPF_STX_XADD(BPF_W, R10, R0, -40),
|
||||
BPF_ATOMIC_OP(BPF_W, BPF_ADD, R10, R0, -40),
|
||||
BPF_LDX_MEM(BPF_W, R0, R10, -40),
|
||||
BPF_EXIT_INSN(),
|
||||
},
|
||||
@@ -4316,7 +4316,7 @@ static struct bpf_test tests[] = {
|
||||
BPF_ALU64_REG(BPF_MOV, R1, R10),
|
||||
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
|
||||
BPF_ST_MEM(BPF_W, R10, -40, 0x10),
|
||||
BPF_STX_XADD(BPF_W, R10, R0, -40),
|
||||
BPF_ATOMIC_OP(BPF_W, BPF_ADD, R10, R0, -40),
|
||||
BPF_ALU64_REG(BPF_MOV, R0, R10),
|
||||
BPF_ALU64_REG(BPF_SUB, R0, R1),
|
||||
BPF_EXIT_INSN(),
|
||||
@@ -4331,7 +4331,7 @@ static struct bpf_test tests[] = {
|
||||
.u.insns_int = {
|
||||
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
|
||||
BPF_ST_MEM(BPF_W, R10, -40, 0x10),
|
||||
BPF_STX_XADD(BPF_W, R10, R0, -40),
|
||||
BPF_ATOMIC_OP(BPF_W, BPF_ADD, R10, R0, -40),
|
||||
BPF_EXIT_INSN(),
|
||||
},
|
||||
INTERNAL,
|
||||
@@ -4352,7 +4352,7 @@ static struct bpf_test tests[] = {
|
||||
.u.insns_int = {
|
||||
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
|
||||
BPF_ST_MEM(BPF_DW, R10, -40, 0x10),
|
||||
BPF_STX_XADD(BPF_DW, R10, R0, -40),
|
||||
BPF_ATOMIC_OP(BPF_DW, BPF_ADD, R10, R0, -40),
|
||||
BPF_LDX_MEM(BPF_DW, R0, R10, -40),
|
||||
BPF_EXIT_INSN(),
|
||||
},
|
||||
@@ -4367,7 +4367,7 @@ static struct bpf_test tests[] = {
|
||||
BPF_ALU64_REG(BPF_MOV, R1, R10),
|
||||
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
|
||||
BPF_ST_MEM(BPF_DW, R10, -40, 0x10),
|
||||
BPF_STX_XADD(BPF_DW, R10, R0, -40),
|
||||
BPF_ATOMIC_OP(BPF_DW, BPF_ADD, R10, R0, -40),
|
||||
BPF_ALU64_REG(BPF_MOV, R0, R10),
|
||||
BPF_ALU64_REG(BPF_SUB, R0, R1),
|
||||
BPF_EXIT_INSN(),
|
||||
@@ -4382,7 +4382,7 @@ static struct bpf_test tests[] = {
|
||||
.u.insns_int = {
|
||||
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
|
||||
BPF_ST_MEM(BPF_DW, R10, -40, 0x10),
|
||||
BPF_STX_XADD(BPF_DW, R10, R0, -40),
|
||||
BPF_ATOMIC_OP(BPF_DW, BPF_ADD, R10, R0, -40),
|
||||
BPF_EXIT_INSN(),
|
||||
},
|
||||
INTERNAL,
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user