Commit Graph

277 Commits

Author SHA1 Message Date
Daniel Borkmann 42984d7c1e bpf: add sample usages for persistent maps/progs
This patch adds a couple of stand-alone examples on how BPF_OBJ_PIN
and BPF_OBJ_GET commands can be used.

Example with maps:

  # ./fds_example -F /sys/fs/bpf/m -P -m -k 1 -v 42
  bpf: map fd:3 (Success)
  bpf: pin ret:(0,Success)
  bpf: fd:3 u->(1:42) ret:(0,Success)
  # ./fds_example -F /sys/fs/bpf/m -G -m -k 1
  bpf: get fd:3 (Success)
  bpf: fd:3 l->(1):42 ret:(0,Success)
  # ./fds_example -F /sys/fs/bpf/m -G -m -k 1 -v 24
  bpf: get fd:3 (Success)
  bpf: fd:3 u->(1:24) ret:(0,Success)
  # ./fds_example -F /sys/fs/bpf/m -G -m -k 1
  bpf: get fd:3 (Success)
  bpf: fd:3 l->(1):24 ret:(0,Success)

  # ./fds_example -F /sys/fs/bpf/m2 -P -m
  bpf: map fd:3 (Success)
  bpf: pin ret:(0,Success)
  # ./fds_example -F /sys/fs/bpf/m2 -G -m -k 1
  bpf: get fd:3 (Success)
  bpf: fd:3 l->(1):0 ret:(0,Success)
  # ./fds_example -F /sys/fs/bpf/m2 -G -m
  bpf: get fd:3 (Success)

Example with progs:

  # ./fds_example -F /sys/fs/bpf/p -P -p
  bpf: prog fd:3 (Success)
  bpf: pin ret:(0,Success)
  bpf sock:4 <- fd:3 attached ret:(0,Success)
  # ./fds_example -F /sys/fs/bpf/p -G -p
  bpf: get fd:3 (Success)
  bpf: sock:4 <- fd:3 attached ret:(0,Success)

  # ./fds_example -F /sys/fs/bpf/p2 -P -p -o ./sockex1_kern.o
  bpf: prog fd:5 (Success)
  bpf: pin ret:(0,Success)
  bpf: sock:3 <- fd:5 attached ret:(0,Success)
  # ./fds_example -F /sys/fs/bpf/p2 -G -p
  bpf: get fd:3 (Success)
  bpf: sock:4 <- fd:3 attached ret:(0,Success)

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-02 22:48:39 -05:00
Chunyan Zhang 67aedeb857 Sample: Trace_event: Correct the comments
The commit 889204278c ("tracing: Update trace-event-sample with
TRACE_SYSTEM_VAR documentation") changed TRACE_SYSTEM to 'sample-trace',
but didn't make the according change of its name in the comments.

Link: http://lkml.kernel.org/r/1443599650-23680-1-git-send-email-zhang.chunyan@linaro.org

Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-02 14:09:15 -05:00
David S. Miller b75ec3af27 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-11-01 00:15:30 -04:00
Yang Shi 85ff8a43f3 bpf: sample: define aarch64 specific registers
Define aarch64 specific registers for building bpf samples correctly.

Signed-off-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:51:10 -07:00
Alexei Starovoitov 39111695b1 samples: bpf: add bpf_perf_event_output example
Performance test and example of bpf_perf_event_output().
kprobe is attached to sys_write() and trivial bpf program streams
pid+cookie into userspace via PERF_COUNT_SW_BPF_OUTPUT event.

Usage:
$ sudo ./bld_x64/samples/bpf/trace_output
recv 2968913 events per sec

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22 06:42:15 -07:00
Christoph Hellwig 517982229f configfs: remove old API
Remove the old show_attribute and store_attribute methods and update
the documentation.  Also replace the two C samples with a single new
one in the proper samples directory where people expect to find it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-10-13 22:17:57 -07:00
Alexei Starovoitov bf5088773f bpf: add unprivileged bpf tests
Add new tests samples/bpf/test_verifier:

unpriv: return pointer
  checks that pointer cannot be returned from the eBPF program

unpriv: add const to pointer
unpriv: add pointer to pointer
unpriv: neg pointer
  checks that pointer arithmetic is disallowed

unpriv: cmp pointer with const
unpriv: cmp pointer with pointer
  checks that comparison of pointers is disallowed
  Only one case allowed 'void *value = bpf_map_lookup_elem(..); if (value == 0) ...'

unpriv: check that printk is disallowed
  since bpf_trace_printk is not available to unprivileged

unpriv: pass pointer to helper function
  checks that pointers cannot be passed to functions that expect integers
  If function expects a pointer the verifier allows only that type of pointer.
  Like 1st argument of bpf_map_lookup_elem() must be pointer to map.
  (applies to non-root as well)

unpriv: indirectly pass pointer on stack to helper function
  checks that pointer stored into stack cannot be used as part of key
  passed into bpf_map_lookup_elem()

unpriv: mangle pointer on stack 1
unpriv: mangle pointer on stack 2
  checks that writing into stack slot that already contains a pointer
  is disallowed

unpriv: read pointer from stack in small chunks
  checks that < 8 byte read from stack slot that contains a pointer is
  disallowed

unpriv: write pointer into ctx
  checks that storing pointers into skb->fields is disallowed

unpriv: write pointer into map elem value
  checks that storing pointers into element values is disallowed
  For example:
  int bpf_prog(struct __sk_buff *skb)
  {
    u32 key = 0;
    u64 *value = bpf_map_lookup_elem(&map, &key);
    if (value)
       *value = (u64) skb;
  }
  will be rejected.

unpriv: partial copy of pointer
  checks that doing 32-bit register mov from register containing
  a pointer is disallowed

unpriv: pass pointer to tail_call
  checks that passing pointer as an index into bpf_tail_call
  is disallowed

unpriv: cmp map pointer with zero
  checks that comparing map pointer with constant is disallowed

unpriv: write into frame pointer
  checks that frame pointer is read-only (applies to root too)

unpriv: cmp of frame pointer
  checks that R10 cannot be using in comparison

unpriv: cmp of stack pointer
  checks that Rx = R10 - imm is ok, but comparing Rx is not

unpriv: obfuscate stack pointer
  checks that Rx = R10 - imm is ok, but Rx -= imm is not

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 19:13:37 -07:00
David S. Miller f6d3125fa3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/dsa/slave.c

net/dsa/slave.c simply had overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-02 07:21:25 -07:00
Petr Mladek 54aea45429 kprobes: use _do_fork() in samples to make them work again
Commit 3033f14ab7 ("clone: support passing tls argument via C rather
than pt_regs magic") introduced _do_fork() that allowed to pass @tls
parameter.

The old do_fork() is defined only for architectures that are not ready
to use this way and do not define HAVE_COPY_THREAD_TLS.

Let's use _do_fork() in the kprobe examples to make them work again on
all architectures.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thiago Macieira <thiago.macieira@intel.com>
Cc: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-01 21:42:35 -04:00
Alexei Starovoitov 27b29f6305 bpf: add bpf_redirect() helper
Existing bpf_clone_redirect() helper clones skb before redirecting
it to RX or TX of destination netdev.
Introduce bpf_redirect() helper that does that without cloning.

Benchmarked with two hosts using 10G ixgbe NICs.
One host is doing line rate pktgen.
Another host is configured as:
$ tc qdisc add dev $dev ingress
$ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
   action bpf run object-file tcbpf1_kern.o section clone_redirect_xmit drop
so it receives the packet on $dev and immediately xmits it on $dev + 1
The section 'clone_redirect_xmit' in tcbpf1_kern.o file has the program
that does bpf_clone_redirect() and performance is 2.0 Mpps

$ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
   action bpf run object-file tcbpf1_kern.o section redirect_xmit drop
which is using bpf_redirect() - 2.4 Mpps

and using cls_bpf with integrated actions as:
$ tc filter add dev $dev root pref 10 \
  bpf run object-file tcbpf1_kern.o section redirect_xmit integ_act classid 1
performance is 2.5 Mpps

To summarize:
u32+act_bpf using clone_redirect - 2.0 Mpps
u32+act_bpf using redirect - 2.4 Mpps
cls_bpf using redirect - 2.5 Mpps

For comparison linux bridge in this setup is doing 2.1 Mpps
and ixgbe rx + drop in ip_rcv - 7.8 Mpps

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 21:09:07 -07:00
Kaixu Xia 5ed3ccbd5a bpf: fix build warnings and add function read_trace_pipe()
There are two improvements in this patch:
 1. Fix the build warnings;
 2. Add function read_trace_pipe() to print the result on
    the screen;

Before this patch, we can get the result through /sys/kernel/de
bug/tracing/trace_pipe and get nothing on the screen.
By applying this patch, the result can be printed on the screen.
  $ ./tracex6
	...
         tracex6-705   [003] d..1   131.428593: : CPU-3   19981414
            sshd-683   [000] d..1   131.428727: : CPU-0   221682321
            sshd-683   [000] d..1   131.428821: : CPU-0   221808766
            sshd-683   [000] d..1   131.428950: : CPU-0   221982984
            sshd-683   [000] d..1   131.429045: : CPU-0   222111851
         tracex6-705   [003] d..1   131.429168: : CPU-3   20757551
            sshd-683   [000] d..1   131.429170: : CPU-0   222281240
            sshd-683   [000] d..1   131.429261: : CPU-0   222403340
            sshd-683   [000] d..1   131.429378: : CPU-0   222561024
	...

Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-12 16:39:12 -07:00
Kaixu Xia 47efb30274 samples/bpf: example of get selected PMU counter value
This is a simple example and shows how to use the new ability
to get the selected Hardware PMU counter value.

Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:50:06 -07:00
Alex Gartrell 24b4d2abd0 ebpf: Allow dereferences of PTR_TO_STACK registers
mov %rsp, %r1           ; r1 = rsp
        add $-8, %r1            ; r1 = rsp - 8
        store_q $123, -8(%rsp)  ; *(u64*)r1 = 123  <- valid
        store_q $123, (%r1)     ; *(u64*)r1 = 123  <- previously invalid
        mov $0, %r0
        exit                    ; Always need to exit

And we'd get the following error:

	0: (bf) r1 = r10
	1: (07) r1 += -8
	2: (7a) *(u64 *)(r10 -8) = 999
	3: (7a) *(u64 *)(r1 +0) = 999
	R1 invalid mem access 'fp'

	Unable to load program

We already know that a register is a stack address and the appropriate
offset, so we should be able to validate those references as well.

Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27 00:54:10 -07:00
David S. Miller c5e40ee287 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/bridge/br_mdb.c

br_mdb.c conflict was a function call being removed to fix a bug in
'net' but whose signature was changed in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-23 00:41:16 -07:00
Steven Rostedt (Red Hat) d6726c8145 tracing: Fix sample output of dynamic arrays
He Kuang noticed that the trace event samples for arrays was broken:

"The output result of trace_foo_bar event in traceevent samples is
 wrong. This problem can be reproduced as following:

  (Build kernel with SAMPLE_TRACE_EVENTS=m)

  $ insmod trace-events-sample.ko

  $ echo 1 > /sys/kernel/debug/tracing/events/sample-trace/foo_bar/enable

  $ cat /sys/kernel/debug/tracing/trace

  event-sample-980 [000] ....  43.649559: foo_bar: foo hello 21 0x15
  BIT1|BIT3|0x10 {0x1,0x6f6f6e53,0xff007970,0xffffffff} Snoopy
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 The array length is not right, should be {0x1}.
  (ffffffff,ffffffff)

  event-sample-980 [000] ....  44.653827: foo_bar: foo hello 22 0x16
  BIT2|BIT3|0x10
  {0x1,0x2,0x646e6147,0x666c61,0xffffffff,0xffffffff,0x750aeffe,0x7}
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 The array length is not right, should be {0x1,0x2}.
  Gandalf (ffffffff,ffffffff)"

This was caused by an update to have __print_array()'s second parameter
be the count of items in the array and not the size of the array.

As there is already users of __print_array(), it can not change. But
the sample code can and we can also improve on the documentation about
__print_array() and __get_dynamic_array_len().

Link: http://lkml.kernel.org/r/1436839171-31527-2-git-send-email-hekuang@huawei.com

Fixes: ac01ce1410 ("tracing: Make ftrace_print_array_seq compute buf_len")
Reported-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-07-17 14:15:13 -04:00
Michael Holzheu d912557b34 samples: bpf: enable trace samples for s390x
The trace bpf samples do not compile on s390x because they use x86
specific fields from the "pt_regs" structure.

Fix this and access the fields via new PT_REGS macros.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08 15:17:45 -07:00
Daniel Wagner 0fb1170ee6 bpf: BPF based latency tracing
BPF offers another way to generate latency histograms. We attach
kprobes at trace_preempt_off and trace_preempt_on and calculate the
time it takes to from seeing the off/on transition.

The first array is used to store the start time stamp. The key is the
CPU id. The second array stores the log2(time diff). We need to use
static allocation here (array and not hash tables). The kprobes
hooking into trace_preempt_on|off should not calling any dynamic
memory allocation or free path. We need to avoid recursivly
getting called. Besides that, it reduces jitter in the measurement.

CPU 0
      latency        : count     distribution
       1 -> 1        : 0        |                                        |
       2 -> 3        : 0        |                                        |
       4 -> 7        : 0        |                                        |
       8 -> 15       : 0        |                                        |
      16 -> 31       : 0        |                                        |
      32 -> 63       : 0        |                                        |
      64 -> 127      : 0        |                                        |
     128 -> 255      : 0        |                                        |
     256 -> 511      : 0        |                                        |
     512 -> 1023     : 0        |                                        |
    1024 -> 2047     : 0        |                                        |
    2048 -> 4095     : 166723   |*************************************** |
    4096 -> 8191     : 19870    |***                                     |
    8192 -> 16383    : 6324     |                                        |
   16384 -> 32767    : 1098     |                                        |
   32768 -> 65535    : 190      |                                        |
   65536 -> 131071   : 179      |                                        |
  131072 -> 262143   : 18       |                                        |
  262144 -> 524287   : 4        |                                        |
  524288 -> 1048575  : 1363     |                                        |
CPU 1
      latency        : count     distribution
       1 -> 1        : 0        |                                        |
       2 -> 3        : 0        |                                        |
       4 -> 7        : 0        |                                        |
       8 -> 15       : 0        |                                        |
      16 -> 31       : 0        |                                        |
      32 -> 63       : 0        |                                        |
      64 -> 127      : 0        |                                        |
     128 -> 255      : 0        |                                        |
     256 -> 511      : 0        |                                        |
     512 -> 1023     : 0        |                                        |
    1024 -> 2047     : 0        |                                        |
    2048 -> 4095     : 114042   |*************************************** |
    4096 -> 8191     : 9587     |**                                      |
    8192 -> 16383    : 4140     |                                        |
   16384 -> 32767    : 673      |                                        |
   32768 -> 65535    : 179      |                                        |
   65536 -> 131071   : 29       |                                        |
  131072 -> 262143   : 4        |                                        |
  262144 -> 524287   : 1        |                                        |
  524288 -> 1048575  : 364      |                                        |
CPU 2
      latency        : count     distribution
       1 -> 1        : 0        |                                        |
       2 -> 3        : 0        |                                        |
       4 -> 7        : 0        |                                        |
       8 -> 15       : 0        |                                        |
      16 -> 31       : 0        |                                        |
      32 -> 63       : 0        |                                        |
      64 -> 127      : 0        |                                        |
     128 -> 255      : 0        |                                        |
     256 -> 511      : 0        |                                        |
     512 -> 1023     : 0        |                                        |
    1024 -> 2047     : 0        |                                        |
    2048 -> 4095     : 40147    |*************************************** |
    4096 -> 8191     : 2300     |*                                       |
    8192 -> 16383    : 828      |                                        |
   16384 -> 32767    : 178      |                                        |
   32768 -> 65535    : 59       |                                        |
   65536 -> 131071   : 2        |                                        |
  131072 -> 262143   : 0        |                                        |
  262144 -> 524287   : 1        |                                        |
  524288 -> 1048575  : 174      |                                        |
CPU 3
      latency        : count     distribution
       1 -> 1        : 0        |                                        |
       2 -> 3        : 0        |                                        |
       4 -> 7        : 0        |                                        |
       8 -> 15       : 0        |                                        |
      16 -> 31       : 0        |                                        |
      32 -> 63       : 0        |                                        |
      64 -> 127      : 0        |                                        |
     128 -> 255      : 0        |                                        |
     256 -> 511      : 0        |                                        |
     512 -> 1023     : 0        |                                        |
    1024 -> 2047     : 0        |                                        |
    2048 -> 4095     : 29626    |*************************************** |
    4096 -> 8191     : 2704     |**                                      |
    8192 -> 16383    : 1090     |                                        |
   16384 -> 32767    : 160      |                                        |
   32768 -> 65535    : 72       |                                        |
   65536 -> 131071   : 32       |                                        |
  131072 -> 262143   : 26       |                                        |
  262144 -> 524287   : 12       |                                        |
  524288 -> 1048575  : 298      |                                        |

All this is based on the trace3 examples written by
Alexei Starovoitov <ast@plumgrid.com>.

Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23 06:09:58 -07:00
Alexei Starovoitov ffeedafbf0 bpf: introduce current->pid, tgid, uid, gid, comm accessors
eBPF programs attached to kprobes need to filter based on
current->pid, uid and other fields, so introduce helper functions:

u64 bpf_get_current_pid_tgid(void)
Return: current->tgid << 32 | current->pid

u64 bpf_get_current_uid_gid(void)
Return: current_gid << 32 | current_uid

bpf_get_current_comm(char *buf, int size_of_buf)
stores current->comm into buf

They can be used from the programs attached to TC as well to classify packets
based on current task fields.

Update tracex2 example to print histogram of write syscalls for each process
instead of aggregated for all.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-15 15:53:50 -07:00
Alexei Starovoitov d691f9e8d4 bpf: allow programs to write to certain skb fields
allow programs read/write skb->mark, tc_index fields and
((struct qdisc_skb_cb *)cb)->data.

mark and tc_index are generically useful in TC.
cb[0]-cb[4] are primarily used to pass arguments from one
program to another called via bpf_tail_call() which can
be seen in sockex3_kern.c example.

All fields of 'struct __sk_buff' are readable to socket and tc_cls_act progs.
mark, tc_index are writeable from tc_cls_act only.
cb[0]-cb[4] are writeable by both sockets and tc_cls_act.

Add verifier tests and improve sample code.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07 02:01:33 -07:00
Alexei Starovoitov 3431205e03 bpf: make programs see skb->data == L2 for ingress and egress
eBPF programs attached to ingress and egress qdiscs see inconsistent skb->data.
For ingress L2 header is already pulled, whereas for egress it's present.
This is known to program writers which are currently forced to use
BPF_LL_OFF workaround.
Since programs don't change skb internal pointers it is safe to do
pull/push right around invocation of the program and earlier taps and
later pt->func() will not be affected.
Multiple taps via packet_rcv(), tpacket_rcv() are doing the same trick
around run_filter/BPF_PROG_RUN even if skb_shared.

This fix finally allows programs to use optimized LD_ABS/IND instructions
without BPF_LL_OFF for higher performance.
tc ingress + cls_bpf + samples/bpf/tcbpf1_kern.o
       w/o JIT   w/JIT
before  20.5     23.6 Mpps
after   21.8     26.6 Mpps

Old programs with BPF_LL_OFF will still work as-is.

We can now undo most of the earlier workaround commit:
a166151cbe ("bpf: fix bpf helpers to use skb->mac_header relative offsets")

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07 02:01:33 -07:00
Jesper Dangaard Brouer 05a14d5e17 pktgen: add benchmark script pktgen_bench_xmit_mode_netif_receive.sh
This script pktgen_bench_xmit_mode_netif_receive.sh is a benchmark
script, which can be used for benchmarking part of the network stack.
This can be used for performance improving or catching regression in
that area.

The script is developed for benchmarking ingress qdisc path, original
idea by Alexei Starovoitov.  This script don't really need any
hardware.  This is achieved via the recently introduced stack inject
feature "xmit_mode netif_receive". See commit 62f64aed62 ("pktgen:
introduce xmit_mode '<start_xmit|netif_receive>'").

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:17 -04:00
Jesper Dangaard Brouer 1d73ba16ad pktgen: add sample script pktgen_sample03_burst_single_flow.sh
Add the pktgen samples script pktgen_sample03_burst_single_flow.sh
that demonstrates how to acheive maximum performance.

If correctly tuned[1] single CPU 10Gbit/s wirespeed small pkts is
possible[2] which is 14.88Mpps.  The trick is to take advantage of the
"burst" feature introduced in commit 38b2cf2982 ("net: pktgen:
packet bursting via skb->xmit_more").

[1] http://netoptimizer.blogspot.dk/2014/06/pktgen-for-network-overload-testing.html
[2] http://netoptimizer.blogspot.dk/2014/10/unlocked-10gbps-tx-wirespeed-smallest.html

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:17 -04:00
Jesper Dangaard Brouer 282fb58947 pktgen: add sample script pktgen_sample02_multiqueue.sh
Add the pktgen samples script pktgen_sample02_multiqueue.sh that
demonstrates generating packets on multiqueue NICs.

Specifically notice the options "-t" that specifies how many
kernel threads to activate.  Also notice the flag QUEUE_MAP_CPU,
which cause the SKB TX queue to be mapped to the CPU running the
kernel thread.  For best scalability people are also encourage to
map NIC IRQ /proc/irq/*/smp_affinity to CPU number.

Usage example with "-t" 4 threads and help:
 ./pktgen_sample02_multiqueue.sh -i eth4 -m 00:1B:21:3C:9D:F8 -t 4

Usage: ./pktgen_sample02_multiqueue.sh [-vx] -i ethX
  -i : ($DEV)       output interface/device (required)
  -s : ($PKT_SIZE)  packet size
  -d : ($DEST_IP)   destination IP
  -m : ($DST_MAC)   destination MAC-addr
  -t : ($THREADS)   threads to start
  -c : ($SKB_CLONE) SKB clones send before alloc new SKB
  -b : ($BURST)     HW level bursting of SKBs
  -v : ($VERBOSE)   verbose
  -x : ($DEBUG)     debug

Removing pktgen.conf-2-1 and pktgen.conf-2-2 as these examples
should be covered now.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:17 -04:00
Jesper Dangaard Brouer 6f09479758 pktgen: add sample script pktgen_sample01_simple.sh
Add the first basic pktgen samples script pktgen_sample01_simple.sh,
which demonstrates the a simple use of the helper functions.
Removing pktgen.conf-1-1 as that example should be covered now.

The naming scheme pktgen_sampleNN, where NN is a number, should encourage
reading the samples in a specific order.

Script cause pktgen sending with a single thread and single interface,
and introduce flow variation via random UDP source port.

Usage example and help:
 ./pktgen_sample01_simple.sh -i eth4 -m 00:1B:21:3C:9D:F8 -d 192.168.8.2

Usage: ./pktgen_sample01_simple.sh [-vx] -i ethX
  -i : ($DEV)       output interface/device (required)
  -s : ($PKT_SIZE)  packet size
  -d : ($DEST_IP)   destination IP
  -m : ($DST_MAC)   destination MAC-addr
  -c : ($SKB_CLONE) SKB clones send before alloc new SKB
  -v : ($VERBOSE)   verbose
  -x : ($DEBUG)     debug

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:17 -04:00
Jesper Dangaard Brouer b64b0d1e64 pktgen: new pktgen helper functions for samples scripts
Preparing for removing existing samples/pktgen/ scripts, and
replacing these with easier to use samples.

This commit provides two helper shell files, that can
be "included" by shell source'ing. Namely "functions.sh"
and "parameters.sh".

The parameters.sh file support easy and consistant parameter
parsing across the sample scripts.  Usage example is printed on
errors.

The functions.sh file provides, three new shell functions for
configuring the different components of pktgen: pg_ctrl(),
pg_thread() and pg_set().  A slightly improved version of the old
pgset() function is also provided for backwards compat.

The new functions correspond to pktgens different components.
 * pg_ctrl()   control "pgctrl" (/proc/net/pktgen/pgctrl)
 * pg_thread() control the kernel threads and binding to devices
 * pg_set()    control setup of individual devices

These changes are borrowed from:
 https://github.com/netoptimizer/network-testing/tree/master/pktgen

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00