Commit Graph

519862 Commits

Author SHA1 Message Date
Jesper Dangaard Brouer 05a14d5e17 pktgen: add benchmark script pktgen_bench_xmit_mode_netif_receive.sh
This script pktgen_bench_xmit_mode_netif_receive.sh is a benchmark
script, which can be used for benchmarking part of the network stack.
This can be used for performance improving or catching regression in
that area.

The script is developed for benchmarking ingress qdisc path, original
idea by Alexei Starovoitov.  This script don't really need any
hardware.  This is achieved via the recently introduced stack inject
feature "xmit_mode netif_receive". See commit 62f64aed62 ("pktgen:
introduce xmit_mode '<start_xmit|netif_receive>'").

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:17 -04:00
Jesper Dangaard Brouer 1d73ba16ad pktgen: add sample script pktgen_sample03_burst_single_flow.sh
Add the pktgen samples script pktgen_sample03_burst_single_flow.sh
that demonstrates how to acheive maximum performance.

If correctly tuned[1] single CPU 10Gbit/s wirespeed small pkts is
possible[2] which is 14.88Mpps.  The trick is to take advantage of the
"burst" feature introduced in commit 38b2cf2982 ("net: pktgen:
packet bursting via skb->xmit_more").

[1] http://netoptimizer.blogspot.dk/2014/06/pktgen-for-network-overload-testing.html
[2] http://netoptimizer.blogspot.dk/2014/10/unlocked-10gbps-tx-wirespeed-smallest.html

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:17 -04:00
Jesper Dangaard Brouer 282fb58947 pktgen: add sample script pktgen_sample02_multiqueue.sh
Add the pktgen samples script pktgen_sample02_multiqueue.sh that
demonstrates generating packets on multiqueue NICs.

Specifically notice the options "-t" that specifies how many
kernel threads to activate.  Also notice the flag QUEUE_MAP_CPU,
which cause the SKB TX queue to be mapped to the CPU running the
kernel thread.  For best scalability people are also encourage to
map NIC IRQ /proc/irq/*/smp_affinity to CPU number.

Usage example with "-t" 4 threads and help:
 ./pktgen_sample02_multiqueue.sh -i eth4 -m 00:1B:21:3C:9D:F8 -t 4

Usage: ./pktgen_sample02_multiqueue.sh [-vx] -i ethX
  -i : ($DEV)       output interface/device (required)
  -s : ($PKT_SIZE)  packet size
  -d : ($DEST_IP)   destination IP
  -m : ($DST_MAC)   destination MAC-addr
  -t : ($THREADS)   threads to start
  -c : ($SKB_CLONE) SKB clones send before alloc new SKB
  -b : ($BURST)     HW level bursting of SKBs
  -v : ($VERBOSE)   verbose
  -x : ($DEBUG)     debug

Removing pktgen.conf-2-1 and pktgen.conf-2-2 as these examples
should be covered now.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:17 -04:00
Jesper Dangaard Brouer 6f09479758 pktgen: add sample script pktgen_sample01_simple.sh
Add the first basic pktgen samples script pktgen_sample01_simple.sh,
which demonstrates the a simple use of the helper functions.
Removing pktgen.conf-1-1 as that example should be covered now.

The naming scheme pktgen_sampleNN, where NN is a number, should encourage
reading the samples in a specific order.

Script cause pktgen sending with a single thread and single interface,
and introduce flow variation via random UDP source port.

Usage example and help:
 ./pktgen_sample01_simple.sh -i eth4 -m 00:1B:21:3C:9D:F8 -d 192.168.8.2

Usage: ./pktgen_sample01_simple.sh [-vx] -i ethX
  -i : ($DEV)       output interface/device (required)
  -s : ($PKT_SIZE)  packet size
  -d : ($DEST_IP)   destination IP
  -m : ($DST_MAC)   destination MAC-addr
  -c : ($SKB_CLONE) SKB clones send before alloc new SKB
  -v : ($VERBOSE)   verbose
  -x : ($DEBUG)     debug

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:17 -04:00
Jesper Dangaard Brouer b64b0d1e64 pktgen: new pktgen helper functions for samples scripts
Preparing for removing existing samples/pktgen/ scripts, and
replacing these with easier to use samples.

This commit provides two helper shell files, that can
be "included" by shell source'ing. Namely "functions.sh"
and "parameters.sh".

The parameters.sh file support easy and consistant parameter
parsing across the sample scripts.  Usage example is printed on
errors.

The functions.sh file provides, three new shell functions for
configuring the different components of pktgen: pg_ctrl(),
pg_thread() and pg_set().  A slightly improved version of the old
pgset() function is also provided for backwards compat.

The new functions correspond to pktgens different components.
 * pg_ctrl()   control "pgctrl" (/proc/net/pktgen/pgctrl)
 * pg_thread() control the kernel threads and binding to devices
 * pg_set()    control setup of individual devices

These changes are borrowed from:
 https://github.com/netoptimizer/network-testing/tree/master/pktgen

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00
Jesper Dangaard Brouer 4020726479 pktgen: make /proc/net/pktgen/pgctrl report fail on invalid input
Giving /proc/net/pktgen/pgctrl an invalid command just returns shell
success and prints a warning in dmesg.  This is not very useful for
shell scripting, as it can only detect the error by parsing dmesg.

Instead return -EINVAL when the command is unknown, as this provides
userspace shell scripting a way of detecting this.

Also bump version tag to 2.75, because (1) reading /proc/net/pktgen/pgctrl
output this version number which would allow to detect this small
semantic change, and (2) because the pktgen version tag have not been
updated since 2010.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00
Jesper Dangaard Brouer 2a1ddf27e8 pktgen: document ability to add same device to several threads
The pktgen.txt documentation still claimed that adding same device to
multiple threads were not supported, but it have been since 2008 via
commit e6fce5b916 ("pktgen: multiqueue etc.").

Document this and describe the naming scheme dev@X, as the procfile name
still need to be unique.

Fixes: e6fce5b916 ("pktgen: multiqueue etc.")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00
Jesper Dangaard Brouer 91db4b3c89 pktgen: doc were missing several config options
The pktgen.txt documentation over available config options were not complete.
Making the list complete by adding the following.

Pgcontrol commands:
 reset

Device commands:
 burst
 queue_map_min
 queue_map_max
 skb_priority
 tos
 traffic_class
 node
 spi
 dst6_max
 dst6_min
 vlan_cfi
 vlan_id
 vlan_p
 svlan_cfi
 svlan_id
 svlan_p

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00
Jesper Dangaard Brouer d079abd181 pktgen: adjust spacing in proc file interface output
Too many spaces were introduced in commit 63adc6fb8a ("pktgen: cleanup
checkpatch warnings"), thus misaligning "src_min:" to other columns.

Fixes: 63adc6fb8a ("pktgen: cleanup checkpatch warnings")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00
Jesper Dangaard Brouer d012827e81 pktgen: remove obsolete "max_before_softirq" from pktgen doc
And cleanup some whitespaces in pktgen.txt.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00
Michael Holzheu fe59384495 test_bpf: Add backward jump test case
Currently the testsuite does not have a test case with a backward jump.
The s390x JIT (kernel 4.0) had a bug in that area.
So add one new test case for this now.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 15:10:51 -04:00
Jiri Pirko 12c227ec89 flow_dissector: do not break if ports are not needed in flowlabel
This restored previous behaviour. If caller does not want ports to be
filled, we should not break.

Fixes: 06635a35d1 ("flow_dissect: use programable dissector in skb_flow_dissect and friends")
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 13:59:02 -04:00
Florian Westphal bd5850d39f net: sched: pkt_cls: remove unused macros from uapi
Jamal points out that this header also contains kernel internal magic that
cannot be used from userspace for anything meaningful.

Lets remove what the kernel doesn't use anymore and wrap remainder with
__KERNEL__.

Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 23:26:51 -04:00
Marcelo Ricardo Leitner 2efd055c53 tcp: add tcpi_segs_in and tcpi_segs_out to tcp_info
This patch tracks the total number of inbound and outbound segments on a
TCP socket. One may use this number to have an idea on connection
quality when compared against the retransmissions.

RFC4898 named these : tcpEStatsPerfSegsIn and tcpEStatsPerfSegsOut

These are a 32bit field each and can be fetched both from TCP_INFO
getsockopt() if one has a handle on a TCP socket, or from inet_diag
netlink facility (iproute2/ss patch will follow)

Note that tp->segs_out was placed near tp->snd_nxt for good data
locality and minimal performance impact, while tp->segs_in was placed
near tp->bytes_received for the same reason.

Join work with Eric Dumazet.

Note that received SYN are accounted on the listener, but sent SYNACK
are not accounted.

Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 23:25:21 -04:00
Florian Westphal 48ed7b26fa ipv6: reject locally assigned nexthop addresses
ip -6 addr add dead::1/128 dev eth0
sleep 5
ip -6 route add default via dead::1/128
-> fails
ip -6 addr add dead::1/128 dev eth0
ip -6 route add default via dead::1/128
-> succeeds

reason is that if (nonsensensical) route above is added,
dead::1 is still subject to DAD, so the route lookup will
pick eth0 as outdev due to the prefix route that is added before
DAD work is started.

Add explicit test that checks if nexthop gateway is a local address.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1167969
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 23:23:38 -04:00
David S. Miller b66ba8d5a4 Merge branch 'stmmac-probe-refactoring'
Joachim Eastwood says:

====================
stmmac: probe code refactoring and clean up part 1

This patch set refactor the code in stmmac_pci_probe and stmmac_pltfr_probe
and moves the common bits into stmmac_dvr_probe. Along the way some clean-
ups are applied to stmmac_pltfr_probe.

The code has been tested on the LPC18xx platform.

I am still working on more refactoring of the platform probe code, hence
part 1, but I need some more time on this.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 18:57:26 -04:00
Joachim Eastwood def5cd3cfd stmmac: drop unnecessary dt checks in stmmac_probe_config_dt
Since the caller already check the presence of a of_node there
is no need to repeat the check in stmmac_probe_config_dt.

There is also no point in checking the return value of the
of_match_device function since if there wasn't match in the
first place we would never be in this function.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 18:57:26 -04:00
Joachim Eastwood 15ffac73bb stmmac: change the stmmac_dvr_probe return type to int
Since stmmac_dvr_probe takes care of setting driver data and
assign resources to the priv structure there is no need to
access the priv structure from the other probe functions.
This mean that this function can be changed into just return
an int and thus simplifying the callers.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 18:57:26 -04:00
Joachim Eastwood e56788cf13 stmmac: let stmmac_dvr_probe take a struct of resources
Creat a struct that contain all the resources that needs to be
assigned to the priv struct in stmmac_dvr_probe. This makes it
possible to factor out more common code from the other probe
functions and also use this struct to hold the resources as
they are fetched.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 18:57:26 -04:00
Joachim Eastwood 803f8fc462 stmmac: move driver data setting into stmmac_dvr_probe
Move setting of driver data into stmmac_dvr_probe so the
other probe functions don't have to. This will help to
simplify the other probe functions later.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 18:57:26 -04:00
David S. Miller 614919c3d9 Merge branch 'tcp_src_port_selection'
Eric Dumazet says:

====================
tcp: improve source port selection

With increase of TCP sockets in hosts, we often hit limitations
caused by port selection, due to randomization and poor strategy.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 18:55:32 -04:00
Eric Dumazet 946f9eb226 tcp: improve REUSEADDR/NOREUSEADDR cohabitation
inet_csk_get_port() randomization effort tends to spread
sockets on all the available range (ip_local_port_range)

This is unfortunate because SO_REUSEADDR sockets have
less requirements than non SO_REUSEADDR ones.

If an application uses SO_REUSEADDR hint, it is to try to
allow source ports being shared.

So instead of picking a random port number in ip_local_port_range,
lets try first in first half of the range.

This gives more chances to use upper half of the range for the
sockets with strong requirements (not using SO_REUSEADDR)

Note this patch does not add a new sysctl, and only changes
the way we try to pick port number.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marcelo Ricardo Leitner <mleitner@redhat.com>
Cc: Flavio Leitner <fbl@redhat.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 18:55:32 -04:00
Eric Dumazet f5af1f57a2 inet_hashinfo: remove bsocket counter
We no longer need bsocket atomic counter, as inet_csk_get_port()
calls bind_conflict() regardless of its value, after commit
2b05ad33e1 ("tcp: bind() fix autoselection to share ports")

This patch removes overhead of maintaining this counter and
double inet_csk_get_port() calls under pressure.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marcelo Ricardo Leitner <mleitner@redhat.com>
Cc: Flavio Leitner <fbl@redhat.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 18:55:32 -04:00
Jason Baron ce5ec44099 tcp: ensure epoll edge trigger wakeup when write queue is empty
We currently rely on the setting of SOCK_NOSPACE in the write()
path to ensure that we wake up any epoll edge trigger waiters when
acks return to free space in the write queue. However, if we fail
to allocate even a single skb in the write queue, we could end up
waiting indefinitely.

Fix this by explicitly issuing a wakeup when we detect the condition
of an empty write queue and a return value of -EAGAIN. This allows
userspace to re-try as we expect this to be a temporary failure.

I've tested this approach by artificially making
sk_stream_alloc_skb() return NULL periodically. In that case,
epoll edge trigger waiters will hang indefinitely in epoll_wait()
without this patch.

Signed-off-by: Jason Baron <jbaron@akamai.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 18:52:47 -04:00
David S. Miller b92d581499 Merge branch 'cxgb4-next'
Hariprasad Shenai says:

====================
cxgb4: Cleanup and update T4/T4 register ranges

This series cleans and optimizes setup_memwin function and also updates
T4/T5 adapter register ranges by removing incorrect register addresses

This patch series has been created against net-next tree and includes
patches on cxgb4 driver.

We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-21 18:46:36 -04:00