Expose the kernel connection tracker via OVS. Userspace components can
make use of the CT action to populate the connection state (ct_state)
field for a flow. This state can be subsequently matched.
Exposed connection states are OVS_CS_F_*:
- NEW (0x01) - Beginning of a new connection.
- ESTABLISHED (0x02) - Part of an existing connection.
- RELATED (0x04) - Related to an established connection.
- INVALID (0x20) - Could not track the connection for this packet.
- REPLY_DIR (0x40) - This packet is in the reply direction for the flow.
- TRACKED (0x80) - This packet has been sent through conntrack.
When the CT action is executed by itself, it will send the packet
through the connection tracker and populate the ct_state field with one
or more of the connection state flags above. The CT action will always
set the TRACKED bit.
When the COMMIT flag is passed to the conntrack action, this specifies
that information about the connection should be stored. This allows
subsequent packets for the same (or related) connections to be
correlated with this connection. Sending subsequent packets for the
connection through conntrack allows the connection tracker to consider
the packets as ESTABLISHED, RELATED, and/or REPLY_DIR.
The CT action may optionally take a zone to track the flow within. This
allows connections with the same 5-tuple to be kept logically separate
from connections in other zones. If the zone is specified, then the
"ct_zone" match field will be subsequently populated with the zone id.
IP fragments are handled by transparently assembling them as part of the
CT action. The maximum received unit (MRU) size is tracked so that
refragmentation can occur during output.
IP frag handling contributed by Andy Zhou.
Based on original design by Justin Pettit.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, we used the kernel-internal netlink actions length to
calculate the size of messages to serialize back to userspace.
However,the sw_flow_actions may not be formatted exactly the same as the
actions on the wire, so store the original actions length when
de-serializing and re-use the original length when serializing.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Broadcom buses may have more than 1 Ethernet device. This is used e.g.
to have few interfaces connected to different switch ports. So far we
saw chipsets with only 2 devices (e.g. BCM4706) but recent ones have
up to 3 (e.g. Netgear R8000 uses 3rd interface for most of switch
traffic, lower interfaces are for some kind of offloading).
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kalle Valo says:
====================
Major changes:
iwlwifi:
* new Tx power firmware API
* bump max firmware API to 17
* fix bug in debug prints
* static checker fix
* fix unused defines
* fix command list on newest firmware
brcmfmac:
* support NVRAM loading for bcm47xx platform
* new debugfs entry for msgbuf protocol layer used with PCIe devices
ath10k:
* add spectral scan support for qca99x0
* add qca6164 support
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure to indicate to tunnel driver that key.tun_id is set,
otherwise gre won't recognize the metadata.
Fixes: d3aa45ce6b ("bpf: add helpers to access tunnel metadata")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
act_bpf: remove spinlock in fast path
v1 version had a race condition in cleanup path of bpf_prog.
I tried to fix it by adding new callback 'cleanup_rcu' to 'struct tcf_common'
and call it out of act_api cleanup path, but Daniel noticed
(thanks for the idea!) that most of the classifiers already do action cleanup
out of rcu callback.
So instead this set of patches converts tcindex and rsvp classifiers to call
tcf_exts_destroy() after rcu grace period and since action cleanup logic
in __tcf_hash_release() is only called when bind and refcnt goes to zero,
it's guaranteed that cleanup() callback is called from rcu callback.
More specifically:
patches 1 and 2 - simple fixes
patches 2 and 3 - convert tcf_exts_destroy in tcindex and rsvp to call_rcu
patch 5 - removes spin_lock from act_bpf
The cleanup of actions is now universally done after rcu grace period
and in the future we can drop (now unnecessary) call_rcu from tcf_hash_destroy()
patch 5 is using synchronize_rcu() in act_bpf replacement path, since it's
very rare and alternative of dynamically allocating 'struct tcf_bpf_cfg' just
to pass it to call_rcu looks even less appealing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to act_gact/act_mirred, act_bpf can be lockless in packet processing
with extra care taken to free bpf programs after rcu grace period.
Replacement of existing act_bpf (very rare) is done with synchronize_rcu()
and final destruction is done from tc_action_ops->cleanup() callback that is
called from tcf_exts_destroy()->tcf_action_destroy()->__tcf_hash_release() when
bind and refcnt reach zero which is only possible when classifier is destroyed.
Previous two patches fixed the last two classifiers (tcindex and rsvp) to
call tcf_exts_destroy() from rcu callback.
Similar to gact/mirred there is a race between prog->filter and
prog->tcf_action. Meaning that the program being replaced may use
previous default action if it happened to return TC_ACT_UNSPEC.
act_mirred race betwen tcf_action and tcfm_dev is similar.
In all cases the race is harmless.
Long term we may want to improve the situation by replacing the whole
tc_action->priv as single pointer instead of updating inner fields one by one.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adjust destroy path of cls_rsvp to call tcf_exts_destroy() after
rcu grace period.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adjust destroy path of cls_tcindex to call tcf_exts_destroy() after
rcu grace period.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix harmless typo and avoid unnecessary copy of empty 'prog' into
unused 'strcut tcf_bpf_cfg old'.
Fixes: f4eaed28c7 ("act_bpf: fix memory leaks when replacing bpf programs")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Kconfig option AVERAGE and its implementation has been removed by
commit f4e774f55f ("average: remove out-of-line implementation").
Remove the dead build rule in lib/Makefile.
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Below compilation warnings are observed in gcc version 4.8.2.
Even though it's not seen in bit older gcc versions (for ex, 4.7.3),
It's good to fix it by changing format specifier from %d to
%zd in wmi pull phyerr functions.
wmi.c: In function 'ath10k_wmi_op_pull_phyerr_ev':
wmi.c:3567:8: warning: format '%d' expects argument of type 'int',
but argument 4 has type 'long unsigned int' [-Wformat=]
left_len, sizeof(*phyerr));
^
wmi.c: In function 'ath10k_wmi_10_4_op_pull_phyerr_ev':
wmi.c:3612:8: warning: format '%d' expects argument of type 'int',
but argument 4 has type 'long unsigned int' [-Wformat=]
left_len, sizeof(*phyerr));
^
Fixes: 991adf71a6 ("ath10k: refactor phyerr event handlers")
Fixes: 2b0a2e0d7c ("ath10k: handle 10.4 firmware phyerr event")
Signed-off-by: Raja Mani <rmani@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This adds additional 0x0041 PCI Device ID
definition to ath10k for QCA6164 which is a 1
spatial stream sibling of the QCA6174 (which is 2
spatial stream chip).
The QCA6164 needs a dedicated board.bin file which
is different than the one used for QCA6174. If the
board.bin is wrong the device will crash early
while trying to boot firmware. The register dump
will look like this:
ath10k_pci 0000:02:00.0: firmware register dump:
ath10k_pci 0000:02:00.0: [00]: 0x05010000 0x000015B3 0x000A012D 0x00955B31
...
Note the value 0x000A012D.
Special credit goes to Alan Liu
<alanliu@qca.qualcomm.com> for providing support
help which enabled me to come up with this patch.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
To enable/configure spectral scan parameters in 10.4 firmware, existing
wmi spectral related functions can be reused. Link those functions in
10.4 wmi ops table.
In addition, adjust bin size (only when size is 68 bytes) before reporting
bin samples to user space. The background for this adjustment is that
qca99x0 reports bin size as 68 bytes (64 bytes + 4 bytes) in report
mode 2. First 64 bytes carries in-band tones (-32 to +31) and last 4 byte
carries band edge detection data (+32) mainly used in radar detection
purpose. Additional last 4 bytes are stripped to make bin size valid one.
This bin size adjustment will happen only for qca99x0, all other chipsets
will report proper bin sizes (64/128) without extra 4 bytes being added
at the end. The changes are validated in qca99x0 using 10.4 firmware.
Signed-off-by: Raja Mani <rmani@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The function returns 1 when DMA mapping fails. The
driver would return bogus values and could
possibly confuse itself if DMA failed.
Fixes: 767d34fc67 ("ath10k: remove DMA mapping wrappers")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kernel would complain about leaving a held lock
after going back to userspace and would
subsequently deadlock.
Fixes: e04cafbc38 ("ath10k: fix peer limit enforcement")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Florian Fainelli says:
====================
Documentation: dsa
This patch series adds some documentation about DSA as a subsystem as well
as the SF2 driver since it slightly diverges from your average DSA driver ;)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a document describing the Broadcom Starfigther 2 switch hardware,
its specifics, and how the driver is implemented and its specifics.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Shilimkar says:
====================
RDS: Few more fixes
As indicated in the earlier series [1], this is a follow-up series which
addresses few issues around the RDS FMR code. With [1] and the subject
series, now I can run many parallel threads with multiple sockets with
N x N traffic. The stress tests has survived overnight runs.
[1] https://lkml.org/lkml/2015/8/22/127
====================
Signed-off-by: David S. Miller <davem@davemloft.net>