Userspace can now indicate that it can cope with larger-than-mtu sized
packets and packets that have invalid ipv4/tcp checksums.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Once we allow userspace to receive gso/gro packets, userspace
needs to be able to determine when checksums appear to be
broken, but are not.
NFQA_SKB_CSUMNOTREADY means 'checksums will be fixed in kernel
later, pretend they are ok'.
NFQA_SKB_GSO could be used for statistics, or to determine when
packet size exceeds mtu.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The new revision of the set match supports to match the counters
and to suppress updating the counters at matching too.
At the set:list types, the updating of the subcounters can be
suppressed as well.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch allows to dump BPF filters attached to a socket with
SO_ATTACH_FILTER.
Note that we check CAP_SYS_ADMIN before allowing to dump this info.
For now, only AF_PACKET sockets use this feature.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sk_rmem_alloc is disclosed via /proc/net/packet but not via netlink messages.
The goal is to have the same level of information.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This value is disclosed via /proc/net/packet but not via netlink messages.
The goal is to have the same level of information.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull PCI updates from Bjorn Helgaas:
"PCI changes for the v3.10 merge window:
PCI device hotplug
- Remove ACPI PCI subdrivers (Jiang Liu, Myron Stowe)
- Make acpiphp builtin only, not modular (Jiang Liu)
- Add acpiphp mutual exclusion (Jiang Liu)
Power management
- Skip "PME enabled/disabled" messages when not supported (Rafael
Wysocki)
- Fix fallback to PCI_D0 (Rafael Wysocki)
Miscellaneous
- Factor quirk_io_region (Yinghai Lu)
- Cache MSI capability offsets & cleanup (Gavin Shan, Bjorn Helgaas)
- Clean up EISA resource initialization and logging (Bjorn Helgaas)
- Fix prototype warnings (Andy Shevchenko, Bjorn Helgaas)
- MIPS: Initialize of_node before scanning bus (Gabor Juhos)
- Fix pcibios_get_phb_of_node() declaration "weak" annotation (Gabor
Juhos)
- Add MSI INTX_DISABLE quirks for AR8161/AR8162/etc (Xiong Huang)
- Fix aer_inject return values (Prarit Bhargava)
- Remove PME/ACPI dependency (Andrew Murray)
- Use shared PCI_BUS_NUM() and PCI_DEVID() (Shuah Khan)"
* tag 'pci-v3.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (63 commits)
vfio-pci: Use cached MSI/MSI-X capabilities
vfio-pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
PCI: Remove "extern" from function declarations
PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h
PCI: Use msix_table_size() directly, drop multi_msix_capable()
PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros
PCI: Drop is_64bit_address() and is_mask_bit_support() macros
PCI: Drop msi_data_reg() macro
PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros
PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly
PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc
PCI: Clean up MSI/MSI-X capability #defines
PCI: Use cached MSI-X cap while enabling MSI-X
PCI: Use cached MSI cap while enabling MSI interrupts
PCI: Remove MSI/MSI-X cap check in pci_msi_check_device()
PCI: Cache MSI/MSI-X capability offsets in struct pci_dev
PCI: Use u8, not int, for PM capability offset
[SCSI] megaraid_sas: Use correct #define for MSI-X capability
PCI: Remove "extern" from function declarations
...
Allow configuring the default destination port on a per-device basis.
Adds new netlink paramater IFLA_VXLAN_PORT to allow setting destination
port when creating new vxlan.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Source compatiability for build iproute2 was broken by:
commit c7995c43fa
Author: Atzm Watanabe <atzm@stratosphere.co.jp>
vxlan: Allow setting destination to unicast address.
Since this commit has not made it upstream (still net-next),
and better to avoid gratitious changes to exported API's;
go back to original definition, and add a comment.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We hope to at some point deprecate KVM legacy device assignment in
favor of VFIO-based assignment. Towards that end, allow legacy
device assignment to be deconfigured.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
For pseries machine emulation, in order to move the interrupt
controller code to the kernel, we need to intercept some RTAS
calls in the kernel itself. This adds an infrastructure to allow
in-kernel handlers to be registered for RTAS services by name.
A new ioctl, KVM_PPC_RTAS_DEFINE_TOKEN, then allows userspace to
associate token values with those service names. Then, when the
guest requests an RTAS service with one of those token values, it
will be handled by the relevant in-kernel handler rather than being
passed up to userspace as at present.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
[agraf: fix warning]
Signed-off-by: Alexander Graf <agraf@suse.de>
Enabling this capability connects the vcpu to the designated in-kernel
MPIC. Using explicit connections between vcpus and irqchips allows
for flexibility, but the main benefit at the moment is that it
simplifies the code -- KVM doesn't need vm-global state to remember
which MPIC object is associated with this vm, and it doesn't need to
care about ordering between irqchip creation and vcpu creation.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu]
Signed-off-by: Alexander Graf <agraf@suse.de>
Hook the MPIC code up to the KVM interfaces, add locking, etc.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit]
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently, devices that are emulated inside KVM are configured in a
hardcoded manner based on an assumption that any given architecture
only has one way to do it. If there's any need to access device state,
it is done through inflexible one-purpose-only IOCTLs (e.g.
KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is
cumbersome and depletes a limited numberspace.
This API provides a mechanism to instantiate a device of a certain
type, returning an ID that can be used to set/get attributes of the
device. Attributes may include configuration parameters (e.g.
register base address), device state, operational commands, etc. It
is similar to the ONE_REG API, except that it acts on devices rather
than vcpus.
Both device types and individual attributes can be tested without having
to create the device or get/set the attribute, without the need for
separately managing enumerated capabilities.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
We have a capability enquire system that allows user space to ask kvm
whether a feature is available.
The point behind this system is that we can have different kernel
configurations with different capabilities and user space can adjust
accordingly.
Because features can always be non existent, we can drop any #ifdefs
on CAP defines that could be used generically, like the irq routing
bits. These can be easily reused for non-IOAPIC systems as well.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
We are currently out of free bits in AT_HWCAP. With POWER8, we have
several hardware features that we need to advertise.
Tested on POWER and x86.
Signed-off-by: Michael Neuling <michael@neuling.org>
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently, there is no way to find out which timestamp is reported in
tpacket{,2,3}_hdr's tp_sec, tp_{n,u}sec members. It can be one of
SOF_TIMESTAMPING_SYS_HARDWARE, SOF_TIMESTAMPING_RAW_HARDWARE,
SOF_TIMESTAMPING_SOFTWARE, or a fallback variant late call from the
PF_PACKET code in software.
Therefore, report in the tp_status member of the ring buffer which
timestamp has been reported for RX and TX path. This should not break
anything for the following reasons: i) in RX ring path, the user needs
to test for tp_status & TP_STATUS_USER, and later for other flags as
well such as TP_STATUS_VLAN_VALID et al, so adding other flags will
do no harm; ii) in TX ring path, time stamps with PACKET_TIMESTAMP
socketoption are not available resp. had no effect except that the
application setting this is buggy. Next to TP_STATUS_AVAILABLE, the
user also should check for other flags such as TP_STATUS_WRONG_FORMAT
to reclaim frames to the application. Thus, in case TX ts are turned
off (default case), nothing happens to the application logic, and in
case we want to use this new feature, we now can also check which of
the ts source is reported in the status field as provided in the docs.
Reported-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This makes it more readable and clearer what bits are still free to
use. The compiler reduces this to a constant for us anyway.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
The following patchset contains fixes for recently applied
Netfilter/IPVS updates to the net-next tree, most relevantly
they are:
* Fix sparse warnings introduced in the RCU conversion, from
Julian Anastasov.
* Fix wrong endianness in the size field of IPVS sync messages,
from Simon Horman.
* Fix missing if checking in nf_xfrm_me_harder, from Dan Carpenter.
* Fix off by one access in the IPVS SCTP tracking code, again from
Dan Carpenter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove my soon bouncing email address.
Also remove the "Contact:" line in file header.
The MAINTAINERS file is a better place to find the
contact person anyway.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>