Commit Graph

1338 Commits

Author SHA1 Message Date
Alexei Starovoitov
db20fd2b01 bpf: add lookup/update/delete/iterate methods to BPF maps
'maps' is a generic storage of different types for sharing data between kernel
and userspace.

The maps are accessed from user space via BPF syscall, which has commands:

- create a map with given type and attributes
  fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size)
  returns fd or negative error

- lookup key in a given map referenced by fd
  err = bpf(BPF_MAP_LOOKUP_ELEM, union bpf_attr *attr, u32 size)
  using attr->map_fd, attr->key, attr->value
  returns zero and stores found elem into value or negative error

- create or update key/value pair in a given map
  err = bpf(BPF_MAP_UPDATE_ELEM, union bpf_attr *attr, u32 size)
  using attr->map_fd, attr->key, attr->value
  returns zero or negative error

- find and delete element by key in a given map
  err = bpf(BPF_MAP_DELETE_ELEM, union bpf_attr *attr, u32 size)
  using attr->map_fd, attr->key

- iterate map elements (based on input key return next_key)
  err = bpf(BPF_MAP_GET_NEXT_KEY, union bpf_attr *attr, u32 size)
  using attr->map_fd, attr->key, attr->next_key

- close(fd) deletes the map

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:05:14 -04:00
Alexei Starovoitov
99c55f7d47 bpf: introduce BPF syscall and maps
BPF syscall is a multiplexor for a range of different operations on eBPF.
This patch introduces syscall with single command to create a map.
Next patch adds commands to access maps.

'maps' is a generic storage of different types for sharing data between kernel
and userspace.

Userspace example:
/* this syscall wrapper creates a map with given type and attributes
 * and returns map_fd on success.
 * use close(map_fd) to delete the map
 */
int bpf_create_map(enum bpf_map_type map_type, int key_size,
                   int value_size, int max_entries)
{
    union bpf_attr attr = {
        .map_type = map_type,
        .key_size = key_size,
        .value_size = value_size,
        .max_entries = max_entries
    };

    return bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
}

'union bpf_attr' is backwards compatible with future extensions.

More details in Documentation/networking/filter.txt and in manpage

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:05:14 -04:00
David S. Miller
1f6d80358d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	arch/mips/net/bpf_jit.c
	drivers/net/can/flexcan.c

Both the flexcan and MIPS bpf_jit conflicts were cases of simple
overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-23 12:09:27 -04:00
Tom Herbert
4565e9919c gre: Setup and TX path for gre/UDP foo-over-udp encapsulation
Added netlink attrs to configure FOU encapsulation for GRE, netlink
handling of these flags, and properly adjust MTU for encapsulation.
ip_tunnel_encap is called from ip_tunnel_xmit to actually perform FOU
encapsulation.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19 17:15:32 -04:00
Tom Herbert
5632848653 net: Changes to ip_tunnel to support foo-over-udp encapsulation
This patch changes IP tunnel to support (secondary) encapsulation,
Foo-over-UDP. Changes include:

1) Adding tun_hlen as the tunnel header length, encap_hlen as the
   encapsulation header length, and hlen becomes the grand total
   of these.
2) Added common netlink define to support FOU encapsulation.
3) Routines to perform FOU encapsulation.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19 17:15:32 -04:00
Tom Herbert
23461551c0 fou: Support for foo-over-udp RX path
This patch provides a receive path for foo-over-udp. This allows
direct encapsulation of IP protocols over UDP. The bound destination
port is used to map to an IP protocol, and the XFRM framework
(udp_encap_rcv) is used to receive encapsulated packets. Upon
reception, the encapsulation header is logically removed (pointer
to transport header is advanced) and the packet is reinjected into
the receive path with the IP protocol indicated by the mapping.

Netlink is used to configure FOU ports. The configuration information
includes the port number to bind to and the IP protocol corresponding
to that port.

This should support GRE/UDP
(http://tools.ietf.org/html/draft-yong-tsvwg-gre-in-udp-encap-02),
as will as the other IP tunneling protocols (IPIP, SIT).

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19 17:15:31 -04:00
Andy Zhou
971427f353 openvswitch: Add recirc and hash action.
Recirc action allows a packet to reenter openvswitch processing.
currently openvswitch lookup flow for packet received and execute
set of actions on that packet, with help of recirc action we can
process/modify the packet and recirculate it back in openvswitch
for another pass.

OVS hash action calculates 5-tupple hash and set hash in flow-key
hash. This can be used along with recirculation for distributing
packets among different ports for bond devices.
For example:
OVS bonding can use following actions:
Match on: bond flow; Action: hash, recirc(id)
Match on: recirc-id == id and hash lower bits == a;
          Action: output port_bond_a

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-15 23:28:14 -07:00
Linus Torvalds
90a3c48fbf Merge tag 'usb-3.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are some USB and PHY fixes for 3.17-rc5.

  Nothing major here, just a number of tiny fixes for reported issues,
  and some new device ids as well.

  All have been tested in linux-next"

* tag 'usb-3.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (46 commits)
  xhci: fix oops when xhci resumes from hibernate with hw lpm capable devices
  usb: xhci: Fix OOPS in xhci error handling code
  xhci: Fix null pointer dereference if xhci initialization fails
  storage: Add single-LUN quirk for Jaz USB Adapter
  uas: Add missing le16_to_cpu calls to asm1051 / asm1053 usb-id check
  usb: chipidea: msm: Initialize PHY on reset event
  usb: chipidea: msm: Use USB PHY API to control PHY state
  usb: hub: take hub->hdev reference when processing from eventlist
  uas: Disable uas on ASM1051 devices
  usb: dwc2/gadget: avoid disabling ep0
  usb: dwc2/gadget: delay enabling irq once hardware is configured properly
  usb: dwc2/gadget: do not call disconnect method in pullup
  usb: dwc2/gadget: break infinite loop in endpoint disable code
  usb: dwc2/gadget: fix phy initialization sequence
  usb: dwc2/gadget: fix phy disable sequence
  uwb: init beacon cache entry before registering uwb device
  USB: ftdi_sio: Add support for GE Healthcare Nemo Tracker device
  USB: document the 'u' flag for usb-storage quirks parameter
  usb: host: xhci: fix compliance mode workaround
  usb: dwc3: fix TRB completion when multiple TRBs are started
  ...
2014-09-12 11:59:10 -07:00
Linus Torvalds
c8c16e3624 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
 "An update to Synaptics PS/2 driver to handle "ForcePads" (currently
  found in HP EliteBook 1040 laptops), a change for Elan PS/2 driver to
  detect newer touchpads, bunch of devices get annotated as Trackpoint
  and/or Pointer to help userspace classify and handle them, plus
  assorted driver fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: serport - add compat handling for SPIOCSTYPE ioctl
  Input: atmel_mxt_ts - fix double free of input device
  Input: synaptics - add support for ForcePads
  Input: matrix_keypad - use request_any_context_irq()
  Input: atmel_mxt_ts - downgrade warning about empty interrupts
  Input: wm971x - fix typo in module parameter description
  Input: cap1106 - fix register definition
  Input: add missing POINTER / DIRECT properties to a bunch of drivers
  Input: add INPUT_PROP_POINTING_STICK property
  Input: elantech - fix detection of touchpad on ASUS s301l
2014-09-11 10:08:36 -07:00
David Drysdale
b01d072065 shm: add memfd.h to UAPI export list
The new header file memfd.h from commit 9183df25fe ("shm: add
memfd_create() syscall") should be exported.

Signed-off-by: David Drysdale <drysdale@google.com>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-10 15:42:12 -07:00
David S. Miller
0aac383353 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
nf-next pull request

The following patchset contains Netfilter/IPVS updates for your
net-next tree. Regarding nf_tables, most updates focus on consolidating
the NAT infrastructure and adding support for masquerading. More
specifically, they are:

1) use __u8 instead of u_int8_t in arptables header, from
   Mike Frysinger.

2) Add support to match by skb->pkttype to the meta expression, from
   Ana Rey.

3) Add support to match by cpu to the meta expression, also from
   Ana Rey.

4) A smatch warning about IPSET_ATTR_MARKMASK validation, patch from
   Vytas Dauksa.

5) Fix netnet and netportnet hash types the range support for IPv4,
   from Sergey Popovich.

6) Fix missing-field-initializer warnings resolved, from Mark Rustad.

7) Dan Carperter reported possible integer overflows in ipset, from
   Jozsef Kadlecsick.

8) Filter out accounting objects in nfacct by type, so you can
   selectively reset quotas, from Alexey Perevalov.

9) Move specific NAT IPv4 functions to the core so x_tables and
   nf_tables can share the same NAT IPv4 engine.

10) Use the new NAT IPv4 functions from nft_chain_nat_ipv4.

11) Move specific NAT IPv6 functions to the core so x_tables and
    nf_tables can share the same NAT IPv4 engine.

12) Use the new NAT IPv6 functions from nft_chain_nat_ipv6.

13) Refactor code to add nft_delrule(), which can be reused in the
    enhancement of the NFT_MSG_DELTABLE to remove a table and its
    content, from Arturo Borrero.

14) Add a helper function to unregister chain hooks, from
    Arturo Borrero.

15) A cleanup to rename to nft_delrule_by_chain for consistency with
    the new nft_*() functions, also from Arturo.

16) Add support to match devgroup to the meta expression, from Ana Rey.

17) Reduce stack usage for IPVS socket option, from Julian Anastasov.

18) Remove unnecessary textsearch state initialization in xt_string,
    from Bojan Prtvar.

19) Add several helper functions to nf_tables, more work to prepare
    the enhancement of NFT_MSG_DELTABLE, again from Arturo Borrero.

20) Enhance NFT_MSG_DELTABLE to delete a table and its content, from
    Arturo Borrero.

21) Support NAT flags in the nat expression to indicate the flavour,
    eg. random fully, from Arturo.

22) Add missing audit code to ebtables when replacing tables, from
    Nicolas Dichtel.

23) Generalize the IPv4 masquerading code to allow its re-use from
    nf_tables, from Arturo.

24) Generalize the IPv6 masquerading code, also from Arturo.

25) Add the new masq expression to support IPv4/IPv6 masquerading
    from nf_tables, also from Arturo.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-10 12:46:32 -07:00
Jiri Pirko
e5c3ea5c66 bridge: implement rtnl_link_ops->get_size and rtnl_link_ops->fill_info
Allow rtnetlink users to get bridge master info in IFLA_INFO_DATA attr
This initial part implements forward_delay, hello_time, max_age options.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-09 11:29:55 -07:00
Alexei Starovoitov
daedfb2245 net: filter: split filter.h and expose eBPF to user space
allow user space to generate eBPF programs

uapi/linux/bpf.h: eBPF instruction set definition

linux/filter.h: the rest

This patch only moves macro definitions, but practically it freezes existing
eBPF instruction set, though new instructions can still be added in the future.

These eBPF definitions cannot go into uapi/linux/filter.h, since the names
may conflict with existing applications.

Full eBPF ISA description is in Documentation/networking/filter.txt

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-09 10:26:47 -07:00
Arturo Borrero
9ba1f726be netfilter: nf_tables: add new nft_masq expression
The nft_masq expression is intended to perform NAT in the masquerade flavour.

We decided to have the masquerade functionality in a separated expression other
than nft_nat.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-09 16:31:30 +02:00
Arturo Borrero
e42eff8a32 netfilter: nft_nat: include a flag attribute
Both SNAT and DNAT (and the upcoming masquerade) can have additional
configuration parameters, such as port randomization and NAT addressing
persistence. We can cover these scenarios by simply adding a flag
attribute for userspace to fill when needed.

The flags to use are defined in include/uapi/linux/netfilter/nf_nat.h:

 NF_NAT_RANGE_MAP_IPS
 NF_NAT_RANGE_PROTO_SPECIFIED
 NF_NAT_RANGE_PROTO_RANDOM
 NF_NAT_RANGE_PERSISTENT
 NF_NAT_RANGE_PROTO_RANDOM_FULLY
 NF_NAT_RANGE_PROTO_RANDOM_ALL

The caller must take care of not messing up with the flags, as they are
added unconditionally to the final resulting nf_nat_range.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-09 16:31:27 +02:00
Ana Rey
3045d76070 netfilter: nf_tables: add devgroup support in meta expresion
Add devgroup support to let us match device group of a packets incoming
or outgoing interface.

Signed-off-by: Ana Rey <anarey@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-09 16:31:23 +02:00
David S. Miller
5b4c314575 Merge tag 'master-2014-09-08' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
pull request: wireless-next 2014-09-08

Please pull this batch of updates intended for the 3.18 stream...

For the mac80211 bits, Johannes says:

"Not that much content this time. Some RCU cleanups, crypto
performance improvements, and various patches all over,
rather than listing them one might as well look into the
git log instead."

For the Bluetooth bits, Gustavo says:

"The changes consists of:

        - Coding style fixes to HCI drivers
        - Corrupted ack value fix for the H5 HCI driver
        - A couple of Enhanced L2CAP fixes
        - Conversion of SMP code to use common L2CAP channel API
        - Page scan optimizations when using the kernel-side whitelist
        - Various mac802154 and and ieee802154 6lowpan cleanups
        - One new Atheros USB ID"

For the iwlwifi bits, Emmanuel says:

"We have a new big thing coming up which is called Dynamic Queue
Allocation (or DQA).  This is a completely new way to work with the
Tx queues and it requires major refactoring.  This is being done by
Johannes and Avri.  Besides this, Johannes disables U-APSD by default
because of APs that would disable A-MPDU if the association supports
U-ASPD.  Luca contributed to the power area which he was cleaning
up on the way while working on CSA.  A few more random things here
and there."

For the Atheros bits, Kalle says:

"For ath6kl we had two small fixes and a new SDIO device id.

For ath10k the bigger changes are:

 * support for new firmware version 10.2 (Michal)

 * spectral scan support (Simon, Sven & Mathias)

 * export a firmware crash dump file (Ben & me)

 * cleaning up of pci.c (Michal)

 * print pci id in all messages, which causes most of the churn (Michal)"

Beyond that, we have the usual collection of various updates to ath9k,
b43, mwifiex, and wil6210, as well as a few other bits here and there.

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-08 16:43:58 -07:00
Hans de Goede
7611392fe8 Input: add INPUT_PROP_POINTING_STICK property
It is useful for userspace to know that there not dealing with a regular
mouse but rather with a pointing stick (e.g. a trackpoint) so that
userspace can e.g. automatically enable middle button scrollwheel
emulation.

It is impossible to tell the difference from the evdev info without
resorting to putting a list of device / driver names in userspace, this is
undesirable.

Add a property which allows userspace to see if a device is a pointing
stick, and set it on all the pointing stick drivers.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2014-09-08 14:58:11 -07:00
David S. Miller
eb84d6b604 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-09-07 21:41:53 -07:00
Govindarajulu Varadarajan
f0db9b0734 ethtool: Add generic options for tunables
This patch adds new ethtool cmd, ETHTOOL_GTUNABLE & ETHTOOL_STUNABLE for getting
tunable values from driver.

Add get_tunable and set_tunable to ethtool_ops. Driver implements these
functions for getting/setting tunable value.

Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 12:12:20 -07:00
Piotr Król
6fa9e1be7f usb: usbip: fix usbip.h path in userspace tool
Fixes: 588b48caf6 ("usbip: move usbip userspace code out of staging")
which introduced build failure by not changing uapi/usbip.h include path
according to new location.

Signed-off-by: Piotr Król <piotr.krol@3mdeb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-04 16:25:30 -07:00
Linus Torvalds
10f3291a1d Merge branch 'akpm' (fixes from Andrew Morton)
Merge patches from Andrew Morton:
 "22 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
  kexec: purgatory: add clean-up for purgatory directory
  Documentation/kdump/kdump.txt: add ARM description
  flush_icache_range: export symbol to fix build errors
  tools: selftests: fix build issue with make kselftests target
  ocfs2: quorum: add a log for node not fenced
  ocfs2: o2net: set tcp user timeout to max value
  ocfs2: o2net: don't shutdown connection when idle timeout
  ocfs2: do not write error flag to user structure we cannot copy from/to
  x86/purgatory: use approprate -m64/-32 build flag for arch/x86/purgatory
  drivers/rtc/rtc-s5m.c: re-add support for devices without irq specified
  xattr: fix check for simultaneous glibc header inclusion
  kexec: remove CONFIG_KEXEC dependency on crypto
  kexec: create a new config option CONFIG_KEXEC_FILE for new syscall
  x86,mm: fix pte_special versus pte_numa
  hugetlb_cgroup: use lockdep_assert_held rather than spin_is_locked
  mm/zpool: use prefixed module loading
  zram: fix incorrect stat with failed_reads
  lib: turn CONFIG_STACKTRACE into an actual option.
  mm: actually clear pmd_numa before invalidating
  memblock, memhotplug: fix wrong type in memblock_find_in_range_node().
  ...
2014-08-29 16:28:29 -07:00
Filipe Brandenburger
bfcfd44cce xattr: fix check for simultaneous glibc header inclusion
The guard was introduced in commit ea1a8217b0 ("xattr: guard against
simultaneous glibc header inclusion") but it is using #ifdef to check
for a define that is either set to 1 or 0.  Fix it to use #if instead.

* Without this patch:

  $ { echo "#include <sys/xattr.h>"; echo "#include <linux/xattr.h>"; } | gcc -E -Iinclude/uapi - >/dev/null
  include/uapi/linux/xattr.h:19:0: warning: "XATTR_CREATE" redefined [enabled by default]
   #define XATTR_CREATE 0x1 /* set value, fail if attr already exists */
   ^
  /usr/include/x86_64-linux-gnu/sys/xattr.h:32:0: note: this is the location of the previous definition
   #define XATTR_CREATE XATTR_CREATE
   ^

* With this patch:

  $ { echo "#include <sys/xattr.h>"; echo "#include <linux/xattr.h>"; } | gcc -E -Iinclude/uapi - >/dev/null
  (no warnings)

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Cc: Allan McRae <allan@archlinux.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-29 16:28:16 -07:00
Florian Fainelli
3e8a72d1da net: dsa: reduce number of protocol hooks
DSA is currently registering one packet_type function per EtherType it
needs to intercept in the receive path of a DSA-enabled Ethernet device.
Right now we have three of them: trailer, DSA and eDSA, and there might
be more in the future, this will not scale to the addition of new
protocols.

This patch proceeds with adding a new layer of abstraction and two new
functions:

dsa_switch_rcv() which will dispatch into the tag-protocol specific
receive function implemented by net/dsa/tag_*.c

dsa_slave_xmit() which will dispatch into the tag-protocol specific
transmit function implemented by net/dsa/tag_*.c

When we do create the per-port slave network devices, we iterate over
the switch protocol to assign the DSA-specific receive and transmit
operations.

A new fake ethertype value is used: ETH_P_XDSA to illustrate the fact
that this is no longer going to look like ETH_P_DSA or ETH_P_TRAILER
like it used to be.

This allows us to greatly simplify the check in eth_type_trans() and
always override the skb->protocol with ETH_P_XDSA for Ethernet switches
tagged protocol, while also reducing the number repetitive slave
netdevice_ops assignments.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:39 -07:00
Alexey Perevalov
f111f780ae netfilter: nfnetlink_acct: add filter support to nfacct counter list/reset
You can use this to skip accounting objects when listing/resetting
via NFNL_MSG_ACCT_GET/NFNL_MSG_ACCT_GET_CTRZERO messages with the
NLM_F_DUMP netlink flag. The filtering covers the following cases:

1. No filter specified. In this case, the client will get old behaviour,
2. List/reset counter object only: In this case, you have to use
   NFACCT_F_QUOTA as mask and value 0.
3. List/reset quota objects only: You have to use NFACCT_F_QUOTA_PKTS
   as mask and value - the same, for byte based quota mask should be
   NFACCT_F_QUOTA_BYTES and value - the same.

If you want to obtain the object with any quota type
(ie. NFACCT_F_QUOTA_PKTS|NFACCT_F_QUOTA_BYTES), you need to perform
two dump requests, one to obtain NFACCT_F_QUOTA_PKTS objects and
another for NFACCT_F_QUOTA_BYTES.

Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-08-26 21:36:19 +02:00