Commit Graph

46701 Commits

Author SHA1 Message Date
Dave Watson
3c4d755915 tls: kernel TLS support
Software implementation of transport layer security, implemented using ULP
infrastructure.  tcp proto_ops are replaced with tls equivalents of sendmsg and
sendpage.

Only symmetric crypto is done in the kernel, keys are passed by setsockopt
after the handshake is complete.  All control messages are supported via CMSG
data - the actual symmetric encryption is the same, just the message type needs
to be passed separately.

For user API, please see Documentation patch.

Pieces that can be shared between hw and sw implementation
are in tls_main.c

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 12:12:40 -04:00
Dave Watson
e3b5616a34 tcp: export do_tcp_sendpages and tcp_rate_check_app_limited functions
Export do_tcp_sendpages and tcp_rate_check_app_limited, since tls will need to
sendpages while the socket is already locked.

tcp_sendpage is exported, but requires the socket lock to not be held already.

Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 12:12:40 -04:00
Dave Watson
734942cc4e tcp: ULP infrastructure
Add the infrustructure for attaching Upper Layer Protocols (ULPs) over TCP
sockets. Based on a similar infrastructure in tcp_cong.  The idea is that any
ULP can add its own logic by changing the TCP proto_ops structure to its own
methods.

Example usage:

setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls"));

modules will call:
tcp_register_ulp(&tcp_tls_ulp_ops);

to register/unregister their ulp, with an init function and name.

A list of registered ulps will be returned by tcp_get_available_ulp, which is
hooked up to /proc.  Example:

$ cat /proc/sys/net/ipv4/tcp_available_ulp
tls

There is currently no functionality to remove or chain ULPs, but
it should be possible to add these in the future if needed.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 12:12:40 -04:00
David S. Miller
0ddead90b2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The conflicts were two cases of overlapping changes in
batman-adv and the qed driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 11:59:32 -04:00
David Howells
f7aec129a3 rxrpc: Cache the congestion window setting
Cache the congestion window setting that was determined during a call's
transmission phase when it finishes so that it can be used by the next call
to the same peer, thereby shortcutting the slow-start algorithm.

The value is stored in the rxrpc_peer struct and is accessed without
locking.  Each call takes the value that happens to be there when it starts
and just overwrites the value when it finishes.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14 15:42:45 -04:00
Jesper Dangaard Brouer
849a44de91 net: don't global ICMP rate limit packets originating from loopback
Florian Weimer seems to have a glibc test-case which requires that
loopback interfaces does not get ICMP ratelimited.  This was broken by
commit c0303efeab ("net: reduce cycles spend on ICMP replies that
gets rate limited").

An ICMP response will usually be routed back-out the same incoming
interface.  Thus, take advantage of this and skip global ICMP
ratelimit when the incoming device is loopback.  In the unlikely event
that the outgoing it not loopback, due to strange routing policy
rules, ICMP rate limiting still works via peer ratelimiting via
icmpv4_xrlim_allow().  Thus, we should still comply with RFC1812
(section 4.3.2.8 "Rate Limiting").

This seems to fix the reproducer given by Florian.  While still
avoiding to perform expensive and unneeded outgoing route lookup for
rate limited packets (in the non-loopback case).

Fixes: c0303efeab ("net: reduce cycles spend on ICMP replies that gets rate limited")
Reported-by: Florian Weimer <fweimer@redhat.com>
Reported-by: "H.J. Lu" <hjl.tools@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14 15:33:58 -04:00
Dan Carpenter
c4f65b09b4 net/act_pedit: fix an error code
I'm reviewing static checker warnings where we do ERR_PTR(0), which is
the same as NULL.  I'm pretty sure we intended to return ERR_PTR(-EINVAL)
here.  Sometimes these bugs lead to a NULL dereference but I don't
immediately see that problem here.

Fixes: 71d0ed7079 ("net/act_pedit: Support using offset relative to the conventional network headers")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14 15:24:18 -04:00
Paolo Abeni
7608894e43 net: use skb_unref() in napi_consume_skb()
The commit 83ada39bb79d ("net: factor out a helper to decrement the
skb refcount") provided and used a helper for decrementing skb usage,
but I missed at least a spot for it.

This change remove some more duplicated code reusing skb_unref() in
napi_consume_skb(), too. The helper uses an additional, unneeded
unlikely(!skb) test - napi_consume_skb() already check it a few lines
above - but the compiler is smart enough to optimize the duplicated
test out.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14 15:23:51 -04:00
David S. Miller
4cbf87c789 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2017-06-14

Here's another batch of Bluetooth patches for the 4.13 kernel:

 - Fix for Broadcom controllers not supporting Event Mask Page 2
 - New QCA ROME USB ID for btusb
 - Fix for Security Manager Protocol to use constant-time memcmp
 - Improved support for TI WiLink chips

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14 15:22:17 -04:00
Yonghong Song
31fd85816d bpf: permits narrower load from bpf program context fields
Currently, verifier will reject a program if it contains an
narrower load from the bpf context structure. For example,
        __u8 h = __sk_buff->hash, or
        __u16 p = __sk_buff->protocol
        __u32 sample_period = bpf_perf_event_data->sample_period
which are narrower loads of 4-byte or 8-byte field.

This patch solves the issue by:
  . Introduce a new parameter ctx_field_size to carry the
    field size of narrower load from prog type
    specific *__is_valid_access validator back to verifier.
  . The non-zero ctx_field_size for a memory access indicates
    (1). underlying prog type specific convert_ctx_accesses
         supporting non-whole-field access
    (2). the current insn is a narrower or whole field access.
  . In verifier, for such loads where load memory size is
    less than ctx_field_size, verifier transforms it
    to a full field load followed by proper masking.
  . Currently, __sk_buff and bpf_perf_event_data->sample_period
    are supporting narrowing loads.
  . Narrower stores are still not allowed as typical ctx stores
    are just normal stores.

Because of this change, some tests in verifier will fail and
these tests are removed. As a bonus, rename some out of bound
__sk_buff->cb access to proper field name and remove two
redundant "skb cb oob" tests.

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14 14:56:25 -04:00
WANG Cong
74030603df net_sched: move tcf_lock down after gen_replace_estimator()
Laura reported a sleep-in-atomic kernel warning inside
tcf_act_police_init() which calls gen_replace_estimator() with
spinlock protection.

It is not necessary in this case, we already have RTNL lock here
so it is enough to protect concurrent writers. For the reader,
i.e. tcf_act_police(), it needs to make decision based on this
rate estimator, in the worst case we drop more/less packets than
necessary while changing the rate in parallel, it is still acceptable.

Reported-by: Laura Abbott <labbott@redhat.com>
Reported-by: Nick Huber <nicholashuber@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14 14:39:19 -04:00
Florian Fainelli
3cc9f2573c net: dsa: Introduce dsa_get_cpu_port()
Introduce a helper function which will return a reference to the CPU
port used in a dsa_switch_tree. Right now this is a singleton, but this
will change once we introduce multi-CPU port support, so ease the
transition by converting the affected code paths.

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 16:35:03 -04:00
Florian Fainelli
a29342e739 net: dsa: Associate slave network device with CPU port
In preparation for supporting multiple CPU ports with DSA, have the
dsa_port structure know which CPU it is associated with. This will be
important in order to make sure the correct CPU is used for transmission
of the frames. If not for functional reasons, for performance (e.g: load
balancing) and forwarding decisions.

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 16:35:03 -04:00
Florian Fainelli
67dbb9d433 net: dsa: Relocate master ethtool operations
Relocate master_ethtool_ops and master_orig_ethtool_ops into struct
dsa_port in order to be both consistent, and make things self contained
within the dsa_port structure.

This is a preliminary change to supporting multiple CPU port interfaces.

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 16:35:02 -04:00
Florian Fainelli
6d3c8c0dd8 net: dsa: Remove master_netdev and use dst->cpu_dp->netdev
In preparation for supporting multiple CPU ports, remove
dst->master_netdev and ds->master_netdev and replace them with only one
instance of the common object we have for a port: struct
dsa_port::netdev. ds->master_netdev is currently write only and would be
helpful in the case where we have two switches, both with CPU ports, and
also connected within each other, which the multi-CPU port patch series
would address.

While at it, introduce a helper function used in net/dsa/slave.c to
immediately get a reference on the master network device called
dsa_master_netdev().

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 16:35:02 -04:00
Mateusz Jurczyk
20a3d5bf5e caif: Add sockaddr length check before accessing sa_family in connect handler
Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in the connect()
handler of the AF_CAIF socket. Since the syscall doesn't enforce a minimum
size of the corresponding memory region, very short sockaddrs (zero or one
byte long) result in operating on uninitialized memory while referencing
sa_family.

Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 16:16:11 -04:00
Johannes Berg
aa9f979c41 networking: use skb_put_zero()
Use the recently introduced helper to replace the pattern of
skb_put() && memset(), this transformation was done with the
following spatch:

@@
identifier p;
expression len;
expression skb;
@@
-p = skb_put(skb, len);
-memset(p, 0, len);
+p = skb_put_zero(skb, len);

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 13:54:03 -04:00
David S. Miller
0e74008b66 Merge tag 'mac80211-next-for-davem-2017-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:

====================
A couple of weeks worth of updates - looks like things are quiet:
 * merged net-next back to get a patch from net that another patch
   here depends on
 * various small improvements/cleanups across the board
 * 4-way handshake offload (many thanks to Arend for shepherding that)
 * mesh CSA/DFS support in mac80211
 * the skb_put_zero() we discussed previously
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 13:52:37 -04:00
David S. Miller
5952b0200e Merge tag 'batadv-next-for-davem-20170613' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:

====================
This feature/cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - decrease maximum fragment size, by Matthias Schiffer

 - Clean up seqfile writing, by Markus Elfring (2 patches)

 - use __func__ in debug messages, by Sven Eckelmann

 - Mark tpmeter initializers with __init, by Antonio Quartulli

 - ignore loop detection MAC addresses, by Simon Wunderlich

 - clean up some return handling, by Simon Wunderlich

 - improve ELP throughput value handling for WiFi neighbors
   in BATMAN V/ELP, by Sven Eckelmann (2 patches)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 13:47:16 -04:00
David S. Miller
7de84403a2 Merge tag 'batadv-net-for-davem-20170613' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:

====================
Here are two batman-adv bugfixes:

 - fix rx packet counters for local ARP replies, by Sven Eckelmann

 - fix memory leaks for unicast packetes received from another gateway
   in bridge loop avoidance, by Andreas Pape
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 13:46:01 -04:00
David S. Miller
c5549ee401 Merge tag 'mac80211-for-davem-2017-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:

====================
Some fixes:
 * Avi fixes some fallout from my mac80211 RX flags changes
 * Emmanuel fixes an issue with adhering to the spec, and
   an oversight in the SMPS management code
 * Jason's patch makes mac80211 use constant-time memory
   comparisons for message authentication, to avoid having
   potentially observable timing differences
 * my fix makes mac80211 set the basic rates bitmap before
   the channel so the next update to the driver has more
   consistent data - this required another rework patch to
   remove some useless 5/10 MHz code that can never be hit
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 13:34:13 -04:00
yuval.shaia@oracle.com
5514174fe9 net: phy: Make phy_ethtool_ksettings_get return void
Make return value void since function never return meaningfull value

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 12:59:06 -04:00
WANG Cong
c38b7d327a igmp: acquire pmc lock for ip_mc_clear_src()
Andrey reported a use-after-free in add_grec():

        for (psf = *psf_list; psf; psf = psf_next) {
		...
                psf_next = psf->sf_next;

where the struct ip_sf_list's were already freed by:

 kfree+0xe8/0x2b0 mm/slub.c:3882
 ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
 ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
 ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
 inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
 sock_release+0x8d/0x1e0 net/socket.c:597
 sock_close+0x16/0x20 net/socket.c:1072

This happens because we don't hold pmc->lock in ip_mc_clear_src()
and a parallel mr_ifc_timer timer could jump in and access them.

The RCU lock is there but it is merely for pmc itself, this
spinlock could actually ensure we don't access them in parallel.

Thanks to Eric and Long for discussion on this bug.

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 12:51:37 -04:00
Ashwanth Goli
97d8b6e3b8 net: rps: fix uninitialized symbol warning
This patch fixes uninitialized symbol warning that
got introduced by the following commit
773fc8f6e8 ("net: rps: send out pending IPI's on CPU hotplug")

Signed-off-by: Ashwanth Goli <ashwanth@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:31:22 -04:00
Sven Eckelmann
d62890885e batman-adv: Accept only filled wifi station info
The wifi driver can decide to not provide parts of the station info. For
example, the expected throughput of the station can be omitted when the
used rate control doesn't provide this kind of information.

The B.A.T.M.A.N. V implementation must therefore check the filled bitfield
before it tries to access the expected_throughput of the returned
station_info.

Reported-by: Alvaro Antelo <alvaro.antelo@gmail.com>
Fixes: c833484e5f ("batman-adv: ELP - compute the metric based on the estimated throughput")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Reviewed-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2017-06-13 12:25:43 +02:00