Merge fixes from Andrew Morton:
"27 fixes.
There are three patches that aren't actually fixes. They're simple
function renamings which are nice-to-have in mainline as ongoing net
development depends on them."
* akpm: (27 commits)
timerfd: export defines to userspace
mm/hugetlb.c: fix reservation race when freeing surplus pages
mm/slab.c: fix SLAB freelist randomization duplicate entries
zram: support BDI_CAP_STABLE_WRITES
zram: revalidate disk under init_lock
mm: support anonymous stable page
mm: add documentation for page fragment APIs
mm: rename __page_frag functions to __page_frag_cache, drop order from drain
mm: rename __alloc_page_frag to page_frag_alloc and __free_page_frag to page_frag_free
mm, memcg: fix the active list aging for lowmem requests when memcg is enabled
mm: don't dereference struct page fields of invalid pages
mailmap: add codeaurora.org names for nameless email commits
signal: protect SIGNAL_UNKILLABLE from unintentional clearing.
mm: pmd dirty emulation in page fault handler
ipc/sem.c: fix incorrect sem_lock pairing
lib/Kconfig.debug: fix frv build failure
mm: get rid of __GFP_OTHER_NODE
mm: fix remote numa hits statistics
mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done}
ocfs2: fix crash caused by stale lvb with fsdlm plugin
...
fib_select_path does not call fib_select_multipath if oif is set in the
flow struct. For VRF use cases oif is always set, so multipath route
selection is bypassed. Use the FLOWI_FLAG_SKIP_NH_OIF to skip the oif
check similar to what is done in fib_table_lookup.
Add saddr and proto to the flow struct for the fib lookup done by the
VRF driver to better match hash computation for a flow.
Fixes: 613d09b30f ("net: Use VRF device index for lookups on TX")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We now 'select SOCK_CGROUP_DATA' but Kconfig complains that this is
not right when CONFIG_NET is disabled and there is no socket interface:
warning: (CGROUP_BPF) selects SOCK_CGROUP_DATA which has unmet direct dependencies (NET)
I don't know what the correct solution for this is, but simply removing
the dependency on NET from SOCK_CGROUP_DATA by moving it out of the
'if NET' section avoids the warning and does not produce other build
errors.
Fixes: 483c4933ea ("cgroup: Fix CGROUP_BPF config")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
On 32bit arches, (skb->end - skb->data) is not 'unsigned int',
so we shall use min_t() instead of min() to avoid a compiler error.
Fixes: 1272ce87fa ("gro: Enter slow-path if there is no tailroom")
Reported-by: kernel test robot <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch series "Page fragment updates", v4.
This patch series takes care of a few cleanups for the page fragments
API.
First we do some renames so that things are much more consistent. First
we move the page_frag_ portion of the name to the front of the functions
names. Secondly we split out the cache specific functions from the
other page fragment functions by adding the word "cache" to the name.
Finally I added a bit of documentation that will hopefully help to
explain some of this. I plan to revisit this later as we get things
more ironed out in the near future with the changes planned for the DMA
setup to support eXpress Data Path.
This patch (of 3):
This patch renames the page frag functions to be more consistent with
other APIs. Specifically we place the name page_frag first in the name
and then have either an alloc or free call name that we append as the
suffix. This makes it a bit clearer in terms of naming.
In addition we drop the leading double underscores since we are
technically no longer a backing interface and instead the front end that
is called from the networking APIs.
Link: http://lkml.kernel.org/r/20170104023854.13451.67390.stgit@localhost.localdomain
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The GRO fast path caches the frag0 address. This address becomes
invalid if frag0 is modified by pskb_may_pull or its variants.
So whenever that happens we must disable the frag0 optimization.
This is usually done through the combination of gro_header_hard
and gro_header_slow, however, the IPv6 extension header path did
the pulling directly and would continue to use the GRO fast path
incorrectly.
This patch fixes it by disabling the fast path when we enter the
IPv6 extension header path.
Fixes: 78a478d0ef ("gro: Inline skb_gro_header and cache frag0 virtual address")
Reported-by: Slava Shwartsman <slavash@mellanox.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The GRO path has a fast-path where we avoid calling pskb_may_pull
and pskb_expand by directly accessing frag0. However, this should
only be done if we have enough tailroom in the skb as otherwise
we'll have to expand it later anyway.
This patch adds the check by capping frag0_len with the skb tailroom.
Fixes: cb18978cbf ("gro: Open-code final pskb_may_pull")
Reported-by: Slava Shwartsman <slavash@mellanox.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With commit e53743994e
("af_iucv: use paged SKBs for big outbound messages"),
we transmit paged skbs for both of AF_IUCV's transport modes
(IUCV or HiperSockets).
The qeth driver for Layer 3 HiperSockets currently doesn't
support NETIF_F_SG, so these skbs would just be linearized again
by the stack.
Avoid that overhead by using paged skbs only for IUCV transport.
cc stable, since this also circumvents a significant skb leak when
sending large messages (where the skb then needs to be linearized).
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # v4.8+
Fixes: e53743994e ("af_iucv: use paged SKBs for big outbound messages")
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit bdabad3e36 ("net: Add Qualcomm IPC router") introduced a
new address family. Update the family name tables accordingly so
that the lockdep initialization can use the proper names for this
family.
Cc: Courtney Cavin <courtney.cavin@sonymobile.com>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Failure to mark this pointer as __le32 causes checkers like
sparse to complain:
net/qrtr/qrtr.c:274:16: warning: incorrect type in assignment (different base types)
net/qrtr/qrtr.c:274:16: expected unsigned int [unsigned] [usertype] <noident>
net/qrtr/qrtr.c:274:16: got restricted __le32 [usertype] <noident>
net/qrtr/qrtr.c:275:16: warning: incorrect type in assignment (different base types)
net/qrtr/qrtr.c:275:16: expected unsigned int [unsigned] [usertype] <noident>
net/qrtr/qrtr.c:275:16: got restricted __le32 [usertype] <noident>
net/qrtr/qrtr.c:276:16: warning: incorrect type in assignment (different base types)
net/qrtr/qrtr.c:276:16: expected unsigned int [unsigned] [usertype] <noident>
net/qrtr/qrtr.c:276:16: got restricted __le32 [usertype] <noident>
Silence it.
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is perfectly possible to have non zero indexed switches being present
in a DSA switch tree, in such a case, we will be deferencing a NULL
pointer while dsa_cpu_port_ethtool_{setup,restore}. Be more defensive
and ensure that dst->ds[0] is valid before doing anything with it.
Fixes: 0c73c523cf ("net: dsa: Initialize CPU port ethtool ops per tree")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removes following sparse complain :
net/core/flow_dissector.c:70:8: warning: symbol 'skb_flow_get_be16'
was not declared. Should it be static?
Fixes: 972d3876fa ("flow dissector: ICMP support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sockfs_setattr() static as it is not used outside of net/socket.c
This fixes the following GCC warning:
net/socket.c:534:5: warning: no previous prototype for ‘sockfs_setattr’ [-Wmissing-prototypes]
Fixes: 86741ec254 ("net: core: Add a UID field to struct sock.")
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 1fb6f159fd ("tcp: add tcp_conn_request"),
tcp_peer_is_proven() no longer needs to be exported.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
> cat /proc/sys/net/ipv4/tcp_notsent_lowat
-1
> echo 4294967295 > /proc/sys/net/ipv4/tcp_notsent_lowat
-bash: echo: write error: Invalid argument
> echo -2147483648 > /proc/sys/net/ipv4/tcp_notsent_lowat
> cat /proc/sys/net/ipv4/tcp_notsent_lowat
-2147483648
but in documentation we have "tcp_notsent_lowat - UNSIGNED INTEGER"
v2: simplify to just proc_douintvec
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we added CALIPSO support in Linux v4.8 we forgot to add it to the
list of supported protocols with display at boot.
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Another single fix, to correctly handle destruction of a
single netlink socket having ownership of multiple objects
(scheduled scan requests and interfaces.)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
vti6 interface is registered before the rtnl_link_ops block
is attached. As a result the resulting RTM_NEWLINK is missing
IFLA_INFO_KIND. Re-order attachment of rtnl_link_ops block to fix.
Signed-off-by: Dave Forster <dforster@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains accumulated Netfilter fixes for your
net tree:
1) Ensure quota dump and reset happens iff we can deliver numbers to
userspace.
2) Silence splat on incorrect use of smp_processor_id() from nft_queue.
3) Fix an out-of-bound access reported by KASAN in
nf_tables_rule_destroy(), patch from Florian Westphal.
4) Fix layer 4 checksum mangling in the nf_tables payload expression
with IPv6.
5) Fix a race in the CLUSTERIP target from control plane path when two
threads run to add a new configuration object. Serialize invocations
of clusterip_config_init() using spin_lock. From Xin Long.
6) Call br_nf_pre_routing_finish_bridge_finish() once we are done with
the br_nf_pre_routing_finish() hook. From Artur Molchanov.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
A single netlink socket might own multiple interfaces *and* a
scheduled scan request (which might belong to another interface),
so when it goes away both may need to be destroyed.
Remove the schedule_scan_stop indirection to fix this - it's only
needed for interface destruction because of the way this works
right now, with a single work taking care of all interfaces.
Cc: stable@vger.kernel.org
Fixes: 93a1e86ce1 ("nl80211: Stop scheduled scan if netlink client disappears")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Final nlmsg_len field update must reflect inserted net_dm_drop_point
data.
This patch depends on previous patch:
"drop_monitor: add missing call to genlmsg_end"
Signed-off-by: Reiter Wolfgang <wr0112358@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the case of custom rules being present we need to handle the case of the
LOCAL table being intialized after the new rule has been added. To address
that I am adding a new check so that we can make certain we don't use an
alias of MAIN for LOCAL when allocating a new table.
Fixes: 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse")
Reported-by: Oliver Brunel <jjk@jjacky.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5.2. Action on Reception of a Query
When a system receives a Query, it does not respond immediately.
Instead, it delays its response by a random amount of time, bounded
by the Max Resp Time value derived from the Max Resp Code in the
received Query message. A system may receive a variety of Queries on
different interfaces and of different kinds (e.g., General Queries,
Group-Specific Queries, and Group-and-Source-Specific Queries), each
of which may require its own delayed response.
Before scheduling a response to a Query, the system must first
consider previously scheduled pending responses and in many cases
schedule a combined response. Therefore, the system must be able to
maintain the following state:
o A timer per interface for scheduling responses to General Queries.
o A per-group and interface timer for scheduling responses to Group-
Specific and Group-and-Source-Specific Queries.
o A per-group and interface list of sources to be reported in the
response to a Group-and-Source-Specific Query.
When a new Query with the Router-Alert option arrives on an
interface, provided the system has state to report, a delay for a
response is randomly selected in the range (0, [Max Resp Time]) where
Max Resp Time is derived from Max Resp Code in the received Query
message. The following rules are then used to determine if a Report
needs to be scheduled and the type of Report to schedule. The rules
are considered in order and only the first matching rule is applied.
1. If there is a pending response to a previous General Query
scheduled sooner than the selected delay, no additional response
needs to be scheduled.
2. If the received Query is a General Query, the interface timer is
used to schedule a response to the General Query after the
selected delay. Any previously pending response to a General
Query is canceled.
--8<--
Currently the timer is rearmed with new random expiration time for
every incoming query regardless of possibly already pending report.
Which is not aligned with the above RFE.
It also might happen that higher rate of incoming queries can
postpone the report after the expiration time of the first query
causing group membership loss.
Now the per interface general query timer is rearmed only
when there is no pending report already scheduled on that interface or
the newly selected expiration time is before the already pending
scheduled report.
Signed-off-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>