Commit Graph

3893 Commits

Author SHA1 Message Date
Jakub Kicinski
5c0ca3f566 test_bpf: fix the dummy skb after dissector changes
Commit d5709f7ab7 ("flow_dissector: For stripped vlan, get vlan
info from skb->vlan_tci") made flow dissector look at vlan_proto
when vlan is present.  Since test_bpf sets skb->vlan_tci to ~0
(including VLAN_TAG_PRESENT) we have to populate skb->vlan_proto.

Fixes false negative on test #24:
test_bpf: #24 LD_PAYLOAD_OFF jited:0 175 ret 0 != 42 FAIL (1 times)

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15 19:17:15 -04:00
David S. Miller
b20b378d49 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/mediatek/mtk_eth_soc.c
	drivers/net/ethernet/qlogic/qed/qed_dcbx.c
	drivers/net/phy/Kconfig

All conflicts were cases of overlapping commits.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-12 15:52:44 -07:00
David S. Miller
60175ccdf4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next
tree.  Most relevant updates are the removal of per-conntrack timers to
use a workqueue/garbage collection approach instead from Florian
Westphal, the hash and numgen expression for nf_tables from Laura
Garcia, updates on nf_tables hash set to honor the NLM_F_EXCL flag,
removal of ip_conntrack sysctl and many other incremental updates on our
Netfilter codebase.

More specifically, they are:

1) Retrieve only 4 bytes to fetch ports in case of non-linear skb
   transport area in dccp, sctp, tcp, udp and udplite protocol
   conntrackers, from Gao Feng.

2) Missing whitespace on error message in physdev match, from Hangbin Liu.

3) Skip redundant IPv4 checksum calculation in nf_dup_ipv4, from Liping Zhang.

4) Add nf_ct_expires() helper function and use it, from Florian Westphal.

5) Replace opencoded nf_ct_kill() call in IPVS conntrack support, also
   from Florian.

6) Rename nf_tables set implementation to nft_set_{name}.c

7) Introduce the hash expression to allow arbitrary hashing of selector
   concatenations, from Laura Garcia Liebana.

8) Remove ip_conntrack sysctl backward compatibility code, this code has
   been around for long time already, and we have two interfaces to do
   this already: nf_conntrack sysctl and ctnetlink.

9) Use nf_conntrack_get_ht() helper function whenever possible, instead
   of opencoding fetch of hashtable pointer and size, patch from Liping Zhang.

10) Add quota expression for nf_tables.

11) Add number generator expression for nf_tables, this supports
    incremental and random generators that can be combined with maps,
    very useful for load balancing purpose, again from Laura Garcia Liebana.

12) Fix a typo in a debug message in FTP conntrack helper, from Colin Ian King.

13) Introduce a nft_chain_parse_hook() helper function to parse chain hook
    configuration, this is used by a follow up patch to perform better chain
    update validation.

14) Add rhashtable_lookup_get_insert_key() to rhashtable and use it from the
    nft_set_hash implementation to honor the NLM_F_EXCL flag.

15) Missing nulls check in nf_conntrack from nf_conntrack_tuple_taken(),
    patch from Florian Westphal.

16) Don't use the DYING bit to know if the conntrack event has been already
    delivered, instead a state variable to track event re-delivery
    states, also from Florian.

17) Remove the per-conntrack timer, use the workqueue approach that was
    discussed during the NFWS, from Florian Westphal.

18) Use the netlink conntrack table dump path to kill stale entries,
    again from Florian.

19) Add a garbage collector to get rid of stale conntracks, from
    Florian.

20) Reschedule garbage collector if eviction rate is high.

21) Get rid of the __nf_ct_kill_acct() helper.

22) Use ARPHRD_ETHER instead of hardcoded 1 from ARP logger.

23) Make nf_log_set() interface assertive on unsupported families.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-06 12:45:26 -07:00
Geert Uytterhoeven
e6173ba42b lib/test_hash.c: fix warning in preprocessor symbol evaluation
Some versions of gcc don't like tests for the value of an undefined
preprocessor symbol, even in the #else branch of an #ifndef:

    lib/test_hash.c:224:7: warning: "HAVE_ARCH__HASH_32" is not defined [-Wundef]
     #elif HAVE_ARCH__HASH_32 != 1
	   ^
    lib/test_hash.c:229:7: warning: "HAVE_ARCH_HASH_32" is not defined [-Wundef]
     #elif HAVE_ARCH_HASH_32 != 1
	   ^
    lib/test_hash.c:234:7: warning: "HAVE_ARCH_HASH_64" is not defined [-Wundef]
     #elif HAVE_ARCH_HASH_64 != 1
	   ^

Seen with gcc 4.9, not seen with 4.1.2.

Change the logic to only check the value inside an #ifdef to fix this.

Fixes: 468a942852 ("<linux/hash.h>: Add support for architecture-specific functions")
Link: http://lkml.kernel.org/r/20160829214952.1334674-4-arnd@arndb.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: George Spelvin <linux@sciencehorizons.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-01 17:52:01 -07:00
Geert Uytterhoeven
ed76b7a131 lib/test_hash.c: fix warning in two-dimensional array init
lib/test_hash.c: In function 'test_hash_init':
  lib/test_hash.c:146:2: warning: missing braces around initializer [-Wmissing-braces]

Fixes: 468a942852 ("<linux/hash.h>: Add support for architecture-specific functions")
Link: http://lkml.kernel.org/r/20160829214952.1334674-3-arnd@arndb.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: George Spelvin <linux@sciencehorizons.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-01 17:52:01 -07:00
Josh Poimboeuf
0d025d271e mm/usercopy: get rid of CONFIG_DEBUG_STRICT_USER_COPY_CHECKS
There are three usercopy warnings which are currently being silenced for
gcc 4.6 and newer:

1) "copy_from_user() buffer size is too small" compile warning/error

   This is a static warning which happens when object size and copy size
   are both const, and copy size > object size.  I didn't see any false
   positives for this one.  So the function warning attribute seems to
   be working fine here.

   Note this scenario is always a bug and so I think it should be
   changed to *always* be an error, regardless of
   CONFIG_DEBUG_STRICT_USER_COPY_CHECKS.

2) "copy_from_user() buffer size is not provably correct" compile warning

   This is another static warning which happens when I enable
   __compiletime_object_size() for new compilers (and
   CONFIG_DEBUG_STRICT_USER_COPY_CHECKS).  It happens when object size
   is const, but copy size is *not*.  In this case there's no way to
   compare the two at build time, so it gives the warning.  (Note the
   warning is a byproduct of the fact that gcc has no way of knowing
   whether the overflow function will be called, so the call isn't dead
   code and the warning attribute is activated.)

   So this warning seems to only indicate "this is an unusual pattern,
   maybe you should check it out" rather than "this is a bug".

   I get 102(!) of these warnings with allyesconfig and the
   __compiletime_object_size() gcc check removed.  I don't know if there
   are any real bugs hiding in there, but from looking at a small
   sample, I didn't see any.  According to Kees, it does sometimes find
   real bugs.  But the false positive rate seems high.

3) "Buffer overflow detected" runtime warning

   This is a runtime warning where object size is const, and copy size >
   object size.

All three warnings (both static and runtime) were completely disabled
for gcc 4.6 with the following commit:

  2fb0815c9e ("gcc4: disable __compiletime_object_size for GCC 4.6+")

That commit mistakenly assumed that the false positives were caused by a
gcc bug in __compiletime_object_size().  But in fact,
__compiletime_object_size() seems to be working fine.  The false
positives were instead triggered by #2 above.  (Though I don't have an
explanation for why the warnings supposedly only started showing up in
gcc 4.6.)

So remove warning #2 to get rid of all the false positives, and re-enable
warnings #1 and #3 by reverting the above commit.

Furthermore, since #1 is a real bug which is detected at compile time,
upgrade it to always be an error.

Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer
needed.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-30 10:10:21 -07:00
David S. Miller
6abdd5f593 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
All three conflicts were cases of simple overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-30 00:54:02 -04:00
Eric Dumazet
9dbeea7f08 rhashtable: fix a memory leak in alloc_bucket_locks()
If vmalloc() was successful, do not attempt a kmalloc_array()

Fixes: 4cf0b354d9 ("rhashtable: avoid large lock-array allocations")
Reported-by: CAI Qian <caiqian@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: CAI Qian <caiqian@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26 21:59:53 -07:00
Pablo Neira Ayuso
5ca8cc5bf1 rhashtable: add rhashtable_lookup_get_insert_key()
This patch modifies __rhashtable_insert_fast() so it returns the
existing object that clashes with the one that you want to insert.
In case the object is successfully inserted, NULL is returned.
Otherwise, you get an error via ERR_PTR().

This patch adapts the existing callers of __rhashtable_insert_fast()
so they handle this new logic, and it adds a new
rhashtable_lookup_get_insert_key() interface to fetch this existing
object.

nf_tables needs this change to improve handling of EEXIST cases via
honoring the NLM_F_EXCL flag and by checking if the data part of the
mapping matches what we have.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-08-26 17:29:41 +02:00
Herbert Xu
246779dd09 rhashtable: Remove GFP flag from rhashtable_walk_init
The commit 8f6fd83c6c ("rhashtable:
accept GFP flags in rhashtable_walk_init") added a GFP flag argument
to rhashtable_walk_init because some users wish to use the walker
in an unsleepable context.

In fact we don't need to allocate memory in rhashtable_walk_init
at all.  The walker is always paired with an iterator so we could
just stash ourselves there.

This patch does that by introducing a new enter function to replace
the existing init function.  This way we don't have to churn all
the existing users again.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-19 14:40:24 -07:00
Linus Torvalds
184ca82348 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Buffers powersave frame test is reversed in cfg80211, fix from Felix
    Fietkau.

 2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme.

 3) Fix some tg3 ethtool logic bugs, and one that would cause no
    interrupts to be generated when rx-coalescing is set to 0.  From
    Satish Baddipadige and Siva Reddy Kallam.

 4) QLCNIC mailbox corruption and napi budget handling fix from Manish
    Chopra.

 5) Fix fib_trie logic when walking the trie during /proc/net/route
    output than can access a stale node pointer.  From David Forster.

 6) Several sctp_diag fixes from Phil Sutter.

 7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel.

 8) Checksum fixup fixes in bpf from Daniel Borkmann.

 9) Memork leaks in nfnetlink, from Liping Zhang.

10) Use after free in rxrpc, from David Howells.

11) Use after free in new skb_array code of macvtap driver, from Jason
    Wang.

12) Calipso resource leak, from Colin Ian King.

13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang.

14) Fix bpf non-linear packet write helpers, from Daniel Borkmann.

15) Fix lockdep splats in macsec, from Sabrina Dubroca.

16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF
    handling.

17) Various tc-action bug fixes, from CONG Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
  net_sched: allow flushing tc police actions
  net_sched: unify the init logic for act_police
  net_sched: convert tcf_exts from list to pointer array
  net_sched: move tc offload macros to pkt_cls.h
  net_sched: fix a typo in tc_for_each_action()
  net_sched: remove an unnecessary list_del()
  net_sched: remove the leftover cleanup_a()
  mlxsw: spectrum: Allow packets to be trapped from any PG
  mlxsw: spectrum: Unmap 802.1Q FID before destroying it
  mlxsw: spectrum: Add missing rollbacks in error path
  mlxsw: reg: Fix missing op field fill-up
  mlxsw: spectrum: Trap loop-backed packets
  mlxsw: spectrum: Add missing packet traps
  mlxsw: spectrum: Mark port as active before registering it
  mlxsw: spectrum: Create PVID vPort before registering netdevice
  mlxsw: spectrum: Remove redundant errors from the code
  mlxsw: spectrum: Don't return upon error in removal path
  i40e: check for and deal with non-contiguous TCs
  ixgbe: Re-enable ability to toggle VLAN filtering
  ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
  ...
2016-08-17 17:26:58 -07:00
Vegard Nossum
12311959ec rhashtable: fix shift by 64 when shrinking
I got this:

    ================================================================================
    UBSAN: Undefined behaviour in ./include/linux/log2.h:63:13
    shift exponent 64 is too large for 64-bit type 'long unsigned int'
    CPU: 1 PID: 721 Comm: kworker/1:1 Not tainted 4.8.0-rc1+ #87
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    Workqueue: events rht_deferred_worker
     0000000000000000 ffff88011661f8d8 ffffffff82344f50 0000000041b58ab3
     ffffffff84f98000 ffffffff82344ea4 ffff88011661f900 ffff88011661f8b0
     0000000000000001 ffff88011661f6b8 dffffc0000000000 ffffffff867f7640
    Call Trace:
     [<ffffffff82344f50>] dump_stack+0xac/0xfc
     [<ffffffff82344ea4>] ? _atomic_dec_and_lock+0xc4/0xc4
     [<ffffffff8242f5b8>] ubsan_epilogue+0xd/0x8a
     [<ffffffff82430c41>] __ubsan_handle_shift_out_of_bounds+0x255/0x29a
     [<ffffffff824309ec>] ? __ubsan_handle_out_of_bounds+0x180/0x180
     [<ffffffff84003436>] ? nl80211_req_set_reg+0x256/0x2f0
     [<ffffffff812112ba>] ? print_context_stack+0x8a/0x160
     [<ffffffff81200031>] ? amd_pmu_reset+0x341/0x380
     [<ffffffff823af808>] rht_deferred_worker+0x1618/0x1790
     [<ffffffff823af808>] ? rht_deferred_worker+0x1618/0x1790
     [<ffffffff823ae1f0>] ? rhashtable_jhash2+0x370/0x370
     [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970
     [<ffffffff8134c1cf>] process_one_work+0x79f/0x1970
     [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970
     [<ffffffff8134ba30>] ? try_to_grab_pending+0x4c0/0x4c0
     [<ffffffff8134d564>] ? worker_thread+0x1c4/0x1340
     [<ffffffff8134d8ff>] worker_thread+0x55f/0x1340
     [<ffffffff845e904f>] ? __schedule+0x4df/0x1d40
     [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970
     [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970
     [<ffffffff813642f7>] kthread+0x237/0x390
     [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280
     [<ffffffff845f8c93>] ? _raw_spin_unlock_irq+0x33/0x50
     [<ffffffff845f95df>] ret_from_fork+0x1f/0x40
     [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280
    ================================================================================

roundup_pow_of_two() is undefined when called with an argument of 0, so
let's avoid the call and just fall back to ht->p.min_size (which should
never be smaller than HASH_MIN_SIZE).

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-15 11:10:09 -07:00
Florian Westphal
4cf0b354d9 rhashtable: avoid large lock-array allocations
Sander reports following splat after netfilter nat bysrc table got
converted to rhashtable:

swapper/0: page allocation failure: order:3, mode:0x2084020(GFP_ATOMIC|__GFP_COMP)
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc1 [..]
 [<ffffffff811633ed>] warn_alloc_failed+0xdd/0x140
 [<ffffffff811638b1>] __alloc_pages_nodemask+0x3e1/0xcf0
 [<ffffffff811a72ed>] alloc_pages_current+0x8d/0x110
 [<ffffffff8117cb7f>] kmalloc_order+0x1f/0x70
 [<ffffffff811aec19>] __kmalloc+0x129/0x140
 [<ffffffff8146d561>] bucket_table_alloc+0xc1/0x1d0
 [<ffffffff8146da1d>] rhashtable_insert_rehash+0x5d/0xe0
 [<ffffffff819fcfff>] nf_nat_setup_info+0x2ef/0x400

The failure happens when allocating the spinlock array.
Even with GFP_KERNEL its unlikely for such a large allocation
to succeed.

Thomas Graf pointed me at inet_ehash_locks_alloc(), so in addition
to adding NOWARN for atomic allocations this also makes the bucket-array
sizing more conservative.

In commit 095dc8e0c3 ("tcp: fix/cleanup inet_ehash_locks_alloc()"),
Eric Dumazet says: "Budget 2 cache lines per cpu worth of 'spinlocks'".
IOW, consider size needed by a single spinlock when determining
number of locks per cpu.  So with 64 byte per cacheline and 4 byte per
spinlock this gives 32 locks per cpu.

Resulting size of the lock-array (sizeof(spinlock) == 4):

cpus:    1   2   4   8   16   32   64
old:    1k  1k  4k  8k  16k  16k  16k
new:   128 256 512  1k   2k   4k   8k

8k allocation should have decent chance of success even
with GFP_ATOMIC, and should not fail with GFP_KERNEL.

With 72-byte spinlock (LOCKDEP):
cpus :   1   2
old:    9k 18k
new:   ~2k ~4k

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Suggested-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-14 21:12:57 -07:00
Linus Torvalds
1bd4403d86 unsafe_[get|put]_user: change interface to use a error target label
When I initially added the unsafe_[get|put]_user() helpers in commit
5b24a7a2aa ("Add 'unsafe' user access functions for batched
accesses"), I made the mistake of modeling the interface on our
traditional __[get|put]_user() functions, which return zero on success,
or -EFAULT on failure.

That interface is fairly easy to use, but it's actually fairly nasty for
good code generation, since it essentially forces the caller to check
the error value for each access.

In particular, since the error handling is already internally
implemented with an exception handler, and we already use "asm goto" for
various other things, we could fairly easily make the error cases just
jump directly to an error label instead, and avoid the need for explicit
checking after each operation.

So switch the interface to pass in an error label, rather than checking
the error value in the caller.  Best do it now before we start growing
more users (the signal handling code in particular would be a good place
to use the new interface).

So rather than

	if (unsafe_get_user(x, ptr))
		... handle error ..

the interface is now

	unsafe_get_user(x, ptr, label);

where an error during the user mode fetch will now just cause a jump to
'label' in the caller.

Right now the actual _implementation_ of this all still ends up being a
"if (err) goto label", and does not take advantage of any exception
label tricks, but for "unsafe_put_user()" in particular it should be
fairly straightforward to convert to using the exception table model.

Note that "unsafe_get_user()" is much harder to convert to a clever
exception table model, because current versions of gcc do not allow the
use of "asm goto" (for the exception) with output values (for the actual
value to be fetched).  But that is hopefully not a limitation in the
long term.

[ Also note that it might be a good idea to switch unsafe_get_user() to
  actually _return_ the value it fetches from user space, but this
  commit only changes the error handling semantics ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-08 13:02:01 -07:00
Phil Sutter
3b3bf80b99 rhashtable-test: Fix max_size parameter description
Looks like a simple copy'n'paste error.

Fixes: 1aa661f5c3 ("rhashtable-test: Measure time to insert, remove & traverse entries")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 12:52:42 -07:00
Jason Baron
9049fc7453 dynamic_debug: add jump label support
Although dynamic debug is often only used for debug builds, sometimes
its enabled for production builds as well.  Minimize its impact by using
jump labels.  This reduces the text section by 7000+ bytes in the kernel
image below.  It does increase data, but this should only be referenced
when changing the direction of the branches, and hence usually not in
cache.

     text     data     bss       dec     hex  filename
  8194852  4879776  925696  14000324  d5a0c4  vmlinux.pre
  8187337  4960224  925696  14073257  d6bda9  vmlinux.post

Link: http://lkml.kernel.org/r/d165b465e8c89bc582d973758d40be44c33f018b.1467837322.git.jbaron@akamai.com
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Joe Perches <joe@perches.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-04 08:50:07 -04:00
Krzysztof Kozlowski
00085f1efa dma-mapping: use unsigned long for dma_attrs
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer.  Thus the pointer can point to const data.
However the attributes do not have to be a bitfield.  Instead unsigned
long will do fine:

1. This is just simpler.  Both in terms of reading the code and setting
   attributes.  Instead of initializing local attributes on the stack
   and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
   attributes are passed by value.

Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-04 08:50:07 -04:00
Linus Torvalds
d52bd54db8 Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:

 - the rest of ocfs2

 - various hotfixes, mainly MM

 - quite a bit of misc stuff - drivers, fork, exec, signals, etc.

 - printk updates

 - firmware

 - checkpatch

 - nilfs2

 - more kexec stuff than usual

 - rapidio updates

 - w1 things

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (111 commits)
  ipc: delete "nr_ipc_ns"
  kcov: allow more fine-grained coverage instrumentation
  init/Kconfig: add clarification for out-of-tree modules
  config: add android config fragments
  init/Kconfig: ban CONFIG_LOCALVERSION_AUTO with allmodconfig
  relay: add global mode support for buffer-only channels
  init: allow blacklisting of module_init functions
  w1:omap_hdq: fix regression
  w1: add helper macro module_w1_family
  w1: remove need for ida and use PLATFORM_DEVID_AUTO
  rapidio/switches: add driver for IDT gen3 switches
  powerpc/fsl_rio: apply changes for RIO spec rev 3
  rapidio: modify for rev.3 specification changes
  rapidio: change inbound window size type to u64
  rapidio/idt_gen2: fix locking warning
  rapidio: fix error handling in mbox request/release functions
  rapidio/tsi721_dma: advance queue processing from transfer submit call
  rapidio/tsi721: add messaging mbox selector parameter
  rapidio/tsi721: add PCIe MRRS override parameter
  rapidio/tsi721_dma: add channel mask and queue size parameters
  ...
2016-08-02 21:08:07 -04:00
Vegard Nossum
a4691deabf kcov: allow more fine-grained coverage instrumentation
For more targeted fuzzing, it's better to disable kernel-wide
instrumentation and instead enable it on a per-subsystem basis.  This
follows the pattern of UBSAN and allows you to compile in the kcov
driver without instrumenting the whole kernel.

To instrument a part of the kernel, you can use either

    # for a single file in the current directory
    KCOV_INSTRUMENT_filename.o := y

or

    # for all the files in the current directory (excluding subdirectories)
    KCOV_INSTRUMENT := y

or

    # (same as above)
    ccflags-y += $(CFLAGS_KCOV)

or

    # for all the files in the current directory (including subdirectories)
    subdir-ccflags-y += $(CFLAGS_KCOV)

Link: http://lkml.kernel.org/r/1464008380-11405-1-git-send-email-vegard.nossum@oracle.com
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:43 -04:00
Arnd Bergmann
a9bfd33217 crc32: use ktime_get_ns() for measurement
The crc32 test function measures the elapsed time in nanoseconds, but
uses 'struct timespec' for that.  We want to remove timespec from the
kernel for y2038 compatibility, and ktime_get_ns() also helps make the
code simpler here.

It is also slightly better to use monontonic time, as we are only
interested in the time difference.

Link: http://lkml.kernel.org/r/20160617143932.3289626-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: "David S . Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:08 -04:00
Sebastian Ott
f003a1f182 lib/iommu-helper: skip to next segment
When a large enough area in the iommu bitmap is found but would span a
boundary we continue the search starting from the next bit position.
For large allocations this can lead to several useless invocations of
bitmap_find_next_zero_area() and iommu_is_span_boundary().

Continue the search from the start of the next segment (which is the
next bit position such that we'll not cross the same segment boundary
again).

Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1606081910070.3211@schleppi
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:07 -04:00
Borislav Petkov
6b1d174b0c ratelimit: extend to print suppressed messages on release
Extend the ratelimiting facility to print the amount of suppressed lines
when it is being released.

This use case is aimed at short-termed, burst-like users for which we
want to output the suppressed lines stats only once, after it has been
disposed of.  For an example, see /dev/kmsg usage in a follow-on patch.

Also, change the printk() line we issue on release to not use
"callbacks" as it is misleading: we're not suppressing callbacks but
printk() calls.

This has been separated from a previous patch by Linus.

Link: http://lkml.kernel.org/r/20160716061745.15795-2-bp@alien8.de
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: Franck Bui <fbui@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:06 -04:00
Nicolas Iooss
901d805c33 UBSAN: fix typo in format string
handle_object_size_mismatch() used %pk to format a kernel pointer with
pr_err().  This seemed to be a misspelling for %pK, but using this to
format a kernel pointer does not make much sence here.

Therefore use %p instead, like in handle_missaligned_access().

Link: http://lkml.kernel.org/r/20160730083010.11569-1-nicolas.iooss_linux@m4x.org
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 17:31:41 -04:00
Vladimir Davydov
05eb6e7263 radix-tree: account nodes to memcg only if explicitly requested
Radix trees may be used not only for storing page cache pages, so
unconditionally accounting radix tree nodes to the current memory cgroup
is bad: if a radix tree node is used for storing data shared among
different cgroups we risk pinning dead memory cgroups forever.

So let's only account radix tree nodes if it was explicitly requested by
passing __GFP_ACCOUNT to INIT_RADIX_TREE.  Currently, we only want to
account page cache entries, so mark mapping->page_tree so.

Fixes: 58e698af4c ("radix-tree: account radix_tree_node to memory cgroup")
Link: http://lkml.kernel.org/r/1470057188-7864-1-git-send-email-vdavydov@virtuozzo.com
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>	[4.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 17:31:41 -04:00
Linus Torvalds
f716a85cd6 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:

 - GCC plugin support by Emese Revfy from grsecurity, with a fixup from
   Kees Cook.  The plugins are meant to be used for static analysis of
   the kernel code.  Two plugins are provided already.

 - reduction of the gcc commandline by Arnd Bergmann.

 - IS_ENABLED / IS_REACHABLE macro enhancements by Masahiro Yamada

 - bin2c fix by Michael Tautschnig

 - setlocalversion fix by Wolfram Sang

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  gcc-plugins: disable under COMPILE_TEST
  kbuild: Abort build on bad stack protector flag
  scripts: Fix size mismatch of kexec_purgatory_size
  kbuild: make samples depend on headers_install
  Kbuild: don't add obj tree in additional includes
  Kbuild: arch: look for generated headers in obtree
  Kbuild: always prefix objtree in LINUXINCLUDE
  Kbuild: avoid duplicate include path
  Kbuild: don't add ../../ to include path
  vmlinux.lds.h: replace config_enabled() with IS_ENABLED()
  kconfig.h: allow to use IS_{ENABLE,REACHABLE} in macro expansion
  kconfig.h: use already defined macros for IS_REACHABLE() define
  export.h: use __is_defined() to check if __KSYM_* is defined
  kconfig.h: use __is_defined() to check if MODULE is defined
  kbuild: setlocalversion: print error to STDERR
  Add sancov plugin
  Add Cyclomatic complexity GCC plugin
  GCC plugin infrastructure
  Shared library support
2016-08-02 16:37:12 -04:00