Commit Graph

23134 Commits

Author SHA1 Message Date
Linus Torvalds cb60e3e65c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "New notable features:
   - The seccomp work from Will Drewry
   - PR_{GET,SET}_NO_NEW_PRIVS from Andy Lutomirski
   - Longer security labels for Smack from Casey Schaufler
   - Additional ptrace restriction modes for Yama by Kees Cook"

Fix up trivial context conflicts in arch/x86/Kconfig and include/linux/filter.h

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (65 commits)
  apparmor: fix long path failure due to disconnected path
  apparmor: fix profile lookup for unconfined
  ima: fix filename hint to reflect script interpreter name
  KEYS: Don't check for NULL key pointer in key_validate()
  Smack: allow for significantly longer Smack labels v4
  gfp flags for security_inode_alloc()?
  Smack: recursive tramsmute
  Yama: replace capable() with ns_capable()
  TOMOYO: Accept manager programs which do not start with / .
  KEYS: Add invalidation support
  KEYS: Do LRU discard in full keyrings
  KEYS: Permit in-place link replacement in keyring list
  KEYS: Perform RCU synchronisation on keys prior to key destruction
  KEYS: Announce key type (un)registration
  KEYS: Reorganise keys Makefile
  KEYS: Move the key config into security/keys/Kconfig
  KEYS: Use the compat keyctl() syscall wrapper on Sparc64 for Sparc32 compat
  Yama: remove an unused variable
  samples/seccomp: fix dependencies on arch macros
  Yama: add additional ptrace scopes
  ...
2012-05-21 20:27:36 -07:00
Linus Torvalds 99262a3daf Merge tag 'virtio-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
Pull virtio updates from Rusty Russell.

* tag 'virtio-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  virtio: fix typo in comment
  virtio-mmio: Devices parameter parsing
  virtio_blk: Drop unused request tracking list
  virtio-blk: Fix hot-unplug race in remove method
  virtio: Use ida to allocate virtio index
  virtio: balloon: separate out common code between remove and freeze functions
  virtio: balloon: drop restore_common()
  9p: disconnect channel when PCI device is removed
  virtio: update documentation to v0.9.5 of spec
2012-05-21 20:20:23 -07:00
Sasha Levin 991ad9ec39 9p: disconnect channel when PCI device is removed
When a virtio_9p pci device is being removed, we should close down any
active channels and free up resources, we're not supposed to BUG() if there's
still an open channel since it's a valid case when removing the PCI device.

Otherwise, removing the PCI device with an open channel would cause the
following BUG():

[ 1184.671416] ------------[ cut here ]------------
[ 1184.672057] kernel BUG at net/9p/trans_virtio.c:618!
[ 1184.672057] invalid opcode: 0000 [#1] PREEMPT SMP
[ 1184.672057] CPU 3
[ 1184.672057] Pid: 5, comm: kworker/u:0 Tainted: G        W    3.4.0-rc2-next-20120413-sasha-dirty #76
[ 1184.672057] RIP: 0010:[<ffffffff825c9116>]  [<ffffffff825c9116>] p9_virtio_remove+0x16/0x90
[ 1184.672057] RSP: 0018:ffff88000d653ac0  EFLAGS: 00010202
[ 1184.672057] RAX: ffffffff836bfb40 RBX: ffff88000c9b2148 RCX: ffff88000d658978
[ 1184.672057] RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff880028868000
[ 1184.672057] RBP: ffff88000d653ad0 R08: 0000000000000000 R09: 0000000000000000
[ 1184.672057] R10: 0000000000000000 R11: 0000000000000001 R12: ffff880028868000
[ 1184.672057] R13: ffffffff835aa7c0 R14: ffff880041630000 R15: ffff88000d653da0
[ 1184.672057] FS:  0000000000000000(0000) GS:ffff880035a00000(0000) knlGS:0000000000000000
[ 1184.672057] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1184.672057] CR2: 0000000001181000 CR3: 000000000eba1000 CR4: 00000000000406e0
[ 1184.672057] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
x000000000117a190 *[ 1184.672057] DR3: 00000000000000**
00 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1184.672057] Process kworker/u:0 (pid: 5, threadinfo ffff88000d652000, task ffff88000d658000)
[ 1184.672057] Stack:
[ 1184.672057]  ffff880028868000 ffffffff836bfb40 ffff88000d653af0 ffffffff8193661b
[ 1184.672057]  ffff880028868008 ffffffff836bfb40 ffff88000d653b10 ffffffff81af1c81
[ 1184.672057]  ffff880028868068 ffff880028868008 ffff88000d653b30 ffffffff81af257a
[ 1184.795301] Call Trace:
[ 1184.795301]  [<ffffffff8193661b>] virtio_dev_remove+0x1b/0x60
[ 1184.795301]  [<ffffffff81af1c81>] __device_release_driver+0x81/0xd0
[ 1184.795301]  [<ffffffff81af257a>] device_release_driver+0x2a/0x40
[ 1184.795301]  [<ffffffff81af0d48>] bus_remove_device+0x138/0x150
[ 1184.795301]  [<ffffffff81aef08d>] device_del+0x14d/0x1b0
[ 1184.795301]  [<ffffffff81aef138>] device_unregister+0x48/0x60
[ 1184.795301]  [<ffffffff8193694d>] unregister_virtio_device+0xd/0x10
[ 1184.795301]  [<ffffffff8265fc74>] virtio_pci_remove+0x2a/0x6c
[ 1184.795301]  [<ffffffff818a95ad>] pci_device_remove+0x4d/0x110
[ 1184.795301]  [<ffffffff81af1c81>] __device_release_driver+0x81/0xd0
[ 1184.795301]  [<ffffffff81af257a>] device_release_driver+0x2a/0x40
[ 1184.795301]  [<ffffffff81af0d48>] bus_remove_device+0x138/0x150
[ 1184.795301]  [<ffffffff81aef08d>] device_del+0x14d/0x1b0
[ 1184.795301]  [<ffffffff81aef138>] device_unregister+0x48/0x60
[ 1184.795301]  [<ffffffff818a36fa>] pci_stop_bus_device+0x6a/0x90
[ 1184.795301]  [<ffffffff818a3791>] pci_stop_and_remove_bus_device+0x11/0x20
[ 1184.795301]  [<ffffffff818c21d9>] remove_callback+0x9/0x10
[ 1184.795301]  [<ffffffff81252d91>] sysfs_schedule_callback_work+0x21/0x60
[ 1184.795301]  [<ffffffff810cb1a1>] process_one_work+0x281/0x430
[ 1184.795301]  [<ffffffff810cb140>] ? process_one_work+0x220/0x430
[ 1184.795301]  [<ffffffff81252d70>] ? sysfs_read_file+0x1c0/0x1c0
[ 1184.795301]  [<ffffffff810cc613>] worker_thread+0x1f3/0x320
[ 1184.795301]  [<ffffffff810cc420>] ? manage_workers.clone.13+0x130/0x130
[ 1184.795301]  [<ffffffff810d30b2>] kthread+0xb2/0xc0
[ 1184.795301]  [<ffffffff826783f4>] kernel_thread_helper+0x4/0x10
[ 1184.795301]  [<ffffffff810deb18>] ? finish_task_switch+0x78/0xf0
[ 1184.795301]  [<ffffffff82676574>] ? retint_restore_args+0x13/0x13
[ 1184.795301]  [<ffffffff810d3000>] ? kthread_flush_work_fn+0x10/0x10
[ 1184.795301]  [<ffffffff826783f0>] ? gs_change+0x13/0x13
[ 1184.795301] Code: c1 9e 0a 00 48 83 c4 08 5b c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 49 89 fc 53 48 8b 9f a8 04 00 00 80 3b 00 74 0a <0f> 0b 0f 1f 84 00 00 00 00 00 48 8b 87 88 04 00 00 ff 50 30 31
[ 1184.795301] RIP  [<ffffffff825c9116>] p9_virtio_remove+0x16/0x90
[ 1184.795301]  RSP <ffff88000d653ac0>
[ 1184.952618] ---[ end trace a307b3ed40206b4c ]---

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-05-22 12:16:10 +09:30
James Morris ff2bb047c4 Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into next
Per pull request, for 3.5.
2012-05-22 11:21:06 +10:00
Sam Ravnborg e47b65b032 net: drop NET dependency from HAVE_BPF_JIT
There is no point having the NET dependency on the select target, as it
forces all users to depend on NET to tell they support BPF_JIT.  Move
the config option to the bottom of the file - this could be a nice place
also for future "selectable" config symbols.

Fix up all users to drop the dependency on NET now that it is not
required to supress warnings for non-NET builds.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-21 12:50:12 -07:00
Linus Torvalds cb62ab71fe Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking changes from David Miller:

 1) Get rid of the error prone NLA_PUT*() macros that used an embedded
    goto.

 2) Kill off the token-ring and MCA networking drivers, from Paul
    Gortmaker.

 3) Reduce high-order allocations made by datagram AF_UNIX sockets, from
    Eric Dumazet.

 4) Add PTP hardware clock support to IGB and IXGBE, from Richard
    Cochran and Jacob Keller.

 5) Allow users to query timestamping capabilities of a card via
    ethtool, from Richard Cochran.

 6) Add loadbalance mode to the teaming driver, from Jiri Pirko.  Part
    of this is that we can now have BPF filters not attached to sockets,
    and the loadbalancing function is calculated using one.

 7) Francois Romieu went through the network drivers removing gratuitous
    uses of netdev->base_addr, perhaps some day we can remove it
    completely but it's used for ISA probing still.

 8) Add a BPF JIT for sparc.  I know, who cares, right? :-)

 9) Move networking sysctl registry away from using the compatability
    mode interfaces in the sysctl code.  From Eric W Biederman.

10) Pavel Emelyanov added a way to save and restore TCP socket state via
    TCP_REPAIR, TCP_REPAIR_QUEUE, and TCP_QUEUE_SEQ socket options as
    well as a way to forcefully bind a socket to a port via the
    sk->sk_reuse value SK_FORCE_REUSE.  There is also a
    TCP_REPAIR_OPTIONS which allows to reinstante the TCP options
    enabled on the connection.

11) Several enhancements from Eric Dumazet that, in particular, can
    enhance splice performance on TCP sockets significantly.

     a) Reset the offset of the per-socket sendmsg page when we know
        we're the only use of the page in linear_to_page().

     b) Add facilities such that skb->data can be backed a page rather
        than SLAB kmalloc'd memory.  In particular devices which were
        receiving into linear RX buffers can now end up providing paged
        data.

    The big result is that code like splice and GRO do not have to copy
    any more.

12) Allow a pure sender to more gracefully handle ACK backlogs in TCP.
    What can happen at high rates is that the sender hasn't grown his
    receive buffer limits at all (he's not receiving data so really
    doesn't need to), but the non-data ACKs consume receive buffer
    space.

    sk_add_backlog() is too aggressive in dropping frames in this case,
    so relax it's requirements by using the receive buffer plus the send
    buffer limit as the backlog limit instead of just the former.

    Also from Eric Dumazet.

13) Add ipv6 support to L2TP, from Benjamin LaHaise, James Chapman, and
    Chris Elston.

14) Implement TCP early retransmit (RFC 5827), from Yuchung Cheng.
    Basically, we can start fast retransmit before hiting the dupack
    threshold under certain conditions.

15) New CODEL active queue management packet scheduler, from Eric
    Dumazet based upon initial work by Dave Taht.

    Basically, the big feature is that packets are dropped (or ECN bits
    are set) based upon how long packets live in the queue, rather than
    the queue length (which is what RED uses).

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1341 commits)
  drivers/net/stmmac: seq_file fix memory leak
  ipv6/exthdrs: strict Pad1 and PadN check
  USB: qmi_wwan: Add ZTE (Vodafone) K3520-Z
  USB: qmi_wwan: Add ZTE (Vodafone) K3765-Z
  USB: qmi_wwan: Make forced int 4 whitelist generic
  net/ipv4: replace simple_strtoul with kstrtoul
  net/ipv4/ipconfig: neaten __setup placement
  net: qmi_wwan: Add Vodafone/Huawei K5005 support
  net: cdc_ether: Add ZTE WWAN matches before generic Ethernet
  ipv6: use skb coalescing in reassembly
  ipv4: use skb coalescing in defragmentation
  net: introduce skb_try_coalesce()
  net:ipv6:fixed space issues relating to operators.
  net:ipv6:fixed a trailing white space issue.
  ipv6: disable GSO on sockets hitting dst_allfrag
  tg3: use netdev_alloc_frag() API
  net: napi_frags_skb() is static
  ppp: avoid false drop_monitor false positives
  ipv6: bool/const conversions phase2
  ipx: Remove spurious NULL checking in ipx_ioctl().
  ...
2012-05-21 10:03:46 -07:00
Linus Torvalds 31ed8e6f93 Merge branch 'dentry-cleanups' (dcache access cleanups and optimizations)
This branch simplifies and clarifies the dcache lookup, and allows us to
do certain nice optimizations when comparing dentries.  It also cleans
up the interface to __d_lookup_rcu(), especially around passing the
inode information around.

* dentry-cleanups:
  vfs: make it possible to access the dentry hash/len as one 64-bit entry
  vfs: move dentry name length comparison from dentry_cmp() into callers
  vfs: do the careful dentry name access for all dentry_cmp cases
  vfs: remove unnecessary d_unhashed() check from __d_lookup_rcu
  vfs: clean up __d_lookup_rcu() and dentry_cmp() interfaces
2012-05-21 08:50:57 -07:00
David S. Miller 17eea0df5f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-05-20 21:53:04 -04:00
Eldad Zack 9b905fe684 ipv6/exthdrs: strict Pad1 and PadN check
The following tightens the padding check from commit
c1412fce7e :

* Take into account combinations of consecutive Pad1 and PadN.

* Catch the corner case of when only padding is present in the
  header, when the extention header length is 0 (i.e., 8 bytes).
  In this case, the header would have exactly 6 bytes of padding:

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
:  Next Header  : Hdr Ext Len=0 :                               :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
:                        Padding (Pad1 or PadN)                 :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-20 16:58:39 -04:00
Eldad Zack 413c27d869 net/ipv4: replace simple_strtoul with kstrtoul
Replace simple_strtoul with kstrtoul in three similar occurrences, all setup
handlers:
* route.c: set_rhash_entries
* tcp.c: set_thash_entries
* udp.c: set_uhash_entries

Also check if the conversion failed.

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-20 04:06:17 -04:00
Eldad Zack b37f4d7b01 net/ipv4/ipconfig: neaten __setup placement
The __setup macro should follow the corresponding setup handler.

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-20 04:06:16 -04:00
Eric Dumazet ec16439e17 ipv6: use skb coalescing in reassembly
ip6_frag_reasm() can use skb_try_coalesce() to build optimized skb,
reducing memory used by them (truesize), and reducing number of cache
line misses and overhead for the consumer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-19 18:34:57 -04:00
Eric Dumazet 3cc4949269 ipv4: use skb coalescing in defragmentation
ip_frag_reasm() can use skb_try_coalesce() to build optimized skb,
reducing memory used by them (truesize), and reducing number of cache
line misses and overhead for the consumer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-19 18:34:57 -04:00
Eric Dumazet bad43ca832 net: introduce skb_try_coalesce()
Move tcp_try_coalesce() protocol independent part to
skb_try_coalesce().

skb_try_coalesce() can be used in IPv4 defrag and IPv6 reassembly,
to build optimized skbs (less sk_buff, and possibly less 'headers')

skb_try_coalesce() is zero copy, unless the copy can fit in destination
header (its a rare case)

kfree_skb_partial() is also moved to net/core/skbuff.c and exported,
because IPv6 will need it in patch (ipv6: use skb coalescing in
reassembly).

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-19 18:34:57 -04:00
Jeffrin Jose 3dde259882 net:ipv6:fixed space issues relating to operators.
Fixed space issues relating to operators found by
checkpatch.pl tool in net/ipv6/udp.c

Signed-off-by: Jeffrin Jose <ahiliation@yahoo.co.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-19 18:34:57 -04:00
Jeffrin Jose 9a52e97e24 net:ipv6:fixed a trailing white space issue.
Fixed a trailing white space issue found by
checkpatch.pl tool in net/ipv6/udp.c

Signed-off-by: Jeffrin Jose <ahiliation@yahoo.co.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-19 18:34:57 -04:00
Eric Dumazet a34a101e1e ipv6: disable GSO on sockets hitting dst_allfrag
If the allfrag feature has been set on a host route (due to an ICMPv6
Packet Too Big received indicating a MTU of less than 1280), we hit a
very slow behavior in TCP stack, because all big packets are dropped and
only a retransmit timer is able to push one MSS frame every 200 ms.

One way to handle this is to disable GSO on the socket the first time a
super packet is dropped. Adding a specific dst_allfrag() in the fast
path is probably overkill since the dst_allfrag() case almost never
happen.

Result on netperf TCP_STREAM, one flow :

Before : 60 kbit/sec
After : 1.6 Gbit/sec

Reported-by: Tore Anderson <tore@fud.no>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Tore Anderson <tore@fud.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-19 04:02:12 -04:00
Eric Dumazet 4adb9c4ac8 net: napi_frags_skb() is static
No need to export napi_frags_skb()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-19 02:51:00 -04:00
Eric Dumazet a50feda546 ipv6: bool/const conversions phase2
Mostly bool conversions, some inline removals and const additions.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-19 01:08:16 -04:00
David S. Miller 32e9072b92 ipx: Remove spurious NULL checking in ipx_ioctl().
We already unconditionally dereference 'sk' via lock_sock(sk) earlier
in this function, and our caller (sock_do_ioctl()) makes takes similar
liberties.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-19 00:51:04 -04:00
Eric Dumazet 72e843bb09 ipv6: ip6_fragment() should check CHECKSUM_PARTIAL
Quoting Tore Anderson from :

If the allfrag feature has been set on a host route (due to an ICMPv6
Packet Too Big received indicating a MTU of less than 1280),
TCP SYN/ACK packets to that destination appears to get an incorrect
TCP checksum. This in turn means they are thrown away as invalid.

In the case of an IPv4 client behind a link with a MTU of less than
1260, accessing an IPv6 server through a stateless translator,
this means that the client can only download a single large file
from the server, because once it is in the server's routing cache
with the allfrag feature set, new TCP connections can no longer
be established.

</endquote>

It appears ip6_fragment() doesn't handle CHECKSUM_PARTIAL properly.

As network drivers are not prepared to fetch correct transport header, a
safe fix is to call skb_checksum_help() before fragmenting packet.

Reported-by: Tore Anderson <tore@fud.no>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Tore Anderson <tore@fud.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-18 23:49:33 -04:00
Eric Dumazet d4b1133558 pktgen: fix module unload for good
commit c57b546840 (pktgen: fix crash at module unload) did a very poor
job with list primitives.

1) list_splice() arguments were in the wrong order

2) list_splice(list, head) has undefined behavior if head is not
initialized.

3) We should use the list_splice_init() variant to clear pktgen_threads
list.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-18 13:54:33 -04:00
Eric Dumazet d7f7c0ac11 ipv6: remove csummode in ip6_append_data()
csummode variable is always CHECKSUM_NONE in ip6_append_data()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-18 13:31:25 -04:00
Eric Dumazet 6f532612cc net: introduce netdev_alloc_frag()
Fix two issues introduced in commit a1c7fff7e1
( net: netdev_alloc_skb() use build_skb() )

- Must be IRQ safe (non NAPI drivers can use it)
- Must not leak the frag if build_skb() fails to allocate sk_buff

This patch introduces netdev_alloc_frag() for drivers willing to
use build_skb() instead of __netdev_alloc_skb() variants.

Factorize code so that :
__dev_alloc_skb() is a wrapper around __netdev_alloc_skb(), and
dev_alloc_skb() a wrapper around netdev_alloc_skb()

Use __GFP_COLD flag.

Almost all network drivers now benefit from skb->head_frag
infrastructure.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-18 13:31:25 -04:00
Eric Dumazet 92113bfde2 ipv6: bool conversions phase1
ipv6_opt_accepted() returns a bool, and can use const pointers

ipv6_addr_equal(), ipv6_addr_any(), ipv6_addr_loopback(),
ipv6_addr_orchid() return a bool.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-18 02:24:13 -04:00