Commit Graph

14645 Commits

Author SHA1 Message Date
Al Viro
b2ddedcd21 x32: fix sigtimedwait
It needs 64bit timespec.  As it is, we end up truncating the timeout
to whole seconds; usually it doesn't matter, but for having all
sub-second timeouts truncated to one jiffy is visibly wrong.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-26 01:15:03 -05:00
Al Viro
a566c28882 x32: fix waitid()
It needs 64bit rusage and 32bit siginfo.  glibc never calls it with
non-NULL rusage pointer, or we would've seen breakage already...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-26 01:15:03 -05:00
Al Viro
8d9807b109 switch compat_sys_wait4() and compat_sys_waitid() to COMPAT_SYSCALL_DEFINE
Strictly speaking, ppc64 needs it for C ABI compliance.  Realistically
I would be very surprised if e.g. passing 0xffffffff as 'options'
argument to waitid() from 32bit task would cause problems, but yes,
it puts us into undefined behaviour territory.  ppc64 expects int
argument to be passed in 64bit register with bits 31..63 containing
the same value.  SYSCALL_DEFINE on ppc provides a wrapper that normalizes
the value passed from userland; so does COMPAT_SYSCALL_DEFINE.  Plain
declaration of compat_sys_something() with an int argument obviously
doesn't.  Again, for wait4 and waitid I would be extremely surprised
if gcc started to produce code depending on that value having been
properly sign-extended - the argument(s) in question end up passed
blindly to sys_wait4 and sys_waitid resp. and normalization for native
syscalls takes care of their use there.  Still, better to use
COMPAT_SYSCALL_DEFINE here than worry about nasal daemons...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-26 01:15:02 -05:00
Al Viro
90228fc110 switch compat_sys_sigaltstack() to COMPAT_SYSCALL_DEFINE
Makes sigaltstack conversion easier to split into per-architecture
parts.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-26 01:15:02 -05:00
Linus Torvalds
96680d2b91 Merge branch 'for-next' of git://git.infradead.org/users/eparis/notify
Pull filesystem notification updates from Eric Paris:
 "This pull mostly is about locking changes in the fsnotify system.  By
  switching the group lock from a spin_lock() to a mutex() we can now
  hold the lock across things like iput().  This fixes a problem
  involving unmounting a fs and having inodes be busy, first pointed out
  by FAT, but reproducible with tmpfs.

  This also restores signal driven I/O for inotify, which has been
  broken since about 2.6.32."

Ugh.  I *hate* the timing of this.  It was rebased after the merge
window opened, and then left to sit with the pull request coming the day
before the merge window closes.  That's just crap.  But apparently the
patches themselves have been around for over a year, just gathering
dust, so now it's suddenly critical.

Fixed up semantic conflict in fs/notify/fdinfo.c as per Stephen
Rothwell's fixes from -next.

* 'for-next' of git://git.infradead.org/users/eparis/notify:
  inotify: automatically restart syscalls
  inotify: dont skip removal of watch descriptor if creation of ignored event failed
  fanotify: dont merge permission events
  fsnotify: make fasync generic for both inotify and fanotify
  fsnotify: change locking order
  fsnotify: dont put marks on temporary list when clearing marks by group
  fsnotify: introduce locked versions of fsnotify_add_mark() and fsnotify_remove_mark()
  fsnotify: pass group to fsnotify_destroy_mark()
  fsnotify: use a mutex instead of a spinlock to protect a groups mark list
  fanotify: add an extra flag to mark_remove_from_mask that indicates wheather a mark should be destroyed
  fsnotify: take groups mark_lock before mark lock
  fsnotify: use reference counting for groups
  fsnotify: introduce fsnotify_get_group()
  inotify, fanotify: replace fsnotify_put_group() with fsnotify_destroy_group()
2012-12-20 20:11:52 -08:00
Linus Torvalds
4c9a44aebe Merge branch 'akpm' (Andrew's patch-bomb)
Merge the rest of Andrew's patches for -rc1:
 "A bunch of fixes and misc missed-out-on things.

  That'll do for -rc1.  I still have a batch of IPC patches which still
  have a possible bug report which I'm chasing down."

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (25 commits)
  keys: use keyring_alloc() to create module signing keyring
  keys: fix unreachable code
  sendfile: allows bypassing of notifier events
  SGI-XP: handle non-fatal traps
  fat: fix incorrect function comment
  Documentation: ABI: remove testing/sysfs-devices-node
  proc: fix inconsistent lock state
  linux/kernel.h: fix DIV_ROUND_CLOSEST with unsigned divisors
  memcg: don't register hotcpu notifier from ->css_alloc()
  checkpatch: warn on uapi #includes that #include <uapi/...
  revert "rtc: recycle id when unloading a rtc driver"
  mm: clean up transparent hugepage sysfs error messages
  hfsplus: add error message for the case of failure of sync fs in delayed_sync_fs() method
  hfsplus: rework processing of hfs_btree_write() returned error
  hfsplus: rework processing errors in hfsplus_free_extents()
  hfsplus: avoid crash on failed block map free
  kcmp: include linux/ptrace.h
  drivers/rtc/rtc-imxdi.c: must include <linux/spinlock.h>
  mm: cma: WARN if freed memory is still in use
  exec: do not leave bprm->interp on stack
  ...
2012-12-20 20:00:43 -08:00
Linus Torvalds
54d46ea993 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull signal handling cleanups from Al Viro:
 "sigaltstack infrastructure + conversion for x86, alpha and um,
  COMPAT_SYSCALL_DEFINE infrastructure.

  Note that there are several conflicts between "unify
  SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline;
  resolution is trivial - just remove definitions of SS_ONSTACK and
  SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and
  include/uapi/linux/signal.h contains the unified variant."

Fixed up conflicts as per Al.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  alpha: switch to generic sigaltstack
  new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those
  generic compat_sys_sigaltstack()
  introduce generic sys_sigaltstack(), switch x86 and um to it
  new helper: compat_user_stack_pointer()
  new helper: restore_altstack()
  unify SS_ONSTACK/SS_DISABLE definitions
  new helper: current_user_stack_pointer()
  missing user_stack_pointer() instances
  Bury the conditionals from kernel_thread/kernel_execve series
  COMPAT_SYSCALL_DEFINE: infrastructure
2012-12-20 18:05:28 -08:00
David Howells
cfde819088 keys: use keyring_alloc() to create module signing keyring
Use keyring_alloc() to create special keyrings now that it has
a permissions parameter rather than using key_alloc() +
key_instantiate_and_link().

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-20 17:40:21 -08:00
Cyrill Gorcunov
44fd07e989 kcmp: include linux/ptrace.h
This makes it compile on s390. After all the ptrace_may_access
(which we use this file) is declared exactly in linux/ptrace.h.

This is preparatory work to wire this syscall up on all archs.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-20 17:40:19 -08:00
Hugh Dickins
2832bc19f6 sched: numa: ksm: fix oops in task_numa_placment()
task_numa_placement() oopsed on NULL p->mm when task_numa_fault() got
called in the handling of break_ksm() for ksmd.  That might be a
peculiar case, which perhaps KSM could takes steps to avoid? but it's
more robust if task_numa_placement() allows for such a possibility.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-20 07:06:56 -08:00
Linus Torvalds
7005cd3970 Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull random updates from Ted Ts'o:
 "A few /dev/random improvements for the v3.8 merge window."

* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
  random: Mix cputime from each thread that exits to the pool
  random: prime last_data value per fips requirements
  random: fix debug format strings
  random: make it possible to enable debugging without rebuild
2012-12-19 20:23:37 -08:00
Al Viro
c40702c49f new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those
note that they are relying on access_ok() already checked by caller.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:41 -05:00
Al Viro
9026843952 generic compat_sys_sigaltstack()
Again, conditional on CONFIG_GENERIC_SIGALTSTACK

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:41 -05:00
Al Viro
6bf9adfc90 introduce generic sys_sigaltstack(), switch x86 and um to it
Conditional on CONFIG_GENERIC_SIGALTSTACK; architectures that do not
select it are completely unaffected

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:40 -05:00
Al Viro
5c49574ffd new helper: restore_altstack()
to be used by rt_sigreturn instances

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:40 -05:00
Al Viro
ae903caae2 Bury the conditionals from kernel_thread/kernel_execve series
All architectures have
	CONFIG_GENERIC_KERNEL_THREAD
	CONFIG_GENERIC_KERNEL_EXECVE
	__ARCH_WANT_SYS_EXECVE
None of them have __ARCH_WANT_KERNEL_EXECVE and there are only two callers
of kernel_execve() (which is a trivial wrapper for do_execve() now) left.
Kill the conditionals and make both callers use do_execve().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:38 -05:00
Bjørn Mork
3935e89505 watchdog: Fix disable/enable regression
Commit 8d4516904b ("watchdog: Fix CPU hotplug regression") causes an
oops or hard lockup when doing

 echo 0 > /proc/sys/kernel/nmi_watchdog
 echo 1 > /proc/sys/kernel/nmi_watchdog

and the kernel is booted with nmi_watchdog=1 (default)

Running laptop-mode-tools and disconnecting/connecting AC power will
cause this to trigger, making it a common failure scenario on laptops.

Instead of bailing out of watchdog_disable() when !watchdog_enabled we
can initialize the hrtimer regardless of watchdog_enabled status.  This
makes it safe to call watchdog_disable() in the nmi_watchdog=0 case,
without the negative effect on the enabled => disabled => enabled case.

All these tests pass with this patch:
- nmi_watchdog=1
  echo 0 > /proc/sys/kernel/nmi_watchdog
  echo 1 > /proc/sys/kernel/nmi_watchdog

- nmi_watchdog=0
  echo 0 > /sys/devices/system/cpu/cpu1/online

- nmi_watchdog=0
  echo mem > /sys/power/state

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51661

Cc: <stable@vger.kernel.org> # v3.7
Cc: Norbert Warmuth <nwarmuth@t-online.de>
Cc: Joseph Salisbury <joseph.salisbury@canonical.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-19 12:10:33 -08:00
Linus Torvalds
7a684c452e Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module update from Rusty Russell:
 "Nothing all that exciting; a new module-from-fd syscall for those who
  want to verify the source of the module (ChromeOS) and/or use standard
  IMA on it or other security hooks."

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  MODSIGN: Fix kbuild output when using default extra_certificates
  MODSIGN: Avoid using .incbin in C source
  modules: don't hand 0 to vmalloc.
  module: Remove a extra null character at the top of module->strtab.
  ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants
  ASN.1: Define indefinite length marker constant
  moduleparam: use __UNIQUE_ID()
  __UNIQUE_ID()
  MODSIGN: Add modules_sign make target
  powerpc: add finit_module syscall.
  ima: support new kernel module syscall
  add finit_module syscall to asm-generic
  ARM: add finit_module syscall to ARM
  security: introduce kernel_module_from_file hook
  module: add flags arg to sys_finit_module()
  module: add syscall to load module from fd
2012-12-19 07:55:08 -08:00
Linus Torvalds
673ab8783b Merge branch 'akpm' (more patches from Andrew)
Merge patches from Andrew Morton:
 "Most of the rest of MM, plus a few dribs and drabs.

  I still have quite a few irritating patches left around: ones with
  dubious testing results, lack of review, ones which should have gone
  via maintainer trees but the maintainers are slack, etc.

  I need to be more activist in getting these things wrapped up outside
  the merge window, but they're such a PITA."

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (48 commits)
  mm/vmscan.c: avoid possible deadlock caused by too_many_isolated()
  vmscan: comment too_many_isolated()
  mm/kmemleak.c: remove obsolete simple_strtoul
  mm/memory_hotplug.c: improve comments
  mm/hugetlb: create hugetlb cgroup file in hugetlb_init
  mm/mprotect.c: coding-style cleanups
  Documentation: ABI: /sys/devices/system/node/
  slub: drop mutex before deleting sysfs entry
  memcg: add comments clarifying aspects of cache attribute propagation
  kmem: add slab-specific documentation about the kmem controller
  slub: slub-specific propagation changes
  slab: propagate tunable values
  memcg: aggregate memcg cache values in slabinfo
  memcg/sl[au]b: shrink dead caches
  memcg/sl[au]b: track all the memcg children of a kmem_cache
  memcg: destroy memcg caches
  sl[au]b: allocate objects from memcg cache
  sl[au]b: always get the cache from its page in kmem_cache_free()
  memcg: skip memcg kmem allocations in specified code regions
  memcg: infrastructure to match an allocation to the right cache
  ...
2012-12-18 15:08:12 -08:00
Glauber Costa
2ad306b17c fork: protect architectures where THREAD_SIZE >= PAGE_SIZE against fork bombs
Because those architectures will draw their stacks directly from the page
allocator, rather than the slab cache, we can directly pass __GFP_KMEMCG
flag, and issue the corresponding free_pages.

This code path is taken when the architecture doesn't define
CONFIG_ARCH_THREAD_INFO_ALLOCATOR (only ia64 seems to), and has
THREAD_SIZE >= PAGE_SIZE.  Luckily, most - if not all - of the remaining
architectures fall in this category.

This will guarantee that every stack page is accounted to the memcg the
process currently lives on, and will have the allocations to fail if they
go over limit.

For the time being, I am defining a new variant of THREADINFO_GFP, not to
mess with the other path.  Once the slab is also tracked by memcg, we can
get rid of that flag.

Tested to successfully protect against :(){ :|:& };:

Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Frederic Weisbecker <fweisbec@redhat.com>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18 15:02:13 -08:00
Glauber Costa
50bdd430c2 res_counter: return amount of charges after res_counter_uncharge()
It is useful to know how many charges are still left after a call to
res_counter_uncharge.  While it is possible to issue a res_counter_read
after uncharge, this can be racy.

If we need, for instance, to take some action when the counters drop down
to 0, only one of the callers should see it.  This is the same semantics
as the atomic variables in the kernel.

Since the current return value is void, we don't need to worry about
anything breaking due to this change: nobody relied on that, and only
users appearing from now on will be checking this value.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18 15:02:12 -08:00
Alan Cox
19af395d7c irq: tsk->comm is an array
The array check is useless so remove it.

[akpm@linux-foundation.org: remove comment, per David]
Signed-off-by: Alan Cox <alan@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18 15:02:11 -08:00
Linus Torvalds
758338e960 Merge branch 'tip/perf/core-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull minor tracing updates and fixes from Steven Rostedt:
 "It seems that one of my old pull requests have slipped through.

  The changes are contained to just the files that I maintain, and are
  changes from others that I told I would get into this merge window.

  They have already been in linux-next for several weeks, and should be
  well tested."

* 'tip/perf/core-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Remove unnecessary WARN_ONCE's from tracing_buffers_splice_read
  tracing: Remove unneeded checks from the stack tracer
  tracing: Add a resize function to make one buffer equivalent to another buffer
2012-12-18 12:28:39 -08:00
Linus Torvalds
a2faf2fc53 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull (again) user namespace infrastructure changes from Eric Biederman:
 "Those bugs, those darn embarrasing bugs just want don't want to get
  fixed.

  Linus I just updated my mirror of your kernel.org tree and it appears
  you successfully pulled everything except the last 4 commits that fix
  those embarrasing bugs.

  When you get a chance can you please repull my branch"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  userns: Fix typo in description of the limitation of userns_install
  userns: Add a more complete capability subset test to commit_creds
  userns: Require CAP_SYS_ADMIN for most uses of setns.
  Fix cap_capable to only allow owners in the parent user namespace to have caps.
2012-12-18 10:55:28 -08:00
Linus Torvalds
848b81415c Merge branch 'akpm' (Andrew's patch-bomb)
Merge misc patches from Andrew Morton:
 "Incoming:

   - lots of misc stuff

   - backlight tree updates

   - lib/ updates

   - Oleg's percpu-rwsem changes

   - checkpatch

   - rtc

   - aoe

   - more checkpoint/restart support

  I still have a pile of MM stuff pending - Pekka should be merging
  later today after which that is good to go.  A number of other things
  are twiddling thumbs awaiting maintainer merges."

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (180 commits)
  scatterlist: don't BUG when we can trivially return a proper error.
  docs: update documentation about /proc/<pid>/fdinfo/<fd> fanotify output
  fs, fanotify: add @mflags field to fanotify output
  docs: add documentation about /proc/<pid>/fdinfo/<fd> output
  fs, notify: add procfs fdinfo helper
  fs, exportfs: add exportfs_encode_inode_fh() helper
  fs, exportfs: escape nil dereference if no s_export_op present
  fs, epoll: add procfs fdinfo helper
  fs, eventfd: add procfs fdinfo helper
  procfs: add ability to plug in auxiliary fdinfo providers
  tools/testing/selftests/kcmp/kcmp_test.c: print reason for failure in kcmp_test
  breakpoint selftests: print failure status instead of cause make error
  kcmp selftests: print fail status instead of cause make error
  kcmp selftests: make run_tests fix
  mem-hotplug selftests: print failure status instead of cause make error
  cpu-hotplug selftests: print failure status instead of cause make error
  mqueue selftests: print failure status instead of cause make error
  vm selftests: print failure status instead of cause make error
  ubifs: use prandom_bytes
  mtd: nandsim: use prandom_bytes
  ...
2012-12-17 20:58:12 -08:00