Commit Graph

24391 Commits

Author SHA1 Message Date
Andrea Arcangeli
78f11a2557 mm: thp: fix /dev/zero MAP_PRIVATE and vm_flags cleanups
The huge_memory.c THP page fault was allowed to run if vm_ops was null
(which would succeed for /dev/zero MAP_PRIVATE, as the f_op->mmap wouldn't
setup a special vma->vm_ops and it would fallback to regular anonymous
memory) but other THP logics weren't fully activated for vmas with vm_file
not NULL (/dev/zero has a not NULL vma->vm_file).

So this removes the vm_file checks so that /dev/zero also can safely use
THP (the other albeit safer approach to fix this bug would have been to
prevent the THP initial page fault to run if vm_file was set).

After removing the vm_file checks, this also makes huge_memory.c stricter
in khugepaged for the DEBUG_VM=y case.  It doesn't replace the vm_file
check with a is_pfn_mapping check (but it keeps checking for VM_PFNMAP
under VM_BUG_ON) because for a is_cow_mapping() mapping VM_PFNMAP should
only be allowed to exist before the first page fault, and in turn when
vma->anon_vma is null (so preventing khugepaged registration).  So I tend
to think the previous comment saying if vm_file was set, VM_PFNMAP might
have been set and we could still be registered in khugepaged (despite
anon_vma was not NULL to be registered in khugepaged) was too paranoid.
The is_linear_pfn_mapping check is also I think superfluous (as described
by comment) but under DEBUG_VM it is safe to stay.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=33682

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Caspar Zhang <bugs@casparzhang.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: <stable@kernel.org>		[2.6.38.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-28 11:28:20 -07:00
Linus Torvalds
7fcaa9aaea Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (42 commits)
  [media] media: vb2: correct queue initialization order
  [media] media: vb2: fix incorrect v4l2_buffer->flags handling
  [media] s5p-fimc: Add support for the buffer timestamps and sequence
  [media] s5p-fimc: Fix bytesperline and plane payload setup
  [media] s5p-fimc: Do not allow changing format after REQBUFS
  [media] s5p-fimc: Fix FIMC3 pixel limits on Exynos4
  [media] tda18271: update tda18271c2_rf_cal as per NXP's rev.04 datasheet
  [media] tda18271: update tda18271_rf_band as per NXP's rev.04 datasheet
  [media] tda18271: fix bad calculation of main post divider byte
  [media] tda18271: prog_cal and prog_tab variables should be s32, not u8
  [media] tda18271: fix calculation bug in tda18271_rf_tracking_filters_init
  [media] omap3isp: queue: Don't corrupt buf->npages when get_user_pages() fails
  [media] v4l: Don't register media entities for subdev device nodes
  [media] omap3isp: Don't increment node entity use count when poweron fails
  [media] omap3isp: lane shifter support
  [media] omap3isp: ccdc: support Y10/12, 8-bit bayer fmts
  [media] media: add missing 8-bit bayer formats and Y12
  [media] v4l: add V4L2_PIX_FMT_Y12 format
  cx23885: Fix stv0367 Kconfig dependency
  [media] omap3isp: Use isp xclk defines
  ...

Fix up trivial conflict (spelink errurs) in drivers/media/video/omap3isp/isp.c
2011-04-27 15:17:52 -07:00
Christoph Hellwig
1879fd6a26 add hlist_bl_lock/unlock helpers
Now that the whole dcache_hash_bucket crap is gone, go all the way and
also remove the weird locking layering violations for locking the hash
buckets.  Add hlist_bl_lock/unlock helpers to move the locking into the
list abstraction instead of requiring each caller to open code it.
After all allowing for the bit locks is the whole point of these helpers
over the plain hlist variant.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-25 18:14:10 -07:00
Linus Torvalds
3dd2ee4824 bit_spinlock: don't play preemption games inside the busy loop
When we are waiting for the bit-lock to be released, and are looping
over the 'cpu_relax()' should not be doing anything else - otherwise we
miss the point of trying to do the whole 'cpu_relax()'.

Do the preemption enable/disable around the loop, rather than inside of
it.

Noticed when I was looking at the code generation for the dcache
__d_drop usage, and the code just looked very odd.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-25 18:10:58 -07:00
Linus Torvalds
5dd12af05c Merge branch 'dcache-cleanup'
* dcache-cleanup:
  vfs: get rid of insane dentry hashing rules
2011-04-24 08:51:15 -07:00
Tejun Heo
ae01b2493c libata: Implement ATA_FLAG_NO_DIPM and apply it to mcp65
NVIDIA mcp65 familiy of controllers cause command timeouts when DIPM
is used.  Implement ATA_FLAG_NO_DIPM and apply it.

This problem was reported by Stefan Bader in the following thread.

 http://thread.gmane.org/gmane.linux.ide/48841

stable: applicable to 2.6.37 and 38.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Stefan Bader <stefan.bader@canonical.com>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2011-04-24 11:32:16 -04:00
Tejun Heo
3f7ac1d667 libata: Kill unused ATA_DFLAG_{H|D}IPM flags
ATA_DFLAG_{H|D}IPM flags are no longer used.  Kill them.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2011-04-24 11:32:03 -04:00
Linus Torvalds
dea3667bc3 vfs: get rid of insane dentry hashing rules
The dentry hashing rules have been really quite complicated for a long
while, in odd ways.  That made functions like __d_drop() very fragile
and non-obvious.

In particular, whether a dentry was hashed or not was indicated with an
explicit DCACHE_UNHASHED bit.  That's despite the fact that the hash
abstraction that the dentries use actually have a 'is this entry hashed
or not' model (which is a simple test of the 'pprev' pointer).

The reason that was done is because we used the normal 'is this entry
unhashed' model to mark whether the dentry had _ever_ been hashed in the
dentry hash tables, and that logic goes back many years (commit
b3423415fb: "dcache: avoid RCU for never-hashed dentries").

That, in turn, meant that __d_drop had totally different unhashing logic
for the dentry hash table case and for the anonymous dcache case,
because in order to use the "is this dentry hashed" logic as a flag for
whether it had ever been on the RCU hash table, we had to unhash such a
dentry differently so that we'd never think that it wasn't 'unhashed'
and wouldn't be free'd correctly.

That's just insane.  It made the logic really hard to follow, when there
were two different kinds of "unhashed" states, and one of them (the one
that used "list_bl_unhashed()") really had nothing at all to do with
being unhashed per se, but with a very subtle lifetime rule instead.

So turn all of it around, and make it logical.

Instead of having a DENTRY_UNHASHED bit in d_flags to indicate whether
the dentry is on the hash chains or not, use the hash chain unhashed
logic for that.  Suddenly "d_unhashed()" just uses "list_bl_unhashed()",
and everything makes sense.

And for the lifetime rule, just use an explicit DENTRY_RCUACCEES bit.
If we ever insert the dentry into the dentry hash table so that it is
visible to RCU lookup, we mark it DENTRY_RCUACCESS to show that it now
needs the RCU lifetime rules.  Now suddently that test at dentry free
time makes sense too.

And because unhashing now is sane and doesn't depend on where the dentry
got unhashed from (because the dentry hash chain details doesn't have
some subtle side effects), we can re-unify the __d_drop() logic and use
common code for the unhashing.

Also fix one more open-coded hash chain bit_spin_lock() that I missed in
the previous chain locking cleanup commit.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-24 07:58:46 -07:00
Andi Kleen
8c9e80ed27 SECURITY: Move exec_permission RCU checks into security modules
Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
is enabled, even though just the standard capability module is active.
This is because security_inode_exec_permission unconditionally fails
RCU walks.

Move this decision to the low level security module. This requires
passing the RCU flags down the security hook. This way at least
the capability module and a few easy cases in selinux/smack work
with RCU walks with CONFIG_SECURITY=y

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-22 16:17:29 -07:00
Linus Torvalds
73aa86825f Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: Remove the extra check in queue_requests_store
  block, blk-sysfs: Fix an err return path in blk_register_queue()
  block: remove stale kerneldoc member from __blk_run_queue()
  block: get rid of QUEUE_FLAG_REENTER
  cfq-iosched: read_lock() does not always imply rcu_read_lock()
  block: kill blk_flush_plug_list() export
2011-04-20 09:48:52 -07:00
Linus Torvalds
6cf544377f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (51 commits)
  netfilter: ipset: Fix the order of listing of sets
  ip6_pol_route panic: Do not allow VLAN on loopback
  bnx2x: Fix port identification problem
  r8169: add Realtek as maintainer.
  ip: ip_options_compile() resilient to NULL skb route
  bna: fix memory leak during RX path cleanup
  bna: fix for clean fw re-initialization
  usbnet: Fix up 'FLAG_POINTTOPOINT' and 'FLAG_MULTI_PACKET' overlaps.
  iwlegacy: fix tx_power initialization
  Revert "tcp: disallow bind() to reuse addr/port"
  qlcnic: limit skb frags for non tso packet
  net: can: mscan: fix build breakage in mpc5xxx_can
  netfilter: ipset: set match and SET target fixes
  netfilter: ipset: bitmap:ip,mac type requires "src" for MAC
  sctp: fix oops while removed transport still using as retran path
  sctp: fix oops when updating retransmit path with DEBUG on
  net: Disable NETIF_F_TSO_ECN when TSO is disabled
  net: Disable all TSO features when SG is disabled
  sfc: Use rmb() to ensure reads occur in order
  ieee802154: Remove hacked CFLAGS in net/ieee802154/Makefile
  ...
2011-04-19 15:16:41 -07:00
Linus Torvalds
4ae0ff16ef Merge branch 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  RTC: rtc-omap: Fix a leak of the IRQ during init failure
  posix clocks: Replace mutex with reader/writer semaphore
2011-04-19 10:56:46 -07:00
Michael Jones
cbbc69a4a9 [media] media: add missing 8-bit bayer formats and Y12
8-bit SGBRG and SRGGB media bus formats are missing, as well as the
12-bit grey format. Add them.

Signed-off-by: Michael Jones <michael.jones@matrix-vision.de>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-04-19 11:23:31 -03:00
Michael Jones
d924de09ca [media] v4l: add V4L2_PIX_FMT_Y12 format
Y12 is a grey-scale format with a depth of 12 bits per pixel stored in
16-bit words.

Signed-off-by: Michael Jones <michael.jones@matrix-vision.de>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-04-19 11:20:56 -03:00
Jens Axboe
c21e6beba8 block: get rid of QUEUE_FLAG_REENTER
We are currently using this flag to check whether it's safe
to call into ->request_fn(). If it is set, we punt to kblockd.
But we get a lot of false positives and excessive punts to
kblockd, which hurts performance.

The only real abuser of this infrastructure is SCSI. So export
the async queue run and convert SCSI over to use that. There's
room for improvement in that SCSI need not always use the async
call, but this fixes our performance issue and they can fix that
up in due time.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-19 13:32:46 +02:00
Linus Torvalds
96fd2d57b8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: xen-kbdfront - fix mouse getting stuck after save/restore
  Input: estimate number of events per packet
  Input: evdev - indicate buffer overrun with SYN_DROPPED
  Input: document event types and codes and their intended use
  Input: add KEY_IMAGES specifically for AL Image Browser
  Input: twl4030_keypad - fix potential NULL dereference in twl4030_kp_probe()
  Input: h3600_ts - fix error handling at connect
  Input: twl4030_keypad - avoid potential NULL-pointer dereference
2011-04-18 13:29:03 -07:00
Linus Torvalds
8a83f33100 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: add blk_run_queue_async
  block: blk_delay_queue() should use kblockd workqueue
  md: fix up raid1/raid10 unplugging.
  md: incorporate new plugging into raid5.
  md: provide generic support for handling unplug callbacks.
  md - remove old plugging code.
  md/dm - remove remains of plug_fn callback.
  md: use new plugging interface for RAID IO.
  block: drop queue lock before calling __blk_run_queue() for kblockd punt
  Revert "block: add callback function for unplug notification"
  block: Enhance new plugging support to support general callbacks
2011-04-18 13:21:18 -07:00
Linus Torvalds
c78193e9c7 next_pidmap: fix overflow condition
next_pidmap() just quietly accepted whatever 'last' pid that was passed
in, which is not all that safe when one of the users is /proc.

Admittedly the proc code should do some sanity checking on the range
(and that will be the next commit), but that doesn't mean that the
helper functions should just do that pidmap pointer arithmetic without
checking the range of its arguments.

So clamp 'last' to PID_MAX_LIMIT.  The fact that we then do "last+1"
doesn't really matter, the for-loop does check against the end of the
pidmap array properly (it's only the actual pointer arithmetic overflow
case we need to worry about, and going one bit beyond isn't going to
overflow).

[ Use PID_MAX_LIMIT rather than pid_max as per Eric Biederman ]

Reported-by: Tavis Ormandy <taviso@cmpxchg8b.com>
Analyzed-by: Robert Święcki <robert@swiecki.net>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-18 10:35:30 -07:00
Jeff Brown
80b4895aa4 Input: estimate number of events per packet
Calculate a default based on the number of ABS axes, REL axes,
and MT slots for the device during input device registration.

Signed-off-by: Jeff Brown <jeffbrown@android.com>
Reviewed-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-04-18 10:15:43 -07:00
Christoph Hellwig
24ecfbe27f block: add blk_run_queue_async
Instead of overloading __blk_run_queue to force an offload to kblockd
add a new blk_run_queue_async helper to do it explicitly.  I've kept
the blk_queue_stopped check for now, but I suspect it's not needed
as the check we do when the workqueue items runs should be enough.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-18 11:41:33 +02:00
Richard Cochran
1791f88143 posix clocks: Replace mutex with reader/writer semaphore
A dynamic posix clock is protected from asynchronous removal by a mutex.
However, using a mutex has the unwanted effect that a long running clock
operation in one process will unnecessarily block other processes.

For example, one process might call read() to get an external time stamp
coming in at one pulse per second. A second process calling clock_gettime
would have to wait for almost a whole second.

This patch fixes the issue by using a reader/writer semaphore instead of
a mutex.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/%3C20110330132421.GA31771%40riccoc20.at.omicron.at%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-04-18 10:39:38 +02:00
NeilBrown
af1db72d8b md/dm - remove remains of plug_fn callback.
Now that unplugging is done differently, the unplug_fn callback is
never called, so it can be completely discarded.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-04-18 18:25:41 +10:00
Jens Axboe
b4cb290e0a Revert "block: add callback function for unplug notification"
MD can't use this since it really requires us to be able to
keep more than a single piece of state for the unplug. Commit
048c9374 added the required support for MD, so get rid of this
now unused code.

This reverts commit f75664570d.

Conflicts:

	block/blk-core.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-18 09:54:05 +02:00
NeilBrown
048c9374a7 block: Enhance new plugging support to support general callbacks
md/raid requires an unplug callback, but as it does not uses
requests the current code cannot provide one.

So allow arbitrary callbacks to be attached to the blk_plug.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-18 09:52:22 +02:00
Linus Torvalds
d733ed6c34 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: make unplug timer trace event correspond to the schedule() unplug
  block: let io_schedule() flush the plug inline
2011-04-16 10:33:41 -07:00