Add an FS-Cache helper to bulk uncache pages on an inode. This will
only work for the circumstance where the pages in the cache correspond
1:1 with the pages attached to an inode's page cache.
This is required for CIFS and NFS: When disabling inode cookie, we were
returning the cookie and setting cifsi->fscache to NULL but failed to
invalidate any previously mapped pages. This resulted in "Bad page
state" errors and manifested in other kind of errors when running
fsstress. Fix it by uncaching mapped pages when we disable the inode
cookie.
This patch should fix the following oops and "Bad page state" errors
seen during fsstress testing.
------------[ cut here ]------------
kernel BUG at fs/cachefiles/namei.c:201!
invalid opcode: 0000 [#1] SMP
Pid: 5, comm: kworker/u:0 Not tainted 2.6.38.7-30.fc15.x86_64 #1 Bochs Bochs
RIP: 0010: cachefiles_walk_to_object+0x436/0x745 [cachefiles]
RSP: 0018:ffff88002ce6dd00 EFLAGS: 00010282
RAX: ffff88002ef165f0 RBX: ffff88001811f500 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000282
RBP: ffff88002ce6dda0 R08: 0000000000000100 R09: ffffffff81b3a300
R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff88002ae54840
R13: ffff88002ae54840 R14: ffff880029c29c00 R15: ffff88001811f4b0
FS: 00007f394dd32720(0000) GS:ffff88002ef00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fffcb62ddf8 CR3: 000000001825f000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/u:0 (pid: 5, threadinfo ffff88002ce6c000, task ffff88002ce55cc0)
Stack:
0000000000000246 ffff88002ce55cc0 ffff88002ce6dd58 ffff88001815dc00
ffff8800185246c0 ffff88001811f618 ffff880029c29d18 ffff88001811f380
ffff88002ce6dd50 ffffffff814757e4 ffff88002ce6dda0 ffffffff8106ac56
Call Trace:
cachefiles_lookup_object+0x78/0xd4 [cachefiles]
fscache_lookup_object+0x131/0x16d [fscache]
fscache_object_work_func+0x1bc/0x669 [fscache]
process_one_work+0x186/0x298
worker_thread+0xda/0x15d
kthread+0x84/0x8c
kernel_thread_helper+0x4/0x10
RIP cachefiles_walk_to_object+0x436/0x745 [cachefiles]
---[ end trace 1d481c9af1804caa ]---
I tested the uncaching by the following means:
(1) Create a big file on my NFS server (104857600 bytes).
(2) Read the file into the cache with md5sum on the NFS client. Look in
/proc/fs/fscache/stats:
Pages : mrk=25601 unc=0
(3) Open the file for read/write ("bash 5<>/warthog/bigfile"). Look in proc
again:
Pages : mrk=25601 unc=25601
Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The list of available general purpose memory allocators in
Documentation/CodingStyle chapter 14 is incomplete. This patch adds
the missing vzalloc() to the list.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (k10temp) Update documentation for Fam12h
hwmon-vid: Fix typo in VIA CPU name
hwmon: (f71882fg) Add support for the F71869A
hwmon: Use <> rather than () around my e-mail address
hwmon: (emc6w201) Properly handle all errors
Add some CPU series IDs and links to the Fam12h datasheets.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
The F71869A is almost the same as the F71869F/E, except that it has
the normal number of temp and pwm zones for a F71882FG derived chip,
rather then the limited number of the F71869F/E.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Max Baldwin <archerseven@gmail.com>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Commit e1866b33b1 (PM / Runtime: Rework
runtime PM handling during driver removal) forgot to update the
documentation in Documentation/power/runtime_pm.txt to match the new
code in drivers/base/dd.c. Update that documentation to match the
code it describes.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Replace reference to pm_runtime_idle_sync() in the driver core with
pm_runtime_put_sync() which is used in the code.
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
MAINTAINERS: add myself as maintainer of USB/IP
usb: r8a66597-hcd: fix cannot detect low/full speed device
USB: ehci-ath79: fix a NULL pointer dereference
USB: Add new FT232H chip to drivers/usb/serial/ftdi_sio.c
usb/isp1760: Fix bug preventing the unlinking of control urbs
USB: Fix up URB error codes to reflect implementation.
xhci: Always set urb->status to zero for isoc endpoints.
xhci: Add reset on resume quirk for asrock p67 host
xHCI 1.0: Incompatible Device Error
USB: don't let errors prevent system sleep
USB: don't let the hub driver prevent system sleep
USB: change maintainership of ohci-hcd and ehci-hcd
xHCI 1.0: Force Stopped Event(FSE)
xhci: Don't warn about zeroed bMaxBurst descriptor field.
USB: Free bandwidth when usb_disable_device is called.
xhci: Reject double add of active endpoints.
USB: TI 3410/5052 USB Serial Driver: Fix mem leak when firmware is too big.
usb: musb: gadget: clear TXPKTRDY flag when set FLUSHFIFO
usb: musb: host: compare status for negative error values
Commit 4d27e9dcff (PM: Make power
domain callbacks take precedence over subsystem ones) forgot to
update the device power management documentation to take changes
made by it into account. Correct that mistake.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The part of Documentation/power/devices.txt regarding sysdevs is not
valid any more after commit 2e711c04db
(PM: Remove sysdev suspend, resume and shutdown operations), so
remove it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Commit e866500247 (PM: Allow
pm_runtime_suspend() to succeed during system suspend) removed usage
count increment across system PM.
Update doc to reflect this.
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: Move RCU_BOOST #ifdefs to header file
rcu: use softirq instead of kthreads except when RCU_BOOST=y
rcu: Use softirq to address performance regression
rcu: Simplify curing of load woes
Documentation/usb/error-codes.txt mentions that urb->status can be set to
-EXDEV, if the isochronous transfer was not fully completed. However, in
practice, EHCI, UHCI, and OHCI all only set -EXDEV in the individual frame
status, never in the URB status. Those host controller actually always
pass in a zero status to usb_hcd_giveback_urb, and rely on the core to set
the appropriate status value.
The xHCI driver ran into issues with the uvcvideo driver when it tried to
set -EXDEV in urb->status, because the driver refused to submit URBs, and
the userspace camera application's video froze.
Clean up the documentation to reflect the actual implementation.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
According to commit 676db4af04 ("cgroupfs: create /sys/fs/cgroup to
mount cgroupfs on") the canonical mountpoint for the cgroup filesystem
is /sys/fs/cgroup. Hence, this should be used in the documentation.
Signed-off-by: Jörg Sommer <joerg@alea.gnuu.de>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of listing the architectures that are supported by
kmemleak in Documentation/kmemleak.txt, just refer people to
the list of supported architecutures in lib/Kconfig.debug so
that Documentation/kmemleak.txt does not need more updates
for this.
Signed-off-by: Maxin B. John <maxin.john@gmail.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread)
introduced performance regression. In an AIM7 test, this commit degraded
performance by about 40%.
The commit runs rcu callbacks in a kthread instead of softirq. We observed
high rate of context switch which is caused by this. Out test system has
64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
which is caused by RCU's per-CPU kthread. A trace showed that most of
the time the RCU per-CPU kthread doesn't actually handle any callbacks,
but instead just does a very small amount of work handling grace periods.
This means that RCU's per-CPU kthreads are making the scheduler do quite
a bit of work in order to allow a very small amount of RCU-related
processing to be done.
Alex Shi's analysis determined that this slowdown is due to lock
contention within the scheduler. Unfortunately, as Peter Zijlstra points
out, the scheduler's real-time semantics require global action, which
means that this contention is inherent in real-time scheduling. (Yes,
perhaps someone will come up with a workaround -- otherwise, -rt is not
going to do well on large SMP systems -- but this patch will work around
this issue in the meantime. And "the meantime" might well be forever.)
This patch therefore re-introduces softirq processing to RCU, but only
for core RCU work. RCU callbacks are still executed in kthread context,
so that only a small amount of RCU work runs in softirq context in the
common case. This should minimize ksoftirqd execution, allowing us to
skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Tested-by: "Alex,Shi" <alex.shi@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* 'for-linus' of git://neil.brown.name/md:
md/raid5: remove unusual use of bio_iovec_idx()
md/raid5: fix FUA request handling in ops_run_io()
md/raid5: fix raid5_set_bi_hw_segments
md:Documentation/md.txt - fix typo
md/bitmap: remove unused fields from struct bitmap
md/bitmap: use proper accessor macro
md: check ->hot_remove_disk when removing disk
md: Using poll /proc/mdstat can monitor the events of adding a spare disks
MD: use is_power_of_2 macro
MD: raid5 do not set fullsync
MD: support initial bitmap creation in-kernel
MD: add sync_super to mddev_t struct
MD: raid1 changes to allow use by device mapper
MD: move thread wakeups into resume
MD: possible typo
MD: no sync IO while suspended
MD: no integrity register if no gendisk