.set_page_dirty() is one of those a_ops that defaults to the
buffer implementation when not set. Therefore provide a dummy
function to make it do nothing.
(Uncovered by perfcounters fd's which can now be writable-mmap-ed.)
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davide Libenzi <davidel@xmailserver.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Authentication error abort codes should be translated to appropriate
Linux error codes, rather than all being translated to EREMOTEIO - which
indicates that the server had internal problems.
Additionally, a server shouldn't be marked unavailable and the next
server tried if an authentication error occurs. This will quickly make
all the servers unavailable to the client. Instead the error should be
returned straight to the user.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* akpm: (182 commits)
fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset
fbdev: *bfin*: fix __dev{init,exit} markings
fbdev: *bfin*: drop unnecessary calls to memset
fbdev: bfin-t350mcqb-fb: drop unused local variables
fbdev: blackfin has __raw I/O accessors, so use them in fb.h
fbdev: s1d13xxxfb: add accelerated bitblt functions
tcx: use standard fields for framebuffer physical address and length
fbdev: add support for handoff from firmware to hw framebuffers
intelfb: fix a bug when changing video timing
fbdev: use framebuffer_release() for freeing fb_info structures
radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb?
s3c-fb: CPUFREQ frequency scaling support
s3c-fb: fix resource releasing on error during probing
carminefb: fix possible access beyond end of carmine_modedb[]
acornfb: remove fb_mmap function
mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF
mb862xxfb: restrict compliation of platform driver to PPC
Samsung SoC Framebuffer driver: add Alpha Channel support
atmel-lcdc: fix pixclock upper bound detection
offb: use framebuffer_alloc() to allocate fb_info struct
...
Manually fix up conflicts due to kmemcheck in mm/slab.c
CONFIG_FILE_LOCKING should not depend on CONFIG_BLOCK.
This makes it possible to run complete systems out of a CONFIG_BLOCK=n
initramfs on current kernels again (this last worked on 2.6.27.*).
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
put_cpu_no_resched() is an optimization of put_cpu() which unfortunately
can cause high latencies.
The nfs iostats code uses put_cpu_no_resched() in a code sequence where a
reschedule request caused by an interrupt between the get_cpu() and the
put_cpu_no_resched() can delay the reschedule for at least HZ.
The other users of put_cpu_no_resched() optimize correctly in interrupt
code, but there is no real harm in using the put_cpu() function which is
an alias for preempt_enable(). The extra check of the preemmpt count is
not as critical as the potential source of missing a reschedule.
Debugged in the preempt-rt tree and verified in mainline.
Impact: remove a high latency source
[akpm@linux-foundation.org: build fix]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After introduction of keyed wakeups Davide Libenzi did on epoll, we are
able to avoid spurious wakeups in poll()/select() code too.
For example, typical use of poll()/select() is to wait for incoming
network frames on many sockets. But TX completion for UDP/TCP frames call
sock_wfree() which in turn schedules thread.
When scheduled, thread does a full scan of all polled fds and can sleep
again, because nothing is really available. If number of fds is large,
this cause significant load.
This patch makes select()/poll() aware of keyed wakeups and useless
wakeups are avoided. This reduces number of context switches by about 50%
on some setups, and work performed by sofirq handlers.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1) I_FREEING tests should be coupled with I_CLEAR
The two I_FREEING tests are racy because clear_inode() can set i_state to
I_CLEAR between the clear of I_SYNC and the test of I_FREEING.
2) skip I_WILL_FREE inodes in generic_sync_sb_inodes() to avoid possible
races with generic_forget_inode()
generic_forget_inode() sets I_WILL_FREE call writeback on its own, so
generic_sync_sb_inodes() shall not try to step in and create possible races:
generic_forget_inode
inode->i_state |= I_WILL_FREE;
spin_unlock(&inode_lock);
generic_sync_sb_inodes()
spin_lock(&inode_lock);
__iget(inode);
__writeback_single_inode
// see non zero i_count
may WARN here ==> WARN_ON(inode->i_state & I_WILL_FREE);
spin_unlock(&inode_lock);
may call generic_forget_inode again ==> iput(inode);
The above race and warning didn't turn up because writeback_inodes() holds
the s_umount lock, so generic_forget_inode() finds MS_ACTIVE and returns
early. But we are not sure the UBIFS calls and future callers will
guarantee that. So skip I_WILL_FREE inodes for the sake of safety.
Cc: Eric Sandeen <sandeen@sandeen.net>
Acked-by: Jeff Layton <jlayton@redhat.com>
Cc: Masayoshi MIZUMA <m.mizuma@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The per-task oom_adj value is a characteristic of its mm more than the
task itself since it's not possible to oom kill any thread that shares the
mm. If a task were to be killed while attached to an mm that could not be
freed because another thread were set to OOM_DISABLE, it would have
needlessly been terminated since there is no potential for future memory
freeing.
This patch moves oomkilladj (now more appropriately named oom_adj) from
struct task_struct to struct mm_struct. This requires task_lock() on a
task to check its oom_adj value to protect against exec, but it's already
necessary to take the lock when dereferencing the mm to find the total VM
size for the badness heuristic.
This fixes a livelock if the oom killer chooses a task and another thread
sharing the same memory has an oom_adj value of OOM_DISABLE. This occurs
because oom_kill_task() repeatedly returns 1 and refuses to kill the
chosen task while select_bad_process() will repeatedly choose the same
task during the next retry.
Taking task_lock() in select_bad_process() to check for OOM_DISABLE and in
oom_kill_task() to check for threads sharing the same memory will be
removed in the next patch in this series where it will no longer be
necessary.
Writing to /proc/pid/oom_adj for a kthread will now return -EINVAL since
these threads are immune from oom killing already. They simply report an
oom_adj value of OOM_DISABLE.
Cc: Nick Piggin <npiggin@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Export all page flags faithfully in /proc/kpageflags.
11. KPF_MMAP (pseudo flag) memory mapped page
12. KPF_ANON (pseudo flag) memory mapped page (anonymous)
13. KPF_SWAPCACHE page is in swap cache
14. KPF_SWAPBACKED page is swap/RAM backed
15. KPF_COMPOUND_HEAD (*)
16. KPF_COMPOUND_TAIL (*)
17. KPF_HUGE hugeTLB pages
18. KPF_UNEVICTABLE page is in the unevictable LRU list
19. KPF_HWPOISON(TBD) hardware detected corruption
20. KPF_NOPAGE (pseudo flag) no page frame at the address
32-39. more obscure flags for kernel developers
(*) For compound pages, exporting _both_ head/tail info enables
users to tell where a compound page starts/ends, and its order.
The accompanying page-types tool will handle the details like decoupling
overloaded flags and hiding obscure flags to normal users.
Thanks to KOSAKI and Andi for their valuable recommendations!
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
send_sigio_to_task() reads fown->signum several times, we can race with
F_SETSIG which changes ->signum lockless. In theory, this can fool
security checks or we can call group_send_sig_info() with the wrong
->si_signo which does not match "int sig".
Change the code to cache ->signum.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shift current_cred() from __f_setown() to f_modown(). This reduces
the number of arguments and saves 48 bytes from fs/fcntl.o.
[ Note: this doesn't clear euid/uid when pid is set to NULL. But if
f_owner.pid == NULL we never use f_owner.uid/euid. Otherwise we'd
have a bug anyway: we must not send signals if pid was reset to NULL. ]
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (64 commits)
debugfs: use specified mode to possibly mark files read/write only
debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem.
xen: remove driver_data direct access of struct device from more drivers
usb: gadget: at91_udc: remove driver_data direct access of struct device
uml: remove driver_data direct access of struct device
block/ps3: remove driver_data direct access of struct device
s390: remove driver_data direct access of struct device
parport: remove driver_data direct access of struct device
parisc: remove driver_data direct access of struct device
of_serial: remove driver_data direct access of struct device
mips: remove driver_data direct access of struct device
ipmi: remove driver_data direct access of struct device
infiniband: ehca: remove driver_data direct access of struct device
ibmvscsi: gadget: at91_udc: remove driver_data direct access of struct device
hvcs: remove driver_data direct access of struct device
xen block: remove driver_data direct access of struct device
thermal: remove driver_data direct access of struct device
scsi: remove driver_data direct access of struct device
pcmcia: remove driver_data direct access of struct device
PCIE: remove driver_data direct access of struct device
...
Manually fix up trivial conflicts due to different direct driver_data
direct access fixups in drivers/block/{ps3disk.c,ps3vram.c}
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
ocfs2/net: Use wait_event() in o2net_send_message_vec()
ocfs2: Adjust rightmost path in ocfs2_add_branch.
ocfs2: fdatasync should skip unimportant metadata writeout
ocfs2: Remove redundant gotos in ocfs2_mount_volume()
ocfs2: Add statistics for the checksum and ecc operations.
ocfs2 patch to track delayed orphan scan timer statistics
ocfs2: timer to queue scan of all orphan slots
ocfs2: Correct ordering of ip_alloc_sem and localloc locks for directories
ocfs2: Fix possible deadlock in quota recovery
ocfs2: Fix possible deadlock with quotas in ocfs2_setattr()
ocfs2: Fix lock inversion in ocfs2_local_read_info()
ocfs2: Fix possible deadlock in ocfs2_global_read_dquot()
ocfs2: update comments in masklog.h
ocfs2: Don't printk the error when listing too many xattrs.
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: remove some includings of blktrace_api.h
mg_disk: seperate mg_disk.h again
block: Introduce helper to reset queue limits to default values
cfq: remove extraneous '\n' in blktrace output
ubifs: register backing_dev_info
btrfs: properly register fs backing device
block: don't overwrite bdi->state after bdi_init() has been run
cfq: cleanup for last_end_request in cfq_data
Commit fec1878fe9 caused a regression in
which contiguous blocks being allocated to the end of an extent were
getting a new extent created. This typically results in files entirely
made up of 1-block extents even though the blocks are contiguous on
disk.
Apparently grub doesn't handle a jfs file being fragmented into too many
extents, since it refuses to boot a kernel from jfs that was created by
the 2.6.30 kernel.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Reported-by: Alex <alevkovich@tut.by>
When porting blktrace to tracepoints, we changed to trace/block.h
for trace prober declarations.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>