Simplify page cache zeroing of segments of pages through 3 functions
zero_user_segments(page, start1, end1, start2, end2)
Zeros two segments of the page. It takes the position where to
start and end the zeroing which avoids length calculations and
makes code clearer.
zero_user_segment(page, start, end)
Same for a single segment.
zero_user(page, start, length)
Length variant for the case where we know the length.
We remove the zero_user_page macro. Issues:
1. Its a macro. Inline functions are preferable.
2. The KM_USER0 macro is only defined for HIGHMEM.
Having to treat this special case everywhere makes the
code needlessly complex. The parameter for zeroing is always
KM_USER0 except in one single case that we open code.
Avoiding KM_USER0 makes a lot of code not having to be dealing
with the special casing for HIGHMEM anymore. Dealing with
kmap is only necessary for HIGHMEM configurations. In those
configurations we use KM_USER0 like we do for a series of other
functions defined in highmem.h.
Since KM_USER0 is depends on HIGHMEM the existing zero_user_page
function could not be a macro. zero_user_* functions introduced
here can be be inline because that constant is not used when these
functions are called.
Also extract the flushing of the caches to be outside of the kmap.
[akpm@linux-foundation.org: fix nfs and ntfs build]
[akpm@linux-foundation.org: fix ntfs build some more]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Steven French <sfrench@us.ibm.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: David Chinner <dgc@sgi.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Steven French <sfrench@us.ibm.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Do a quick signedness check for block numbers. There are a number of places
where signed integers are used for block numbers, which limits the usable file
system size to 8 TiB. The disk format, excepting a problem which will be
fixed in the following patch, supports file systems up to 16 TiB in size.
This patch cleans up those sites so that we can enable the full usable size.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- remove the following no longer used functions:
- bitmap.c: reiserfs_claim_blocks_to_be_allocated()
- bitmap.c: reiserfs_release_claimed_blocks()
- bitmap.c: reiserfs_can_fit_pages()
- make the following functions static:
- inode.c: restart_transaction()
- journal.c: reiserfs_async_progress_wait()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Vladimir V. Saveliev <vs@namesys.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
currently the export_operation structure and helpers related to it are in
fs.h. fs.h is already far too large and there are very few places needing the
export bits, so split them off into a separate header.
[akpm@linux-foundation.org: fix cifs build]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes a confusion reiserfs has for a long time.
On release file operation reiserfs used to try to pack file data stored in
last incomplete page of some files into metadata blocks. After packing the
page got cleared with clear_page_dirty. It did not take into account that
the page may be mmaped into other process's address space. Recent
replacement for clear_page_dirty cancel_dirty_page found the confusion with
sanity check that page has to be not mapped.
The patch fixes the confusion by making reiserfs avoid tail packing if an
inode was ever mmapped. reiserfs_mmap and reiserfs_file_release are
serialized with mutex in reiserfs specific inode. reiserfs_mmap locks the
mutex and sets a bit in reiserfs specific inode flags.
reiserfs_file_release checks the bit having the mutex locked. If bit is
set - tail packing is avoided. This eliminates a possibility that mmapped
page gets cancel_page_dirty-ed.
Signed-off-by: Vladimir Saveliev <vs@namesys.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <mason@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename Reiserfs's struct path to struct treepath to prevent name collision
between it and struct path from fs/namei.c.
Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
One of our test team hit a reiserfs_panic while running fsstress tests on
2.6.19-rc1. The message looks like :
REISERFS: panic(device Null superblock):
reiserfs[5676]: assertion !(p->path_length != 1 ) failed at
fs/reiserfs/stree.c:397:reiserfs_check_path: path not properly relsed.
The backtrace looked :
kernel BUG in reiserfs_panic at fs/reiserfs/prints.c:361!
.reiserfs_check_path+0x58/0x74
.reiserfs_get_block+0x1444/0x1508
.__block_prepare_write+0x1c8/0x558
.block_prepare_write+0x34/0x64
.reiserfs_prepare_write+0x118/0x1d0
.generic_file_buffered_write+0x314/0x82c
.__generic_file_aio_write_nolock+0x350/0x3e0
.__generic_file_write_nolock+0x78/0xb0
.generic_file_write+0x60/0xf0
.reiserfs_file_write+0x198/0x2038
.vfs_write+0xd0/0x1b4
.sys_write+0x4c/0x8c
syscall_exit+0x0/0x4
Upon debugging I found that the restart_transaction was not releasing
the path if the th->refcount was > 1.
/*static*/
int restart_transaction(struct reiserfs_transaction_handle *th,
struct inode *inode, struct path *path)
{
[...]
/* we cannot restart while nested */
if (th->t_refcount > 1) { <<- Path is not released in this case!
return 0;
}
pathrelse(path); <<- Path released here.
[...]
This could happen in such a situation :
In reiserfs/inode.c: reiserfs_get_block() ::
if (repeat == NO_DISK_SPACE || repeat == QUOTA_EXCEEDED) {
/* restart the transaction to give the journal a chance to free
** some blocks. releases the path, so we have to go back to
** research if we succeed on the second try
*/
SB_JOURNAL(inode->i_sb)->j_next_async_flush = 1;
-->> retval = restart_transaction(th, inode, &path); <<--
We are supposed to release the path, no matter we succeed or fail. But
if the th->refcount is > 1, the path is still valid. And,
if (retval)
goto failure;
repeat =
_allocate_block(th, block, inode,
&allocated_block_nr, NULL, create);
If the above allocate_block fails with NO_DISK_SPACE or QUOTA_EXCEEDED,
we would have path which is not released.
if (repeat != NO_DISK_SPACE && repeat != QUOTA_EXCEEDED) {
goto research;
}
if (repeat == QUOTA_EXCEEDED)
retval = -EDQUOT;
else
retval = -ENOSPC;
goto failure;
[...]
failure:
[...]
reiserfs_check_path(&path); << Panics here !
Attached here is a patch which could fix the issue.
fix reiserfs/inode.c : restart_transaction() to release the path in all
cases.
The restart_transaction() doesn't release the path when the the journal
handle has a refcount > 1. This would trigger a reiserfs_panic() if we
encounter an -ENOSPC / -EDQUOT in reiserfs_get_block().
Signed-off-by: Suzuki K P <suzuki@in.ibm.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Cc: <reiserfs-dev@namesys.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This eliminates the i_blksize field from struct inode. Filesystems that want
to provide a per-inode st_blksize can do so by providing their own getattr
routine instead of using the generic_fillattr() function.
Note that some filesystems were providing pretty much random (and incorrect)
values for i_blksize.
[bunk@stusta.de: cleanup]
[akpm@osdl.org: generic_fillattr() fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
reiserfs_write_full_page does zero bytes in the file past eof, but it may
call get_block on those buffers as well. On machines where the page size
is larger than the blocksize, this can result in mmaped files incorrectly
growing up to a block boundary during writepage.
The fix is to avoid calling get_block for any blocks that are entirely past
eof
Signed-off-by: Chris Mason <mason@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Same as with already do with the file operations: keep them in .rodata and
prevents people from doing runtime patching.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that get_block() can handle mapping multiple disk blocks, no need to have
->get_blocks(). This patch removes fs specific ->get_blocks() added for DIO
and makes it users use get_block() instead.
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The return value of this function is never used, so let's be honest and
declare it as void.
Some places where invalidatepage returned 0, I have inserted comments
suggesting a BUG_ON.
[akpm@osdl.org: JBD BUG fix]
[akpm@osdl.org: rework for git-nfs]
[akpm@osdl.org: don't go BUG in block_invalidate_page()]
Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>