Commit Graph

62 Commits

Author SHA1 Message Date
Andrew Price
ad99f77778 [GFS2] Fix cast from unsigned int to s64
This fixes bz 444829 where allocating a new block caused gfs2 file systems to
report 0 bytes used in df. It was caused by a broken cast from an unsigned int
in gfs2_block_alloc() to a negative s64 in gfs2_statfs_change(). This patch
casts the unsigned int to an s64 before the unary minus is applied.

Signed-off-by: Andrew Price <andy@andrewprice.me.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-05-12 08:54:56 +01:00
Bob Peterson
1f466a47e8 [GFS2] Faster gfs2_bitfit algorithm
This version of the gfs2_bitfit algorithm includes the latest
suggestions from Steve Whitehouse.  It is typically eight to
ten times faster than the version we're using today.  If there
is a lot of metadata mixed in (lots of small files) the
algorithm is often 15 times faster, and given the right
conditions, I've seen peaks of 20 times faster.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:39 +01:00
Steven Whitehouse
9b8c81d1de [GFS2] Allow bmap to allocate extents
We've supported mapping of extents when no block allocation is required
for some time. This patch extends that to mapping of extents when an
allocation has been requested. In that case we try to allocate as many
blocks as are requested, but we might return fewer in case there is
something preventing us from returning the complete amount (e.g. an
already allocated block is in the way).

Currently the only code path which can actually request multiple data
blocks in a single bmap call is the page_mkwrite path and even then it
only happens if there are multiple blocks per page. What this patch does
do however, is merge the allocation requests for metadata (growing the
metadata tree in either height or depth) with the allocation of the data
blocks in the case that both are needed. This results in lower overheads
even in the single block allocation case.

The one thing which we can't handle here at the moment is unstuffing. I
would like to be able to do that, but the problem which arises is that
in order to unstuff one has to get a locked page from the page cache
which results in locking problems in the (usual) case that the caller is
holding the page lock on the page it wishes to map. So that case will
have to be addressed in future patches.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:14 +01:00
Steven Whitehouse
b45e41d7d5 [GFS2] Add extent allocation to block allocator
Rather than having to allocate a single block at a time, this patch
allows the block allocator to allocate an extent. Since there is
no difference (so far as the block allocator is concerned) between
data blocks and indirect blocks, it is posible to allocate a single
extent and for the caller to unrevoke just the blocks required
for indirect blocks.

Currently the only bit of GFS2 to make use of this feature is the
build height function. The intention is that gfs2_block_map will
be changed to make use of this feature in future patches.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:47 +01:00
Steven Whitehouse
1639431a3f [GFS2] Merge gfs2_alloc_meta and gfs2_alloc_data
Thanks to the preceeding patches, the only difference between
these two functions is their name. We can thus merge them
and call the new function gfs2_alloc_block to reflect the
fact that it can allocate either kind of block.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:45 +01:00
Steven Whitehouse
5731be53e3 [GFS2] Update gfs2_trans_add_unrevoke to accept extents
By adding an extra argument to gfs2_trans_add_unrevoke we can now
specify an extent length of blocks to unrevoke. This means that
we only need to make one pass through the list for each extent
rather than each block. Currently the only extent length which
is used is 1, but that will change in the future.

Also gfs2_trans_add_unrevoke is removed from gfs2_alloc_meta
since its the only difference between this and gfs2_alloc_data
which is left. This will allow a future patch to merge these
two functions into one (i.e. one call to allocate both data
and metadata in a single extent in the future).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:42 +01:00
Steven Whitehouse
ac576cc5be [GFS2] Merge the rd_last_alloc_meta and rd_last_alloc_data fields
We don't need to keep track of when we last allocated data
and metadata separately since the only thing thats important
when searching for a free block is whether its free or not,
which is independent from what type of block it is.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:39 +01:00
Steven Whitehouse
ce276b06e8 [GFS2] Reduce inode size by merging fields
There were three fields being used to keep track of the location
of the most recently allocated block for each inode. These have
been merged into a single field in order to better keep the
data and metadata for an inode close on disk, and also to reduce
the space required for storage.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:37 +01:00
Bob Peterson
cf45b752c9 [GFS2] Remove rgrp and glock version numbers
This patch further reduces GFS2's memory requirements by
eliminating the 64-bit version number fields in lieu of
a couple bits.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:29 +01:00
Steven Whitehouse
110acf3837 [GFS2] Add consts to various bits of rgrp.c
There are a couple of routines which scan bitmaps where we can
mark the bitmaps const, plus a couple of call sites that can
be updated too.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:21 +01:00
Bob Peterson
42d52e3818 [GFS2] Combine rg_flags and rd_flags
This patch reduces the memory required by GFS2 by combining
the rd_flags and rg_flags (in core only).

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:10 +01:00
Bob Peterson
6bdd9be628 [GFS2] Allocate gfs2_rgrpd from slab memory
This patch moves the gfs2_rgrpd structure to its own slab
memory.  This makes it easier to control and monitor, and
yields less memory fragmentation.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:07 +01:00
Bob Peterson
fe6c991c52 [GFS2] Get rid of unneeded parameter in gfs2_rlist_alloc
This patch removed the unnecessary parameter from function
gfs2_rlist_alloc.  The parameter was always passed in as 0.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:39:49 +01:00
Steven Whitehouse
6dbd822487 [GFS2] Reduce inode size by moving i_alloc out of line
It is possible to reduce the size of GFS2 inodes by taking the i_alloc
structure out of the gfs2_inode. This patch allocates the i_alloc
structure whenever its needed, and frees it afterward. This decreases
the amount of low memory we use at the expense of requiring a memory
allocation for each page or partial page that we write. A quick test
with postmark shows that the overhead is not measurable and I also note
that OCFS2 use the same approach.

In the future I'd like to solve the problem by shrinking down the size
of the members of the i_alloc structure, but for now, this reduces the
immediate problem of using too much low-memory on x86 and doesn't add
too much overhead.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:18:25 +00:00
Bob Peterson
b3513fca7e [GFS2] Incremental patch to fix compiler warning
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:14:53 +00:00
Bob Peterson
5fdc2eeb5d [GFS2] Run through full bitmaps quicker in gfs2_bitfit
I eliminated the passing of an unused parameter into gfs2_bitfit called rgd.

This also changes the gfs2_bitfit code that searches for free (or used) blocks.
Before, the code was trying to check for bytes that indicated 4 blocks in
the undesired state.  The problem is, it was spending more time trying to
do this than it actually was saving.  This version only optimizes the case
where we're looking for free blocks, and it checks a machine word at a time.
So on 32-bit machines, it will check 32-bits (16 blocks) and on 64-bit
machines, it will check 64-bits (32 blocks) at a time.  The compiler
optimizes that quite well and we save some time, especially when running
through full bitmaps (like the bitmaps allocated for the journals).

There's probably a more elegant or optimized way to do this, but I haven't
thought of it yet.  I'm open to suggestions.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:13:31 +00:00
Abhijith Das
292c8c14ca [GFS2] patch to check for recursive lock requests in gfs2_rename code path
A certain scenario in the rename code path triggers a kernel BUG()
because it accidentally does recursive locking The first lock is
requested to unlink an already existing inode (replacing a file) and the
second lock is requested when the destination directory needs to alloc
some space. It is rare that these two
events happen during the same rename call, and even more rare that these
two instances try to lock the same rgrp. It is, however, possible.
https://bugzilla.redhat.com/show_bug.cgi?id=404711

Signed-off-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:08:18 +00:00
Steven Whitehouse
51ff87bdd9 [GFS2] Clean up internal read function
As requested by Christoph, this patch cleans up GFS2's internal
read function so that it no longer uses the do_generic_mapping_read
function. This function is obsolete and GFS2 is the last user of it.

As a side effect the internal read code gets smaller and easier
to read and gfs2_readpage is split into two. One function has the locking
and the other function has the rest of the logic.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
2008-01-25 08:07:11 +00:00
Benjamin Marzinski
7a9f53b3c1 [GFS2] Alternate gfs2_iget to avoid looking up inodes being freed
There is a possible deadlock between two processes on the same node, where one
process is deleting an inode, and another process is looking for allocated but
unused inodes to delete in order to create more space.

process A does an iput() on inode X, and it's i_count drops to 0. This causes
iput_final() to be called, which puts an inode into state I_FREEING at
generic_delete_inode(). There no point between when iput_final() is called, and
when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is
set, no other process on that node can successfully look up that inode until
the delete finishes.

process B locks the the resource group for the same inode in get_local_rgrp(),
which is called by gfs2_inplace_reserve_i()

process A tries to lock the resource group for the inode in
gfs2_dinode_dealloc(), but it's already locked by process B

process B waits in find_inode for the inode to have the I_FREEING state cleared.

Deadlock.

This patch solves the problem by adding an alternative to gfs2_iget(),
gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING
state.o The alternate test function is just like the original one, except that
it fails if the inode is being freed, and sets a skipped flag. The alternate
set function is just like the original, except that it fails if the skipped
flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of
gfs2_iget().

Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:56:29 +01:00
Bob Peterson
5f3eae7546 [GFS2] invalid metadata block - REVISED
This is for bugzilla bug #248176: GFS2: invalid metadata block

Patches 1 thru 3 were accepted upstream, but there were problems
with 4 and 5.  Those issues have been resolved and now the recovery
tests are passing without errors.  This code has gone through
41 * 3 successful gfs2 recovery tests before it hit an
unrelated (openais) problem.

This is a complete rewrite of patch 4 for bug #248176.

Part of the problem was that inodes were being recycled
before their buffers were flushed to the journal logs.
Another problem was that the clone bitmaps were being
searched for deleted inodes to recycle, but only the
"real" bitmaps should be searched for that purpose.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:55:10 +01:00
Bob Peterson
6760bdcd03 [GFS2] Prevent infinite loop in try_rgrp_unlink()
This is patch three of five for bug #248176.

The try_rgrp_unlink code in rgrp.c had an infinite loop.  This was
caused because the bitmap function rgblk_search can return a block
less than the "goal" block, in which case it was looping.  The fix is
to make it always march forward as needed.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:54:56 +01:00
Steven Whitehouse
6eefaf61f6 [GFS2] Fix incorrect return code in rgrp.c
The following patch fixes a bug where 0 was being used as a return code
to indicate "nothing to do" when in fact 0 was a valid block location
which might be returned by the function.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:33:15 +01:00
Bob Peterson
24c7387333 [GFS2] soft lockup in rgblk_search
This patch seems to fix the problem described in bugzilla bug 246114.
It was written by Steve Whitehouse with some tweaking by me.

The code was looping in the relatively new section of code designed to
search for and reuse unlinked inodes.  In cases where it was finding an
appropriate inode to reuse, it was looping around and finding the same
block over and over because a "<=" check should have been a "<" when
comparing the goal block to the last unlinked block found.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:32:43 +01:00
Wendy Cheng
35dcc52e3a [GFS2] Remove i_mode passing from NFS File Handle
GFS2 has been passing i_mode within NFS File Handle. Other than the
wrong assumption that there is always room for this extra 16 bit value,
the current gfs2_get_dentry doesn't really need the i_mode to work
correctly. Note that GFS2 NFS code does go thru the same lookup code
path as direct file access route (where the mode is obtained from name
lookup) but gfs2_get_dentry() is coded for different purpose. It is not
used during lookup time. It is part of the file access procedure call.
When the call is invoked, if on-disk inode is not in-memory, it has to
be read-in. This makes i_mode passing a useless overhead.

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:24:11 +01:00
Wendy Cheng
bb9bcf0616 [GFS2] Obtaining no_formal_ino from directory entry
GFS2 lookup code doesn't ask for inode shared glock. This implies during
in-memory inode creation for existing file, GFS2 will not disk-read in
the inode contents. This leaves no_formal_ino un-initialized during
lookup time. The un-initialized no_formal_ino is subsequently encoded
into file handle. Clients will get ESTALE error whenever it tries to
access these files.

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:24:08 +01:00