* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
quota: stop using QUOTA_OK / NO_QUOTA
dquot: cleanup dquot initialize routine
dquot: move dquot initialization responsibility into the filesystem
dquot: cleanup dquot drop routine
dquot: move dquot drop responsibility into the filesystem
dquot: cleanup dquot transfer routine
dquot: move dquot transfer responsibility into the filesystem
dquot: cleanup inode allocation / freeing routines
dquot: cleanup space allocation / freeing routines
ext3: add writepage sanity checks
ext3: Truncate allocated blocks if direct IO write fails to update i_size
quota: Properly invalidate caches even for filesystems with blocksize < pagesize
quota: generalize quota transfer interface
quota: sb_quota state flags cleanup
jbd: Delay discarding buffers in journal_unmap_buffer
ext3: quota_write cross block boundary behaviour
quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
quota: split out compat_sys_quotactl support from quota.c
quota: split out netlink notification support from quota.c
quota: remove invalid optimization from quota_sync_all
...
Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c
Get rid of the initialize dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.
Rename the now static low-level dquot_initialize helper to __dquot_initialize
and vfs_dq_init to dquot_initialize to have a consistent namespace.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Currently various places in the VFS call vfs_dq_init directly. This means
we tie the quota code into the VFS. Get rid of that and make the
filesystem responsible for the initialization. For most metadata operations
this is a straight forward move into the methods, but for truncate and
open it's a bit more complicated.
For truncate we currently only call vfs_dq_init for the sys_truncate case
because open already takes care of it for ftruncate and open(O_TRUNC) - the
new code causes an additional vfs_dq_init for those which is harmless.
For open the initialization is moved from do_filp_open into the open method,
which means it happens slightly earlier now, and only for regular files.
The latter is fine because we don't need to initialize it for operations
on special files, and we already do it as part of the namespace operations
for directories.
Add a dquot_file_open helper that filesystems that support generic quotas
can use to fill in ->open.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Get rid of the drop dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.
Rename the now static low-level dquot_drop helper to __dquot_drop
and vfs_dq_drop to dquot_drop to have a consistent namespace.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Currently clear_inode calls vfs_dq_drop directly. This means
we tie the quota code into the VFS. Get rid of that and make the
filesystem responsible for the drop inside the ->clear_inode
superblock operation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Get rid of the transfer dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.
Rename the now static low-level dquot_transfer helper to __dquot_transfer
and vfs_dq_transfer to dquot_transfer to have a consistent namespace,
and make the new dquot_transfer return a normal negative errno value
which all callers expect.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Get rid of the alloc_inode and free_inode dquot operations - they are
always called from the filesystem and if a filesystem really needs
their own (which none currently does) it can just call into it's
own routine directly.
Also get rid of the vfs_dq_alloc/vfs_dq_free wrappers and always
call the lowlevel dquot_alloc_inode / dqout_free_inode routines
directly, which now lose the number argument which is always 1.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Get rid of the alloc_space, free_space, reserve_space, claim_space and
release_rsv dquot operations - they are always called from the filesystem
and if a filesystem really needs their own (which none currently does)
it can just call into it's own routine directly.
Move shared logic into the common __dquot_alloc_space,
dquot_claim_space_nodirty and __dquot_free_space low-level methods,
and rationalize the wrappers around it to move as much as possible
code into the common block for CONFIG_QUOTA vs not. Also rename
all these helpers to be named dquot_* instead of vfs_dq_*.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Currently we were adding ioctl cmds/structures for ocfs2 into ocfs2_fs.h
which was used for define ocfs2 on-disk layout. That sounds a little bit
confusing, and it may be quickly polluted espcially when growing the
ocfs2_info_request ioctls afterwards(it will grow i bet).
As a result, such OCFS2 IOCs do need to be placed somewhere other than
ocfs2_fs.h, a separated ocfs2_ioctl.h will be added to store such ioctl
structures and definitions which could also be used from userspace to
invoke ioctls call.
Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
This patch makes ocfs2 send SIGXFSZ if new file size exceeds the rlimit.
Processes may get SIGXFSZ on one node (in the cluster) while others will
not on another if file size limits are different on the two nodes.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Make use of the newly added BASTS masklog to trace ASTs and BASTs in userdlm.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
This patch adds a new masklog and uses it allow tracing ASTs and BASTs
in the dlmglue layer. This has been found to be very useful in debugging
cluster locking issues.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
If a node down event happens while dlm shutdown in progress, dlm recovery
should be done before dlm is shutdown. We can't migrate unrecovered locks,
obviously. But dlm_reco_thread only does recovery if the dlm_state is
in DLM_CTXT_JOINED.
dlm_reco_thread should do recovery if dlm_state is in DLM_CTXT_JOINED or
DLM_CTXT_IN_SHUTDOWN.
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
In ocfs2_direct_IO_get_blocks, we only need to bug out
in case of we are going to write a recounted extent rec.
What a silly bug introduced by me!
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Cc: stable@kernel.org
This patch fixes a compiling warning in ocfs2_file_aio_write().
Signed-off-by: Coly Li <coly.li@suse.de>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Unlike ocfs2, dlmfs has no permanent storage. It can't store off a
cluster stack it is supposed to be using. So it can't specify the stack
name in ocfs2_cluster_connect().
Instead, we create ocfs2_cluster_connect_agnostic(), which simply uses
the stack that is currently enabled. This is find for dlmfs, which will
rely on the stack initialization.
We add the "stackglue" capability to dlmfs's capability list. This lets
userspace know dlmfs can be used with all cluster stacks.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Rather than directly using o2dlm, dlmfs can now use the stackglue. This
allows it to use userspace cluster stacks and fs/dlm. This commit
forces o2cb for now. A latter commit will bump the protocol version and
allow non-o2cb stacks.
This is one big sed, really. LKM_xxMODE becomes DLM_LOCK_xx. LKM_flag
becomes DLM_LKF_flag.
We also learn to check that the LVB is valid before reading it. Any DLM
can lose the contents of the LVB during a complicated recovery. userdlm
should be checking this. Now it does. dlmfs will return 0 from read(2)
if the LVB was invalid.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
We want folks using dlmfs to be able to use the LVB in places other than
just write(2)/read(2). By ignoring truncate requests, we allow 'echo
"contents" > /dlm/space/lockname' to work.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Inside the stackglue, the locking protocol structure is hanging off of
the ocfs2_cluster_connection. This takes it one further; the locking
protocol is passed into ocfs2_cluster_connect(). Now different cluster
connections can have different locking protocols with distinct asts.
Note that all locking protocols have to keep their maximum protocol
version in lock-step.
With the protocol structure set in ocfs2_cluster_connect(), there is no
need for the stackglue to have a static pointer to a specific protocol
structure. We can change initialization to only pass in the maximum
protocol version.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
With the full ocfs2_locking_protocol hanging off of the
ocfs2_cluster_connection, ast wrappers can get the ast/bast pointers
there. They don't need to get them from their plugin structure.
The user plugin still needs the maximum locking protocol version,
though. This changes the plugin structure so that it only holds the max
version, not the entire ocfs2_locking_protocol pointer.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
With the ocfs2_cluster_connection hanging off of the ocfs2_dlm_lksb, we
have access to it in the ast and bast wrapper functions. Attach the
ocfs2_locking_protocol to the conn.
Now, instead of refering to a static variable for ast/bast pointers, the
wrappers can look at the connection. This means different connections
can have different ast/bast pointers, and it reduces the need for the
static pointer.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
We're going to want it in the ast functions, so we convert union
ocfs2_dlm_lksb to struct ocfs2_dlm_lksb and let it carry the connection.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
The stackglue ast and bast functions tried to maintain the fiction that
their arguments were void pointers. In reality, stack_user.c had to
know that the argument was an ocfs2_lock_res in order to get the status
off of the lksb. That's ugly.
This changes stackglue to always pass the lksb as the argument to ast
and bast functions. The caller can always use container_of() to get the
ocfs2_lock_res or user_dlm_lock_res. The net effect to the caller is
zero. They still get back the lockres in their ast. stackglue gets
cleaner, and now can use the lksb itself.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
We're going to remove the tie between ocfs2_dlmfs and o2dlm.
ocfs2_dlmfs doesn't belong in the fs/ocfs2/dlm directory anymore. Here
we move it to fs/ocfs2/dlmfs.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
o2dlm's userspace filesystem is an easy way to use the DLM from
userspace. It is intentionally simple. For example, it does not allow
for asynchronous behavior or lock conversion. This is intentional to
keep the interface simple.
Because there is no asynchronous notification, there is no way for a
process holding a lock to know another node needs the lock. This is the
number one complaint of ocfs2_dlmfs users. Turns out, we can solve this
very easily. We add poll() support to ocfs2_dlmfs. When a BAST is
received, the lock's file descriptor will receive POLLIN.
This is trivial to implement. Userdlm already has an appropriate
waitqueue, and the lock knows when it is blocked.
We add the "bast" capability to tell userspace this is available.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>