Merge tag 'afs-next-20200604' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS updates from David Howells:
 "There's some core VFS changes which affect a couple of filesystems:

   - Make the inode hash table RCU safe and providing some RCU-safe
     accessor functions. The search can then be done without taking the
     inode_hash_lock. Care must be taken because the object may be being
     deleted and no wait is made.

   - Allow iunique() to avoid taking the inode_hash_lock.

   - Allow AFS's callback processing to avoid taking the inode_hash_lock
     when using the inode table to find an inode to notify.

   - Improve Ext4's time updating. Konstantin Khlebnikov said "For now,
     I've plugged this issue with try-lock in ext4 lazy time update.
     This solution is much better."

  Then there's a set of changes to make a number of improvements to the
  AFS driver:

   - Improve callback (ie. third party change notification) processing
     by:

      (a) Relying more on the fact we're doing this under RCU and by
          using fewer locks. This makes use of the RCU-based inode
          searching outlined above.

      (b) Moving to keeping volumes in a tree indexed by volume ID
          rather than a flat list.

      (c) Making the server and volume records logically part of the
          cell. This means that a server record now points directly at
          the cell and the tree of volumes is there. This removes an N:M
          mapping table, simplifying things.

   - Improve keeping NAT or firewall channels open for the server
     callbacks to reach the client by actively polling the fileserver on
     a timed basis, instead of only doing it when we have an operation
     to process.

   - Improving detection of delayed or lost callbacks by including the
     parent directory in the list of file IDs to be queried when doing a
     bulk status fetch from lookup. We can then check to see if our copy
     of the directory has changed under us without us getting notified.

   - Determine aliasing of cells (such as a cell that is pointed to be a
     DNS alias). This allows us to avoid having ambiguity due to
     apparently different cells using the same volume and file servers.

   - Improve the fileserver rotation to do more probing when it detects
     that all of the addresses to a server are listed as non-responsive.
     It's possible that an address that previously stopped responding
     has become responsive again.

  Beyond that, lay some foundations for making some calls asynchronous:

   - Turn the fileserver cursor struct into a general operation struct
     and hang the parameters off of that rather than keeping them in
     local variables and hang results off of that rather than the call
     struct.

   - Implement some general operation handling code and simplify the
     callers of operations that affect a volume or a volume component
     (such as a file). Most of the operation is now done by core code.

   - Operations are supplied with a table of operations to issue
     different variants of RPCs and to manage the completion, where all
     the required data is held in the operation object, thereby allowing
     these to be called from a workqueue.

   - Put the standard "if (begin), while(select), call op, end" sequence
     into a canned function that just emulates the current behaviour for
     now.

  There are also some fixes interspersed:

   - Don't let the EACCES from ICMP6 mapping reach the user as such,
     since it's confusing as to whether it's a filesystem error. Convert
     it to EHOSTUNREACH.

   - Don't use the epoch value acquired through probing a server. If we
     have two servers with the same UUID but in different cells, it's
     hard to draw conclusions from them having different epoch values.

   - Don't interpret the argument to the CB.ProbeUuid RPC as a
     fileserver UUID and look up a fileserver from it.

   - Deal with servers in different cells having the same UUIDs. In the
     event that a CB.InitCallBackState3 RPC is received, we have to
     break the callback promises for every server record matching that
     UUID.

   - Don't let afs_statfs return values that go below 0.

   - Don't use running fileserver probe state to make server selection
     and address selection decisions on. Only make decisions on final
     state as the running state is cleared at the start of probing"

Acked-by: Al Viro <viro@zeniv.linux.org.uk> (fs/inode.c part)

* tag 'afs-next-20200604' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (27 commits)
  afs: Adjust the fileserver rotation algorithm to reprobe/retry more quickly
  afs: Show more a bit more server state in /proc/net/afs/servers
  afs: Don't use probe running state to make decisions outside probe code
  afs: Fix afs_statfs() to not let the values go below zero
  afs: Fix the by-UUID server tree to allow servers with the same UUID
  afs: Reorganise volume and server trees to be rooted on the cell
  afs: Add a tracepoint to track the lifetime of the afs_volume struct
  afs: Detect cell aliases 3 - YFS Cells with a canonical cell name op
  afs: Detect cell aliases 2 - Cells with no root volumes
  afs: Detect cell aliases 1 - Cells with root volumes
  afs: Implement client support for the YFSVL.GetCellName RPC op
  afs: Retain more of the VLDB record for alias detection
  afs: Fix handling of CB.ProbeUuid cache manager op
  afs: Don't get epoch from a server because it may be ambiguous
  afs: Build an abstraction around an "operation" concept
  afs: Rename struct afs_fs_cursor to afs_operation
  afs: Remove the error argument from afs_protocol_error()
  afs: Set error flag rather than return error from file status decode
  afs: Make callback processing more efficient.
  afs: Show more information in /proc/net/afs/servers
  ...
This commit is contained in:
Linus Torvalds
2020-06-05 16:26:36 -07:00
38 changed files with 4453 additions and 3939 deletions

View File

@@ -18,6 +18,7 @@ kafs-y := \
file.o \
flock.o \
fsclient.o \
fs_operation.o \
fs_probe.o \
inode.o \
main.o \
@@ -30,6 +31,7 @@ kafs-y := \
server_list.o \
super.o \
vlclient.o \
vl_alias.o \
vl_list.o \
vl_probe.o \
vl_rotate.o \

View File

@@ -10,7 +10,7 @@
#include <linux/in.h>
#define AFS_MAXCELLNAME 64 /* Maximum length of a cell name */
#define AFS_MAXCELLNAME 256 /* Maximum length of a cell name */
#define AFS_MAXVOLNAME 64 /* Maximum length of a volume name */
#define AFS_MAXNSERVERS 8 /* Maximum servers in a basic volume record */
#define AFS_NMAXNSERVERS 13 /* Maximum servers in a N/U-class volume record */
@@ -146,7 +146,6 @@ struct afs_file_status {
struct afs_status_cb {
struct afs_file_status status;
struct afs_callback callback;
unsigned int cb_break; /* Pre-op callback break counter */
bool have_status; /* True if status record was retrieved */
bool have_cb; /* True if cb record was retrieved */
bool have_error; /* True if status.abort_code indicates an error */

View File

@@ -22,6 +22,7 @@ enum AFSVL_Operations {
VLGETENTRYBYNAMEU = 527, /* AFS Get VLDB entry by name (UUID-variant) */
VLGETADDRSU = 533, /* AFS Get addrs for fileserver */
YVLGETENDPOINTS = 64002, /* YFS Get endpoints for file/volume server */
YVLGETCELLNAME = 64014, /* YFS Get actual cell name */
VLGETCAPABILITIES = 65537, /* AFS Get server capabilities */
};

View File

@@ -21,192 +21,17 @@
#include "internal.h"
/*
* Create volume and callback interests on a server.
*/
static struct afs_cb_interest *afs_create_interest(struct afs_server *server,
struct afs_vnode *vnode)
{
struct afs_vol_interest *new_vi, *vi;
struct afs_cb_interest *new;
struct hlist_node **pp;
new_vi = kzalloc(sizeof(struct afs_vol_interest), GFP_KERNEL);
if (!new_vi)
return NULL;
new = kzalloc(sizeof(struct afs_cb_interest), GFP_KERNEL);
if (!new) {
kfree(new_vi);
return NULL;
}
new_vi->usage = 1;
new_vi->vid = vnode->volume->vid;
INIT_HLIST_NODE(&new_vi->srv_link);
INIT_HLIST_HEAD(&new_vi->cb_interests);
refcount_set(&new->usage, 1);
new->sb = vnode->vfs_inode.i_sb;
new->vid = vnode->volume->vid;
new->server = afs_get_server(server, afs_server_trace_get_new_cbi);
INIT_HLIST_NODE(&new->cb_vlink);
write_lock(&server->cb_break_lock);
for (pp = &server->cb_volumes.first; *pp; pp = &(*pp)->next) {
vi = hlist_entry(*pp, struct afs_vol_interest, srv_link);
if (vi->vid < new_vi->vid)
continue;
if (vi->vid > new_vi->vid)
break;
vi->usage++;
goto found_vi;
}
new_vi->srv_link.pprev = pp;
new_vi->srv_link.next = *pp;
if (*pp)
(*pp)->pprev = &new_vi->srv_link.next;
*pp = &new_vi->srv_link;
vi = new_vi;
new_vi = NULL;
found_vi:
new->vol_interest = vi;
hlist_add_head(&new->cb_vlink, &vi->cb_interests);
write_unlock(&server->cb_break_lock);
kfree(new_vi);
return new;
}
/*
* Set up an interest-in-callbacks record for a volume on a server and
* register it with the server.
* - Called with vnode->io_lock held.
*/
int afs_register_server_cb_interest(struct afs_vnode *vnode,
struct afs_server_list *slist,
unsigned int index)
{
struct afs_server_entry *entry = &slist->servers[index];
struct afs_cb_interest *cbi, *vcbi, *new, *old;
struct afs_server *server = entry->server;
again:
vcbi = rcu_dereference_protected(vnode->cb_interest,
lockdep_is_held(&vnode->io_lock));
if (vcbi && likely(vcbi == entry->cb_interest))
return 0;
read_lock(&slist->lock);
cbi = afs_get_cb_interest(entry->cb_interest);
read_unlock(&slist->lock);
if (vcbi) {
if (vcbi == cbi) {
afs_put_cb_interest(afs_v2net(vnode), cbi);
return 0;
}
/* Use a new interest in the server list for the same server
* rather than an old one that's still attached to a vnode.
*/
if (cbi && vcbi->server == cbi->server) {
write_seqlock(&vnode->cb_lock);
old = rcu_dereference_protected(vnode->cb_interest,
lockdep_is_held(&vnode->cb_lock.lock));
rcu_assign_pointer(vnode->cb_interest, cbi);
write_sequnlock(&vnode->cb_lock);
afs_put_cb_interest(afs_v2net(vnode), old);
return 0;
}
/* Re-use the one attached to the vnode. */
if (!cbi && vcbi->server == server) {
write_lock(&slist->lock);
if (entry->cb_interest) {
write_unlock(&slist->lock);
afs_put_cb_interest(afs_v2net(vnode), cbi);
goto again;
}
entry->cb_interest = cbi;
write_unlock(&slist->lock);
return 0;
}
}
if (!cbi) {
new = afs_create_interest(server, vnode);
if (!new)
return -ENOMEM;
write_lock(&slist->lock);
if (!entry->cb_interest) {
entry->cb_interest = afs_get_cb_interest(new);
cbi = new;
new = NULL;
} else {
cbi = afs_get_cb_interest(entry->cb_interest);
}
write_unlock(&slist->lock);
afs_put_cb_interest(afs_v2net(vnode), new);
}
ASSERT(cbi);
/* Change the server the vnode is using. This entails scrubbing any
* interest the vnode had in the previous server it was using.
*/
write_seqlock(&vnode->cb_lock);
old = rcu_dereference_protected(vnode->cb_interest,
lockdep_is_held(&vnode->cb_lock.lock));
rcu_assign_pointer(vnode->cb_interest, cbi);
vnode->cb_s_break = cbi->server->cb_s_break;
vnode->cb_v_break = vnode->volume->cb_v_break;
clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
write_sequnlock(&vnode->cb_lock);
afs_put_cb_interest(afs_v2net(vnode), old);
return 0;
}
/*
* Remove an interest on a server.
*/
void afs_put_cb_interest(struct afs_net *net, struct afs_cb_interest *cbi)
{
struct afs_vol_interest *vi;
if (cbi && refcount_dec_and_test(&cbi->usage)) {
if (!hlist_unhashed(&cbi->cb_vlink)) {
write_lock(&cbi->server->cb_break_lock);
hlist_del_init(&cbi->cb_vlink);
vi = cbi->vol_interest;
cbi->vol_interest = NULL;
if (--vi->usage == 0)
hlist_del(&vi->srv_link);
else
vi = NULL;
write_unlock(&cbi->server->cb_break_lock);
if (vi)
kfree_rcu(vi, rcu);
afs_put_server(net, cbi->server, afs_server_trace_put_cbi);
}
kfree_rcu(cbi, rcu);
}
}
/*
* allow the fileserver to request callback state (re-)initialisation
* Allow the fileserver to request callback state (re-)initialisation.
* Unfortunately, UUIDs are not guaranteed unique.
*/
void afs_init_callback_state(struct afs_server *server)
{
server->cb_s_break++;
rcu_read_lock();
do {
server->cb_s_break++;
server = rcu_dereference(server->uuid_next);
} while (0);
rcu_read_unlock();
}
/*
@@ -237,70 +62,110 @@ void afs_break_callback(struct afs_vnode *vnode, enum afs_cb_break_reason reason
write_sequnlock(&vnode->cb_lock);
}
/*
* Look up a volume by volume ID under RCU conditions.
*/
static struct afs_volume *afs_lookup_volume_rcu(struct afs_cell *cell,
afs_volid_t vid)
{
struct afs_volume *volume = NULL;
struct rb_node *p;
int seq = 0;
do {
/* Unfortunately, rbtree walking doesn't give reliable results
* under just the RCU read lock, so we have to check for
* changes.
*/
read_seqbegin_or_lock(&cell->volume_lock, &seq);
p = rcu_dereference_raw(cell->volumes.rb_node);
while (p) {
volume = rb_entry(p, struct afs_volume, cell_node);
if (volume->vid < vid)
p = rcu_dereference_raw(p->rb_left);
else if (volume->vid > vid)
p = rcu_dereference_raw(p->rb_right);
else
break;
volume = NULL;
}
} while (need_seqretry(&cell->volume_lock, seq));
done_seqretry(&cell->volume_lock, seq);
return volume;
}
/*
* allow the fileserver to explicitly break one callback
* - happens when
* - the backing file is changed
* - a lock is released
*/
static void afs_break_one_callback(struct afs_server *server,
static void afs_break_one_callback(struct afs_volume *volume,
struct afs_fid *fid)
{
struct afs_vol_interest *vi;
struct afs_cb_interest *cbi;
struct afs_iget_data data;
struct super_block *sb;
struct afs_vnode *vnode;
struct inode *inode;
read_lock(&server->cb_break_lock);
hlist_for_each_entry(vi, &server->cb_volumes, srv_link) {
if (vi->vid < fid->vid)
continue;
if (vi->vid > fid->vid) {
vi = NULL;
break;
}
//atomic_inc(&vi->usage);
break;
if (fid->vnode == 0 && fid->unique == 0) {
/* The callback break applies to an entire volume. */
write_lock(&volume->cb_v_break_lock);
volume->cb_v_break++;
trace_afs_cb_break(fid, volume->cb_v_break,
afs_cb_break_for_volume_callback, false);
write_unlock(&volume->cb_v_break_lock);
return;
}
/* See if we can find a matching inode - even an I_NEW inode needs to
* be marked as it can have its callback broken before we finish
* setting up the local inode.
*/
sb = rcu_dereference(volume->sb);
if (!sb)
return;
inode = find_inode_rcu(sb, fid->vnode, afs_ilookup5_test_by_fid, fid);
if (inode) {
vnode = AFS_FS_I(inode);
afs_break_callback(vnode, afs_cb_break_for_callback);
} else {
trace_afs_cb_miss(fid, afs_cb_break_for_callback);
}
}
static void afs_break_some_callbacks(struct afs_server *server,
struct afs_callback_break *cbb,
size_t *_count)
{
struct afs_callback_break *residue = cbb;
struct afs_volume *volume;
afs_volid_t vid = cbb->fid.vid;
size_t i;
volume = afs_lookup_volume_rcu(server->cell, vid);
/* TODO: Find all matching volumes if we couldn't match the server and
* break them anyway.
*/
if (!vi)
goto out;
/* Step through all interested superblocks. There may be more than one
* because of cell aliasing.
*/
hlist_for_each_entry(cbi, &vi->cb_interests, cb_vlink) {
if (fid->vnode == 0 && fid->unique == 0) {
/* The callback break applies to an entire volume. */
struct afs_super_info *as = AFS_FS_S(cbi->sb);
struct afs_volume *volume = as->volume;
write_lock(&volume->cb_v_break_lock);
volume->cb_v_break++;
trace_afs_cb_break(fid, volume->cb_v_break,
afs_cb_break_for_volume_callback, false);
write_unlock(&volume->cb_v_break_lock);
for (i = *_count; i > 0; cbb++, i--) {
if (cbb->fid.vid == vid) {
_debug("- Fid { vl=%08llx n=%llu u=%u }",
cbb->fid.vid,
cbb->fid.vnode,
cbb->fid.unique);
--*_count;
if (volume)
afs_break_one_callback(volume, &cbb->fid);
} else {
data.volume = NULL;
data.fid = *fid;
inode = ilookup5_nowait(cbi->sb, fid->vnode,
afs_iget5_test, &data);
if (inode) {
vnode = AFS_FS_I(inode);
afs_break_callback(vnode, afs_cb_break_for_callback);
iput(inode);
} else {
trace_afs_cb_miss(fid, afs_cb_break_for_callback);
}
*residue++ = *cbb;
}
}
out:
read_unlock(&server->cb_break_lock);
}
/*
@@ -313,29 +178,11 @@ void afs_break_callbacks(struct afs_server *server, size_t count,
ASSERT(server != NULL);
/* TODO: Sort the callback break list by volume ID */
rcu_read_lock();
for (; count > 0; callbacks++, count--) {
_debug("- Fid { vl=%08llx n=%llu u=%u }",
callbacks->fid.vid,
callbacks->fid.vnode,
callbacks->fid.unique);
afs_break_one_callback(server, &callbacks->fid);
}
while (count > 0)
afs_break_some_callbacks(server, callbacks, &count);
_leave("");
rcu_read_unlock();
return;
}
/*
* Clear the callback interests in a server list.
*/
void afs_clear_callback_interests(struct afs_net *net, struct afs_server_list *slist)
{
int i;
for (i = 0; i < slist->nr_servers; i++) {
afs_put_cb_interest(net, slist->servers[i].cb_interest);
slist->servers[i].cb_interest = NULL;
}
}

View File

@@ -161,9 +161,13 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net,
atomic_set(&cell->usage, 2);
INIT_WORK(&cell->manager, afs_manage_cell);
INIT_LIST_HEAD(&cell->proc_volumes);
rwlock_init(&cell->proc_lock);
cell->volumes = RB_ROOT;
INIT_HLIST_HEAD(&cell->proc_volumes);
seqlock_init(&cell->volume_lock);
cell->fs_servers = RB_ROOT;
seqlock_init(&cell->fs_lock);
rwlock_init(&cell->vl_servers_lock);
cell->flags = (1 << AFS_CELL_FL_CHECK_ALIAS);
/* Provide a VL server list, filling it in if we were given a list of
* addresses to use.
@@ -481,7 +485,9 @@ static void afs_cell_destroy(struct rcu_head *rcu)
ASSERTCMP(atomic_read(&cell->usage), ==, 0);
afs_put_volume(cell->net, cell->root_volume, afs_volume_trace_put_cell_root);
afs_put_vlserverlist(cell->net, rcu_access_pointer(cell->vl_servers));
afs_put_cell(cell->net, cell->alias_of);
key_put(cell->anonymous_key);
kfree(cell);

View File

@@ -118,8 +118,6 @@ bool afs_cm_incoming_call(struct afs_call *call)
{
_enter("{%u, CB.OP %u}", call->service_id, call->operation_ID);
call->epoch = rxrpc_kernel_get_epoch(call->net->socket, call->rxcall);
switch (call->operation_ID) {
case CBCallBack:
call->type = &afs_SRXCBCallBack;
@@ -149,49 +147,6 @@ bool afs_cm_incoming_call(struct afs_call *call)
}
}
/*
* Record a probe to the cache manager from a server.
*/
static int afs_record_cm_probe(struct afs_call *call, struct afs_server *server)
{
_enter("");
if (test_bit(AFS_SERVER_FL_HAVE_EPOCH, &server->flags) &&
!test_bit(AFS_SERVER_FL_PROBING, &server->flags)) {
if (server->cm_epoch == call->epoch)
return 0;
if (!server->probe.said_rebooted) {
pr_notice("kAFS: FS rebooted %pU\n", &server->uuid);
server->probe.said_rebooted = true;
}
}
spin_lock(&server->probe_lock);
if (!test_and_set_bit(AFS_SERVER_FL_HAVE_EPOCH, &server->flags)) {
server->cm_epoch = call->epoch;
server->probe.cm_epoch = call->epoch;
goto out;
}
if (server->probe.cm_probed &&
call->epoch != server->probe.cm_epoch &&
!server->probe.said_inconsistent) {
pr_notice("kAFS: FS endpoints inconsistent %pU\n",
&server->uuid);
server->probe.said_inconsistent = true;
}
if (!server->probe.cm_probed || call->epoch == server->cm_epoch)
server->probe.cm_epoch = server->cm_epoch;
out:
server->probe.cm_probed = true;
spin_unlock(&server->probe_lock);
return 0;
}
/*
* Find the server record by peer address and record a probe to the cache
* manager from a server.
@@ -210,7 +165,7 @@ static int afs_find_cm_server_by_peer(struct afs_call *call)
}
call->server = server;
return afs_record_cm_probe(call, server);
return 0;
}
/*
@@ -231,7 +186,7 @@ static int afs_find_cm_server_by_uuid(struct afs_call *call,
}
call->server = server;
return afs_record_cm_probe(call, server);
return 0;
}
/*
@@ -268,7 +223,9 @@ static void SRXAFSCB_CallBack(struct work_struct *work)
* to maintain cache coherency.
*/
if (call->server) {
trace_afs_server(call->server, atomic_read(&call->server->usage),
trace_afs_server(call->server,
atomic_read(&call->server->ref),
atomic_read(&call->server->active),
afs_server_trace_callback);
afs_break_callbacks(call->server, call->count, call->request);
}
@@ -305,8 +262,7 @@ static int afs_deliver_cb_callback(struct afs_call *call)
call->count = ntohl(call->tmp);
_debug("FID count: %u", call->count);
if (call->count > AFSCBMAX)
return afs_protocol_error(call, -EBADMSG,
afs_eproto_cb_fid_count);
return afs_protocol_error(call, afs_eproto_cb_fid_count);
call->buffer = kmalloc(array3_size(call->count, 3, 4),
GFP_KERNEL);
@@ -351,8 +307,7 @@ static int afs_deliver_cb_callback(struct afs_call *call)
call->count2 = ntohl(call->tmp);
_debug("CB count: %u", call->count2);
if (call->count2 != call->count && call->count2 != 0)
return afs_protocol_error(call, -EBADMSG,
afs_eproto_cb_count);
return afs_protocol_error(call, afs_eproto_cb_count);
call->iter = &call->def_iter;
iov_iter_discard(&call->def_iter, READ, call->count2 * 3 * 4);
call->unmarshall++;
@@ -509,7 +464,8 @@ static int afs_deliver_cb_probe(struct afs_call *call)
}
/*
* allow the fileserver to quickly find out if the fileserver has been rebooted
* Allow the fileserver to quickly find out if the cache manager has been
* rebooted.
*/
static void SRXAFSCB_ProbeUuid(struct work_struct *work)
{
@@ -581,7 +537,7 @@ static int afs_deliver_cb_probe_uuid(struct afs_call *call)
if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING))
return afs_io_error(call, afs_io_error_cm_reply);
return afs_find_cm_server_by_uuid(call, call->request);
return afs_find_cm_server_by_peer(call);
}
/*
@@ -672,8 +628,7 @@ static int afs_deliver_yfs_cb_callback(struct afs_call *call)
call->count = ntohl(call->tmp);
_debug("FID count: %u", call->count);
if (call->count > YFSCBMAX)
return afs_protocol_error(call, -EBADMSG,
afs_eproto_cb_fid_count);
return afs_protocol_error(call, afs_eproto_cb_fid_count);
size = array_size(call->count, sizeof(struct yfs_xdr_YFSFid));
call->buffer = kmalloc(size, GFP_KERNEL);

File diff suppressed because it is too large Load Diff

View File

@@ -12,6 +12,47 @@
#include <linux/fsnotify.h>
#include "internal.h"
static void afs_silly_rename_success(struct afs_operation *op)
{
_enter("op=%08x", op->debug_id);
afs_vnode_commit_status(op, &op->file[0]);
}
static void afs_silly_rename_edit_dir(struct afs_operation *op)
{
struct afs_vnode_param *dvp = &op->file[0];
struct afs_vnode *dvnode = dvp->vnode;
struct afs_vnode *vnode = AFS_FS_I(d_inode(op->dentry));
struct dentry *old = op->dentry;
struct dentry *new = op->dentry_2;
spin_lock(&old->d_lock);
old->d_flags |= DCACHE_NFSFS_RENAMED;
spin_unlock(&old->d_lock);
if (dvnode->silly_key != op->key) {
key_put(dvnode->silly_key);
dvnode->silly_key = key_get(op->key);
}
down_write(&dvnode->validate_lock);
if (test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags) &&
dvnode->status.data_version == dvp->dv_before + dvp->dv_delta) {
afs_edit_dir_remove(dvnode, &old->d_name,
afs_edit_dir_for_silly_0);
afs_edit_dir_add(dvnode, &new->d_name,
&vnode->fid, afs_edit_dir_for_silly_1);
}
up_write(&dvnode->validate_lock);
}
static const struct afs_operation_ops afs_silly_rename_operation = {
.issue_afs_rpc = afs_fs_rename,
.issue_yfs_rpc = yfs_fs_rename,
.success = afs_silly_rename_success,
.edit_dir = afs_silly_rename_edit_dir,
};
/*
* Actually perform the silly rename step.
*/
@@ -19,56 +60,22 @@ static int afs_do_silly_rename(struct afs_vnode *dvnode, struct afs_vnode *vnode
struct dentry *old, struct dentry *new,
struct key *key)
{
struct afs_fs_cursor fc;
struct afs_status_cb *scb;
afs_dataversion_t dir_data_version;
int ret = -ERESTARTSYS;
struct afs_operation *op;
_enter("%pd,%pd", old, new);
scb = kzalloc(sizeof(struct afs_status_cb), GFP_KERNEL);
if (!scb)
return -ENOMEM;
op = afs_alloc_operation(key, dvnode->volume);
if (IS_ERR(op))
return PTR_ERR(op);
afs_op_set_vnode(op, 0, dvnode);
op->dentry = old;
op->dentry_2 = new;
op->ops = &afs_silly_rename_operation;
trace_afs_silly_rename(vnode, false);
if (afs_begin_vnode_operation(&fc, dvnode, key, true)) {
dir_data_version = dvnode->status.data_version + 1;
while (afs_select_fileserver(&fc)) {
fc.cb_break = afs_calc_vnode_cb_break(dvnode);
afs_fs_rename(&fc, old->d_name.name,
dvnode, new->d_name.name,
scb, scb);
}
afs_vnode_commit_status(&fc, dvnode, fc.cb_break,
&dir_data_version, scb);
ret = afs_end_vnode_operation(&fc);
}
if (ret == 0) {
spin_lock(&old->d_lock);
old->d_flags |= DCACHE_NFSFS_RENAMED;
spin_unlock(&old->d_lock);
if (dvnode->silly_key != key) {
key_put(dvnode->silly_key);
dvnode->silly_key = key_get(key);
}
down_write(&dvnode->validate_lock);
if (test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags) &&
dvnode->status.data_version == dir_data_version) {
afs_edit_dir_remove(dvnode, &old->d_name,
afs_edit_dir_for_silly_0);
afs_edit_dir_add(dvnode, &new->d_name,
&vnode->fid, afs_edit_dir_for_silly_1);
}
up_write(&dvnode->validate_lock);
}
kfree(scb);
_leave(" = %d", ret);
return ret;
return afs_do_sync_operation(op);
}
/**
@@ -139,65 +146,66 @@ out:
return ret;
}
static void afs_silly_unlink_success(struct afs_operation *op)
{
struct afs_vnode *vnode = op->file[1].vnode;
_enter("op=%08x", op->debug_id);
afs_check_for_remote_deletion(op, op->file[0].vnode);
afs_vnode_commit_status(op, &op->file[0]);
afs_vnode_commit_status(op, &op->file[1]);
afs_update_dentry_version(op, &op->file[0], op->dentry);
drop_nlink(&vnode->vfs_inode);
if (vnode->vfs_inode.i_nlink == 0) {
set_bit(AFS_VNODE_DELETED, &vnode->flags);
clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
}
}
static void afs_silly_unlink_edit_dir(struct afs_operation *op)
{
struct afs_vnode_param *dvp = &op->file[0];
struct afs_vnode *dvnode = dvp->vnode;
_enter("op=%08x", op->debug_id);
down_write(&dvnode->validate_lock);
if (test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags) &&
dvnode->status.data_version == dvp->dv_before + dvp->dv_delta)
afs_edit_dir_remove(dvnode, &op->dentry->d_name,
afs_edit_dir_for_unlink);
up_write(&dvnode->validate_lock);
}
static const struct afs_operation_ops afs_silly_unlink_operation = {
.issue_afs_rpc = afs_fs_remove_file,
.issue_yfs_rpc = yfs_fs_remove_file,
.success = afs_silly_unlink_success,
.edit_dir = afs_silly_unlink_edit_dir,
};
/*
* Tell the server to remove a sillyrename file.
*/
static int afs_do_silly_unlink(struct afs_vnode *dvnode, struct afs_vnode *vnode,
struct dentry *dentry, struct key *key)
{
struct afs_fs_cursor fc;
struct afs_status_cb *scb;
int ret = -ERESTARTSYS;
struct afs_operation *op;
_enter("");
scb = kcalloc(2, sizeof(struct afs_status_cb), GFP_KERNEL);
if (!scb)
return -ENOMEM;
op = afs_alloc_operation(NULL, dvnode->volume);
if (IS_ERR(op))
return PTR_ERR(op);
afs_op_set_vnode(op, 0, dvnode);
afs_op_set_vnode(op, 1, vnode);
op->dentry = dentry;
op->ops = &afs_silly_unlink_operation;
trace_afs_silly_rename(vnode, true);
if (afs_begin_vnode_operation(&fc, dvnode, key, false)) {
afs_dataversion_t dir_data_version = dvnode->status.data_version + 1;
while (afs_select_fileserver(&fc)) {
fc.cb_break = afs_calc_vnode_cb_break(dvnode);
if (test_bit(AFS_SERVER_FL_IS_YFS, &fc.cbi->server->flags) &&
!test_bit(AFS_SERVER_FL_NO_RM2, &fc.cbi->server->flags)) {
yfs_fs_remove_file2(&fc, vnode, dentry->d_name.name,
&scb[0], &scb[1]);
if (fc.ac.error != -ECONNABORTED ||
fc.ac.abort_code != RXGEN_OPCODE)
continue;
set_bit(AFS_SERVER_FL_NO_RM2, &fc.cbi->server->flags);
}
afs_fs_remove(&fc, vnode, dentry->d_name.name, false, &scb[0]);
}
afs_vnode_commit_status(&fc, dvnode, fc.cb_break,
&dir_data_version, &scb[0]);
ret = afs_end_vnode_operation(&fc);
if (ret == 0) {
drop_nlink(&vnode->vfs_inode);
if (vnode->vfs_inode.i_nlink == 0) {
set_bit(AFS_VNODE_DELETED, &vnode->flags);
clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
}
}
if (ret == 0) {
down_write(&dvnode->validate_lock);
if (test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags) &&
dvnode->status.data_version == dir_data_version)
afs_edit_dir_remove(dvnode, &dentry->d_name,
afs_edit_dir_for_unlink);
up_write(&dvnode->validate_lock);
}
}
kfree(scb);
_leave(" = %d", ret);
return ret;
return afs_do_sync_operation(op);
}
/*

View File

@@ -10,6 +10,99 @@
#include <linux/dns_resolver.h>
#include "internal.h"
static atomic_t afs_autocell_ino;
/*
* iget5() comparator for inode created by autocell operations
*
* These pseudo inodes don't match anything.
*/
static int afs_iget5_pseudo_test(struct inode *inode, void *opaque)
{
return 0;
}
/*
* iget5() inode initialiser
*/
static int afs_iget5_pseudo_set(struct inode *inode, void *opaque)
{
struct afs_super_info *as = AFS_FS_S(inode->i_sb);
struct afs_vnode *vnode = AFS_FS_I(inode);
struct afs_fid *fid = opaque;
vnode->volume = as->volume;
vnode->fid = *fid;
inode->i_ino = fid->vnode;
inode->i_generation = fid->unique;
return 0;
}
/*
* Create an inode for a dynamic root directory or an autocell dynamic
* automount dir.
*/
struct inode *afs_iget_pseudo_dir(struct super_block *sb, bool root)
{
struct afs_super_info *as = AFS_FS_S(sb);
struct afs_vnode *vnode;
struct inode *inode;
struct afs_fid fid = {};
_enter("");
if (as->volume)
fid.vid = as->volume->vid;
if (root) {
fid.vnode = 1;
fid.unique = 1;
} else {
fid.vnode = atomic_inc_return(&afs_autocell_ino);
fid.unique = 0;
}
inode = iget5_locked(sb, fid.vnode,
afs_iget5_pseudo_test, afs_iget5_pseudo_set, &fid);
if (!inode) {
_leave(" = -ENOMEM");
return ERR_PTR(-ENOMEM);
}
_debug("GOT INODE %p { ino=%lu, vl=%llx, vn=%llx, u=%x }",
inode, inode->i_ino, fid.vid, fid.vnode, fid.unique);
vnode = AFS_FS_I(inode);
/* there shouldn't be an existing inode */
BUG_ON(!(inode->i_state & I_NEW));
inode->i_size = 0;
inode->i_mode = S_IFDIR | S_IRUGO | S_IXUGO;
if (root) {
inode->i_op = &afs_dynroot_inode_operations;
inode->i_fop = &simple_dir_operations;
} else {
inode->i_op = &afs_autocell_inode_operations;
}
set_nlink(inode, 2);
inode->i_uid = GLOBAL_ROOT_UID;
inode->i_gid = GLOBAL_ROOT_GID;
inode->i_ctime = inode->i_atime = inode->i_mtime = current_time(inode);
inode->i_blocks = 0;
inode->i_generation = 0;
set_bit(AFS_VNODE_PSEUDODIR, &vnode->flags);
if (!root) {
set_bit(AFS_VNODE_MOUNTPOINT, &vnode->flags);
inode->i_flags |= S_AUTOMOUNT;
}
inode->i_flags |= S_NOATIME;
unlock_new_inode(inode);
_leave(" = %p", inode);
return inode;
}
/*
* Probe to see if a cell may exist. This prevents positive dentries from
* being created unnecessarily.

View File

@@ -69,7 +69,7 @@ static const struct vm_operations_struct afs_vm_ops = {
*/
void afs_put_wb_key(struct afs_wb_key *wbk)
{
if (refcount_dec_and_test(&wbk->usage)) {
if (wbk && refcount_dec_and_test(&wbk->usage)) {
key_put(wbk->key);
kfree(wbk);
}
@@ -220,14 +220,35 @@ static void afs_file_readpage_read_complete(struct page *page,
}
#endif
static void afs_fetch_data_success(struct afs_operation *op)
{
struct afs_vnode *vnode = op->file[0].vnode;
_enter("op=%08x", op->debug_id);
afs_check_for_remote_deletion(op, vnode);
afs_vnode_commit_status(op, &op->file[0]);
afs_stat_v(vnode, n_fetches);
atomic_long_add(op->fetch.req->actual_len, &op->net->n_fetch_bytes);
}
static void afs_fetch_data_put(struct afs_operation *op)
{
afs_put_read(op->fetch.req);
}
static const struct afs_operation_ops afs_fetch_data_operation = {
.issue_afs_rpc = afs_fs_fetch_data,
.issue_yfs_rpc = yfs_fs_fetch_data,
.success = afs_fetch_data_success,
.put = afs_fetch_data_put,
};
/*
* Fetch file data from the volume.
*/
int afs_fetch_data(struct afs_vnode *vnode, struct key *key, struct afs_read *req)
{
struct afs_fs_cursor fc;
struct afs_status_cb *scb;
int ret;
struct afs_operation *op;
_enter("%s{%llx:%llu.%u},%x,,,",
vnode->volume->name,
@@ -236,34 +257,15 @@ int afs_fetch_data(struct afs_vnode *vnode, struct key *key, struct afs_read *re
vnode->fid.unique,
key_serial(key));
scb = kzalloc(sizeof(struct afs_status_cb), GFP_KERNEL);
if (!scb)
return -ENOMEM;
op = afs_alloc_operation(key, vnode->volume);
if (IS_ERR(op))
return PTR_ERR(op);
ret = -ERESTARTSYS;
if (afs_begin_vnode_operation(&fc, vnode, key, true)) {
afs_dataversion_t data_version = vnode->status.data_version;
afs_op_set_vnode(op, 0, vnode);
while (afs_select_fileserver(&fc)) {
fc.cb_break = afs_calc_vnode_cb_break(vnode);
afs_fs_fetch_data(&fc, scb, req);
}
afs_check_for_remote_deletion(&fc, vnode);
afs_vnode_commit_status(&fc, vnode, fc.cb_break,
&data_version, scb);
ret = afs_end_vnode_operation(&fc);
}
if (ret == 0) {
afs_stat_v(vnode, n_fetches);
atomic_long_add(req->actual_len,
&afs_v2net(vnode)->n_fetch_bytes);
}
kfree(scb);
_leave(" = %d", ret);
return ret;
op->fetch.req = afs_get_read(req);
op->ops = &afs_fetch_data_operation;
return afs_do_sync_operation(op);
}
/*

View File

@@ -70,7 +70,8 @@ static void afs_schedule_lock_extension(struct afs_vnode *vnode)
*/
void afs_lock_op_done(struct afs_call *call)
{
struct afs_vnode *vnode = call->lvnode;
struct afs_operation *op = call->op;
struct afs_vnode *vnode = op->lock.lvnode;
if (call->error == 0) {
spin_lock(&vnode->lock);
@@ -172,15 +173,28 @@ static void afs_kill_lockers_enoent(struct afs_vnode *vnode)
vnode->lock_key = NULL;
}
static void afs_lock_success(struct afs_operation *op)
{
struct afs_vnode *vnode = op->file[0].vnode;
_enter("op=%08x", op->debug_id);
afs_check_for_remote_deletion(op, vnode);
afs_vnode_commit_status(op, &op->file[0]);
}
static const struct afs_operation_ops afs_set_lock_operation = {
.issue_afs_rpc = afs_fs_set_lock,
.issue_yfs_rpc = yfs_fs_set_lock,
.success = afs_lock_success,
};
/*
* Get a lock on a file
*/
static int afs_set_lock(struct afs_vnode *vnode, struct key *key,
afs_lock_type_t type)
{
struct afs_status_cb *scb;
struct afs_fs_cursor fc;
int ret;
struct afs_operation *op;
_enter("%s{%llx:%llu.%u},%x,%u",
vnode->volume->name,
@@ -189,35 +203,29 @@ static int afs_set_lock(struct afs_vnode *vnode, struct key *key,
vnode->fid.unique,
key_serial(key), type);
scb = kzalloc(sizeof(struct afs_status_cb), GFP_KERNEL);
if (!scb)
return -ENOMEM;
op = afs_alloc_operation(key, vnode->volume);
if (IS_ERR(op))
return PTR_ERR(op);
ret = -ERESTARTSYS;
if (afs_begin_vnode_operation(&fc, vnode, key, true)) {
while (afs_select_fileserver(&fc)) {
fc.cb_break = afs_calc_vnode_cb_break(vnode);
afs_fs_set_lock(&fc, type, scb);
}
afs_op_set_vnode(op, 0, vnode);
afs_check_for_remote_deletion(&fc, vnode);
afs_vnode_commit_status(&fc, vnode, fc.cb_break, NULL, scb);
ret = afs_end_vnode_operation(&fc);
}
kfree(scb);
_leave(" = %d", ret);
return ret;
op->lock.type = type;
op->ops = &afs_set_lock_operation;
return afs_do_sync_operation(op);
}
static const struct afs_operation_ops afs_extend_lock_operation = {
.issue_afs_rpc = afs_fs_extend_lock,
.issue_yfs_rpc = yfs_fs_extend_lock,
.success = afs_lock_success,
};
/*
* Extend a lock on a file
*/
static int afs_extend_lock(struct afs_vnode *vnode, struct key *key)
{
struct afs_status_cb *scb;
struct afs_fs_cursor fc;
int ret;
struct afs_operation *op;
_enter("%s{%llx:%llu.%u},%x",
vnode->volume->name,
@@ -226,35 +234,29 @@ static int afs_extend_lock(struct afs_vnode *vnode, struct key *key)
vnode->fid.unique,
key_serial(key));
scb = kzalloc(sizeof(struct afs_status_cb), GFP_KERNEL);
if (!scb)
return -ENOMEM;
op = afs_alloc_operation(key, vnode->volume);
if (IS_ERR(op))
return PTR_ERR(op);
ret = -ERESTARTSYS;
if (afs_begin_vnode_operation(&fc, vnode, key, false)) {
while (afs_select_current_fileserver(&fc)) {
fc.cb_break = afs_calc_vnode_cb_break(vnode);
afs_fs_extend_lock(&fc, scb);
}
afs_op_set_vnode(op, 0, vnode);
afs_check_for_remote_deletion(&fc, vnode);
afs_vnode_commit_status(&fc, vnode, fc.cb_break, NULL, scb);
ret = afs_end_vnode_operation(&fc);
}
kfree(scb);
_leave(" = %d", ret);
return ret;
op->flags |= AFS_OPERATION_UNINTR;
op->ops = &afs_extend_lock_operation;
return afs_do_sync_operation(op);
}
static const struct afs_operation_ops afs_release_lock_operation = {
.issue_afs_rpc = afs_fs_release_lock,
.issue_yfs_rpc = yfs_fs_release_lock,
.success = afs_lock_success,
};
/*
* Release a lock on a file
*/
static int afs_release_lock(struct afs_vnode *vnode, struct key *key)
{
struct afs_status_cb *scb;
struct afs_fs_cursor fc;
int ret;
struct afs_operation *op;
_enter("%s{%llx:%llu.%u},%x",
vnode->volume->name,
@@ -263,25 +265,15 @@ static int afs_release_lock(struct afs_vnode *vnode, struct key *key)
vnode->fid.unique,
key_serial(key));
scb = kzalloc(sizeof(struct afs_status_cb), GFP_KERNEL);
if (!scb)
return -ENOMEM;
op = afs_alloc_operation(key, vnode->volume);
if (IS_ERR(op))
return PTR_ERR(op);
ret = -ERESTARTSYS;
if (afs_begin_vnode_operation(&fc, vnode, key, false)) {
while (afs_select_current_fileserver(&fc)) {
fc.cb_break = afs_calc_vnode_cb_break(vnode);
afs_fs_release_lock(&fc, scb);
}
afs_op_set_vnode(op, 0, vnode);
afs_check_for_remote_deletion(&fc, vnode);
afs_vnode_commit_status(&fc, vnode, fc.cb_break, NULL, scb);
ret = afs_end_vnode_operation(&fc);
}
kfree(scb);
_leave(" = %d", ret);
return ret;
op->flags |= AFS_OPERATION_UNINTR;
op->ops = &afs_release_lock_operation;
return afs_do_sync_operation(op);
}
/*

239
fs/afs/fs_operation.c Normal file
View File

@@ -0,0 +1,239 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/* Fileserver-directed operation handling.
*
* Copyright (C) 2020 Red Hat, Inc. All Rights Reserved.
* Written by David Howells (dhowells@redhat.com)
*/
#include <linux/kernel.h>
#include <linux/slab.h>
#include <linux/fs.h>
#include "internal.h"
static atomic_t afs_operation_debug_counter;
/*
* Create an operation against a volume.
*/
struct afs_operation *afs_alloc_operation(struct key *key, struct afs_volume *volume)
{
struct afs_operation *op;
_enter("");
op = kzalloc(sizeof(*op), GFP_KERNEL);
if (!op)
return ERR_PTR(-ENOMEM);
if (!key) {
key = afs_request_key(volume->cell);
if (IS_ERR(key)) {
kfree(op);
return ERR_CAST(key);
}
} else {
key_get(key);
}
op->key = key;
op->volume = afs_get_volume(volume, afs_volume_trace_get_new_op);
op->net = volume->cell->net;
op->cb_v_break = volume->cb_v_break;
op->debug_id = atomic_inc_return(&afs_operation_debug_counter);
op->error = -EDESTADDRREQ;
op->ac.error = SHRT_MAX;
_leave(" = [op=%08x]", op->debug_id);
return op;
}
/*
* Lock the vnode(s) being operated upon.
*/
static bool afs_get_io_locks(struct afs_operation *op)
{
struct afs_vnode *vnode = op->file[0].vnode;
struct afs_vnode *vnode2 = op->file[1].vnode;
_enter("");
if (op->flags & AFS_OPERATION_UNINTR) {
mutex_lock(&vnode->io_lock);
op->flags |= AFS_OPERATION_LOCK_0;
_leave(" = t [1]");
return true;
}
if (!vnode2 || !op->file[1].need_io_lock || vnode == vnode2)
vnode2 = NULL;
if (vnode2 > vnode)
swap(vnode, vnode2);
if (mutex_lock_interruptible(&vnode->io_lock) < 0) {
op->error = -EINTR;
op->flags |= AFS_OPERATION_STOP;
_leave(" = f [I 0]");
return false;
}
op->flags |= AFS_OPERATION_LOCK_0;
if (vnode2) {
if (mutex_lock_interruptible_nested(&vnode2->io_lock, 1) < 0) {
op->error = -EINTR;
op->flags |= AFS_OPERATION_STOP;
mutex_unlock(&vnode->io_lock);
op->flags &= ~AFS_OPERATION_LOCK_0;
_leave(" = f [I 1]");
return false;
}
op->flags |= AFS_OPERATION_LOCK_1;
}
_leave(" = t [2]");
return true;
}
static void afs_drop_io_locks(struct afs_operation *op)
{
struct afs_vnode *vnode = op->file[0].vnode;
struct afs_vnode *vnode2 = op->file[1].vnode;
_enter("");
if (op->flags & AFS_OPERATION_LOCK_1)
mutex_unlock(&vnode2->io_lock);
if (op->flags & AFS_OPERATION_LOCK_0)
mutex_unlock(&vnode->io_lock);
}
static void afs_prepare_vnode(struct afs_operation *op, struct afs_vnode_param *vp,
unsigned int index)
{
struct afs_vnode *vnode = vp->vnode;
if (vnode) {
vp->fid = vnode->fid;
vp->dv_before = vnode->status.data_version;
vp->cb_break_before = afs_calc_vnode_cb_break(vnode);
if (vnode->lock_state != AFS_VNODE_LOCK_NONE)
op->flags |= AFS_OPERATION_CUR_ONLY;
}
if (vp->fid.vnode)
_debug("PREP[%u] {%llx:%llu.%u}",
index, vp->fid.vid, vp->fid.vnode, vp->fid.unique);
}
/*
* Begin an operation on the fileserver.
*
* Fileserver operations are serialised on the server by vnode, so we serialise
* them here also using the io_lock.
*/
bool afs_begin_vnode_operation(struct afs_operation *op)
{
struct afs_vnode *vnode = op->file[0].vnode;
ASSERT(vnode);
_enter("");
if (op->file[0].need_io_lock)
if (!afs_get_io_locks(op))
return false;
afs_prepare_vnode(op, &op->file[0], 0);
afs_prepare_vnode(op, &op->file[1], 1);
op->cb_v_break = op->volume->cb_v_break;
_leave(" = true");
return true;
}
/*
* Tidy up a filesystem cursor and unlock the vnode.
*/
static void afs_end_vnode_operation(struct afs_operation *op)
{
_enter("");
if (op->error == -EDESTADDRREQ ||
op->error == -EADDRNOTAVAIL ||
op->error == -ENETUNREACH ||
op->error == -EHOSTUNREACH)
afs_dump_edestaddrreq(op);
afs_drop_io_locks(op);
if (op->error == -ECONNABORTED)
op->error = afs_abort_to_error(op->ac.abort_code);
}
/*
* Wait for an in-progress operation to complete.
*/
void afs_wait_for_operation(struct afs_operation *op)
{
_enter("");
while (afs_select_fileserver(op)) {
op->cb_s_break = op->server->cb_s_break;
if (test_bit(AFS_SERVER_FL_IS_YFS, &op->server->flags) &&
op->ops->issue_yfs_rpc)
op->ops->issue_yfs_rpc(op);
else
op->ops->issue_afs_rpc(op);
op->error = afs_wait_for_call_to_complete(op->call, &op->ac);
}
if (op->error == 0) {
_debug("success");
op->ops->success(op);
}
afs_end_vnode_operation(op);
if (op->error == 0 && op->ops->edit_dir) {
_debug("edit_dir");
op->ops->edit_dir(op);
}
_leave("");
}
/*
* Dispose of an operation.
*/
int afs_put_operation(struct afs_operation *op)
{
int i, ret = op->error;
_enter("op=%08x,%d", op->debug_id, ret);
if (op->ops && op->ops->put)
op->ops->put(op);
if (op->file[0].put_vnode)
iput(&op->file[0].vnode->vfs_inode);
if (op->file[1].put_vnode)
iput(&op->file[1].vnode->vfs_inode);
if (op->more_files) {
for (i = 0; i < op->nr_files - 2; i++)
if (op->more_files[i].put_vnode)
iput(&op->more_files[i].vnode->vfs_inode);
kfree(op->more_files);
}
afs_end_cursor(&op->ac);
afs_put_serverlist(op->net, op->server_list);
afs_put_volume(op->net, op->volume, afs_volume_trace_put_put_op);
kfree(op);
return ret;
}
int afs_do_sync_operation(struct afs_operation *op)
{
afs_begin_vnode_operation(op);
afs_wait_for_operation(op);
return afs_put_operation(op);
}

View File

@@ -1,7 +1,7 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/* AFS fileserver probing
*
* Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
* Copyright (C) 2018, 2020 Red Hat, Inc. All Rights Reserved.
* Written by David Howells (dhowells@redhat.com)
*/
@@ -11,15 +11,86 @@
#include "internal.h"
#include "protocol_yfs.h"
static bool afs_fs_probe_done(struct afs_server *server)
{
if (!atomic_dec_and_test(&server->probe_outstanding))
return false;
static unsigned int afs_fs_probe_fast_poll_interval = 30 * HZ;
static unsigned int afs_fs_probe_slow_poll_interval = 5 * 60 * HZ;
wake_up_var(&server->probe_outstanding);
clear_bit_unlock(AFS_SERVER_FL_PROBING, &server->flags);
wake_up_bit(&server->flags, AFS_SERVER_FL_PROBING);
return true;
/*
* Start the probe polling timer. We have to supply it with an inc on the
* outstanding server count.
*/
static void afs_schedule_fs_probe(struct afs_net *net,
struct afs_server *server, bool fast)
{
unsigned long atj;
if (!net->live)
return;
atj = server->probed_at;
atj += fast ? afs_fs_probe_fast_poll_interval : afs_fs_probe_slow_poll_interval;
afs_inc_servers_outstanding(net);
if (timer_reduce(&net->fs_probe_timer, atj))
afs_dec_servers_outstanding(net);
}
/*
* Handle the completion of a set of probes.
*/
static void afs_finished_fs_probe(struct afs_net *net, struct afs_server *server)
{
bool responded = server->probe.responded;
write_seqlock(&net->fs_lock);
if (responded) {
list_add_tail(&server->probe_link, &net->fs_probe_slow);
} else {
server->rtt = UINT_MAX;
clear_bit(AFS_SERVER_FL_RESPONDING, &server->flags);
list_add_tail(&server->probe_link, &net->fs_probe_fast);
}
write_sequnlock(&net->fs_lock);
afs_schedule_fs_probe(net, server, !responded);
}
/*
* Handle the completion of a probe.
*/
static void afs_done_one_fs_probe(struct afs_net *net, struct afs_server *server)
{
_enter("");
if (atomic_dec_and_test(&server->probe_outstanding))
afs_finished_fs_probe(net, server);
wake_up_all(&server->probe_wq);
}
/*
* Handle inability to send a probe due to ENOMEM when trying to allocate a
* call struct.
*/
static void afs_fs_probe_not_done(struct afs_net *net,
struct afs_server *server,
struct afs_addr_cursor *ac)
{
struct afs_addr_list *alist = ac->alist;
unsigned int index = ac->index;
_enter("");
trace_afs_io_error(0, -ENOMEM, afs_io_error_fs_probe_fail);
spin_lock(&server->probe_lock);
server->probe.local_failure = true;
if (server->probe.error == 0)
server->probe.error = -ENOMEM;
set_bit(index, &alist->failed);
spin_unlock(&server->probe_lock);
return afs_done_one_fs_probe(net, server);
}
/*
@@ -30,10 +101,8 @@ void afs_fileserver_probe_result(struct afs_call *call)
{
struct afs_addr_list *alist = call->alist;
struct afs_server *server = call->server;
unsigned int server_index = call->server_index;
unsigned int index = call->addr_ix;
unsigned int rtt_us = 0;
bool have_result = false;
int ret = call->error;
_enter("%pU,%u", &server->uuid, index);
@@ -52,8 +121,9 @@ void afs_fileserver_probe_result(struct afs_call *call)
goto responded;
case -ENOMEM:
case -ENONET:
clear_bit(index, &alist->responded);
server->probe.local_failure = true;
afs_io_error(call, afs_io_error_fs_probe_fail);
trace_afs_io_error(call->debug_id, ret, afs_io_error_fs_probe_fail);
goto out;
case -ECONNRESET: /* Responded, but call expired. */
case -ERFKILL:
@@ -72,12 +142,11 @@ void afs_fileserver_probe_result(struct afs_call *call)
server->probe.error == -ETIMEDOUT ||
server->probe.error == -ETIME))
server->probe.error = ret;
afs_io_error(call, afs_io_error_fs_probe_fail);
trace_afs_io_error(call->debug_id, ret, afs_io_error_fs_probe_fail);
goto out;
}
responded:
set_bit(index, &alist->responded);
clear_bit(index, &alist->failed);
if (call->service_id == YFS_FS_SERVICE) {
@@ -95,39 +164,34 @@ responded:
rtt_us = rxrpc_kernel_get_srtt(call->net->socket, call->rxcall);
if (rtt_us < server->probe.rtt) {
server->probe.rtt = rtt_us;
server->rtt = rtt_us;
alist->preferred = index;
have_result = true;
}
smp_wmb(); /* Set rtt before responded. */
server->probe.responded = true;
set_bit(AFS_SERVER_FL_PROBED, &server->flags);
set_bit(index, &alist->responded);
set_bit(AFS_SERVER_FL_RESPONDING, &server->flags);
out:
spin_unlock(&server->probe_lock);
_debug("probe [%u][%u] %pISpc rtt=%u ret=%d",
server_index, index, &alist->addrs[index].transport, rtt_us, ret);
_debug("probe %pU [%u] %pISpc rtt=%u ret=%d",
&server->uuid, index, &alist->addrs[index].transport,
rtt_us, ret);
have_result |= afs_fs_probe_done(server);
if (have_result)
wake_up_all(&server->probe_wq);
return afs_done_one_fs_probe(call->net, server);
}
/*
* Probe all of a fileserver's addresses to find out the best route and to
* query its capabilities.
* Probe one or all of a fileserver's addresses to find out the best route and
* to query its capabilities.
*/
static int afs_do_probe_fileserver(struct afs_net *net,
struct afs_server *server,
struct key *key,
unsigned int server_index,
struct afs_error *_e)
void afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server,
struct key *key, bool all)
{
struct afs_addr_cursor ac = {
.index = 0,
};
struct afs_call *call;
bool in_progress = false;
_enter("%pU", &server->uuid);
@@ -137,50 +201,25 @@ static int afs_do_probe_fileserver(struct afs_net *net,
afs_get_addrlist(ac.alist);
read_unlock(&server->fs_lock);
atomic_set(&server->probe_outstanding, ac.alist->nr_addrs);
server->probed_at = jiffies;
atomic_set(&server->probe_outstanding, all ? ac.alist->nr_addrs : 1);
memset(&server->probe, 0, sizeof(server->probe));
server->probe.rtt = UINT_MAX;
for (ac.index = 0; ac.index < ac.alist->nr_addrs; ac.index++) {
call = afs_fs_get_capabilities(net, server, &ac, key, server_index);
if (!IS_ERR(call)) {
afs_put_call(call);
in_progress = true;
} else {
afs_prioritise_error(_e, PTR_ERR(call), ac.abort_code);
}
ac.index = ac.alist->preferred;
if (ac.index < 0 || ac.index >= ac.alist->nr_addrs)
all = true;
if (all) {
for (ac.index = 0; ac.index < ac.alist->nr_addrs; ac.index++)
if (!afs_fs_get_capabilities(net, server, &ac, key))
afs_fs_probe_not_done(net, server, &ac);
} else {
if (!afs_fs_get_capabilities(net, server, &ac, key))
afs_fs_probe_not_done(net, server, &ac);
}
if (!in_progress)
afs_fs_probe_done(server);
afs_put_addrlist(ac.alist);
return in_progress;
}
/*
* Send off probes to all unprobed servers.
*/
int afs_probe_fileservers(struct afs_net *net, struct key *key,
struct afs_server_list *list)
{
struct afs_server *server;
struct afs_error e;
bool in_progress = false;
int i;
e.error = 0;
e.responded = false;
for (i = 0; i < list->nr_servers; i++) {
server = list->servers[i].server;
if (test_bit(AFS_SERVER_FL_PROBED, &server->flags))
continue;
if (!test_and_set_bit_lock(AFS_SERVER_FL_PROBING, &server->flags) &&
afs_do_probe_fileserver(net, server, key, i, &e))
in_progress = true;
}
return in_progress ? 0 : e.error;
}
/*
@@ -190,7 +229,7 @@ int afs_wait_for_fs_probes(struct afs_server_list *slist, unsigned long untried)
{
struct wait_queue_entry *waits;
struct afs_server *server;
unsigned int rtt = UINT_MAX;
unsigned int rtt = UINT_MAX, rtt_s;
bool have_responders = false;
int pref = -1, i;
@@ -200,7 +239,7 @@ int afs_wait_for_fs_probes(struct afs_server_list *slist, unsigned long untried)
for (i = 0; i < slist->nr_servers; i++) {
if (test_bit(i, &untried)) {
server = slist->servers[i].server;
if (!test_bit(AFS_SERVER_FL_PROBING, &server->flags))
if (!atomic_read(&server->probe_outstanding))
__clear_bit(i, &untried);
if (server->probe.responded)
have_responders = true;
@@ -230,7 +269,7 @@ int afs_wait_for_fs_probes(struct afs_server_list *slist, unsigned long untried)
server = slist->servers[i].server;
if (server->probe.responded)
goto stop;
if (test_bit(AFS_SERVER_FL_PROBING, &server->flags))
if (atomic_read(&server->probe_outstanding))
still_probing = true;
}
}
@@ -246,10 +285,11 @@ stop:
for (i = 0; i < slist->nr_servers; i++) {
if (test_bit(i, &untried)) {
server = slist->servers[i].server;
if (server->probe.responded &&
server->probe.rtt < rtt) {
rtt_s = READ_ONCE(server->rtt);
if (test_bit(AFS_SERVER_FL_RESPONDING, &server->flags) &&
rtt_s < rtt) {
pref = i;
rtt = server->probe.rtt;
rtt = rtt_s;
}
remove_wait_queue(&server->probe_wq, &waits[i]);
@@ -265,3 +305,156 @@ stop:
slist->preferred = pref;
return 0;
}
/*
* Probe timer. We have an increment on fs_outstanding that we need to pass
* along to the work item.
*/
void afs_fs_probe_timer(struct timer_list *timer)
{
struct afs_net *net = container_of(timer, struct afs_net, fs_probe_timer);
if (!queue_work(afs_wq, &net->fs_prober))
afs_dec_servers_outstanding(net);
}
/*
* Dispatch a probe to a server.
*/
static void afs_dispatch_fs_probe(struct afs_net *net, struct afs_server *server, bool all)
__releases(&net->fs_lock)
{
struct key *key = NULL;
/* We remove it from the queues here - it will be added back to
* one of the queues on the completion of the probe.
*/
list_del_init(&server->probe_link);
afs_get_server(server, afs_server_trace_get_probe);
write_sequnlock(&net->fs_lock);
afs_fs_probe_fileserver(net, server, key, all);
afs_put_server(net, server, afs_server_trace_put_probe);
}
/*
* Probe a server immediately without waiting for its due time to come
* round. This is used when all of the addresses have been tried.
*/
void afs_probe_fileserver(struct afs_net *net, struct afs_server *server)
{
write_seqlock(&net->fs_lock);
if (!list_empty(&server->probe_link))
return afs_dispatch_fs_probe(net, server, true);
write_sequnlock(&net->fs_lock);
}
/*
* Probe dispatcher to regularly dispatch probes to keep NAT alive.
*/
void afs_fs_probe_dispatcher(struct work_struct *work)
{
struct afs_net *net = container_of(work, struct afs_net, fs_prober);
struct afs_server *fast, *slow, *server;
unsigned long nowj, timer_at, poll_at;
bool first_pass = true, set_timer = false;
if (!net->live)
return;
_enter("");
if (list_empty(&net->fs_probe_fast) && list_empty(&net->fs_probe_slow)) {
_leave(" [none]");
return;
}
again:
write_seqlock(&net->fs_lock);
fast = slow = server = NULL;
nowj = jiffies;
timer_at = nowj + MAX_JIFFY_OFFSET;
if (!list_empty(&net->fs_probe_fast)) {
fast = list_first_entry(&net->fs_probe_fast, struct afs_server, probe_link);
poll_at = fast->probed_at + afs_fs_probe_fast_poll_interval;
if (time_before(nowj, poll_at)) {
timer_at = poll_at;
set_timer = true;
fast = NULL;
}
}
if (!list_empty(&net->fs_probe_slow)) {
slow = list_first_entry(&net->fs_probe_slow, struct afs_server, probe_link);
poll_at = slow->probed_at + afs_fs_probe_slow_poll_interval;
if (time_before(nowj, poll_at)) {
if (time_before(poll_at, timer_at))
timer_at = poll_at;
set_timer = true;
slow = NULL;
}
}
server = fast ?: slow;
if (server)
_debug("probe %pU", &server->uuid);
if (server && (first_pass || !need_resched())) {
afs_dispatch_fs_probe(net, server, server == fast);
first_pass = false;
goto again;
}
write_sequnlock(&net->fs_lock);
if (server) {
if (!queue_work(afs_wq, &net->fs_prober))
afs_dec_servers_outstanding(net);
_leave(" [requeue]");
} else if (set_timer) {
if (timer_reduce(&net->fs_probe_timer, timer_at))
afs_dec_servers_outstanding(net);
_leave(" [timer]");
} else {
afs_dec_servers_outstanding(net);
_leave(" [quiesce]");
}
}
/*
* Wait for a probe on a particular fileserver to complete for 2s.
*/
int afs_wait_for_one_fs_probe(struct afs_server *server, bool is_intr)
{
struct wait_queue_entry wait;
unsigned long timo = 2 * HZ;
if (atomic_read(&server->probe_outstanding) == 0)
goto dont_wait;
init_wait_entry(&wait, 0);
for (;;) {
prepare_to_wait_event(&server->probe_wq, &wait,
is_intr ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE);
if (timo == 0 ||
server->probe.responded ||
atomic_read(&server->probe_outstanding) == 0 ||
(is_intr && signal_pending(current)))
break;
timo = schedule_timeout(timo);
}
finish_wait(&server->probe_wq, &wait);
dont_wait:
if (server->probe.responded)
return 0;
if (is_intr && signal_pending(current))
return -ERESTARTSYS;
if (timo == 0)
return -ETIME;
return -EDESTADDRREQ;
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -82,12 +82,14 @@ static int __net_init afs_net_init(struct net *net_ns)
INIT_WORK(&net->cells_manager, afs_manage_cells);
timer_setup(&net->cells_timer, afs_cells_timer, 0);
mutex_init(&net->cells_alias_lock);
mutex_init(&net->proc_cells_lock);
INIT_HLIST_HEAD(&net->proc_cells);
seqlock_init(&net->fs_lock);
net->fs_servers = RB_ROOT;
INIT_LIST_HEAD(&net->fs_updates);
INIT_LIST_HEAD(&net->fs_probe_fast);
INIT_LIST_HEAD(&net->fs_probe_slow);
INIT_HLIST_HEAD(&net->fs_proc);
INIT_HLIST_HEAD(&net->fs_addresses4);
@@ -96,6 +98,8 @@ static int __net_init afs_net_init(struct net *net_ns)
INIT_WORK(&net->fs_manager, afs_manage_servers);
timer_setup(&net->fs_timer, afs_servers_timer, 0);
INIT_WORK(&net->fs_prober, afs_fs_probe_dispatcher);
timer_setup(&net->fs_probe_timer, afs_fs_probe_timer, 0);
ret = -ENOMEM;
sysnames = kzalloc(sizeof(*sysnames), GFP_KERNEL);

View File

@@ -38,7 +38,7 @@ static int afs_proc_cells_show(struct seq_file *m, void *v)
if (v == SEQ_START_TOKEN) {
/* display header on line 1 */
seq_puts(m, "USE TTL SV NAME\n");
seq_puts(m, "USE TTL SV ST NAME\n");
return 0;
}
@@ -46,10 +46,11 @@ static int afs_proc_cells_show(struct seq_file *m, void *v)
vllist = rcu_dereference(cell->vl_servers);
/* display one cell per line on subsequent lines */
seq_printf(m, "%3u %6lld %2u %s\n",
seq_printf(m, "%3u %6lld %2u %2u %s\n",
atomic_read(&cell->usage),
cell->dns_expiry - ktime_get_real_seconds(),
vllist->nr_servers,
cell->state,
cell->name);
return 0;
}
@@ -208,11 +209,10 @@ static const char afs_vol_types[3][3] = {
*/
static int afs_proc_cell_volumes_show(struct seq_file *m, void *v)
{
struct afs_cell *cell = PDE_DATA(file_inode(m->file));
struct afs_volume *vol = list_entry(v, struct afs_volume, proc_link);
struct afs_volume *vol = hlist_entry(v, struct afs_volume, proc_link);
/* Display header on line 1 */
if (v == &cell->proc_volumes) {
if (v == SEQ_START_TOKEN) {
seq_puts(m, "USE VID TY NAME\n");
return 0;
}
@@ -230,8 +230,8 @@ static void *afs_proc_cell_volumes_start(struct seq_file *m, loff_t *_pos)
{
struct afs_cell *cell = PDE_DATA(file_inode(m->file));
read_lock(&cell->proc_lock);
return seq_list_start_head(&cell->proc_volumes, *_pos);
rcu_read_lock();
return seq_hlist_start_head_rcu(&cell->proc_volumes, *_pos);
}
static void *afs_proc_cell_volumes_next(struct seq_file *m, void *v,
@@ -239,15 +239,13 @@ static void *afs_proc_cell_volumes_next(struct seq_file *m, void *v,
{
struct afs_cell *cell = PDE_DATA(file_inode(m->file));
return seq_list_next(v, &cell->proc_volumes, _pos);
return seq_hlist_next_rcu(v, &cell->proc_volumes, _pos);
}
static void afs_proc_cell_volumes_stop(struct seq_file *m, void *v)
__releases(cell->proc_lock)
{
struct afs_cell *cell = PDE_DATA(file_inode(m->file));
read_unlock(&cell->proc_lock);
rcu_read_unlock();
}
static const struct seq_operations afs_proc_cell_volumes_ops = {
@@ -378,20 +376,26 @@ static int afs_proc_servers_show(struct seq_file *m, void *v)
int i;
if (v == SEQ_START_TOKEN) {
seq_puts(m, "UUID USE ADDR\n");
seq_puts(m, "UUID REF ACT\n");
return 0;
}
server = list_entry(v, struct afs_server, proc_link);
alist = rcu_dereference(server->addresses);
seq_printf(m, "%pU %3d %pISpc%s\n",
seq_printf(m, "%pU %3d %3d\n",
&server->uuid,
atomic_read(&server->usage),
&alist->addrs[0].transport,
alist->preferred == 0 ? "*" : "");
for (i = 1; i < alist->nr_addrs; i++)
seq_printf(m, " %pISpc%s\n",
&alist->addrs[i].transport,
atomic_read(&server->ref),
atomic_read(&server->active));
seq_printf(m, " - info: fl=%lx rtt=%u brk=%x\n",
server->flags, server->rtt, server->cb_s_break);
seq_printf(m, " - probe: last=%d out=%d\n",
(int)(jiffies - server->probed_at) / HZ,
atomic_read(&server->probe_outstanding));
seq_printf(m, " - ALIST v=%u rsp=%lx f=%lx\n",
alist->version, alist->responded, alist->failed);
for (i = 0; i < alist->nr_addrs; i++)
seq_printf(m, " [%x] %pISpc%s\n",
i, &alist->addrs[i].transport,
alist->preferred == i ? "*" : "");
return 0;
}

View File

@@ -8,7 +8,7 @@
#define YFS_FS_SERVICE 2500
#define YFS_CM_SERVICE 2501
#define YFSCBMAX 1024
#define YFSCBMAX 1024
enum YFS_CM_Operations {
YFSCBProbe = 206, /* probe client */

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More