Commit Graph

348 Commits

Author SHA1 Message Date
David Teigland
96006ea6d4 dlm: fix missing dir remove
I don't know exactly how, but in some cases, a dir
record is not removed, or a new one is created when
it shouldn't be.  The result is that the dir node
lookup returns a master node where the rsb does not
exist.  In this case, The master node will repeatedly
return -EBADR for requests, and the lock requests will
be stuck.

Until all possible ways for this to happen can be
eliminated, a simple and effective way to recover from
this situation is for the supposed master node to send
a standard remove message to the dir node when it
receives a request for a resource it has no rsb for.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-07-16 14:24:43 -05:00
David Teigland
c503a62103 dlm: fix conversion deadlock from recovery
The process of rebuilding locks on a new master during
recovery could re-order the locks on the convert queue,
creating an "in place" conversion deadlock that would
not be resolved.  Fix this by not considering queue
order when granting conversions after recovery.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-07-16 14:18:22 -05:00
David Teigland
6d768177c2 dlm: use wait_event_timeout
Use wait_event_timeout to avoid using a timer
directly.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-07-16 14:18:12 -05:00
David Teigland
05c32f47bf dlm: fix race between remove and lookup
It was possible for a remove message on an old
rsb to be sent after a lookup message on a new
rsb, where the rsbs were for the same resource
name.  This could lead to a missing directory
entry for the new rsb.

It is fixed by keeping a copy of the resource
name being removed until after the remove has
been sent.  A lookup checks if this in-progress
remove matches the name it is looking up.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-07-16 14:18:01 -05:00
David Teigland
1d7c484eeb dlm: use idr instead of list for recovered rsbs
When a large number of resources are being recovered,
a linear search of the recover_list takes a long time.
Use an idr in place of a list.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-07-16 14:17:52 -05:00
David Teigland
c04fecb4d9 dlm: use rsbtbl as resource directory
Remove the dir hash table (dirtbl), and use
the rsb hash table (rsbtbl) as the resource
directory.  It has always been an unnecessary
duplication of information.

This improves efficiency by using a single rsbtbl
lookup in many cases where both rsbtbl and dirtbl
lookups were needed previously.

This eliminates the need to handle cases of rsbtbl
and dirtbl being out of sync.

In many cases there will be memory savings because
the dir hash table no longer exists.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-07-16 14:16:19 -05:00
Dan Carpenter
75af271ed5 dlm: NULL dereference on failure in kmem_cache_create()
We aren't allowed to pass NULL pointers to kmem_cache_destroy() so if
both allocations fail, it leads to a NULL dereference.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2012-05-15 10:39:28 -05:00
David Teigland
4875647a08 dlm: fixes for nodir mode
The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used.  This commit
fixes a number of problems, making nodir much more usable.

- Major change to recovery: recover all locks and restart
  all in-progress operations after recovery.  In some
  cases it's not possible to know which in-progess locks
  to recover, so recover all.  (Most require recovery
  in nodir mode anyway since rehashing changes most
  master nodes.)

- Change the way nodir mode is enabled, from a command
  line mount arg passed through gfs2, into a sysfs
  file managed by dlm_controld, consistent with the
  other config settings.

- Allow recovering MSTCPY locks on an rsb that has not
  yet been turned into a master copy.

- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
  from a previous, aborted recovery cycle.  Base this
  on the local recovery status not being in the state
  where any nodes should be sending LOCK messages for the
  current recovery cycle.

- Hold rsb lock around dlm_purge_mstcpy_locks() because it
  may run concurrently with dlm_recover_master_copy().

- Maintain highbast on process-copy lkb's (in addition to
  the master as is usual), because the lkb can switch
  back and forth between being a master and being a
  process copy as the master node changes in recovery.

- When recovering MSTCPY locks, flag rsb's that have
  non-empty convert or waiting queues for granting
  at the end of recovery.  (Rename flag from LOCKS_PURGED
  to RECOVER_GRANT and similar for the recovery function,
  because it's not only resources with purged locks
  that need grant a grant attempt.)

- Replace a couple of unnecessary assertion panics with
  error messages.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-05-02 14:15:27 -05:00
David Teigland
6d40c4a708 dlm: improve error and debug messages
Change some existing error/debug messages to
collect more useful information, and add
some new error/debug messages to address
recently found problems.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-04-26 15:41:46 -05:00
David Teigland
57638bf3aa dlm: avoid unnecessary search in search_rsb
If the rsb is found in the "keep" tree, but is
not the right type (i.e. not MASTER), we can
return immediately with the result.  There's
no point in going on to search the "toss" list
as if we hadn't found it.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-04-26 15:37:56 -05:00
David Teigland
d6e24788d2 dlm: limit rcom debug messages
Unify the checking for both types of ignored
rcom messages, and replace the two log_debug
statements with a single, rate limited debug
message.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-04-26 15:37:37 -05:00
David Teigland
13ef11110f dlm: fix waiter recovery
An outstanding remote operation (an lkb on the "waiter"
list) could sometimes miss being resent during recovery.
The decision was based on the lkb_nodeid field, which
could have changed during an earlier aborted recovery,
so it no longer represents the actual remote destination.
The lkb_wait_nodeid is always the actual remote node,
so it is the best value to use.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-04-26 15:36:04 -05:00
David Teigland
513ef596d4 dlm: prevent connections during shutdown
During lowcomms shutdown, a new connection could possibly
be created, and attempt to use a workqueue that's been
destroyed.  Similarly, during startup, a new connection
could attempt to use a workqueue that's not been set up
yet.  Add a global variable to indicate when new connections
are allowed.

Based on patch by: Christine Caulfield <ccaulfie@redhat.com>

Reported-by: dann frazier <dann.frazier@canonical.com>
Reviewed-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2012-04-26 15:35:38 -05:00
Linus Torvalds
721b024bd4 Merge tag 'dlm-fixes-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm fixes from David Teigland:
 "This includes one short patch fixing the behavior of the QUECVT flag,
  which the gfs2 folks are waiting on."

* tag 'dlm-fixes-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  dlm: fix QUECVT when convert queue is empty
2012-04-23 18:22:42 -07:00
David Teigland
53ad1c980d dlm: fix QUECVT when convert queue is empty
The QUECVT flag should not prevent conversions from
being granted immediately when the convert queue is
empty.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-04-23 11:30:59 -05:00
Stephen Boyd
234e340582 simple_open: automatically convert to simple_open()
Many users of debugfs copy the implementation of default_open() when
they want to support a custom read/write function op.  This leads to a
proliferation of the default_open() implementation across the entire
tree.

Now that the common implementation has been consolidated into libfs we
can replace all the users of this function with simple_open().

This replacement was done with the following semantic patch:

<smpl>
@ open @
identifier open_f != simple_open;
identifier i, f;
@@
-int open_f(struct inode *i, struct file *f)
-{
(
-if (i->i_private)
-f->private_data = i->i_private;
|
-f->private_data = i->i_private;
)
-return 0;
-}

@ has_open depends on open @
identifier fops;
identifier open.open_f;
@@
struct file_operations fops = {
...
-.open = open_f,
+.open = simple_open,
...
};
</smpl>

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05 15:25:50 -07:00
Linus Torvalds
30d73f3752 Merge tag 'dlm-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates for 3.4 from David Teigland:
 "This set includes one trivial fix, and one simple recovery speed up.
  Directory recovery can use the standard hash table to find resources
  rather than always searching the linear recovery list."

* tag 'dlm-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  dlm: last element of dlm_local_addr[] never used
  dlm: fix slow rsb search in dir recovery
2012-03-21 13:54:22 -07:00
David Teigland
1b189b8889 dlm: last element of dlm_local_addr[] never used
The last element of dlm_local_addr[DLM_MAX_ADDR_COUNT]
was not used because the loop ended at COUNT - 1.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2012-03-21 09:18:34 -05:00
Benjamin Poirier
2f2d76cc3e dlm: Do not allocate a fd for peeloff
avoids allocating a fd that a) propagates to every kernel thread and
usermodehelper b) is not properly released.

References: http://article.gmane.org/gmane.linux.network.drbd/22529
Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-08 13:52:09 -08:00
David Teigland
7210cb7a72 dlm: fix slow rsb search in dir recovery
The function used to find an rsb during directory
recovery was searching the single linear list of
rsb's.  This wasted a lot of time compared to
using the standard hash table to find the rsb.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-03-08 14:46:30 -06:00
Linus Torvalds
49d41bae46 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  dlm: add recovery callbacks
  dlm: add node slots and generation
  dlm: move recovery barrier calls
  dlm: convert rsb list to rb_tree
2012-01-10 14:55:55 -08:00
David Teigland
60f98d1839 dlm: add recovery callbacks
These new callbacks notify the dlm user about lock recovery.
GFS2, and possibly others, need to be aware of when the dlm
will be doing lock recovery for a failed lockspace member.

In the past, this coordination has been done between dlm and
file system daemons in userspace, which then direct their
kernel counterparts.  These callbacks allow the same
coordination directly, and more simply.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-01-04 08:56:31 -06:00
David Teigland
757a427196 dlm: add node slots and generation
Slot numbers are assigned to nodes when they join the lockspace.
The slot number chosen is the minimum unused value starting at 1.
Once a node is assigned a slot, that slot number will not change
while the node remains a lockspace member.  If the node leaves
and rejoins it can be assigned a new slot number.

A new generation number is also added to a lockspace.  It is
set and incremented during each recovery along with the slot
collection/assignment.

The slot numbers will be passed to gfs2 which will use them as
journal id's.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-01-04 08:55:57 -06:00
David Teigland
f95a34c665 dlm: move recovery barrier calls
Put all the calls to recovery barriers in the same function
to clarify where they each happen.  Should not change any behavior.
Also modify some recovery debug lines to make them consistent.

Signed-off-by: David Teigland <teigland@redhat.com>
2012-01-04 08:53:27 -06:00
Alexey Dobriyan
4e3fd7a06d net: remove ipv6_addr_copy()
C assignment can handle struct in6_addr copying.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-22 16:43:32 -05:00