Commit Graph

3248 Commits

Author SHA1 Message Date
Philipp Reisner 07fc96197a drbd: Do not check aspects that are not subject to change in _conn_requests_state()
When _conn_requests_state() is used to change other parts of the state
than the connection, do not check for a valid connection transition.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:23 +01:00
Philipp Reisner 892fdd1aee drbd: Improve readability of IO resuming after freeze due to no data access
The previous way of doing the state change was also okay since the
state change on the susp flag gets propagated from the mdev
to the tconn.

Fortunately all this goes away in drbd-9.0

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:22 +01:00
Philipp Reisner 88f79ec4ae drbd: Fix IO resuming after connection was established while executing the fence handler
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:22 +01:00
Lars Ellenberg b792b655cd drbd: fix potential list_add corruption
If the md_sync_timer triggers a second time,
while the work queued during the first time is still pending,
this could result in list_add() of an already added item,
and corrupt the work item list.

This likely only triggered because of the erroneous
batch-dequeueing of work items fixed with
  drbd: dequeue single work items in wait_for_work()

Still, skip queueing if md_sync_work is already queued.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:21 +01:00
Lars Ellenberg bc317a9ecd drbd: dequeue single work items in wait_for_work()
As long as we still use drbd_queue_work_front(),
we must only dequeue the single first item during normal operation.

The comment in drbd_worker() even says so,
but bc8a5a1 drbd: remove struct drbd_tl_epoch objects (barrier works)
introduced the batch dequeueing again via list_splice_init() in
wait_for_work().

Change back to list_move() of the first item, if any.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:21 +01:00
Lars Ellenberg c02abda2b2 drbd: mutex_unlock "... must no be used in interrupt context"
Documentation of mutex_unlock says
we must not use it in interrupt context.
So do not call it while holding the spin_lock_irq,
but give up the spinlock temporarily.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:21 +01:00
Philipp Reisner c1fd29a11f drbd: Fix a race condition that can lead to a BUG()
If the preconditions for a state change change after the wait_event() we
might hit the BUG() statement in conn_set_state().

With holding the spin_lock while evaluating the condition AND until the
actual state change we ensure the the preconditions can not change anymore.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:20 +01:00
Lars Ellenberg 0ee98e2eb0 drbd: temporarily suspend io in drbd_adm_disk_opts
drbd_adm_disk_opts() does
	wait_event(mdev->al_wait, lc_try_lock(mdev->act_log));
	drbd_al_shrink(mdev);

If the device is very busy, this can take a very long time to succeed.
Fix this by temporarily suspending IO,
then quickly change the settings, and resume.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:20 +01:00
Lars Ellenberg 4eb9b3cba0 drbd: don't send out P_BARRIER with stale information
We must only send P_BARRIER for epochs we actually sent P_DATA in.

If we (re-)establish a connection, we reinitialized the
send.current_epoch_nr, but forgot to reset send.current_epoch_writes.

This could result in a spurious P_BARRIER with stale epoch information,
and a disconnect/reconnect cycle once the then "unexpected"
P_BARRIER_ACK is received:
  BAD! BarrierAck #28823 received, expected #28829!

Introduce re_init_if_first_write() and maybe_send_barrier() helpers,
and call them appropriately for read/write/set-out-of-sync requests.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:19 +01:00
Lars Ellenberg 08332d7325 drbd: properly call drbd_rs_cancel_all() in drbd_disconnected()
drbd_disconnected() is supposed to clear the resync lru cache,
by calling drbd_rs_cancel_all().

We must do so before we call drbd_flush_workqueue(), as at least the
callback w_restart_disk_io() may wait for resync progres, and would
otherwise deadlock.

drbd_finish_peer_reqs() may again populate that cache, which will
then potentially be stale after the next resync handshake and bitmap
exchange, we have to do it again after that.

A stale resync lru cache causes no harm but ugly messages like this:
 BAD! sector=196608s enr=6 rs_left=-256 rs_failed=0 count=256 cstate=SyncTarget

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:19 +01:00
Philipp Reisner 155522df5b drbd: Remove dead code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:19 +01:00
Philipp Reisner b66623e33e drbd: Avoid NetworkFailure state during disconnect
Disconnecting is a cluster wide state change. In case the peer node agrees
to the state transition, it sends back the fact on the meta-data connection
and closes both sockets.

In case the node node that initiated the state transfer sees the closing
action on the data-socket, before the P_STATE_CHG_REPLY packet, it was
going into one of the network failure states.

At least with the fencing option set to something else thatn "dont-care",
the unclean shutdown of the connection causes a short IO freeze or
a fence operation.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:18 +01:00
Philipp Reisner 39a1aa7f49 drbd: Protect accesses to the uuid set with a spinlock
There is at least the worker context, the receiver context, the context of
receiving netlink packts.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:04 +01:00
Philipp Reisner fef45d297e drbd: Write all pages of the bitmap after an online resize
We need to write the whole bitmap after we moved the meta data
due to an online resize operation.

With the support for one peta byte devices bitmap IO was optimized
to only write out touched pages. This optimization must be turned
off when writing the bitmap after an online resize.

This issue was introduced with drbd-8.3.10.

The impact of this bug is that after an online resize, the next
resync could become larger than expected.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:51 +01:00
Philipp Reisner 5af2e8ce2b drbd: Fix completion of requests while the device is suspended
In various places (E.g. CONNECTION_LOST_WHILE_PENDING) the
RQ_COMPLETION_SUSP mask is passed in the clear set to mod_rq_state().

The issue was that it tried to clear the RQ_COMPLETION_SUSP bit
out of the state mask first, and eventuelly set it afterwards,
in the drbd_req_put_completion_ref() function.

Fixed that by moving the reference getting out of
drbd_req_put_completion_ref() into the mod_rq_state(), before the place
where the extra reference might be put.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:50 +01:00
Andreas Gruenbacher 715306f69d drbd: Don't unregister socket state_change callback from within the callback
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:50 +01:00
Lars Ellenberg eb12010e9a drbd: disambiguation, s/ERR_DISCARD/ERR_DISCARD_IMPOSSIBLE/
If for some reason (typically "split-brained" cluster manager)
drbd replica data has diverged, we can chose a victim,
and reconnect using "--discard-my-data", causing the victim
to become sync-target, fetching all changed blocks from the peer.

If we are Primary, we are potentially in use, and we refuse to
"roll back" changes to the data below the page cache and other users.

Rename the error symbol for this to ERR_DISCARD_IMPOSSIBLE.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:50 +01:00
Lars Ellenberg 427c0434fc drbd: disambiguation, s/DISCARD_CONCURRENT/RESOLVE_CONFLICTS/
We don't discard anything here, really.
We resolve conflicting, concurrent writes to overlapping data blocks.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:49 +01:00
Lars Ellenberg d4dabbe22d drbd: disambiguation, s/P_DISCARD_WRITE/P_SUPERSEDED/
To avoid confusion with REQ_DISCARD aka TRIM, rename our
"discard concurrent write acks" from P_DISCARD_WRITE to P_SUPERSEDED.

At the same time, rename the drbd request event DISCARD_WRITE
to CONFLICT_RESOLVED. It already triggers both successful completion
or restart of the request, depending on our RQ_POSTPONED flag.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:49 +01:00
Lars Ellenberg 232fd3f4a0 drbd: cleanup, drop unused struct
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:48 +01:00
Lars Ellenberg 46e21bbadb drbd: NEG_ACK does not imply a barrier-ack
Don't drop a request from the transfer log just because it was NEG_ACKED.
We need it around to be able to verify P_BARRIER_ACKs against the
transver log.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:48 +01:00
Lars Ellenberg 99b4d8fe6d drbd: only start a new epoch, if the current epoch contains writes
Almost all code paths calling start_new_tl_epoch() guarded it with
	if (... current_tle_writes > 0 ... ).
Just move that inside start_new_tl_epoch().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:47 +01:00
Philipp Reisner 8a0bab2a6d drbd: Finish requests that completed while IO was frozen
Requests of an acked epoch are stored on the barrier_acked_requests list. In
case the private bio of such a request completes while IO on the drbd device
is suspended [req_mod(completed_ok)] then the request stays there.

When thawing IO because the fence_peer handler returned, then we use
tl_clear() to apply the connection_lost_while_pending event to all requests
on the transfer-log and the barrier_acked_requests list.

Up to now the connection_lost_while_pending event was not applied
on requests on the barrier_acked_requests list. Fixed that.

I.e. now the connection_lost_while_pending and resend events are
applied to requests on the barrier_acked_requests list. For that
it is necessary that the resend event finishes (local only)
READS correctly.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:47 +01:00
Lars Ellenberg e959d08d3e drbd: Fix a potential issue with the DISCARD_CONCURRENT flag
The DISCARD_CONCURRENT flag should be set on one node and cleared on the
other node.
As the code was before it was theoretical possible that a node accepts the
meta socket, but has to close it later on, and keeps the DISCARD_CONCURRENT
flag.
Correct this by moving the clear_bit(DISCARD_CONCURRENT) where the packet
gets sent.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:47 +01:00
Lars Ellenberg 519b6d3eac drbd: fix drbd wire compatibility for empty flushes
DRBD has a concept of request epochs or reorder-domains,
which are separated on the wire by P_BARRIER packets.

Older DRBD is not able to handle zero-sized requests at all,
so we need to map empty flushes to these drbd barriers.

These are the equivalent of empty flushes, and
by default trigger flushes on the receiving side anyways
(unless not supported or explicitly disabled),
so there is no need to handle this differently in newer drbd either.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:46 +01:00