Once a packet has been posted to a connection in the data_ready handler, we
mustn't try reposting if we then find that the connection is dying as the
refcount has been given over to the dying connection and the packet might
no longer exist.
Losing the packet isn't a problem as the peer will retransmit.
Signed-off-by: David Howells <dhowells@redhat.com>
The call state machine processor sets up the message parameters for a UDP
message that it might need to transmit in advance on the basis that there's
a very good chance it's going to have to transmit either an ACK or an
ABORT. This requires it to look in the connection struct to retrieve some
of the parameters.
However, if the call is complete, the call connection pointer may be NULL
to dissuade the processor from transmitting a message. However, there are
some situations where the processor is still going to be called - and it's
still going to set up message parameters whether it needs them or not.
This results in a NULL pointer dereference at:
net/rxrpc/call_event.c:837
To fix this, skip the message pre-initialisation if there's no connection
attached.
Signed-off-by: David Howells <dhowells@redhat.com>
If rxrpc_new_client_call() fails to make a connection, the call record that
it allocated needs to be marked as RXRPC_CALL_RELEASED before it is passed
to rxrpc_put_call() to indicate that it no longer has any attachment to the
AF_RXRPC socket.
Without this, an assertion failure may occur at:
net/rxrpc/call_object:635
Signed-off-by: David Howells <dhowells@redhat.com>
A newly added bugfix caused an uninitialized variable to be
used for printing debug output. This is harmless as long
as the debug setting is disabled, but otherwise leads to an
immediate crash.
gcc warns about this when -Wmaybe-uninitialized is enabled:
net/rxrpc/call_object.c: In function 'rxrpc_release_call':
net/rxrpc/call_object.c:496:163: error: 'sp' may be used uninitialized in this function [-Werror=maybe-uninitialized]
The initialization was removed but one of the users remains.
This adds back the initialization.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 372ee16386 ("rxrpc: Fix races between skb free, ACK generation and replying")
Signed-off-by: David Howells <dhowells@redhat.com>
Inside the kafs filesystem it is possible to occasionally have a call
processed and terminated before we've had a chance to check whether we need
to clean up the rx queue for that call because afs_send_simple_reply() ends
the call when it is done, but this is done in a workqueue item that might
happen to run to completion before afs_deliver_to_call() completes.
Further, it is possible for rxrpc_kernel_send_data() to be called to send a
reply before the last request-phase data skb is released. The rxrpc skb
destructor is where the ACK processing is done and the call state is
advanced upon release of the last skb. ACK generation is also deferred to
a work item because it's possible that the skb destructor is not called in
a context where kernel_sendmsg() can be invoked.
To this end, the following changes are made:
(1) kernel_rxrpc_data_consumed() is added. This should be called whenever
an skb is emptied so as to crank the ACK and call states. This does
not release the skb, however. kernel_rxrpc_free_skb() must now be
called to achieve that. These together replace
rxrpc_kernel_data_delivered().
(2) kernel_rxrpc_data_consumed() is wrapped by afs_data_consumed().
This makes afs_deliver_to_call() easier to work as the skb can simply
be discarded unconditionally here without trying to work out what the
return value of the ->deliver() function means.
The ->deliver() functions can, via afs_data_complete(),
afs_transfer_reply() and afs_extract_data() mark that an skb has been
consumed (thereby cranking the state) without the need to
conditionally free the skb to make sure the state is correct on an
incoming call for when the call processor tries to send the reply.
(3) rxrpc_recvmsg() now has to call kernel_rxrpc_data_consumed() when it
has finished with a packet and MSG_PEEK isn't set.
(4) rxrpc_packet_destructor() no longer calls rxrpc_hard_ACK_data().
Because of this, we no longer need to clear the destructor and put the
call before we free the skb in cases where we don't want the ACK/call
state to be cranked.
(5) The ->deliver() call-type callbacks are made to return -EAGAIN rather
than 0 if they expect more data (afs_extract_data() returns -EAGAIN to
the delivery function already), and the caller is now responsible for
producing an abort if that was the last packet.
(6) There are many bits of unmarshalling code where:
ret = afs_extract_data(call, skb, last, ...);
switch (ret) {
case 0: break;
case -EAGAIN: return 0;
default: return ret;
}
is to be found. As -EAGAIN can now be passed back to the caller, we
now just return if ret < 0:
ret = afs_extract_data(call, skb, last, ...);
if (ret < 0)
return ret;
(7) Checks for trailing data and empty final data packets has been
consolidated as afs_data_complete(). So:
if (skb->len > 0)
return -EBADMSG;
if (!last)
return 0;
becomes:
ret = afs_data_complete(call, skb, last);
if (ret < 0)
return ret;
(8) afs_transfer_reply() now checks the amount of data it has against the
amount of data desired and the amount of data in the skb and returns
an error to induce an abort if we don't get exactly what we want.
Without these changes, the following oops can occasionally be observed,
particularly if some printks are inserted into the delivery path:
general protection fault: 0000 [#1] SMP
Modules linked in: kafs(E) af_rxrpc(E) [last unloaded: af_rxrpc]
CPU: 0 PID: 1305 Comm: kworker/u8:3 Tainted: G E 4.7.0-fsdevel+ #1303
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
Workqueue: kafsd afs_async_workfn [kafs]
task: ffff88040be041c0 ti: ffff88040c070000 task.ti: ffff88040c070000
RIP: 0010:[<ffffffff8108fd3c>] [<ffffffff8108fd3c>] __lock_acquire+0xcf/0x15a1
RSP: 0018:ffff88040c073bc0 EFLAGS: 00010002
RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: ffff88040d29a710
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88040d29a710
RBP: ffff88040c073c70 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: ffff88040be041c0 R15: ffffffff814c928f
FS: 0000000000000000(0000) GS:ffff88041fa00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa4595f4750 CR3: 0000000001c14000 CR4: 00000000001406f0
Stack:
0000000000000006 000000000be04930 0000000000000000 ffff880400000000
ffff880400000000 ffffffff8108f847 ffff88040be041c0 ffffffff81050446
ffff8803fc08a920 ffff8803fc08a958 ffff88040be041c0 ffff88040c073c38
Call Trace:
[<ffffffff8108f847>] ? mark_held_locks+0x5e/0x74
[<ffffffff81050446>] ? __local_bh_enable_ip+0x9b/0xa1
[<ffffffff8108f9ca>] ? trace_hardirqs_on_caller+0x16d/0x189
[<ffffffff810915f4>] lock_acquire+0x122/0x1b6
[<ffffffff810915f4>] ? lock_acquire+0x122/0x1b6
[<ffffffff814c928f>] ? skb_dequeue+0x18/0x61
[<ffffffff81609dbf>] _raw_spin_lock_irqsave+0x35/0x49
[<ffffffff814c928f>] ? skb_dequeue+0x18/0x61
[<ffffffff814c928f>] skb_dequeue+0x18/0x61
[<ffffffffa009aa92>] afs_deliver_to_call+0x344/0x39d [kafs]
[<ffffffffa009ab37>] afs_process_async_call+0x4c/0xd5 [kafs]
[<ffffffffa0099e9c>] afs_async_workfn+0xe/0x10 [kafs]
[<ffffffff81063a3a>] process_one_work+0x29d/0x57c
[<ffffffff81064ac2>] worker_thread+0x24a/0x385
[<ffffffff81064878>] ? rescuer_thread+0x2d0/0x2d0
[<ffffffff810696f5>] kthread+0xf3/0xfb
[<ffffffff8160a6ff>] ret_from_fork+0x1f/0x40
[<ffffffff81069602>] ? kthread_create_on_node+0x1cf/0x1cf
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rxrpc_lookup_peer() function returns NULL on error, it never returns
error pointers.
Fixes: 8496af50eb ('rxrpc: Use RCU to access a peer's service connection tree')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
security initialized after alloc workqueue, so we should exit security
before destroy workqueue in the error handing.
Fixes: 648af7fca1 ("rxrpc: Absorb the rxkad security module")
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
The call hash table is now no longer used as calls are looked up directly
by channel slot on the connection, so kill it off.
Signed-off-by: David Howells <dhowells@redhat.com>
Move to using RCU access to a peer's service connection tree when routing
an incoming packet. This is done using a seqlock to trigger retrying of
the tree walk if a change happened.
Further, we no longer get a ref on the connection looked up in the
data_ready handler unless we queue the connection's work item - and then
only if the refcount > 0.
Note that I'm avoiding the use of a hash table for service connections
because each service connection is addressed by a 62-bit number
(constructed from epoch and connection ID >> 2) that would allow the client
to engage in bucket stuffing, given knowledge of the hash algorithm.
Peers, however, are hashed as the network address is less controllable by
the client. The total number of peers will also be limited in a future
commit.
Signed-off-by: David Howells <dhowells@redhat.com>
Overhaul the usage count accounting for the rxrpc_connection struct to make
it easier to implement RCU access from the data_ready handler.
The problem is that currently we're using a lock to prevent the garbage
collector from trying to clean up a connection that we're contemplating
unidling. We could just stick incoming packets on the connection we find,
but we've then got a problem that we may race when dispatching a work item
to process it as we need to give that a ref to prevent the rxrpc_connection
struct from disappearing in the meantime.
Further, incoming packets may get discarded if attached to an
rxrpc_connection struct that is going away. Whilst this is not a total
disaster - the client will presumably resend - it would delay processing of
the call. This would affect the AFS client filesystem's service manager
operation.
To this end:
(1) We now maintain an extra count on the connection usage count whilst it
is on the connection list. This mean it is not in use when its
refcount is 1.
(2) When trying to reuse an old connection, we only increment the refcount
if it is greater than 0. If it is 0, we replace it in the tree with a
new candidate connection.
(3) Two connection flags are added to indicate whether or not a connection
is in the local's client connection tree (used by sendmsg) or the
peer's service connection tree (used by data_ready). This makes sure
that we don't try and remove a connection if it got replaced.
The flags are tested under lock with the removal operation to prevent
the reaper from killing the rxrpc_connection struct whilst someone
else is trying to effect a replacement.
This could probably be alleviated by using memory barriers between the
flag set/test and the rb_tree ops. The rb_tree op would still need to
be under the lock, however.
(4) When trying to reap an old connection, we try to flip the usage count
from 1 to 0. If it's not 1 at that point, then it must've come back
to life temporarily and we ignore it.
Signed-off-by: David Howells <dhowells@redhat.com>
Move the lookup of a peer from a call that's being accepted into the
function that creates a new incoming connection. This will allow us to
avoid incrementing the peer's usage count in some cases in future.
Note that I haven't bother to integrate rxrpc_get_addr_from_skb() with
rxrpc_extract_addr_from_skb() as I'm going to delete the former in the very
near future.
Signed-off-by: David Howells <dhowells@redhat.com>
Split the service-specific connection code out into into its own file. The
client-specific code has already been split out. This will leave just the
common code in the original file.
Signed-off-by: David Howells <dhowells@redhat.com>
Split the client-specific connection code out into its own file. It will
behave somewhat differently from the service-specific connection code, so
it makes sense to separate them.
Signed-off-by: David Howells <dhowells@redhat.com>
Each channel on a connection has a separate, independent number space from
which to allocate callNumber values. It is entirely possible, for example,
to have a connection with four active calls, each with call number 1.
Note that the callNumber values for any particular channel don't have to
start at 1, but they are supposed to increment monotonically for that
channel from a client's perspective and may not be reused once the call
number is transmitted (until the epoch cycles all the way back round).
Currently, however, call numbers are allocated on a per-connection basis
and, further, are held in an rb-tree. The rb-tree is redundant as the four
channel pointers in the rxrpc_connection struct are entirely capable of
pointing to all the calls currently in progress on a connection.
To this end, make the following changes:
(1) Handle call number allocation independently per channel.
(2) Get rid of the conn->calls rb-tree. This is overkill as a connection
may have a maximum of four calls in progress at any one time. Use the
pointers in the channels[] array instead, indexed by the channel
number from the packet.
(3) For each channel, save the result of the last call that was in
progress on that channel in conn->channels[] so that the final ACK or
ABORT packet can be replayed if necessary. Any call earlier than that
is just ignored. If we've seen the next call number in a packet, the
last one is most definitely defunct.
(4) When generating a RESPONSE packet for a connection, the call number
counter for each channel must be included in it.
(5) When parsing a RESPONSE packet for a connection, the call number
counters contained therein should be used to set the minimum expected
call numbers on each channel.
To do in future commits:
(1) Replay terminal packets based on the last call stored in
conn->channels[].
(2) Connections should be retired before the callNumber space on any
channel runs out.
(3) A server is expected to disregard or reject any new incoming call that
has a call number less than the current call number counter. The call
number counter for that channel must be advanced to the new call
number.
Note that the server cannot just require that the next call that it
sees on a channel be exactly the call number counter + 1 because then
there's a scenario that could cause a problem: The client transmits a
packet to initiate a connection, the network goes out, the server
sends an ACK (which gets lost), the client sends an ABORT (which also
gets lost); the network then reconnects, the client then reuses the
call number for the next call (it doesn't know the server already saw
the call number), but the server thinks it already has the first
packet of this call (it doesn't know that the client doesn't know that
it saw the call number the first time).
Signed-off-by: David Howells <dhowells@redhat.com>
The socket's accept queue (socket->acceptq) should be accessed under
socket->call_lock, not under the connection lock.
Signed-off-by: David Howells <dhowells@redhat.com>
Add RCU destruction for connections and calls as the RCU lookup from the
transport socket data_ready handler is going to come along shortly.
Whilst we're at it, move the cleanup workqueue flushing and RCU barrierage
into the destruction code for the objects that need it (locals and
connections) and add the extra RCU barrier required for connection cleanup.
Signed-off-by: David Howells <dhowells@redhat.com>
When a call is disconnected, clear the call's pointer to the connection and
release the associated ref on that connection. This means that the call no
longer pins the connection and the connection can be discarded even before
the call is.
As the code currently stands, the call struct is effectively pinned by
userspace until userspace has enacted a recvmsg() to retrieve the final
call state as sk_buffs on the receive queue pin the call to which they're
related because:
(1) The rxrpc_call struct contains the userspace ID that recvmsg() has to
include in the control message buffer to indicate which call is being
referred to. This ID must remain valid until the terminal packet is
completely read and must be invalidated immediately at that point as
userspace is entitled to immediately reuse it.
(2) The final ACK to the reply to a client call isn't sent until the last
data packet is entirely read (it's probably worth altering this in
future to be send the ACK as soon as all the data has been received).
This change requires a bit of rearrangement to make sure that the call
isn't going to try and access the connection again after protocol
completion:
(1) Delete the error link earlier when we're releasing the call. Possibly
network errors should be distributed via connections at the cost of
adding in an access to the rxrpc_connection struct.
(2) Remove the call from the connection's call tree before disconnecting
the call. The call tree needs to be removed anyway and incoming
packets delivered by channel pointer instead.
(3) The release call event should be considered last after all other
events have been processed so that we don't need access to the
connection again.
(4) Move the channel_lock taking from rxrpc_release_call() to
rxrpc_disconnect_call() where it will be required in future.
Signed-off-by: David Howells <dhowells@redhat.com>
If rxrpc_connect_call() fails during the creation of a client connection,
there are two bugs that we can hit that need fixing:
(1) The call state should be moved to RXRPC_CALL_DEAD before the call
cleanup phase is invoked. If not, this can cause an assertion failure
later.
(2) call->link should be reinitialised after being deleted in
rxrpc_new_client_call() - which otherwise leads to a failure later
when the call cleanup attempts to delete the link again.
Signed-off-by: David Howells <dhowells@redhat.com>
Rather than calling rxrpc_get_connection() manually before calling
rxrpc_queue_conn(), do it inside the queue wrapper.
This allows us to do some important fixes:
(1) If the usage count is 0, do nothing. This prevents connections from
being reanimated once they're dead.
(2) If rxrpc_queue_work() fails because the work item is already queued,
retract the usage count increment which would otherwise be lost.
(3) Don't take a ref on the connection in the work function. By passing
the ref through the work item, this is unnecessary. Doing it in the
work function is too late anyway. Previously, connection-directed
packets held a ref on the connection, but that's not really the best
idea.
And another useful changes:
(*) Don't need to take a refcount on the connection in the data_ready
handler unless we invoke the connection's work item. We're using RCU
there so that's otherwise redundant.
Signed-off-by: David Howells <dhowells@redhat.com>
Check that the client conns cache is empty before module removal and bug if
not, listing any offending connections that are still present. Unfortunately,
if there are connections still around, then the transport socket is still
unexpectedly open and active, so we can't just unallocate the connections.
Signed-off-by: David Howells <dhowells@redhat.com>
Turn the connection event and state #define lists into enums and move
outside of the struct definition.
Whilst we're at it, change _SERVER to _SERVICE in those identifiers and add
EV_ into the event name to distinguish them from flags and states.
Also add a symbol indicating the number of states and use that in the state
text array.
Signed-off-by: David Howells <dhowells@redhat.com>