Clean up some unused macros in net/*.
1. be left for code change. e.g. PGV_FROM_VMALLOC, PGV_FROM_VMALLOC, KMEM_SAFETYZONE.
2. never be used since introduced to kernel.
e.g. P9_RDMA_MAX_SGE, UTIL_CTRL_PKT_SIZE.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RDS protocol has lots of functions that should be
declared static. rds_message_get/add_version_extension is
removed since it defined but never used.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add two CMSGs for masked versions of cswp and fadd. args
struct modified to use a union for different atomic op type's
arguments. Change IB to do masked atomic ops. Atomic op type
in rds_message similarly unionized.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
This prints the constant identifier for work completion status and rdma
cm event types, like we already do for IB event types.
A core string array helper is added that each string type uses.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Right now there's nothing to stop the various paths that use
rs->rs_transport from racing with rmmod and executing freed transport
code. The simple fix is to have binding to a transport also hold a
reference to the transport's module, removing this class of races.
We already had an unused t_owner field which was set for the modular
transports and which wasn't set for the built-in loop transport.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
rs_transport is now also used by the rdma paths once the socket is
bound. We don't need this stale comment to tell us what cscope can.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
rds_send_xmit() was changed to hold an interrupt masking spinlock instead of a
mutex so that it could be called from the IB receive tasklet path. This broke
the TCP transport because its xmit method can block and masks and unmasks
interrupts.
This patch serializes callers to rds_send_xmit() with a simple bit instead of
the current spinlock or previous mutex. This enables rds_send_xmit() to be
called from any context and to call functions which block. Getting rid of the
c_send_lock exposes the bare c_lock acquisitions which are changed to block
interrupts.
A waitqueue is added so that rds_conn_shutdown() can wait for callers to leave
rds_send_xmit() before tearing down partial send state. This lets us get rid
of c_senders.
rds_send_xmit() is changed to check the conn state after acquiring the
RDS_IN_XMIT bit to resolve races with the shutdown path. Previously both
worked with the conn state and then the lock in the same order, allowing them
to race and execute the paths concurrently.
rds_send_reset() isn't racing with rds_send_xmit() now that rds_conn_shutdown()
properly ensures that rds_send_xmit() can't start once the conn state has been
changed. We can remove its previous use of the spinlock.
Finally, c_send_generation is redundant. Callers can race to test the c_flags
bit by simply retrying instead of racing to test the c_send_generation atomic.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
rds_send_acked_before() wasn't blocking interrupts when acquiring c_lock from
user context but nothing calls it. Rather than fix its use of c_lock we just
remove the function.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
A few paths had the same block of code to queue a connection's connect work if
it was in the right state. Let's move this in to a helper function.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
This is the first in a long line of patches that tries to fix races
between RDS connection shutdown and RDS traffic.
Here we are maintaining a count of active senders to make sure
the connection doesn't go away while they are using it.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The RDS bind lookups are somewhat expensive in terms of CPU
time and locking overhead. This commit changes them into a
faster RCU based hash tree instead of the rbtrees they were using
before.
On large NUMA systems it is a significant improvement.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This removes a global waitqueue used to wait for rds messages
and replaces it with a waitqueue inside the rds_message struct.
The global waitqueue turns into a global lock and significantly
bottlenecks operations on large machines.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
rds_send_xmit is required to loop around after it releases the lock
because someone else could done a trylock, found someone working on the
list and backed off.
But, once we drop our lock, it is possible that someone else does come
in and make progress on the list. We should detect this and not loop
around if another process is actually working on the list.
This patch adds a generation counter that is bumped every time we
get the lock and do some send work. If the retry notices someone else
has bumped the generation counter, it does not need to loop around and
continue working.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
This change allows us to call rds_send_xmit() from a tasklet,
which is crucial to our new operating model.
* Change c_send_lock to a spinlock
* Update stats fields "sem_" to "_lock"
* Remove unneeded rds_conn_is_sending()
About locking between shutdown and send -- send checks if the
connection is up. Shutdown puts the connection into
DISCONNECTING. After this, all threads entering send will exit
immediately. However, a thread could be *in* send_xmit(), so
shutdown acquires the c_send_lock to ensure everyone is out
before proceeding with connection shutdown.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
We now ask the transport to give us a rm for the congestion
map, and then we handle it normally. Previously, the
transport defined a function that we would call to send
a congestion map.
Convert TCP and loop transports to new cong map method.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Previously, RDS would wait until the final send WR had completed
and then handle cleanup. With silent ops, we do not know
if an atomic, rdma, or data op will be last. This patch
handles any of these cases by keeping a pointer to the last
op in the message in m_last_op.
When the TX completion event fires, rds dispatches to per-op-type
cleanup functions, and then does whole-message cleanup, if the
last op equalled m_last_op.
This patch also moves towards having op-specific functions take
the op struct, instead of the overall rm struct.
rds_ib_connection has a pointer to keep track of a a partially-
completed data send operation. This patch changes it from an
rds_message pointer to the narrower rm_data_op pointer, and
modifies places that use this pointer as needed.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Add a flag to the API so users can indicate they want
silent operations. This is needed because silent ops
cannot be used with USE_ONCE MRs, so we can't just
assume silent.
Also, change send_xmit to do atomic op before rdma op if
both are present, and centralize the hairy logic to determine if
we want to attempt silent, or not.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Simplify rds_send_xmit().
Send a congestion map (via xmit_cong_map) without
decrementing send_quota.
Move resetting of conn xmit variables to end of loop.
Update comments.
Implement a special case to turn off sending an rds header
when there is an atomic op and no other data.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
A big changeset, but it's all pretty dumb.
struct rds_rdma_op was already embedded in struct rm_rdma_op.
Remove rds_rdma_op and put its members in rm_rdma_op. Rename
members with "op_" prefix instead of "r_", for consistency.
Of course this breaks a lot, so fixup the code accordingly.
Signed-off-by: Andy Grover <andy.grover@oracle.com>