Commit Graph

37 Commits

Author SHA1 Message Date
santosh.shilimkar@oracle.com 3f6b314303 RDS: Fix rds MR reference count in rds_rdma_unuse()
rds_rdma_unuse() drops the mr reference count which it hasn't
taken. Correct way of removing mr is to remove mr from the tree
and then rdma_destroy_mr() it first, then rds_mr_put() to decrement
its reference count. Whichever thread holds last reference will free
the mr via rds_mr_put()

This bug was triggering weird null pointer crashes. One if the trace
for it is captured below.

BUG: unable to handle kernel NULL pointer dereference at
0000000000000104
IP: [<ffffffffa0899471>] rds_ib_free_mr+0x31/0x130 [rds_rdma]
PGD 4366fa067 PUD 4366f9067 PMD 0
Oops: 0000 [#1] SMP

[...]

task: ffff88046da6a000 ti: ffff88046da6c000 task.ti: ffff88046da6c000
RIP: 0010:[<ffffffffa0899471>]  [<ffffffffa0899471>]
rds_ib_free_mr+0x31/0x130 [rds_rdma]
RSP: 0018:ffff88046fa43bd8  EFLAGS: 00010286
RAX: 0000000071d38b80 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff880079e7ff40
RBP: ffff88046fa43bf8 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88046fa43ca8 R11: ffff88046a802ed8 R12: ffff880079e7fa40
R13: 0000000000000000 R14: ffff880079e7ff40 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88046fa40000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000104 CR3: 00000004366fb000 CR4: 00000000000006e0
Stack:
 ffff880079e7fa40 ffff880671d38f08 ffff880079e7ff40 0000000000000296
 ffff88046fa43c28 ffffffffa087a38b ffff880079e7fa40 ffff880671d38f10
 0000000000000000 0000000000000292 ffff88046fa43c48 ffffffffa087a3b6
Call Trace:
 <IRQ>
 [<ffffffffa087a38b>] rds_destroy_mr+0x8b/0xa0 [rds]
 [<ffffffffa087a3b6>] __rds_put_mr_final+0x16/0x30 [rds]
 [<ffffffffa087a492>] rds_rdma_unuse+0xc2/0x120 [rds]
 [<ffffffffa08766d3>] rds_recv_incoming_exthdrs+0x83/0xa0 [rds]
 [<ffffffffa0876782>] rds_recv_incoming+0x92/0x200 [rds]
 [<ffffffffa0895269>] rds_ib_process_recv+0x259/0x320 [rds_rdma]
 [<ffffffffa08962a8>] rds_ib_recv_tasklet_fn+0x1a8/0x490 [rds_rdma]
 [<ffffffff810dcd78>] ? __remove_hrtimer+0x58/0x90
 [<ffffffff810799e1>] tasklet_action+0xb1/0xc0
 [<ffffffff81079b52>] __do_softirq+0xe2/0x290
 [<ffffffff81079df6>] irq_exit+0xa6/0xb0
 [<ffffffff81613915>] do_IRQ+0x65/0xf0
 [<ffffffff816118ab>] common_interrupt+0x6b/0x6b

Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 16:28:10 -07:00
santosh.shilimkar@oracle.com 5c240fa2ab RDS: Fix assertion level from fatal to warning
Fix the asserion level since its not fatal and can be hit
in normal execution paths. There is no need to take the
system down.

We keep the WARN_ON() to detect the condition if we get
here with bad pages.

Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 13:35:31 -07:00
santosh.shilimkar@oracle.com 1d2e3f396c RDS: restore return value in rds_cmsg_rdma_args()
In rds_cmsg_rdma_args() 'ret' is used by rds_pin_pages() which returns
number of pinned pages on success. And the same value is returned to the
caller of rds_cmsg_rdma_args() on success which is not intended.

Commit f4a3fc03c1 ("RDS: Clean up error handling in rds_cmsg_rdma_args")
removed the 'ret = 0' line which broke RDS RDMA mode.

Fix it by restoring the return value on rds_pin_pages() success
keeping the clean-up in place.

Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 13:35:29 -07:00
Cong Wang dee49f203a rds: avoid calling sock_kfree_s() on allocation failure
It is okay to free a NULL pointer but not okay to mischarge the socket optmem
accounting. Compile test only.

Reported-by: rucsoftsec@gmail.com
Cc: Chien Yen <chien.yen@oracle.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14 17:00:19 -04:00
Dan Rosenberg 218854af84 rds: Integer overflow in RDS cmsg handling
In rds_cmsg_rdma_args(), the user-provided args->nr_local value is
restricted to less than UINT_MAX.  This seems to need a tighter upper
bound, since the calculation of total iov_size can overflow, resulting
in a small sock_kmalloc() allocation.  This would probably just result
in walking off the heap and crashing when calling rds_rdma_pages() with
a high count value.  If it somehow doesn't crash here, then memory
corruption could occur soon after.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17 12:20:52 -08:00
Andy Grover d139ff0907 RDS: Let rds_message_alloc_sgs() return NULL
Even with the previous fix, we still are reading the iovecs once
to determine SGs needed, and then again later on. Preallocating
space for sg lists as part of rds_message seemed like a good idea
but it might be better to not do this. While working to redo that
code, this patch attempts to protect against userspace rewriting
the rds_iovec array between the first and second accesses.

The consequences of this would be either a too-small or too-large
sg list array. Too large is not an issue. This patch changes all
callers of message_alloc_sgs to handle running out of preallocated
sgs, and fail gracefully.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-30 16:34:18 -07:00
Andy Grover fc8162e3c0 RDS: Copy rds_iovecs into kernel memory instead of rereading from userspace
Change rds_rdma_pages to take a passed-in rds_iovec array instead
of doing copy_from_user itself.

Change rds_cmsg_rdma_args to copy rds_iovec array once only. This
eliminates the possibility of userspace changing it after our
sanity checks.

Implement stack-based storage for small numbers of iovecs, based
on net/socket.c, to save an alloc in the extremely common case.

Although this patch reduces iovec copies in cmsg_rdma_args to 1,
we still do another one in rds_rdma_extra_size. Getting rid of
that one will be trickier, so it'll be a separate patch.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-30 16:34:17 -07:00
Andy Grover f4a3fc03c1 RDS: Clean up error handling in rds_cmsg_rdma_args
We don't need to set ret = 0 at the end -- it's initialized to 0.

Also, don't increment s_send_rdma stat if we're exiting with an
error.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-30 16:34:17 -07:00
Andy Grover a09f69c49b RDS: Return -EINVAL if rds_rdma_pages returns an error
rds_cmsg_rdma_args would still return success even if rds_rdma_pages
returned an error (or overflowed).

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-30 16:34:16 -07:00
Linus Torvalds 1b1f693d7a net: fix rds_iovec page count overflow
As reported by Thomas Pollet, the rdma page counting can overflow.  We
get the rdma sizes in 64-bit unsigned entities, but then limit it to
UINT_MAX bytes and shift them down to pages (so with a possible "+1" for
an unaligned address).

So each individual page count fits comfortably in an 'unsigned int' (not
even close to overflowing into signed), but as they are added up, they
might end up resulting in a signed return value. Which would be wrong.

Catch the case of tot_pages turning negative, and return the appropriate
error code.

Reported-by: Thomas Pollet <thomas.pollet@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-30 16:34:16 -07:00
Dan Carpenter 9b9d2e00bf rds: signedness bug
In the original code if the copy_from_user() fails in rds_rdma_pages()
then the error handling fails and we get a stack trace from kmalloc().

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-19 11:59:43 -07:00
Andy Grover 20c72bd5f5 RDS: Implement masked atomic operations
Add two CMSGs for masked versions of cswp and fadd. args
struct modified to use a union for different atomic op type's
arguments. Change IB to do masked atomic ops. Atomic op type
in rds_message similarly unionized.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:16:51 -07:00
Andy Grover 2c3a5f9abb RDS: Add flag for silent ops. Do atomic op before RDMA
Add a flag to the API so users can indicate they want
silent operations. This is needed because silent ops
cannot be used with USE_ONCE MRs, so we can't just
assume silent.

Also, change send_xmit to do atomic op before rdma op if
both are present, and centralize the hairy logic to determine if
we want to attempt silent, or not.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:12:06 -07:00
Andy Grover 7e3bd65ebf RDS: Move some variables around for consistency
Also, add a comment.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:12:05 -07:00
Andy Grover f8b3aaf2ba RDS: Remove struct rds_rdma_op
A big changeset, but it's all pretty dumb.

struct rds_rdma_op was already embedded in struct rm_rdma_op.
Remove rds_rdma_op and put its members in rm_rdma_op. Rename
members with "op_" prefix instead of "r_", for consistency.

Of course this breaks a lot, so fixup the code accordingly.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:11:58 -07:00
Andy Grover d0ab25a83c RDS: purge atomic resources too in rds_message_purge()
Add atomic_free_op function, analogous to rdma_free_op,
and call it in rds_message_purge().

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:11:57 -07:00
Andy Grover 4324879df0 RDS: Inline rdma_prepare into cmsg_rdma_args
cmsg_rdma_args just calls rdma_prepare and does a little
arg checking -- not quite enough to justify its existence.
Plus, it is the only caller of rdma_prepare().

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:11:56 -07:00
Andy Grover 6200ed7799 RDS: Whitespace
Tidy up some whitespace issues.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:11:44 -07:00
Andy Grover d22faec22c RDS: Do not mask address when pinning pages
This does not appear to be necessary.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:11:43 -07:00
Andy Grover 40589e74f7 RDS: Base init_depth and responder_resources on hw values
Instead of using a constant for initiator_depth and
responder_resources, read the per-QP values when the
device is enumerated, and then use these values when creating
the connection.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:11:42 -07:00
Andy Grover 15133f6e67 RDS: Implement atomic operations
Implement a CMSG-based interface to do FADD and CSWP ops.

Alter send routines to handle atomic ops.

Add atomic counters to stats.

Add xmit_atomic() to struct rds_transport

Inline rds_ib_send_unmap_rdma into unmap_rm

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:11:41 -07:00
Andy Grover f4dd96f7b2 RDS: make sure all sgs alloced are initialized
rds_message_alloc_sgs() now returns correctly-initialized
sg lists, so calleds need not do this themselves.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:11:39 -07:00
Andy Grover ff87e97a9d RDS: make m_rdma_op a member of rds_message
This eliminates a separate memory alloc, although
it is now necessary to add an "r_active" flag, since
it is no longer to use the m_rdma_op pointer as an
indicator of if an rdma op is present.

rdma SGs allocated from rm sg pool.

rds_rm_size also gets bigger. It's a little inefficient to
run through CMSGs twice, but it makes later steps a lot smoother.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:11:38 -07:00
Andy Grover 21f79afa5f RDS: fold rdma.h into rds.h
RDMA is now an intrinsic part of RDS, so it's easier to just have
a single header.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:11:37 -07:00
Andy Grover 3ef13f3c22 RDS: cleanup/fix rds_rdma_unuse
First, it looks to me like the atomic_inc is wrong.
We should be decrementing refcount only once here, no? It's
already being done by the mr_put() at the end.

Second, simplify the logic a bit by bailing early (with a warning)
if !mr.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
2010-09-08 18:11:35 -07:00