Commit Graph

103 Commits

Author SHA1 Message Date
Eric Dumazet
f037590fff rds: fix a leak of kernel memory
struct rds_rdma_notify contains a 32 bits hole on 64bit arches,
make sure it is zeroed before copying it to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-18 23:40:03 -07:00
Julia Lawall
5daf47bb4e net/rds: Add missing mutex_unlock
Add a mutex_unlock missing on the error path.  In each case, whenever the
label out is reached from elsewhere in the function, mutex is not locked.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression E1;
@@

* mutex_lock(E1);
  <+... when != E1
  if (...) {
    ... when != E1
*   return ...;
  }
  ...+>
* mutex_unlock(E1);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Reviewed-by: Zach Brown <zach.brown@oracle.com>
Acked-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-29 00:18:48 -07:00
Joe Perches
ccbd6a5a4f net: Remove unnecessary semicolons after switch statements
Also added an explicit break; to avoid
a fallthrough in net/ipv4/tcp_input.c

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:44:35 -07:00
David S. Miller
e1703b36c3 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/e100.c
	drivers/net/e1000e/netdev.c
2010-04-27 12:49:13 -07:00
Dan Carpenter
24acc68956 rdma: potential ERR_PTR dereference
In the original code, the "goto out" calls "rdma_destroy_id(cm_id);"
That isn't needed here and would cause problems because "cm_id" is an
ERR_PTR.  The new code just returns directly.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-22 15:57:26 -07:00
Eric Dumazet
aa39514516 net: sk_sleep() helper
Define a new function to return the waitqueue of a "struct sock".

static inline wait_queue_head_t *sk_sleep(struct sock *sk)
{
	return sk->sk_sleep;
}

Change all read occurrences of sk_sleep by a call to this function.

Needed for a future RCU conversion. sk_sleep wont be a field directly
available.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-20 16:37:13 -07:00
David S. Miller
871039f02f Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/stmmac/stmmac_main.c
	drivers/net/wireless/wl12xx/wl1271_cmd.c
	drivers/net/wireless/wl12xx/wl1271_main.c
	drivers/net/wireless/wl12xx/wl1271_spi.c
	net/core/ethtool.c
	net/mac80211/scan.c
2010-04-11 14:53:53 -07:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Dan Carpenter
18062ca947 rds: cleanup: remove unneeded variable
We never use "sk" so this patch removes it.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24 13:34:09 -07:00
Tina Yang
768bbedf9c RDS: Enable per-cpu workqueue threads
Create per-cpu workqueue threads instead of a single
krdsd thread. This is a step towards better scalability.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:17:02 -07:00
Andy Grover
561c7df63e RDS: Do not call set_page_dirty() with irqs off
set_page_dirty() unconditionally re-enables interrupts, so
if we call it with irqs off, they will be on after the call,
and that's bad. This patch moves the call after we've re-enabled
interrupts in send_drop_to(), so it's safe.

Also, add BUG_ONs to let us know if we ever do call set_page_dirty
with interrupts off.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:17:01 -07:00
Sherman Pun
450d06c020 RDS: Properly unmap when getting a remote access error
If the RDMA op has aborted with a remote access error,
in addition to what we already do (tell userspace it has
completed with an error) also unmap it and put() the rm.

Otherwise, hangs may occur on arches that track maps and
will not exit without proper cleanup.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:17:00 -07:00
Andy Grover
b98ba52f96 RDS: only put sockets that have seen congestion on the poll_waitq
rds_poll_waitq's listeners will be awoken if we receive a congestion
notification. Bad performance may result because *all* polled sockets
contend for this single lock. However, it should not be necessary to
wake pollers when a congestion update arrives if they have never
experienced congestion, and not putting these on the waitq will
hopefully greatly reduce contention.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:16:59 -07:00
Tina Yang
550a8002e4 RDS: Fix locking in rds_send_drop_to()
It seems rds_send_drop_to() called
__rds_rdma_send_complete(rs, rm, RDS_RDMA_CANCELED)
with only rds_sock lock, but not rds_message lock. It raced with
other threads that is attempting to modify the rds_message as well,
such as from within rds_rdma_send_complete().

Signed-off-by: Tina Yang <tina.yang@oracle.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:16:58 -07:00
Andy Grover
97069788d6 RDS: Turn down alarming reconnect messages
RDS's error messages when a connection goes down are a little
extreme. A connection may go down, and it will be re-established,
and everything is fine. This patch links these messages through
rdsdebug(), instead of to printk directly.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:16:57 -07:00
Andy Grover
571c02fa81 RDS: Workaround for in-use MRs on close causing crash
if a machine is shut down without closing sockets properly, and
freeing all MRs, then a BUG_ON will bring it down. This patch
changes these to WARN_ONs -- leaking MRs is not fatal (although
not ideal, and there is more work to do here for a proper fix.)

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:16:56 -07:00
Tina Yang
048c15e641 RDS: Fix send locking issue
Fix a deadlock between rds_rdma_send_complete() and
rds_send_remove_from_sock() when rds socket lock and
rds message lock are acquired out-of-order.

Signed-off-by: Tina Yang <Tina.Yang@oracle.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:16:55 -07:00
Andy Grover
2e7b3b9945 RDS: Fix congestion issues for loopback
We have two kinds of loopback: software (via loop transport)
and hardware (via IB). sw is used for 127.0.0.1, and doesn't
support rdma ops. hw is used for sends to local device IPs,
and supports rdma. Both are used in different cases.

For both of these, when there is a congestion map update, we
want to call rds_cong_map_updated() but not actually send
anything -- since loopback local and foreign congestion maps
point to the same spot, they're already in sync.

The old code never called sw loop's xmit_cong_map(),so
rds_cong_map_updated() wasn't being called for it. sw loop
ports would not work right with the congestion monitor.

Fixing that meant that hw loopback now would send congestion maps
to itself. This is also undesirable (racy), so we check for this
case in the ib-specific xmit code.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:16:55 -07:00
Andy Grover
8e82376e5f RDS/TCP: Wait to wake thread when write space available
Instead of waking the send thread whenever any send space is available,
wait until it is at least half empty. This is modeled on how
sock_def_write_space() does it, and may help to minimize context
switches.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:16:55 -07:00
Andy Grover
b075cfdb66 RDS: update copy_to_user state in tcp transport
Other transports use rds_page_copy_user, which updates our
s_copy_to_user counter. TCP doesn't, so it needs to explicity
call rds_stats_add().

Reported-by: Richard Frank <richard.frank@oracle.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:16:54 -07:00
Andy Grover
1123fd734d RDS: sendmsg() should check sndtimeo, not rcvtimeo
Most likely cut n paste error - sendmsg() was checking sock_rcvtimeo.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:16:54 -07:00
Andy Grover
735f61e626 RDS: Do not BUG() on error returned from ib_post_send
BUGging on a runtime error code should be avoided. This
patch also eliminates all other BUG()s that have no real
reason to exist.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16 21:16:53 -07:00
Joe Perches
6884b348ed net/rds: remove uses of NIPQUAD, use %pI4
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andy Grover <andy.grover@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-03 20:16:48 -08:00
Linus Torvalds
e69381b417 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (45 commits)
  RDMA/cxgb3: Fix error paths in post_send and post_recv
  RDMA/nes: Fix stale ARP issue
  RDMA/nes: FIN during MPA startup causes timeout
  RDMA/nes: Free kmap() resources
  RDMA/nes: Check for zero STag
  RDMA/nes: Fix Xansation test crash on cm_node ref_count
  RDMA/nes: Abnormal listener exit causes loopback node crash
  RDMA/nes: Fix crash in nes_accept()
  RDMA/nes: Resource not freed for REJECTed connections
  RDMA/nes: MPA request/response error checking
  RDMA/nes: Fix query of ORD values
  RDMA/nes: Fix MAX_CM_BUFFER define
  RDMA/nes: Pass correct size to ioremap_nocache()
  RDMA/nes: Update copyright and branding string
  RDMA/nes: Add max_cqe check to nes_create_cq()
  RDMA/nes: Clean up struct nes_qp
  RDMA/nes: Implement IB_SIGNAL_ALL_WR as an iWARP extension
  RDMA/nes: Add additional SFP+ PHY uC status check and PHY reset
  RDMA/nes: Correct fast memory registration implementation
  IB/ehca: Fix error paths in post_send and post_recv
  ...
2009-12-16 10:32:31 -08:00
Linus Torvalds
d7fc02c7ba Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
  mac80211: fix reorder buffer release
  iwmc3200wifi: Enable wimax core through module parameter
  iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
  iwmc3200wifi: Coex table command does not expect a response
  iwmc3200wifi: Update wiwi priority table
  iwlwifi: driver version track kernel version
  iwlwifi: indicate uCode type when fail dump error/event log
  iwl3945: remove duplicated event logging code
  b43: fix two warnings
  ipw2100: fix rebooting hang with driver loaded
  cfg80211: indent regulatory messages with spaces
  iwmc3200wifi: fix NULL pointer dereference in pmkid update
  mac80211: Fix TX status reporting for injected data frames
  ath9k: enable 2GHz band only if the device supports it
  airo: Fix integer overflow warning
  rt2x00: Fix padding bug on L2PAD devices.
  WE: Fix set events not propagated
  b43legacy: avoid PPC fault during resume
  b43: avoid PPC fault during resume
  tcp: fix a timewait refcnt race
  ...

Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
CTL_UNNUMBERED removed) in
	kernel/sysctl_check.c
	net/ipv4/sysctl_net_ipv4.c
	net/ipv6/addrconf.c
	net/sctp/sysctl.c
2009-12-08 07:55:01 -08:00