Commit Graph

9722 Commits

Author SHA1 Message Date
Jarek Poplawski d4766692e7 pkt_sched: Protect gen estimators under est_lock.
gen_kill_estimator() required rtnl_lock() protection, but since it is
moved to an RCU callback __qdisc_destroy() let's use est_lock instead.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 15:20:24 -07:00
David S. Miller b9a3b1102b pkt_sched: Fix queue quiescence testing in dev_deactivate().
Based upon discussions with Jarek P. and Herbert Xu.

First, we're testing the wrong qdisc.  We just reset the device
queue qdiscs to &noop_qdisc and checking it's state is completely
pointless here.

We want to wait until the previous qdisc that was sitting at
the ->qdisc pointer is not busy any more.  And that would be
->qdisc_sleeping.

Because of how we propagate the samples qdisc pointer down into
qdisc_run and friends via per-cpu ->output_queue and netif_schedule,
we have to wait also for the __QDISC_STATE_SCHED bit to clear as
well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 15:18:38 -07:00
Jarek Poplawski 26b284de54 pkt_sched: Fix oops in htb_delete.
Recent changes introduced a bug in htb_delete(): cl->parent->children
counter update misses checking cl->parent for NULL, which is used for
root classes, so deleting them causes an oops.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 15:16:43 -07:00
Andrew Gallatin 64c00d81b5 pktgen: prevent pktgen from using bad tx queue
With the new multi-queue transmit code, it is possible to accidentally
make pktgen pick a non-existing tx queue simply by using a stale
script to drive pktgen.  Access to this non-existing tx queue will
then trigger a bad memory access and kill the machine.

For example, setting "queue_map_max 2" will cause my machine to die
when accessing a garbage spinlock in the non-existing tx queue:

BUG: spinlock bad magic on CPU#0, kpktgend_0/564
  lock: ffff88001ddf6718, .magic: ffffffff, .owner: /-1, .owner_cpu: 0
Pid: 564, comm: kpktgend_0 Not tainted 2.6.27-rc3 #35

Call Trace:
  [<ffffffff803a1228>] spin_bug+0xa4/0xac
  [<ffffffff803a1253>] _raw_spin_lock+0x23/0x123
  [<ffffffff8055b06f>] _spin_lock_bh+0x17/0x1b
  [<ffffffff804cb57d>] pktgen_thread_worker+0xa97/0x1002
  [<ffffffff8022874d>] ? finish_task_switch+0x38/0x97
  [<ffffffff80242077>] ? autoremove_wake_function+0x0/0x36
  [<ffffffff80242077>] ? autoremove_wake_function+0x0/0x36
  [<ffffffff804caae6>] ? pktgen_thread_worker+0x0/0x1002
  [<ffffffff80241a40>] kthread+0x44/0x6d
  [<ffffffff8020c399>] child_rip+0xa/0x11
  [<ffffffff802419fc>] ? kthread+0x0/0x6d
  [<ffffffff8020c38f>] ? child_rip+0x0/0x11

The attached patch adds some sanity checking to prevent
these sorts of configuration errors.

Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 15:16:00 -07:00
Arnaldo Carvalho de Melo 3e8a0a559c dccp: change L/R must have at least one byte in the dccpsf_val field
Thanks to Eugene Teo for reporting this problem.
    
Signed-off-by: Eugene Teo <eugenete@kernel.sg>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 13:48:39 -07:00
Jean-Christophe DUBOIS c1e24df27f xfrm: remove unnecessary variable in xfrm_output_resume() 2nd try
Small fix removing an unnecessary intermediate variable.

Signed-off-by: Jean-Christophe DUBOIS <jcd@tribudubois.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 13:35:37 -07:00
Jamal Hadi Salim 36723873b6 net-sched: fix Action flushing return code
Flushing must consistently return ENOMEM on failure of any allocation

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 02:41:45 -07:00
Jamal Hadi Salim f97017cdef net-sched: Fix actions flushing
Flushing of actions has been broken since we changed
the semantics of netlink parsed tb[X] to mean X is an attribute type.
This makes the flushing work.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 02:41:22 -07:00
Julien Brunel 34093d055e net/rxrpc: Use an IS_ERR test rather than a NULL test
In case of error, the function rxrpc_get_transport returns an ERR
pointer, but never returns a NULL pointer. So after a call to this
function, a NULL test should be replaced by an IS_ERR test.

A simplified version of the semantic patch that makes this change is
as follows: 
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@correct_null_test@
expression x,E;
statement S1, S2;
@@
x =  rxrpc_get_transport(...)
<... when != x = E
if (
(
- x@p2 != NULL
+ ! IS_ERR ( x )
|
- x@p2 == NULL
+ IS_ERR( x )
)
 )
S1
else S2
...>
? x = E;
// </smpl>

Signed-off-by: Julien Brunel <brunel@diku.dk>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 02:40:48 -07:00
Jamal Hadi Salim 317900cb01 wext: Send name on events
In the minimal the wireless extensions oughta send at least
the name in addition to the ifindex.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 02:39:56 -07:00
Andrew Morton 6ced0b3f1e net/tipc/subscr.c: don't use ___constant_swab32
It's an internal implementation detail which we _should_ be free to change. 
So we did, and it promptly broke.

The compiler shold be able to work out when to use the __constant version
anyway.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 02:32:06 -07:00
Brian Haley 5e0115e500 ipv6: Fix OOPS, ip -f inet6 route get fec0::1, linux-2.6.26, ip6_route_output, rt6_fill_node+0x175
Alexey Dobriyan wrote:
> On Thu, Aug 07, 2008 at 07:00:56PM +0200, John Gumb wrote:
>> Scenario: no ipv6 default route set.
> 
>> # ip -f inet6 route get fec0::1
>>
>> BUG: unable to handle kernel NULL pointer dereference at 00000000
>> IP: [<c0369b85>] rt6_fill_node+0x175/0x3b0
>> EIP is at rt6_fill_node+0x175/0x3b0
> 
> 0xffffffff80424dd3 is in rt6_fill_node (net/ipv6/route.c:2191).
> 2186                    } else
> 2187    #endif
> 2188                            NLA_PUT_U32(skb, RTA_IIF, iif);
> 2189            } else if (dst) {
> 2190                    struct in6_addr saddr_buf;
> 2191      ====>         if (ipv6_dev_get_saddr(ip6_dst_idev(&rt->u.dst)->dev,
>					       ^^^^^^^^^^^^^^^^^^^^^^^^
>											NULL
> 
> 2192                                           dst, 0, &saddr_buf) == 0)
> 2193                            NLA_PUT(skb, RTA_PREFSRC, 16, &saddr_buf);
> 2194            }

The commit that changed this can't be reverted easily, but the patch
below works for me.

Fix NULL de-reference in rt6_fill_node() when there's no IPv6 input
device present in the dst entry.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 01:58:57 -07:00
Jarek Poplawski 1cfa26661a pkt_sched: Add BH protection for qdisc_stab_lock.
Since qdisc_stab_lock is used in qdisc_put_stab(), which is called in
BH context from __qdisc_destroy() RCU callback, softirq safe locking
is needed.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-11 18:11:06 -07:00
David S. Miller 0a37c10ed4 Merge branch 'stealer/ipvs/for-davem' of git://git.stealer.net/linux-2.6 2008-08-11 18:04:35 -07:00
Simon Horman e93615d086 ipvs: Explictly clear ip_vs_stats members
In order to align the coding styles of ip_vs_zero_stats() and
its child-function ip_vs_zero_estimator(), clear ip_vs_stats
members explicitlty rather than doing a limited memset().

This was chosen over modifying ip_vs_zero_estimator() to use
memset() as it is more robust against changes in members
in the relevant structures. memset() would be prefered if
all members of the structure were to be cleared.

Cc: Sven Wegener <sven.wegener@stealer.net>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
2008-08-11 14:00:55 +02:00
Sven Wegener 519e49e888 ipvs: No need to zero out ip_vs_stats during initialization
It's a global variable and automatically initialized to zero. And now we can
also initialize the lock at compile time.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
2008-08-11 14:00:46 +02:00
Sven Wegener 3a14a313f9 ipvs: Embed estimator object into stats object
There's no reason for dynamically allocating an estimator object for every
stats object. Directly embed an estimator object into every stats object and
switch to using the kernel-provided list implementation. This makes the code
much simpler and faster, as we do not need to traverse the list of all
estimators to find the one belonging to a stats object. There's no need to use
an rwlock, as we only have one reader. Also reorder the members of the
estimator structure slightly to avoid padding overhead. This can't be done
with the stats object as the members are currently copied to our user space
object via memcpy() and changing it would break ABI.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
2008-08-11 14:00:43 +02:00
Sven Wegener 5587da55fb ipvs: Mark net_vs_ctl_path const
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
2008-08-11 11:46:27 +02:00
Sven Wegener 048cf48b89 ipvs: Annotate init functions with __init
Being able to discard these functions saves a couple of bytes at runtime. The
cleanup functions can't be annotated with __exit as they are also called from
init functions.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
2008-08-11 11:46:18 +02:00
Sven Wegener d149ccc9cf ipvs: Initialize schedulers' struct list_head at compile time
No need to do it at runtime and this saves a couple of bytes in the text
section.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
2008-08-11 11:46:06 +02:00
Sven Wegener 66a0be4720 ipvs: Use list_empty() instead of open-coding the same functionality
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
2008-08-11 11:45:57 +02:00
Sven Wegener 8ab19ea36c ipvs: Fix possible deadlock in estimator code
There is a slight chance for a deadlock in the estimator code. We can't call
del_timer_sync() while holding our lock, as the timer might be active and
spinning for the lock on another cpu. Work around this issue by using
try_to_del_timer_sync() and releasing the lock. We could actually delete the
timer outside of our lock, as the add and kill functions are only every called
from userspace via [gs]etsockopt() and are serialized by a mutex, but better
make this explicit.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Cc: stable <stable@kernel.org>
Acked-by: Simon Horman <horms@verge.net.au>
2008-08-11 11:45:40 +02:00
Sven Wegener bc0fde2fad ipvs: Fix possible deadlock in sync code
Commit 998e7a7680 ("ipvs: Use kthread_run()
instead of doing a double-fork via kernel_thread()") introduced a possible
deadlock in the sync code. We need to use the _bh versions for the lock, as the
lock is also accessed from a bottom half.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
2008-08-11 11:44:38 +02:00
Herbert Xu d97106ea52 udp: Drop socket lock for encapsulated packets
The socket lock is there to protect the normal UDP receive path.
Encapsulation UDP sockets don't need that protection.  In fact
the locking is deadly for them as they may contain another UDP
packet within, possibly with the same addresses.

Also the nested bit was copied from TCP.  TCP needs it because
of accept(2) spawning sockets.  This simply doesn't apply to UDP
so I've removed it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-09 00:35:05 -07:00
David S. Miller 8123b421e8 pkt_sched: Fix ingress deletion and filter attachment.
Based upon bug reports by Stephen Hemminger.

We still had some cases using ->qdisc instead of ->qdisc_sleeping.

Also, qdisc_lookup() should return ingress qdiscs.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-08 23:23:39 -07:00