Commit Graph

13290 Commits

Author SHA1 Message Date
Krishna Kumar fd3ae5e8fc Speed-up pfifo_fast lookup using a private bitmap
Maintain a per-qdisc bitmap for pfifo_fast giving  availability
of skbs for each band. This allows faster lookup for a skb when
there are no high priority skbs. Also, it helps in (rare) cases
when there are no skbs on the list, where an immediate lookup is
faster than iterating through the three bands.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:19:21 -07:00
David Ward 31ce8c71a3 ipv6: Update Neighbor Cache when IPv6 RA is received on a router
When processing a received IPv6 Router Advertisement, the kernel
creates or updates an IPv6 Neighbor Cache entry for the sender --
but presently this does not occur if IPv6 forwarding is enabled
(net.ipv6.conf.*.forwarding = 1), or if IPv6 Router Advertisements
are not accepted (net.ipv6.conf.*.accept_ra = 0), because in these
cases processing of the Router Advertisement has already halted.

This patch allows the Neighbor Cache to be updated in these cases,
while still avoiding any modification to routes or link parameters.

This continues to satisfy RFC 4861, since any entry created in the
Neighbor Cache as the result of a received Router Advertisement is
still placed in the STALE state.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:04:09 -07:00
Octavian Purdila 80a1096bac tcp: fix premature termination of FIN_WAIT2 time-wait sockets
There is a race condition in the time-wait sockets code that can lead
to premature termination of FIN_WAIT2 and, subsequently, to RST
generation when the FIN,ACK from the peer finally arrives:

Time     TCP header
0.000000 30755 > http [SYN] Seq=0 Win=2920 Len=0 MSS=1460 TSV=282912 TSER=0
0.000008 http > 30755 aSYN, ACK] Seq=0 Ack=1 Win=2896 Len=0 MSS=1460 TSV=...
0.136899 HEAD /1b.html?n1Lg=v1 HTTP/1.0 [Packet size limited during capture]
0.136934 HTTP/1.0 200 OK [Packet size limited during capture]
0.136945 http > 30755 [FIN, ACK] Seq=187 Ack=207 Win=2690 Len=0 TSV=270521...
0.136974 30755 > http [ACK] Seq=207 Ack=187 Win=2734 Len=0 TSV=283049 TSER=...
0.177983 30755 > http [ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283089 TSER=...
0.238618 30755 > http [FIN, ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283151...
0.238625 http > 30755 [RST] Seq=188 Win=0 Len=0

Say twdr->slot = 1 and we are running inet_twdr_hangman and in this
instance inet_twdr_do_twkill_work returns 1. At that point we will
mark slot 1 and schedule inet_twdr_twkill_work. We will also make
twdr->slot = 2.

Next, a connection is closed and tcp_time_wait(TCP_FIN_WAIT2, timeo)
is called which will create a new FIN_WAIT2 time-wait socket and will
place it in the last to be reached slot, i.e. twdr->slot = 1.

At this point say inet_twdr_twkill_work will run which will start
destroying the time-wait sockets in slot 1, including the just added
TCP_FIN_WAIT2 one.

To avoid this issue we increment the slot only if all entries in the
slot have been purged.

This change may delay the slots cleanup by a time-wait death row
period but only if the worker thread didn't had the time to run/purge
the current slot in the next period (6 seconds with default sysctl
settings). However, on such a busy system even without this change we
would probably see delays...

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:00:35 -07:00
Jens Låås 80b71b80df fib_trie: resize rework
Here is rework and cleanup of the resize function.

Some bugs we had. We were using ->parent when we should use 
node_parent(). Also we used ->parent which is not assigned by
inflate in inflate loop.

Also a fix to set thresholds to power 2 to fit halve 
and double strategy.

max_resize is renamed to max_work which better indicates
it's function.

Reaching max_work is not an error, so warning is removed. 
max_work only limits amount of work done per resize.
(limits CPU-usage, outstanding memory etc).

The clean-up makes it relatively easy to add fixed sized 
root-nodes if we would like to decrease the memory pressure
on routers with large routing tables and dynamic routing.
If we'll need that...

Its been tested with 280k routes.

Work done together with Robert Olsson.

Signed-off-by: Jens Låås <jens.laas@its.uu.se>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:57:15 -07:00
Sascha Hlusiak 8945a808f7 sit: allow ip fragmentation when using nopmtudisc to fix package loss
if tunnel parameters have frag_off set to IP_DF, pmtudisc on the ipv4 link
will be performed by deriving the mtu from the ipv4 link and setting the
DF-Flag of the encapsulating IPv4 Header. If fragmentation is needed on the
way, the IPv4 pmtu gets adjusted, the ipv6 package will be resent eventually,
using the new and lower mtu and everyone is happy.

If the frag_off parameter is unset, the mtu for the tunnel will be derived
from the tunnel device or the ipv6 pmtu, which might be higher than the ipv4
pmtu. In that case we must allow the fragmentation of the IPv4 packet because
the IPv6 mtu wouldn't 'learn' from the adjusted IPv4 pmtu, resulting in
frequent icmp_frag_needed and package loss on the IPv6 layer.

This patch allows fragmentation when tunnel was created with parameter
nopmtudisc, like in ipip/gre tunnels.

Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:53:53 -07:00
Eric Dumazet 30038fc61a net: ip_rt_send_redirect() optimization
While doing some forwarding benchmarks, I noticed
ip_rt_send_redirect() is rather expensive, even if send_redirects is
false for the device.

Fix is to avoid two atomic ops, we dont really need to take a
reference on in_dev

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:52:01 -07:00
Eric Dumazet df19a62677 tcp: keepalive cleanups
Introduce keepalive_probes(tp) helper, and use it, like 
keepalive_time_when(tp) and keepalive_intvl_when(tp)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:48:54 -07:00
Eric Dumazet 3d1427f870 ipv4: af_inet.c cleanups
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:45:21 -07:00
Alexey Dobriyan 2975315b79 pktgen: use proc_create_data()
It looks like after rename device proc entry is unusable,
because of no ->read_proc or ->proc_fops.

And create_proc_entry() is deprecated.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:41:43 -07:00
Stephen Hemminger c3d2f52dd4 pktgen: increase version
Increase module version, and cleanup module info.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:41:39 -07:00
Stephen Hemminger 63adc6fb8a pktgen: cleanup checkpatch warnings
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:41:37 -07:00
Stephen Hemminger 64e8ff5ef2 pktgen: use common idle routine
Simpler to have one place that spins and accounts for delays,
this will also make the last packet be detected faster for more
repeatable timing.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:41:36 -07:00
Stephen Hemminger 2bc481cf43 pktgen: spin using hrtimer
This changes how the pktgen thread spins/waits between
packets if delay is configured. It uses a high res timer to
wait for time to arrive.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:41:29 -07:00
Stephen Hemminger fd29cf7262 pktgen: convert to use ktime_t
The kernel ktime_t is a nice generic infrastructure for mananging
high resolution times, as is done in pktgen.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:32:12 -07:00
Stephen Hemminger 5c9d191c16 pktgen: avoid calling gettimeofday
If not using delay then no need to update next_tx after
each packet sent. This allows pktgen to send faster especially
on systems with slower clock sources.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:32:09 -07:00
Stephen Hemminger 5b8db2f568 pktgen: reorganize transmit loop
Handle standard (and non-standard) return values in a switch.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:32:07 -07:00
Stephen Hemminger e470757d61 pktgen: use netdev_alloc_skb
netdev_alloc_skb is NUMA node aware.
Also, don't exhaust atomic emergency pool. Don't want pktgen
to cause OOM behaviour.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:32:04 -07:00
Stephen Hemminger 7d7bb1cf0e pktgen: cleanup clone count test
The if statement to test for "should a new packet be used"
can be simplified.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:32:00 -07:00
Stephen Hemminger 3791decb5a pktgen: xmit logic reorganization
Do some reorganization of transmit logic path:
   * move transmit queue full idle to separate routine
   * add a cpu_relax()
   * eliminate some of the uneeded goto's
   * if queue is still stopped, go back to main thread loop.
   * don't give up transmitting if quantum is exhausted (be greedy)

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:31:55 -07:00
Stephen Hemminger 3bda06a3d7 pktgen: stop_device cleanup
All the callers were freeing skb after stopping device.
Remove unneeded forward decl.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:31:53 -07:00
Stephen Hemminger 65c5b786a3 pktgen: mark read-only/mostly variables
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:31:51 -07:00
Stephen Hemminger 475ac1e409 pktgen: change inlining
Don't force inlining where not needed. Gcc does better job
of deciding to inline local functions.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:31:47 -07:00
Stephen Hemminger 648fda7404 pktgen: minor cleanup
A couple of minor functions can be written more compactly.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:31:45 -07:00
Jouni Malinen 5bf6fcc2bb mac80211: Check pending scan request after having processed mgd work
When the queued management work items are processed in
ieee80211_sta_work() an item could be removed. This could change the
anybusy from true to false, so we better check whether we can start a
new scan only after having processed the pending work first.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-08-28 14:40:46 -04:00
Johannes Berg 15db0b7fd8 mac80211: fix scan cancel on ifdown
When an interface is taken down while a scan is
pending -- i.e. a scan request was accepted but
not yet acted upon due to other work being in
progress -- we currently do not properly cancel
that scan and end up getting stuck. Fix this by
doing better checks when an interface is taken
down.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-08-28 14:40:45 -04:00