Commit Graph

19566 Commits

Author SHA1 Message Date
Eric Dumazet 4b9d9be839 inetpeer: remove unused list
Andi Kleen and Tim Chen reported huge contention on inetpeer
unused_peers.lock, on memcached workload on a 40 core machine, with
disabled route cache.

It appears we constantly flip peers refcnt between 0 and 1 values, and
we must insert/remove peers from unused_peers.list, holding a contended
spinlock.

Remove this list completely and perform a garbage collection on-the-fly,
at lookup time, using the expired nodes we met during the tree
traversal.

This removes a lot of code, makes locking more standard, and obsoletes
two sysctls (inet_peer_gc_mintime and inet_peer_gc_maxtime). This also
removes two pointers in inet_peer structure.

There is still a false sharing effect because refcnt is in first cache
line of object [were the links and keys used by lookups are located], we
might move it at the end of inet_peer structure to let this first cache
line mostly read by cpus.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08 17:05:30 -07:00
Jerry Chu 9ad7c049f0 tcp: RFC2988bis + taking RTT sample from 3WHS for the passive open side
This patch lowers the default initRTO from 3secs to 1sec per
RFC2988bis. It falls back to 3secs if the SYN or SYN-ACK packet
has been retransmitted, AND the TCP timestamp option is not on.

It also adds support to take RTT sample during 3WHS on the passive
open side, just like its active open counterpart, and uses it, if
valid, to seed the initRTO for the data transmission phase.

The patch also resets ssthresh to its initial default at the
beginning of the data transmission phase, and reduces cwnd to 1 if
there has been MORE THAN ONE retransmission during 3WHS per RFC5681.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08 17:05:30 -07:00
stephen hemminger aee80b54b2 ipv6: generate link local address for GRE tunnel
Use same logic as SIT tunnel to handle link local address
for GRE tunnel. OSPFv3 requires link-local address to function.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08 17:05:30 -07:00
Alexander Duyck bff55273f9 v2 ethtool: remove support for ETHTOOL_GRXNTUPLE
This change is meant to remove all support for displaying an ntuple as
strings via ETHTOOL_GRXNTUPLE.  The reason for this change is due to the
fact that multiple issues have been found including:
 - Multiple buffer overruns for strings being displayed.
 - Incorrect filters displayed, cleared filters with ring of -2 are displayed
 - Setting get_rx_ntuple displays no rules if defined.
 - Endianess wrong on displayed values.
 - Hard limit of 1024 filters makes display functionality extremely limited

The only driver that had supported this interface was ixgbe.  Since it no
longer uses the interface and due to the issues mentioned above I am
submitting this patch to remove it.

v2:
Updated based on comments from Ben Hutchings
 - Left ETH_SS_NTUPLE_FILTERS in code but commented on it being deprecated
 - Removed ethtool_rx_ntuple_list and ethtool_rx_ntuple_flow_spec_container
 - Left ETHTOOL_GRXNTUPLE but commented it as deprecated

Also cleaned up set_rx_ntuple since there is no flow spec container to
maintain we can drop all the code for the alloc and free of it and just
return ops->set_rx_ntuple().
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08 16:45:31 -07:00
John W. Linville c0c33addcb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-06-08 13:44:21 -04:00
Shahar Levi f41ccd71d8 mac80211: Stop BA session event from device
Some devices support BT/WLAN co-existence algorigthms.
In order not to harm the system performance and user experience, the device
requests not to allow any RX BA session and tear down existing RX BA sessions
based on system constraints such as periodic BT activity that needs to limit
WLAN activity (eg.SCO or A2DP).
In such cases, the intention is to limit the duration of the RX PPDU and
therefore prevent the peer device to use A-MPDU aggregation.

Adding ieee80211_stop_rx_ba_session() callback
that can be used by the driver to stop existing BA sessions.

Signed-off-by: Shahar Levi <shahar_levi@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-07 14:41:36 -04:00
John W. Linville 41bfce8ede Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-06-07 14:07:11 -04:00
Alexey Dobriyan a6b7a40786 net: remove interrupt.h inclusion from netdevice.h
* remove interrupt.g inclusion from netdevice.h -- not needed
* fixup fallout, add interrupt.h and hardirq.h back where needed.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 22:55:11 -07:00
David S. Miller 5d0c90cf4d sctp: Guard IPV6 specific code properly.
Outside of net/sctp/ipv6.c, IPV6 specific code needs to
be ifdef guarded.

This fixes build failures with IPV6 disabled.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 13:05:55 -07:00
John W. Linville ab6a44ce1d Revert "mac80211: Skip tailroom reservation for full HW-crypto devices"
This reverts commit aac6af5534.

Conflicts:

	net/mac80211/key.c

That commit has a race that causes a warning, as documented in the thread
here:

	http://marc.info/?l=linux-wireless&m=130717684914101&w=2

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-06 15:23:53 -04:00
Joe Perches f81c622420 net: Remove unnecessary semicolons
Semicolons are not necessary after switch/while/for/if braces
so remove them.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-05 14:33:39 -07:00
Ben Greear 827d978037 af-packet: Use existing netdev reference for bound sockets.
This saves a network device lookup on each packet transmitted,
for sockets that are bound to a network device.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-05 14:16:28 -07:00
Ben Greear 160ff18a07 af-packet: Hold reference to bound network devices.
Old code was probably safe, but with this change we
can actually use the netdev object, not just compare
the pointer values.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-05 14:16:28 -07:00
David S. Miller e990b37b90 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-06-04 13:38:31 -07:00
Linus Torvalds 0e833d8cfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (40 commits)
  tg3: Fix tg3_skb_error_unmap()
  net: tracepoint of net_dev_xmit sees freed skb and causes panic
  drivers/net/can/flexcan.c: add missing clk_put
  net: dm9000: Get the chip in a known good state before enabling interrupts
  drivers/net/davinci_emac.c: add missing clk_put
  af-packet: Add flag to distinguish VID 0 from no-vlan.
  caif: Fix race when conditionally taking rtnl lock
  usbnet/cdc_ncm: add missing .reset_resume hook
  vlan: fix typo in vlan_dev_hard_start_xmit()
  net/ipv4: Check for mistakenly passed in non-IPv4 address
  iwl4965: correctly validate temperature value
  bluetooth l2cap: fix locking in l2cap_global_chan_by_psm
  ath9k: fix two more bugs in tx power
  cfg80211: don't drop p2p probe responses
  Revert "net: fix section mismatches"
  drivers/net/usb/catc.c: Fix potential deadlock in catc_ctrl_run()
  sctp: stop pending timers and purge queues when peer restart asoc
  drivers/net: ks8842 Fix crash on received packet when in PIO mode.
  ip_options_compile: properly handle unaligned pointer
  iwlagn: fix incorrect PCI subsystem id for 6150 devices
  ...
2011-06-04 23:16:00 +09:00
John W. Linville 7b29dc21ea Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into for-davem 2011-06-03 14:31:50 -04:00
Thadeu Lima de Souza Cascardo 59e7e7078d mac80211: call dev_alloc_name before copying name to sdata
This partially reverts 1c5cae815d, because
the netdev name is copied into sdata->name, which is used for debugging
messages, for example. Otherwise, we get messages like this:

wlan%d: authenticated

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-03 14:22:06 -04:00
Koki Sanagi ec764bf083 net: tracepoint of net_dev_xmit sees freed skb and causes panic
Because there is a possibility that skb is kfree_skb()ed and zero cleared
after ndo_start_xmit, we should not see the contents of skb like skb->len and
skb->dev->name after ndo_start_xmit. But trace_net_dev_xmit does that
and causes panic by NULL pointer dereference.
This patch fixes trace_net_dev_xmit not to see the contents of skb directly.

If you want to reproduce this panic,

1. Get tracepoint of net_dev_xmit on
2. Create 2 guests on KVM
2. Make 2 guests use virtio_net
4. Execute netperf from one to another for a long time as a network burden
5. host will panic(It takes about 30 minutes)

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02 14:06:31 -07:00
Joe Perches afab2d2999 net: 8021q: Add pr_fmt
Use the current logging style.

Add #define pr_fmt and remove embedded prefix from formats.

Not converting the current pr_<level> uses to netdev_<level>
because all the output here is nicely prefaced with "8021q: ".

Remove __func__ use from proc registration failure message.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02 14:04:39 -07:00
Michio Honda 8a07eb0a50 sctp: Add ASCONF operation on the single-homed host
In this case, the SCTP association transmits an ASCONF packet
including addition of the new IP address and deletion of the old
address.  This patch implements this functionality.
In this case, the ASCONF chunk is added to the beginning of the
queue, because the other chunks cannot be transmitted in this state.

Signed-off-by: Michio Honda <micchie@sfc.wide.ad.jp>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02 02:04:53 -07:00
Michio Honda 7dc04d7122 sctp: Add socket option operation for Auto-ASCONF.
This patch allows the application to operate Auto-ASCONF on/off
behavior via setsockopt() and getsockopt().

Signed-off-by: Michio Honda <micchie@sfc.wide.ad.jp>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02 02:04:53 -07:00
Michio Honda dd51be0f54 sctp: Add sysctl support for Auto-ASCONF.
This patch allows the system administrator to change default
Auto-ASCONF on/off behavior via an sysctl value.

Signed-off-by: Michio Honda <micchie@sfc.wide.ad.jp>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02 02:04:53 -07:00
Michio Honda 9f7d653b67 sctp: Add Auto-ASCONF support (core).
SCTP reconfigure the IP addresses in the association by using
ASCONF chunks as mentioned in RFC5061.  For example, we can
start to use the newly configured IP address in the existing
association.  This patch implements automatic ASCONF operation
in the SCTP stack with address events in the host computer,
which is called auto_asconf.

Signed-off-by: Michio Honda <micchie@sfc.wide.ad.jp>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02 02:04:53 -07:00
Michio Honda b1364104e3 sctp: Add ADD/DEL ASCONF handling at the receiver.
This patch fixes the problem that the original code cannot delete
the remote address where the corresponding transport is currently
directed, even when the ASCONF is sent from the other address (this
situation happens when the single-homed sender transmits  ASCONF
with ADD and DEL.)

Signed-off-by: Michio Honda <micchie@sfc.wide.ad.jp>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02 02:04:52 -07:00
Ben Greear a3bcc23e89 af-packet: Add flag to distinguish VID 0 from no-vlan.
Currently, user-space cannot determine if a 0 tcp_vlan_tci
means there is no VLAN tag or the VLAN ID was zero.

Add flag to make this explicit.  User-space can check for
TP_STATUS_VLAN_VALID || tp_vlan_tci > 0, which will be backwards
compatible. Older could would have just checked for tp_vlan_tci,
so it will work no worse than before.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-01 21:18:03 -07:00