Commit Graph

53248 Commits

Author SHA1 Message Date
Vlad Yasevich 07d9396771 [SCTP]: Set assoc_id correctly during INIT collision.
During the INIT/COOKIE-ACK collision cases, it's possible to get
into a situation where the association id is not yet set at the time
of the user event generation.  As a result, user events have an
association id set to 0 which will confuse applications.

This happens if we hit case B of duplicate cookie processing.
In the particular example found and provided by Oscar Isaula
<Oscar.Isaula@motorola.com>, flow looks like this:
A				B
---- INIT------->  (lost)
	    <---------INIT------
---- INIT-ACK--->
	    <------ Cookie ECHO

When the Cookie Echo is received, we end up trying to update the
association that was created on A as a result of the (lost) INIT,
but that association doesn't have the ID set yet.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 13:55:27 -07:00
Sridhar Samudrala 827bf12236 [SCTP]: Re-order SCTP initializations to avoid race with sctp_rcv()
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 13:36:30 -07:00
Vlad Yasevich ce5325c133 [SCTP]: Fix the SO_REUSEADDR handling to be similar to TCP.
Update the SO_REUSEADDR handling to also check for listen state.  This
was muliple listening server sockets can't be created and they will
not steal packets from each other.

Reported by Paolo Galtieri <pgaltieri@mvista.com>

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 13:34:49 -07:00
Vlad Yasevich 16d00fb776 [SCTP]: Verify all destination ports in sctp_connectx.
We need to make sure that all destination ports are the same, since
the association really must not connect to multiple different ports
at once.  This was reported on the sctp-impl list.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 13:34:09 -07:00
Jamal Hadi Salim 5a6d34162f [XFRM] SPD info TLV aggregation
Aggregate the SPD info TLVs.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 12:55:39 -07:00
Jamal Hadi Salim af11e31609 [XFRM] SAD info TLV aggregationx
Aggregate the SAD info TLVs.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 12:55:13 -07:00
David Howells 224711df5c [AF_RXRPC]: Sort out MTU handling.
Sort out the MTU determination and handling in AF_RXRPC:

 (1) If it's present, parse the additional information supplied by the peer at
     the end of the ACK packet (struct ackinfo) to determine the MTU sizes
     that peer is willing to support.

 (2) Initialise the MTU size to that peer from the kernel's routing records.

 (3) Send ACKs rather than ACKALLs as the former carry the additional info,
     and the latter do not.

 (4) Declare the interface MTU size in outgoing ACKs as a maximum amount of
     data that can be stuffed into an RxRPC packet without it having to be
     fragmented to come in this computer's NIC.

 (5) If sendmsg() is given MSG_MORE then it should allocate an skb of the
     maximum size rather than one just big enough for the data it's got left
     to process on the theory that there is more data to come that it can
     append to that packet.

     This means, for example, that if AFS does a large StoreData op, all the
     packets barring the last will be filled to the maximum unfragmented size.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 12:41:11 -07:00
Heiko Carstens da99f05654 [AF_IUCV/IUCV] : Add missing section annotations
Add missing section annotations and found and fixed some
Coding Style issues.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
2007-05-04 12:23:27 -07:00
Jennifer Hunt 561e036006 [AF_IUCV]: Implementation of a skb backlog queue
With the inital implementation we missed to implement a skb backlog
queue . The result is that socket receive processing tossed packets.
Since AF_IUCV connections are working synchronously it leads to
connection hangs. Problems with read, close and select also occured.

Using a skb backlog queue is fixing all of these problems .

Signed-off-by: Jennifer Hunt <jenhunt@us.ibm.com>
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 12:22:07 -07:00
Patrick McHardy 9e71efcd6d [NETLINK]: Remove bogus BUG_ON
Remove bogus BUG_ON(mutex_is_locked(nlk_sk(sk)->cb_mutex)), when the
netlink_kernel_create caller specifies an external mutex it might
validly be locked.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 12:15:11 -07:00
Eric Dumazet db3459d1a7 [IPV6]: Some cleanups in include/net/ipv6.h
1) struct ip6_flowlabel : moves 'users' field to avoid two 32bits
   holes for 64bit arches. Shrinks by 8 bytes sizeof(struct
   ip6_flowlabel)

2) ipv6_addr_cmp() and ipv6_addr_copy() dont need (void *) casts :
   Compiler might take into account natural alignement of in6_addr
   structs to emit better code for memcpy()/memcmp() Casts to (void *)
   force byte accesses.

3) ipv6_addr_prefix() optimization :

Better to clear whole struct, as compiler can emit better code for
memset(addr, 0, 16) (2 stores on x86_64), and avoid some conditional
branches.

# size vmlinux.after vmlinux.before
   text    data     bss     dec     hex filename
5262262  647612  557432 6467306  62aeea vmlinux.after
5262550  647612  557432 6467594  62b00a vmlinux.before

thats 288 bytes saved.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 17:39:04 -07:00
Srinivas Aji b40b4f79ce [TCP]: zero out rx_opt in tcp_disconnect()
When the server drops its connection, NFS client reconnects using the
same socket after disconnecting. If the new connection's SYN,ACK
doesn't contain the TCP timestamp option and the old connection's did,
tp->tcp_header_len is recomputed assuming no timestamp header but
tp->rx_opt.tstamp_ok remains set. Then tcp_build_and_update_options()
adds in a timestamp option past the end of the allocated TCP header,
overwriting TCP data, or when the data is in skb_shinfo(skb)->frags[],
overwriting skb_shinfo(skb) causing a crash soon after. (The issue was
debugged from such a crash.)

Similarly, wscale_ok and sack_ok also get set based on the SYN,ACK
packet but not reset on disconnect, since they are zeroed out at
initialization. The patch zeroes out the entire tp->rx_opt struct in
tcp_disconnect() to avoid this sort of problem.

Signed-off-by: Srinivas Aji <Aji_Srinivas@emc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 17:32:28 -07:00
Michael Chan fde82055c1 [BNX2]: Fix TSO problem with small MSS.
Remove the check for skb->len greater than MTU when doing TSO.  When
the destination has a smaller MSS than the source, a TSO packet may
be smaller than the MTU at the source and we still need to process it
as a TSO packet.

Thanks to Brian Ristuccia <bristuccia@starentnetworks.com> for
reporting the problem.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 17:23:35 -07:00
Pavel Emelianov 7562f876cd [NET]: Rework dev_base via list_head (v3)
Cleanup of dev_base list use, with the aim to simplify making device
list per-namespace. In almost every occasion, use of dev_base variable
and dev->next pointer could be easily replaced by for_each_netdev
loop. A few most complicated places were converted to using
first_netdev()/next_netdev().

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 15:13:45 -07:00
Ilpo Järvinen 03fba04796 [TCP] Highspeed: Limited slow-start is nowadays in tcp_slow_start
Reuse limited slow-start (RFC3742) included into tcp_cong instead
of having another implementation in High Speed TCP.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:28:35 -07:00
Michael Chan 72fbaeb623 [BNX2]: Update version and reldate.
Update version to 1.5.10.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:25:32 -07:00
Michael Chan 883e515118 [BNX2]: Print bus information for PCIE devices.
Fix the code to print PCI or PCIE bus information for all devices.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:25:11 -07:00
Michael Chan 8e6a72c435 [BNX2]: Add 1-shot MSI handler for 5709.
The 5709 supports the one-shot MSI handler similar to some of the tg3
chips.  In this mode, the MSI disables itself automatically until it
is re-enabled at the end of NAPI poll.

Put the request_irq/free_irq logic in common procedures.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:24:48 -07:00
Michael Chan da3e4fbed2 [BNX2]: Restructure PHY event handling.
Restructure by adding bnx2_phy_event_is_set() to make code cleaner
and easier to understand.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:24:23 -07:00
Michael Chan 1b8227c48e [BNX2]: Add indirect spinlock.
The indirect register access method will be used by more than one
caller in BH context (NAPI poll and timer), so a spinlock is required.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:24:05 -07:00
Michael Chan 27a005b883 [BNX2]: Add support for 5709 Serdes.
Add PCI ID and code to support the 5709 Serdes PHY.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:23:41 -07:00
Michael Chan 605a9e20aa [BNX2]: Re-structure the 2.5G Serdes code.
Add some common procedures to handle enabling and disabling 2.5G.
Add some missing code to resolve flow control.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:23:13 -07:00
Michael Chan ca58c3af99 [BNX2]: Put MII register offsets in the bnx2 struct.
The 5709 Serdes device uses non-standard MII register offsets.  This
re-structuring will make it easier to support 5709 Serdes.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:22:52 -07:00
Michael Chan 4666f87a82 [BNX2]: Add ipv6 TSO and checksum for 5709.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:22:28 -07:00
Michael Chan 874bb672fd [BNX2]: Update 5709 firmware.
Add ipv6 TSO support in firmware. 

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:21:48 -07:00