Commit Graph

194 Commits

Author SHA1 Message Date
David S. Miller 05e3aa0949 net: Create and use new helper, neigh_output().
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-16 17:26:00 -07:00
David S. Miller fec8292d9c ipv4: Use calculated 'neigh' instead of re-evaluating dst->neighbour
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-16 14:25:54 -07:00
David S. Miller f6b72b6217 net: Embed hh_cache inside of struct neighbour.
Now that there is a one-to-one correspondance between neighbour
and hh_cache entries, we no longer need:

1) dynamic allocation
2) attachment to dst->hh
3) refcounting

Initialization of the hh_cache entry is indicated by hh_len
being non-zero, and such initialization is always done with
the neighbour's lock held as a writer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-14 07:53:20 -07:00
David S. Miller e12fe68ce3 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-07-05 23:23:37 -07:00
Steffen Klassert c146066ab8 ipv4: Don't use ufo handling on later transformed packets
We might call ip_ufo_append_data() for packets that will be IPsec
transformed later. This function should be used just for real
udp packets. So we check for rt->dst.header_len which is only
nonzero on IPsec handling and call ip_ufo_append_data() just
if rt->dst.header_len is zero.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-01 17:33:19 -07:00
Steffen Klassert 353e5c9abd ipv4: Fix IPsec slowpath fragmentation problem
ip_append_data() builds packets based on the mtu from dst_mtu(rt->dst.path).
On IPsec the effective mtu is lower because we need to add the protocol
headers and trailers later when we do the IPsec transformations. So after
the IPsec transformations the packet might be too big, which leads to a
slowpath fragmentation then. This patch fixes this by building the packets
based on the lower IPsec mtu from dst_mtu(&rt->dst) and adapts the exthdr
handling to this.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-27 20:34:26 -07:00
Steffen Klassert 33f99dc7fd ipv4: Fix packet size calculation in __ip_append_data
Git commit 59104f06 (ip: take care of last fragment in ip_append_data)
added a check to see if we exceed the mtu when we add trailer_len.
However, the mtu is already subtracted by the trailer length when the
xfrm transfomation bundles are set up. So IPsec packets with mtu
size get fragmented, or if the DF bit is set the packets will not
be send even though they match the mtu perfectly fine. This patch
actually reverts commit 59104f06.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-27 20:34:25 -07:00
Paul Gortmaker 56f8a75c17 ip: introduce ip_is_fragment helper inline function
There are enough instances of this:

    iph->frag_off & htons(IP_MF | IP_OFFSET)

that a helper function is probably warranted.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-21 20:33:34 -07:00
Steffen Klassert 96d7303e9c ipv4: Fix packet size calculation for raw IPsec packets in __ip_append_data
We assume that transhdrlen is positive on the first fragment
which is wrong for raw packets. So we don't add exthdrlen to the
packet size for raw packets. This leads to a reallocation on IPsec
because we have not enough headroom on the skb to place the IPsec
headers. This patch fixes this by adding exthdrlen to the packet
size whenever the send queue of the socket is empty. This issue was
introduced with git commit 1470ddf7 (inet: Remove explicit write
references to sk/inet in ip_append_data)

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-09 14:49:59 -07:00
David S. Miller 22f728f8f3 ipv4: Always call ip_options_build() after rest of IP header is filled in.
This will allow ip_options_build() to reliably look at the values of
iph->{daddr,saddr}

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 17:21:27 -04:00
David S. Miller 0a5ebb8000 ipv4: Pass explicit daddr arg to ip_send_reply().
This eliminates an access to rt->rt_src.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-10 13:32:46 -07:00
David S. Miller f5fca60865 ipv4: Pass flow key down into ip_append_*().
This way rt->rt_dst accesses are unnecessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-08 21:24:07 -07:00
David S. Miller 77968b7824 ipv4: Pass flow keys down into datagram packet building engine.
This way ip_output.c no longer needs rt->rt_{src,dst}.

We already have these keys sitting, ready and waiting, on the stack or
in a socket structure.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-08 21:24:06 -07:00
David S. Miller ea4fc0d619 ipv4: Don't use rt->rt_{src,dst} in ip_queue_xmit().
Now we can pick it out of the provided flow key.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-08 15:28:28 -07:00
David S. Miller d9d8da805d inet: Pass flowi to ->queue_xmit().
This allows us to acquire the exact route keying information from the
protocol, however that might be managed.

It handles all of the possibilities, from the simplest case of storing
the key in inet->cork.fl to the more complex setup SCTP has where
individual transports determine the flow.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-08 15:28:28 -07:00
David S. Miller b57ae01a8a ipv4: Use cork flow in ip_queue_xmit()
All invokers of ip_queue_xmit() must make certain that the
socket is locked.  All of SCTP, TCP, DCCP, and L2TP now make
sure this is the case.

Therefore we can use the cork flow during output route lookup in
ip_queue_xmit() when the socket route check fails.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-08 14:05:14 -07:00
David S. Miller 706527280e ipv4: Initialize cork->opt using NULL not 0.
Noticed by Joe Perches.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-06 16:01:15 -07:00
David S. Miller b80d72261a ipv4: Initialize on-stack cork more efficiently.
ip_setup_cork() explicitly initializes every member of
inet_cork except flags, addr, and opt.  So we can simply
set those three members to zero instead of using a
memset() via an empty struct assignment.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2011-05-06 15:37:57 -07:00
David S. Miller bdc712b4c2 inet: Decrease overhead of on-stack inet_cork.
When we fast path datagram sends to avoid locking by putting
the inet_cork on the stack we use up lots of space that isn't
necessary.

This is because inet_cork contains a "struct flowi" which isn't
used in these code paths.

Split inet_cork to two parts, "inet_cork" and "inet_cork_full".
Only the latter of which has the "struct flowi" and is what is
stored in inet_sock.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2011-05-06 15:37:57 -07:00
David S. Miller dd927a2694 ipv4: In ip_build_and_send_pkt() use 'saddr' and 'daddr' args passed in.
Instead of rt->rt_{dst,src}

The only tricky part is source route option handling.

If the source route option is enabled we can't just use plain 'daddr',
we have to use opt->opt.faddr.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-04 12:03:30 -07:00
David S. Miller 31e4543db2 ipv4: Make caller provide on-stack flow key to ip_route_output_ports().
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-03 20:25:42 -07:00
Eric Dumazet f6d8bd051c inet: add RCU protection to inet->opt
We lack proper synchronization to manipulate inet->opt ip_options

Problem is ip_make_skb() calls ip_setup_cork() and
ip_setup_cork() possibly makes a copy of ipc->opt (struct ip_options),
without any protection against another thread manipulating inet->opt.

Another thread can change inet->opt pointer and free old one under us.

Use RCU to protect inet->opt (changed to inet->inet_opt).

Instead of handling atomic refcounts, just copy ip_options when
necessary, to avoid cache line dirtying.

We cant insert an rcu_head in struct ip_options since its included in
skb->cb[], so this patch is large because I had to introduce a new
ip_options_rcu structure.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-28 13:16:35 -07:00
David S. Miller 1c01a80cfe Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/smsc911x.c
2011-04-11 13:44:25 -07:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
David S. Miller 538de0e01f ipv4: Use flowi4_init_output() in ip_send_reply()
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-31 04:53:37 -07:00