Realising the following call pattern,
* first dccp_entail() is called to enqueue a new skb and
* then skb_clone() is called to transmit a clone of that skb,
this patch integrates both into the same function.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This patch rearranges the order of statements of the slow-path input processing
(i.e. any other state than OPEN), to resolve the following issues.
1. Dependencies: the order of statements now better matches RFC 4340, 8.5, i.e.
step 7 is before step 9 (previously 9 was before 7), and parsing options in
step 8 (which may consume resources) now comes after step 7.
2. Sequence number checks are omitted if in state LISTEN/REQUEST, due to the
note underneath the table in RFC 4340, 7.5.3.
As a result, CCID processing is now indeed confined to OPEN/PARTOPEN states,
i.e. congestion control is performed only on the flow of data packets. This
avoids pathological cases of doing congestion control on those messages
which set up and terminate the connection.
3. Packets are now passed on to Ack Vector / CCID processing only after
- step 7 (receive unexpected packets),
- step 9 (receive Reset),
- step 13 (receive CloseReq),
- step 14 (receive Close)
and only if the state is PARTOPEN. This simplifies CCID processing:
- in LISTEN/CLOSED the CCIDs are non-existent;
- in RESPOND/REQUEST the CCIDs have not yet been negotiated;
- in CLOSEREQ and active-CLOSING the node has already closed this socket;
- in passive-CLOSING the client is waiting for its Reset.
In the last case, RFC 4340, 8.3 leaves it open to ignore further incoming
data, which is the approach taken here.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This allows us to acquire the exact route keying information from the
protocol, however that might be managed.
It handles all of the possibilities, from the simplest case of storing
the key in inet->cork.fl to the more complex setup SCTP has where
individual transports determine the flow.
Signed-off-by: David S. Miller <davem@davemloft.net>
Operation order is now transposed, we first create the child
socket then we try to hook up the route.
Signed-off-by: David S. Miller <davem@davemloft.net>
Since this is invoked from inet_stream_connect() the socket is locked
and therefore this usage is safe.
Signed-off-by: David S. Miller <davem@davemloft.net>
A length of zero (after subtracting two for the type and len fields) for
the DCCPO_{CHANGE,CONFIRM}_{L,R} options will cause an underflow due to
the subtraction. The subsequent code may read past the end of the
options value buffer when parsing. I'm unsure of what the consequences
of this might be, but it's probably not good.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: stable@kernel.org
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that output route lookups update the flow with
destination address selection, we can fetch it from
fl4->daddr instead of rt->rt_dst
Signed-off-by: David S. Miller <davem@davemloft.net>
We lack proper synchronization to manipulate inet->opt ip_options
Problem is ip_make_skb() calls ip_setup_cork() and
ip_setup_cork() possibly makes a copy of ipc->opt (struct ip_options),
without any protection against another thread manipulating inet->opt.
Another thread can change inet->opt pointer and free old one under us.
Use RCU to protect inet->opt (changed to inet->inet_opt).
Instead of handling atomic refcounts, just copy ip_options when
necessary, to avoid cache line dirtying.
We cant insert an rcu_head in struct ip_options since its included in
skb->cb[], so this patch is large because I had to introduce a new
ip_options_rcu structure.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
These functions are used together as a unit for route resolution
during connect(). They address the chicken-and-egg problem that
exists when ports need to be allocated during connect() processing,
yet such port allocations require addressing information from the
routing code.
It's currently more heavy handed than it needs to be, and in
particular we allocate and initialize a flow object twice.
Let the callers provide the on-stack flow object. That way we only
need to initialize it once in the ip_route_connect() call.
Later, if ip_route_newports() needs to do anything, it re-uses that
flow object as-is except for the ports which it updates before the
route re-lookup.
Also, describe why this set of facilities are needed and how it works
in a big comment.
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers
where possible, to make code intention more obvious.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create two sets of port member accessors, one set prefixed by fl4_*
and the other prefixed by fl6_*
This will let us to create AF optimal flow instances.
It will work because every context in which we access the ports,
we have to be fully aware of which AF the flowi is anyways.
Signed-off-by: David S. Miller <davem@davemloft.net>
I intend to turn struct flowi into a union of AF specific flowi
structs. There will be a common structure that each variant includes
first, much like struct sock_common.
This is the first step to move in that direction.
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a bug in the order of dccp_rcv_state_process() that still permitted
reception even after closing the socket. A Reset after close thus causes a NULL
pointer dereference by not preventing operations on an already torn-down socket.
dccp_v4_do_rcv()
|
| state other than OPEN
v
dccp_rcv_state_process()
|
| DCCP_PKT_RESET
v
dccp_rcv_reset()
|
v
dccp_time_wait()
WARNING: at net/ipv4/inet_timewait_sock.c:141 __inet_twsk_hashdance+0x48/0x128()
Modules linked in: arc4 ecb carl9170 rt2870sta(C) mac80211 r8712u(C) crc_ccitt ah
[<c0038850>] (unwind_backtrace+0x0/0xec) from [<c0055364>] (warn_slowpath_common)
[<c0055364>] (warn_slowpath_common+0x4c/0x64) from [<c0055398>] (warn_slowpath_n)
[<c0055398>] (warn_slowpath_null+0x1c/0x24) from [<c02b72d0>] (__inet_twsk_hashd)
[<c02b72d0>] (__inet_twsk_hashdance+0x48/0x128) from [<c031caa0>] (dccp_time_wai)
[<c031caa0>] (dccp_time_wait+0x40/0xc8) from [<c031c15c>] (dccp_rcv_state_proces)
[<c031c15c>] (dccp_rcv_state_process+0x120/0x538) from [<c032609c>] (dccp_v4_do_)
[<c032609c>] (dccp_v4_do_rcv+0x11c/0x14c) from [<c0286594>] (release_sock+0xac/0)
[<c0286594>] (release_sock+0xac/0x110) from [<c031fd34>] (dccp_close+0x28c/0x380)
[<c031fd34>] (dccp_close+0x28c/0x380) from [<c02d9a78>] (inet_release+0x64/0x70)
The fix is by testing the socket state first. Receiving a packet in Closed state
now also produces the required "No connection" Reset reply of RFC 4340, 8.3.1.
Reported-and-tested-by: Johan Hovold <jhovold@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>