This patch is to move secid and peer_secid from endpoint to association,
and pass asoc to sctp_assoc_request and sctp_sk_clone instead of ep. As
ep is the local endpoint and asoc represents a connection, and in SCTP
one sk/ep could have multiple asoc/connection, saving secid/peer_secid
for new asoc will overwrite the old asoc's.
Note that since asoc can be passed as NULL, security_sctp_assoc_request()
is moved to the place right after the new_asoc is created in
sctp_sf_do_5_1B_init() and sctp_sf_do_unexpected_init().
v1->v2:
- fix the description of selinux_netlbl_skbuff_setsid(), as Jakub noticed.
- fix the annotation in selinux_sctp_assoc_request(), as Richard Noticed.
Fixes: 72e89f5008 ("security: Add support for SCTP security hooks")
Reported-by: Prashanth Prahlad <pprahlad@redhat.com>
Reviewed-by: Richard Haines <richard_c_haines@btinternet.com>
Tested-by: Richard Haines <richard_c_haines@btinternet.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sctp_transport_pl_hlen() is called to calculate the outer header length
for PL. However, as the Figure in rfc8899#section-4.4:
Any additional
headers .--- MPS -----.
| | |
v v v
+------------------------------+
| IP | ** | PL | protocol data |
+------------------------------+
<----- PLPMTU ----->
<---------- PMTU -------------->
Outer header are IP + Any additional headers, which doesn't include
Packetization Layer itself header, namely sctphdr, whereas sctphdr
is counted by __sctp_mtu_payload().
The incorrect calculation caused the link pathmtu to be set larger
than expected by t->pl.pmtu + sctp_transport_pl_hlen(). This patch
is to fix it by subtracting sctphdr len in sctp_transport_pl_hlen().
Fixes: d9e2e410ae ("sctp: add the constants/variables and states and some APIs for transport")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sctp_transport_pl_update() is called when transport update its dst and
pathmtu, instead of stopping the PLPMTUD probe timer, PLPMTUD should
start over and reset the probe timer. Otherwise, the PLPMTUD service
would stop.
Fixes: 92548ec2f1 ("sctp: add the probe timer in transport for PLPMTUD")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
transport encap_port update should be updated when sctp_vtag_verify()
succeeds, namely, returns 1, not returns 0. Correct it in this patch.
While at it, also fix the indentation.
Fixes: a1dd2cf2f1 ("sctp: allow changing transport encap_port by peer packets")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to introduce last_rtx_chunks into sctp_transport to detect
if there's any packet retransmission/loss happened by checking against
asoc's rtx_data_chunks in sctp_transport_pl_send().
If there is, namely, transport->last_rtx_chunks != asoc->rtx_data_chunks,
the pmtu probe will be sent out. Otherwise, increment the pl.raise_count
and return when it's in Search Complete state.
With this patch, if in Search Complete state, which is a long period, it
doesn't need to keep probing the current pmtu unless there's data packet
loss. This will save quite some traffic.
v1->v2:
- add the missing Fixes tag.
Fixes: 0dac127c05 ("sctp: do black hole detection in search complete state")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch does 3 things:
- make sctp_transport_pl_send() and sctp_transport_pl_recv()
return bool type to decide if more probe is needed to send.
- pr_debug() only when probe is really needed to send.
- count pl.raise_count in sctp_transport_pl_send() instead of
sctp_transport_pl_recv(), and it's only incremented for the
1st probe for the same size.
These are preparations for the next patch to make probes happen
only when there's packet loss in Search Complete state.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial conflict in net/netfilter/nf_tables_api.c.
Duplicate fix in tools/testing/selftests/net/devlink_port_split.py
- take the net-next version.
skmsg, and L4 bpf - keep the bpf code but remove the flags
and err params.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ilja reported that, simply putting it, nothing was validating that
from_addr_param functions were operating on initialized memory. That is,
the parameter itself was being validated by sctp_walk_params, but it
doesn't check for types and their specific sizes and it could be a 0-length
one, causing from_addr_param to potentially work over the next parameter or
even uninitialized memory.
The fix here is to, in all calls to from_addr_param, check if enough space
is there for the wanted IP address type.
Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the PLPMUTD probe will stop for a long period (interval * 30)
after it enters search complete state. If there's a pmtu change on the
route path, it takes a long time to be aware if the ICMP TooBig packet
is lost or filtered.
As it says in rfc8899#section-4.3:
"A DPLPMTUD method MUST NOT rely solely on this method."
(ICMP PTB message).
This patch is to enable the other method for search complete state:
"A PL can use the DPLPMTUD probing mechanism to periodically
generate probe packets of the size of the current PLPMTU."
With this patch, the probe will continue with the current pmtu every
'interval' until the PMTU_RAISE_TIMER 'timeout', which we implement
by adding raise_count to raise the probe size when it counts to 30
and removing the SCTP_PL_COMPLETE check for PMTU_RAISE_TIMER.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, sctp over udp was using udp tunnel's icmp err process, which
only does sk lookup on sctp side. However for sctp's icmp error process,
there are more things to do, like syncing assoc pmtu/retransmit packets
for toobig type err, and starting proto_unreach_timer for unreach type
err etc.
Now after adding PLPMTUD, which also requires to process toobig type err
on sctp side. This patch is to process icmp err on sctp side by parsing
the type/code/info in .encap_err_lookup and call sctp's icmp processing
functions. Note as the 'redirect' err process needs to know the outer
ip(v6) header's, we have to leave it to udp(v6)_err to handle it.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As described in rfc8899#section-5.2, when a probe succeeds, there might
be the following state transitions:
- Base -> Search, occurs when probe succeeds with BASE_PLPMTU,
pl.pmtu is not changing,
pl.probe_size increases by SCTP_PL_BIG_STEP,
- Error -> Search, occurs when probe succeeds with BASE_PLPMTU,
pl.pmtu is changed from SCTP_MIN_PLPMTU to SCTP_BASE_PLPMTU,
pl.probe_size increases by SCTP_PL_BIG_STEP.
- Search -> Search Complete, occurs when probe succeeds with the probe
size SCTP_MAX_PLPMTU less than pl.probe_high,
pl.pmtu is not changing, but update *pathmtu* with it,
pl.probe_size is set back to pl.pmtu to double check it.
- Search Complete -> Search, occurs when probe succeeds with the probe
size equal to pl.pmtu,
pl.pmtu is not changing,
pl.probe_size increases by SCTP_PL_MIN_STEP.
So search process can be described as:
1. When it just enters 'Search' state, *pathmtu* is not updated with
pl.pmtu, and probe_size increases by a big step (SCTP_PL_BIG_STEP)
each round.
2. Until pl.probe_high is set when a probe fails, and probe_size
decreases back to pl.pmtu, as described in the last patch.
3. When the probe with the new size succeeds, probe_size changes to
increase by a small step (SCTP_PL_MIN_STEP) due to pl.probe_high
is set.
4. Until probe_size is next to pl.probe_high, the searching finishes and
it goes to 'Complete' state and updates *pathmtu* with pl.pmtu, and
then probe_size is set to pl.pmtu to confirm by once more probe.
5. This probe occurs after "30 * probe_inteval", a much longer time than
that in Search state. Once it is done it goes to 'Search' state again
with probe_size increased by SCTP_PL_MIN_STEP.
As we can see above, during the searching, pl.pmtu changes while *pathmtu*
doesn't. *pathmtu* is only updated when the search finishes by which it
gets an optimal value for it. A big step is used at the beginning until
it gets close to the optimal value, then it changes to a small step until
it has this optimal value.
The small step is also used in 'Complete' until it goes to 'Search' state
again and the probe with 'pmtu + the small step' succeeds, which means a
higher size could be used. Then probe_size changes to increase by a big
step again until it gets close to the next optimal value.
Note that anytime when black hole is detected, it goes directly to 'Base'
state with pl.pmtu set to SCTP_BASE_PLPMTU, as described in the last patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The state transition is described in rfc8899#section-5.2,
PROBE_COUNT == MAX_PROBES means the probe fails for MAX times, and the
state transition includes:
- Base -> Error, occurs when BASE_PLPMTU Confirmation Fails,
pl.pmtu is set to SCTP_MIN_PLPMTU,
probe_size is still SCTP_BASE_PLPMTU;
- Search -> Base, occurs when Black Hole Detected,
pl.pmtu is set to SCTP_BASE_PLPMTU,
probe_size is set back to SCTP_BASE_PLPMTU;
- Search Complete -> Base, occurs when Black Hole Detected
pl.pmtu is set to SCTP_BASE_PLPMTU,
probe_size is set back to SCTP_BASE_PLPMTU;
Note a black hole is encountered when a sender is unaware that packets
are not being delivered to the destination endpoint. So it includes the
probe failures with equal probe_size to pl.pmtu, and definitely not
include that with greater probe_size than pl.pmtu. The later one is the
normal probe failure where probe_size should decrease back to pl.pmtu
and pl.probe_high is set. pl.probe_high would be used on HB ACK recv
path in the next patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch does exactly what rfc8899#section-6.2.1.2 says:
The SCTP sender needs to be able to determine the total size of a
probe packet. The HEARTBEAT chunk could carry a Heartbeat
Information parameter that includes, besides the information
suggested in [RFC4960], the probe size to help an implementation
associate a HEARTBEAT ACK with the size of probe that was sent. The
sender could also use other methods, such as sending a nonce and
verifying the information returned also contains the corresponding
nonce. The length of the PAD chunk is computed by reducing the
probing size by the size of the SCTP common header and the HEARTBEAT
chunk.
Note that HB ACK chunk will carry back whatever HB chunk carried, including
the probe_size we put it in; We also check hbinfo->probe_size in the HB ACK
against link->pl.probe_size to validate this HB ACK chunk.
v1->v2:
- Remove the unused 'sp' and add static for sctp_packet_bundle_pad().
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are 3 timers described in rfc8899#section-5.1.1:
PROBE_TIMER, PMTU_RAISE_TIMER, CONFIRMATION_TIMER
This patches adds a 'probe_timer' in transport, and it works as either
PROBE_TIMER or PMTU_RAISE_TIMER. At most time, it works as PROBE_TIMER
and expires every a 'probe_interval' time to send the HB probe packet.
When transport pl enters COMPLETE state, it works as PMTU_RAISE_TIMER
and expires in 'probe_interval * 30' time to go back to SEARCH state
and do searching again.
SCTP HB is an acknowledged packet, CONFIRMATION_TIMER is not needed.
The timer will start when transport pl enters BASE state and stop
when it enters DISABLED state.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These are 4 constants described in rfc8899#section-5.1.2:
MAX_PROBES, MIN_PLPMTU, MAX_PLPMTU, BASE_PLPMTU;
And 2 variables described in rfc8899#section-5.1.3:
PROBED_SIZE, PROBE_COUNT;
And 5 states described in rfc8899#section-5.2:
DISABLED, BASE, SEARCH, SEARCH_COMPLETE, ERROR;
And these 4 APIs are used to reset/update PLPMTUD, check if PLPMTUD is
enabled, and calculate the additional headers length for a transport.
Note the member 'probe_high' in transport will be set to the probe
size when a probe fails with this probe size in the next patches.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PLPMTUD can be enabled by doing 'sysctl -w net.sctp.probe_interval=n'.
'n' is the interval for PLPMTUD probe timer in milliseconds, and it
can't be less than 5000 if it's not 0.
All asoc/transport's PLPMTUD in a new socket will be enabled by default.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This chunk is defined in rfc4820#section-3, and used to pad an
SCTP packet. The receiver must discard this chunk and continue
processing the rest of the chunks in the packet.
Add it now, as it will be bundled with a heartbeat chunk to probe
pmtu in the following patches.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The same thing should be done for sctp_sf_do_dupcook_b().
Meanwhile, SCTP_CMD_UPDATE_ASSOC cmd can be removed.
v1->v2:
- Fix the return value in sctp_sf_do_assoc_update().
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
inet(6)_skb_parm was removed from sctp_input_cb by Commit a1dd2cf2f1
("sctp: allow changing transport encap_port by peer packets"), as it
thought sctp_input_cb->header is not used any more in SCTP.
syzbot reported a crash:
[ ] BUG: KASAN: use-after-free in decode_session6+0xe7c/0x1580
[ ]
[ ] Call Trace:
[ ] <IRQ>
[ ] dump_stack+0x107/0x163
[ ] kasan_report.cold+0x1f/0x37
[ ] decode_session6+0xe7c/0x1580
[ ] __xfrm_policy_check+0x2fa/0x2850
[ ] sctp_rcv+0x12b0/0x2e30
[ ] sctp6_rcv+0x22/0x40
[ ] ip6_protocol_deliver_rcu+0x2e8/0x1680
[ ] ip6_input_finish+0x7f/0x160
[ ] ip6_input+0x9c/0xd0
[ ] ipv6_rcv+0x28e/0x3c0
It was caused by sctp_input_cb->header/IP6CB(skb) still used in sctp rx
path decode_session6() but some members overwritten by sctp6_rcv().
This patch is to fix it by bring inet(6)_skb_parm back to sctp_input_cb
and not overwriting it in sctp4/6_rcv() and sctp_udp_rcv().
Reported-by: syzbot+5be8aebb1b7dfa90ef31@syzkaller.appspotmail.com
Fixes: a1dd2cf2f1 ("sctp: allow changing transport encap_port by peer packets")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/136c1a7a419341487c504be6d1996928d9d16e02.1604472932.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch is to add the function to make the abort chunk with
the error cause for new encapsulation port restart, defined
on Section 4.4 in draft-tuexen-tsvwg-sctp-udp-encaps-cons-03.
v1->v2:
- no change.
v2->v3:
- no need to call htons() when setting nep.cur_port/new_port.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
sctp_mtu_payload() is for calculating the frag size before making
chunks from a msg. So we should only add udphdr size to overhead
when udp socks are listening, as only then sctp can handle the
incoming sctp over udp packets and outgoing sctp over udp packets
will be possible.
Note that we can't do this according to transport->encap_port, as
different transports may be set to different values, while the
chunks were made before choosing the transport, we could not be
able to meet all rfc6951#section-5.6 recommends.
v1->v2:
- Add udp_port for sctp_sock to avoid a potential race issue, it
will be used in xmit path in the next patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As rfc6951#section-5.4 says:
"After finding the SCTP association (which
includes checking the verification tag), the UDP source port MUST be
stored as the encapsulation port for the destination address the SCTP
packet is received from (see Section 5.1).
When a non-encapsulated SCTP packet is received by the SCTP stack,
the encapsulation of outgoing packets belonging to the same
association and the corresponding destination address MUST be
disabled."
transport encap_port should be updated by a validated incoming packet's
udp src port.
We save the udp src port in sctp_input_cb->encap_port, and then update
the transport in two places:
1. right after vtag is verified, which is required by RFC, and this
allows the existent transports to be updated by the chunks that
can only be processed on an asoc.
2. right before processing the 'init' where the transports are added,
and this allows building a sctp over udp connection by client with
the server not knowing the remote encap port.
3. when processing ootb_pkt and creating the temporary transport for
the reply pkt.
Note that sctp_input_cb->header is removed, as it's not used any more
in sctp.
v1->v2:
- Change encap_port as __be16 for sctp_input_cb.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
encap_port is added as per netns/sock/assoc/transport, and the
latter one's encap_port inherits the former one's by default.
The transport's encap_port value would mostly decide if one
packet should go out with udp encapsulated or not.
This patch also allows users to set netns' encap_port by sysctl.
v1->v2:
- Change to define encap_port as __be16 for sctp_sock, asoc and
transport.
v2->v3:
- No change.
v3->v4:
- Add 'encap_port' entry in ip-sysctl.rst.
v4->v5:
- Improve the description of encap_port in ip-sysctl.rst.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch is to add the functions to create/release udp4 sock,
and set the sock's encap_rcv to process the incoming udp encap
sctp packets. In sctp_udp_rcv(), as we can see, all we need to
do is fix the transport header for sctp_rcv(), then it would
implement the part of rfc6951#section-5.4:
"When an encapsulated packet is received, the UDP header is removed.
Then, the generic lookup is performed, as done by an SCTP stack
whenever a packet is received, to find the association for the
received SCTP packet"
Note that these functions will be called in the last patch of
this patchset when enabling this feature.
v1->v2:
- Add pr_err() when fails to create udp v4 sock.
v2->v3:
- Add 'select NET_UDP_TUNNEL' in sctp Kconfig.
v3->v4:
- No change.
v4->v5:
- Change to set udp_port to 0 by default.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>