Commit Graph

172 Commits

Author SHA1 Message Date
Wei Yongjun
ad025a56a5 tipc: remove duplicated include from socket.c
Remove duplicated include.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 15:51:14 -07:00
Wei Yongjun
52f50ce556 tipc: fix sparse non static symbol warnings
Fixes the following sparse warnings:

net/tipc/socket.c:545:5: warning:
 symbol 'tipc_sk_proto_rcv' was not declared. Should it be static?
net/tipc/socket.c:2015:5: warning:
 symbol 'tipc_ioctl' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-20 22:19:04 -07:00
Jon Paul Maloy
9fbfb8b120 tipc: rename temporarily named functions
After the previous commit, we can now give the functions with temporary
names, such as tipc_link_xmit2(), tipc_msg_build2() etc., their proper
names.

There are no functional changes in this commit.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 21:38:19 -07:00
Jon Paul Maloy
0abd8ff21f tipc: start using the new multicast functions
In this commit, we convert the socket multicast send function to
directly call the new multicast/broadcast function (tipc_bclink_xmit2())
introduced in the previous commit. We do this instead of letting the
call go via the now obsolete tipc_port_mcast_xmit(), hence saving
a call level and some code complexity.

We also remove the initial destination lookup at the message sending
side, and replace that with an unconditional lookup at the receiving
side, including on the sending node itself. This makes the destination
lookup and message transfer more uniform than before.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 21:38:18 -07:00
Jon Paul Maloy
078bec826f tipc: add new functions for multicast and broadcast distribution
We add a new broadcast link transmit function in bclink.c and a new
receive function in socket.c. The purpose is to move the branching
between external and internal destination down to the link layer,
just as we have done with unicast in earlier commits. We also make
use of the new link-independent fragmentation support that was
introduced in an earlier commit series.

This gives a shorter and simpler code path, and makes it possible
to obtain copy-free buffer delivery to all node local destination
sockets.

The new transmission code is added in parallel with the existing one,
and will be used by the socket multicast send function in the next
commit in this series.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 21:38:18 -07:00
Fabian Frederick
7ceaa583be tipc: remove unnecessary break after return
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15 16:26:59 -07:00
Erik Hugne
70452dcb6d tipc: fix a memleak when sending data
This fixes a regression bug caused by:
067608e9d0 ("tipc: introduce direct
iovec to buffer chain fragmentation function")

If data is sent on a nonblocking socket and the destination link
is congested, the buffer chain is leaked. We fix this by freeing
the chain in this case.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 16:10:01 -07:00
Jon Paul Maloy
60120526c2 tipc: simplify connection congestion handling
As a consequence of the recently introduced serialized access
to the socket in commit 8d94168a761819d10252bab1f8de6d7b202c3baa
("tipc: same receive code path for connection protocol and data
messages") we can make a number of simplifications in the
detection and handling of connection congestion situations.

- We don't need to keep two counters, one for sent messages and one
  for acked messages. There is no longer any risk for races between
  acknowledge messages arriving in BH and data message sending
  running in user context. So we merge this into one counter,
  'sent_unacked', which is incremented at sending and subtracted
  from at acknowledge reception.

- We don't need to set the 'congested' field in tipc_port to
  true before we sent the message, and clear it when sending
  is successful. (As a matter of fact, it was never necessary;
  the field was set in link_schedule_port() before any wakeup
  could arrive anyway.)

- We keep the conditions for link congestion and connection connection
  congestion separated. There would otherwise be a risk that an arriving
  acknowledge message may wake up a user sleeping because of link
  congestion.

- We can simplify reception of acknowledge messages.

We also make some cosmetic/structural changes:

- We rename the 'congested' field to the more correct 'link_cong´.

- We rename 'conn_unacked' to 'rcv_unacked'

- We move the above mentioned fields from struct tipc_port to
  struct tipc_sock.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:56 -07:00
Jon Paul Maloy
ac0074ee70 tipc: clean up connection protocol reception function
We simplify the code for receiving connection probes, leveraging the
recently introduced tipc_msg_reverse() function. We also stick to
the principle of sending a possible response message directly from
the calling (tipc_sk_rcv or backlog_rcv) functions, hence making
the call chain shallower and easier to follow.

We make one small protocol change here, allowed according to
the spec. If a protocol message arrives from a remote socket that
is not the one we are connected to, we are currently generating a
connection abort message and send it to the source. This behavior
is unnecessary, and might even be a security risk, so instead we
now choose to only ignore the message. The consequnce for the sender
is that he will need longer time to discover his mistake (until the
next timeout), but this is an extreme corner case, and may happen
anyway under other circumstances, so we deem this change acceptable.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:56 -07:00
Jon Paul Maloy
ec8a2e5621 tipc: same receive code path for connection protocol and data messages
As a preparation to eliminate port_lock we need to bring reception
of connection protocol messages under proper protection of bh_lock_sock
or socket owner.

We fix this by letting those messages follow the same code path as
incoming data messages.

As a side effect of this change, the last reference to the function
net_route_msg() disappears, and we can eliminate that function.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:56 -07:00
Jon Paul Maloy
4ccfe5e041 tipc: connection oriented transport uses new send functions
We move the message sending across established connections
to use the message preparation and send functions introduced
earlier in this series. We now do the message preparation
and call to the link send function directly from the socket,
instead of going via the port layer.

As a consequence of this change, the functions tipc_send(),
tipc_port_iovec_rcv(), tipc_port_iovec_reject() and tipc_reject_msg()
become unreferenced and can be eliminated from port.c. For the same
reason, the functions tipc_link_xmit_fast(), tipc_link_iovec_xmit_long()
and tipc_link_iovec_fast() can be eliminated from link.c.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy
e2dafe87d3 tipc: RDM/DGRAM transport uses new fragmenting and sending functions
We merge the code for sending port name and port identity addressed
messages into the corresponding send functions in socket.c, and start
using the new fragmenting and transmit functions we just have introduced.

This saves a call level and quite a few code lines, as well as making
this part of the code easier to follow. As a consequence, the functions
tipc_send2name() and tipc_send2port() in port.c can be removed.

For practical reasons, we break out the code for sending multicast messages
from tipc_sendmsg() and move it into a separate function, tipc_sendmcast(),
but we do not yet convert it into using the new build/send functions.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy
5a379074a7 tipc: introduce message evaluation function
When a message arrives in a node and finds no destination
socket, we may need to drop it, reject it, or forward it after
a secondary destination lookup. The latter two cases currently
results in a code path that is perceived as complex, because it
follows a deep call chain via obscure functions such as
net_route_named_msg() and net_route_msg().

We now introduce a function, tipc_msg_eval(), that takes the
decision about whether such a message should be rejected or
forwarded, but leaves it to the caller to actually perform
the indicated action.

If the decision is 'reject', it is still the task of the recently
introduced function tipc_msg_reverse() to take the final decision
about whether the message is rejectable or not. In the latter case
it drops the message.

As a result of this change, we can finally eliminate the function
net_route_named_msg(), and hence become independent of net_route_msg().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy
8db1bae30b tipc: separate building and sending of rejected messages
The way we build and send rejected message is currenty perceived as
hard to follow, partly because we let the transmission go via deep
call chains through functions such as tipc_reject_msg() and
net_route_msg().

We want to remove those functions, and make the call sequences shallower
and simpler. For this purpose, we separate building and sending of
rejected messages. We build the reject message using the new function
tipc_msg_reverse(), and let the transmission go via the newly introduced
tipc_link_xmit2() function, as all transmission eventually will do. We
also ensure that all calls to tipc_link_xmit2() are made outside
port_lock/bh_lock_sock.

Finally, we replace all calls to tipc_reject_msg() with the two new
calls at all locations in the code that we want to keep. The remaining
calls are made from code that we are planning to remove, along with
tipc_reject_msg() itself.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy
e4de5fab80 tipc: use negative error return values in functions
In some places, TIPC functions returns positive integers as return
codes. This goes against standard Linux coding practice, and may
even cause problems in some cases.

We now change the return values of the functions filter_rcv()
and filter_connect() to become signed integers, and return
negative error codes when needed. The codes we use in these
particular cases are still TIPC specific, since they are both
part of the TIPC API and have no correspondence in errno.h

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:54 -07:00
Jon Paul Maloy
02c00c2ab0 tipc: fix potential bug in function tipc_backlog_rcv
In commit 4f4482dcd9 ("tipc: compensate
for double accounting in socket rcv buffer") we access 'truesize' of
a received buffer after it might have been released by the function
filter_rcv().

In this commit we correct this by reading the value of 'truesize' to
the stack before delivering the buffer to filter_rcv().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:01:30 -07:00
Arnaldo Carvalho de Melo
85d3fc9418 tipc: Don't reset the timeout when restarting
As it may then take longer than what the user specified using
setsockopt(SO_RCVTIMEO).

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24 14:11:41 -04:00
Jon Paul Maloy
9816f0615d tipc: merge port message reception into socket reception function
In order to reduce complexity and save a call level during message
reception at port/socket level, we remove the function tipc_port_rcv()
and merge its functionality into tipc_sk_rcv().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 15:19:48 -04:00
Jon Paul Maloy
4f4482dcd9 tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.

As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.

This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.

In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.

It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 15:19:47 -04:00
Jon Paul Maloy
6163a194e0 tipc: decrease connection flow control window
Memory overhead when allocating big buffers for data transfer may
be quite significant. E.g., truesize of a 64 KB buffer turns out
to be 132 KB, 2 x the requested size.

This invalidates the "worst case" calculation we have been
using to determine the default socket receive buffer limit,
which is based on the assumption that 1024x64KB = 67MB buffers
may be queued up on a socket.

Since TIPC connections cannot survive hitting the buffer limit,
we have to compensate for this overhead.

We do that in this commit by dividing the fix connection flow
control window from 1024 (2*512) messages to 512 (2*256). Since
older version nodes send out acks at 512 message intervals,
compatibility with such nodes is guaranteed, although performance
may be non-optimal in such cases.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 15:19:47 -04:00
Erik Hugne
78acb1f9b8 tipc: add ioctl to fetch link names
We add a new ioctl for AF_TIPC that can be used to fetch the
logical name for a link to a remote node on a given bearer. This
should be used in combination with link state subscriptions.
The logical name size limit definitions are moved to tipc.h, as
they are now also needed by the new ioctl.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-26 12:13:24 -04:00
David S. Miller
676d23690f net: Fix use after free by removing length arg from sk_data_ready callbacks.
Several spots in the kernel perform a sequence like:

	skb_queue_tail(&sk->s_receive_queue, skb);
	sk->sk_data_ready(sk, skb->len);

But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up.  So this skb->len access is potentially
to freed up memory.

Furthermore, the skb->len can be modified by the consumer so it is
possible that the value isn't accurate.

And finally, no actual implementation of this callback actually uses
the length argument.  And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.

So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.

Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-11 16:15:36 -04:00
Geert Uytterhoeven
065d7e3956 tipc: Let tipc_release() return 0
net/tipc/socket.c: In function ‘tipc_release’:
net/tipc/socket.c:352: warning: ‘res’ is used uninitialized in this function

Introduced by commit 24be34b5a0 ("tipc:
eliminate upcall function pointers between port and socket"), which
removed the sole initializer of "res".

Just return 0 to fix it.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-07 15:08:17 -04:00
David S. Miller
85dcce7a73 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/r8152.c
	drivers/net/xen-netback/netback.c

Both the r8152 and netback conflicts were simple overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:31:55 -04:00
Jon Paul Maloy
5c311421a2 tipc: eliminate redundant lookups in registry
As an artefact from the native interface, the message sending functions
in the port takes a port ref as first parameter, and then looks up in
the registry to find the corresponding port pointer. This despite the
fact that the only currently existing caller, tipc_sock, already knows
this pointer.

We change the signature of these functions to take a struct tipc_port*
argument, and remove the redundant lookups.

We also remove an unmotivated extra lookup in the function
socket.c:auto_connect(), and, as the lookup functions tipc_port_deref()
and ref_deref() now become unused, we remove these two functions.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00