Commit Graph

810 Commits

Author SHA1 Message Date
Jon Paul Maloy
25b660c7e2 tipc: let internal link users call the new link send function
We convert the link internal users (changeover protocol, broadcast
synchronization) to use the new packet send function.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 21:38:18 -07:00
Jon Paul Maloy
dbdf6d24ad tipc: make name table distributor use new send function
In a previous commit series ("tipc: new unicast transmission code")
we introduced a new message sending function, tipc_link_xmit2(),
and moved the unicast data users over to use that function. We now
let the internal name table distributor do the same.

The interaction between the name distributor and the node/link
layer also becomes significantly simpler, so we can eliminate
the function tipc_link_names_xmit().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 21:38:18 -07:00
David S. Miller
1a98c69af1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 14:09:34 -07:00
Fabian Frederick
7ceaa583be tipc: remove unnecessary break after return
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15 16:26:59 -07:00
Jon Paul Maloy
999417549c tipc: clear 'next'-pointer of message fragments before reassembly
If the 'next' pointer of the last fragment buffer in a message is not
zeroed before reassembly, we risk ending up with a corrupt message,
since the reassembly function itself isn't doing this.

Currently, when a buffer is retrieved from the deferred queue of the
broadcast link, the next pointer is not cleared, with the result as
described above.

This commit corrects this, and thereby fixes a bug that may occur when
long broadcast messages are transmitted across dual interfaces. The bug
has been present since 40ba3cdf54 ("tipc:
message reassembly using fragment chain")

This commit should be applied to both net and net-next.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-11 15:02:10 -07:00
Erik Hugne
70452dcb6d tipc: fix a memleak when sending data
This fixes a regression bug caused by:
067608e9d0 ("tipc: introduce direct
iovec to buffer chain fragmentation function")

If data is sent on a nonblocking socket and the destination link
is congested, the buffer chain is leaked. We fix this by freeing
the chain in this case.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 16:10:01 -07:00
Jon Paul Maloy
29322d0db9 tipc: fix bug in multicast/broadcast message reassembly
Since commit 37e22164a8 ("tipc: rename and
move message reassembly function") reassembly of long broadcast messages
has been broken. This is because we test for a non-NULL return value
of the *buf parameter as criteria for succesful reassembly. However, this
parameter is left defined even after reception of the first fragment,
when reassebly is still incomplete. This leads to a kernel crash as soon
as a the first fragment of a long broadcast message is received.

We fix this with this commit, by implementing a stricter behavior of the
function and its return values.

This commit should be applied to both net and net-next.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 15:55:09 -07:00
Erik Hugne
3f53bd8f8b tipc: fix link acknowledge logic in receive path
Link state acks triggered from the receive path is done before
the last received packet have been processed by the link layer.
The effect of this is that the last received packet will not be
included in the ack. This causes problems if the link window is
set to TIPC_MIN_LINK_WIN, where the ack interval will be equal to
the link tolerance, and the link enters a stop-and-go behavior.
We move the ack logic to after link state processing, just before
the packet is delivered to higher layers.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Carl Sigurjonsson <carl.sigurjonsson@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 19:55:07 -07:00
Erik Hugne
7ae934bebe tipc: refactor message delivery out of tipc_rcv
This is a cosmetic change, separating message delivery from the
link state processing.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 19:55:07 -07:00
Jon Paul Maloy
60120526c2 tipc: simplify connection congestion handling
As a consequence of the recently introduced serialized access
to the socket in commit 8d94168a761819d10252bab1f8de6d7b202c3baa
("tipc: same receive code path for connection protocol and data
messages") we can make a number of simplifications in the
detection and handling of connection congestion situations.

- We don't need to keep two counters, one for sent messages and one
  for acked messages. There is no longer any risk for races between
  acknowledge messages arriving in BH and data message sending
  running in user context. So we merge this into one counter,
  'sent_unacked', which is incremented at sending and subtracted
  from at acknowledge reception.

- We don't need to set the 'congested' field in tipc_port to
  true before we sent the message, and clear it when sending
  is successful. (As a matter of fact, it was never necessary;
  the field was set in link_schedule_port() before any wakeup
  could arrive anyway.)

- We keep the conditions for link congestion and connection connection
  congestion separated. There would otherwise be a risk that an arriving
  acknowledge message may wake up a user sleeping because of link
  congestion.

- We can simplify reception of acknowledge messages.

We also make some cosmetic/structural changes:

- We rename the 'congested' field to the more correct 'link_cong´.

- We rename 'conn_unacked' to 'rcv_unacked'

- We move the above mentioned fields from struct tipc_port to
  struct tipc_sock.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:56 -07:00
Jon Paul Maloy
ac0074ee70 tipc: clean up connection protocol reception function
We simplify the code for receiving connection probes, leveraging the
recently introduced tipc_msg_reverse() function. We also stick to
the principle of sending a possible response message directly from
the calling (tipc_sk_rcv or backlog_rcv) functions, hence making
the call chain shallower and easier to follow.

We make one small protocol change here, allowed according to
the spec. If a protocol message arrives from a remote socket that
is not the one we are connected to, we are currently generating a
connection abort message and send it to the source. This behavior
is unnecessary, and might even be a security risk, so instead we
now choose to only ignore the message. The consequnce for the sender
is that he will need longer time to discover his mistake (until the
next timeout), but this is an extreme corner case, and may happen
anyway under other circumstances, so we deem this change acceptable.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:56 -07:00
Jon Paul Maloy
ec8a2e5621 tipc: same receive code path for connection protocol and data messages
As a preparation to eliminate port_lock we need to bring reception
of connection protocol messages under proper protection of bh_lock_sock
or socket owner.

We fix this by letting those messages follow the same code path as
incoming data messages.

As a side effect of this change, the last reference to the function
net_route_msg() disappears, and we can eliminate that function.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:56 -07:00
Jon Paul Maloy
b786e2b0fa tipc: let port protocol senders use new link send function
Several functions in port.c, related to the port protocol and
connection shutdown, need to send messages. We now convert them
to use the new link send function.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy
4ccfe5e041 tipc: connection oriented transport uses new send functions
We move the message sending across established connections
to use the message preparation and send functions introduced
earlier in this series. We now do the message preparation
and call to the link send function directly from the socket,
instead of going via the port layer.

As a consequence of this change, the functions tipc_send(),
tipc_port_iovec_rcv(), tipc_port_iovec_reject() and tipc_reject_msg()
become unreferenced and can be eliminated from port.c. For the same
reason, the functions tipc_link_xmit_fast(), tipc_link_iovec_xmit_long()
and tipc_link_iovec_fast() can be eliminated from link.c.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy
e2dafe87d3 tipc: RDM/DGRAM transport uses new fragmenting and sending functions
We merge the code for sending port name and port identity addressed
messages into the corresponding send functions in socket.c, and start
using the new fragmenting and transmit functions we just have introduced.

This saves a call level and quite a few code lines, as well as making
this part of the code easier to follow. As a consequence, the functions
tipc_send2name() and tipc_send2port() in port.c can be removed.

For practical reasons, we break out the code for sending multicast messages
from tipc_sendmsg() and move it into a separate function, tipc_sendmcast(),
but we do not yet convert it into using the new build/send functions.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy
5a379074a7 tipc: introduce message evaluation function
When a message arrives in a node and finds no destination
socket, we may need to drop it, reject it, or forward it after
a secondary destination lookup. The latter two cases currently
results in a code path that is perceived as complex, because it
follows a deep call chain via obscure functions such as
net_route_named_msg() and net_route_msg().

We now introduce a function, tipc_msg_eval(), that takes the
decision about whether such a message should be rejected or
forwarded, but leaves it to the caller to actually perform
the indicated action.

If the decision is 'reject', it is still the task of the recently
introduced function tipc_msg_reverse() to take the final decision
about whether the message is rejectable or not. In the latter case
it drops the message.

As a result of this change, we can finally eliminate the function
net_route_named_msg(), and hence become independent of net_route_msg().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy
8db1bae30b tipc: separate building and sending of rejected messages
The way we build and send rejected message is currenty perceived as
hard to follow, partly because we let the transmission go via deep
call chains through functions such as tipc_reject_msg() and
net_route_msg().

We want to remove those functions, and make the call sequences shallower
and simpler. For this purpose, we separate building and sending of
rejected messages. We build the reject message using the new function
tipc_msg_reverse(), and let the transmission go via the newly introduced
tipc_link_xmit2() function, as all transmission eventually will do. We
also ensure that all calls to tipc_link_xmit2() are made outside
port_lock/bh_lock_sock.

Finally, we replace all calls to tipc_reject_msg() with the two new
calls at all locations in the code that we want to keep. The remaining
calls are made from code that we are planning to remove, along with
tipc_reject_msg() itself.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy
067608e9d0 tipc: introduce direct iovec to buffer chain fragmentation function
Fragmentation at message sending is currently performed in two
places in link.c, depending on whether data to be transmitted
is delivered in the form of an iovec or as a big sk_buff. Those
functions are also tightly entangled with the send functions
that are using them.

We now introduce a re-entrant, standalone function, tipc_msg_build2(),
that builds a packet chain directly from an iovec. Each fragment is
sized according to the MTU value given by the caller, and is prepended
with a correctly built fragment header, when needed. The function is
independent from who is calling and where the chain will be delivered,
as long as the caller is able to indicate a correct MTU.

The function is tested, but not called by anybody yet. Since it is
incompatible with the existing tipc_msg_build(), and we cannot yet
remove that function, we have given it a temporary name.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy
16e166b88c tipc: make link mtu easily accessible from socket
Message fragmentation is currently performed at link level, inside
the protection of node_lock. This potentially binds up the sending
link structure for a long time, instead of letting it do other tasks,
such as handle reception of new packets.

In this commit, we make the MTUs of each active link become easily
accessible from the socket level, i.e., without taking any spinlock
or dereferencing the target link pointer. This way, we make it possible
to perform fragmentation in the sending socket, before sending the
whole fragment chain to the link for transport.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy
4f1688b2c6 tipc: introduce send functions for chained buffers in link
The current link implementation provides several different transmit
functions, depending on the characteristics of the message to be
sent: if it is an iovec or an sk_buff, if it needs fragmentation or
not, if the caller holds the node_lock or not. The permutation of
these options gives us an unwanted amount of unnecessarily complex
code.

As a first step towards simplifying the send path for all messages,
we introduce two new send functions at link level, tipc_link_xmit2()
and __tipc_link_xmit2(). The former looks up a link to the message
destination, and if one is found, it grabs the node lock and calls
the second function, which works exclusively inside the node lock
protection. If no link is found, and the destination is on the same
node, it delivers the message directly to the local destination
socket.

The new functions take a buffer chain where all packet headers are
already prepared, and the correct MTU has been used. These two
functions will later replace all other link-level transmit functions.

The functions are not backwards compatible, so we have added them
as new functions with temporary names. They are tested, but have no
users yet. Those will be added later in this series.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:54 -07:00
Jon Paul Maloy
e4de5fab80 tipc: use negative error return values in functions
In some places, TIPC functions returns positive integers as return
codes. This goes against standard Linux coding practice, and may
even cause problems in some cases.

We now change the return values of the functions filter_rcv()
and filter_connect() to become signed integers, and return
negative error codes when needed. The codes we use in these
particular cases are still TIPC specific, since they are both
part of the TIPC API and have no correspondence in errno.h

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:54 -07:00
Jon Paul Maloy
3d09fc4244 tipc: eliminate case of writing to freed memory
In the function tipc_nodesub_notify() we call a function pointer
aggregated into the object to be notified, whereafter we set
the function pointer to NULL. However, in some cases the function
pointed to will free the struct containing the function pointer,
resulting in a write to already freed memory.

This bug seems to always have been there, without causing any
notable harm.

In this commit we fix the problem by inverting the order of the
zeroing and the function call.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:54 -07:00
Octavian Purdila
bad93e9d4e net: add __pskb_copy_fclone and pskb_copy_for_clone
There are several instances where a pskb_copy or __pskb_copy is
immediately followed by an skb_clone.

Add a couple of new functions to allow the copy skb to be allocated
from the fclone cache and thus speed up subsequent skb_clone calls.

Cc: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Marek Lindner <mareklindner@neomailbox.ch>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Antonio Quartulli <antonio@meshcoding.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: Arvid Brodin <arvid.brodin@alten.se>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Allan Stephens <allan.stephens@windriver.com>
Cc: Andrew Hendry <andrew.hendry@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Reviewed-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:38:02 -07:00
Jon Paul Maloy
02c00c2ab0 tipc: fix potential bug in function tipc_backlog_rcv
In commit 4f4482dcd9 ("tipc: compensate
for double accounting in socket rcv buffer") we access 'truesize' of
a received buffer after it might have been released by the function
filter_rcv().

In this commit we correct this by reading the value of 'truesize' to
the stack before delivering the buffer to filter_rcv().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:01:30 -07:00
Arnaldo Carvalho de Melo
85d3fc9418 tipc: Don't reset the timeout when restarting
As it may then take longer than what the user specified using
setsockopt(SO_RCVTIMEO).

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24 14:11:41 -04:00
Jon Paul Maloy
9816f0615d tipc: merge port message reception into socket reception function
In order to reduce complexity and save a call level during message
reception at port/socket level, we remove the function tipc_port_rcv()
and merge its functionality into tipc_sk_rcv().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 15:19:48 -04:00
Jon Paul Maloy
c82910e2a8 tipc: clean up neigbor discovery message reception
The function tipc_disc_rcv(), which is handling received neighbor
discovery messages, is perceived as messy, and it is hard to verify
its correctness by code inspection. The fact that the task it is set
to resolve is fairly complex does not make the situation better.

In this commit we try to take a more systematic approach to the
problem. We define a decision machine which takes three state flags
 as input, and produces three action flags as output. We then walk
through all permutations of the state flags, and for each of them we
describe verbally what is going on, plus that we set zero or more of
the action flags. The action flags indicate what should be done once
the decision machine has finished its job, while the last part of the
function deals with performing those actions.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 15:19:48 -04:00
Jon Paul Maloy
38504c28a2 tipc: improve and extend media address conversion functions
TIPC currently handles two media specific addresses: Ethernet MAC
addresses and InfiniBand addresses. Those are kept in three different
formats:

1) A "raw" format as obtained from the device. This format is known
   only by the media specific adapter code in eth_media.c and
   ib_media.c.
2) A "generic" internal format, in the form of struct tipc_media_addr,
   which can be referenced and passed around by the generic media-
   unaware code.
3) A serialized version of the latter, to be conveyed in neighbor
   discovery messages.

Conversion between the three formats can only be done by the media
specific code, so we have function pointers for this purpose in
struct tipc_media. Here, the media adapters can install their own
conversion functions at startup.

We now introduce a new such function, 'raw2addr()', whose purpose
is to convert from format 1 to format 2 above. We also try to as far
as possible uniform commenting, variable names and usage of these
functions, with the purpose of making them more comprehensible.

We can now also remove the function tipc_l2_media_addr_set(), whose
job is done better by the new function.

Finally, we expand the field for serialized addresses (format 3)
in discovery messages from 20 to 32 bytes. This is permitted
according to the spec, and reduces the risk of problems when we
add new media in the future.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 15:19:48 -04:00
Jon Paul Maloy
37e22164a8 tipc: rename and move message reassembly function
The function tipc_link_frag_rcv() is in reality a re-entrant generic
message reassemby function that has nothing in particular to do with
the link, where it is defined now. This becomes obvious when we see
the need to call the function from other places in the code.

In this commit rename it to tipc_buf_append() and move it to the file
msg.c. We also simplify its signature by moving the tail pointer to
the control block of the head buffer, hence making the head buffer
self-contained.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 15:19:48 -04:00
Jon Paul Maloy
5074ab89c5 tipc: mark head of reassembly buffer as non-linear
The message reassembly function does not update the 'len' and 'data_len'
fields of the head skbuff correctly when fragments are chained to it.
This may sometimes lead to obsure errors, such as fragment reordering
when we receive fragments which are cloned buffers.

This commit fixes this, by ensuring that the two fields are updated
correctly.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 15:19:48 -04:00
Jon Paul Maloy
ec37dcd382 tipc: don't record link RESET or ACTIVATE messages as traffic
In the current code, all incoming LINK_PROTOCOL messages, irrespective
of type, nudge the "last message received" checkpoint, informing the
link state machine that a message was received from the peer since last
supervision timeout event. This inhibits the link from starting probing
the peer unnecessarily.

However, not only STATE messages are recorded as legitimate incoming
traffic this way, but even RESET and ACTIVATE messages, which in
reality are there to inform the link that the peer endpoint has been
reset. At the same time, some RESET messages may be dropped instead
of causing a link reset. This happens when the link endpoint thinks
it is fully up and working, and the session number of the RESET is
lower than or equal to the current link session. In such cases the
RESET is perceived as a delayed remnant from an earlier session, or
the current one, and dropped.

Now, if a TIPC module is removed and then immediately reinserted, e.g.
when using a script, RESET messages may arrive at the peer link endpoint
before this one has had time to discover the failure. The RESET may be
dropped because of the session number, but only after it has been
recorded as a legitimate traffic event. Hence, the receiving link will
not start probing, and not discover that the peer endpoint is down, at
the same time ignoring the periodic RESET messages coming from that
endpoint. We have ended up in a stale state where a failed link cannot
be re-established.

In this commit, we remedy this by nudging the checkpoint only for
received STATE messages, not for RESET or ACTIVATE messages.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 15:19:48 -04:00
Jon Paul Maloy
4f4482dcd9 tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.

As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.

This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.

In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.

It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 15:19:47 -04:00
Jon Paul Maloy
6163a194e0 tipc: decrease connection flow control window
Memory overhead when allocating big buffers for data transfer may
be quite significant. E.g., truesize of a 64 KB buffer turns out
to be 132 KB, 2 x the requested size.

This invalidates the "worst case" calculation we have been
using to determine the default socket receive buffer limit,
which is based on the assumption that 1024x64KB = 67MB buffers
may be queued up on a socket.

Since TIPC connections cannot survive hitting the buffer limit,
we have to compensate for this overhead.

We do that in this commit by dividing the fix connection flow
control window from 1024 (2*512) messages to 512 (2*256). Since
older version nodes send out acks at 512 message intervals,
compatibility with such nodes is guaranteed, although performance
may be non-optimal in such cases.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 15:19:47 -04:00
David S. Miller
5f013c9bc7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/altera/altera_sgdma.c
	net/netlink/af_netlink.c
	net/sched/cls_api.c
	net/sched/sch_api.c

The netlink conflict dealt with moving to netlink_capable() and
netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations
in non-init namespaces.  These were simple transformations from
netlink_capable to netlink_ns_capable.

The Altera driver conflict was simply code removal overlapping some
void pointer cast cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-12 13:19:14 -04:00
Ying Xue
ca9cf06a06 tipc: don't directly overwrite node action_flags
Each node action flag should be set or cleared separately, instead
we now set the whole flags variable in one shot, and it's turned
out to be hard to see which other flags are affected. Therefore,
for instance, we explicitly clear TIPC_WAIT_OWN_LINKS_DOWN bit in
node_lost_contact().

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-09 01:41:01 -04:00
Ying Xue
aecb9bb89c tipc: rename enum names of node flags
Rename node flags to action_flags as well as its enum names so
that they can reflect its real meanings.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-09 01:41:01 -04:00
Ying Xue
52ff872055 tipc: purge signal handler infrastructure
In the previous commits of this series, we removed all asynchronous
actions which were based on the tasklet handler - "tipc_k_signal()".

So the moment has now come when we can completely remove the tasklet
handler infrastructure. That is done with this commit.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 17:26:45 -04:00
Ying Xue
3f5a12bd9f tipc: avoid to asynchronously reset all links
Postpone the actions of resetting all links until after bclink
lock is released, avoiding to asynchronously reset all links.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 17:26:45 -04:00
Ying Xue
eb8b00f5f2 tipc: convert allocations of global variables associated with bclink
Convert allocations of global variables associated with bclink from
static way to dynamical way for the convenience of bclink instance
initialisation. Meanwhile, this also helps TIPC support name space
in the future easily.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 17:26:45 -04:00
Ying Xue
d69afc90b8 tipc: define new functions to operate bc_lock
As we are going to do more jobs when bc_lock is released, the two
operations of holding/releasing the lock should be encapsulated with
functions. In addition, we move bc_lock spin lock into tipc_bclink
structure avoiding to define the global variable.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 17:26:44 -04:00
Ying Xue
ca0c42732c tipc: avoid to asynchronously deliver name tables to peer node
Postpone the actions of delivering name tables until after node
lock is released, avoiding to do it under asynchronous context.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 17:26:44 -04:00
Ying Xue
9d56194968 tipc: remove TIPC_NAMES_GONE node flag
Since previously what all publications pertaining to the lost node
were removed from name table was finished in tasklet context
asynchronously, we need to TIPC_NAMES_GONE flag indicating whether
the node cleanup work is finished or not. But now as the cleanup work
has been finished when node lock is released, the flag becomes
meaningless for us.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 17:26:44 -04:00
Ying Xue
9db9fdd198 tipc: avoid to asynchronously notify subscriptions
Postpone the actions of notifying subscriptions until after node lock
is released, avoiding to asynchronously execute registered handlers
when node is lost.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 17:26:44 -04:00
Ying Xue
10f465c496 tipc: rename setup_blocked variable of node struct to flags
Rename setup_blocked variable of node struct to a more common
name called "flags", which will be used to represent kinds of
node states.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 17:26:44 -04:00
Ying Xue
486f930ac5 tipc: adjust order of variables in tipc_node structure
Move more frequently used variables up to the head of tipc_node
structure, hopefully improving a bit performance.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 17:26:44 -04:00
Ying Xue
5356f3d7d4 tipc: always use tipc_node_lock() to hold node lock
Although we obtain node lock with tipc_node_lock() in most time, there
are still places where we directly use native spin lock interface
to grab node lock. But as we will do more jobs in the future when node
lock is released, we should ensure that tipc_node_lock() is always
called when node lock is taken.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 17:26:43 -04:00
Ying Xue
1621b94d2a tipc: fix memory leak of publications
Commit 1bb8dce57f ("tipc: fix memory
leak during module removal") introduced a memory leak issue: when
name table is stopped, it's forgotten that publication instances are
freed properly. Additionally the useless "continue" statement in
tipc_nametbl_stop() is removed as well.

Reported-by: Jason <huzhijiang@gmail.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-30 13:31:26 -04:00
Ying Xue
eab8c04573 tipc: move the delivery of named messages out of nametbl lock
Commit a89778d8ba ("tipc: add support
for link state subscriptions") introduced below possible deadlock
scenario:

       CPU0                          CPU1
T0:   tipc_publish()                 link_timeout()
T1:   tipc_nametbl_publish()         [grab node lock]*
T2:   [grab nametbl write lock]*     link_state_event()
T3:   named_cluster_distribute()     link_activate()
T4:   [grab node lock]*              tipc_node_link_up()
T5:                                  tipc_nametbl_publish()
T6:                                  [grab nametble write lock]*

The opposite order of holding nametbl write lock and node lock on
above two different paths may result in a deadlock. If we move the
the delivery of named messages via link out of name nametbl lock,
the reverse order of holding locks will be eliminated, as a result,
the deadlock will be killed as well.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-28 14:49:54 -04:00
Erik Hugne
d7bb74c38c tipc: fix out of bounds indexing
Commit 78acb1f9b8 ("tipc: add
ioctl to fetch link names") introduced a buffer overflow bug where
specially crafted ioctl requests could cause out-of-bounds indexing
of the node->links array. This was caused by an incorrect check vs
MAX_BEARERS, and the static code checker complaint is:
net/tipc/node.c:459 tipc_node_get_linkname() error: buffer overflow 'node->links' 2 <= 2

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-28 14:43:35 -04:00
Ying Xue
22e7987ae7 tipc: fix a possible memory leak
The commit a8b9b96e95 ("tipc: fix race
in disc create/delete") leads to the following static checker warning:

	net/tipc/discover.c:352 tipc_disc_create()
		warn: possible memory leak of 'req'

The risk of memory leak really exists in practice. Especially when
it's failed to allocate memory for "req->buf", tipc_disc_create()
doesn't free its allocated memory, instead just directly returns
with ENOMEM error code. In this situation, memory leak, of course,
happens.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-27 19:08:06 -04:00
Erik Hugne
78acb1f9b8 tipc: add ioctl to fetch link names
We add a new ioctl for AF_TIPC that can be used to fetch the
logical name for a link to a remote node on a given bearer. This
should be used in combination with link state subscriptions.
The logical name size limit definitions are moved to tipc.h, as
they are now also needed by the new ioctl.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-26 12:13:24 -04:00
Erik Hugne
a89778d8ba tipc: add support for link state subscriptions
When links are established over a bearer plane, we create a node
local publication containing information about the peer node and
bearer plane. This allows TIPC applications to use the standard
TIPC topology server subscription mechanism to get notifications
when a link goes up or down.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-26 12:13:24 -04:00
Eric W. Biederman
90f62cf30a net: Use netlink_ns_capable to verify the permisions of netlink messages
It is possible by passing a netlink socket to a more privileged
executable and then to fool that executable into writing to the socket
data that happens to be valid netlink message to do something that
privileged executable did not intend to do.

To keep this from happening replace bare capable and ns_capable calls
with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
Which act the same as the previous calls except they verify that the
opener of the socket had the desired permissions as well.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24 13:44:54 -04:00
Ying Xue
a8b9b96e95 tipc: fix race in disc create/delete
Commit a21a584d67 (tipc: fix neighbor
detection problem after hw address change) introduces a race condition
involving tipc_disc_delete() and tipc_disc_add/remove_dest that can
cause TIPC to dereference the pointer to the bearer discovery request
structure after it has been freed since a stray pointer is left in the
bearer structure.

In order to fix the issue, the process of resetting the discovery
request handler is optimized: the discovery request handler and request
buffer are just reset instead of being freed, allocated and initialized.
As the request point is always valid and the request's lock is taken
while the request handler is reset, the race doesn't happen any more.

Reported-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:17:53 -04:00
Ying Xue
28dd94187a tipc: use bc_lock to protect node map in bearer structure
The node map variable - 'nodes' in bearer structure is only used by
bclink. When bclink accesses it, bc_lock is held. But when change it,
for instance, in tipc_bearer_add_dest() or tipc_bearer_remove_dest()
the bc_lock is not taken at all. To avoid any inconsistent data, we
should always grab bc_lock while accessing node map variable.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:17:53 -04:00
Ying Xue
4ae88c94d3 tipc: use bearer_disable to disable bearer in tipc_l2_device_event
As bearer pointer is known in tipc_l2_device_event(), it's unnecessary
to search it again in tipc_disable_bearer(). If tipc_disable_bearer()
is replaced with bearer_disable() in tipc_l2_device_event(), this will
help us save a bit time when bearer is disabled.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:17:53 -04:00
Ying Xue
f1c8d8cb82 tipc: make media_ptr pointed netdevice valid
The 'media_ptr' pointer in bearer structure which points to network
device, is protected by RCU. So, before netdevice is released,
synchronize_net() should be involved to prevent no any user of
the netdevice on read side from accessing it after it is freed.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:17:53 -04:00
Ying Xue
7216cd949c tipc: purge tipc_net_lock lock
Now tipc routing hierarchy comprises the structures 'node', 'link'and
'bearer'. The whole hierarchy is protected by a big read/write lock,
tipc_net_lock, to ensure that nothing is added or removed while code
is accessing any of these structures. Obviously the locking policy
makes node, link and bearer components closely bound together so that
their relationship becomes unnecessarily complex. In the worst case,
such locking policy not only has a negative influence on performance,
but also it's prone to lead to deadlock occasionally.

In order o decouple the complex relationship between bearer and node
as well as link, the locking policy is adjusted as follows:

- Bearer level
  RTNL lock is used on update side, and RCU is used on read side.
  Meanwhile, all bearer instances including broadcast bearer are
  saved into bearer_list array.

- Node and link level
  All node instances are saved into two tipc_node_list and node_htable
  lists. The two lists are protected by node_list_lock on write side,
  and they are guarded with RCU lock on read side. All members in node
  structure including link instances are protected by node spin lock.

- The relationship between bearer and node
  When link accesses bearer, it first needs to find the bearer with
  its bearer identity from the bearer_list array. When bearer accesses
  node, it can iterate the node_htable hash list with the node
  address to find the corresponding node.

In the new locking policy, every component has its private locking
solution and the relationship between bearer and node is very simple,
that is, they can find each other with node address or bearer identity
from node_htable hash list or bearer_list array.

Until now above all changes have been done, so tipc_net_lock can be
removed safely.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:17:53 -04:00
Ying Xue
2231c5af45 tipc: use RCU to protect media_ptr pointer
Now the media_ptr pointer is protected with tipc_net_lock write lock
on write side; tipc_net_lock read lock is used to read side. As the
part of effort of eliminating tipc_net_lock, we decide to adjust the
locking policy of media_ptr pointer protection: on write side, RTNL
lock is use while on read side RCU read lock is applied.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:17:53 -04:00
Ying Xue
7a2f7d18e7 tipc: decouple the relationship between bearer and link
Currently on both paths of message transmission and reception, the
read lock of tipc_net_lock must be held before bearer is accessed,
while the write lock of tipc_net_lock has to be taken before bearer
is configured. Although it can ensure that bearer is always valid on
the two data paths, link and bearer is closely bound together.

So as the part of effort of removing tipc_net_lock, the locking
policy of bearer protection will be adjusted as below: on the two
data paths, RCU is used, and on the configuration path of bearer,
RTNL lock is applied.

Now RCU just covers the path of message reception. To make it possible
to protect the path of message transmission with RCU, link should not
use its stored bearer pointer to access bearer, but it should use the
bearer identity of its attached bearer as index to get bearer instance
from bearer_list array, which can help us decouple the relationship
between bearer and link. As a result, bearer on the path of message
transmission can be safely protected by RCU when we access bearer_list
array within RCU lock protection.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:17:53 -04:00
Ying Xue
f8322dfce5 tipc: convert bearer_list to RCU list
Convert bearer_list to RCU list. It's protected by RTNL lock on
update side, and RCU read lock is applied to read side.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:17:52 -04:00
Ying Xue
f97e455abf tipc: use RTNL lock to protect tipc_net_stop routine
As the tipc network initialization(ie, tipc_net_start routine) is
under RTNL protection, its corresponding deinitialization part(ie,
tipc_net_stop routine) should be protected by RTNL too.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:17:52 -04:00
Ying Xue
ca07fb07c9 tipc: adjust locking policy of protecting tipc_ptr pointer of net_device
Currently the 'tipc_ptr' pointer is protected by tipc_net_lock
write lock on write side, and RCU read lock is applied to read
side. In addition, there have two paths on write side where we
may change variables pointed by the 'tipc_ptr' pointer: one is
to configure bearer by tipc-config tool and another one is that
bearer status is changed by notification events of its attached
interface. But on the latter path, we improperly deem that
accessing 'tipc_ptr' pointer happens on read side with
rcu_read_lock() although some variables pointed by the 'tipc_ptr'
pointer are changed possibly.

Moreover, as now the both paths are guarded by RTNL lock, it's
better to adjust the locking policy of 'tipc_ptr' pointer
protection, allowing RTNL instead of tipc_net_lock write lock to
protect it on write side, which will help us purge tipc_net_lock
in the future.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:17:52 -04:00
Ying Xue
ef13a262c3 tipc: replace config_mutex lock with RTNL lock
There have two paths where we can configure or change bearer status:
one is that bearer is configured from user space with tipc-config
tool; another one is that bearer is changed by notification events
from its attached interface. On the first path, one dedicated
config_mutex lock is guarded; on the latter path, RTNL lock has been
placed to serialize the process of dealing with interface events.
So, if RTNL lock is also used to protect the first path, this will
not only extremely help us simplify current locking policy, but also
config_mutex lock can be deleted as well.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:17:52 -04:00
David S. Miller
676d23690f net: Fix use after free by removing length arg from sk_data_ready callbacks.
Several spots in the kernel perform a sequence like:

	skb_queue_tail(&sk->s_receive_queue, skb);
	sk->sk_data_ready(sk, skb->len);

But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up.  So this skb->len access is potentially
to freed up memory.

Furthermore, the skb->len can be modified by the consumer so it is
possible that the value isn't accurate.

And finally, no actual implementation of this callback actually uses
the length argument.  And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.

So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.

Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-11 16:15:36 -04:00
Geert Uytterhoeven
065d7e3956 tipc: Let tipc_release() return 0
net/tipc/socket.c: In function ‘tipc_release’:
net/tipc/socket.c:352: warning: ‘res’ is used uninitialized in this function

Introduced by commit 24be34b5a0 ("tipc:
eliminate upcall function pointers between port and socket"), which
removed the sole initializer of "res".

Just return 0 to fix it.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-07 15:08:17 -04:00
Erik Hugne
a5e7ac5ce1 tipc: fix regression bug where node events are not being generated
Commit 5902385a24 ("tipc: obsolete
the remote management feature") introduces a regression where node
topology events are not being generated because the publication
that triggers this: {0, <z.c.n>, <z.c.n>} is no longer available.
This will break applications that rely on node events to discover
when nodes join/leave a cluster.

We fix this by advertising the node publication when TIPC enters
networking mode, and withdraws it upon shutdown.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 16:03:57 -04:00
Erik Hugne
16470111ed tipc: make discovery domain a bearer attribute
The node discovery domain is assigned when a bearer is enabled.
In the previous commit we reflect this attribute directly in the
bearer structure since it's needed to reinitialize the node
discovery mechanism after a hardware address change.

There's no need to replicate this attribute anywhere else, so we
remove it from the tipc_link_req structure.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-28 14:46:29 -04:00
Erik Hugne
a21a584d67 tipc: fix neighbor detection problem after hw address change
If the hardware address of a underlying netdevice is changed, it is
not enough to simply reset the bearer/links over this device. We
also need to reflect this change in the TIPC bearer and node
discovery structures aswell.

This patch adds the necessary reinitialization of the node disovery
mechanism following a hardware address change so that the correct
originating media address is advertised in the discovery messages.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reported-by: Dong Liu <dliu.cn@gmail.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-28 14:46:29 -04:00
Ying Xue
dde2026608 tipc: use node list lock to protect tipc_num_links variable
Without properly implicit or explicit read memory barrier, it's
unsafe to read an atomic variable with atomic_read() from another
thread which is different with the thread of changing the atomic
variable with atomic_inc() or atomic_dec(). So a stale tipc_num_links
may be got with atomic_read() in tipc_node_get_links(). If the
tipc_num_links variable type is converted from atomic to unsigned
integer and node list lock is used to protect it, the issue would
be avoided.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27 13:08:38 -04:00
Ying Xue
2220646a53 tipc: use node_list_lock to protect tipc_num_nodes variable
As tipc_node_list is protected by rcu read lock on read side, it's
unnecessary to hold node_list_lock to protect tipc_node_list in
tipc_node_get_links(). Instead, node_list_lock should just protects
tipc_num_nodes in the function.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27 13:08:37 -04:00
Ying Xue
6c7a762e70 tipc: tipc: convert node list and node hlist to RCU lists
Convert tipc_node_list list and node_htable hash list to RCU lists.
On read side, the two lists are protected with RCU read lock, and
on update side, node_list_lock is applied to them.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27 13:08:37 -04:00
Ying Xue
46651c59c4 tipc: rename node create lock to protect node list and hlist
When a node is created, tipc_net_lock read lock is first held and
then node_create_lock is grabbed in order to prevent the same node
from being created and inserted into both node list and hlist twice.
But when we query node from the two node lists, we only hold
tipc_net_lock read lock without grabbing node_create_lock. Obviously
this locking policy is unable to guarantee that the two node lists
are always synchronized especially when the operation of changing
and accessing them occurs in different contexts like currently doing.

Therefore, rename node_create_lock to node_list_lock to protect the
two node lists, that is, whenever node is inserted into them or node
is queried from them, the node_list_lock should be always held. As a
result, tipc_net_lock read lock becomes redundant and then can be
removed from the node query functions.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27 13:08:37 -04:00
Ying Xue
987b58be37 tipc: make broadcast bearer store in bearer_list array
Now unicast bearer is dynamically allocated and placed into its
identity specified slot of bearer_list array. When we search
bearer_list array with a bearer identity, the corresponding bearer
instance can be found. But broadcast bearer is statically allocated
and it is not located in the bearer_list array yet. So we decide to
enlarge bearer_list array into MAX_BEARERS + 1 slots, and its last
slot stores the broadcast bearer so that the broadcast bearer can
be found from bearer_list array with MAX_BEARERS as index. The
change will help us reduce the complex relationship between bearer
and link in the future.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27 13:08:37 -04:00
Ying Xue
f47de12b06 tipc: remove active flag from tipc_bearer structure
After the allocation of tipc_bearer structure instance is converted
from statical way to dynamical way, we identify whether a certain
tipc_bearer structure pointer is valid by checking whether the pointer
is NULL or not. So the active flag in tipc_bearer structure becomes
redundant.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27 13:08:37 -04:00
Ying Xue
3874ccbba8 tipc: convert tipc_bearers array to pointer list
As part of the effort to introduce RCU protection for the bearer
list, we first need to change it to a list of pointers.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27 13:08:37 -04:00
Ying Xue
78dfb789b6 tipc: acquire necessary locks in named_cluster_distribute routine
The 'tipc_node_list' is guarded by tipc_net_lock and 'links' array
defined in 'tipc_node' structure is protected by node lock as well.
Without acquiring the two locks in named_cluster_distribute() a fatal
oops may happen in case that a destroyed link might be got and then
accessed. Therefore, above mentioned two locks must be held in
named_cluster_distribute() to prevent the issue from happening
accidentally.

As 'links' array in node struct must be protected by node lock,
we have to move the code of selecting an active link from
tipc_link_xmit() to named_cluster_distribute() and then call
__tipc_link_xmit() with the selected link to deliver name messages.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27 13:08:36 -04:00
Ying Xue
5902385a24 tipc: obsolete the remote management feature
Due to the lacking of any credential, it's allowed to accept commands
requested from remote nodes to query the local node status, which is
prone to involve potential security risks. Instead, if we login to
a remote node with ssh command, this approach is not only more safe
than the remote management feature, but also it can give us more
permissions like changing the remote node configuration. So it's
reasonable for us to obsolete the remote management feature now.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27 13:08:36 -04:00
Ying Xue
76d7882420 tipc: remove unnecessary checking for node object
tipc_node_create routine doesn't need to check whether a node
object specified with a node address exists or not because its
caller(ie, tipc_disc_recv_msg routine) has checked this before
calling it.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27 13:08:36 -04:00
David S. Miller
04f58c8854 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	Documentation/devicetree/bindings/net/micrel-ks8851.txt
	net/core/netpoll.c

The net/core/netpoll.c conflict is a bug fix in 'net' happening
to code which is completely removed in 'net-next'.

In micrel-ks8851.txt we simply have overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-25 20:29:20 -04:00
Erik Hugne
a5d0e7c037 tipc: fix spinlock recursion bug for failed subscriptions
If a topology event subscription fails for any reason, such as out
of memory, max number reached or because we received an invalid
request the correct behavior is to terminate the subscribers
connection to the topology server. This is currently broken and
produces the following oops:

[27.953662] tipc: Subscription rejected, illegal request
[27.955329] BUG: spinlock recursion on CPU#1, kworker/u4:0/6
[27.957066]  lock: 0xffff88003c67f408, .magic: dead4ead, .owner: kworker/u4:0/6, .owner_cpu: 1
[27.958054] CPU: 1 PID: 6 Comm: kworker/u4:0 Not tainted 3.14.0-rc6+ #5
[27.960230] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[27.960874] Workqueue: tipc_rcv tipc_recv_work [tipc]
[27.961430]  ffff88003c67f408 ffff88003de27c18 ffffffff815c0207 ffff88003de1c050
[27.962292]  ffff88003de27c38 ffffffff815beec5 ffff88003c67f408 ffffffff817f0a8a
[27.963152]  ffff88003de27c58 ffffffff815beeeb ffff88003c67f408 ffffffffa0013520
[27.964023] Call Trace:
[27.964292]  [<ffffffff815c0207>] dump_stack+0x45/0x56
[27.964874]  [<ffffffff815beec5>] spin_dump+0x8c/0x91
[27.965420]  [<ffffffff815beeeb>] spin_bug+0x21/0x26
[27.965995]  [<ffffffff81083df6>] do_raw_spin_lock+0x116/0x140
[27.966631]  [<ffffffff815c6215>] _raw_spin_lock_bh+0x15/0x20
[27.967256]  [<ffffffffa0008540>] subscr_conn_shutdown_event+0x20/0xa0 [tipc]
[27.968051]  [<ffffffffa000fde4>] tipc_close_conn+0xa4/0xb0 [tipc]
[27.968722]  [<ffffffffa00101ba>] tipc_conn_terminate+0x1a/0x30 [tipc]
[27.969436]  [<ffffffffa00089a2>] subscr_conn_msg_event+0x1f2/0x2f0 [tipc]
[27.970209]  [<ffffffffa0010000>] tipc_receive_from_sock+0x90/0xf0 [tipc]
[27.970972]  [<ffffffffa000fa79>] tipc_recv_work+0x29/0x50 [tipc]
[27.971633]  [<ffffffff8105dbf5>] process_one_work+0x165/0x3e0
[27.972267]  [<ffffffff8105e869>] worker_thread+0x119/0x3a0
[27.972896]  [<ffffffff8105e750>] ? manage_workers.isra.25+0x2a0/0x2a0
[27.973622]  [<ffffffff810648af>] kthread+0xdf/0x100
[27.974168]  [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0
[27.974893]  [<ffffffff815ce13c>] ret_from_fork+0x7c/0xb0
[27.975466]  [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0

The recursion occurs when subscr_terminate tries to grab the
subscriber lock, which is already taken by subscr_conn_msg_event.
We fix this by checking if the request to establish a new
subscription was successful, and if not we initiate termination of
the subscriber after we have released the subscriber lock.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-24 15:36:56 -04:00
David S. Miller
85dcce7a73 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/r8152.c
	drivers/net/xen-netback/netback.c

Both the r8152 and netback conflicts were simple overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:31:55 -04:00
Jon Paul Maloy
5c311421a2 tipc: eliminate redundant lookups in registry
As an artefact from the native interface, the message sending functions
in the port takes a port ref as first parameter, and then looks up in
the registry to find the corresponding port pointer. This despite the
fact that the only currently existing caller, tipc_sock, already knows
this pointer.

We change the signature of these functions to take a struct tipc_port*
argument, and remove the redundant lookups.

We also remove an unmotivated extra lookup in the function
socket.c:auto_connect(), and, as the lookup functions tipc_port_deref()
and ref_deref() now become unused, we remove these two functions.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy
58ed944241 tipc: align usage of variable names and macros in socket
The practice of naming variables in TIPC is inconistent, sometimes
even within the same file.

In this commit we align variable names and declarations within
socket.c, and function and macro names within socket.h. We also
reduce the number of conversion macros to two, in order to make
usage less obsure.

These changes are purely cosmetic.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy
3b4f302d85 tipc: eliminate redundant locking
The three functions tipc_portimportance(), tipc_portunreliable() and
tipc_portunreturnable() and their corresponding tipc_set* functions,
are all grabbing port_lock when accessing the targeted port. This is
unnecessary in the current code, since these calls only are made from
within socket downcalls, already protected by sock_lock.

We remove the redundant locking. Also, since the functions now become
trivial one-liners, we move them to port.h and make them inline.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy
24be34b5a0 tipc: eliminate upcall function pointers between port and socket
Due to the original one-to-many relation between port and user API
layers, upcalls to the API have been performed via function pointers,
installed in struct tipc_port at creation. Since this relation now
always is one-to-one, we can instead use ordinary function calls.

We remove the function pointers 'dispatcher' and ´wakeup' from
struct tipc_port, and replace them with calls to the renamed
functions tipc_sk_rcv() and tipc_sk_wakeup().

At the same time we change the name and signature of the functions
tipc_createport() and tipc_deleteport() to reflect their new role
as mere initialization/destruction functions.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy
8826cde655 tipc: aggregate port structure into socket structure
After the removal of the tipc native API the relation between
a tipc_port and its API types is strictly one-to-one, i.e, the
latter can now only be a socket API. There is therefore no need
to allocate struct tipc_port and struct sock independently.

In this commit, we aggregate struct tipc_port into struct tipc_sock,
hence saving both CPU cycles and structure complexity.

There are no functional changes in this commit, except for the
elimination of the separate allocation/freeing of tipc_port.
All other changes are just adaptatons to the new data structure.

This commit also opens up for further code simplifications and
code volume reduction, something we will do in later commits.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy
f9fef18c6d tipc: remove redundant 'peer_name' field in struct tipc_sock
The field 'peer_name' in struct tipc_sock is redundant, since
this information already is available from tipc_port, to which
tipc_sock has a reference.

We remove the field, and ensure that peer node and peer port
info instead is fetched via the functions that already exist
for this purpose.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy
978813ee89 tipc: replace reference table rwlock with spinlock
The lock for protecting the reference table is declared as an
RWLOCK, although it is only used in write mode, never in read
mode.

We redefine it to become a spinlock.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Joe Perches
184593c734 tipc: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:28:06 -04:00
Erik Hugne
2892505ea1 tipc: don't log disabled tasklet handler errors
Failure to schedule a TIPC tasklet with tipc_k_signal because the
tasklet handler is disabled is not an error. It means TIPC is
currently in the process of shutting down. We remove the error
logging in this case.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:46:24 -05:00
Erik Hugne
1bb8dce57f tipc: fix memory leak during module removal
When the TIPC module is removed, the tasklet handler is disabled
before all other subsystems. This will cause lingering publications
in the name table because the node_down tasklets responsible to
clean up publications from an unreachable node will never run.
When the name table is shut down, these publications are detected
and an error message is logged:
tipc: nametbl_stop(): orphaned hash chain detected
This is actually a memory leak, introduced with commit
993b858e37 ("tipc: correct the order
of stopping services at rmmod")

Instead of just logging an error and leaking memory, we free
the orphaned entries during nametable shutdown.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:46:24 -05:00
Erik Hugne
edcc0511b5 tipc: drop subscriber connection id invalidation
When a topology server subscriber is disconnected, the associated
connection id is set to zero. A check vs zero is then done in the
subscription timeout function to see if the subscriber have been
shut down. This is unnecessary, because all subscription timers
will be cancelled when a subscriber terminates. Setting the
connection id to zero is actually harmful because id zero is the
identity of the topology server listening socket, and can cause a
race that leads to this socket being closed instead.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:46:23 -05:00
Ying Xue
fe8e464939 tipc: avoid to unnecessary process switch under non-block mode
When messages are received via tipc socket under non-block mode,
schedule_timeout() is called in tipc_wait_for_rcvmsg(), that is,
the process of receiving messages will be scheduled once although
timeout value passed to schedule_timeout() is 0. The same issue
exists in accept()/wait_for_accept(). To avoid this unnecessary
process switch, we only call schedule_timeout() if the timeout
value is non-zero.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:46:23 -05:00
Ying Xue
4652edb70e tipc: fix connection refcount leak
When tipc_conn_sendmsg() calls tipc_conn_lookup() to query a
connection instance, its reference count value is increased if
it's found. But subsequently if it's found that the connection is
closed, the work of sending message is not queued into its server
send workqueue, and the connection reference count is not decreased.
This will cause a reference count leak. To reproduce this problem,
an application would need to open and closes topology server
connections with high intensity.

We fix this by immediately decrementing the connection reference
count if a send fails due to the connection being closed.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:46:23 -05:00
Ying Xue
6d4ebeb4df tipc: allow connection shutdown callback to be invoked in advance
Currently connection shutdown callback function is called when
connection instance is released in tipc_conn_kref_release(), and
receiving packets and sending packets are running in different
threads. Even if connection is closed by the thread of receiving
packets, its shutdown callback may not be called immediately as
the connection reference count is non-zero at that moment. So,
although the connection is shut down by the thread of receiving
packets, the thread of sending packets doesn't know it. Before
its shutdown callback is invoked to tell the sending thread its
connection has been closed, the sending thread may deliver
messages by tipc_conn_sendmsg(), this is why the following error
information appears:

"Sending subscription event failed, no memory"

To eliminate it, allow connection shutdown callback function to
be called before connection id is removed in tipc_close_conn(),
which makes the sending thread know the truth in time that its
socket is closed so that it doesn't send message to it. We also
remove the "Sending XXX failed..." error reporting for topology
and config services.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:46:23 -05:00
David S. Miller
67ddc87f16 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/wireless/ath/ath9k/recv.c
	drivers/net/wireless/mwifiex/pcie.c
	net/ipv6/sit.c

The SIT driver conflict consists of a bug fix being done by hand
in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper
was created (netdev_alloc_pcpu_stats()) which takes care of this.

The two wireless conflicts were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05 20:32:02 -05:00
Ying Xue
970122fdf4 tipc: make bearer set up in module insertion stage
Accidentally a side effect is involved by commit 6e967adf7(tipc:
relocate common functions from media to bearer). Now tipc stack
handler of receiving packets from netdevices as well as netdevice
notification handler are registered when bearer is enabled rather
than tipc module initialization stage, but the two handlers are
both unregistered in tipc module exit phase. If tipc module is
inserted and then immediately removed, the following warning
message will appear:

"dev_remove_pack: ffffffffa0380940 not found"

This is because in module insertion stage tipc stack packet handler
is not registered at all, but in module exit phase dev_remove_pack()
needs to remove it. Of course, dev_remove_pack() cannot find tipc
protocol handler from the kernel protocol handler list so that the
warning message is printed out.

But if registering the two handlers is adjusted from enabling bearer
phase into inserting module stage, the warning message will be
eliminated. Due to this change, tipc_core_start_net() and
tipc_core_stop_net() can be deleted as well.

Reported-by: Wang Weidong <wangweidong1@huawei.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-22 00:00:15 -05:00
Ying Xue
9fe7ed4749 tipc: remove all enabled flags from all tipc components
When tipc module is inserted, many tipc components are initialized
one by one. During the initialization period, if one of them is
failed, tipc_core_stop() will be called to stop all components
whatever corresponding components are created or not. To avoid to
release uncreated ones, relevant components have to add necessary
enabled flags indicating whether they are created or not.

But in the initialization stage, if one component is unsuccessfully
created, we will just destroy successfully created components before
the failed component instead of all components. All enabled flags
defined in components, in turn, become redundant. Additionally it's
also unnecessary to identify whether table.types is NULL in
tipc_nametbl_stop() because name stable has been definitely created
successfully when tipc_nametbl_stop() is called.

Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-22 00:00:15 -05:00
Erik Hugne
63fa01c147 tipc: failed transmissions should return error
When a message could not be sent out because the destination node
or link could not be found, the full message size is returned from
sendmsg() as if it had been sent successfully. An application will
then get a false indication that it's making forward progress. This
problem has existed since the initial commit in 2.6.16.

We change this to return -ENETUNREACH if the message cannot be
delivered due to the destination node/link being unavailable. We
also get rid of the redundant tipc_reject_msg call since freeing
the buffer and doing a tipc_port_iovec_reject accomplishes exactly
the same thing.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-19 16:40:57 -05:00