linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-02 10:11:36 +00:00

Author	SHA1	Message	Date
Hagen Paul Pfeifer	4a3ad7b3ea	netem: markov loss model transition fix The transition from markov state "3 => lost packets within a burst period" to "1 => successfully transmitted packets within a gap period" has no additional loss event. The loss already happen for transition from 1 -> 3, this additional loss will make things go wild. E.g. transition probabilities: p13: 10% p31: 100% Expected: Ploss = p13 / (p13 + p31) Ploss = ~9.09% ... but it isn't. Even worse: we get a double loss - each time. So simple don't return true to indicate loss, rather break and return false. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Stefano Salsano <stefano.salsano@uniroma2.it> Cc: Fabio Ludovici <fabio.ludovici@yahoo.it> Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-25 19:03:39 -04:00
David S. Miller	b45bd46dec	Included changes: - data structure reshaping to accommodate multiple routing protocol implementations - routing protocol API enhancement - send to userspace the event "batman-adv Gateway loss" in case of soft-iface destruction and a "batman-adv Gateway" was configured - improve the TT component to support and advertise runtime flag changes - minor code refactoring - make the ICMP kernel-to-userspace communication more generic -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJSZ+VyAAoJEADl0hg6qKeOccIP/1RE68wFqd20BTh6WtnKo4s6 H/F5WMUrk/HCNe3p4JnIOYv5WESR3tqTylCqBIl3yzcsks2KsLm4zEG5FIx/0J3Y 1Mgy8V/xlGVN7M+wFSLCgzpTnJ3aCvy7+ied2qoz9KsC4vgiKUimDkPTsbUL7NUp OBmJGYfe/nLDoPI/CXu3nJCtNXcDgv5a5Z8ZMBuipeK++JsBMZRLfJEIM+7Q4Ouc KNLZFXavwtXAxsHLpWDS48MkVAz0tbyy4P6e2k7iQQq+W1WZjCoFMz0xLIKRFf+Y yOOZXItpTTX7rzLxFHkLAopZo9UMPsFjm/OceFBlnAbp24ftfR0b4gjFAUQiKFsq lGlGIXVkhsR6arfQ4SIlrrGOW7h+Kea2I1aPWC7yoi/97+22Nrr/a903p+kkhP4t sAoMk7DbbdajcV01RULF+xjaBFvEdaSfSBVB5j76Gf9AxNZfSGd7wH1qPG7O7HFT jO6Z4fbG6bbHBHMt9j4o9oGg4h5X8epbhZB7e8rwoBe/dSzw+B4CK2Y+j1/QW3C4 PDqL3t5yi0O0+dhkI2G8DmhTOm+ZKVZt60WIMM+G5T6DiLyECveexGWIbjjZR67w qgVXsvE0PwHabY8Ne/z+IlnHY8zegUFYVusQ0lQfLpkKhdjoYXLF709RT42r2QIN 8/6WlNGD2YG/0sWEDF7H =sYKR -----END PGP SIGNATURE----- Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge Antonio Quartulli says: ==================== this is another set of changes intended for net-next/linux-3.13. (probably our last pull request for this cycle) Patches 1 and 2 reshape two of our main data structures in a way that they can easily be extended in the future to accommodate new routing protocols. Patches from 3 to 9 improve our routing protocol API and its users so that all the protocol-related code is not mixed up with the other components anymore. Patch 10 limits the local Translation Table maximum size to a value such that it can be fully transfered over the air if needed. This value depends on fragmentation being enabled or not and on the mtu values. Patch 11 makes batman-adv send a uevent in case of soft-interface destruction while a "bat-Gateway" was configured (this informs userspace about the GW not being available anymore). Patches 13 and 14 enable the TT component to detect non-mesh client flag changes at runtime (till now those flags where set upon client detection and were not changed anymore). Patch 16 is a generalisation of our user-to-kernel space communication (and viceversa) used to exchange ICMP packets to send/received to/from the mesh network. Now it can easily accommodate new ICMP packet types without breaking the existing userspace API anymore. Remaining patches are minor changes and cleanups. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-23 17:12:33 -04:00
Hannes Frederic Sowa	7088ad74e6	inet: remove old fragmentation hash initializing All fragmentation hash secrets now get initialized by their corresponding hash function with net_get_random_once. Thus we can eliminate the initial seeding. Also provide a comment that hash secret seeding happens at the first call to the corresponding hashing function. Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-23 17:01:41 -04:00
Hannes Frederic Sowa	b1190570b4	ipv6: split inet6_hash_frag for netfilter and initialize secrets with net_get_random_once Defer the fragmentation hash secret initialization for IPv6 like the previous patch did for IPv4. Because the netfilter logic reuses the hash secret we have to split it first. Thus introduce a new nf_hash_frag function which takes care to seed the hash secret. Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-23 17:01:40 -04:00
Hannes Frederic Sowa	e7b519ba55	ipv4: initialize ip4_frags hash secret as late as possible Defer the generation of the first hash secret for the ipv4 fragmentation cache as late as possible. ip4_frags.rnd gets initial seeded by inet_frags_init and regulary reseeded by inet_frag_secret_rebuild. Either we call ipqhashfn directly from ip_fragment.c in which case we initialize the secret directly. If we first get called by inet_frag_secret_rebuild we install a new secret by a manual call to get_random_bytes. This secret will be overwritten as soon as the first call to ipqhashfn happens. This is safe because we won't race while publishing the new secrets with anyone else. Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-23 17:01:40 -04:00
David S. Miller	c3fa32b976	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/usb/qmi_wwan.c include/net/dst.h Trivial merge conflicts, both were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-23 16:49:34 -04:00
Hannes Frederic Sowa	34d92d5315	net: always inline net_secret_init Currently net_secret_init does not get inlined, so we always have a call to net_secret_init even in the fast path. Let's specify net_secret_init as __always_inline so we have the nop in the fast-path without the call to net_secret_init and the unlikely path at the epilogue of the function. jump_labels handle the inlining correctly. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-23 16:26:46 -04:00
Simon Wunderlich	da6b8c20a5	batman-adv: generalize batman-adv icmp packet handling Instead of handling icmp packets only up to length of icmp_packet_rr, the code should handle any icmp length size. Therefore the length truncating is moved to when the packet is actually sent to userspace (this does not support lengths longer than icmp_packet_rr yet). Longer packets are forwarded without truncating. This patch also cleans up some parts where the icmp header struct could be used instead of other icmp_packet(_rr) structs to make the code more readable. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-23 17:03:47 +02:00
Simon Wunderlich	15c33da6e8	batman-adv: Start new development cycle Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-23 17:03:46 +02:00
Antonio Quartulli	0eb01568f0	batman-adv: include the sync-flags when compute the global/local table CRC Flags covered by TT_SYNC_MASK are kept in sync among the nodes in the network and therefore they have to be considered while computing the global/local table CRC. In this way a generic originator is able to understand if its table contains the correct flags or not. Bits from 4 to 7 in the TT flags fields are now reserved for "synchronized" flags only. This allows future developers to add more flags of this type without breaking compatibility. It's important to note that not all the remote TT flags are synchronised. This comes from the fact that some flags are used to inject an information once only. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2013-10-23 17:03:46 +02:00
Antonio Quartulli	3c4f7ab60c	batman-adv: improve the TT component to support runtime flag changes Some flags (i.e. the WIFI flag) may change after that the related client has already been announced. However it is useful to informa the rest of the network about this change. Add a runtime-flag-switch detection mechanism and re-announce the related TT entry to advertise the new flag value. This mechanism can be easily exploited by future flags that may need the same treatment. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2013-10-23 17:03:45 +02:00
Antonio Quartulli	0c69aecc5b	batman-adv: invoke dev_get_by_index() outside of is_wifi_iface() Upcoming changes need to perform other checks on the incoming net_device struct. To avoid performing dev_get_by_index() for each and every check, it is better to move it outside of is_wifi_iface() and search the netdev object once only. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2013-10-23 17:03:44 +02:00
Antonio Quartulli	8257f55ae2	batman-adv: send GW_DEL event in case of soft-iface destruction In case of soft_iface destruction send a GW DEL event to userspace so that applications which are listening for GW events are informed about the lost of connectivity and can react accordingly. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2013-10-23 17:03:44 +02:00
Marek Lindner	a19d3d85e1	batman-adv: limit local translation table max size The local translation table size is limited by what can be transferred from one node to another via a full table request. The number of entries fitting into a full table request depend on whether the fragmentation is enabled or not. Therefore this patch introduces a max table size check and refuses to add more local clients when that size is reached. Moreover, if the max full table packet size changes (MTU change or fragmentation is disabled) the local table is downsized instantaneously. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Antonio Quartulli <ordex@autistici.org>	2013-10-23 17:03:43 +02:00
Antonio Quartulli	4627456a77	batman-adv: adapt the TT component to use the new API functions Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-23 17:03:42 +02:00
Antonio Quartulli	d0015fdd3d	batman-adv: provide orig_node routing API Some operations executed on an orig_node depends on the current routing algorithm being used. To easily make this mechanism routing algorithm agnostic add a orig_node specific API that each algorithm can populate with its own routines. Such routines are then invoked by the code when needed, without knowing which routing algorithm is currently in use With this patch 3 API functions are added: - orig_free (to free routing depending internal structs) - orig_add_if (to change the inner state of an orig_node when a new hard interface is added) - orig_del_if (to change the inner state of an orig_node when an hard interface is removed) Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-23 17:03:21 +02:00
Antonio Quartulli	81e26b1a1c	batman-adv: adapt the neighbor purging routine to use the new API functions Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-23 15:33:12 +02:00
Antonio Quartulli	6680a1249f	batman-adv: adapt bonding to use the new API functions Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-23 15:33:12 +02:00
Antonio Quartulli	c43c981e50	batman-adv: add bat_neigh_is_equiv_or_better API function Each routing protocol has its own metric semantic and therefore is the protocol itself the only component able to compare two metrics to check their "similarity". This new API allows each routing protocol to implement its own logic and make the external code protocol agnostic. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-23 15:33:11 +02:00
Antonio Quartulli	a3285a8f20	batman-adv: add bat_neigh_cmp API function This new API allows to compare the two neighbours based on the metric avoiding the user to deal with any routing algorithm specific detail Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-23 15:33:10 +02:00
Antonio Quartulli	737a2a2297	batman-adv: add bat_orig_print API function Each routing protocol has its own metric and private variables, therefore it is useful to introduce a new API for originator information printing. This API needs to be implemented by each protocol in order to provide its specific originator table output. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-23 15:33:10 +02:00
Antonio Quartulli	bbad0a5e36	batman-adv: make struct batadv_orig_node algorithm agnostic some of the struct batadv_orig_node members are B.A.T.M.A.N. IV specific and therefore they are moved in a algorithm specific substruct in order to make batadv_orig_node routing algorithm agnostic Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-23 15:33:09 +02:00
Antonio Quartulli	0538f75991	batman-adv: make struct batadv_neigh_node algorithm agnostic some of the fields in struct batadv_neigh_node are strictly related to the B.A.T.M.A.N. IV algorithm. In order to make the struct usable by any routing algorithm it has to be split and made more generic Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-23 15:33:08 +02:00
Linus Lüssing	454594f3b9	Revert "bridge: only expire the mdb entry when query is received" While this commit was a good attempt to fix issues occuring when no multicast querier is present, this commit still has two more issues: 1) There are cases where mdb entries do not expire even if there is a querier present. The bridge will unnecessarily continue flooding multicast packets on the according ports. 2) Never removing an mdb entry could be exploited for a Denial of Service by an attacker on the local link, slowly, but steadily eating up all memory. Actually, this commit became obsolete with "bridge: disable snooping if there is no querier" (`b00589af3b`) which included fixes for a few more cases. Therefore reverting the following commits (the commit stated in the commit message plus three of its follow up fixes): ==================== Revert "bridge: update mdb expiration timer upon reports." This reverts commit `f144febd93`. Revert "bridge: do not call setup_timer() multiple times" This reverts commit `1faabf2aab`. Revert "bridge: fix some kernel warning in multicast timer" This reverts commit `c7e8e8a8f7`. Revert "bridge: only expire the mdb entry when query is received" This reverts commit `9f00b2e7cf`. ==================== CC: Cong Wang <amwang@redhat.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Reviewed-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-22 14:41:02 -04:00
ZHAO Gang	0a6957e7d4	net: remove function sk_reset_txq() What sk_reset_txq() does is just calls function sk_tx_queue_reset(), and sk_reset_txq() is used only in sock.h, by dst_negative_advice(). Let dst_negative_advice() calls sk_tx_queue_reset() directly so we can remove unneeded sk_reset_txq(). Signed-off-by: ZHAO Gang <gamerh2o@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-22 14:00:21 -04:00
Neal Cardwell	02cf4ebd82	tcp: initialize passive-side sk_pacing_rate after 3WHS For passive TCP connections, upon receiving the ACK that completes the 3WHS, make sure we set our pacing rate after we get our first RTT sample. On passive TCP connections, when we receive the ACK completing the 3WHS we do not take an RTT sample in tcp_ack(), but rather in tcp_synack_rtt_meas(). So upon receiving the ACK that completes the 3WHS, tcp_ack() leaves sk_pacing_rate at its initial value. Originally the initial sk_pacing_rate value was 0, so passive-side connections defaulted to sysctl_tcp_min_tso_segs (2 segs) in skbuffs made in the first RTT. With a default initial cwnd of 10 packets, this happened to be correct for RTTs 5ms or bigger, so it was hard to see problems in WAN or emulated WAN testing. Since `7eec4174ff` ("pkt_sched: fq: fix non TCP flows pacing"), the initial sk_pacing_rate is 0xffffffff. So after that change, passive TCP connections were keeping this value (and using large numbers of segments per skbuff) until receiving an ACK for data. Signed-off-by: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-21 18:56:23 -04:00
Hannes Frederic Sowa	c2f17e827b	ipv6: probe routes asynchronous in rt6_probe Routes need to be probed asynchronous otherwise the call stack gets exhausted when the kernel attemps to deliver another skb inline, like e.g. xt_TEE does, and we probe at the same time. We update neigh->updated still at once, otherwise we would send to many probes. Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-21 18:56:22 -04:00
Eric Dumazet	61c1db7fae	ipv6: sit: add GSO/TSO support Now ipv6_gso_segment() is stackable, its relatively easy to implement GSO/TSO support for SIT tunnels Performance results, when segmentation is done after tunnel device (as no NIC is yet enabled for TSO SIT support) : Before patch : lpq84:~# ./netperf -H 2002:af6:1153:: -Cc MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:1153:: () port 0 AF_INET6 Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 3168.31 4.81 4.64 2.988 2.877 After patch : lpq84:~# ./netperf -H 2002:af6:1153:: -Cc MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:1153:: () port 0 AF_INET6 Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 5525.00 7.76 5.17 2.763 1.840 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-21 18:49:39 -04:00
Eric Dumazet	d3e5e0062d	ipv6: gso: make ipv6_gso_segment() stackable In order to support GSO on SIT tunnels, we need to make inet_gso_segment() stackable. It should not assume network header starts right after mac header. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-21 18:49:39 -04:00
Eric W. Biederman	fd2d5356d9	ipv4: Allow unprivileged users to use per net sysctls Allow unprivileged users to use: /proc/sys/net/ipv4/icmp_echo_ignore_all /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts /proc/sys/net/ipv4/icmp_ignore_bogus_error_response /proc/sys/net/ipv4/icmp_errors_use_inbound_ifaddr /proc/sys/net/ipv4/icmp_ratelimit /proc/sys/net/ipv4/icmp_ratemask /proc/sys/net/ipv4/ping_group_range /proc/sys/net/ipv4/tcp_ecn /proc/sys/net/ipv4/ip_local_ports_range These are occassionally handy and after a quick review I don't see any problems with unprivileged users using them. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-21 18:43:03 -04:00
Eric W. Biederman	0a6fa23dcb	ipv4: Use math to point per net sysctls into the appropriate struct net. Simplify maintenance of ipv4_net_table by using math to point the per net sysctls into the appropriate struct net, instead of manually reassinging all of the variables into hard coded table slots. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-21 18:43:02 -04:00
Eric W. Biederman	2e685cad57	tcp_memcontrol: Kill struct tcp_memcontrol Replace the pointers in struct cg_proto with actual data fields and kill struct tcp_memcontrol as it is not fully redundant. This removes a confusing, unnecessary layer of abstraction. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-21 18:43:02 -04:00
Eric W. Biederman	a4fe34bf90	tcp_memcontrol: Remove the per netns control. The code that is implemented is per memory cgroup not per netns, and having per netns bits is just confusing. Remove the per netns bits to make it easier to see what is really going on. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-21 18:43:02 -04:00
Eric W. Biederman	f594d63199	tcp_memcontrol: Remove setting cgroup settings via sysctl The code is broken and does not constrain sysctl_tcp_mem as tcp_update_limit does. With the result that it allows the cgroup tcp memory limits to be bypassed. The semantics are broken as the settings are not per netns and are in a per netns table, and instead looks at current. Since the code is broken in both design and implementation and does not implement the functionality for which it was written remove it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-21 18:43:02 -04:00
Eric W. Biederman	cd91cce620	tcp_memcontrol: Remove tcp_max_memory This function is never called. Remove it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-21 18:43:02 -04:00
Julian Anastasov	56e42441ed	netfilter: nf_conntrack: fix rt6i_gateway checks for H.323 helper Now when rt6_nexthop() can return nexthop address we can use it for proper nexthop comparison of directly connected destinations. For more information refer to commit `bbb5823cf7` ("netfilter: nf_conntrack: fix rt_gateway checks for H.323 helper"). Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-21 18:37:01 -04:00
Julian Anastasov	550bab42f8	ipv6: fill rt6i_gateway with nexthop address Make sure rt6i_gateway contains nexthop information in all routes returned from lookup or when routes are directly attached to skb for generated ICMP packets. The effect of this patch should be a faster version of rt6_nexthop() and the consideration of local addresses as nexthop. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-21 18:37:01 -04:00
David S. Miller	5bf47256f5	Included changed: - email addresses update in documentation, source files and MAINTAINERS - make the TT component distinguish non-mesh clients based on the VLAN they belong to - improve all the internal components to properly work on a per-VLAN basis (enabled by the new TT-VLAN feature) - enhance the sysfs interface in order to provide behaviour switches on a per-VLAN basis (enabled by the new TT-VLAN feature) - improve TT lock mechanism - improve unicast transmission APIs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJSYvnjAAoJEADl0hg6qKeOGFAQAK+P3XkProP+MWhgpQzWDc+F TCVvi3eQ9NKVzxVnglRTznEVqXePVArK9oWb39KbCeguoqsuo7T8I+oNK06qPCH6 1aodkBqJLq0OT1EWIxMo+1eOHCevRRqBjS1Jh0DMxuugMsKZZu3/DrHK59ay/4y7 8wRb8CqQrKpILsh43cKRm9SPNJj0nmFPIwoWmgu++ffPyfIPMnBHSowMEgxqJ3h4 Vp4adjJQU2D3qa1Vln99MYzdJaUhRDVDjxdAroCbuk6M1bl9o88UjhFxRvZJN8JN HdxiMN1hvlDJ7OsiBGw42RROnibyqkui8BZl5hP85sjbKSSU9lCqMJ1XWW+gVNhx sKA7LIm7NPNW9Ysvgd3FhjX/cg18WgjC2HHU26uMhYmGrGUfP8eBw55XidabApgb TpGhKjFxhYqfGnPhAtarsqLYfxWh6vbb1G6cyaC5jJ4baIa5YKqt8tejHCNiFJLI WrVnmi0TJfGjdoULfUdkBOx/pI6zyZ3PWPISbIDUIslQXrnEzKUj37VbN3N0Qlj1 QcVcC+iVd3/gJ/dnvKzmeGjsm2nKK5eEwewRtNuPkQSaM13/dN2CUQ4/+/6BSY1D wODn+Wc5zCoi8sxvVb7TcT+NLO27QkH0REJh4W9KxJx9NSws0BfiVBcTKPFCra+x gMsgNYNdgoCTLQBNj8KK =sEEg -----END PGP SIGNATURE----- Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge Antonio Quartulli says: ==================== this is another batch intended for net-next/linux-3.13. This pull request is a bit bigger than usual, but 6 patches are very small (three of them are about email updates).. Patch 1 is fixing a previous merge conflict resolution that went wrong (I realised that only now while checking other patches..). Patches from 2 to 4 that are updating our emails in all the proper files (Documentation/, headers and MAINTAINERS). Patches 5, 6 and 7 are bringing a big improvement to the TranslationTable component: it is now able to group non-mesh clients based on the VLAN they belong to. In this way a lot a new enhancements are now possible thanks to the fact that each batman-adv behaviour can be applied on a per VLAN basis. And, of course, in patches from 8 to 12 you have some of the enhancements I was talking about: - make the batman-Gateway selection VLAN dependent - make DAT (Distributed ARP Table) group ARP entries on a VLAN basis (this allows DAT to work even when the admin decided to use the same IP subnet on different VLANs) - make the AP-Isolation behaviour switchable on each VLAN independently - export VLAN specific attributes via sysfs. Switches like the AP-Isolation are now exported once per VLAN (backward compatibility of the sysfs interface has been preserved) Patches 13 and 14 are small code cleanups. Patch 15 is a minor improvement in the TT locking mechanism. Patches 16 and 17 are other enhancements to the TT component. Those allow a node to parse a "non-mesh client announcement message" and accept only those TT entries belonging to certain VLANs. Patch 18 exploits this parse&accept mechanism to make the Bridge Loop Avoidance component reject only TT entries connected to the VLAN where it is operating. Previous to this change, BLA was rejecting all the entries coming from any other Backbone node, regardless of the VLAN (for more details about how the Bridge Loop Avoidance works please check [1]). [1] http://www.open-mesh.org/projects/batman-adv/wiki/Bridge-loop-avoidance-II ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:52:42 -04:00
Hannes Frederic Sowa	e34c9a6997	net: switch net_secret key generation to net_get_random_once Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:45:36 -04:00
Hannes Frederic Sowa	222e83d2e0	tcp: switch tcp_fastopen key generation to net_get_random_once Changed key initialization of tcp_fastopen cookies to net_get_random_once. If the user sets a custom key net_get_random_once must be called at least once to ensure we don't overwrite the user provided key when the first cookie is generated later on. Cc: Yuchung Cheng <ycheng@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:45:35 -04:00
Hannes Frederic Sowa	1bbdceef1e	inet: convert inet_ehash_secret and ipv6_hash_secret to net_get_random_once Initialize the ehash and ipv6_hash_secrets with net_get_random_once. Each compilation unit gets its own secret now: ipv4/inet_hashtables.o ipv4/udp.o ipv6/inet6_hashtables.o ipv6/udp.o rds/connection.o The functions still get inlined into the hashing functions. In the fast path we have at most two (needed in ipv6) if (unlikely(...)). Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:45:35 -04:00
Hannes Frederic Sowa	b23a002fc6	inet: split syncookie keys for ipv4 and ipv6 and initialize with net_get_random_once This patch splits the secret key for syncookies for ipv4 and ipv6 and initializes them with net_get_random_once. This change was the reason I did this series. I think the initialization of the syncookie_secret is way to early. Cc: Florian Westphal <fw@strlen.de> Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:45:35 -04:00
Hannes Frederic Sowa	a48e42920f	net: introduce new macro net_get_random_once net_get_random_once is a new macro which handles the initialization of secret keys. It is possible to call it in the fast path. Only the initialization depends on the spinlock and is rather slow. Otherwise it should get used just before the key is used to delay the entropy extration as late as possible to get better randomness. It returns true if the key got initialized. The usage of static_keys for net_get_random_once is a bit uncommon so it needs some further explanation why this actually works: === In the simple non-HAVE_JUMP_LABEL case we actually have === no constrains to use static_key_(true\|false) on keys initialized with STATIC_KEY_INIT_(FALSE\|TRUE). So this path just expands in favor of the likely case that the initialization is already done. The key is initialized like this: ___done_key = { .enabled = ATOMIC_INIT(0) } The check if (!static_key_true(&___done_key)) \ expands into (pseudo code) if (!likely(___done_key > 0)) , so we take the fast path as soon as ___done_key is increased from the helper function. === If HAVE_JUMP_LABELs are available this depends === on patching of jumps into the prepared NOPs, which is done in jump_label_init at boot-up time (from start_kernel). It is forbidden and dangerous to use net_get_random_once in functions which are called before that! At compilation time NOPs are generated at the call sites of net_get_random_once. E.g. net/ipv6/inet6_hashtable.c:inet6_ehashfn (we need to call net_get_random_once two times in inet6_ehashfn, so two NOPs): 71: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 76: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) Both will be patched to the actual jumps to the end of the function to call __net_get_random_once at boot time as explained above. arch_static_branch is optimized and inlined for false as return value and actually also returns false in case the NOP is placed in the instruction stream. So in the fast case we get a "return false". But because we initialize ___done_key with (enabled != (entries & 1)) this call-site will get patched up at boot thus returning true. The final check looks like this: if (!static_key_true(&___done_key)) \ ___ret = __net_get_random_once(buf, \ expands to if (!!static_key_false(&___done_key)) \ ___ret = __net_get_random_once(buf, \ So we get true at boot time and as soon as static_key_slow_inc is called on the key it will invert the logic and return false for the fast path. static_key_slow_inc will change the branch because it got initialized with .enabled == 0. After static_key_slow_inc is called on the key the branch is replaced with a nop again. === Misc: === The helper defers the increment into a workqueue so we don't have problems calling this code from atomic sections. A seperate boolean (___done) guards the case where we enter net_get_random_once again before the increment happend. Cc: Ingo Molnar <mingo@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:45:35 -04:00
Hannes Frederic Sowa	b50026b5ac	ipv6: split inet6_ehashfn to hash functions per compilation unit This patch splits the inet6_ehashfn into separate ones in ipv6/inet6_hashtables.o and ipv6/udp.o to ease the introduction of seperate secrets keys later. Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:45:34 -04:00
Hannes Frederic Sowa	65cd8033ff	ipv4: split inet_ehashfn to hash functions per compilation unit This duplicates a bit of code but let's us easily introduce separate secret keys later. The separate compilation units are ipv4/inet_hashtabbles.o, ipv4/udp.o and rds/connection.o. Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:45:34 -04:00
Eric Dumazet	cb32f511a7	ipip: add GSO/TSO support Now inet_gso_segment() is stackable, its relatively easy to implement GSO/TSO support for IPIP Performance results, when segmentation is done after tunnel device (as no NIC is yet enabled for TSO IPIP support) : Before patch : lpq83:~# ./netperf -H 7.7.9.84 -Cc MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.9.84 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 3357.88 5.09 3.70 2.983 2.167 After patch : lpq83:~# ./netperf -H 7.7.9.84 -Cc MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.9.84 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 7710.19 4.52 6.62 1.152 1.687 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:36:19 -04:00
Eric Dumazet	3347c96029	ipv4: gso: make inet_gso_segment() stackable In order to support GSO on IPIP, we need to make inet_gso_segment() stackable. It should not assume network header starts right after mac header. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:36:18 -04:00
Eric Dumazet	2d26f0a3c0	ipv4: generalize gre_handle_offloads This patch makes gre_handle_offloads() more generic and rename it to iptunnel_handle_offloads() This will be used to add GSO/TSO support to IPIP tunnels. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:36:18 -04:00
Eric Dumazet	030737bcc3	net: generalize skb_segment() While implementing GSO/TSO support for IPIP, I found skb_segment() was assuming network header was immediately following mac header. Its not really true in the case inet_gso_segment() is stacked : By the time tcp_gso_segment() is called, network header points to the inner IP header. Let's instead assume nothing and pick the current offsets found in original skb, we have skb_headers_offset_update() helper for that. Also move the csum_start update inside skb_headers_offset_update() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:36:18 -04:00
Jiri Pirko	e93b7d748b	ip_output: do skb ufo init for peeked non ufo skb as well Now, if user application does: sendto len<mtu flag MSG_MORE sendto len>mtu flag 0 The skb is not treated as fragmented one because it is not initialized that way. So move the initialization to fix this. introduced by: commit `e89e9cf539` "[IPv4/IPv6]: UFO Scatter-gather approach" Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-19 19:20:52 -04:00

1 2 3 4 5 ...

29841 Commits