linux

Author	SHA1	Message	Date
wenxu	6e6b904ad4	ip_tunnel: Fix route fl4 init in ip_md_tunnel_xmit Init the gre_key from tuninfo->key.tun_id and init the mark from the skb->mark, set the oif to zero in the collect metadata mode. Fixes: `cfc7381b30` ("ip_tunnel: add collect_md mode to IPIP tunnel") Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-26 09:43:03 -08:00
wenxu	c8b34e680a	ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit Add tnl_update_pmtu in ip_md_tunnel_xmit to dynamic modify the pmtu which packet send through collect_metadata mode ip tunnel Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-26 09:43:03 -08:00
wenxu	f46fe4f8d7	ip_tunnel: Add ip tunnel dst_cache in ip_md_tunnel_xmit Add ip tunnel dst cache in ip_md_tunnel_xmit to make more efficient for the route lookup. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-26 09:43:03 -08:00
Willem de Bruijn	f859a44847	tcp: allow zerocopy with fastopen Accept MSG_ZEROCOPY in all the TCP states that allow sendmsg. Remove the explicit check for ESTABLISHED and CLOSE_WAIT states. This requires correctly handling zerocopy state (uarg, sk_zckey) in all paths reachable from other TCP states. Such as the EPIPE case in sk_stream_wait_connect, which a sendmsg() in incorrect state will now hit. Most paths are already safe. Only extension needed is for TCP Fastopen active open. This can build an skb with data in tcp_send_syn_data. Pass the uarg along with other fastopen state, so that this skb also generates a zerocopy notification on release. Tested with active and passive tcp fastopen packetdrill scripts at `1747eef03d` Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-25 22:41:08 -08:00
Peter Oskolkov	997dd96471	net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c Currently, IPv6 defragmentation code drops non-last fragments that are smaller than 1280 bytes: see commit `0ed4229b08` ("ipv6: defrag: drop non-last frags smaller than min mtu") This behavior is not specified in IPv6 RFCs and appears to break compatibility with some IPv6 implemenations, as reported here: https://www.spinics.net/lists/netdev/msg543846.html This patch re-uses common IP defragmentation queueing and reassembly code in IP6 defragmentation in nf_conntrack, removing the 1280 byte restriction. Signed-off-by: Peter Oskolkov <posk@google.com> Reported-by: Tom Herbert <tom@herbertland.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-25 21:37:11 -08:00
Peter Oskolkov	d4289fcc9b	net: IP6 defrag: use rbtrees for IPv6 defrag Currently, IPv6 defragmentation code drops non-last fragments that are smaller than 1280 bytes: see commit `0ed4229b08` ("ipv6: defrag: drop non-last frags smaller than min mtu") This behavior is not specified in IPv6 RFCs and appears to break compatibility with some IPv6 implemenations, as reported here: https://www.spinics.net/lists/netdev/msg543846.html This patch re-uses common IP defragmentation queueing and reassembly code in IPv6, removing the 1280 byte restriction. Signed-off-by: Peter Oskolkov <posk@google.com> Reported-by: Tom Herbert <tom@herbertland.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-25 21:37:11 -08:00
Peter Oskolkov	c23f35d19d	net: IP defrag: encapsulate rbtree defrag code into callable functions This is a refactoring patch: without changing runtime behavior, it moves rbtree-related code from IPv4-specific files/functions into .h/.c defrag files shared with IPv6 defragmentation code. Signed-off-by: Peter Oskolkov <posk@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-25 21:37:11 -08:00
Liangwei Dong	6c900360e7	nl80211: Allow set/del pmksa operations for AP Host drivers may offload authentication to the user space through the commit ("cfg80211: Authentication offload to user space in AP mode"). This interface can be used to implement SAE by having the userspace do authentication/PMKID key derivation and driver handle the association. A step ahead, this interface can get further optimized if the PMKID is passed to the host driver and also have it respond to the association request by the STA on a valid PMKID. This commit enables the userspace to pass the PMKID to the host drivers through the set/del pmksa operations in AP mode. Set/Del pmksa is now restricted to STA/P2P client mode only and thus the drivers might not expect them in any other(AP) mode. This commit also introduces a feature flag NL80211_EXT_FEATURE_AP_PMKSA_CACHING (johannes: renamed) to maintain the backward compatibility of such an expectation by the host drivers. These operations are allowed in AP mode only when the drivers advertize the capability through this flag. Signed-off-by: Liangwei Dong <liangwei@codeaurora.org> Signed-off-by: Srinivas Dasari <dasaris@codeaurora.org> [rename flag to NL80211_EXT_FEATURE_AP_PMKSA_CACHING] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2019-01-25 21:09:21 +01:00
Srinivas Dasari	fe4943702c	cfg80211: Authentication offload to user space in AP mode commit `40cbfa9021` ("cfg80211/nl80211: Optional authentication offload to userspace")' introduced authentication offload to user space by the host drivers in station mode. This commit extends the same for the AP mode too. Extend NL80211_ATTR_EXTERNAL_AUTH_SUPPORT to also claim the support of external authentication from the user space in AP mode. A new flag parameter is introduced in cfg80211_ap_settings to intend the same while "start ap". Host driver to use NL80211_CMD_FRAME interface to transmit and receive the authentication frames to / from the user space. Host driver to indicate the flag NL80211_RXMGMT_FLAG_EXTERNAL_AUTH while sending the authentication frame to the user space. This intends to the user space that the driver wishes it to process the authentication frame for certain protocols, though it had initially advertised the support for SME functionality. User space shall accordingly do the authentication and indicate its final status through the command NL80211_CMD_EXTERNAL_AUTH. Allow the command even if userspace doesn't include the attribute NL80211_ATTR_SSID for AP interface. Host driver shall continue with the association sequence and indicate the STA connection status through cfg80211_new_sta. To facilitate the host drivers in AP mode for matching the pmkid by the stations during the association, NL80211_CMD_EXTERNAL_AUTH is also enhanced to include the pmkid to drivers after the authentication. This pmkid can also be used in the STA mode to include in the association request. Also modify nl80211_external_auth to not mandate SSID in AP mode. Signed-off-by: Srinivas Dasari <dasaris@codeaurora.org> [remove useless nla_get_flag() usage] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2019-01-25 21:08:05 +01:00
David S. Miller	517952756e	Merge tag 'mac80211-for-davem-2019-01-25' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just a few small fixes: * avoid trying to operate TDLS when not connection, this is not valid and led to issues * count TTL-dropped frames in mesh better * deal with new WiGig channels in regulatory code * remove a WARN_ON() that can trigger due to benign races during device/driver registration * fix nested netlink policy maxattrs (syzkaller) * fix hwsim n_limits (syzkaller) * propagate __aligned(2) to a surrounding struct * return proper error in virt_wifi error path ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-25 10:59:36 -08:00
David S. Miller	30e5c2c6bf	net: Revert devlink health changes. This reverts the devlink health changes from 9/17/2019, Jiri wants things to be designed differently and it was agreed that the easiest way to do this is start from the beginning again. Commits reverted: `cb5ccfbe73` `880ee82f03` `c7af343b4e` `ff253fedab` `6f9d56132e` `fcd852c69d` `8a66704a13` `12bd0dcefe` `aba25279c1` `ce019faa70` `b8c45a033a` And the follow-on build fix: o33a0efa4baecd689da9474ce0e8b673eb6931c60 Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-25 10:53:23 -08:00
Veerendranath Jakkam	ab4dfa2053	cfg80211: Allow drivers to advertise supported AKM suites There was no such capability advertisement from the driver and thus the current user space has to assume the driver to support all the AKMs. While that may be the case with some drivers (e.g., mac80211-based ones), there are cfg80211-based drivers that implement SME and have constraints on which AKMs can be supported (e.g., such drivers may need an update to support SAE AKM using NL80211_CMD_EXTERNAL_AUTH). Allow such drivers to advertise the exact set of supported AKMs so that user space tools can determine what network profile options should be allowed to be configured. Signed-off-by: Veerendranath Jakkam <vjakkam@codeaurora.org> [pmsr data might be big, start a new netlink message section] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2019-01-25 14:05:31 +01:00
Sriram R	c82c06ce43	cfg80211: Notify all User Hints To self managed wiphys Currently Self Managed WIPHY's are not notified on any hints other than user cell base station hints. Self Managed wiphy's basically rely on hints from firmware and its local regdb for regulatory management, so hints from wireless core can be ignored. But all user hints needs to be notified to them to provide flexibility to these drivers to honour or ignore these user hints. Currently none of the drivers supporting self managed wiphy register a notifier with cfg80211. Hence this change does not affect any other driver behavior. Signed-off-by: Sriram R <srirrama@codeaurora.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2019-01-25 14:05:31 +01:00
Gustavo A. R. Silva	4af217500e	cfg80211: mark expected switch fall-throughs In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warnings: net/wireless/wext-compat.c:1327:6: warning: this statement may fall through [-Wimplicit-fallthrough=] net/wireless/wext-compat.c:1341:6: warning: this statement may fall through [-Wimplicit-fallthrough=] Warning level 3 was used: -Wimplicit-fallthrough=3 This patch is part of the ongoing efforts to enabling -Wimplicit-fallthrough Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2019-01-25 14:05:31 +01:00
Toke Høiland-Jørgensen	390298e86f	mac80211: Expose ieee80211_schedule_txq() function Since we reworked ieee80211_return_txq() so it assumes that the caller takes care of logging, we need another function that can be called without holding any locks. Introduce ieee80211_schedule_txq() which serves this purpose. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2019-01-25 14:05:27 +01:00
Chaitanya Tata	93183bdbe7	cfg80211: extend range deviation for DMG Recently, DMG frequency bands have been extended till 71GHz, so extend the range check till 20GHz (45-71GHZ), else some channels will be marked as disabled. Signed-off-by: Chaitanya Tata <Chaitanya.Tata@bluwireless.co.uk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2019-01-25 10:18:51 +01:00
Chaitanya Tata	faae54ad41	cfg80211: reg: remove warn_on for a normal case If there are simulatenous queries of regdb, then there might be a case where multiple queries can trigger request_firmware_no_wait and can have parallel callbacks being executed asynchronously. In this scenario we might hit the WARN_ON. So remove the warn_on, as the code already handles multiple callbacks gracefully. Signed-off-by: Chaitanya Tata <chaitanya.tata@bluwireless.co.uk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2019-01-25 10:18:02 +01:00
Mathieu Malaterre	7c53eb5d87	mac80211: Add attribute aligned(2) to struct 'action' During refactor in commit `9e478066ea` ("mac80211: fix MU-MIMO follow-MAC mode") a new struct 'action' was declared with packed attribute as: struct { struct ieee80211_hdr_3addr hdr; u8 category; u8 action_code; } __packed action; But since struct 'ieee80211_hdr_3addr' is declared with an aligned keyword as: struct ieee80211_hdr { __le16 frame_control; __le16 duration_id; u8 addr1[ETH_ALEN]; u8 addr2[ETH_ALEN]; u8 addr3[ETH_ALEN]; __le16 seq_ctrl; u8 addr4[ETH_ALEN]; } __packed __aligned(2); Solve the ambiguity of placing aligned structure in a packed one by adding the aligned(2) attribute to struct 'action'. This removes the following warning (W=1): net/mac80211/rx.c:234:2: warning: alignment 1 of 'struct <anonymous>' is less than 2 [-Wpacked-not-aligned] Cc: Johannes Berg <johannes.berg@intel.com> Suggested-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2019-01-25 10:17:25 +01:00
Balaji Pothunoori	7ed5285396	mac80211: don't initiate TDLS connection if station is not associated to AP Following call trace is observed while adding TDLS peer entry in driver during TDLS setup. Call Trace: [<c1301476>] dump_stack+0x47/0x61 [<c10537d2>] __warn+0xe2/0x100 [<fa22415f>] ? sta_apply_parameters+0x49f/0x550 [mac80211] [<c1053895>] warn_slowpath_null+0x25/0x30 [<fa22415f>] sta_apply_parameters+0x49f/0x550 [mac80211] [<fa20ad42>] ? sta_info_alloc+0x1c2/0x450 [mac80211] [<fa224623>] ieee80211_add_station+0xe3/0x160 [mac80211] [<c1876fe3>] nl80211_new_station+0x273/0x420 [<c170f6d9>] genl_rcv_msg+0x219/0x3c0 [<c170f4c0>] ? genl_rcv+0x30/0x30 [<c170ee7e>] netlink_rcv_skb+0x8e/0xb0 [<c170f4ac>] genl_rcv+0x1c/0x30 [<c170e8aa>] netlink_unicast+0x13a/0x1d0 [<c170ec18>] netlink_sendmsg+0x2d8/0x390 [<c16c5acd>] sock_sendmsg+0x2d/0x40 [<c16c6369>] ___sys_sendmsg+0x1d9/0x1e0 Fixing this by allowing TDLS setup request only when we have completed association. Signed-off-by: Balaji Pothunoori <bpothuno@codeaurora.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2019-01-25 10:13:22 +01:00
Johannes Berg	a8b5c6d692	nl80211: fix NLA_POLICY_NESTED() arguments syzbot reported an out-of-bounds read when passing certain malformed messages into nl80211. The specific place where this happened isn't interesting, the problem is that nested policy parsing was referring to the wrong maximum attribute and thus the policy wasn't long enough. Fix this by referring to the correct attribute. Since this is really not necessary, I'll come up with a separate patch to just pass the policy instead of both, in the common case we can infer the maxattr from the size of the policy array. Reported-by: syzbot+4157b036c5f4713b1f2f@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Fixes: `9bb7e0f24e` ("cfg80211: add peer measurement with FTM initiator API") Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2019-01-25 09:26:32 +01:00
Felix Fietkau	7d652669b6	batman-adv: release station info tidstats With the addition of TXQ stats in the per-tid statistics the struct station_info grew significantly. This resulted in stack size warnings due to the structure itself being above the limit for the warnings. To work around this, the TID array was allocated dynamically. Also a function to free this content was introduced with commit `7ea3e110f2` ("cfg80211: release station info tidstats where needed") but the necessary changes were not provided for batman-adv's B.A.T.M.A.N. V implementation. Signed-off-by: Felix Fietkau <nbd@nbd.name> Fixes: `8689c051a2` ("cfg80211: dynamically allocate per-tid stats for station info") [sven@narfation.org: add commit message] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2019-01-25 09:04:41 +01:00
Colin Ian King	1e4b6e91b4	Bluetooth: make hw_err static, reduces object code size Don't populate the const array hw_err on the stack but instead make it static. Makes the object code smaller by 45 bytes: Before: text data bss dec hex filename 100880 21090 1088 123058 1e0b2 linux/net/bluetooth/hci_core.o After: text data bss dec hex filename 100739 21186 1088 123013 1e085 linux/net/bluetooth/hci_core.o (gcc version 8.2.0 x86_64) Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2019-01-25 08:53:58 +01:00
Rajat Jain	e2bef3847e	Bluetooth: Allow driver specific cmd timeout handling Add a hook to allow the BT driver to do device or command specific handling in case of timeouts. This is to be used by Intel driver to reset the device after certain number of timeouts. Signed-off-by: Rajat Jain <rajatja@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2019-01-25 08:46:32 +01:00
YueHaibing	0ba9480cff	bridge: remove duplicated include from br_multicast.c Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-24 22:49:57 -08:00
Zhaolong Zhang	8eab6dac8d	tipc: remove dead code in struct tipc_topsrv max_rcvbuf_size is no longer used since commit "414574a0af36". Signed-off-by: Zhaolong Zhang <zhangzl2013@126.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-24 22:31:23 -08:00
Priyaranjan Jha	78dc70ebaa	tcp_bbr: adapt cwnd based on ack aggregation estimation Aggregation effects are extremely common with wifi, cellular, and cable modem link technologies, ACK decimation in middleboxes, and LRO and GRO in receiving hosts. The aggregation can happen in either direction, data or ACKs, but in either case the aggregation effect is visible to the sender in the ACK stream. Previously BBR's sending was often limited by cwnd under severe ACK aggregation/decimation because BBR sized the cwnd at 2BDP. If packets were acked in bursts after long delays (e.g. one ACK acking 5BDP after 5RTT), BBR's sending was halted after sending 2BDP over 2RTT, leaving the bottleneck idle for potentially long periods. Note that loss-based congestion control does not have this issue because when facing aggregation it continues increasing cwnd after bursts of ACKs, growing cwnd until the buffer is full. To achieve good throughput in the presence of aggregation effects, this algorithm allows the BBR sender to put extra data in flight to keep the bottleneck utilized during silences in the ACK stream that it has evidence to suggest were caused by aggregation. A summary of the algorithm: when a burst of packets are acked by a stretched ACK or a burst of ACKs or both, BBR first estimates the expected amount of data that should have been acked, based on its estimated bandwidth. Then the surplus ("extra_acked") is recorded in a windowed-max filter to estimate the recent level of observed ACK aggregation. Then cwnd is increased by the ACK aggregation estimate. The larger cwnd avoids BBR being cwnd-limited in the face of ACK silences that recent history suggests were caused by aggregation. As a sanity check, the ACK aggregation degree is upper-bounded by the cwnd (at the time of measurement) and a global max of BW 100ms. The algorithm is further described by the following presentation: https://datatracker.ietf.org/meeting/101/materials/slides-101-iccrg-an-update-on-bbr-work-at-google-00 In our internal testing, we observed a significant increase in BBR throughput (measured using netperf), in a basic wifi setup. - Host1 (sender on ethernet) -> AP -> Host2 (receiver on wifi) - 2.4 GHz -> BBR before: ~73 Mbps; BBR after: ~102 Mbps; CUBIC: ~100 Mbps - 5.0 GHz -> BBR before: ~362 Mbps; BBR after: ~593 Mbps; CUBIC: ~601 Mbps Also, this code is running globally on YouTube TCP connections and produced significant bandwidth increases for YouTube traffic. This is based on Ian Swett's max_ack_height_ algorithm from the QUIC BBR implementation. Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-24 22:27:27 -08:00
Priyaranjan Jha	232aa8ec3e	tcp_bbr: refactor bbr_target_cwnd() for general inflight provisioning Because bbr_target_cwnd() is really a general-purpose BBR helper for computing some volume of inflight data as a function of the estimated BDP, refactor it into following helper functions: - bbr_bdp() - bbr_quantization_budget() - bbr_inflight() Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-24 22:27:26 -08:00
David S. Miller	9620d6f683	Merge tag 'linux-can-fixes-for-5.0-20190122' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2019-01-22 this is a pull request of 4 patches for net/master. The first patch by is by Manfred Schlaegl and reverts a patch that caused wrong warning messages in certain use cases. The next patch is by Oliver Hartkopp for the bcm that adds sanity checks for the timer value before using it to detect potential interger overflows. The last two patches are for the flexcan driver, YueHaibing's patch fixes the the return value in the error path of the flexcan_setup_stop_mode() function. The second patch is by Uwe Kleine-König and fixes a NULL pointer deref on older flexcan cores in flexcan_chip_start(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-24 21:52:37 -08:00
Xin Long	ecf938fe7d	sctp: set flow sport from saddr only when it's 0 Now sctp_transport_pmtu() passes transport->saddr into .get_dst() to set flow sport from 'saddr'. However, transport->saddr is set only when transport->dst exists in sctp_transport_route(). If sctp_transport_pmtu() is called without transport->saddr set, like when transport->dst doesn't exists, the flow sport will be set to 0 from transport->saddr, which will cause a wrong route to be got. Commit `6e91b578bf` ("sctp: re-use sctp_transport_pmtu in sctp_transport_route") made the issue be triggered more easily since sctp_transport_pmtu() would be called in sctp_transport_route() after that. In gerneral, fl4->fl4_sport should always be set to htons(asoc->base.bind_addr.port), unless transport->asoc doesn't exist in sctp_v4/6_get_dst(), which is the case: sctp_ootb_pkt_new() -> sctp_transport_route() For that, we can simply handle it by setting flow sport from saddr only when it's 0 in sctp_v4/6_get_dst(). Fixes: `6e91b578bf` ("sctp: re-use sctp_transport_pmtu in sctp_transport_route") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-24 18:13:57 -08:00
Xin Long	4ff40b8626	sctp: set chunk transport correctly when it's a new asoc In the paths: sctp_sf_do_unexpected_init() -> sctp_make_init_ack() sctp_sf_do_dupcook_a/b()() -> sctp_sf_do_5_1D_ce() The new chunk 'retval' transport is set from the incoming chunk 'chunk' transport. However, 'retval' transport belong to the new asoc, which is a different one from 'chunk' transport's asoc. It will cause that the 'retval' chunk gets set with a wrong transport. Later when sending it and because of Commit `b9fd683982` ("sctp: add sctp_packet_singleton"), sctp_packet_singleton() will set some fields, like vtag to 'retval' chunk from that wrong transport's asoc. This patch is to fix it by setting 'retval' transport correctly which belongs to the right asoc in sctp_make_init_ack() and sctp_sf_do_5_1D_ce(). Fixes: `b9fd683982` ("sctp: add sctp_packet_singleton") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-24 18:13:57 -08:00
Xin Long	8220c870cb	sctp: improve the events for sctp stream adding This patch is to improve sctp stream adding events in 2 places: 1. In sctp_process_strreset_addstrm_out(), move up SCTP_MAX_STREAM and in stream allocation failure checks, as the adding has to succeed after reconf_timer stops for the in stream adding request retransmission. 3. In sctp_process_strreset_addstrm_in(), no event should be sent, as no in or out stream is added here. Fixes: `50a41591f1` ("sctp: implement receiver-side procedures for the Add Outgoing Streams Request Parameter") Fixes: `c5c4ebb3ab` ("sctp: implement receiver-side procedures for the Add Incoming Streams Request Parameter") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-24 18:13:57 -08:00
Xin Long	2e6dc4d951	sctp: improve the events for sctp stream reset This patch is to improve sctp stream reset events in 4 places: 1. In sctp_process_strreset_outreq(), the flag should always be set with SCTP_STREAM_RESET_INCOMING_SSN instead of OUTGOING, as receiver's in stream is reset here. 2. In sctp_process_strreset_outreq(), move up SCTP_STRRESET_ERR_WRONG_SSN check, as the reset has to succeed after reconf_timer stops for the in stream reset request retransmission. 3. In sctp_process_strreset_inreq(), no event should be sent, as no in or out stream is reset here. 4. In sctp_process_strreset_resp(), SCTP_STREAM_RESET_INCOMING_SSN or OUTGOING event should always be sent for stream reset requests, no matter it fails or succeeds to process the request. Fixes: `8105447645` ("sctp: implement receiver-side procedures for the Outgoing SSN Reset Request Parameter") Fixes: `16e1a91965` ("sctp: implement receiver-side procedures for the Incoming SSN Reset Request Parameter") Fixes: `11ae76e67a` ("sctp: implement receiver-side procedures for the Reconf Response Parameter") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-24 18:13:57 -08:00
wenxu	d71b57532d	ip_tunnel: Make none-tunnel-dst tunnel port work with lwtunnel ip l add dev tun type gretap key 1000 ip a a dev tun 10.0.0.1/24 Packets with tun-id 1000 can be recived by tun dev. But packet can't be sent through dev tun for non-tunnel-dst With this patch: tunnel-dst can be get through lwtunnel like beflow: ip r a 10.0.0.7 encap ip dst 172.168.0.11 dev tun Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-24 17:54:12 -08:00
Björn Töpel	a36b38aa2a	xsk: add sock_diag interface for AF_XDP This patch adds the sock_diag interface for querying sockets from user space. Tools like iproute2 ss(8) can use this interface to list open AF_XDP sockets. The user-space ABI is defined in linux/xdp_diag.h and includes netlink request and response structs. The request can query sockets and the response contains socket information about the rings, umems, inode and more. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-01-25 01:50:03 +01:00
Björn Töpel	50e74c0131	xsk: add id to umem This commit adds an id to the umem structure. The id uniquely identifies a umem instance, and will be exposed to user-space via the socket monitoring interface. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-01-25 01:50:03 +01:00
Björn Töpel	1d0dc06930	net: xsk: track AF_XDP sockets on a per-netns list Track each AF_XDP socket in a per-netns list. This will be used later by the sock_diag interface for querying sockets from userspace. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-01-25 01:50:03 +01:00
ZhangXiaoxu	53ab60baa1	ipvs: Fix signed integer overflow when setsockopt timeout There is a UBSAN bug report as below: UBSAN: Undefined behaviour in net/netfilter/ipvs/ip_vs_ctl.c:2227:21 signed integer overflow: -2147483647 * 1000 cannot be represented in type 'int' Reproduce program: #include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #define IPPROTO_IP 0 #define IPPROTO_RAW 255 #define IP_VS_BASE_CTL (64+1024+64) #define IP_VS_SO_SET_TIMEOUT (IP_VS_BASE_CTL+10) /* The argument to IP_VS_SO_GET_TIMEOUT / struct ipvs_timeout_t { int tcp_timeout; int tcp_fin_timeout; int udp_timeout; }; int main() { int ret = -1; int sockfd = -1; struct ipvs_timeout_t to; sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW); if (sockfd == -1) { printf("socket init error\n"); return -1; } to.tcp_timeout = -2147483647; to.tcp_fin_timeout = -2147483647; to.udp_timeout = -2147483647; ret = setsockopt(sockfd, IPPROTO_IP, IP_VS_SO_SET_TIMEOUT, (char )(&to), sizeof(to)); printf("setsockopt return %d\n", ret); return ret; } Return -EINVAL if the timeout value is negative or max than 'INT_MAX / HZ'. Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-01-24 13:38:54 +01:00
Eric Dumazet	d9ff286a0f	bpf: allow BPF programs access skb_shared_info->gso_segs field This adds the ability to read gso_segs from a BPF program. v3: Use BPF_REG_AX instead of BPF_REG_TMP for the temporary register, as suggested by Martin. v2: refined Eddie Hao patch to address Alexei feedback. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Eddie Hao <eddieh@google.com> Cc: Martin KaFai Lau <kafai@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-01-24 10:49:05 +01:00
Eric Dumazet	63530aba78	ax25: fix possible use-after-free syzbot found that ax25 routes where not properly protected against concurrent use [1]. In this particular report the bug happened while copying ax25->digipeat. Fix this problem by making sure we call ax25_get_route() while ax25_route_lock is held, so that no modification could happen while using the route. The current two ax25_get_route() callers do not sleep, so this change should be fine. Once we do that, ax25_get_route() no longer needs to grab a reference on the found route. [1] ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de BUG: KASAN: use-after-free in memcpy include/linux/string.h:352 [inline] BUG: KASAN: use-after-free in kmemdup+0x42/0x60 mm/util.c:113 Read of size 66 at addr ffff888066641a80 by task syz-executor2/531 ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de CPU: 1 PID: 531 Comm: syz-executor2 Not tainted 5.0.0-rc2+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1db/0x2d0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x123/0x190 mm/kasan/generic.c:191 memcpy+0x24/0x50 mm/kasan/common.c:130 memcpy include/linux/string.h:352 [inline] kmemdup+0x42/0x60 mm/util.c:113 kmemdup include/linux/string.h:425 [inline] ax25_rt_autobind+0x25d/0x750 net/ax25/ax25_route.c:424 ax25_connect.cold+0x30/0xa4 net/ax25/af_ax25.c:1224 __sys_connect+0x357/0x490 net/socket.c:1664 __do_sys_connect net/socket.c:1675 [inline] __se_sys_connect net/socket.c:1672 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1672 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x458099 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f870ee22c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099 RDX: 0000000000000048 RSI: 0000000020000080 RDI: 0000000000000005 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de R10: 0000000000000000 R11: 0000000000000246 R12: 00007f870ee236d4 R13: 00000000004be48e R14: 00000000004ce9a8 R15: 00000000ffffffff Allocated by task 526: save_stack+0x45/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] __kasan_kmalloc mm/kasan/common.c:496 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:469 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:504 ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de kmem_cache_alloc_trace+0x151/0x760 mm/slab.c:3609 kmalloc include/linux/slab.h:545 [inline] ax25_rt_add net/ax25/ax25_route.c:95 [inline] ax25_rt_ioctl+0x3b9/0x1270 net/ax25/ax25_route.c:233 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763 sock_do_ioctl+0xe2/0x400 net/socket.c:950 sock_ioctl+0x32f/0x6c0 net/socket.c:1074 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de Freed by task 550: save_stack+0x45/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:458 kasan_slab_free+0xe/0x10 mm/kasan/common.c:466 __cache_free mm/slab.c:3487 [inline] kfree+0xcf/0x230 mm/slab.c:3806 ax25_rt_add net/ax25/ax25_route.c:92 [inline] ax25_rt_ioctl+0x304/0x1270 net/ax25/ax25_route.c:233 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763 sock_do_ioctl+0xe2/0x400 net/socket.c:950 sock_ioctl+0x32f/0x6c0 net/socket.c:1074 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff888066641a80 which belongs to the cache kmalloc-96 of size 96 The buggy address is located 0 bytes inside of 96-byte region [ffff888066641a80, ffff888066641ae0) The buggy address belongs to the page: page:ffffea0001999040 count:1 mapcount:0 mapping:ffff88812c3f04c0 index:0x0 flags: 0x1fffc0000000200(slab) ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de raw: 01fffc0000000200 ffffea0001817948 ffffea0002341dc8 ffff88812c3f04c0 raw: 0000000000000000 ffff888066641000 0000000100000020 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888066641980: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888066641a00: 00 00 00 00 00 00 00 00 02 fc fc fc fc fc fc fc >ffff888066641a80: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ^ ffff888066641b00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888066641b80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ralf Baechle <ralf@linux-mips.org> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-23 11:18:00 -08:00
Gustavo A. R. Silva	6317950c1b	Bluetooth: Mark expected switch fall-throughs In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warnings: net/bluetooth/rfcomm/core.c:479:6: warning: this statement may fall through [-Wimplicit-fallthrough=] net/bluetooth/l2cap_core.c:4223:6: warning: this statement may fall through [-Wimplicit-fallthrough=] Warning level 3 was used: -Wimplicit-fallthrough=3 This patch is part of the ongoing efforts to enabling -Wimplicit-fallthrough. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2019-01-23 18:27:16 +01:00
Gustavo A. R. Silva	f79e3365bc	tipc: mark expected switch fall-throughs In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warnings: net/tipc/link.c:1125:6: warning: this statement may fall through [-Wimplicit-fallthrough=] net/tipc/socket.c:736:6: warning: this statement may fall through [-Wimplicit-fallthrough=] net/tipc/socket.c:2418:7: warning: this statement may fall through [-Wimplicit-fallthrough=] Warning level 3 was used: -Wimplicit-fallthrough=3 This patch is part of the ongoing efforts to enabling -Wimplicit-fallthrough. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-23 09:06:35 -08:00
Marcel Holtmann	7c9cbd0b5e	Bluetooth: Verify that l2cap_get_conf_opt provides large enough buffer The function l2cap_get_conf_opt will return L2CAP_CONF_OPT_SIZE + opt->len as length value. The opt->len however is in control over the remote user and can be used by an attacker to gain access beyond the bounds of the actual packet. To prevent any potential leak of heap memory, it is enough to check that the resulting len calculation after calling l2cap_get_conf_opt is not below zero. A well formed packet will always return >= 0 here and will end with the length value being zero after the last option has been parsed. In case of malformed packets messing with the opt->len field the length value will become negative. If that is the case, then just abort and ignore the option. In case an attacker uses a too short opt->len value, then garbage will be parsed, but that is protected by the unknown option handling and also the option parameter size checks. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2019-01-23 13:35:07 +02:00
Yafang Shao	c9e4576743	bpf: sock recvbuff must be limited by rmem_max in bpf_setsockopt() When sock recvbuff is set by bpf_setsockopt(), the value must by limited by rmem_max. It is the same with sendbuff. Fixes: `8c4b4c7e9f` ("bpf: Add setsockopt helper function to bpf") Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-01-23 12:22:22 +01:00
Marcel Holtmann	af3d5d1c87	Bluetooth: Check L2CAP option sizes returned from l2cap_get_conf_opt When doing option parsing for standard type values of 1, 2 or 4 octets, the value is converted directly into a variable instead of a pointer. To avoid being tricked into being a pointer, check that for these option types that sizes actually match. In L2CAP every option is fixed size and thus it is prudent anyway to ensure that the remote side sends us the right option size along with option paramters. If the option size is not matching the option type, then that option is silently ignored. It is a protocol violation and instead of trying to give the remote attacker any further hints just pretend that option is not present and proceed with the default values. Implementation following the specification and its qualification procedures will always use the correct size and thus not being impacted here. To keep the code readable and consistent accross all options, a few cosmetic changes were also required. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2019-01-23 12:53:20 +02:00
Gustavo A. R. Silva	3bbe8b1a4a	9p: mark expected switch fall-through In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warning: net/9p/trans_xen.c:514:6: warning: this statement may fall through [-Wimplicit-fallthrough=] Warning level 3 was used: -Wimplicit-fallthrough=3 This patch is part of the ongoing efforts to enabling -Wimplicit-fallthrough Link: http://lkml.kernel.org/r/20190123071632.GA8039@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>	2019-01-23 16:29:54 +09:00
Nathan Chancellor	33a0efa4ba	devlink: Use DIV_ROUND_UP_ULL in DEVLINK_HEALTH_SIZE_TO_BUFFERS When building this code on a 32-bit platform such as ARM, there is a link time error (lld error shown, happpens with ld.bfd too): ld.lld: error: undefined symbol: __aeabi_uldivmod >>> referenced by devlink.c >>> net/core/devlink.o:(devlink_health_buffers_create) in archive built-in.a This happens when using a regular division symbol with a u64 dividend. Use DIV_ROUND_UP_ULL, which wraps do_div, to avoid this situation. Fixes: `cb5ccfbe73` ("devlink: Add health buffer support") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-22 21:00:37 -08:00
Lubomir Rintel	7c62b8dd5c	net/ipv6: lower the level of "link is not ready" messages This message gets logged far too often for how interesting is it. Most distributions nowadays configure NetworkManager to use randomly generated MAC addresses for Wi-Fi network scans. The interfaces end up being periodically brought down for the address change. When they're subsequently brought back up, the message is logged, eventually flooding the log. Perhaps the message is not all that helpful: it seems to be more interesting to hear when the addrconf actually start, not when it does not. Let's lower its level. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Acked-By: Thomas Haller <thaller@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-22 20:42:39 -08:00
YueHaibing	ed175d9c6f	devlink: Add missing check of nlmsg_put nlmsg_put may fail, this fix add a check of its return value. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-22 20:08:23 -08:00
Jakub Kicinski	1518039f6b	net/ipv6: don't return positive numbers when nothing was dumped in6_dump_addrs() returns a positive 1 if there was nothing to dump. This return value can not be passed as return from inet6_dump_addr() as is, because it will confuse rtnetlink, resulting in NLMSG_DONE never getting set: $ ip addr list dev lo EOF on netlink Dump terminated v2: flip condition to avoid a new goto (DaveA) Fixes: `7c1e8a3817` ("netlink: fixup regression in RTM_GETADDR") Reported-by: Brendan Galloway <brendan.galloway@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-22 17:24:18 -08:00
Linus Lüssing	4b3087c7e3	bridge: Snoop Multicast Router Advertisements When multiple multicast routers are present in a broadcast domain then only one of them will be detectable via IGMP/MLD query snooping. The multicast router with the lowest IP address will become the selected and active querier while all other multicast routers will then refrain from sending queries. To detect such rather silent multicast routers, too, RFC4286 ("Multicast Router Discovery") provides a standardized protocol to detect multicast routers for multicast snooping switches. This patch implements the necessary MRD Advertisement message parsing and after successful processing adds such routers to the internal multicast router list. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-22 17:18:09 -08:00

... 13 14 15 16 17 ...

55319 Commits