linux

Author	SHA1	Message	Date
Cong Wang	95278ddaa1	net_sched: convert idrinfo->lock from spinlock to a mutex In commit `ec3ed293e7` ("net_sched: change tcf_del_walker() to take idrinfo->lock") we move fl_hw_destroy_tmplt() to a workqueue to avoid blocking with the spinlock held. Unfortunately, this causes a lot of troubles here: 1. tcf_chain_destroy() could be called right after we queue the work but before the work runs. This is a use-after-free. 2. The chain refcnt is already 0, we can't even just hold it again. We can check refcnt==1 but it is ugly. 3. The chain with refcnt 0 is still visible in its block, which means it could be still found and used! 4. The block has a refcnt too, we can't hold it without introducing a proper API either. We can make it working but the end result is ugly. Instead of wasting time on reviewing it, let's just convert the troubling spinlock to a mutex, which allows us to use non-atomic allocations too. Fixes: `ec3ed293e7` ("net_sched: change tcf_del_walker() to take idrinfo->lock") Reported-by: Ido Schimmel <idosch@idosch.org> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Vlad Buslov <vladbu@mellanox.com> Cc: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-05 00:36:31 -07:00
Magnus Karlsson	a41b4f3c58	xsk: simplify xdp_clear_umem_at_qid implementation As we now do not allow ethtool to deactivate the queue id we are running an AF_XDP socket on, we can simplify the implementation of xdp_clear_umem_at_qid(). Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-05 09:31:01 +02:00
Jakub Kicinski	1661d34662	ethtool: don't allow disabling queues with umem installed We already check the RSS indirection table does not use queues which would be disabled by channel reconfiguration. Make sure user does not try to disable queues which have a UMEM and zero-copy AF_XDP socket installed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-05 09:31:01 +02:00
Jakub Kicinski	b8c8a2e2e3	ethtool: rename local variable max -> curr ethtool_set_channels() validates the config against driver's max settings. It retrieves the current config and stores it in a variable called max. This was okay when only max settings were accessed but we will soon want to access current settings as well, so calling the entire structure max makes the code less readable. While at it drop unnecessary parenthesis. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-05 09:31:00 +02:00
Magnus Karlsson	c9b47cc1fa	xsk: fix bug when trying to use both copy and zero-copy on one queue id Previously, the xsk code did not record which umem was bound to a specific queue id. This was not required if all drivers were zero-copy enabled as this had to be recorded in the driver anyway. So if a user tried to bind two umems to the same queue, the driver would say no. But if copy-mode was first enabled and then zero-copy mode (or the reverse order), we mistakenly enabled both of them on the same umem leading to buggy behavior. The main culprit for this is that we did not store the association of umem to queue id in the copy case and only relied on the driver reporting this. As this relation was not stored in the driver for copy mode (it does not rely on the AF_XDP NDOs), this obviously could not work. This patch fixes the problem by always recording the umem to queue id relationship in the netdev_queue and netdev_rx_queue structs. This way we always know what kind of umem has been bound to a queue id and can act appropriately at bind time. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-05 09:31:00 +02:00
David Ahern	6f52f80e85	net/neigh: Extend dump filter to proxy neighbor dumps Move the attribute parsing from neigh_dump_table to neigh_dump_info, and pass the filter arguments down to neigh_dump_table in a new struct. Add the filter option to proxy neigh dumps as well to make them consistent. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-05 00:28:45 -07:00
Jianfeng Tan	9d2f67e43b	net/packet: fix packet drop as of virtio gso When we use raw socket as the vhost backend, a packet from virito with gso offloading information, cannot be sent out in later validaton at xmit path, as we did not set correct skb->protocol which is further used for looking up the gso function. To fix this, we set this field according to virito hdr information. Fixes: `e858fae2b0` ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion") Signed-off-by: Jianfeng Tan <jianfeng.tan@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 22:23:15 -07:00
David Ahern	1620a33695	net: Move free of dst_metrics to helper Move the refcounting and potential free of dst metrics associated for ipv4 and ipv6 to a common helper. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 21:54:25 -07:00
David Ahern	e1255ed4b6	net: common metrics init helper for dst_entry ipv4 and ipv6 both use refcounted metrics if FIB entries have metrics set. Move the common initialization code to a helper and use for both protocols. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 21:54:19 -07:00
David Ahern	cc5f0eb216	net: Move free of fib_metrics to helper Move the refcounting and potential free of dst metrics associated with a fib entry to a helper and use it in both ipv4 and ipv6. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 21:54:10 -07:00
David Ahern	767a221753	net: common metrics init helper for FIB entries Consolidate initialization of ipv4 and ipv6 metrics when fib entries are created into a single helper, ip_fib_metrics_init, that handles the call to ip_metrics_convert. If no metrics are defined for the fib entry, then the metrics is set to dst_default_metrics. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 21:54:03 -07:00
Flavio Leitner	17c357efe5	openvswitch: load NAT helper Load the respective NAT helper module if the flow uses it. Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 21:45:16 -07:00
Vinicius Costa Gomes	5a781ccbd1	tc: Add support for configuring the taprio scheduler This traffic scheduler allows traffic classes states (transmission allowed/not allowed, in the simplest case) to be scheduled, according to a pre-generated time sequence. This is the basis of the IEEE 802.1Qbv specification. Example configuration: tc qdisc replace dev enp3s0 parent root handle 100 taprio \ num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ queues 1@0 1@1 2@2 \ base-time 1528743495910289987 \ sched-entry S 01 300000 \ sched-entry S 02 300000 \ sched-entry S 04 300000 \ clockid CLOCK_TAI The configuration format is similar to mqprio. The main difference is the presence of a schedule, built by multiple "sched-entry" definitions, each entry has the following format: sched-entry <CMD> <GATE MASK> <INTERVAL> The only supported <CMD> is "S", which means "SetGateStates", following the IEEE 802.1Qbv-2015 definition (Table 8-6). <GATE MASK> is a bitmask where each bit is a associated with a traffic class, so bit 0 (the least significant bit) being "on" means that traffic class 0 is "active" for that schedule entry. <INTERVAL> is a time duration in nanoseconds that specifies for how long that state defined by <CMD> and <GATE MASK> should be held before moving to the next entry. This schedule is circular, that is, after the last entry is executed it starts from the first one, indefinitely. The other parameters can be defined as follows: - base-time: specifies the instant when the schedule starts, if 'base-time' is a time in the past, the schedule will start at base-time + (N * cycle-time) where N is the smallest integer so the resulting time is greater than "now", and "cycle-time" is the sum of all the intervals of the entries in the schedule; - clockid: specifies the reference clock to be used; The parameters should be similar to what the IEEE 802.1Q family of specification defines. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 13:52:23 -07:00
Vasundhara Volam	16511789b9	devlink: Add generic parameter msix_vec_per_pf_min msix_vec_per_pf_min - This param sets the number of minimal MSIX vectors required for the device initialization. This value is set in the device which limits MSIX vectors per PF. Cc: Jiri Pirko <jiri@mellanox.com> Cc: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 13:49:42 -07:00
Vasundhara Volam	f61cba4291	devlink: Add generic parameter msix_vec_per_pf_max msix_vec_per_pf_max - This param sets the number of MSIX vectors that the device requests from the host on driver initialization. This value is set in the device which is applicable per PF. Cc: Jiri Pirko <jiri@mellanox.com> Cc: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 13:49:42 -07:00
Vasundhara Volam	e3b5106162	devlink: Add generic parameter ignore_ari ignore_ari - Device ignores ARI(Alternate Routing ID) capability, even when platforms has the support and creates same number of partitions when platform does not support ARI capability. Cc: Jiri Pirko <jiri@mellanox.com> Cc: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 13:49:42 -07:00
David S. Miller	9e15ff7b89	Merge tag 'mac80211-for-davem-2018-10-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just three small fixes: * fix use-after-free in regulatory code * fix rx-mgmt key flag in AP mode (mac80211) * fix wireless extensions compat code memory leak ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 13:47:39 -07:00
David S. Miller	9e50727f0e	Merge tag 'mlx5-updates-2018-10-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2018-10-03 mlx5 core driver and ethernet netdev updates, please note there is a small devlink releated update to allow extack argument to eswitch operations. From Eli Britstein, 1) devlink: Add extack argument to the eswitch related operations 2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks 3) net/mlx5e: Add extack messages for TC offload failures From Eran Ben Elisha, 4) mlx5e: Add counter for aRFS rule insertion failures From Feras Daoud 5) Fast teardown support for mlx5 device This change introduces the enhanced version of the "Force teardown" that allows SW to perform teardown in a faster way without the need to reclaim all the FW pages. Fast teardown provides the following advantages: 1- Fix a FW race condition that could cause command timeout 2- Avoid moving to polling mode 3- Close the vport to prevent PCI ACK to be sent without been scatter to memory ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 09:48:37 -07:00
David S. Miller	f0e834e17f	Merge tag 'rxrpc-next-20181004' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Development Here are some development patches for AF_RXRPC. The most significant points are: (1) Change the tracepoint that indicates a packet has been transmitted into one that indicates a packet is about to be transmitted. Without this, the response tracepoint may occur first if the round trip is fast enough. (2) Sort out AFS address list handling to better enforce maximum capacity to use helper functions to fill them and to do an insertion sort to order them. This is here to make (3) easier. (3) Keep AF_INET addresses as AF_INET addresses rather than converting them to AF_INET6 in both AF_RXRPC and kAFS. I hadn't realised that a UDP6 socket would just call down into UDP4 if given an AF_INET address. (4) Allow the timestamp on the first DATA packet of a reply to be retrieved by a kernel service. This will give the kAFS a more accurate base from which to calculate the callback promise expiration. (5) Allow the rxrpc protocol epoch value to be retrieved from an incoming call. This will allow kAFS to determine if the fileserver restarted and if two addresses apparently assigned to the same fileserver actually are different boxes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 09:44:15 -07:00
David Howells	bbb4c4323a	dns: Allow the dns resolver to retrieve a server set Allow the DNS resolver to retrieve a set of servers and their associated addresses, ports, preference and weight ratings. In terms of communication with userspace, "srv=1" is added to the callout string (the '1' indicating the maximum data version supported by the kernel) to ask the userspace side for this. If the userspace side doesn't recognise it, it will ignore the option and return the usual text address list. If the userspace side does recognise it, it will return some binary data that begins with a zero byte that would cause the string parsers to give an error. The second byte contains the version of the data in the blob (this may be between 1 and the version specified in the callout data). The remainder of the payload is version-specific. In version 1, the payload looks like (note that this is packed): u8 Non-string marker (ie. 0) u8 Content (0 => Server list) u8 Version (ie. 1) u8 Source (eg. DNS_RECORD_FROM_DNS_SRV) u8 Status (eg. DNS_LOOKUP_GOOD) u8 Number of servers foreach-server { u16 Name length (LE) u16 Priority (as per SRV record) (LE) u16 Weight (as per SRV record) (LE) u16 Port (LE) u8 Source (eg. DNS_RECORD_FROM_NSS) u8 Status (eg. DNS_LOOKUP_GOT_NOT_FOUND) u8 Protocol (eg. DNS_SERVER_PROTOCOL_UDP) u8 Number of addresses char[] Name (not NUL-terminated) foreach-address { u8 Family (AF_INET{,6}) union { u8[4] ipv4_addr u8[16] ipv6_addr } } } This can then be used to fetch a whole cell's VL-server configuration for AFS, for example. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 09:40:52 -07:00
David Howells	e908bcf4f1	rxrpc: Allow the reply time to be obtained on a client call Allow the epoch value to be queried on a server connection. This is in the rxrpc header of every packet for use in routing and is derived from the client's state. It's also not supposed to change unless the client gets restarted. AFS can make use of this information to deduce whether a fileserver has been restarted because the fileserver makes client calls to the filesystem driver's cache manager to send notifications (ie. callback breaks) about conflicting changes from other clients. These convey the fileserver's own epoch value back to the filesystem. Signed-off-by: David Howells <dhowells@redhat.com>	2018-10-04 09:54:29 +01:00
David Howells	2070a3e449	rxrpc: Allow the reply time to be obtained on a client call Allow the timestamp on the sk_buff holding the first DATA packet of a reply to be queried. This can then be used as a base for the expiry time calculation on the callback promise duration indicated by an operation result. Signed-off-by: David Howells <dhowells@redhat.com>	2018-10-04 09:42:29 +01:00
Joe Stringer	d71019b54b	net: core: Fix build with CONFIG_IPV6=m Stephen Rothwell reports the following link failure with IPv6 as module: x86_64-linux-gnu-ld: net/core/filter.o: in function `sk_lookup': (.text+0x19219): undefined reference to `__udp6_lib_lookup' Fix the build by only enabling the IPv6 socket lookup if IPv6 support is compiled into the kernel. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-04 10:32:56 +02:00
David Howells	5a790b7375	rxrpc: Drop the local endpoint arg from rxrpc_extract_addr_from_skb() rxrpc_extract_addr_from_skb() doesn't use the argument that points to the local endpoint, so remove the argument. Signed-off-by: David Howells <dhowells@redhat.com>	2018-10-04 09:32:28 +01:00
David Howells	46894a1359	rxrpc: Use IPv4 addresses throught the IPv6 AF_RXRPC opens an IPv6 socket through which to send and receive network packets, both IPv6 and IPv4. It currently turns AF_INET addresses into AF_INET-as-AF_INET6 addresses based on an assumption that this was necessary; on further inspection of the code, however, it turns out that the IPv6 code just farms packets aimed at AF_INET addresses out to the IPv4 code. Fix AF_RXRPC to use AF_INET addresses directly when given them. Fixes: `7b674e390e` ("rxrpc: Fix IPv6 support") Signed-off-by: David Howells <dhowells@redhat.com>	2018-10-04 09:32:28 +01:00
David Howells	b3cfb6f567	rxrpc: Emit the data Tx trace line before transmitting Print the data Tx trace line before transmitting so that it appears before the trace lines indicating success or failure of the transmission. This makes the trace log less confusing. Signed-off-by: David Howells <dhowells@redhat.com>	2018-10-04 09:32:27 +01:00
David Howells	d2944b1c66	rxrpc: Use rxrpc_free_skb() rather than rxrpc_lose_skb() rxrpc_lose_skb() is now exactly the same as rxrpc_free_skb(), so remove it and use the latter instead. Signed-off-by: David Howells <dhowells@redhat.com>	2018-10-04 09:32:27 +01:00
David S. Miller	6f41617bf2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net' overlapped the renaming of a netlink attribute in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-03 21:00:17 -07:00
Eli Britstein	db7ff19e7b	devlink: Add extack for eswitch operations Add extack argument to the eswitch related operations. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-10-03 16:17:58 -07:00
Chuck Lever	ad0911802c	xprtrdma: Clean up xprt_rdma_disconnect_inject Clean up: Use the appropriate C macro instead of open-coding container_of() . Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-10-03 14:19:31 -04:00
Chuck Lever	f26c32fa5c	xprtrdma: Add documenting comments Clean up: fill in or update documenting comments for transport switch entry points. For xprt_rdma_allocate: The first paragraph is no longer true since commit `5a6d1db455` ("SUNRPC: Add a transport-specific private field in rpc_rqst"). The second paragraph is no longer true since commit `54cbd6b0c6` ("xprtrdma: Delay DMA mapping Send and Receive buffers"). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-10-03 14:10:55 -04:00
Chuck Lever	61c208a5ca	xprtrdma: Report when there were zero posted Receives To show that a caller did attempt to allocate and post more Receive buffers, the trace point in rpcrdma_post_recvs() should report when rpcrdma_post_recvs() was invoked but no new Receive buffers were posted. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-10-03 14:02:01 -04:00
Chuck Lever	512ccfb61a	xprtrdma: Move rb_flags initialization Clean up: rb_flags might be used for other things besides RPCRDMA_BUF_F_EMPTY_SCQ, so initialize it in a generic spot instead of in a send-completion-queue-related helper. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-10-03 13:39:02 -04:00
Gustavo A. R. Silva	2cc543f5cd	sctp: fix fall-through annotation Replace "fallthru" with a proper "fall through" annotation. This fix is part of the ongoing efforts to enabling -Wimplicit-fallthrough Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-03 09:33:13 -07:00
Trond Myklebust	b92a8fabab	SUNRPC: Refactor sunrpc_cache_lookup This is a trivial split into lookup and insert functions, no change in behavior. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-10-03 11:54:34 -04:00
Trond Myklebust	608a0ab2f5	SUNRPC: Add lockless lookup of the server's auth domain Avoid taking the global auth_domain_lock in most lookups of the auth domain by adding an RCU protected lookup. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-10-03 11:32:59 -04:00
Trond Myklebust	30382d6ce5	SUNRPC: Remove the server 'authtab_lock' and just use RCU Module removal is RCU safe by design, so we really have no need to lock the 'authtab[]' array. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-10-03 11:32:59 -04:00
Chuck Lever	f7d4668155	xprtrdma: Don't disable BH's in backchannel server Clean up: This code was copied from xprtsock.c and backchannel_rqst.c. For rpcrdma, the backchannel server runs exclusively in process context, thus disabling bottom-halves is unnecessary. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-10-03 10:23:45 -04:00
Chuck Lever	83e301dd13	xprtrdma: Remove memory address of "ep" from an error message Clean up: Replace the hashed memory address of the target rpcrdma_ep with the server's IP address and port. The server address is more useful in an administrative error message. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-10-03 10:05:37 -04:00
Chuck Lever	f9521d53e8	xprtrdma: Rename rpcrdma_qp_async_error_upcall Clean up: Use a function name that is consistent with the RDMA core API and with other consumers. Because this is a function that is invoked from outside the rpcrdma.ko module, add an appropriate documenting comment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-10-03 09:14:57 -04:00
Chuck Lever	31e62d25b5	xprtrdma: Simplify RPC wake-ups on connect Currently, when a connection is established, rpcrdma_conn_upcall invokes rpcrdma_conn_func and then wake_up_all(&ep->rep_connect_wait). The former wakes waiting RPCs, but the connect worker is not done yet, and that leads to races, double wakes, and difficulty understanding how this logic is supposed to work. Instead, collect all the "connection established" logic in the connect worker (xprt_rdma_connect_worker). A disconnect worker is retained to handle provider upcalls safely. Fixes: `254f91e2fa` ("xprtrdma: RPC/RDMA must invoke ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-10-03 08:58:41 -04:00
Chuck Lever	316a616e78	xprtrdma: Re-organize the switch() in rpcrdma_conn_upcall Clean up: Eliminate the FALLTHROUGH into the default arm to make the switch easier to understand. Also, as long as I'm here, do not display the memory address of the target rpcrdma_ep. A hashed memory address is of marginal use here. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-10-03 08:39:39 -04:00
Chenbo Feng	e9837e55b0	netfilter: xt_quota: fix the behavior of xt_quota module A major flaw of the current xt_quota module is that quota in a specific rule gets reset every time there is a rule change in the same table. It makes the xt_quota module not very useful in a table in which iptables rules are changed at run time. This fix introduces a new counter that is visible to userspace as the remaining quota of the current rule. When userspace restores the rules in a table, it can restore the counter to the remaining quota instead of resetting it to the full quota. Signed-off-by: Chenbo Feng <fengc@google.com> Suggested-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-10-03 11:32:54 +02:00
Vakul Garg	4e6d47206c	tls: Add support for inplace records encryption Presently, for non-zero copy case, separate pages are allocated for storing plaintext and encrypted text of records. These pages are stored in sg_plaintext_data and sg_encrypted_data scatterlists inside record structure. Further, sg_plaintext_data & sg_encrypted_data are passed to cryptoapis for record encryption. Allocating separate pages for plaintext and encrypted text is inefficient from both required memory and performance point of view. This patch adds support of inplace encryption of records. For non-zero copy case, we reuse the pages from sg_encrypted_data scatterlist to copy the application's plaintext data. For the movement of pages from sg_encrypted_data to sg_plaintext_data scatterlists, we introduce a new function move_to_plaintext_sg(). This function add pages into sg_plaintext_data from sg_encrypted_data scatterlists. tls_do_encryption() is modified to pass the same scatterlist as both source and destination into aead_request_set_crypt() if inplace crypto has been enabled. A new ariable 'inplace_crypto' has been introduced in record structure to signify whether the same scatterlist can be used. By default, the inplace_crypto is enabled in get_rec(). If zero-copy is used (i.e. plaintext data is not copied), inplace_crypto is set to '0'. Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Reviewed-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-02 23:03:47 -07:00
David S. Miller	00538ba915	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2018-09-30 Here's the first bluetooth-next pull request for the 4.20 kernel. - Fixes & cleanups to hci_qca driver - NULL dereference fix to debugfs - Improved L2CAP Connection-oriented Channel MTU & MPS handling - Added support for USB-based RTL8822C controller - Added device ID for BCM4335C0 UART-based controller - Various other smaller cleanups & fixes Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-02 22:34:54 -07:00
Eric Dumazet	64199fc0a4	ipv4: fix use-after-free in ip_cmsg_recv_dstaddr() Caching ip_hdr(skb) before a call to pskb_may_pull() is buggy, do not do it. Fixes: `2efd4fca70` ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-02 22:32:05 -07:00
Patrick Ruddy	e4a38c0c4b	ipv6: add vrf table handling code for ipv6 mcast The code to obtain the correct table for the incoming interface was missing for IPv6. This has been added along with the table creation notification to fib rules for the RTNL_FAMILY_IP6MR address family. Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com> Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-02 22:29:08 -07:00
Robert Shearman	854da99173	ipv4: Allow sending multicast packets on specific i/f using VRF socket It is useful to be able to use the same socket for listening in a specific VRF, as for sending multicast packets out of a specific interface. However, the bound device on the socket currently takes precedence and results in the packets not being sent. Relax the condition on overriding the output interface to use for sending packets out of UDP, raw and ping sockets to allow multicast packets to be sent using the specified multicast interface. Signed-off-by: Robert Shearman <rshearma@vyatta.att-mail.com> Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-02 22:28:17 -07:00
Ido Schimmel	6919622af3	bridge: mcast: Default back to multicast enabled state Commit `13cefad2f2` ("net: bridge: convert and rename mcast disabled") converted the 'multicast_disabled' field to an option bit named 'BROPT_MULTICAST_ENABLED'. While the old field was implicitly initialized to 0, the new field is not initialized, resulting in the bridge defaulting to multicast disabled state and breaking existing applications. Fix this by explicitly initializing the option. Fixes: `13cefad2f2` ("net: bridge: convert and rename mcast disabled") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-02 22:27:36 -07:00
Eric Dumazet	8873c064d1	tcp: do not release socket ownership in tcp_close() syzkaller was able to hit the WARN_ON(sock_owned_by_user(sk)); in tcp_close() While a socket is being closed, it is very possible other threads find it in rtnetlink dump. tcp_get_info() will acquire the socket lock for a short amount of time (slow = lock_sock_fast(sk)/unlock_sock_fast(sk, slow);), enough to trigger the warning. Fixes: `67db3e4bfb` ("tcp: no longer hold ehash lock while calling tcp_get_info()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-02 22:17:35 -07:00

... 168 169 170 171 172 ...

61760 Commits