linux

Author	SHA1	Message	Date
Yunsheng Lin	aa9d22dd45	net: hns3: fix error handling for desc filling When desc filling fails in hns3_nic_net_xmit, it will call hns3_clear_desc to unmap the dma mapping. But currently the ring->next_to_use points to the desc where the desc filling or dma mapping return error, which means the desc that ring->next_to_use points to has not done the dma mapping, the desc that need unmapping is before the ring->next_to_use. This patch fixes it by calling ring_ptr_move_bw(next_to_use) before doing unmapping operation, and set desc_cb->dma to zero to avoid freeing it again when unloading. Also, when filling skb head or frag fails, both need to unmap all the way back to next_to_use_head, so remove one desc filling error handling. Fixes: `76ad4f0ee7` ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:37:13 -07:00
Yunsheng Lin	757cd1e4a4	net: hns3: combine len and checksum handling for inner and outer header. When filling len and checksum info to description, there is some similar checking or calculation. So this patch adds hns3_set_l2l3l4 to fill the inner(/normal) header's len and checksum info. If it is a encapsulation skb, it calls hns3_set_outer_l2l3l4 to handle the outer header's len and checksum info, in order to avoid some similar checking or calculation. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:37:13 -07:00
Yunsheng Lin	07918fcde1	net: hns3: refactor BD filling for l2l3l4 info This patch separates the inner and outer l2l3l4 len handling in hns3_set_l2l3l4_len, this is a preparation to combine the l2l3l4 len and checksum handling for inner and outer header. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:37:13 -07:00
Yunsheng Lin	39c38824c2	net: hns3: fix for tunnel type handling in hns3_rx_checksum According to hardware user manual, the tunnel packet type is available in the rx.ol_info field of struct hns3_desc. Currently the tunnel packet type is decided by the rx.l234_info, which may cause RX checksum handling error. This patch fixes it by using the correct field in struct hns3_desc to decide the tunnel packet type. Fixes: `76ad4f0ee7` ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:37:13 -07:00
Yunsheng Lin	db4970aa92	net: hns3: add linearizing checking for TSO case HW requires every continuous 8 buffer data to be larger than MSS, we simplify it by ensuring skb_headlen + the first continuous 7 frags to to be larger than GSO header len + mss, and the remaining continuous 7 frags to be larger than MSS except the last 7 frags. This patch adds hns3_skb_need_linearized to handle it for TSO case. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:37:13 -07:00
Yunsheng Lin	d21ff4f90d	net: hns3: add counter for times RX pages gets allocated Currently, using "ethtool --statistics" can show how many time RX page have been reused, but there is no counter for RX page not being reused. This patch adds non_reuse_pg counter to better debug the performance issue, because it is hard to determine when the RX page is reused or not if there is no such counter. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:37:13 -07:00
Yunsheng Lin	fb00331bb8	net: hns3: use napi_schedule_irqoff in hard interrupts handlers napi_schedule_irqoff is introduced to be used from hard interrupts handlers or when irqs are already masked, see: https://lists.openwall.net/netdev/2014/10/29/2 So this patch replaces napi_schedule with napi_schedule_irqoff. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:37:13 -07:00
Yunsheng Lin	3d5f374189	net: hns3: unify maybe_stop_tx for TSO and non-TSO case Currently, maybe_stop_tx ops for TSO and non-TSO case share some BD calculation code, so this patch unifies the maybe_stop_tx by removing the maybe_stop_tx ops. skb_is_gso() can be used to differentiate the case between TSO and non-TSO case if there is need to handle special case for TSO case. This patch also add tx_copy field in "ethtool --statistics" to help better debug the performance issue caused by calling skb_copy. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:37:13 -07:00
Max Filippov	a5944195d0	xtensa: implement initialize_cacheattr for MPU cores Use CONFIG_MEMMAP_CACHEATTR to initialize MPU as described in the Xtensa LSP RM document. Coalesce adjacent regions with the same cacheattr. Update Kconfig help text. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>	2019-05-07 10:36:34 -07:00
Max Filippov	f7c34874f0	xtensa: add exclusive atomics support Implement atomic primitives using exclusive access opcodes available in the recent xtensa cores. Since l32ex/s32ex don't have any memory ordering guarantees don't define __smp_mb__before_atomic/__smp_mb__after_atomic to make them use memw. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>	2019-05-07 10:36:31 -07:00
David S. Miller	09934b0363	Merge branch 'net-dsa-lantiq-Add-bridge-offloading' Hauke Mehrtens says: ==================== net: dsa: lantiq: Add bridge offloading This adds bridge offloading for the Intel / Lantiq GSWIP 2.1 switch. Changes since: v2: - Added Fixes tag to patch 1 - Fixed typo - added GSWIP_TABLE_MAC_BRIDGE_STATIC and made use of it - used GSWIP_TABLE_MAC_BRIDGE in more places v1: - fix typo signle -> single ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:34:45 -07:00
Hauke Mehrtens	58c59ef9e9	net: dsa: lantiq: Add Forwarding Database access This adds functions to add and remove static entries to and from the forwarding database and dump the full forwarding database. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:34:45 -07:00
Hauke Mehrtens	4581348199	net: dsa: lantiq: Add fast age function Fast aging per port is not supported directly by the hardware, it is only possible to configure a global aging time. Do the fast aging by iterating over the MAC forwarding table and remove all dynamic entries for a given port. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:34:45 -07:00
Hauke Mehrtens	9bbb1c053b	net: dsa: lantiq: Add VLAN aware bridge offloading The VLAN aware bridge offloading is similar to the VLAN unaware offloading, this makes it possible to offload the VLAN bridge functionalities. The hardware supports up to 64 VLAN bridge entries, we already use one entry for each LAN port to prevent forwarding of packets between the ports when the ports are not in a bridge, so in the end we have 57 possible VLANs. The VLAN filtering is currently only active when the ports are in a bridge, VLAN filtering for ports not in a bridge is not implemented. It is currently not possible to change between VLAN filtering and not filtering while the port is already in a bridge, this would make the driver more complicated. The VLANs are only defined on bridge entries, so we will not add anything into the hardware when the port joins a bridge if it is doing VLAN filtering, but only when an allowed VLAN is added. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:34:45 -07:00
Hauke Mehrtens	8206e0ce96	net: dsa: lantiq: Add VLAN unaware bridge offloading This allows to offload bridges with DSA to the switch hardware and do the packet forwarding in hardware. This implements generic functions to access the switch hardware tables, which are used to control many features of the switch. This patch activates the MAC learning by removing the MAC address table lock, to prevent uncontrolled forwarding of packets between all the LAN ports, they are added into individual bridge tables entries with individual flow ids and the switch will do the MAC learning for each port separately before they are added to a real bridge. Each bridge consist of an entry in the active VLAN table and the VLAN mapping table, table entries with the same index are matching. In the VLAN unaware mode we configure everything with VLAN ID 0, but we use different flow IDs, the switch should handle all VLANs as normal payload and ignore them. When the hardware looks for the port of the destination MAC address it only takes the entries which have the same flow ID of the ingress packet. The bridges are configured with 64 possible entries with these information: Table Index, 0...63 VLAN ID, 0...4095: VLAN ID 0 is untagged flow ID, 0..63: Same flow IDs share entries in MAC learning table port map, one bit for each port number tagged port map, one bit for each port number Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:34:44 -07:00
Hauke Mehrtens	30d8938384	net: dsa: lantiq: Allow special tags only on CPU port Allow the special tag in ingress only on the CPU port and not on all ports. A packet with a special tag could circumvent the hardware forwarding and should only be allowed on the CPU port where Linux controls the port. Fixes: `14fceff477` ("net: dsa: Add Lantiq / Intel DSA driver for vrx200)" Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 10:34:44 -07:00
Max Filippov	d065fcf12c	xtensa: clean up inline assembly in futex.h Replace numeric inline assembly parameters with named parameters. Drop unused parameters from the futex_atomic_cmpxchg_inatomic. Use new temporary variable to hold target address in the fixup code in futex_atomic_cmpxchg_inatomic. Conditionalize function bodies so that only 'return -ENOSYS' is left in configurations without futex support. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>	2019-05-07 10:30:39 -07:00
Linus Torvalds	8ff468c29e	Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 FPU state handling updates from Borislav Petkov: "This contains work started by Rik van Riel and brought to fruition by Sebastian Andrzej Siewior with the main goal to optimize when to load FPU registers: only when returning to userspace and not on every context switch (while the task remains in the kernel). In addition, this optimization makes kernel_fpu_begin() cheaper by requiring registers saving only on the first invocation and skipping that in following ones. What is more, this series cleans up and streamlines many aspects of the already complex FPU code, hopefully making it more palatable for future improvements and simplifications. Finally, there's a __user annotations fix from Jann Horn" * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits) x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails x86/pkeys: Add PKRU value to init_fpstate x86/fpu: Restore regs in copy_fpstate_to_sigframe() in order to use the fastpath x86/fpu: Add a fastpath to copy_fpstate_to_sigframe() x86/fpu: Add a fastpath to __fpu__restore_sig() x86/fpu: Defer FPU state load until return to userspace x86/fpu: Merge the two code paths in __fpu__restore_sig() x86/fpu: Restore from kernel memory on the 64-bit path too x86/fpu: Inline copy_user_to_fpregs_zeroing() x86/fpu: Update xstate's PKRU value on write_pkru() x86/fpu: Prepare copy_fpstate_to_sigframe() for TIF_NEED_FPU_LOAD x86/fpu: Always store the registers in copy_fpstate_to_sigframe() x86/entry: Add TIF_NEED_FPU_LOAD x86/fpu: Eager switch PKRU state x86/pkeys: Don't check if PKRU is zero before writing it x86/fpu: Only write PKRU if it is different from current x86/pkeys: Provide *pkru() helpers x86/fpu: Use a feature number instead of mask in two more helpers x86/fpu: Make __raw_xsave_addr() use a feature number instead of mask x86/fpu: Add an __fpregs_load_activate() internal helper ...	2019-05-07 10:24:10 -07:00
Parav Pandit	405ecbf72f	vfio/mdev: Avoid inline get and put parent helpers As section 15 of Documentation/process/coding-style.rst clearly describes that compiler will be able to optimize code. Hence drop inline for get and put helpers for parent. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-05-07 11:23:13 -06:00
Parav Pandit	6093e348a5	vfio/mdev: Fix aborting mdev child device removal if one fails device_for_each_child() stops executing callback function for remaining child devices, if callback hits an error. Each child mdev device is independent of each other. While unregistering parent device, mdev core must remove all child mdev devices. Therefore, mdev_device_remove_cb() always returns success so that device_for_each_child doesn't abort if one child removal hits error. While at it, improve remove and unregister functions for below simplicity. There isn't need to pass forced flag pointer during mdev parent removal which invokes mdev_device_remove(). So simplify the flow. mdev_device_remove() is called from two paths. 1. mdev_unregister_driver() mdev_device_remove_cb() mdev_device_remove() 2. remove_store() mdev_device_remove() Fixes: `7b96953bc6` ("vfio: Mediated device Core driver") Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-05-07 11:23:13 -06:00
Parav Pandit	a6d6f4f160	vfio/mdev: Follow correct remove sequence mdev_remove_sysfs_files() should follow exact mirror sequence of a create, similar to what is followed in error unwinding path of mdev_create_sysfs_files(). Fixes: `6a62c1dfb5` ("vfio/mdev: Re-order sysfs attribute creation") Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-05-07 11:23:13 -06:00
Parav Pandit	d300046350	vfio/mdev: Avoid masking error code to EBUSY Instead of masking return error to -EBUSY, return actual error returned by the driver. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-05-07 11:23:13 -06:00
Parav Pandit	50732af3b6	vfio/mdev: Drop redundant extern for exported symbols There is no need use 'extern' for exported functions. Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-05-07 11:23:13 -06:00
Parav Pandit	f707d837b6	vfio/mdev: Removed unused kref Remove unused kref from the mdev_device structure. Fixes: `7b96953bc6` ("vfio: Mediated device Core driver") Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-05-07 11:23:13 -06:00
Parav Pandit	60e7f2c3fe	vfio/mdev: Avoid release parent reference during error path During mdev parent registration in mdev_register_device(), if parent device is duplicate, it releases the reference of existing parent device. This is incorrect. Existing parent device should not be touched. Fixes: `7b96953bc6` ("vfio: Mediated device Core driver") Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-05-07 11:23:13 -06:00
Jeff Layton	5ddc61fc14	ceph: print inode number in __caps_issued_mask debugging messages To make it easier to correlate with MDS logs. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:38 +02:00
Jeff Layton	488f5284e2	ceph: just call get_session in __ceph_lookup_mds_session I originally thought there was a potential race here, but the fact that this is called with the mdsc->mutex held, ensures that the last reference to the session can't be put here. Still, it's clearer to just return the value from get_session here, and may prevent a bug later if we ever rework this code to be less reliant on mutexes. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:38 +02:00
Jeff Layton	1199d7da2d	ceph: simplify arguments and return semantics of try_get_cap_refs The return of this function is rather complex. It can return 0 or 1, and in the case of a 1 return, the "err" pointer will be filled out. This necessitates a lot of copying of values. We can achieve the same effect by just returning 0, 1 or a negative error code, and drop the "err" argument from this function. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:38 +02:00
Jeff Layton	a452bc0636	ceph: fix comment over ceph_drop_caps_for_unlink It's not clear what AUTH_RDCACHE means in this context, and we're clearly just dropping LINK caps here. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:38 +02:00
Jeff Layton	8340f22ce5	ceph: move wait for mds request into helper function Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:38 +02:00
Jeff Layton	86bda539fa	ceph: have ceph_mdsc_do_request call ceph_mdsc_submit_request Nothing calls ceph_mdsc_submit_request today, but in later patches we'll need to be able to call this separately. Have the helper return an int so we can check the r_err under the mutex, and have the caller just check the error code from the submit. Also move the acquisition of CEPH_CAP_PIN references into the same function. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Jeff Layton	111c708104	ceph: after an MDS request, do callback and completions No MDS requests use r_callback today, but that will change in the future. The OSD client always does r_callback and then completes r_completion. Let's have the MDS client do the same. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Jeff Layton	c1dfc27723	ceph: use pathlen values returned by set_request_path_attr We make copies of the dentry name in set_request_path_attr, but then create_request_message re-fetches the lengths out of the dentry. While we don't currently set the *_drop fields unless the parents are locked, it's still better not to rely on that sort of implicit assumption. Use the pathlen values that set_request_path_attr returned instead, as they will always be correct for the returned paths themselves. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Jeff Layton	f77f21bb28	ceph: use __getname/__putname in ceph_mdsc_build_path Al suggested we get rid of the kmalloc here and just use __getname and __putname to get a full PATH_MAX pathname buffer. Since we build the path in reverse, we continue to return a pointer to the beginning of the string and the length, and add a new helper to free the thing at the end. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Jeff Layton	964fff7491	ceph: use ceph_mdsc_build_path instead of clone_dentry_name While it may be slightly more efficient, it's probably not worthwhile to optimize for the case that clone_dentry_name handles. We can get the same result by just calling ceph_mdsc_build_path when the parent isn't locked, with less code duplication. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Jeff Layton	69a10fb3f4	ceph: fix potential use-after-free in ceph_mdsc_build_path temp is not defined outside of the RCU critical section here. Ensure we grab that value before we drop the rcu_read_lock. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Jeff Layton	ff4a80bf2d	ceph: dump granular cap info in "caps" debugfs file We have a "caps" file already that gives statistics on the caps cache as a whole. Add another section to that output and dump a line for each individual cap record. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Jeff Layton	f5d7726900	ceph: make iterate_session_caps a public symbol Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Jeff Layton	40e7e2c0e8	ceph: fix NULL pointer deref when debugging is enabled Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Jeff Layton	428bb68ad9	ceph: properly handle granular statx requests cephfs can benefit from statx. We can have the client just request caps sufficient for the needed attributes and leave off the rest. Also, recognize when AT_STATX_DONT_SYNC is set, and just scrape the inode without doing any call in that case. Force a call to the MDS in the event that AT_STATX_FORCE_SYNC is set. Link: http://tracker.ceph.com/issues/39258 Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: David Howells <dhowells@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Jeff Layton	ffb61c55b2	ceph: remove superfluous inode_lock in ceph_fsync Originally, filemap_write_and_wait took the i_mutex internally, but commit `02c24a8218` pushed the mutex acquisition into the individual fsync routines, leaving it up to the subsystem maintainers to remove it if it wasn't needed. For ceph, I see no reason to take the inode_lock here. All of the operations inside that lock are protected by their own locking. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Arnd Bergmann	0384892c2d	libceph: fix clang warning for CEPH_DEFINE_OID_ONSTACK clang complains about assigning a variable to itself during the declaration: fs/ceph/ioctl.c:187:26: error: variable 'oid' is uninitialized when used within its own initialization [-Werror,-Wuninitialized] CEPH_DEFINE_OID_ONSTACK(oid); ^~~ include/linux/ceph/osdmap.h:122:52: note: expanded from macro 'CEPH_DEFINE_OID_ONSTACK' struct ceph_object_id oid = CEPH_OID_INIT_ONSTACK(oid) ~~~ ^~~ include/linux/ceph/osdmap.h:120:29: note: expanded from macro 'CEPH_OID_INIT_ONSTACK' ({ ceph_oid_init(&oid); oid; }) ^~~ We use this trick in other places, but it is completely unnecessary here, as we can just use a regular struct initializer. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Arnd Bergmann	1680937266	rbd: convert all rbd_assert(0) to BUG() rbd_assert(0) has caused different issues depending on the compiler version in the past, so it seems better to avoid it completely. Replace the remaining instances. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:37 +02:00
Arnd Bergmann	d342a15b1e	rbd: avoid clang -Wuninitialized warning clang fails to see that rbd_assert(0) ends in an unreachable code path and warns about a subsequent use of an uninitialized variable when CONFIG_PROFILE_ANNOTATED_BRANCHES is set: drivers/block/rbd.c:2402:4: error: variable 'ret' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] rbd_assert(0); ^~~~~~~~~~~~~ drivers/block/rbd.c:563:7: note: expanded from macro 'rbd_assert' if (unlikely(!(expr))) { \ ^~~~~~~~~~~~~~~~~ include/linux/compiler.h:48:23: note: expanded from macro 'unlikely' # define unlikely(x) (__branch_check__(x, 0, __builtin_constant_p(x))) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/block/rbd.c:2410:6: note: uninitialized use occurs here if (ret) { ^~~ drivers/block/rbd.c:2402:4: note: remove the 'if' if its condition is always true rbd_assert(0); ^ drivers/block/rbd.c:563:3: note: expanded from macro 'rbd_assert' if (unlikely(!(expr))) { \ ^ drivers/block/rbd.c:2376:9: note: initialize the variable 'ret' to silence this warning int ret; ^ = 0 1 error generated. This seems to be a bug in clang, but is easy to work around by using an unconditional BUG(). Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:36 +02:00
Yan, Zheng	570df4e9c2	ceph: snapshot nfs re-export To support snapshot nfs re-export, we need a way to lookup snapped inode by file handle. For directory inode, snapped metadata are always stored together with head inode. Client just need to pass vinodeno_t to MDS. For non-directory inode, there can be multiple version of snapped inodes and they can be stored in different dirfrags. Besides vinodeno_t, client also need to tell mds from which dirfrag it got the snapped inode. Another problem of supporting snapshot nfs re-export is that there can be multiple paths to access a snapped inode. For example: mkdir -p d1/d2/d3 mkdir d1/.snap/s1 Paths 'd1/.snap/s1/d2/d3', 'd1/d2/.snap/_s1_<inode number of d1>/d3' and 'd1/d2/d3/.snap/_s1_<inode number of d1>' are all reference to the same snapped inode. For a given snapped inode, There is no convenient way to get the first form and the second form paths. For simplicity, ceph_get_parent() return snapdir for snapped directory inode. Furthermore, client may access snapshot of deleted directory. For example: mkdir -p d1/d2 mkdir d1/.snap/s1 open d1/.snap/s1/d2 rm -rf d1/d2 <nfs server restart> The path constucted by ceph_get_parent() and ceph_get_name() is '<inode of d2>/.snap/_s1_<inode number of d1>'. Futher lookup parent of <inode of d2> will fail. To workaround this case, this patch uses d_obtain_root() to get dentry for snapdir of deleted directory. snapdir dentry has no DCACHE_DISCONNECTED flag set, reconnect_path() stops when it reaches snapdir dentry. Link: http://tracker.ceph.com/issues/22105 Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:36 +02:00
Luis Henriques	0c44a8e0fc	ceph: quota: fix quota subdir mounts The CephFS kernel client does not enforce quotas set in a directory that isn't visible from the mount point. For example, given the path '/dir1/dir2', if quotas are set in 'dir1' and the filesystem is mounted with mount -t ceph <server>:<port>:/dir1/ /mnt then the client won't be able to access 'dir1' inode, even if 'dir2' belongs to a quota realm that points to it. This patch fixes this issue by simply doing an MDS LOOKUPINO operation for unknown inodes. Any inode reference obtained this way will be added to a list in ceph_mds_client, and will only be released when the filesystem is umounted. Link: https://tracker.ceph.com/issues/38482 Reported-by: Hendrik Peyerl <hpeyerl@plusline.net> Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:36 +02:00
Luis Henriques	3886274adf	ceph: factor out ceph_lookup_inode() This function will be used by __fh_to_dentry and by the quotas code, to find quota realm inodes that are not visible in the mountpoint. Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:36 +02:00
Zhi Zhang	1b52931ca9	ceph: remove duplicated filelock ref increase Inode i_filelock_ref is increased in ceph_lock or ceph_flock, but it is increased again in ceph_lock_message. This results in this ref won't become zero. If CEPH_I_ERROR_FILELOCK flag is set in remove_session_caps once, this flag can't be cleared even if client is back to normal. So further file lock will return EIO. Signed-off-by: Zhi Zhang <zhang.david2011@gmail.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2019-05-07 19:22:36 +02:00
Linus Torvalds	68253e718c	Minor updates to ktest.pl - Handle meta characters in grub memu - Use configurable reboot return code for handling ssh reboots - Display names and iteration number on error message -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXNBwDBQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qquvAQCRNDWDS0+w2bh8X2eKVIbn6OAc+r0b IQsNZ0Ytk34lCwEA6PmkROmYLKH+p5Hv7Ohz1pvABcWxAyEZZ+lG00IFYwQ= =y0LU -----END PGP SIGNATURE----- Merge tag 'ktest-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest Pull ktest updates from Steven Rostedt: "Minor updates to ktest.pl - Handle meta characters in grub memu - Use configurable reboot return code for handling ssh reboots - Display names and iteration number on error message" * tag 'ktest-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest: ktest: introduce REBOOT_RETURN_CODE to confirm the result of REBOOT ktest: Add support for meta characters in GRUB_MENU ktest: Show name and iteration on errors	2019-05-07 10:18:57 -07:00
David S. Miller	14cfbdac66	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2019-05-06 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Two AF_XDP libbpf fixes for socket teardown; first one an invalid munmap and the other one an invalid skmap cleanup, both from Björn. 2) More graceful CONFIG_DEBUG_INFO_BTF handling when pahole is not present in the system to generate vmlinux btf info, from Andrii. 3) Fix libbpf and thus fix perf build error with uClibc on arc architecture, from Vineet. 4) Fix missing libbpf_util.h header install in libbpf, from William. 5) Exclude bash-completion/bpftool from .gitignore pattern, from Masahiro. 6) Fix up rlimit in test_libbpf_open kselftest test case, from Yonghong. 7) Minor misc cleanups. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-07 09:29:16 -07:00

... 36 37 38 39 40 ...

840049 Commits