linux

Author	SHA1	Message	Date
Linus Torvalds	1ed2d76e02	Merge branch 'work.sock_recvmsg' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull kern_recvmsg reduction from Al Viro: "kernel_recvmsg() is a set_fs()-using wrapper for sock_recvmsg(). In all but one case that is not needed - use of ITER_KVEC for ->msg_iter takes care of the data and does not care about set_fs(). The only exception is svc_udp_recvfrom() where we want cmsg to be store into kernel object; everything else can just use sock_recvmsg() and be done with that. A followup converting svc_udp_recvfrom() away from set_fs() (and killing kernel_recvmsg() off) is NOT in here - I'd like to hear what netdev folks think of the approach proposed in that followup)" * 'work.sock_recvmsg' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: tipc: switch to sock_recvmsg() smc: switch to sock_recvmsg() ipvs: switch to sock_recvmsg() mISDN: switch to sock_recvmsg() drbd: switch to sock_recvmsg() lustre lnet_sock_read(): switch to sock_recvmsg() cfs2: switch to sock_recvmsg() ncpfs: switch to sock_recvmsg() dlm: switch to sock_recvmsg() svc_recvfrom(): switch to sock_recvmsg()	2018-01-30 18:59:03 -08:00
Al Viro	c8c7840ea9	dlm: switch to sock_recvmsg() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-12-02 20:37:47 -05:00
Al Viro	076ccb76e1	fs: annotate ->poll() instances Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-11-27 16:20:05 -05:00
Linus Torvalds	abc36be236	A couple of configfs cleanups: - proper use of the bool type (Thomas Meyer) - constification of struct config_item_type (Bhumika Goyal) -----BEGIN PGP SIGNATURE----- iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAloLSTALHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYNxfhAAv3cunxiEPEAvs+1xuGd3cZYaxz7qinvIODPxIKoF kRWiuy5PUklRMnJ8seOgJ1p1QokX6Sk4cZ8HcctDJVByqODjOq4K5eaKVN1ZqJoz BUzO/gOqfs64r9yaFIlKfe8nFA+gpUftSeWyv3lThxAIJ1iSbue7OZ/A10tTOS1m RWp9FPepFv+nJMfWqeQU64BsoDQ4kgZ2NcEA+jFxNx5dlmIbLD49tk0lfddvZQXr j5WyAH73iugilLtNUGVOqSzHBY4kUvfCKUV7leirCegyMoGhFtA87m6Wzwbo6ZUI DwQLzWvuPaGv1P2PpNEHfKiNbfIEp75DRyyyf87DD3lc5ffAxQSm28mGuwcr7Rn5 Ow/yWL6ERMzCLExoCzEkXYJISy7T5LIzYDgNggKMpeWxysAduF7Onx7KfW1bTuhK mHvY7iOXCjEvaIVaF8uMKE6zvuY1vCMRXaJ+kC9jcIE3gwhg+2hmQvrdJ2uAFXY+ rkeF2Poj/JlblPU4IKWAjiPUbzB7Lv0gkypCB2pD4riaYIN5qCAgF8ULIGQp2hsO lYW1EEgp5FBop85oSO/HAGWeH9dFg0WaV7WqNRVv0AGXhKjgy+bVd7iYPpvs7mGw z9IqSQDORcG2ETLcFhZgiJpCk/itwqXBD+wgMOjJPP8lL+4kZ8FcuhtY9kc9WlJE Tew= =+tMO -----END PGP SIGNATURE----- Merge tag 'configfs-for-4.15' of git://git.infradead.org/users/hch/configfs Pull configfs updates from Christoph Hellwig: "A couple of configfs cleanups: - proper use of the bool type (Thomas Meyer) - constification of struct config_item_type (Bhumika Goyal)" * tag 'configfs-for-4.15' of git://git.infradead.org/users/hch/configfs: RDMA/cma: make config_item_type const stm class: make config_item_type const ACPI: configfs: make config_item_type const nvmet: make config_item_type const usb: gadget: configfs: make config_item_type const PCI: endpoint: make config_item_type const iio: make function argument and some structures const usb: gadget: make config_item_type structures const dlm: make config_item_type const netconsole: make config_item_type const nullb: make config_item_type const ocfs2/cluster: make config_item_type const target: make config_item_type const configfs: make ci_type field, some pointers and function arguments const configfs: make config_item_type const configfs: Fix bool initialization/comparison	2017-11-14 14:44:04 -08:00
Linus Torvalds	f0b60bfa95	dlm for 4.15 This set focuses, as usual, on fixes to the comms layer. New testing of the dlm with ocfs2 uncovered a number of bugs in the TCP connection handling during recovery, starting, and stopping. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJaCeJ8AAoJEDgbc8f8gGmqpqQP/A9GekFRRvm3QfpmHG3Lj6ey O4IKINaB8F46KDBaWzTwKE3fOl0j19qICKuEibBZeJl4lGh7Q5GDOMZ20AfDU4wv Qq40OEfFombCFsVX/Qc4AvdXj7cjpfjJwrZhW6CHOkYGZDaAmsHBeCgBeTvhkqR4 dj4pGFIwBpvV1gQrIteFx110kupeT8DvCSIVzWelD+Jb18vtht7YfKehRyQ3Cyix 8sbEPiKuhmLW/6wbliRqQL5cp9ZyUU5YtBqhmE8r2QIbOOB+k1xFIvVgUylawv3P qi1SpBkX7zRM4BCTP0J3zbUzQHZhgjtgBLVMiSrAWBFb3XtpssEXVczKFDxFafEt YJtPeqHxr8zwzQeF+6MGx6amRWW0T9yHv2sB79wBkz8wL483qL39k9DNa564NoSJ rZtN0bk4g6CuDnHgEM3hzNsVU2sgdaQMZnRWYONHwvDeI+HgKJWD4nedD6wFmXlo kimrQDQCzvx8ZnCKHH0/k23BV2SoYz+80fbW+TeFCWU6gPFGKcJZ12p1e9YYLZJh yeY1Y/kdNhLWyIZlldIK1TtO0645YPBhXcaFBA/RF7g8EbwKrIG8FUZSHzWwQIoJ hGtLBhWT12BGE2NCLHSMCrKZEb+JeXIN+jKxm9g2m5k6D+nQBt5K7Ae6j6n8pwUC hxic9hQmXNxb0R51YD/+ =w3jk -----END PGP SIGNATURE----- Merge tag 'dlm-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "This set focuses, as usual, on fixes to the comms layer. New testing of the dlm with ocfs2 uncovered a number of bugs in the TCP connection handling during recovery, starting, and stopping" * tag 'dlm-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: remove dlm_send_rcom_lookup_dump dlm: recheck kthread_should_stop() before schedule() DLM: fix NULL pointer dereference in send_to_sock() DLM: fix to reschedule rwork DLM: fix to use sk_callback_lock correctly DLM: fix overflow dlm_cb_seq DLM: fix memory leak in tcp_accept_from_sock() DLM: fix conversion deadlock when DLM_LKF_NODLCKWT flag is set DLM: use CF_CLOSE flag to stop dlm_send correctly DLM: Reanimate CF_WRITE_PENDING flag DLM: fix race condition between dlm_recoverd_stop and dlm_recoverd DLM: close othercon at send/receive error DLM: retry rcom when dlm_wait_function is timed out. DLM: fix to use sock_mutex correctly in xxx_accept_from_sock DLM: fix race condition between dlm_send and dlm_recv DLM: fix double list_del() DLM: fix remove save_cb argument from add_sock() DLM: Fix saving of NULL callbacks DLM: Eliminate CF_WRITE_PENDING flag DLM: Eliminate CF_CONNECT_PENDING flag	2017-11-14 14:06:51 -08:00
Greg Kroah-Hartman	b24413180f	License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-02 11:10:55 +01:00
Bhumika Goyal	761594b741	dlm: make config_item_type const Make config_item_type structures const as they are either passed to a function having the argument as const or stored in the const "ci_type" field of a config_item structure. Done using Coccinelle. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-10-19 16:15:22 +02:00
David Teigland	9250e52359	dlm: remove dlm_send_rcom_lookup_dump This function was only for debugging. It would be called in a condition that should not happen, and should probably have been removed from the final version of the original commit. Remove it because it does mutex lock under spin lock. Signed-off-by: David Teigland <teigland@redhat.com>	2017-10-09 09:29:31 -05:00
Guoqing Jiang	9e1b0211c5	dlm: recheck kthread_should_stop() before schedule() Call schedule() here could make the thread miss wake up from kthread_stop(), so it is better to recheck kthread_should_stop() before call schedule(), a symptom happened when I run indefinite test (which mostly created clustered raid1, assemble it in other nodes, then stop them) of clustered raid. $ ps aux\|grep md\|grep D root 4211 0.0 0.0 19760 2220 ? Ds 02:58 0:00 mdadm -Ssq $ cat /proc/4211/stack kthread_stop+0x4d/0x150 dlm_recoverd_stop+0x15/0x20 [dlm] dlm_release_lockspace+0x2ab/0x460 [dlm] leave+0xbf/0x150 [md_cluster] md_cluster_stop+0x18/0x30 [md_mod] bitmap_free+0x12e/0x140 [md_mod] bitmap_destroy+0x7f/0x90 [md_mod] __md_stop+0x21/0xa0 [md_mod] do_md_stop+0x15f/0x5c0 [md_mod] md_ioctl+0xa65/0x18a0 [md_mod] blkdev_ioctl+0x49e/0x8d0 block_ioctl+0x41/0x50 do_vfs_ioctl+0x96/0x5b0 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x1e/0xad This maybe not resolve the issue completely since the KTHREAD_SHOULD_STOP flag could be set between "break" and "schedule", but at least the chance for the symptom happen could be reduce a lot (The indefinite test runs more than 20 hours without problem and it happens easily without the change). Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:48:10 -05:00
tsutomu.owa@toshiba.co.jp	26b41099e7	DLM: fix NULL pointer dereference in send_to_sock() The writequeue and writequeue_lock member of othercon was not initialized. If lowcomms_state_change() is called from network layer, othercon->swork may be scheduled. In this case, send_to_sock() will generate a NULL pointer reference. We avoid this problem by correctly initializing writequeue and writequeue_lock member of othercon. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	0aa18464c8	DLM: fix to reschedule rwork When an error occurs in kernel_recvmsg or kernel_sendpage and close_connection is called and receive work is already scheduled, receive work is canceled. In that case, the receive work will not be scheduled forever after reconnection, because CF_READ_PENDING flag is established. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	93eaadebe9	DLM: fix to use sk_callback_lock correctly In the current implementation, we think that exclusion control between processing to set the callback function to the connection structure and processing to refer to the connection structure from the callback function was not enough. We fix them. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	ccbbea0432	DLM: fix overflow dlm_cb_seq dlm_cb_seq is 64 bits. If dlm_cb_seq overflows and returns to 0, dlm_rem_lkb_callback() will not work properly. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	3421fb15be	DLM: fix memory leak in tcp_accept_from_sock() The sk member of the socket generated by sock_create_kern() is overwritten by ops->accept(). So the previous sk will not be released. We use kernel_accept() instead of sock_create_kern() and ops->accept(). Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	294e7e4587	DLM: fix conversion deadlock when DLM_LKF_NODLCKWT flag is set When the DLM_LKF_NODLCKWT flag was set, even if conversion deadlock was detected, the caller of can_be_granted() was unknown. We change the behavior of can_be_granted() and change it to detect conversion deadlock regardless of whether the DLM_LKF_NODLCKWT flag is set or not. And depending on whether the DLM_LKF_NODLCKWT flag is set or not, we change the behavior at the caller of can_be_granted(). This fix has no effect except when using DLM_LKF_NODLCKWT flag. Currently, ocfs2 uses the DLM_LKF_NODLCKWT flag and does not expect a cancel operation from conversion deadlock when calling dlm_lock(). ocfs2 is implemented to perform a cancel operation by requesting BASTs (callback). Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	173a31fe2b	DLM: use CF_CLOSE flag to stop dlm_send correctly If reconnection fails while executing dlm_lowcomms_stop, dlm_send will not stop. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	8a4abb0819	DLM: Reanimate CF_WRITE_PENDING flag CF_WRITE_PENDING flag has been reanimated to make dlm_send stop properly when running dlm_lowcomms_stop. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	e412f9201d	DLM: fix race condition between dlm_recoverd_stop and dlm_recoverd When dlm_recoverd_stop() is called between kthread_should_stop() and set_task_state(TASK_INTERRUPTIBLE), dlm_recoverd will not wake up. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	c553e173b0	DLM: close othercon at send/receive error If an error occurs in the sending / receiving process, if othercon exists, sending / receiving processing using othercon may also result in an error. We fix to pre-close othercon as well. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	5966121241	DLM: retry rcom when dlm_wait_function is timed out. If a node sends a DLM_RCOM_STATUS command and an error occurs on the receiving side, the DLM_RCOM_STATUS_REPLY response may not be returned. We retransmitted the DLM_RCOM_STATUS command so that we do not wait for an infinite response. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	c7355827b2	DLM: fix to use sock_mutex correctly in xxx_accept_from_sock In the current implementation, we think that exclusion control for othercon in tcp_accept_from_sock() and sctp_accept_from_sock() was not enough. We fix them. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	b2a6662932	DLM: fix race condition between dlm_send and dlm_recv When kernel_sendpage(in send_to_sock) and kernel_recvmsg (in receive_from_sock) return error, close_connection may works at the same time. At that time, they may wait for each other by cancel_work_sync. Signed-off-by: Tadashi Miyauchi <miayuchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	f0fb83cb92	DLM: fix double list_del() dlm_lowcomms_stop() was not functioning properly. Correctly, we have to wait until all processing is finished with send_workqueue and recv_workqueue. This problem causes the following issue. Senario is 1. dlm_send thread: send_to_sock refers con->writequeue 2. main thread: dlm_lowcomms_stop calls list_del 3. dlm_send thread: send_to_sock calls list_del in writequeue_entry_complete [ 1925.770305] dlm: canceled swork for node 4 [ 1925.772374] general protection fault: 0000 [#1] SMP [ 1925.777930] Modules linked in: ocfs2_stack_user ocfs2 ocfs2_nodemanager ocfs2_stackglue dlm fmxnet(O) fmx_api(O) fmx_cu(O) igb(O) kvm_intel kvm irqbypass autofs4 [ 1925.794131] CPU: 3 PID: 6994 Comm: kworker/u8:0 Tainted: G O 4.4.39 #1 [ 1925.802684] Hardware name: TOSHIBA OX/OX, BIOS OX-P0015 12/03/2015 [ 1925.809595] Workqueue: dlm_send process_send_sockets [dlm] [ 1925.815714] task: ffff8804398d3c00 ti: ffff88046910c000 task.ti: ffff88046910c000 [ 1925.824072] RIP: 0010:[<ffffffffa04bd158>] [<ffffffffa04bd158>] process_send_sockets+0xf8/0x280 [dlm] [ 1925.834480] RSP: 0018:ffff88046910fde0 EFLAGS: 00010246 [ 1925.840411] RAX: dead000000000200 RBX: 0000000000000001 RCX: 000000000000000a [ 1925.848372] RDX: ffff88046bd980c0 RSI: 0000000000000000 RDI: ffff8804673c5670 [ 1925.856341] RBP: ffff88046910fe20 R08: 00000000000000c9 R09: 0000000000000010 [ 1925.864311] R10: ffffffff81e22fc0 R11: 0000000000000000 R12: ffff8804673c56d8 [ 1925.872281] R13: ffff8804673c5660 R14: ffff88046bd98440 R15: 0000000000000058 [ 1925.880251] FS: 0000000000000000(0000) GS:ffff88047fd80000(0000) knlGS:0000000000000000 [ 1925.889280] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1925.895694] CR2: 00007fff09eadf58 CR3: 00000004690f5000 CR4: 00000000001006e0 [ 1925.903663] Stack: [ 1925.905903] ffff8804673c5630 ffff8804673c5620 ffff8804673c5670 ffff88007d219b40 [ 1925.914181] ffff88046f095800 0000000000000100 ffff8800717a1400 ffff8804673c56d8 [ 1925.922459] ffff88046910fe60 ffffffff81073db2 00ff880400000000 ffff88007d219b40 [ 1925.930736] Call Trace: [ 1925.933468] [<ffffffff81073db2>] process_one_work+0x162/0x450 [ 1925.939983] [<ffffffff81074459>] worker_thread+0x69/0x4a0 [ 1925.946109] [<ffffffff810743f0>] ? rescuer_thread+0x350/0x350 [ 1925.952622] [<ffffffff8107956f>] kthread+0xef/0x110 [ 1925.958165] [<ffffffff81079480>] ? kthread_park+0x60/0x60 [ 1925.964283] [<ffffffff8186ab2f>] ret_from_fork+0x3f/0x70 [ 1925.970312] [<ffffffff81079480>] ? kthread_park+0x60/0x60 [ 1925.976436] Code: 01 00 00 48 8b 7d d0 e8 07 d3 3a e1 45 01 7e 18 45 29 7e 1c 75 ab 41 8b 46 24 85 c0 75 a3 49 8b 16 49 8b 46 08 31 f6 48 89 42 08 <48> 89 10 48 b8 00 01 00 00 00 00 ad de 49 8b 7e 10 49 89 06 66 [ 1925.997791] RIP [<ffffffffa04bd158>] process_send_sockets+0xf8/0x280 [dlm] [ 1926.005577] RSP <ffff88046910fde0> Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	988419a9de	DLM: fix remove save_cb argument from add_sock() save_cb argument is not used. We remove them. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
Bob Peterson	cc661fc934	DLM: Fix saving of NULL callbacks In a previous patch I noted that accept() often copies the struct sock (sk) which overwrites the sock callbacks. However, in testing we discovered that the dlm connection structures (con) are sometimes deleted and recreated as connections come and go, and since they're zeroed out by kmem_cache_zalloc, the saved callback pointers are also initialized to zero. But with today's DLM code, the callbacks are only saved when a socket is added. During recovery testing, we discovered a common situation in which the new con is initialized to zero, then a socket is added after accept(). In this case, the sock's saved values are all NULL, but the saved values are wiped out, due to accept(). Therefore, we don't have a known good copy of the callbacks from which we can restore. Since the struct sock callbacks are always good after listen(), this patch saves the known good values after listen(). These good values are then used for subsequent restores. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
Bob Peterson	01da24d3fb	DLM: Eliminate CF_WRITE_PENDING flag Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
Bob Peterson	61d9102b62	DLM: Eliminate CF_CONNECT_PENDING flag Before this patch, there was a flag in the con structure that was used to determine whether or not a connect was needed. The bit was set here and there, and cleared here and there, so it left some race conditions: the bit was set, work was queued, then the worker cleared the bit, allowing someone else to set it while the worker ran. For the most part, this worked okay, but we got into trouble if connections were lost and it needed to reconnect. This patch eliminates the flag in favor of simply checking if we actually have a sock pointer while protected by the mutex. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
Linus Torvalds	066dea8c30	File locking related changes for v4.14 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZrTzyAAoJEAAOaEEZVoIVj8wP/0sOxG+7vEEpe4uj2W52aq9T Y39/ZLfRTLm9SqgH61lkN+IyUsvDx+IP1ws2LBhp0IDRD9m40wdILhHZRWXJcRW2 ApEfmXF+rxnZZ6725ixX9w4Ylab2ZeGmKbzaG4wIjxfddftewZkJvFQcb1LZDfWq 1N0SF4KWoWN6t26Du5CHmYSj/Sz6YGrWGhF22u3mNfkGL+MmuKbz+kB3W+0q2NUF ZjkOIH9WcRiXgSlcHPBLre2EKHqHaNgb0s4Iofd3ZEe50v1NwY/vBMefxuwRdgKS kpLhIKIYMawrHn2rpV0jm12qdgCYj+t2kbVIUBDn3unBP2zYA0e/oo5HNIrroVlk Q6aGwmW0LN60rpd5qcRuNS1p1h2id2HpxEe98dsski6T8CVnj/nvu7EIxmWM02cf g2HeOd7bnl3+uu7SwSTkOVb6G7Kjn+Xufiz/n11mK6fl2jvOyWZZmDqhhjWAYJ8r t5mQVWJdEV12+6+A1WSv9DeS3TUgdYPCF8dzDtF+JVn3WEmxYHywH36Y3hKKz+BA gFEhnHvlyaVvpXCr8Y5BqNSfEfvZe/YUnmVReHpgBU/U4GJ17iQYk/g2vfmPLmsN IZ2OGCrDUc/LfdWc4llRyQBvlGT1KujaT0tbN7xnuWcS2qWdsfX4jDtDUH9E6pvK TB6Sw4Ike0ixamG8N8q/ =VPMU -----END PGP SIGNATURE----- Merge tag 'locks-v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull file locking updates from Jeff Layton: "This pile just has a few file locking fixes from Ben Coddington. There are a couple of cleanup patches + an attempt to bring sanity to the l_pid value that is reported back to userland on an F_GETLK request. After a few gyrations, he came up with a way for filesystems to communicate to the VFS layer code whether the pid should be translated according to the namespace or presented as-is to userland" * tag 'locks-v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: locks: restore a warn for leaked locks on close fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks fs/locks: Use allocation rather than the stack in fcntl_getlk()	2017-09-06 13:43:26 -07:00
Guoqing Jiang	1c24285372	dlm: use sock_create_lite inside tcp_accept_from_sock With commit `0ffdaf5b41` ("net/sock: add WARN_ON(parent->sk) in sock_graft()"), a calltrace happened as follows: [ 457.018340] WARNING: CPU: 0 PID: 15623 at ./include/net/sock.h:1703 inet_accept+0x135/0x140 ... [ 457.018381] RIP: 0010:inet_accept+0x135/0x140 [ 457.018381] RSP: 0018:ffffc90001727d18 EFLAGS: 00010286 [ 457.018383] RAX: 0000000000000001 RBX: ffff880012413000 RCX: 0000000000000001 [ 457.018384] RDX: 000000000000018a RSI: 00000000fffffe01 RDI: ffffffff8156fae8 [ 457.018384] RBP: ffffc90001727d38 R08: 0000000000000000 R09: 0000000000004305 [ 457.018385] R10: 0000000000000001 R11: 0000000000004304 R12: ffff880035ae7a00 [ 457.018386] R13: ffff88001282af10 R14: ffff880034e4e200 R15: 0000000000000000 [ 457.018387] FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 457.018388] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 457.018389] CR2: 00007fdec22f9000 CR3: 0000000002b5a000 CR4: 00000000000006f0 [ 457.018395] Call Trace: [ 457.018402] tcp_accept_from_sock.part.8+0x12d/0x449 [dlm] [ 457.018405] ? vprintk_emit+0x248/0x2d0 [ 457.018409] tcp_accept_from_sock+0x3f/0x50 [dlm] [ 457.018413] process_recv_sockets+0x3b/0x50 [dlm] [ 457.018415] process_one_work+0x138/0x370 [ 457.018417] worker_thread+0x4d/0x3b0 [ 457.018419] kthread+0x109/0x140 [ 457.018421] ? rescuer_thread+0x320/0x320 [ 457.018422] ? kthread_park+0x60/0x60 [ 457.018424] ret_from_fork+0x25/0x30 Since newsocket created by sock_create_kern sets it's sock by the path: sock_create_kern -> __sock_creat ->pf->create => inet_create -> sock_init_data Then WARN_ON is triggered by "con->sock->ops->accept => inet_accept -> sock_graft", it also means newsock->sk is leaked since sock_graft will replace it with a new sk. To resolve the issue, we need to use sock_create_lite instead of sock_create_kern, like commit `0933a578cd` ("rds: tcp: use sock_create_lite() to create the accept socket") did. Reported-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Edwin Török	55acdd926f	dlm: avoid double-free on error path in dlm_device_{register,unregister} Can be reproduced when running dlm_controld (tested on 4.4.x, 4.12.4): # seq 1 100 \| xargs -P0 -n1 dlm_tool join # seq 1 100 \| xargs -P0 -n1 dlm_tool leave misc_register fails due to duplicate sysfs entry, which causes dlm_device_register to free ls->ls_device.name. In dlm_device_deregister the name was freed again, causing memory corruption. According to the comment in dlm_device_deregister the name should've been set to NULL when registration fails, so this patch does that. sysfs: cannot create duplicate filename '/dev/char/10:1' ------------[ cut here ]------------ warning: cpu: 1 pid: 4450 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x56/0x70 modules linked in: msr rfcomm dlm ccm bnep dm_crypt uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev btusb media btrtl btbcm btintel bluetooth ecdh_generic intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_hdmi irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel thinkpad_acpi pcbc nvram snd_seq_midi snd_seq_midi_event aesni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_rawmidi aes_x86_64 crypto_simd glue_helper snd_hda_intel snd_hda_codec cryptd intel_cstate arc4 snd_hda_core snd_seq snd_seq_device snd_hwdep iwldvm intel_rapl_perf mac80211 joydev input_leds iwlwifi serio_raw cfg80211 snd_pcm shpchp snd_timer snd mac_hid mei_me lpc_ich mei soundcore sunrpc parport_pc ppdev lp parport autofs4 i915 psmouse e1000e ahci libahci i2c_algo_bit sdhci_pci ptp drm_kms_helper sdhci pps_core syscopyarea sysfillrect sysimgblt fb_sys_fops drm wmi video cpu: 1 pid: 4450 comm: dlm_test.exe not tainted 4.12.4-041204-generic hardware name: lenovo 232425u/232425u, bios g2et82ww (2.02 ) 09/11/2012 task: ffff96b0cbabe140 task.stack: ffffb199027d0000 rip: 0010:sysfs_warn_dup+0x56/0x70 rsp: 0018:ffffb199027d3c58 eflags: 00010282 rax: 0000000000000038 rbx: ffff96b0e2c49158 rcx: 0000000000000006 rdx: 0000000000000000 rsi: 0000000000000086 rdi: ffff96b15e24dcc0 rbp: ffffb199027d3c70 r08: 0000000000000001 r09: 0000000000000721 r10: ffffb199027d3c00 r11: 0000000000000721 r12: ffffb199027d3cd1 r13: ffff96b1592088f0 r14: 0000000000000001 r15: ffffffffffffffef fs: 00007f78069c0700(0000) gs:ffff96b15e240000(0000) knlgs:0000000000000000 cs: 0010 ds: 0000 es: 0000 cr0: 0000000080050033 cr2: 000000178625ed28 cr3: 0000000091d3e000 cr4: 00000000001406e0 call trace: sysfs_do_create_link_sd.isra.2+0x9e/0xb0 sysfs_create_link+0x25/0x40 device_add+0x5a9/0x640 device_create_groups_vargs+0xe0/0xf0 device_create_with_groups+0x3f/0x60 ? snprintf+0x45/0x70 misc_register+0x140/0x180 device_write+0x6a8/0x790 [dlm] __vfs_write+0x37/0x160 ? apparmor_file_permission+0x1a/0x20 ? security_file_permission+0x3b/0xc0 vfs_write+0xb5/0x1a0 sys_write+0x55/0xc0 ? sys_fcntl+0x5d/0xb0 entry_syscall_64_fastpath+0x1e/0xa9 rip: 0033:0x7f78083454bd rsp: 002b:00007f78069bbd30 eflags: 00000293 orig_rax: 0000000000000001 rax: ffffffffffffffda rbx: 0000000000000006 rcx: 00007f78083454bd rdx: 000000000000009c rsi: 00007f78069bee00 rdi: 0000000000000005 rbp: 00007f77f8000a20 r08: 000000000000fcf0 r09: 0000000000000032 r10: 0000000000000024 r11: 0000000000000293 r12: 00007f78069bde00 r13: 00007f78069bee00 r14: 000000000000000a r15: 00007f78069bbd70 code: 85 c0 48 89 c3 74 12 b9 00 10 00 00 48 89 c2 31 f6 4c 89 ef e8 2c c8 ff ff 4c 89 e2 48 89 de 48 c7 c7 b0 8e 0c a8 e8 41 e8 ed ff <0f> ff 48 89 df e8 00 d5 f4 ff 5b 41 5c 41 5d 5d c3 66 0f 1f 84 ---[ end trace 40412246357cc9e0 ]--- dlm: 59f24629-ae39-44e2-9030-397ebc2eda26: leaving the lockspace group... bug: unable to handle kernel null pointer dereference at 0000000000000001 ip: [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140 pgd 0 oops: 0000 [#1] smp modules linked in: dlm 8021q garp mrp stp llc openvswitch nf_defrag_ipv6 nf_conntrack libcrc32c iptable_filter dm_multipath crc32_pclmul dm_mod aesni_intel psmouse aes_x86_64 sg ablk_helper cryptd lrw gf128mul glue_helper i2c_piix4 nls_utf8 tpm_tis tpm isofs nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc xen_wdt ip_tables x_tables autofs4 hid_generic usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi 8139too serio_raw ata_piix 8139cp mii uhci_hcd ehci_pci ehci_hcd libata scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_mod ipv6 cpu: 0 pid: 394 comm: systemd-udevd tainted: g w 4.4.0+0 #1 hardware name: xen hvm domu, bios 4.7.2-2.2 05/11/2017 task: ffff880002410000 ti: ffff88000243c000 task.ti: ffff88000243c000 rip: e030:[<ffffffff811a3b4a>] [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140 rsp: e02b:ffff88000243fd90 eflags: 00010202 rax: 0000000000000000 rbx: ffff8800029864d0 rcx: 000000000007b36c rdx: 000000000007b36b rsi: 00000000024000c0 rdi: ffff880036801c00 rbp: ffff88000243fdc0 r08: 0000000000018880 r09: 0000000000000054 r10: 000000000000004a r11: ffff880034ace6c0 r12: 00000000024000c0 r13: ffff880036801c00 r14: 0000000000000001 r15: ffffffff8118dcc2 fs: 00007f0ab77548c0(0000) gs:ffff880036e00000(0000) knlgs:0000000000000000 cs: e033 ds: 0000 es: 0000 cr0: 0000000080050033 cr2: 0000000000000001 cr3: 000000000332d000 cr4: 0000000000040660 stack: ffffffff8118dc90 ffff8800029864d0 0000000000000000 ffff88003430b0b0 ffff880034b78320 ffff88003430b0b0 ffff88000243fdf8 ffffffff8118dcc2 ffff8800349c6700 ffff8800029864d0 000000000000000b 00007f0ab7754b90 call trace: [<ffffffff8118dc90>] ? anon_vma_fork+0x60/0x140 [<ffffffff8118dcc2>] anon_vma_fork+0x92/0x140 [<ffffffff8107033e>] copy_process+0xcae/0x1a80 [<ffffffff8107128b>] _do_fork+0x8b/0x2d0 [<ffffffff81071579>] sys_clone+0x19/0x20 [<ffffffff815a30ae>] entry_syscall_64_fastpath+0x12/0x71 ] code: f6 75 1c 4c 89 fa 44 89 e6 4c 89 ef e8 a7 e4 00 00 41 f7 c4 00 80 00 00 49 89 c6 74 47 eb 32 49 63 45 20 48 8d 4a 01 4d 8b 45 00 <49> 8b 1c 06 4c 89 f0 65 49 0f c7 08 0f 94 c0 84 c0 74 ac 49 63 rip [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140 rsp <ffff88000243fd90> cr2: 0000000000000001 --[ end trace 70cb9fd1b164a0e8 ]-- CC: stable@vger.kernel.org Signed-off-by: Edwin Török <edvin.torok@citrix.com> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Bhumika Goyal	417f7c59ed	dlm: constify kset_uevent_ops structure Declare kset_uevent_ops structure as const as it is only passed as an argument to the function kset_create_and_add. This argument is of type const, so declare the structure as const. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Zhu Lingshan	3b0e761ba8	dlm: print log message when cluster name is not set Print a message when a cluster name is not specified by the caller. In this case the cluster name configured for the dlm is used without any validation that it is the cluster expected by the application. Signed-off-by: Zhu Lingshan <lszhu@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Markus Elfring	2ab93ae138	dlm: Delete an unnecessary variable initialisation in dlm_ls_start() The local variable "rv" is reassigned by a statement at the beginning. Thus omit the explicit initialisation. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Markus Elfring	d12ad1a964	dlm: Improve a size determination in two functions Replace the specification of two data structures by pointer dereferences as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Markus Elfring	2f48e06102	dlm: Use kcalloc() in two functions * Multiplications for the size determination of memory allocations indicated that array data structures should be processed. Thus reuse the corresponding function "kcalloc". This issue was detected by using the Coccinelle software. * Replace the specification of data structures by pointer dereferences to make the corresponding size determinations a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Markus Elfring	790854becc	dlm: Use kmalloc_array() in make_member_array() * A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "kmalloc_array". This issue was detected by using the Coccinelle software. * Replace the specification of a data type by a pointer dereference to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Markus Elfring	0d37eca752	dlm: Delete an error message for a failed memory allocation in dlm_recover_waiters_pre() Omit an extra message for a memory allocation failure in this function. Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Markus Elfring	102e67d4e3	dlm: Improve a size determination in dlm_recover_waiters_pre() Replace the specification of a data structure by a pointer dereference as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Markus Elfring	fbb1008151	dlm: Use kcalloc() in dlm_scan_waiters() A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "kcalloc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Markus Elfring	2c257e96df	dlm: Improve a size determination in table_seq_start() Replace the specification of a data structure by a pointer dereference as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Markus Elfring	41922ce831	dlm: Add spaces for better code readability The script "checkpatch.pl" pointed information out like the following. CHECK: spaces preferred around that '+' (ctx:VxV) Thus fix the affected source code places. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Markus Elfring	653996ca8d	dlm: Replace six seq_puts() calls by seq_putc() Six single characters (line breaks) should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Gang He	8e1743748b	dlm: Make dismatch error message more clear This change will try to make this error message more clear, since the upper applications (e.g. ocfs2) invoke dlm_new_lockspace to create a new lockspace with passing a cluster name. Sometimes, dlm_new_lockspace return failure while two cluster names dismatch, the user is a little confused since this line error message is not enough obvious. Signed-off-by: Gang He <ghe@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Vlad Tsyrklevich	8286d6b14c	dlm: Fix kernel memory disclosure Clear the 'unused' field and the uninitialized padding in 'lksb' to avoid leaking memory to userland in copy_result_to_user(). Signed-off-by: Vlad Tsyrklevich <vlad@tsyrklevich.net> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Benjamin Coddington	9d5b86ac13	fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks Since commit `c69899a17c` "NFSv4: Update of VFS byte range lock must be atomic with the stateid update", NFSv4 has been inserting locks in rpciod worker context. The result is that the file_lock's fl_nspid is the kworker's pid instead of the original userspace pid. The fl_nspid is only used to represent the namespaced virtual pid number when displaying locks or returning from F_GETLK. There's no reason to set it for every inserted lock, since we can usually just look it up from fl_pid. So, instead of looking up and holding struct pid for every lock, let's just look up the virtual pid number from fl_pid when it is needed. That means we can remove fl_nspid entirely. The translaton and presentation of fl_pid should handle the following four cases: 1 - F_GETLK on a remote file with a remote lock: In this case, the filesystem should determine the l_pid to return here. Filesystems should indicate that the fl_pid represents a non-local pid value that should not be translated by returning an fl_pid <= 0. 2 - F_GETLK on a local file with a remote lock: This should be the l_pid of the lock manager process, and translated. 3 - F_GETLK on a remote file with a local lock, and 4 - F_GETLK on a local file with a local lock: These should be the translated l_pid of the local locking process. Fuse was already doing the correct thing by translating the pid into the caller's namespace. With this change we must update fuse to translate to init's pid namespace, so that the locks API can then translate from init's pid namespace into the pid namespace of the caller. With this change, the locks API will expect that if a filesystem returns a remote pid as opposed to a local pid for F_GETLK, that remote pid will be <= 0. This signifies that the pid is remote, and the locks API will forego translating that pid into the pid namespace of the local calling process. Finally, we convert remote filesystems to present remote pids using negative numbers. Have lustre, 9p, ceph, cifs, and dlm negate the remote pid returned for F_GETLK lock requests. Since local pids will never be larger than PID_MAX_LIMIT (which is currently defined as <= 4 million), but pid_t is an unsigned int, we should have plenty of room to represent remote pids with negative numbers if we assume that remote pid numbers are similarly limited. If this is not the case, then we run the risk of having a remote pid returned for which there is also a corresponding local pid. This is a problem we have now, but this patch should reduce the chances of that occurring, while also returning those remote pid numbers, for whatever that may be worth. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com>	2017-07-16 10:28:22 -04:00
David Howells	cdfbabfb2f	net: Work around lockdep limitation in sockets that use sockets Lockdep issues a circular dependency warning when AFS issues an operation through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem. The theory lockdep comes up with is as follows: (1) If the pagefault handler decides it needs to read pages from AFS, it calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but creating a call requires the socket lock: mmap_sem must be taken before sk_lock-AF_RXRPC (2) afs_open_socket() opens an AF_RXRPC socket and binds it. rxrpc_bind() binds the underlying UDP socket whilst holding its socket lock. inet_bind() takes its own socket lock: sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET (3) Reading from a TCP socket into a userspace buffer might cause a fault and thus cause the kernel to take the mmap_sem, but the TCP socket is locked whilst doing this: sk_lock-AF_INET must be taken before mmap_sem However, lockdep's theory is wrong in this instance because it deals only with lock classes and not individual locks. The AF_INET lock in (2) isn't really equivalent to the AF_INET lock in (3) as the former deals with a socket entirely internal to the kernel that never sees userspace. This is a limitation in the design of lockdep. Fix the general case by: (1) Double up all the locking keys used in sockets so that one set are used if the socket is created by userspace and the other set is used if the socket is created by the kernel. (2) Store the kern parameter passed to sk_alloc() in a variable in the sock struct (sk_kern_sock). This informs sock_lock_init(), sock_init_data() and sk_clone_lock() as to the lock keys to be used. Note that the child created by sk_clone_lock() inherits the parent's kern setting. (3) Add a 'kern' parameter to ->accept() that is analogous to the one passed in to ->create() that distinguishes whether kernel_accept() or sys_accept4() was the caller and can be passed to sk_alloc(). Note that a lot of accept functions merely dequeue an already allocated socket. I haven't touched these as the new socket already exists before we get the parameter. Note also that there are a couple of places where I've made the accepted socket unconditionally kernel-based: irda_accept() rds_rcp_accept_one() tcp_accept_from_sock() because they follow a sock_create_kern() and accept off of that. Whilst creating this, I noticed that lustre and ocfs don't create sockets through sock_create_kern() and thus they aren't marked as for-kernel, though they appear to be internal. I wonder if these should do that so that they use the new set of lock keys. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 18:23:27 -08:00
Ingo Molnar	174cd4b1e5	sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:32 +01:00
Thomas Gleixner	1f3a8e49d8	ktime: Get rid of ktime_equal() No point in going through loops and hoops instead of just comparing the values. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>	2016-12-25 17:21:23 +01:00
Thomas Gleixner	8b0e195314	ktime: Cleanup ktime_set() usage ktime_set(S,N) was required for the timespec storage type and is still useful for situations where a Seconds and Nanoseconds part of a time value needs to be converted. For anything where the Seconds argument is 0, this is pointless and can be replaced with a simple assignment. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>	2016-12-25 17:21:22 +01:00
Linus Torvalds	7c0f6ba682	Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]#[[:blank:]]include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"\|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-24 11:46:01 -08:00
Linus Torvalds	19d37ce2a7	dlm for 4.10 This set fixes error reporting for dlm sockets, removes the unbound property on the dlm callback workqueue to improve performance, and includes a couple trivial changes. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYTwosAAoJEDgbc8f8gGmqqHUQAKj+/z+kIrMp5MJEhzriMLpP wKIZa9bkmcm+BuLLf7EOwmaYx374HCq4oNNY7DJT0bE9rbFLwx9zgOvdoIjJFU3V mSvhyH8FeyueNyZHZdXmA1JZCGbuCeS36cxseaeS14+ANE/cQFlOHW5ihvLAmmnR fyV/38IjbDl33pTVf2YU5G232csicNMM8xR+1+ctrhd6CREdbY8Nf4TYVjNLHAsD r3FsuzScv1+p1LuczEhFP/Nl0YcVpH3EzSgOY67WRSQlSMyrfdnVvJkgwSIZkhpp XwW++ZBFq3B5Et1YgrFtTECrvMOb3hvoejtKTeTPq3tWoOvgweml1brtO8rVN85U brdTn3blKE7oyh+0ITdENLKXsWB5+qe1afNN51qO+MZyXKCR6uct+SjSI+zelet8 jKqxP1bQCxbnvPfF/pWVGujDE4Cb6qoeCrFSoJ/VpC/JcKxxLB7p06yflY5Ztokr yWnPiBSEz7M7+lRF/HKmJ2PZKwdZwyrrRWtCyRXPPD29kg4pG46oxjqU9iEp3R9F hDCt/AiqQWWQuhU0RZ910h2ce1y9oSyQSAbVqfmqNYZMk6UeO+0X9+kxl5fSeIWT bjO+LsZqz8QQG33XYADs+5dSRK9Lmh5roR6j7QKlVJUsB+RbBhkDSMArh+jSCQap 61L10OPKaN97m6TNXfVw =4ZWQ -----END PGP SIGNATURE----- Merge tag 'dlm-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm fixes from David Teigland: "This set fixes error reporting for dlm sockets, removes the unbound property on the dlm callback workqueue to improve performance, and includes a couple trivial changes" * tag 'dlm-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: fix error return code in sctp_accept_from_sock() dlm: don't specify WQ_UNBOUND for the ast callback workqueue dlm: remove lock_sock to avoid scheduling while atomic dlm: don't save callbacks after accept dlm: audit and remove any unnecessary uses of module.h dlm: make genl_ops const	2016-12-14 08:31:37 -08:00
Johannes Berg	56989f6d85	genetlink: mark families as __ro_after_init Now genl_register_family() is the only thing (other than the users themselves, perhaps, but I didn't find any doing that) writing to the family struct. In all families that I found, genl_register_family() is only called from __init functions (some indirectly, in which case I've add __init annotations to clarifly things), so all can actually be marked __ro_after_init. This protects the data structure from accidental corruption. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-27 16:16:09 -04:00
Johannes Berg	489111e5c2	genetlink: statically initialize families Instead of providing macros/inline functions to initialize the families, make all users initialize them statically and get rid of the macros. This reduces the kernel code size by about 1.6k on x86-64 (with allyesconfig). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-27 16:16:09 -04:00
Johannes Berg	a07ea4d994	genetlink: no longer support using static family IDs Static family IDs have never really been used, the only use case was the workaround I introduced for those users that assumed their family ID was also their multicast group ID. Additionally, because static family IDs would never be reserved by the generic netlink code, using a relatively low ID would only work for built-in families that can be registered immediately after generic netlink is started, which is basically only the control family (apart from the workaround code, which I also had to add code for so it would reserve those IDs) Thus, anything other than GENL_ID_GENERATE is flawed and luckily not used except in the cases I mentioned. Move those workarounds into a few lines of code, and then get rid of GENL_ID_GENERATE entirely, making it more robust. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-27 16:16:09 -04:00
Wei Yongjun	26c1ec2fe4	dlm: fix error return code in sctp_accept_from_sock() Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David Teigland <teigland@redhat.com>	2016-10-24 10:01:51 -05:00
Bob Peterson	aa9f101285	dlm: don't specify WQ_UNBOUND for the ast callback workqueue This patch removes the WQ_UNBOUND flag (which implies WQ_HIGHPRI) from the DLM's ast work queue, in favor of just WQ_HIGHPRI. This has been shown to cause a 19 percent performance increase for simultaneous inode creates on GFS2 with fs_mark. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2016-10-19 11:13:04 -05:00
Bob Peterson	d2fee58a3b	dlm: remove lock_sock to avoid scheduling while atomic Before this patch, functions save_callbacks and restore_callbacks called function lock_sock and release_sock to prevent other processes from messing with the struct sock while the callbacks were saved and restored. However, function add_sock calls write_lock_bh prior to calling it save_callbacks, which disables preempts. So the call to lock_sock would try to schedule when we can't schedule. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2016-10-19 11:00:03 -05:00
Bob Peterson	3735b4b9f1	dlm: don't save callbacks after accept When DLM calls accept() on a socket, the comm code copies the sk after we've saved its callbacks. Afterward, it calls add_sock which saves the callbacks a second time. Since the error reporting function lowcomms_error_report calls the previous callback too, this results in a recursive call to itself. This patch adds a new parameter to function add_sock to tell whether to save the callbacks. Function tcp_accept_from_sock (and its sctp counterpart) then calls it with false to avoid the recursion. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2016-10-19 11:00:03 -05:00
Paul Gortmaker	7963b8a598	dlm: audit and remove any unnecessary uses of module.h Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. In the case of some code where it is modular, we can extend that to also include files that are building basic support functionality but not related to loading or registering the final module; such files also have no need whatsoever for module.h The advantage in removing such instances is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h might have been the implicit source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each instance for the presence of either and replace as needed. In the dlm case, we remove module.h from a global header and only introduce it in the files where it is explicitly required, since there is nothing modular in dlm_internal.h itself. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David Teigland <teigland@redhat.com>	2016-10-19 11:00:03 -05:00
Stephen Hemminger	dbef1c0534	dlm: make genl_ops const This table contains function points and should be const. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Teigland <teigland@redhat.com>	2016-10-19 11:00:03 -05:00
Marcelo Ricardo Leitner	3a8db79889	dlm: free workqueues after the connections After backporting commit `ee44b4bc05` ("dlm: use sctp 1-to-1 API") series to a kernel with an older workqueue which didn't use RCU yet, it was noticed that we are freeing the workqueues in dlm_lowcomms_stop() too early as free_conn() will try to access that memory for canceling the queued works if any. This issue was introduced by commit `0d737a8cfd` as before it such attempt to cancel the queued works wasn't performed, so the issue was not present. This patch fixes it by simply inverting the free order. Cc: stable@vger.kernel.org Fixes: `0d737a8cfd` ("dlm: fix race while closing connections") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>	2016-10-10 09:54:00 -05:00
Eric Ren	079d37df33	dlm: fix malfunction of dlm_tool caused by debugfs changes With the current kernel, `dlm_tool lockdebug` fails as below: "dlm_tool lockdebug ED0BD86DCE724393918A1AE8FDBF1EE3 can't open /sys/kernel/debug/dlm/ED0BD86DCE724393918A1AE8FDBF1EE3: Operation not permitted" This is because table_open() depends on file->f_op to tell which seq_file ops should be passed down. But, the original file ops in file->f_op is replaced by "debugfs_full_proxy_file_operations" with commit `49d200deaa` ("debugfs: prevent access to removed files' private data"). Currently, I can think up 2 solutions: 1st, replace debugfs_create_file() with debugfs_create_file_unsafe(); 2nd, make different table_open#() accordingly. The 1st one is neat, but I don't thoroughly understand its risk. Maybe someone has a better one. Signed-off-by: Eric Ren <zren@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>	2016-08-26 13:22:14 -05:00
Amitoj Kaur Chawla	5c93f56f77	dlm: Use kmemdup instead of kmalloc and memcpy Replace calls to kmalloc followed by a memcpy with a direct call to kmemdup. The Coccinelle semantic patch used to make this change is as follows: @@ expression from,to,size,flag; statement S; @@ - to = $kmalloc\\|kzalloc$(size,flag); + to = kmemdup(from,size,flag); if (to==NULL \|\| ...) S - memcpy(to, from, size); Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>	2016-06-23 11:55:58 -05:00
Zhilong Liu	505ee5283c	dlm: add log_info config option This config option can be used to disable the LOG_INFO recovery messages. Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>	2016-06-21 09:04:24 -05:00
Kirill A. Shutemov	09cbfeaf1a	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced long time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-04-04 10:41:08 -07:00
Andrew Price	82c7d823cc	dlm: config: Fix ENOMEM failures in make_cluster() Commit `1ae1602de0` "configfs: switch ->default groups to a linked list" left the NULL gps pointer behind after removing the kcalloc() call which made it non-NULL. It also left the !gps check in place so make_cluster() now fails with ENOMEM. Remove the remaining uses of the gps variable to fix that. Reviewed-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2016-03-29 10:28:08 -05:00
Linus Torvalds	d77bed0d4c	dlm for 4.6 Previous changes introduced the use of socket error reporting for dlm sockets. This set includes two fixes in how the socket error callbacks are used. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJW4ISmAAoJEDgbc8f8gGmqnWgQANnS13rXt4NxnRtNqzrEUmjk NbvP4BxeWKinxhc9ObUnV0CzDllYa2A6Te2AageCQr/qhRKfnnaQJczJ39Xp+289 oSgJJOo4HGCPjshrq9BwGo8ufijUIc0MsT64TzeI3ww58b1eK2CLoC9uiLDyYwjM Hw0PRXU/MAxzOWJIIWgkh78FQmx8fswOSNyK49/p3/INMVNFxn75bd+shtxUOuCp 50gmI6DG4gGJDK3vtIjZStJhW4lcaM3tjGZ9+mcLQF2PZK5zIeHSr3nEfzJ4Qwps 0p55JeiXdfff6RTrxqnJewc+xysmD9594wG0G0VqLsWaLWulDrHZMFsVg1J10frk bk0WwLjsYG/wLVZtKRe+3QwOyqgTxx1Cea8ZYB2yMcFBkDYxFmh8a00kJcVuucnS W+w4rhI9blk1cc4eHhnuBIi5m2jbelu4NPG5722ORtv+gNpBl2ptecqIjfuhr8xE IIF5tnkZb8lBuLyhCmg8in2mKnY5aKSk3kuQ98rDXZUMCLT0PKG2ZNsXJjpX6G38 uQ+sB9rH6c5pIe0S3keS2f2Ly3V7gtBErA0otyxaq/XlxnJeFlLU5G1chHUeW8VP qxhtjDShPuIA97MxE2GA3ehr7r3jbOb+8qOc13E9ygXDyXNN0tb4JIeB0beNLdRx db8Lt+OR10IUawNyuFB9 =qvzY -----END PGP SIGNATURE----- Merge tag 'dlm-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "Previous changes introduced the use of socket error reporting for dlm sockets. This set includes two fixes in how the socket error callbacks are used" * tag 'dlm-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: DLM: Save and restore socket callbacks properly DLM: Replace nodeid_to_addr with kernel_getpeername	2016-03-17 16:38:36 -07:00
Christoph Hellwig	1ae1602de0	configfs: switch ->default groups to a linked list Replace the current NULL-terminated array of default groups with a linked list. This gets rid of lots of nasty code to size and/or dynamically allocate the array. While we're at it also provide a conveniant helper to remove the default groups. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Felipe Balbi <balbi@kernel.org> [drivers/usb/gadget] Acked-by: Joel Becker <jlbec@evilplan.org> Acked-by: Nicholas Bellinger <nab@linux-iscsi.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com>	2016-03-06 16:11:24 +01:00
Bob Peterson	b81171cb68	DLM: Save and restore socket callbacks properly This patch fixes the problems with patch `b3a5bbfd7`. 1. It removes a return statement from lowcomms_error_report because it needs to call the original error report in all paths through the function. 2. All socket callbacks are saved and restored, not just the sk_error_report, and that's done so with proper locking like sunrpc does. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2016-02-22 14:02:17 -06:00
Bob Peterson	1a31833d08	DLM: Replace nodeid_to_addr with kernel_getpeername This patch replaces the call to nodeid_to_addr with a call to kernel_getpeername. This avoids taking a spinlock because it may potentially be called from a softirq context. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2016-02-22 14:02:11 -06:00
Al Viro	117aa41e80	[regression] fix braino in fs/dlm/user.c it's "bugger off if we got ERR_PTR", not the other way round... Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-01-21 17:45:15 -05:00
Al Viro	16e5c1fc36	convert a bunch of open-coded instances of memdup_user_nul() A _lot_ of ->write() instances were open-coding it; some are converted to memdup_user_nul(), a lot more remain... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-01-04 10:26:58 -05:00
Eric Dumazet	9cd3e072b0	net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA This patch is a cleanup to make following patch easier to review. Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA from (struct socket)->flags to a (struct socket_wq)->flags to benefit from RCU protection in sock_wake_async() To ease backports, we rename both constants. Two new helpers, sk_set_bit(int nr, struct sock sk) and sk_clear_bit(int net, struct sock sk) are added so that following patch can change their implementation. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-01 15:45:05 -05:00
Linus Torvalds	9aa3d651a9	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "This series contains HCH's changes to absorb configfs attribute ->show() + ->store() function pointer usage from it's original tree-wide consumers, into common configfs code. It includes usb-gadget, target w/ drivers, netconsole and ocfs2 changes to realize the improved simplicity, that now renders the original include/target/configfs_macros.h CPP magic for fabric drivers and others, unnecessary and obsolete. And with common code in place, new configfs attributes can be added easier than ever before. Note, there are further improvements in-flight from other folks for v4.5 code in configfs land, plus number of target fixes for post -rc1 code" In the meantime, a new user of the now-removed old configfs API came in through the char/misc tree in commit `7bd1d4093c` ("stm class: Introduce an abstraction for System Trace Module devices"). This merge resolution comes from Alexander Shishkin, who updated his stm class tracing abstraction to account for the removal of the old show_attribute and store_attribute methods in commit `517982229f` ("configfs: remove old API") from this pull. As Alexander says about that patch: "There's no need to keep an extra wrapper structure per item and the awkward show_attribute/store_attribute item ops are no longer needed. This patch converts policy code to the new api, all the while making the code quite a bit smaller and easier on the eyes. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>" That patch was folded into the merge so that the tree should be fully bisectable. * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (23 commits) configfs: remove old API ocfs2/cluster: use per-attribute show and store methods ocfs2/cluster: move locking into attribute store methods netconsole: use per-attribute show and store methods target: use per-attribute show and store methods spear13xx_pcie_gadget: use per-attribute show and store methods dlm: use per-attribute show and store methods usb-gadget/f_serial: use per-attribute show and store methods usb-gadget/f_phonet: use per-attribute show and store methods usb-gadget/f_obex: use per-attribute show and store methods usb-gadget/f_uac2: use per-attribute show and store methods usb-gadget/f_uac1: use per-attribute show and store methods usb-gadget/f_mass_storage: use per-attribute show and store methods usb-gadget/f_sourcesink: use per-attribute show and store methods usb-gadget/f_printer: use per-attribute show and store methods usb-gadget/f_midi: use per-attribute show and store methods usb-gadget/f_loopback: use per-attribute show and store methods usb-gadget/ether: use per-attribute show and store methods usb-gadget/f_acm: use per-attribute show and store methods usb-gadget/f_hid: use per-attribute show and store methods ...	2015-11-13 20:04:17 -08:00
Linus Torvalds	d000f8d67f	dlm for 4.4 This includes one simple fix to make posix locks interruptible by signals in cases where a signal handler is used. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWO3ZiAAoJEDgbc8f8gGmqXc8P/j1GcuiMyHbRjlyGgExYYuFG l9eeoCAzUy4aE9PnN90ur/9FhYGQ3fEte4xaKDU7FMTsQIvNrKYdNAt6qWQdQI6M shogDw6hXwXJ42IPS9fuj2dEPFD5wBoMYjEGowzdAvsMH3cROyN03hIWqSWTL3jI UFSR7NjbnQwT8ZUAYcICEE5VerqsGzxGkaFF+V3fYASsZlALTlQkVHT5DQzCkXgq 0CkNhsMx5H4Ng9y/2dnhPj3y24NqbhdtLX4dkcKevMmHP5FJ/rEI82oizxPgJ8oZ QlcSOUZatNuqLVAStecmsd5sH80/IDspnpMDxnQCKnioNq3x6YXXfhyv5CKB6Ahy atA3SlYDACiZz5tydJ/97DJvvIrF2rUETPXk2Lobc972UU99r8zxCUah8xv4ThD/ DtuSkqNnTmXjMcTssHDqo/Kg16dZxpx+itxsWCEivfZm6EL1j5RAvZO5G04wMmry D/FXDKT/FZR+xYDIg1FLc1uOMldeRbMWhb+zGfTAnYy0aH43oyePVddbC+lSuVfp Pat2avXoovR59+7nhFk+s+xf3c8oKMoSwCZOso4OoVySRZQmE1dH6m6D5RLkgRHw nTGggRRAAOsLoiYFtnCKXHkxUTGZWDJfwtv1OI1IABqdoSj5Px/4JAswWj1e5+dH haCGlyvMDqK8ImaekGcZ =m+FJ -----END PGP SIGNATURE----- Merge tag 'dlm-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm update from David Teigland: "This includes one simple fix to make posix locks interruptible by signals in cases where a signal handler is used" * tag 'dlm-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: make posix locks interruptible	2015-11-05 11:15:25 -08:00
Eric Ren	a6b1533e9a	dlm: make posix locks interruptible Replace wait_event_killable with wait_event_interruptible so that a program waiting for a posix lock can be interrupted by a signal. With the killable version, a program was not interruptible by a signal if it had a signal handler set for it, overriding the default action of terminating the process. Signed-off-by: Eric Ren <zren@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>	2015-11-03 10:38:22 -06:00
Benjamin Coddington	4f6563677a	Move locks API users to locks_lock_inode_wait() Instead of having users check for FL_POSIX or FL_FLOCK to call the correct locks API function, use the check within locks_lock_inode_wait(). This allows for some later cleanup. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>	2015-10-22 14:57:36 -04:00
Christoph Hellwig	9ae0f367df	dlm: use per-attribute show and store methods To simplify the configfs interface and remove boilerplate code that also causes binary bloat. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Teigland <teigland@redhat.com Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-10-13 22:16:18 -07:00
Linus Torvalds	9cbf22b37a	dlm for 4.3 This set mainly includes a change to the way the dlm uses the SCTP API in the kernel, removing the direct dependency on the sctp module. Other odd SCTP-related fixes are also included. The other notable fix is for a long standing regression in the behavior of lock value blocks for user space locks. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJV5HwZAAoJEDgbc8f8gGmqoaQP/iz5zgKSjX0mOC3fz8BqXISk 85cKLPfsf0avDmGx6nkKp5wsmVDYkfrObkocvf7bOcemAuycuOmr9y22ZscNaAWM vKLhTJQ0koAlZqhJmJx45w318BFY03RdDQmVKUnQHza9Ed7Uoa0CyR6jyuwBTuMP gA9O6i6CezodtB8CLPySJa2znlt50CptLaJKj1V9/xCpBh7orwpihv4pBz8oH1lR JXRj9hNEFy2+vk8Pce14fKmHgUROg5+y1V7jZeetpCbTxAAFOeFOL6EH28eWssbQ YoWofcPugmOs9BDbnVZHf6+Y5xIaoiIylb2Q4/me4rjQfSmaiDbTZyqB4TtFrldF BngaAJipmLQu8ELqQmwEMhZTAc/GsB60x1EcjrPVTKbW7pwsfVp2fPVV92a7koQe prmz5rh8HCenrWuy3d4/EP7K+E4+W98ZXsDuym4pBNaoYwCPyvtWLa8kSqAdx47J MNk/ak9ktP2NxsCs+EjCmP2hn2r+RTio6R2uCtKB2pdclfqOupIsYZkVdZERK5Ch 5+ALeVjHfxswFVRxGjbPQRs9x8ZclBydceAHgYbLQ2xDGRvTpQhnIyNLRXsZnkrD t4mTokZG/GGgmWOscZ5nXOOGZt8SpX+UkICWWWbuy3dxuOK6al3lVeBcC0KW5Pki KNHzcKrlGJJnCVr0nWTU =iYRu -----END PGP SIGNATURE----- Merge tag 'dlm-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "This set mainly includes a change to the way the dlm uses the SCTP API in the kernel, removing the direct dependency on the sctp module. Other odd SCTP-related fixes are also included. The other notable fix is for a long standing regression in the behavior of lock value blocks for user space locks" * tag 'dlm-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: print error from kernel_sendpage dlm: fix lvb copy for user locks dlm: sctp_accept_from_sock() can be static dlm: fix reconnecting but not sending data dlm: replace BUG_ON with a less severe handling dlm: use sctp 1-to-1 API dlm: fix not reconnecting on connecting error handling dlm: fix race while closing connections dlm: fix connection stealing if using SCTP	2015-09-03 12:57:48 -07:00
Bob Peterson	b3a5bbfd78	dlm: print error from kernel_sendpage Print a dlm-specific error when a socket error occurs when sending a dlm message. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2015-08-27 09:34:47 -05:00
David Teigland	b96f465035	dlm: fix lvb copy for user locks For a userland lock request, the previous and current lock modes are used to decide when the lvb should be copied back to the user. The wrong previous value was used, so that it always matched the current value. This caused the lvb to be copied back to the user in the wrong cases. Signed-off-by: David Teigland <teigland@redhat.com>	2015-08-25 14:41:50 -05:00
kbuild test robot	18df8a87ba	dlm: sctp_accept_from_sock() can be static Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David Teigland <teigland@redhat.com>	2015-08-17 16:23:09 -05:00
Marcelo Ricardo Leitner	00dcffaebf	dlm: fix reconnecting but not sending data There are cases on which lowcomms_connect_sock() is called directly, which caused the CF_WRITE_PENDING flag to not bet set upon reconnect, specially on send_to_sock() error handling. On this last, the flag was already cleared and no further attempt on transmitting would be done. As dlm tends to connect when it needs to transmit something, it makes sense to always mark this flag right after the connect. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>	2015-08-17 16:22:21 -05:00
Marcelo Ricardo Leitner	acee4e527d	dlm: replace BUG_ON with a less severe handling BUG_ON() is a severe action for this case, specially now that DLM with SCTP will use 1 socket per association. Instead, we can just close the socket on this error condition and return from the function. Also move the check to an earlier stage as it won't change and thus we can abort as soon as possible. Although this issue was reported when still using SCTP with 1-to-many API, this cleanup wouldn't be that simple back then because we couldn't close the socket and making sure such event would cease would be hard. And actually, previous code was closing the association, yet SCTP layer is still raising the new data event. Probably a bug to be fixed in SCTP. Reported-by: <tan.hu@zte.com.cn> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>	2015-08-17 16:22:21 -05:00
Marcelo Ricardo Leitner	ee44b4bc05	dlm: use sctp 1-to-1 API DLM is using 1-to-many API but in a 1-to-1 fashion. That is, it's not needed but this causes it to use sctp_do_peeloff() to mimic an kernel_accept() and this causes a symbol dependency on sctp module. By switching it to 1-to-1 API we can avoid this dependency and also reduce quite a lot of SCTP-specific code in lowcomms.c. The caveat is that now DLM won't always use the same src port. It will choose a random one, just like TCP code. This allows the peers to attempt simultaneous connections, which now are handled just like for TCP. Even more sharing between TCP and SCTP code on DLM is possible, but it is intentionally left for a later commit. Note that for using nodes with this commit, you have to have at least the early fixes on this patchset otherwise it will trigger some issues on old nodes. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>	2015-08-17 16:22:20 -05:00
Marcelo Ricardo Leitner	356344c4c3	dlm: fix not reconnecting on connecting error handling If we don't clear that bit, lowcomms_connect_sock() will not schedule another attempt, and no further attempt will be done. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>	2015-08-17 16:22:19 -05:00
Marcelo Ricardo Leitner	0d737a8cfd	dlm: fix race while closing connections When a connection have issues DLM may need to close it. Therefore we should also cancel pending workqueues for such connection at that time, and not just when dlm is not willing to use this connection anymore. Also, if we don't clear CF_CONNECT_PENDING flag, the error handling routines won't be able to re-connect as lowcomms_connect_sock() will check for it. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>	2015-08-17 16:22:19 -05:00
Marcelo Ricardo Leitner	28926a0965	dlm: fix connection stealing if using SCTP When using SCTP and accepting a new connection, DLM currently validates if the peer trying to connect to it is one of the cluster nodes, but it doesn't check if it already has a connection to it or not. If it already had a connection, it will be overwritten, and the new one will be used for writes, possibly causing the node to leave the cluster due to communication breakage. Still, one could DoS the node by attempting N connections and keeping them open. As said, but being explicit, both situations are only triggerable from other cluster nodes, but are doable with only user-level perms. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>	2015-08-17 16:22:15 -05:00
Greg Kroah-Hartman	f368ed6088	char: make misc_deregister a void function With well over 200+ users of this api, there are a mere 12 users that actually checked the return value of this function. And all of them really didn't do anything with that information as the system or module was shutting down no matter what. So stop pretending like it matters, and just return void from misc_deregister(). If something goes wrong in the call, you will get a WARNING splat in the syslog so you know how to fix up your driver. Other than that, there's nothing that can go wrong. Cc: Alasdair Kergon <agk@redhat.com> Cc: Neil Brown <neilb@suse.com> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Andreas Dilger <andreas.dilger@intel.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Wim Van Sebroeck <wim@iguana.be> Cc: Christine Caulfield <ccaulfie@redhat.com> Cc: David Teigland <teigland@redhat.com> Cc: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <jlbec@evilplan.org> Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Alessandro Zummo <a.zummo@towertech.it> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-05 10:35:49 -07:00
Eric W. Biederman	eeb1bd5c40	net: Add a struct net parameter to sock_create_kern This is long overdue, and is part of cleaning up how we allocate kernel sockets that don't reference count struct net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-11 10:50:17 -04:00
Johannes Berg	053c095a82	netlink: make nlmsg_end() and genlmsg_end() void Contrary to common expectations for an "int" return, these functions return only a positive value -- if used correctly they cannot even return 0 because the message header will necessarily be in the skb. This makes the very common pattern of if (genlmsg_end(...) < 0) { ... } be a whole bunch of dead code. Many places also simply do return nlmsg_end(...); and the caller is expected to deal with it. This also commonly (at least for me) causes errors, because it is very common to write if (my_function(...)) /* error condition */ and if my_function() does "return nlmsg_end()" this is of course wrong. Additionally, there's not a single place in the kernel that actually needs the message length returned, and if anyone needs it later then it'll be very easy to just use skb->len there. Remove this, and make the functions void. This removes a bunch of dead code as described above. The patch adds lines because I did - return nlmsg_end(...); + nlmsg_end(...); + return 0; I could have preserved all the function's return values by returning skb->len, but instead I've audited all the places calling the affected functions and found that none cared. A few places actually compared the return value with <= 0 in dump functionality, but that could just be changed to < 0 with no change in behaviour, so I opted for the more efficient version. One instance of the error I've made numerous times now is also present in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't check for <0 or <=0 and thus broke out of the loop every single time. I've preserved this since it will (I think) have caused the messages to userspace to be formatted differently with just a single message for every SKB returned to userspace. It's possible that this isn't needed for the tools that actually use this, but I don't even know what they are so couldn't test that changing this behaviour would be acceptable. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-18 01:03:45 -05:00
Linus Torvalds	cbfe0de303	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS changes from Al Viro: "First pile out of several (there _definitely_ will be more). Stuff in this one: - unification of d_splice_alias()/d_materialize_unique() - iov_iter rewrite - killing a bunch of ->f_path.dentry users (and f_dentry macro). Getting that completed will make life much simpler for unionmount/overlayfs, since then we'll be able to limit the places sensitive to file _dentry_ to reasonably few. Which allows to have file_inode(file) pointing to inode in a covered layer, with dentry pointing to (negative) dentry in union one. Still not complete, but much closer now. - crapectomy in lustre (dead code removal, mostly) - "let's make seq_printf return nothing" preparations - assorted cleanups and fixes There _definitely_ will be more piles" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) copy_from_iter_nocache() new helper: iov_iter_kvec() csum_and_copy_..._iter() iov_iter.c: handle ITER_KVEC directly iov_iter.c: convert copy_to_iter() to iterate_and_advance iov_iter.c: convert copy_from_iter() to iterate_and_advance iov_iter.c: get rid of bvec_copy_page_{to,from}_iter() iov_iter.c: convert iov_iter_zero() to iterate_and_advance iov_iter.c: convert iov_iter_get_pages_alloc() to iterate_all_kinds iov_iter.c: convert iov_iter_get_pages() to iterate_all_kinds iov_iter.c: convert iov_iter_npages() to iterate_all_kinds iov_iter.c: iterate_and_advance iov_iter.c: macros for iterating over iov_iter kill f_dentry macro dcache: fix kmemcheck warning in switch_names new helper: audit_file() nfsd_vfs_write(): use file_inode() ncpfs: use file_inode() kill f_dentry uses lockd: get rid of ->f_path.dentry->d_sb ...	2014-12-10 16:10:49 -08:00
David Teigland	2ab4bd8ea3	dlm: adopt orphan locks A process may exit, leaving an orphan lock in the lockspace. This adds the capability for another process to acquire the orphan lock. Acquiring the orphan just moves the lock from the orphan list onto the acquiring process's list of locks. An adopting process must specify the resource name and mode of the lock it wants to adopt. If a matching lock is found, the lock is moved to the caller's 's list of locks, and the lkid of the lock is returned like the lkid of a new lock. If an orphan with a different mode is found, then -EAGAIN is returned. If no orphan lock is found on the resource, then -ENOENT is returned. No async completion is used because the result is immediately available. Also, when orphans are purged, allow a zero nodeid to refer to the local nodeid so the caller does not need to look up the local nodeid. Signed-off-by: David Teigland <teigland@redhat.com>	2014-11-19 14:48:02 -06:00
Joe Perches	f365ef9b79	dlm: Use seq_puts() instead of seq_printf() for constant strings Convert the seq_printf output with constant strings to seq_puts. Link: http://lkml.kernel.org/p/b416b016f4a6e49115ba736cad6ea2709a8bc1c4.1412031505.git.joe@perches.com Cc: Christine Caulfield <ccaulfie@redhat.com> Cc: David Teigland <teigland@redhat.com> Cc: cluster-devel@redhat.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-11-05 14:13:09 -05:00
Joe Perches	d6d906b234	dlm: Remove seq_printf() return checks and use seq_has_overflowed() The seq_printf() return is going away soon and users of it should check seq_has_overflowed() to see if the buffer is full and will not accept any more data. Convert functions returning int to void where seq_printf() is used. Link: http://lkml.kernel.org/p/43590057bcb83846acbbcc1fe641f792b2fb7773.1412031505.git.joe@perches.com Link: http://lkml.kernel.org/r/20141029220107.939492048@goodmis.org Acked-by: David Teigland <teigland@redhat.com> Cc: Christine Caulfield <ccaulfie@redhat.com> Cc: cluster-devel@redhat.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-11-05 14:12:38 -05:00
Neale Ferguson	c07127b48c	dlm: fix missing endian conversion of rcom_status flags The flags are already converted to le when being sent, but are not being converted back to cpu when received. Signed-off-by: Neale Ferguson <neale@sinenomine.net> Signed-off-by: David Teigland <teigland@redhat.com>	2014-10-14 15:11:48 -05:00
Joe Perches	d0449b90f8	locks: Remove unused conf argument from lm_grant This argument is always NULL so don't pass it around. [jlayton: remove dependencies on previous patches in series] Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-09 16:01:06 -04:00
Fabian Frederick	e0d9bf4cc0	fs/dlm/debug_fs.c: remove unnecessary null test before debugfs_remove This fixes checkpatch warning: WARNING: debugfs_remove(NULL) is safe this check is probably not required Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:27 -07:00
Lidong Zhong	883854c545	dlm: keep listening connection alive with sctp mode The connection struct with nodeid 0 is the listening socket, not a connection to another node. The sctp resend function was not checking that the nodeid was valid (non-zero), so it would mistakenly get and resend on the listening connection when nodeid was zero. Signed-off-by: Lidong Zhong <lzhong@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>	2014-06-12 10:26:14 -05:00
Fabian Frederick	c1d4518c4e	fs/dlm/debug_fs.c: replace seq_printf by seq_puts Replace seq_printf where possible. This patch also fixes the following checkpatch warning "unnecessary whitespace before a quoted newline" Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-06 16:08:18 -07:00
Fabian Frederick	6edb56871a	fs/dlm/lockspace.c: convert simple_str to kstr Replace obsolete functions. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-06 16:08:18 -07:00
Fabian Frederick	4f4c337fb7	fs/dlm/config.c: convert simple_str to kstr Replace obsolete functions simple_strtoul/kstrtouint simple_strtol/kstrtoint (kstr __must_check requires the right function to be applied) Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-06 16:08:17 -07:00
David S. Miller	676d23690f	net: Fix use after free by removing length arg from sk_data_ready callbacks. Several spots in the kernel perform a sequence like: skb_queue_tail(&sk->s_receive_queue, skb); sk->sk_data_ready(sk, skb->len); But at the moment we place the SKB onto the socket receive queue it can be consumed and freed up. So this skb->len access is potentially to freed up memory. Furthermore, the skb->len can be modified by the consumer so it is possible that the value isn't accurate. And finally, no actual implementation of this callback actually uses the length argument. And since nobody actually cared about it's value, lots of call sites pass arbitrary values in such as '0' and even '1'. So just remove the length argument from the callback, that way there is no confusion whatsoever and all of these use-after-free cases get fixed as a side effect. Based upon a patch by Eric Dumazet and his suggestion to audit this issue tree-wide. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-11 16:15:36 -04:00
David Teigland	075f01775f	dlm: use INFO for recovery messages The log messages relating to the progress of recovery are minimal and very often useful. Change these to the KERN_INFO level so they are always available. Signed-off-by: David Teigland <teigland@redhat.com>	2014-02-14 11:54:44 -06:00
Rashika Kheria	9505857103	fs: Include appropriate header file in dlm/ast.c Include appropriate header file fs/dlm/ast.h in fs/dlm/ast.c because it contains function prototypes of some functions defined in fs/dlm/ast.c. This also eliminates the following warning in fs/dlm/ast: fs/dlm/ast.c:52:5: warning: no previous prototype for ‘dlm_add_lkb_callback’ [-Wmissing-prototypes] fs/dlm/ast.c:113:5: warning: no previous prototype for ‘dlm_rem_lkb_callback’ [-Wmissing-prototypes] fs/dlm/ast.c:174:6: warning: no previous prototype for ‘dlm_add_cb’ [-Wmissing-prototypes] fs/dlm/ast.c:212:6: warning: no previous prototype for ‘dlm_callback_work’ [-Wmissing-prototypes] fs/dlm/ast.c:267:5: warning: no previous prototype for ‘dlm_callback_start’ [-Wmissing-prototypes] fs/dlm/ast.c:278:6: warning: no previous prototype for ‘dlm_callback_stop’ [-Wmissing-prototypes] fs/dlm/ast.c:284:6: warning: no previous prototype for ‘dlm_callback_suspend’ [-Wmissing-prototypes] fs/dlm/ast.c:292:6: warning: no previous prototype for ‘dlm_callback_resume’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: David Teigland <teigland@redhat.com>	2014-02-12 15:44:19 -06:00
Dan Carpenter	e8243f32f2	dlm: silence a harmless use after free warning We pass the freed "r" pointer back to the caller. It's harmless but it upsets the static checkers. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Teigland <teigland@redhat.com>	2014-02-12 15:44:03 -06:00
Linus Torvalds	4ba9920e5e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: 1) BPF debugger and asm tool by Daniel Borkmann. 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann. 3) Correct reciprocal_divide and update users, from Hannes Frederic Sowa and Daniel Borkmann. 4) Currently we only have a "set" operation for the hw timestamp socket ioctl, add a "get" operation to match. From Ben Hutchings. 5) Add better trace events for debugging driver datapath problems, also from Ben Hutchings. 6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we have a small send and a previous packet is already in the qdisc or device queue, defer until TX completion or we get more data. 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko. 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel Borkmann. 9) Share IP header compression code between Bluetooth and IEEE802154 layers, from Jukka Rissanen. 10) Fix ipv6 router reachability probing, from Jiri Benc. 11) Allow packets to be captured on macvtap devices, from Vlad Yasevich. 12) Support tunneling in GRO layer, from Jerry Chu. 13) Allow bonding to be configured fully using netlink, from Scott Feldman. 14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can already get the TCI. From Atzm Watanabe. 15) New "Heavy Hitter" qdisc, from Terry Lam. 16) Significantly improve the IPSEC support in pktgen, from Fan Du. 17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom Herbert. 18) Add Proportional Integral Enhanced packet scheduler, from Vijay Subramanian. 19) Allow openvswitch to mmap'd netlink, from Thomas Graf. 20) Key TCP metrics blobs also by source address, not just destination address. From Christoph Paasch. 21) Support 10G in generic phylib. From Andy Fleming. 22) Try to short-circuit GRO flow compares using device provided RX hash, if provided. From Tom Herbert. The wireless and netfilter folks have been busy little bees too. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits) net/cxgb4: Fix referencing freed adapter ipv6: reallocate addrconf router for ipv6 address when lo device up fib_frontend: fix possible NULL pointer dereference rtnetlink: remove IFLA_BOND_SLAVE definition rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info qlcnic: update version to 5.3.55 qlcnic: Enhance logic to calculate msix vectors. qlcnic: Refactor interrupt coalescing code for all adapters. qlcnic: Update poll controller code path qlcnic: Interrupt code cleanup qlcnic: Enhance Tx timeout debugging. qlcnic: Use bool for rx_mac_learn. bonding: fix u64 division rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC sfc: Use the correct maximum TX DMA ring size for SFC9100 Add Shradha Shah as the sfc driver maintainer. net/vxlan: Share RX skb de-marking and checksum checks with ovs tulip: cleanup by using ARRAY_SIZE() ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called net/cxgb4: Don't retrieve stats during recovery ...	2014-01-25 11:17:34 -08:00
wangweidong	048ed4b626	sctp: remove macros sctp_{lock\|release}_sock Redefined {lock\|release}_sock to sctp_{lock\|release}_sock for user space friendly code which we haven't use in years, so removing them. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 18:41:36 -08:00
Dongmao Zhang	ece35848c1	dlm: set zero linger time on sctp socket The recovery time for a failed node was taking a long time because the failed node could not perform the full shutdown process. Removing the linger time speeds this up. The dlm does not care what happens to messages to or from the failed node. Signed-off-by: Dongmao Zhang <dmzhang@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>	2013-12-16 09:52:34 -06:00
Johannes Berg	c53ed74236	genetlink: only pass array to genl_register_family_with_ops() As suggested by David Miller, make genl_register_family_with_ops() a macro and pass only the array, evaluating ARRAY_SIZE() in the macro, this is a little safer. The openvswitch has some indirection, assing ops/n_ops directly in that code. This might ultimately just assign the pointers in the family initializations, saving the struct genl_family_and_ops and code (once mcast groups are handled differently.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-11-19 16:39:05 -05:00
Bart Van Assche	a97f4a66d8	dlm: Avoid that dlm_release_lockspace() incorrectly returns -EBUSY When dlm_release_lockspace(ls, 1) is invoked on a busy system immediately after the last dlm_unlock() AST has finished it can occur that lkb_idr_is_local() is invoked for the unlocked LKB since removal from ls_lkbidr only occurs after the AST has returned. If that happens dlm_release_lockspace(ls, 1) will return -EBUSY instead of releasing the lockspace. Fix this race condition by changing lkb_idr_is_local() such that it only returns true for LKB's that have not yet been unlocked. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: David Teigland <teigland@redhat.com>	2013-10-16 10:32:42 -05:00
David Teigland	c6ca7bc91d	dlm: remove signal blocking The signal blocking was incorrect and unnecessary so just remove it. Signed-off-by: David Teigland <teigland@redhat.com>	2013-08-12 15:22:43 -05:00
Tejun Heo	ededf305a8	dlm: WQ_NON_REENTRANT is meaningless and going away `dbf2576e37` ("workqueue: make all workqueues non-reentrant") made WQ_NON_REENTRANT no-op and the flag is going away. Remove its usages. This patch doesn't introduce any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: David Teigland <teigland@redhat.com>	2013-07-30 09:24:24 -05:00
Bart Van Assche	cfa805f6f1	dlm: Avoid LVB truncation For lockspaces with an LVB length above 64 bytes, avoid truncating the LVB while exchanging it with another node in the cluster. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: David Teigland <teigland@redhat.com>	2013-06-26 11:38:02 -05:00
David Teigland	696b3d8460	dlm: log an error for unmanaged lockspaces Log an error message if the dlm user daemon exits before all the lockspaces have been removed. Signed-off-by: David Teigland <teigland@redhat.com>	2013-06-25 12:53:20 -05:00
Zhao Hongjiang	ad917e7f82	dlm: config: using strlcpy instead of strncpy for NUL terminated string, need alway set '\0' in the end. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Signed-off-by: David Teigland <teigland@redhat.com>	2013-06-25 12:53:06 -05:00
Wei Yongjun	06452eb053	dlm: remove duplicated include from lowcomms.c Remove duplicated include. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David Teigland <teigland@redhat.com>	2013-06-19 09:52:09 -05:00
Mike Christie	86e92ad299	dlm: disable nagle for SCTP For TCP we disable Nagle and I cannot think of why it would be needed for SCTP. When disabled it seems to improve dlm_lock operations like it does for TCP. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: David Teigland <teigland@redhat.com>	2013-06-14 13:07:11 -05:00
Mike Christie	5d6898714f	dlm: retry failed SCTP sends Currently if a SCTP send fails, we lose the data we were trying to send because the writequeue_entry is released when we do the send. When this happens other nodes will then hang waiting for a reply. This adds support for SCTP to retry the send operation. I also removed the retry limit for SCTP use, because we want to make sure we try every path during init time and for longer failures we want to continually retry in case paths come back up while trying other paths. We will do this until userspace tells us to stop. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: David Teigland <teigland@redhat.com>	2013-06-14 13:07:11 -05:00
Mike Christie	98e1b60ecc	dlm: try other IPs when sctp init assoc fails Currently, if we cannot create a association to the first IP addr that is added to DLM, the SCTP init assoc code will just retry the same IP. This patch adds a simple failover schemes where we will try one of the addresses that was passed into DLM. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: David Teigland <teigland@redhat.com>	2013-06-14 13:07:11 -05:00
Mike Christie	b390ca38d2	dlm: clear correct bit during sctp init failure handling We should be testing and cleaing the init pending bit because later when sctp_init_assoc is recalled it will be checking that it is not set and set the bit. We do not want to touch CF_CONNECT_PENDING here because we will queue swork and process_send_sockets will then call the connect_action function. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: David Teigland <teigland@redhat.com>	2013-06-14 13:07:11 -05:00
Mike Christie	e1631d0c48	dlm: set sctp assoc id during setup sctp_assoc was not getting set so later lookups failed. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: David Teigland <teigland@redhat.com>	2013-06-14 13:07:10 -05:00
Mike Christie	efad7e6b1a	dlm: clear correct init bit during sctp setup We were clearing the base con's init pending flags, but the con for the node was the one with the pending bit set. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: David Teigland <teigland@redhat.com>	2013-06-14 13:07:10 -05:00
Linus Torvalds	73287a43cc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights (1721 non-merge commits, this has to be a record of some sort): 1) Add 'random' mode to team driver, from Jiri Pirko and Eric Dumazet. 2) Make it so that any driver that supports configuration of multiple MAC addresses can provide the forwarding database add and del calls by providing a default implementation and hooking that up if the driver doesn't have an explicit set of handlers. From Vlad Yasevich. 3) Support GSO segmentation over tunnels and other encapsulating devices such as VXLAN, from Pravin B Shelar. 4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton. 5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita Dukkipati. 6) In the PHY layer, allow supporting wake-on-lan in situations where the PHY registers have to be written for it to be configured. Use it to support wake-on-lan in mv643xx_eth. From Michael Stapelberg. 7) Significantly improve firewire IPV6 support, from YOSHIFUJI Hideaki. 8) Allow multiple packets to be sent in a single transmission using network coding in batman-adv, from Martin Hundebøll. 9) Add support for T5 cxgb4 chips, from Santosh Rastapur. 10) Generalize the VXLAN forwarding tables so that there is more flexibility in configurating various aspects of the endpoints. From David Stevens. 11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver, from Dmitry Kravkov. 12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo Neira Ayuso. 13) Start adding networking selftests. 14) In situations of overload on the same AF_PACKET fanout socket, or per-cpu packet receive queue, minimize drop by distributing the load to other cpus/fanouts. From Willem de Bruijn and Eric Dumazet. 15) Add support for new payload offset BPF instruction, from Daniel Borkmann. 16) Convert several drivers over to mdoule_platform_driver(), from Sachin Kamat. 17) Provide a minimal BPF JIT image disassembler userspace tool, from Daniel Borkmann. 18) Rewrite F-RTO implementation in TCP to match the final specification of it in RFC4138 and RFC5682. From Yuchung Cheng. 19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear you like netlink, so I implemented netlink dumping of netlink sockets.") From Andrey Vagin. 20) Remove ugly passing of rtnetlink attributes into rtnl_doit functions, from Thomas Graf. 21) Allow userspace to be able to see if a configuration change occurs in the middle of an address or device list dump, from Nicolas Dichtel. 22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes Frederic Sowa. 23) Increase accuracy of packet length used by packet scheduler, from Jason Wang. 24) Beginning set of changes to make ipv4/ipv6 fragment handling more scalable and less susceptible to overload and locking contention, from Jesper Dangaard Brouer. 25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_() instead. From Hong Zhiguo. 26) Optimize route usage in IPVS by avoiding reference counting where possible, from Julian Anastasov. 27) Convert IPVS schedulers to RCU, also from Julian Anastasov. 28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger Eitzenberger. 29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG, nfnetlink_log, and nfnetlink_queue. From Gao feng. 30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa. 31) Support several new r8169 chips, from Hayes Wang. 32) Support tokenized interface identifiers in ipv6, from Daniel Borkmann. 33) Use usbnet_link_change() helper in USB net driver, from Ming Lei. 34) Add 802.1ad vlan offload support, from Patrick McHardy. 35) Support mmap() based netlink communication, also from Patrick McHardy. 36) Support HW timestamping in mlx4 driver, from Amir Vadai. 37) Rationalize AF_PACKET packet timestamping when transmitting, from Willem de Bruijn and Daniel Borkmann. 38) Bring parity to what's provided by /proc/net/packet socket dumping and the info provided by netlink socket dumping of AF_PACKET sockets. From Nicolas Dichtel. 39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin Poirier" git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits) filter: fix va_list build error af_unix: fix a fatal race with bit fields bnx2x: Prevent memory leak when cnic is absent bnx2x: correct reading of speed capabilities net: sctp: attribute printl with __printf for gcc fmt checks netlink: kconfig: move mmap i/o into netlink kconfig netpoll: convert mutex into a semaphore netlink: Fix skb ref counting. net_sched: act_ipt forward compat with xtables mlx4_en: fix a build error on 32bit arches Revert "bnx2x: allow nvram test to run when device is down" bridge: avoid OOPS if root port not found drivers: net: cpsw: fix kernel warn on cpsw irq enable sh_eth: use random MAC address if no valid one supplied 3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA) tg3: fix to append hardware time stamping flags unix/stream: fix peeking with an offset larger than data in queue unix/dgram: fix peeking with an offset larger than data in queue unix/dgram: peek beyond 0-sized skbs openvswitch: Remove unneeded ovs_netdev_get_ifindex() ...	2013-05-01 14:08:52 -07:00
Daniel Borkmann	1b86643411	net: sctp: introduce uapi header for sctp This patch introduces an UAPI header for the SCTP protocol, so that we can facilitate the maintenance and development of user land applications or libraries, in particular in terms of header synchronization. To not break compatibility, some fragments from lksctp-tools' netinet/sctp.h have been carefully included, while taking care that neither kernel nor user land breaks, so both compile fine with this change (for lksctp-tools I tested with the old netinet/sctp.h header and with a newly adapted one that includes the uapi sctp header). lksctp-tools smoke test run through successfully as well in both cases. Suggested-by: Neil Horman <nhorman@tuxdriver.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-09 13:19:39 -04:00
David Teigland	9000831839	dlm: avoid unnecessary posix unlock When the kernel clears flocks/plocks during close, it calls posix unlock when there are flocks but no posix locks. Without this patch, that unnecessary posix unlock is passed to userland (dlm_controld), across the cluster, and back to the kernel. This can create a lot of plock activity, even when no posix locks had been used. This patch copies the nfs approach, and skips the full posix unlock if there is no plock found during the vfs unlock phase. Signed-off-by: David Teigland <teigland@redhat.com>	2013-04-08 12:03:15 -05:00
Sasha Levin	b67bfe0d42	hlist: drop the node parameter from iterators I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S \| hlist_for_each_entry_continue(a, - b, c) S \| hlist_for_each_entry_from(a, - b, c) S \| hlist_for_each_entry_rcu(a, - b, c, d) S \| hlist_for_each_entry_rcu_bh(a, - b, c, d) S \| hlist_for_each_entry_continue_rcu_bh(a, - b, c) S \| for_each_busy_worker(a, c, - b, d) S \| ax25_uid_for_each(a, - b, c) S \| ax25_for_each(a, - b, c) S \| inet_bind_bucket_for_each(a, - b, c) S \| sctp_for_each_hentry(a, - b, c) S \| sk_for_each(a, - b, c) S \| sk_for_each_rcu(a, - b, c) S \| sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S \| sk_for_each_safe(a, - b, c, d) S \| sk_for_each_bound(a, - b, c) S \| hlist_for_each_entry_safe(a, - b, c, d, e) S \| hlist_for_each_entry_continue_rcu(a, - b, c) S \| nr_neigh_for_each(a, - b, c) S \| nr_neigh_for_each_safe(a, - b, c, d) S \| nr_node_for_each(a, - b, c) S \| nr_node_for_each_safe(a, - b, c, d) S \| - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S \| - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S \| for_each_host(a, - b, c) S \| for_each_host_safe(a, - b, c, d) S \| for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:24 -08:00
Tejun Heo	2a86b3e74f	dlm: convert to idr_alloc() Convert to the much saner new idr interface. Error return values from recover_idr_add() mix -1 and -errno. The conversion doesn't change that but it looks iffy. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:19 -08:00
Tejun Heo	a67a380e6f	dlm: don't use idr_remove_all() idr_destroy() can destroy idr by itself and idr_remove_all() is being deprecated. The conversion isn't completely trivial for recover_idr_clear() as it's the only place in kernel which makes legitimate use of idr_remove_all() w/o idr_destroy(). Replace it with idr_remove() call inside idr_for_each_entry() loop. It goes on top so that it matches the operation order in recover_idr_del(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christine Caulfield <ccaulfie@redhat.com> Cc: David Teigland <teigland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:13 -08:00
Tejun Heo	cda95406c8	dlm: use idr_for_each_entry() in recover_idr_clear() error path Convert recover_idr_clear() to use idr_for_each_entry() instead of idr_for_each(). It's somewhat less efficient this way but it shouldn't matter in an error path. This is to help with deprecation of idr_remove_all(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christine Caulfield <ccaulfie@redhat.com> Cc: David Teigland <teigland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:13 -08:00
Linus Torvalds	d895cb1af1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle and ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...	2013-02-26 20:16:07 -08:00
Zhao Hongjiang	4173581876	fs: change return values from -EACCES to -EPERM According to SUSv3: [EACCES] Permission denied. An attempt was made to access a file in a way forbidden by its file access permissions. [EPERM] Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resource. So -EPERM should be returned if capability checks fails. Strictly speaking this is an API change since the error code user sees is altered. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-26 02:46:14 -05:00
Linus Torvalds	850cb82b75	dlm for 3.9 This includes a single patch to avoid excessive and unnecessary scanning of rsbs to free. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJREX6ZAAoJEDgbc8f8gGmqDJ4QAKFuq+vwBb+21lNVDk1egBTf nDcpuntzZfnqgg+VKC5LJm0bBAvgexH2GrQMIPDFPPMgAOFjTxXDvm93wVGeAK6d JpkiZxW1a0lyjR+7NMvDFnoIdVJvKw+GezNITiVo/lMghl51NWOQTH8QmSBKdEdJ cxwzm/ZV9gBS5bv5hfeq2hCDVD1q5jWohMjYHb+xyx2E/c4z8L8enEMR1yBBrB/4 qw3DgGRMc0eoWzZrXKt1F6MYtM8QnU/H+WxxmLUYc4SMbClIonivAzzOa0PHO5nV YvewWwpo2VVT50EIxARXi0XQzJ0aYbTwo+E7KoG+MzK5sCmLJaP8inH5UnlltvEu OyhZUkh7qPNjhUQokP3FlNWidZkZNM+gJW6hmZwXtSGfGH6pe620QBrbghcV3nCh QXyENcKUcyDNJQSBQFKV/s4ql4RI+iDS3PfovyOdNqkEDWurWKx0AYcrn2dSNYt1 wjFPH94P3NOwz4nIabpYiICrEPXLdXw+CCvHz56DgBL7EFzrWAzC97v5arYpouhZ NmsnzV8+PGHDWA+NM73fKGokW259pvk/zOPtM13zoyXP18X6adGKUqMpZA7jZge9 RHzO/+A+3ScMXaXdb3c2P7GYH/iSSJXBmcTg8BceOYxICTq1wh4HEzvm39vEwMop 3BMloPmzKxzfis69swIu =WtiV -----END PGP SIGNATURE----- Merge tag 'dlm-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm update from David Teigland: "This includes a single patch to avoid excessive and unnecessary scanning of rsbs to free." * tag 'dlm-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: avoid scanning unchanged toss lists	2013-02-21 09:25:23 -08:00
David Teigland	d4b0bcf32b	dlm: check the write size from user Return EINVAL from write if the size is larger than allowed. Do this before allocating kernel memory for the bogus size, which could lead to OOM. Reported-by: Sasha Levin <levinsasha928@gmail.com> Tested-by: Jana Saout <jana@saout.de> Signed-off-by: David Teigland <teigland@redhat.com>	2013-02-04 15:31:22 -06:00
David Teigland	f117228346	dlm: avoid scanning unchanged toss lists Keep track of whether a toss list contains any shrinkable rsbs. If not, dlm_scand can avoid scanning the list for rsbs to shrink. Unnecessary scanning can otherwise waste a lot of time because the toss lists can contain a large number of rsbs that are non-shrinkable (directory records). Signed-off-by: David Teigland <teigland@redhat.com>	2013-01-07 12:02:49 -06:00
David Teigland	da8c66638a	dlm: fix lvb invalidation conditions When a node is removed that held a PW/EX lock, the existing master node should invalidate the lvb on the resource due to the purged lock. Previously, the existing master node was invalidating the lvb if it found only NL/CR locks on the resource during recovery for the removed node. This could lead to cases where it invalidated the lvb and shouldn't have, or cases where it should have invalidated and didn't. When recovery selects a new master node for a resource, and that new master finds only NL/CR locks on the resource after lock recovery, it should invalidate the lvb. This case was handled correctly (but was incorrectly applied to the existing master case also.) When a process exits while holding a PW/EX lock, the lvb on the resource should be invalidated. This was not happening. The lvb contents and VALNOTVALID flag should be recovered before granting locks in recovery so that the recovered lvb state is provided in the callback. The lvb was being recovered after the lock was granted. Signed-off-by: David Teigland <teigland@redhat.com>	2012-11-16 11:20:42 -06:00
Kees Cook	a3de56bdb9	fs/dlm: remove CONFIG_EXPERIMENTAL This config item has not carried much meaning for a while now and is almost always enabled by default. As agreed during the Linux kernel summit, remove it. CC: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Teigland <teigland@redhat.com>	2012-11-01 15:27:24 -05:00
Wei Yongjun	eeee2b5fe1	dlm: remove unused variable in *dlm_lowcomms_get_buffer() The variable users is initialized but never used otherwise, so remove the unused variable. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David Teigland <teigland@redhat.com>	2012-11-01 15:27:13 -05:00
Linus Torvalds	aecdc33e11	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking changes from David Miller: 1) GRE now works over ipv6, from Dmitry Kozlov. 2) Make SCTP more network namespace aware, from Eric Biederman. 3) TEAM driver now works with non-ethernet devices, from Jiri Pirko. 4) Make openvswitch network namespace aware, from Pravin B Shelar. 5) IPV6 NAT implementation, from Patrick McHardy. 6) Server side support for TCP Fast Open, from Jerry Chu and others. 7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel Borkmann. 8) Increate the loopback default MTU to 64K, from Eric Dumazet. 9) Use a per-task rather than per-socket page fragment allocator for outgoing networking traffic. This benefits processes that have very many mostly idle sockets, which is quite common. From Eric Dumazet. 10) Use up to 32K for page fragment allocations, with fallbacks to smaller sizes when higher order page allocations fail. Benefits are a) less segments for driver to process b) less calls to page allocator c) less waste of space. From Eric Dumazet. 11) Allow GRO to be used on GRE tunnels, from Eric Dumazet. 12) VXLAN device driver, one way to handle VLAN issues such as the limitation of 4096 VLAN IDs yet still have some level of isolation. From Stephen Hemminger. 13) As usual there is a large boatload of driver changes, with the scale perhaps tilted towards the wireless side this time around. Fix up various fairly trivial conflicts, mostly caused by the user namespace changes. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits) hyperv: Add buffer for extended info after the RNDIS response message. hyperv: Report actual status in receive completion packet hyperv: Remove extra allocated space for recv_pkt_list elements hyperv: Fix page buffer handling in rndis_filter_send_request() hyperv: Fix the missing return value in rndis_filter_set_packet_filter() hyperv: Fix the max_xfer_size in RNDIS initialization vxlan: put UDP socket in correct namespace vxlan: Depend on CONFIG_INET sfc: Fix the reported priorities of different filter types sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP sfc: Fix loopback self-test with separate_tx_channels=1 sfc: Fix MCDI structure field lookup sfc: Add parentheses around use of bitfield macro arguments sfc: Fix null function pointer in efx_sriov_channel_type vxlan: virtual extensible lan igmp: export symbol ip_mc_leave_group netlink: add attributes to fdb interface tg3: unconditionally select HWMON support when tg3 is enabled. Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT" gre: fix sparse warning ...	2012-10-02 13:38:27 -07:00
Eric W. Biederman	15e473046c	netlink: Rename pid to portid to avoid confusion It is a frequent mistake to confuse the netlink port identifier with a process identifier. Try to reduce this confusion by renaming fields that hold port identifiers portid instead of pid. I have carefully avoided changing the structures exported to userspace to avoid changing the userspace API. I have successfully built an allyesconfig kernel with this change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-10 15:30:41 -04:00
Sasha Levin	2b75bc9121	dlm: check the maximum size of a request from user device_write only checks whether the request size is big enough, but it doesn't check if the size is too big. At that point, it also tries to allocate as much memory as the user has requested even if it's too much. This can lead to OOM killer kicking in, or memory corruption if (count + 1) overflows. Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>	2012-09-10 09:50:27 -05:00
Ying Xue	9c5bef5849	dlm: cleanup send_to_sock routine Remove unnecessary code form send_to_sock routine. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David Teigland <teigland@redhat.com>	2012-08-13 10:03:18 -05:00
Ying Xue	4dd40f0cd9	dlm: convert add_sock routine return value type to void Since add_sock() always returns a success code - 0, its return value type should be changed from integer to void. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David Teigland <teigland@redhat.com>	2012-08-10 09:10:10 -05:00
Xue Ying	b4c798cf69	dlm: remove redundant variable assignments Once the tcp_create_listen_sock() is returned successfully, we will invoke add_sock() immediately. In add_sock(), the 'con' variable is assigned to 'sk_user_data', meanwhile, the 'sock' is also set to 'con->sock'. So it's unnecessary to do the same thing in tcp_create_listen_sock(). Signed-off-by: Xue Ying <ying.xue@windriver.com> Signed-off-by: David Teigland <teigland@redhat.com>	2012-08-10 09:10:10 -05:00
David Teigland	475f230c60	dlm: fix unlock balance warnings The in_recovery rw_semaphore has always been acquired and released by different threads by design. To work around the "BUG: bad unlock balance detected!" messages, adjust things so the dlm_recoverd thread always does both down_write and up_write. Signed-off-by: David Teigland <teigland@redhat.com>	2012-08-08 11:33:49 -05:00
David Teigland	6ad2291624	dlm: fix uninitialized spinlock Use DEFINE_SPINLOCK for global dlm_cb_seq_spin. Signed-off-by: David Teigland <teigland@redhat.com>	2012-08-08 11:33:43 -05:00
David Teigland	36b71a8bfb	dlm: fix deadlock between dlm_send and dlm_controld A deadlock sometimes occurs between dlm_controld closing a lowcomms connection through configfs and dlm_send looking up the address for a new connection in configfs. dlm_controld does a configfs rmdir which calls dlm_lowcomms_close which waits for dlm_send to cancel work on the workqueues. The dlm_send workqueue thread has called tcp_connect_to_sock which calls dlm_nodeid_to_addr which does a configfs lookup and blocks on a lock held by dlm_controld in the rmdir path. The solution here is to save the node addresses within the lowcomms code so that the lowcomms workqueue does not need to step through configfs to get a node address. dlm_controld: wait_for_completion+0x1d/0x20 __cancel_work_timer+0x1b3/0x1e0 cancel_work_sync+0x10/0x20 dlm_lowcomms_close+0x4c/0xb0 [dlm] drop_comm+0x22/0x60 [dlm] client_drop_item+0x26/0x50 [configfs] configfs_rmdir+0x180/0x230 [configfs] vfs_rmdir+0xbd/0xf0 do_rmdir+0x103/0x120 sys_rmdir+0x16/0x20 dlm_send: mutex_lock+0x2b/0x50 get_comm+0x34/0x140 [dlm] dlm_nodeid_to_addr+0x18/0xd0 [dlm] tcp_connect_to_sock+0xf4/0x2d0 [dlm] process_send_sockets+0x1d2/0x260 [dlm] worker_thread+0x170/0x2a0 Signed-off-by: David Teigland <teigland@redhat.com>	2012-08-08 11:33:35 -05:00
David Teigland	96006ea6d4	dlm: fix missing dir remove I don't know exactly how, but in some cases, a dir record is not removed, or a new one is created when it shouldn't be. The result is that the dir node lookup returns a master node where the rsb does not exist. In this case, The master node will repeatedly return -EBADR for requests, and the lock requests will be stuck. Until all possible ways for this to happen can be eliminated, a simple and effective way to recover from this situation is for the supposed master node to send a standard remove message to the dir node when it receives a request for a resource it has no rsb for. Signed-off-by: David Teigland <teigland@redhat.com>	2012-07-16 14:24:43 -05:00
David Teigland	c503a62103	dlm: fix conversion deadlock from recovery The process of rebuilding locks on a new master during recovery could re-order the locks on the convert queue, creating an "in place" conversion deadlock that would not be resolved. Fix this by not considering queue order when granting conversions after recovery. Signed-off-by: David Teigland <teigland@redhat.com>	2012-07-16 14:18:22 -05:00
David Teigland	6d768177c2	dlm: use wait_event_timeout Use wait_event_timeout to avoid using a timer directly. Signed-off-by: David Teigland <teigland@redhat.com>	2012-07-16 14:18:12 -05:00
David Teigland	05c32f47bf	dlm: fix race between remove and lookup It was possible for a remove message on an old rsb to be sent after a lookup message on a new rsb, where the rsbs were for the same resource name. This could lead to a missing directory entry for the new rsb. It is fixed by keeping a copy of the resource name being removed until after the remove has been sent. A lookup checks if this in-progress remove matches the name it is looking up. Signed-off-by: David Teigland <teigland@redhat.com>	2012-07-16 14:18:01 -05:00
David Teigland	1d7c484eeb	dlm: use idr instead of list for recovered rsbs When a large number of resources are being recovered, a linear search of the recover_list takes a long time. Use an idr in place of a list. Signed-off-by: David Teigland <teigland@redhat.com>	2012-07-16 14:17:52 -05:00
David Teigland	c04fecb4d9	dlm: use rsbtbl as resource directory Remove the dir hash table (dirtbl), and use the rsb hash table (rsbtbl) as the resource directory. It has always been an unnecessary duplication of information. This improves efficiency by using a single rsbtbl lookup in many cases where both rsbtbl and dirtbl lookups were needed previously. This eliminates the need to handle cases of rsbtbl and dirtbl being out of sync. In many cases there will be memory savings because the dir hash table no longer exists. Signed-off-by: David Teigland <teigland@redhat.com>	2012-07-16 14:16:19 -05:00
Dan Carpenter	75af271ed5	dlm: NULL dereference on failure in kmem_cache_create() We aren't allowed to pass NULL pointers to kmem_cache_destroy() so if both allocations fail, it leads to a NULL dereference. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Teigland <teigland@redhat.com>	2012-05-15 10:39:28 -05:00
David Teigland	4875647a08	dlm: fixes for nodir mode The "nodir" mode (statically assign master nodes instead of using the resource directory) has always been highly experimental, and never seriously used. This commit fixes a number of problems, making nodir much more usable. - Major change to recovery: recover all locks and restart all in-progress operations after recovery. In some cases it's not possible to know which in-progess locks to recover, so recover all. (Most require recovery in nodir mode anyway since rehashing changes most master nodes.) - Change the way nodir mode is enabled, from a command line mount arg passed through gfs2, into a sysfs file managed by dlm_controld, consistent with the other config settings. - Allow recovering MSTCPY locks on an rsb that has not yet been turned into a master copy. - Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages from a previous, aborted recovery cycle. Base this on the local recovery status not being in the state where any nodes should be sending LOCK messages for the current recovery cycle. - Hold rsb lock around dlm_purge_mstcpy_locks() because it may run concurrently with dlm_recover_master_copy(). - Maintain highbast on process-copy lkb's (in addition to the master as is usual), because the lkb can switch back and forth between being a master and being a process copy as the master node changes in recovery. - When recovering MSTCPY locks, flag rsb's that have non-empty convert or waiting queues for granting at the end of recovery. (Rename flag from LOCKS_PURGED to RECOVER_GRANT and similar for the recovery function, because it's not only resources with purged locks that need grant a grant attempt.) - Replace a couple of unnecessary assertion panics with error messages. Signed-off-by: David Teigland <teigland@redhat.com>	2012-05-02 14:15:27 -05:00
David Teigland	6d40c4a708	dlm: improve error and debug messages Change some existing error/debug messages to collect more useful information, and add some new error/debug messages to address recently found problems. Signed-off-by: David Teigland <teigland@redhat.com>	2012-04-26 15:41:46 -05:00
David Teigland	57638bf3aa	dlm: avoid unnecessary search in search_rsb If the rsb is found in the "keep" tree, but is not the right type (i.e. not MASTER), we can return immediately with the result. There's no point in going on to search the "toss" list as if we hadn't found it. Signed-off-by: David Teigland <teigland@redhat.com>	2012-04-26 15:37:56 -05:00
David Teigland	d6e24788d2	dlm: limit rcom debug messages Unify the checking for both types of ignored rcom messages, and replace the two log_debug statements with a single, rate limited debug message. Signed-off-by: David Teigland <teigland@redhat.com>	2012-04-26 15:37:37 -05:00
David Teigland	13ef11110f	dlm: fix waiter recovery An outstanding remote operation (an lkb on the "waiter" list) could sometimes miss being resent during recovery. The decision was based on the lkb_nodeid field, which could have changed during an earlier aborted recovery, so it no longer represents the actual remote destination. The lkb_wait_nodeid is always the actual remote node, so it is the best value to use. Signed-off-by: David Teigland <teigland@redhat.com>	2012-04-26 15:36:04 -05:00
David Teigland	513ef596d4	dlm: prevent connections during shutdown During lowcomms shutdown, a new connection could possibly be created, and attempt to use a workqueue that's been destroyed. Similarly, during startup, a new connection could attempt to use a workqueue that's not been set up yet. Add a global variable to indicate when new connections are allowed. Based on patch by: Christine Caulfield <ccaulfie@redhat.com> Reported-by: dann frazier <dann.frazier@canonical.com> Reviewed-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: David Teigland <teigland@redhat.com>	2012-04-26 15:35:38 -05:00
Linus Torvalds	721b024bd4	dlm fixes for 3.4 This includes one short patch fixing the behavior of the QUECVT flag, which the gfs2 folks are waiting on. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPlYXZAAoJEDgbc8f8gGmqpzYP/RFkCn8mC5y5cM8lWBk2JQAJ u7khyqowm3TWxjIpX85n7Uxq1vEX4RxFiRzCeiZj3ZoWE3PEQim8Tqrw8SFs8lcT y7oYL6TBkgCbM1ROuKDqXRiw8oRAfRud3cqtRvQzxuds3AoaoyYvE6N+to2y9XlR 5DuUBJEtrpKOEdW1ZeXeUmCnvDwrUyEFuIlACoyochzbk6ug1EF926dgSaViE4ZG OFcGMy8ELNqVYibVcJof2ZfztTvrMcXPIpsJrkK5tIW6w6q+2+eN4Xc2/xMZ4OYc 5AHHXxrqbK1ZABLrqsK/lUQi0Z241kAnqIi33i2nl3mhWSDF3K5CNXmrF9rvGsN7 wEqsfdGOnwFQucF1VU95neo+jYMnom9VGodpvSop7Xy5r+i59MPcfMDfz/I1KqX7 vBDuM5rwisYNfOb6wsfFNcBhkf1ktgo2h2iH5UdIaWfHApF1Lnls7D2j/o7r2uxF tRd4sPhRt2eIn68XRggbWOVxMfdUKtaW50ZhKzW9osMItYX748O8XfQdk0sQUbD9 ZXbFEfbfsfRgMKhMSyNFcGDh6ePsT/cmZL/zR5VKVEHuprL3hEDPhCui5GT0Sm1G 9sXpLu9p51r0d4OIJpScOFMv8aD64w/mwLJ3r5nrGZz2APK9SwWJOqX82fyqivQc uvO42yNGkwSGnBjXKiM6 =KDNZ -----END PGP SIGNATURE----- Merge tag 'dlm-fixes-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm fixes from David Teigland: "This includes one short patch fixing the behavior of the QUECVT flag, which the gfs2 folks are waiting on." * tag 'dlm-fixes-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: fix QUECVT when convert queue is empty	2012-04-23 18:22:42 -07:00
David Teigland	53ad1c980d	dlm: fix QUECVT when convert queue is empty The QUECVT flag should not prevent conversions from being granted immediately when the convert queue is empty. Signed-off-by: David Teigland <teigland@redhat.com>	2012-04-23 11:30:59 -05:00
Stephen Boyd	234e340582	simple_open: automatically convert to simple_open() Many users of debugfs copy the implementation of default_open() when they want to support a custom read/write function op. This leads to a proliferation of the default_open() implementation across the entire tree. Now that the common implementation has been consolidated into libfs we can replace all the users of this function with simple_open(). This replacement was done with the following semantic patch: <smpl> @ open @ identifier open_f != simple_open; identifier i, f; @@ -int open_f(struct inode i, struct file f) -{ ( -if (i->i_private) -f->private_data = i->i_private; \| -f->private_data = i->i_private; ) -return 0; -} @ has_open depends on open @ identifier fops; identifier open.open_f; @@ struct file_operations fops = { ... -.open = open_f, +.open = simple_open, ... }; </smpl> [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-04-05 15:25:50 -07:00
Linus Torvalds	30d73f3752	dlm for 3.4 This set includes one trivial fix, and one simple recovery speed up. Directory recovery can use the standard hash table to find resources rather than always searching the linear recovery list. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPahBcAAoJEDgbc8f8gGmqeHEP/i288yZV8NVbIJG7XpX9JjTY 4n4R1CI/qTMDn74GXDkk/OolHc8XTSQwbp02oFlJbPzj71lsWBUWijTAnwxiLIRz OHQg7eZ2aYL0YmaxAlvM2/6xLNOINmLW/DVwwH4QnpnSB4ymoCHBzyXxrNxgvgRv KWKUUXj7SDaUmbcK0TFZ39VprTmpw3L+mXIm+Y6kCCS2m4GfISp3Zij4OnxztA/c brex0R97EoZwrQOvPSRbVA5IaK6BjwfNScXAKsYCOSLsd+tvelD+UgYBdVHBTOmG godQ5pg8C7SpUB9NQqnLc8r78xpIUcOHQbWRqtwNQ2/6uPI/mWFj+lhpcHRmmzPk TczdDZVg+pIl9U+SMqiG689KgvnUTciPte0sYqksEbk3NqUMJOWOB7Cv79ZYquaV Pdmg788Essq7/5BmgeSRlOvS08RvdVfHXqYGOA6/tJ3f0b15M1YuSLjJdwYVWJkS gVmo4raN44Yh99R/+eNqeI8dvoVfd1pNDAD9VYXk4KdIv3AtKfRZi8XvWZ0o5uQI EdXTIhiA78ogjxG92cnnzj3+CAIpK4Iv1s53Y0KZgJ7gyExvVHyGp7zl1J7hlFLP jLuORsL+xMKTGbSWom796QuVn3jL/CGj/OKbnd1D98S0uRuSS6wiy/6ucBFaKmt0 HvT7AVcX2Gh6t/qdTJ9h =ByvY -----END PGP SIGNATURE----- Merge tag 'dlm-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates for 3.4 from David Teigland: "This set includes one trivial fix, and one simple recovery speed up. Directory recovery can use the standard hash table to find resources rather than always searching the linear recovery list." * tag 'dlm-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: last element of dlm_local_addr[] never used dlm: fix slow rsb search in dir recovery	2012-03-21 13:54:22 -07:00
David Teigland	1b189b8889	dlm: last element of dlm_local_addr[] never used The last element of dlm_local_addr[DLM_MAX_ADDR_COUNT] was not used because the loop ended at COUNT - 1. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Teigland <teigland@redhat.com>	2012-03-21 09:18:34 -05:00
Benjamin Poirier	2f2d76cc3e	dlm: Do not allocate a fd for peeloff avoids allocating a fd that a) propagates to every kernel thread and usermodehelper b) is not properly released. References: http://article.gmane.org/gmane.linux.network.drbd/22529 Signed-off-by: Benjamin Poirier <bpoirier@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-08 13:52:09 -08:00
David Teigland	7210cb7a72	dlm: fix slow rsb search in dir recovery The function used to find an rsb during directory recovery was searching the single linear list of rsb's. This wasted a lot of time compared to using the standard hash table to find the rsb. Signed-off-by: David Teigland <teigland@redhat.com>	2012-03-08 14:46:30 -06:00
Linus Torvalds	49d41bae46	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: add recovery callbacks dlm: add node slots and generation dlm: move recovery barrier calls dlm: convert rsb list to rb_tree	2012-01-10 14:55:55 -08:00
David Teigland	60f98d1839	dlm: add recovery callbacks These new callbacks notify the dlm user about lock recovery. GFS2, and possibly others, need to be aware of when the dlm will be doing lock recovery for a failed lockspace member. In the past, this coordination has been done between dlm and file system daemons in userspace, which then direct their kernel counterparts. These callbacks allow the same coordination directly, and more simply. Signed-off-by: David Teigland <teigland@redhat.com>	2012-01-04 08:56:31 -06:00
David Teigland	757a427196	dlm: add node slots and generation Slot numbers are assigned to nodes when they join the lockspace. The slot number chosen is the minimum unused value starting at 1. Once a node is assigned a slot, that slot number will not change while the node remains a lockspace member. If the node leaves and rejoins it can be assigned a new slot number. A new generation number is also added to a lockspace. It is set and incremented during each recovery along with the slot collection/assignment. The slot numbers will be passed to gfs2 which will use them as journal id's. Signed-off-by: David Teigland <teigland@redhat.com>	2012-01-04 08:55:57 -06:00
David Teigland	f95a34c665	dlm: move recovery barrier calls Put all the calls to recovery barriers in the same function to clarify where they each happen. Should not change any behavior. Also modify some recovery debug lines to make them consistent. Signed-off-by: David Teigland <teigland@redhat.com>	2012-01-04 08:53:27 -06:00
Alexey Dobriyan	4e3fd7a06d	net: remove ipv6_addr_copy() C assignment can handle struct in6_addr copying. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-22 16:43:32 -05:00
Bob Peterson	9beb3bf5a9	dlm: convert rsb list to rb_tree Change the linked lists to rb_tree's in the rsb hash table to speed up searches. Slow rsb searches were having a large impact on gfs2 performance due to the large number of dlm locks gfs2 uses. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2011-11-18 10:20:15 -06:00
Linus Torvalds	2dad3206db	Merge branch 'for-3.1' of git://linux-nfs.org/~bfields/linux * 'for-3.1' of git://linux-nfs.org/~bfields/linux: nfsd: don't break lease on CLAIM_DELEGATE_CUR locks: rename lock-manager ops nfsd4: update nfsv4.1 implementation notes nfsd: turn on reply cache for NFSv4 nfsd4: call nfsd4_release_compoundargs from pc_release nfsd41: Deny new lock before RECLAIM_COMPLETE done fs: locks: remove init_once nfsd41: check the size of request nfsd41: error out when client sets maxreq_sz or maxresp_sz too small nfsd4: fix file leak on open_downgrade nfsd4: remember to put RW access on stateid destruction NFSD: Added TEST_STATEID operation NFSD: added FREE_STATEID operation svcrpc: fix list-corrupting race on nfsd shutdown rpc: allow autoloading of gss mechanisms svcauth_unix.c: quiet sparse noise svcsock.c: include sunrpc.h to quiet sparse noise nfsd: Remove deprecated nfsctl system call and related code. NFSD: allow OP_DESTROY_CLIENTID to be only op in COMPOUND Fix up trivial conflicts in Documentation/feature-removal-schedule.txt	2011-07-25 22:49:19 -07:00
J. Bruce Fields	8fb47a4fbf	locks: rename lock-manager ops Both the filesystem and the lock manager can associate operations with a lock. Confusingly, one of them (fl_release_private) actually has the same name in both operation structures. It would save some confusion to give the lock-manager ops different names. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-20 20:23:19 -04:00
David Teigland	10d1459faf	dlm: don't limit active work items Allow multiple workqueue items (locks with callbacks) to be processed concurrently. There should be no reason not to take advantage of this workqueue feature. Signed-off-by: David Teigland <teigland@redhat.com>	2011-07-19 14:22:32 -05:00
David Teigland	23e8e1aaac	dlm: use workqueue for callbacks Instead of creating our own kthread (dlm_astd) to deliver callbacks for all lockspaces, use a per-lockspace workqueue to deliver the callbacks. This eliminates complications and slowdowns from many lockspaces sharing the same thread. Signed-off-by: David Teigland <teigland@redhat.com>	2011-07-15 12:30:43 -05:00
David Teigland	883ba74f43	dlm: remove deadlock debug print gfs2 recently began using this feature heavily, creating more debug output than we want to see. Signed-off-by: David Teigland <teigland@redhat.com>	2011-07-14 12:31:49 -05:00
David Teigland	3881ac04eb	dlm: improve rsb searches By pre-allocating rsb structs before searching the hash table, they can be inserted immediately. This avoids always having to repeat the search when adding the struct to hash list. This also adds space to the rsb struct for a max resource name, so an rsb allocation can be used by any request. The constant size also allows us to finally use a slab for the rsb structs. Signed-off-by: David Teigland <teigland@redhat.com>	2011-07-12 16:02:09 -05:00
David Teigland	3d6aa675ff	dlm: keep lkbs in idr This is simpler and quicker than the hash table, and avoids needing to search the hash list for every new lkid to check if it's used. Signed-off-by: David Teigland <teigland@redhat.com>	2011-07-11 08:43:45 -05:00
David Teigland	a22ca48068	dlm: fix kmalloc args The gfp and size args were switched. Signed-off-by: David Teigland <teigland@redhat.com>	2011-07-11 08:40:53 -05:00
Jesper Juhl	5d70828a77	dlm: don't do pointless NULL check, use kzalloc and fix order of arguments In fs/dlm/lock.c in the dlm_scan_waiters() function there are 3 small issues: 1) There's no need to test the return value of the allocation and do a memset if is succeedes. Just use kzalloc() to obtain zeroed memory. 2) Since kfree() handles NULL pointers gracefully, the test of 'warned' against NULL before the kfree() after the loop is completely pointless. Remove it. 3) The arguments to kmalloc() (now kzalloc()) were swapped. Thanks to Dr. David Alan Gilbert for pointing this out. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: David Teigland <teigland@redhat.com>	2011-07-11 08:39:42 -05:00
Masatake YAMATO	bcaadf5c1a	dlm: dump address of unknown node When the dlm fails to make a network connection to another node, include the address of the node in the error message. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2011-07-06 16:37:23 -05:00
Bryn M. Reeves	c282af4990	dlm: use vmalloc for hash tables Allocate dlm hash tables in the vmalloc area to allow a greater maximum size without restructuring of the hash table code. Signed-off-by: Bryn M. Reeves <bmr@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2011-07-01 15:49:23 -05:00
Masatake YAMATO	55b3286d3d	dlm: show addresses in configfs Display all addresses the dlm is using for the local node from the configfs file config/dlm/<cluster>/comms/<comm>/addr_list Also make the addr file write only. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2011-06-30 14:45:28 -05:00
Linus Torvalds	b7c2f03628	Merge branch 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6 * 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6: gfs2: Drop __TIME__ usage isdn/diva: Drop __TIME__ usage atm: Drop __TIME__ usage dlm: Drop __TIME__ usage wan/pc300: Drop __TIME__ usage parport: Drop __TIME__ usage hdlcdrv: Drop __TIME__ usage baycom: Drop __TIME__ usage pmcraid: Drop __DATE__ usage edac: Drop __DATE__ usage rio: Drop __DATE__ usage scsi/wd33c93: Drop __TIME__ usage scsi/in2000: Drop __TIME__ usage aacraid: Drop __TIME__ usage media/cx231xx: Drop __TIME__ usage media/radio-maxiradio: Drop __TIME__ usage nozomi: Drop __TIME__ usage cyclades: Drop __TIME__ usage	2011-05-26 13:19:00 -07:00
Michal Marek	75ce481e15	dlm: Drop __TIME__ usage The kernel already prints its build timestamp during boot, no need to repeat it in random drivers and produce different object files each time. Cc: Christine Caulfield <ccaulfie@redhat.com> Cc: David Teigland <teigland@redhat.com> Cc: cluster-devel@redhat.com Signed-off-by: Michal Marek <mmarek@suse.cz>	2011-05-26 09:46:17 +02:00
Linus Torvalds	df3256f9ab	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: make plock operation killable dlm: remove shared message stub for recovery dlm: delayed reply message warning dlm: Remove superfluous call to recalc_sigpending()	2011-05-24 15:04:00 -07:00
David Teigland	901025d2f3	dlm: make plock operation killable Allow processes blocked on plock requests to be interrupted when they are killed. This leaves the problem of cleaning up the lock state in userspace. This has three parts: 1. Add a flag to unlock operations sent to userspace indicating the file is being closed. Userspace will then look for and clear any waiting plock operations that were abandoned by an interrupted process. 2. Queue an unlock-close operation (like in 1) to clean up userspace from an interrupted plock request. This is needed because the vfs will not send a cleanup-unlock if it sees no locks on the file, which it won't if the interrupted operation was the only one. 3. Do not use replies from userspace for unlock-close operations because they are unnecessary (they are just cleaning up for the process which did not make an unlock call). This also simplifies the new unlock-close generated from point 2. Signed-off-by: David Teigland <teigland@redhat.com>	2011-05-23 10:47:06 -05:00
David Teigland	2a7ce0edd6	dlm: remove shared message stub for recovery kmalloc a stub message struct during recovery instead of sharing the struct in the lockspace. This leaves the lockspace stub_ms only for faking downconvert replies, where it is never modified and sharing is not a problem. Also improve the debug messages in the same recovery function. Signed-off-by: David Teigland <teigland@redhat.com>	2011-04-05 10:54:47 -05:00
David Teigland	c6ff669bac	dlm: delayed reply message warning Add an option (disabled by default) to print a warning message when a lock has been waiting a configurable amount of time for a reply message from another node. This is mainly for debugging. Signed-off-by: David Teigland <teigland@redhat.com>	2011-04-01 14:19:06 -05:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Matt Fleming	4bcad6c1ef	dlm: Remove superfluous call to recalc_sigpending() recalc_sigpending() is called within sigprocmask(), so there is no need call it again after sigprocmask() has returned. Signed-off-by: Matt Fleming <matt.fleming@linux.intel.com> Signed-off-by: David Teigland <teigland@redhat.com>	2011-03-28 10:20:17 -05:00
David Teigland	e43f055a95	dlm: use alloc_workqueue function Replaces deprecated create_singlethread_workqueue(). Signed-off-by: David Teigland <teigland@redhat.com>	2011-03-10 13:22:34 -06:00
David Teigland	e3853a90e2	dlm: increase default hash table sizes Make all three hash tables a consistent size of 1024 rather than 1024, 512, 256. All three tables, for resources, locks, and lock dir entries, will generally be filled to the same order of magnitude. Signed-off-by: David Teigland <teigland@redhat.com>	2011-03-10 13:08:22 -06:00
David Teigland	8304d6f24c	dlm: record full callback state Change how callbacks are recorded for locks. Previously, information about multiple callbacks was combined into a couple of variables that indicated what the end result should be. In some situations, we could not tell from this combined state what the exact sequence of callbacks were, and would end up either delivering the callbacks in the wrong order, or suppress redundant callbacks incorrectly. This new approach records all the data for each callback, leaving no uncertainty about what needs to be delivered. Signed-off-by: David Teigland <teigland@redhat.com>	2011-03-10 10:40:00 -06:00
David Teigland	6b155c8fd4	dlm: use single thread workqueues The recent commit to use cmwq for send and recv threads `dcce240ead` introduced problems, apparently due to multiple workqueue threads. Single threads make the problems go away, so return to that until we fully understand the concurrency issues with multiple threads. Signed-off-by: David Teigland <teigland@redhat.com>	2011-02-11 16:50:47 -06:00
Nicholas Bellinger	86c747d2a4	dlm: Make DLM depend on CONFIGFS_FS This patch fixes the following kconfig error after changing CONFIGFS_FS -> select SYSFS: fs/sysfs/Kconfig:1:error: recursive dependency detected! fs/sysfs/Kconfig:1: symbol SYSFS is selected by CONFIGFS_FS fs/configfs/Kconfig:1: symbol CONFIGFS_FS is selected by DLM fs/dlm/Kconfig:1: symbol DLM depends on SYSFS Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org> Cc: Joel Becker <jlbec@evilplan.org> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: James Bottomley <James.Bottomley@suse.de>	2011-01-16 21:22:37 +00:00
Namhyung Kim	b9d4105279	dlm: sanitize work_start() in lowcomms.c The create_workqueue() returns NULL if failed rather than ERR_PTR(). Fix error checking and remove unnecessary variable 'error'. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: David Teigland <teigland@redhat.com>	2010-12-13 13:42:24 -06:00
Bob Peterson	f92c8dd7a0	dlm: reduce cond_resched during send Calling cond_resched() after every send can unnecessarily degrade performance. Go back to an old method of scheduling after 25 messages. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2010-11-12 11:15:20 -06:00
David Teigland	cb2d45da81	dlm: use TCP_NODELAY Nagling doesn't help and can sometimes hurt dlm comms. Signed-off-by: David Teigland <teigland@redhat.com>	2010-11-12 11:12:55 -06:00
Steven Whitehouse	dcce240ead	dlm: Use cmwq for send and receive workqueues So far as I can tell, there is no reason to use a single-threaded send workqueue for dlm, since it may need to send to several sockets concurrently. Both workqueues are set to WQ_MEM_RECLAIM to avoid any possible deadlocks, WQ_HIGHPRI since locking traffic is highly latency sensitive (and to avoid a priority inversion wrt GFS2's glock_workqueue) and WQ_FREEZABLE just in case someone needs to do that (even though with current cluster infrastructure, it doesn't make sense as the node will most likely land up ejected from the cluster) in the future. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: David Teigland <teigland@redhat.com>	2010-11-12 11:08:03 -06:00
David Miller	b36930dd50	dlm: Handle application limited situations properly. In the normal regime where an application uses non-blocking I/O writes on a socket, they will handle -EAGAIN and use poll() to wait for send space. They don't actually sleep on the socket I/O write. But kernel level RPC layers that do socket I/O operations directly and key off of -EAGAIN on the write() to "try again later" don't use poll(), they instead have their own sleeping mechanism and rely upon ->sk_write_space() to trigger the wakeup. So they do effectively sleep on the write(), but this mechanism alone does not let the socket layers know what's going on. Therefore they must emulate what would have happened, otherwise TCP cannot possibly see that the connection is application window size limited. Handle this, therefore, like SUNRPC by setting SOCK_NOSPACE and bumping the ->sk_write_count as needed when we hit the send buffer limits. This should make TCP send buffer size auto-tuning and the ->sk_write_space() callback invocations actually happen. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: David Teigland <teigland@redhat.com>	2010-11-11 13:05:12 -06:00
Linus Torvalds	2c15bd00a5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: Fix dlm lock status block comment in dlm.h dlm: Don't send callback to node making lock request when "try 1cb" fails	2010-10-22 17:33:16 -07:00
Arnd Bergmann	6038f373a3	llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode i, struct file f) { <+... ( nonseekable_open(...) \| nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable / }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, / open uses nonseekable / }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, / we have seq_read / }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, / readdir is present / }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, / read accesses f_pos / }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, / write accesses f_pos / }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, / read and write both use no f_pos / }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, / write uses no f_pos / }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, / read uses no f_pos / }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, / no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>	2010-10-15 15:53:27 +02:00
Steven Whitehouse	314dd2a053	dlm: Don't send callback to node making lock request when "try 1cb" fails When converting a lock, an lkb is in the granted state and also being used to request a new state. In the case that the conversion was a "try 1cb" type which has failed, and if the new state was incompatible with the old state, a callback was being generated to the requesting node. This is incorrect as callbacks should only be sent to all the other nodes holding blocking locks. The requesting node should receive the normal (failed) response to its "try 1cb" conversion request only. This was discovered while debugging a performance problem on GFS2, however this fix also speeds up GFS as well. In the GFS2 case the performance gain is over 10x for cases of write activity to an inode whose glock is cached on another, idle (wrt that glock) node. (comment added, dct) Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Tested-by: Abhijith Das <adas@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2010-09-03 10:10:47 -05:00
Julia Lawall	f70cb33b9c	fs/dlm: Drop unnecessary null test hlist_for_each_entry binds its first argument to a non-null value, and thus any null test on the value of that argument is superfluous. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ iterator I; expression x,E,E1,E2; statement S,S1,S2; @@ I(x,...) { <... - (x != NULL) && E ...> } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David Teigland <teigland@redhat.com>	2010-08-05 14:23:45 -05:00
Changli Gao	a4d935bd97	dlm: use genl_register_family_with_ops() Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>	2010-08-05 14:22:01 -05:00
David Teigland	89d799d008	dlm: fix ast ordering for user locks Commit `7fe2b3190b` fixed possible misordering of completion asts (casts) and blocking asts (basts) for kernel locks. This patch does the same for locks taken by user space applications. Signed-off-by: David Teigland <teigland@redhat.com>	2010-04-30 14:52:51 -05:00
Dan Carpenter	99fb19d49e	dlm: cleanup remove unused code Smatch complains because "lkb" is never NULL. Looking at it, the original code actually adds the new element to the end of the list fine, so we can just get rid of the if condition. This code is four years old and no one has complained so it must work. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>	2010-04-30 14:52:28 -05:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Linus Torvalds	c32da02342	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (56 commits) doc: fix typo in comment explaining rb_tree usage Remove fs/ntfs/ChangeLog doc: fix console doc typo doc: cpuset: Update the cpuset flag file Fix of spelling in arch/sparc/kernel/leon_kernel.c no longer needed Remove drivers/parport/ChangeLog Remove drivers/char/ChangeLog doc: typo - Table 1-2 should refer to "status", not "statm" tree-wide: fix typos "ass?o[sc]iac?te" -> "associate" in comments No need to patch AMD-provided drivers/gpu/drm/radeon/atombios.h devres/irq: Fix devm_irq_match comment Remove reference to kthread_create_on_cpu tree-wide: Assorted spelling fixes tree-wide: fix 'lenght' typo in comments and code drm/kms: fix spelling in error message doc: capitalization and other minor fixes in pnp doc devres: typo fix s/dev/devm/ Remove redundant trailing semicolons from macros fix typo "definetly" -> "definitely" in comment tree-wide: s/widht/width/g typo in comments ... Fix trivial conflict in Documentation/laptops/00-INDEX	2010-03-12 16:04:50 -08:00
Jiri Kosina	318ae2edc3	Merge branch 'for-next' into for-linus Conflicts: Documentation/filesystems/proc.txt arch/arm/mach-u300/include/mach/debug-macro.S drivers/net/qlge/qlge_ethtool.c drivers/net/qlge/qlge_main.c drivers/net/typhoon.c	2010-03-08 16:55:37 +01:00
Emese Revfy	52cf25d0ab	Driver core: Constify struct sysfs_ops in struct kobj_type Constify struct sysfs_ops. This is part of the ops structure constification effort started by Arjan van de Ven et al. Benefits of this constification: * prevents modification of data that is shared (referenced) by many other structure instances at runtime * detects/prevents accidental (but not intentional) modification attempts on archs that enforce read-only kernel data at runtime * potentially better optimized code as the compiler can assume that the const data cannot be changed * the compiler/linker move const data into .rodata and therefore exclude them from false sharing Signed-off-by: Emese Revfy <re.emese@gmail.com> Acked-by: David Teigland <teigland@redhat.com> Acked-by: Matt Domsch <Matt_Domsch@dell.com> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Hans J. Koch <hjk@linutronix.de> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-03-07 17:04:49 -08:00
David Teigland	b6fa8796b2	dlm: use bastmode in debugfs output The bast mode that appears in the debugfs output should be useful on both master and process nodes. lkb_highbast is currently printed, and is only useful on the master node. lkb_bastmode is only useful on the process node. This patch sets lkb_bastmode on the master node as well, and uses that value in the debugfs print. Signed-off-by: David Teigland <teigland@redhat.com>	2010-02-26 12:15:54 -06:00
Steven Whitehouse	b4a5d4bc37	dlm: Send lockspace name with uevents Although it is possible to get this information from the path, its much easier to provide the lockspace as a seperate env variable. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2010-02-26 12:14:25 -06:00
David Teigland	cf6620acc0	dlm: send reply before bast When the lock master processes a successful operation (request, convert, cancel, or unlock), it will process the effects of the change before sending the reply for the operation. The "effects" of the operation are: - blocking callbacks (basts) for any newly granted locks - waiting or converting locks that can now be granted The cast is queued on the local node when the reply from the lock master is received. This means that a lock holder can receive a bast for a lock mode that is doesn't yet know has been granted. Signed-off-by: David Teigland <teigland@redhat.com>	2010-02-26 11:57:37 -06:00
David Teigland	7fe2b3190b	dlm: fix ordering of bast and cast When both blocking and completion callbacks are queued for lock, the dlm would always deliver the completion callback (cast) first. In some cases the blocking callback (bast) is queued before the cast, though, and should be delivered first. This patch keeps track of the order in which they were queued and delivers them in that order. This patch also keeps track of the granted mode in the last cast and eliminates the following bast if the bast mode is compatible with the preceding cast mode. This happens when a remotely mastered lock is demoted, e.g. EX->NL, in which case the local node queues a cast immediately after sending the demote message. In this way a cast can be queued for a mode, e.g. NL, that makes an in-transit bast extraneous. Signed-off-by: David Teigland <teigland@redhat.com>	2010-02-24 11:46:53 -06:00
Adam Buchbinder	c41b20e721	Fix misspellings of "truly" in comments. Some comments misspell "truly"; this fixes them. No code changes. Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-02-04 11:55:45 +01:00
Linus Torvalds	02412f49f6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: always use GFP_NOFS	2009-12-10 09:33:59 -08:00
André Goddard Rosa	af901ca181	tree-wide: fix assorted typos all over the place That is "success", "unknown", "through", "performance", "[re\|un]mapping" , "access", "default", "reasonable", "[con]currently", "temperature" , "channel", "[un]used", "application", "example","hierarchy", "therefore" , "[over\|under]flow", "contiguous", "threshold", "enough" and others. Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-12-04 15:39:55 +01:00
David Teigland	573c24c4af	dlm: always use GFP_NOFS Replace all GFP_KERNEL and ls_allocation with GFP_NOFS. ls_allocation would be GFP_KERNEL for userland lockspaces and GFP_NOFS for file system lockspaces. It was discovered that any lockspaces on the system can affect all others by triggering memory reclaim in the file system which could in turn call back into the dlm to acquire locks, deadlocking dlm threads that were shared by all lockspaces, like dlm_recv. Signed-off-by: David Teigland <teigland@redhat.com>	2009-11-30 16:34:43 -06:00
David Teigland	6861f35078	dlm: fix socket fd translation The code to set up sctp sockets was not using the sockfd_lookup() and sockfd_put() routines to translate an fd to a socket. The direct fget and fput calls were resulting in error messages from alloc_fd(). Also clean up two log messages and remove a third, related to setting up sctp associations. Signed-off-by: David Teigland <teigland@redhat.com>	2009-09-30 12:19:44 -05:00
David Teigland	04bedd79a7	dlm: fix lowcomms_connect_node for sctp The recently added dlm_lowcomms_connect_node() from `391fbdc5d5` does not work when using SCTP instead of TCP. The sctp connection code has nothing to do without data to send. Check for no data in the sctp connection code and do nothing instead of triggering a BUG. Also have connect_node() do nothing when the protocol is sctp. Signed-off-by: David Teigland <teigland@redhat.com>	2009-09-30 12:19:44 -05:00
James Morris	88e9d34c72	seq_file: constify seq_operations Make all seq_operations structs const, to help mitigate against revectoring user-triggerable function pointers. This is derived from the grsecurity patch, although generated from scratch because it's simpler than extracting the changes from there. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 07:39:29 -07:00
Linus Torvalds	5ce0028987	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: use kernel_sendpage dlm: fix connection close handling dlm: fix double-release of socket in error exit path	2009-09-18 09:19:10 -07:00
Paolo Bonzini	1329e3f2c8	dlm: use kernel_sendpage Using kernel_sendpage() is cleaner and safer than following sock->ops ourselves. Signed-off-by: Paolo Bonzini <bonzini@gnu.org> Signed-off-by: David Teigland <teigland@redhat.com>	2009-08-24 13:18:04 -05:00
Lars Marowsky-Bree	063c4c9963	dlm: fix connection close handling Closing a connection to a node can create problems if there are outstanding messages for that node. The problems include dlm_send spinning attempting to reconnect, or BUG from tcp_connect_to_sock() attempting to use a partially closed connection. To cleanly close a connection, we now first attempt to send any pending messages, cancel any remaining workqueue work, and flag the connection as closed to avoid reconnect attempts. Signed-off-by: Lars Marowsky-Bree <lmb@suse.de> Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2009-08-24 13:13:56 -05:00
Casey Dahlin	b5711b8e5a	dlm: fix double-release of socket in error exit path The last correction to the tcp_connect_to_sock error exit path, commit `a89d63a159`, can free an already freed socket, due to collision with a previous (incomplete) attempt to fix the same issue, commit `311f6fc77c`. Signed-off-by: Casey Dahlin <cdahlin@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2009-08-18 15:09:24 -05:00
David S. Miller	aa11d958d1	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: arch/microblaze/include/asm/socket.h	2009-08-12 17:44:53 -07:00
Casey Dahlin	a89d63a159	dlm: free socket in error exit path In the tcp_connect_to_sock() error exit path, the socket allocated at the top of the function was not being freed. Signed-off-by: Casey Dahlin <cdahlin@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2009-07-14 12:28:43 -05:00
Johannes Berg	134e63756d	genetlink: make netns aware This makes generic netlink network namespace aware. No generic netlink families except for the controller family are made namespace aware, they need to be checked one by one and then set the family->netnsok member to true. A new function genlmsg_multicast_netns() is introduced to allow sending a multicast message in a given namespace, for example when it applies to an object that lives in that namespace, a new function genlmsg_multicast_allns() to send a message to all network namespaces (for objects that do not have an associated netns). The function genlmsg_multicast() is changed to multicast the message in just init_net, which is currently correct for all generic netlink families since they only work in init_net right now. Some will later want to work in all net namespaces because they do not care about the netns at all -- those will have to be converted to use one of the new functions genlmsg_multicast_allns() or genlmsg_multicast_netns() whenever they are made netns aware in some way. After this patch families can easily decide whether or not they should be available in all net namespaces. Many genl families us it for objects not related to networking and should therefore be available in all namespaces, but that will have to be done on a per family basis. Note that this doesn't touch on the checkpoint/restart problem where network namespaces could be used, genl families and multicast groups are numbered globally and I see no easy way of changing that, especially since it must be possible to multicast to all network namespaces for those families that do not care about netns. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-12 14:03:27 -07:00
David Teigland	c78a87d0a1	dlm: fix plock use-after-free Fix a regression from the original addition of nfs lock support `586759f03e`. When a synchronous (non-nfs) plock completes, the waiting thread will wake up and free the op struct. This races with the user thread in dev_write() which goes on to read the op's callback field to check if the lock is async and needs a callback. This check can happen on the freed op. The fix is to note the callback value before the op can be freed. Signed-off-by: David Teigland <teigland@redhat.com>	2009-06-18 13:42:42 -05:00
Steven Whitehouse	a566a6b11c	dlm: Fix uninitialised variable warning in lock.c CC [M] fs/dlm/lock.o fs/dlm/lock.c: In function ‘find_rsb’: fs/dlm/lock.c:438: warning: ‘r’ may be used uninitialized in this function Since r is used on the error path to set r_ret, set it to NULL. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2009-06-17 11:31:32 -05:00
David Teigland	748285ccf7	dlm: use more NOFS allocation Change some GFP_KERNEL allocations to use either GFP_NOFS or ls_allocation (when available) which the fs sets to GFP_NOFS. The point is to prevent allocations from going back into the cluster fs in places where that might lead to deadlock. Signed-off-by: David Teigland <teigland@redhat.com>	2009-05-15 11:24:59 -05:00
Christine Caulfield	391fbdc5d5	dlm: connect to nodes earlier Make network connections to other nodes earlier, in the context of dlm_recoverd. This avoids connecting to nodes from dlm_send where we try to avoid allocations which could possibly deadlock if memory reclaim goes into the cluster fs which may try to do a dlm operation. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2009-05-15 09:34:12 -05:00
David Teigland	8511a2728a	dlm: fix use count with multiple joins When a lockspace was joined multiple times, the global dlm use count was incremented when it should not have been. This caused the global dlm threads to not be stopped when all lockspaces were eventually be removed. Signed-off-by: David Teigland <teigland@redhat.com>	2009-05-07 10:14:42 -05:00
Geert Uytterhoeven	08ce4c91e4	dlm: Make name input parameter of {,dlm_}new_lockspace() const \| fs/gfs2/lock_dlm.c:207: warning: passing argument 1 of 'dlm_new_lockspace' discards qualifiers from pointer target type Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David Teigland <teigland@redhat.com>	2009-05-07 10:14:26 -05:00
David Teigland	1fecb1c4b6	dlm: fix length calculation in compat code Using offsetof() to calculate name length does not work because it does not produce consistent results with with structure packing. This caused memcpy to corrupt memory by copying 4 extra bytes off the end of the buffer on 64 bit kernels with 32 bit userspace (the only case where this 32/64 compat code is used). The fix is to calculate name length directly from the start instead of trying to derive it later using count and offsetof. Signed-off-by: David Teigland <teigland@redhat.com>	2009-03-11 12:23:59 -05:00
David Teigland	a536e38125	dlm: ignore cancel on granted lock Return immediately from dlm_unlock(CANCEL) if the lock is granted and not being converted; there's nothing to cancel. Signed-off-by: David Teigland <teigland@redhat.com>	2009-03-11 12:23:58 -05:00
David Teigland	43279e5376	dlm: clear defunct cancel state When a conversion completes successfully and finds that a cancel of the convert is still in progress (which is now a moot point), preemptively clear the state associated with outstanding cancel. That state could cause a subsequent conversion to be ignored. Also, improve the consistency and content of error and debug messages in this area. Signed-off-by: David Teigland <teigland@redhat.com>	2009-03-11 12:23:39 -05:00
Christine Caulfield	5e9ccc372d	dlm: replace idr with hash table for connections Integer nodeids can be too large for the idr code; use a hash table instead. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2009-03-11 12:20:58 -05:00
Joe Perches	2cf12c0bf2	dlm: comment typo fixes Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David Teigland <teigland@redhat.com>	2009-01-28 12:56:07 -06:00
Joe Perches	44ad532b32	dlm: use ipv6_addr_copy Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David Teigland <teigland@redhat.com>	2009-01-28 12:56:02 -06:00
Steven Whitehouse	305a47b17c	dlm: Change rwlock which is only used in write mode to a spinlock The ls_dirtbl[].lock was an rwlock, but since it was only used in write mode a spinlock will suffice. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2009-01-28 12:55:55 -06:00
Jeff Layton	20d5a39929	dlm: initialize file_lock struct in GETLK before copying conflicting lock dlm_posix_get fills out the relevant fields in the file_lock before returning when there is a lock conflict, but doesn't clean out any of the other fields in the file_lock. When nfsd does a NFSv4 lockt call, it sets the fl_lmops to nfsd_posix_mng_ops before calling the lower fs. When the lock comes back after testing a lock on GFS2, it still has that field set. This confuses nfsd into thinking that the file_lock is a nfsd4 lock. Fix this by making DLM reinitialize the file_lock before copying the fields from the conflicting lock. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2009-01-21 15:28:45 -06:00
David Teigland	24179f4880	dlm: fix plock notify callback to lockd We should use the original copy of the file_lock, fl, instead of the copy, flc in the lockd notify callback. The range in flc has been modified by posix_lock_file(), so it will not match a copy of the lock in lockd. Signed-off-by: David Teigland <teigland@redhat.com>	2009-01-21 15:28:45 -06:00
David Teigland	c7be761a81	dlm: change rsbtbl rwlock to spinlock The rwlock is almost always used in write mode, so there's no reason to not use a spinlock instead. Signed-off-by: David Teigland <teigland@redhat.com>	2009-01-08 15:12:39 -06:00
David Teigland	892c4467e3	dlm: fix seq_file usage in debugfs lock dump The old code would leak iterators and leave reference counts on rsbs because it was ignoring the "stop" seq callback. The code followed an example that used the seq operations differently. This new code is based on actually understanding how the seq operations work. It also improves things by saving the hash bucket in the position to avoid cycling through completed buckets in start. Siged-off-by: Davd Teigland <teigland@redhat.com>	2009-01-08 15:12:31 -06:00
Linus Torvalds	7d8a804c59	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: fs/dlm/ast.c: fix warning dlm: add new debugfs entry dlm: add time stamp of blocking callback dlm: change lock time stamping dlm: improve how bast mode handling dlm: remove extra blocking callback check dlm: replace schedule with cond_resched dlm: remove kmap/kunmap dlm: trivial annotation of be16 value dlm: fix up memory allocation flags	2009-01-05 19:02:09 -08:00

... 3 4 5 6 7 ...

695 Commits