linux

Author	SHA1	Message	Date
Quinn Tran	83949613fa	scsi: qla2xxx: Fix null pointer access during disconnect from subsystem NVMEAsync command is being submitted to QLA while the same NVMe controller is in the middle of reset. The reset path has deleted the association and freed aen_op->fcp_req.private. Add a check for this private pointer before issuing the command. ... 6 [ffffb656ca11fce0] page_fault at ffffffff8c00114e [exception RIP: qla_nvme_post_cmd+394] RIP: ffffffffc0d012ba RSP: ffffb656ca11fd98 RFLAGS: 00010206 RAX: ffff8fb039eda228 RBX: ffff8fb039eda200 RCX: 00000000000da161 RDX: ffffffffc0d4d0f0 RSI: ffffffffc0d26c9b RDI: ffff8fb039eda220 RBP: 0000000000000013 R8: ffff8fb47ff6aa80 R9: 0000000000000002 R10: 0000000000000000 R11: ffffb656ca11fdc8 R12: ffff8fb27d04a3b0 R13: ffff8fc46dd98a58 R14: 0000000000000000 R15: ffff8fc4540f0000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 7 [ffffb656ca11fe08] nvme_fc_start_fcp_op at ffffffffc0241568 [nvme_fc] 8 [ffffb656ca11fe50] nvme_fc_submit_async_event at ffffffffc0241901 [nvme_fc] 9 [ffffb656ca11fe68] nvme_async_event_work at ffffffffc014543d [nvme_core] 10 [ffffb656ca11fe98] process_one_work at ffffffff8b6cd437 11 [ffffb656ca11fed8] worker_thread at ffffffff8b6cdcef 12 [ffffb656ca11ff10] kthread at ffffffff8b6d3402 13 [ffffb656ca11ff50] ret_from_fork at ffffffff8c000255 -- PID: 37824 TASK: ffff8fb033063d80 CPU: 20 COMMAND: "kworker/u97:451" 0 [ffffb656ce1abc28] __schedule at ffffffff8be629e3 1 [ffffb656ce1abcc8] schedule at ffffffff8be62fe8 2 [ffffb656ce1abcd0] schedule_timeout at ffffffff8be671ed 3 [ffffb656ce1abd70] wait_for_completion at ffffffff8be639cf 4 [ffffb656ce1abdd0] flush_work at ffffffff8b6ce2d5 5 [ffffb656ce1abe70] nvme_stop_ctrl at ffffffffc0144900 [nvme_core] 6 [ffffb656ce1abe80] nvme_fc_reset_ctrl_work at ffffffffc0243445 [nvme_fc] 7 [ffffb656ce1abe98] process_one_work at ffffffff8b6cd437 8 [ffffb656ce1abed8] worker_thread at ffffffff8b6cdb50 9 [ffffb656ce1abf10] kthread at ffffffff8b6d3402 10 [ffffb656ce1abf50] ret_from_fork at ffffffff8c000255 Link: https://lore.kernel.org/r/20200806111014.28434-10-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:40:15 -04:00
Saurav Kashyap	dffa114533	scsi: qla2xxx: Check if FW supports MQ before enabling OS boot during Boot from SAN was stuck at dracut emergency shell after enabling NVMe driver parameter. For non-MQ support the driver was enabling MQ. Add a check to confirm if FW supports MQ. Link: https://lore.kernel.org/r/20200806111014.28434-9-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:40:15 -04:00
Arun Easi	897d68eb81	scsi: qla2xxx: Fix WARN_ON in qla_nvme_register_hba qla_nvme_register_hba() puts out a warning when there are not enough queue pairs available for FC-NVME. Just fail the NVME registration rather than a WARNING + call Trace. Link: https://lore.kernel.org/r/20200806111014.28434-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:40:14 -04:00
Arun Easi	49030003a3	scsi: qla2xxx: Allow ql2xextended_error_logging special value 1 to be set anytime ql2xextended_error_logging can now be set to 1 to get the default mask value, as opposed to at module load time only. Link: https://lore.kernel.org/r/20200806111014.28434-7-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:40:13 -04:00
Quinn Tran	81b9d1e19d	scsi: qla2xxx: Reduce noisy debug message Update debug level and message for ELS IOCB done. Link: https://lore.kernel.org/r/20200806111014.28434-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:40:13 -04:00
Quinn Tran	abb31aeaa9	scsi: qla2xxx: Fix login timeout Multipath errors were seen during failback due to login timeout. The remote device sent LOGO, the local host tore down the session and did relogin. The RSCN arrived indicates remote device is going through failover after which the relogin is in a 20s timeout phase. At this point the driver is stuck in the relogin process. Add a fix to delete the session as part of abort/flush the login. Link: https://lore.kernel.org/r/20200806111014.28434-5-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:40:12 -04:00
Quinn Tran	4709272f63	scsi: qla2xxx: Indicate correct supported speeds for Mezz card Correct the supported speeds for 16G Mezz card. Link: https://lore.kernel.org/r/20200806111014.28434-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:40:11 -04:00
Quinn Tran	a117579d02	scsi: qla2xxx: Flush I/O on zone disable Perform implicit logout to flush I/O on zone disable. Link: https://lore.kernel.org/r/20200806111014.28434-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:40:11 -04:00
Quinn Tran	10ae30ba66	scsi: qla2xxx: Flush all sessions on zone disable On Zone Disable, certain switches would ignore all commands. This causes timeout for both switch scan command and abort of that command. On detection of this condition, all sessions will be shutdown. Link: https://lore.kernel.org/r/20200806111014.28434-2-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:40:10 -04:00
Matthew Wilcox (Oracle)	d81665198b	block: Fix page_is_mergeable() for compound pages If we pass in an offset which is larger than PAGE_SIZE, then page_is_mergeable() thinks it's not mergeable with the previous bio_vec, leading to a large number of bio_vecs being used. Use a slightly more obvious test that the two pages are compatible with each other. Fixes: `52d52d1c98` ("block: only allow contiguous page structs in a bio_vec") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-08-17 19:35:53 -07:00
Enzo Matsumiya	c314a014b1	scsi: qla2xxx: Use MBX_TOV_SECONDS for mailbox command timeout values Improves readability of qla_mbx.c. Link: https://lore.kernel.org/r/20200805200546.22497-1-ematsumiya@suse.de Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:19:06 -04:00
Douglas Gilbert	223f91b480	scsi: scsi_debug: Fix scp is NULL errors John Garry reported 'sdebug_q_cmd_complete: scp is NULL' failures that were mainly seen on aarch64 machines (e.g. RPi 4 with four A72 CPUs). The problem was tracked down to a missing critical section on a "short circuit" path. Namely, the time to process the current command so far has already exceeded the requested command duration (i.e. the number of nanoseconds in the ndelay parameter). The random=1 parameter setting was pivotal in finding this error. The failure scenario involved first taking that "short circuit" path (due to a very short command duration) and then taking the more likely hrtimer_start() path (due to a longer command duration). With random=1 each command's duration is taken from the uniformly distributed [0..ndelay) interval. The fio utility also helped by reliably generating the error scenario at about once per minute on a RPi 4 (64 bit OS). Link: https://lore.kernel.org/r/20200813155738.109298-1-dgilbert@interlog.com Reported-by: John Garry <john.garry@huawei.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:13:17 -04:00
Steffen Maier	2d9a2c5f58	scsi: zfcp: Fix use-after-free in request timeout handlers Before v4.15 commit `75492a5156` ("s390/scsi: Convert timers to use timer_setup()"), we intentionally only passed zfcp_adapter as context argument to zfcp_fsf_request_timeout_handler(). Since we only trigger adapter recovery, it was unnecessary to sync against races between timeout and (late) completion. Likewise, we only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler(). Since we only wakeup an ERP action, it was unnecessary to sync against races between timeout and (late) completion. Meanwhile the timeout handlers get timer_list as context argument and do a timer-specific container-of to zfcp_fsf_req which can have been freed. Fix it by making sure that any request timeout handlers, that might just have started before del_timer(), are completed by using del_timer_sync() instead. This ensures the request free happens afterwards. Space time diagram of potential use-after-free: Basic idea is to have 2 or more pending requests whose timeouts run out at almost the same time. req 1 timeout ERP thread req 2 timeout ---------------- ---------------- --------------------------------------- zfcp_fsf_request_timeout_handler fsf_req = from_timer(fsf_req, t, timer) adapter = fsf_req->adapter zfcp_qdio_siosl(adapter) zfcp_erp_adapter_reopen(adapter,...) zfcp_erp_strategy ... zfcp_fsf_req_dismiss_all list_for_each_entry_safe zfcp_fsf_req_complete 1 del_timer 1 zfcp_fsf_req_free 1 zfcp_fsf_req_complete 2 zfcp_fsf_request_timeout_handler del_timer 2 fsf_req = from_timer(fsf_req, t, timer) zfcp_fsf_req_free 2 adapter = fsf_req->adapter ^^^^^^^ already freed Link: https://lore.kernel.org/r/20200813152856.50088-1-maier@linux.ibm.com Fixes: `75492a5156` ("s390/scsi: Convert timers to use timer_setup()") Cc: <stable@vger.kernel.org> #4.15+ Suggested-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:12:09 -04:00
Bean Huo	d87a1f6d02	scsi: ufs: No need to send Abort Task if the task in DB was cleared If the bit corresponding to a task in the Doorbell register has been cleared, no need to poll the status of the task on the device side and to send an Abort Task TM. Instead, let it directly goto cleanup. In addition, to keep original debug output, move the goto below the debug print. Link: https://lore.kernel.org/r/20200811141859.27399-3-huobean@gmail.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:09:14 -04:00
Stanley Chu	b10178ee7f	scsi: ufs: Clean up completed request without interrupt notification If somehow no interrupt notification is raised for a completed request and its doorbell bit is cleared by host, UFS driver needs to cleanup its outstanding bit in ufshcd_abort(). Otherwise, system may behave abnormally in the following scenario: After ufshcd_abort() returns, this request will be requeued by SCSI layer with its outstanding bit set. Any future completed request will trigger ufshcd_transfer_req_compl() to handle all "completed outstanding bits". At this time the "abnormal outstanding bit" will be detected and the "requeued request" will be chosen to execute request post-processing flow. This is wrong because this request is still "alive". Link: https://lore.kernel.org/r/20200811141859.27399-2-huobean@gmail.com Reviewed-by: Can Guo <cang@codeaurora.org> Acked-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:09:13 -04:00
Adrian Hunter	127d5f7c4b	scsi: ufs: Improve interrupt handling for shared interrupts For shared interrupts, the interrupt status might be zero, so check that first. Link: https://lore.kernel.org/r/20200811133936.19171-2-adrian.hunter@intel.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:03:29 -04:00
Adrian Hunter	6337f58cec	scsi: ufs: Fix interrupt error message for shared interrupts The interrupt might be shared, in which case it is not an error for the interrupt handler to be called when the interrupt status is zero, so don't print the message unless there was enabled interrupt status. Link: https://lore.kernel.org/r/20200811133936.19171-1-adrian.hunter@intel.com Fixes: `9333d77573` ("scsi: ufs: Fix irq return code") Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:03:29 -04:00
Adrian Hunter	8da76f71fe	scsi: ufs-pci: Add quirk for broken auto-hibernate for Intel EHL Intel EHL UFS host controller advertises auto-hibernate capability but it does not work correctly. Add a quirk for that. [mkp: checkpatch fix] Link: https://lore.kernel.org/r/20200810141024.28859-1-adrian.hunter@intel.com Fixes: `8c09d75276` ("scsi: ufshdc-pci: Add Intel PCI IDs for EHL") Acked-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:00:26 -04:00
Stanley Chu	215d326702	scsi: ufs-mediatek: Fix incorrect time to wait link status Fix incorrect calculation of "ms" based waiting time in function ufs_mtk_setup_clocks(). Link: https://lore.kernel.org/r/20200809055702.20140-1-stanley.chu@mediatek.com Fixes: `9006e3986f` ("scsi: ufs-mediatek: Do not gate clocks if auto-hibern8 is not entered yet") Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 21:53:37 -04:00
Stanley Chu	93b6c5db06	scsi: ufs: Fix possible infinite loop in ufshcd_hold In ufshcd_suspend(), after clk-gating is suspended and link is set as Hibern8 state, ufshcd_hold() is still possibly invoked before ufshcd_suspend() returns. For example, MediaTek's suspend vops may issue UIC commands which would call ufshcd_hold() during the command issuing flow. Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled, then ufshcd_hold() may enter infinite loops because there is no clk-ungating work scheduled or pending. In this case, ufshcd_hold() shall just bypass, and keep the link as Hibern8 state. Link: https://lore.kernel.org/r/20200809050734.18740-1-stanley.chu@mediatek.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Co-developed-by: Andy Teng <andy.teng@mediatek.com> Signed-off-by: Andy Teng <andy.teng@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 21:52:45 -04:00
Mike Christie	fa39ab5184	scsi: fcoe: Fix I/O path allocation ixgbe_fcoe_ddp_setup() can be called from the main I/O path and is called with a spin_lock held, so we have to use GFP_ATOMIC allocation instead of GFP_KERNEL. Link: https://lore.kernel.org/r/1596831813-9839-1-git-send-email-michael.christie@oracle.com cc: Hannes Reinecke <hare@suse.de> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 21:52:39 -04:00
Jing Xiangfeng	2138d1c918	scsi: ufs: ti-j721e-ufs: Fix error return in ti_j721e_ufs_probe() Fix to return error code PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/20200806070135.67797-1-jingxiangfeng@huawei.com Fixes: `22617e2163` ("scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes") Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 21:48:45 -04:00
Linus Torvalds	06a4ec1d9d	mailmap alphabetizing and addition -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl87EbYWHGtlZXNjb29r QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJseID/44aCrTkW6B0z7i3+AwLBnQnY6p g5QI1lipnAcna9RUu5JcIcbXu4p+cMcFf3Ewj6Ohcc4dWAAoSrwdo4uNO6pFvjtB yKyX4MYdN55ZAHLroRD1yOn+TSFPGx66VhwVUxL4fwCewRA8w33ZP6bUY3I9+iJX v86SOxVdWwf2TW9rTh6wM2lMAwVGhqdG1pt660+smw3NJSQCrjrdFN/ZURoPud5n 2UJimgNwqrwvRd23oX5J4HXvyjNGIkhfJdGwkw10jQVLuZgsLgy6rx7JFGOuDBJi LL8QsKRzJfpysLQOaBnQJrYcHvRCEkUSXq7NvTqxsStK7z40cQTx73yL9FxS3rKT KGa7uvqazVhnS2AD+sQnzkQ9TgGtPZDfXRkN0MLU/PtSx6LXm+k0GxxVdwXstOKR ax8X0XqmkZOZdX0E0d906aRnkvpVMBzcke0P1NxUN/N+LH1vrQjnx7FCnRzh+eri KNxGazpYCbbBCrsjqHtfyjkDGasZrjWiV07++dEd1CYG9/Gbrx9OjhC6Ic4IhMsg 7Udy3ODZrJsbtH4vUoPfY2/r8rX1YMtbPKWBVB3v8mMmgbpGPcFquuuVytwCpVMj GdG28mVsskXPBuGe+FBNcPv+zZu3L0Zj/SRPkvT6/khqSWV1ws1B57A4rKa5gV9j agEKTCgIHcip3Pn+BQ== =OLbG -----END PGP SIGNATURE----- Merge tag 'pstore-v5.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull mailmap update from Kees Cook: "This was originally part of my pstore tree, but when I realized that mailmap needed re-alphabetizing, I decided to wait until -rc1 to send this, as I saw a lot of mailmap additions pending in -next for the merge window. It's a programmatic reordering and the addition of a pstore contributor's preferred email address" * tag 'pstore-v5.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: mailmap: Add WeiXiong Liao mailmap: Restore dictionary sorting	2020-08-17 17:15:23 -07:00
Linus Torvalds	4cf7562190	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from David Miller: "Another batch of fixes: 1) Remove nft_compat counter flush optimization, it generates warnings from the refcount infrastructure. From Florian Westphal. 2) Fix BPF to search for build id more robustly, from Jiri Olsa. 3) Handle bogus getopt lengths in ebtables, from Florian Westphal. 4) Infoleak and other fixes to j1939 CAN driver, from Eric Dumazet and Oleksij Rempel. 5) Reset iter properly on mptcp sendmsg() error, from Florian Westphal. 6) Show a saner speed in bonding broadcast mode, from Jarod Wilson. 7) Various kerneldoc fixes in bonding and elsewhere, from Lee Jones. 8) Fix double unregister in bonding during namespace tear down, from Cong Wang. 9) Disable RP filter during icmp_redirect selftest, from David Ahern" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (75 commits) otx2_common: Use devm_kcalloc() in otx2_config_npa() net: qrtr: fix usage of idr in port assignment to socket selftests: disable rp_filter for icmp_redirect.sh Revert "net: xdp: pull ethernet header off packet after computing skb->protocol" phylink: <linux/phylink.h>: fix function prototype kernel-doc warning mptcp: sendmsg: reset iter on error redux net: devlink: Remove overzealous WARN_ON with snapshots tipc: not enable tipc when ipv6 works as a module tipc: fix uninit skb->data in tipc_nl_compat_dumpit() net: Fix potential wrong skb->protocol in skb_vlan_untag() net: xdp: pull ethernet header off packet after computing skb->protocol ipvlan: fix device features bonding: fix a potential double-unregister can: j1939: add rxtimer for multipacket broadcast session can: j1939: abort multipacket broadcast session when timeout occurs can: j1939: cancel rxtimer on multipacket broadcast session complete can: j1939: fix support for multipacket broadcast message net: fddi: skfp: cfm: Remove seemingly unused variable 'ID_sccs' net: fddi: skfp: cfm: Remove set but unused variable 'oldstate' net: fddi: skfp: smt: Remove seemingly unused variable 'ID_sccs' ...	2020-08-17 17:09:50 -07:00
Xu Wang	bf2bcd6f1a	otx2_common: Use devm_kcalloc() in otx2_config_npa() A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "devm_kcalloc". Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-17 15:08:39 -07:00
Christoph Hellwig	7c2308f79f	PCI/P2PDMA: Fix build without DMA ops My commit to make DMA ops support optional missed the reference in the p2pdma code. And while the build bot didn't manage to find a config where this can happen, Matthew did. Fix this by replacing two IS_ENABLED checks with ifdefs. Fixes: `2f9237d4f6` ("dma-mapping: make support for dma ops optional") Link: https://lore.kernel.org/r/20200810124843.1532738-1-hch@lst.de Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>	2020-08-17 17:08:21 -05:00
Necip Fazil Yildiran	8dfddfb796	net: qrtr: fix usage of idr in port assignment to socket Passing large uint32 sockaddr_qrtr.port numbers for port allocation triggers a warning within idr_alloc() since the port number is cast to int, and thus interpreted as a negative number. This leads to the rejection of such valid port numbers in qrtr_port_assign() as idr_alloc() fails. To avoid the problem, switch to idr_alloc_u32() instead. Fixes: `bdabad3e36` ("net: Add Qualcomm IPC router") Reported-by: syzbot+f31428628ef672716ea8@syzkaller.appspotmail.com Signed-off-by: Necip Fazil Yildiran <necip@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-17 15:00:41 -07:00
David Ahern	bcf7ddb018	selftests: disable rp_filter for icmp_redirect.sh h1 is initially configured to reach h2 via r1 rather than the more direct path through r2. If rp_filter is set and inherited for r2, forwarding fails since the source address of h1 is reachable from eth0 vs the packet coming to it via r1 and eth1. Since rp_filter setting affects the test, explicitly reset it. Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-17 14:59:24 -07:00
Yonghong Song	cf28f3bbfc	bpf: Use get_file_rcu() instead of get_file() for task_file iterator With latest `bpftool prog` command, we observed the following kernel panic. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page PGD dfe894067 P4D dfe894067 PUD deb663067 PMD 0 Oops: 0010 [#1] SMP CPU: 9 PID: 6023 ... RIP: 0010:0x0 Code: Bad RIP value. RSP: 0000:ffffc900002b8f18 EFLAGS: 00010286 RAX: ffff8883a405f400 RBX: ffff888e46a6bf00 RCX: 000000008020000c RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8883a405f400 RBP: ffff888e46a6bf50 R08: 0000000000000000 R09: ffffffff81129600 R10: ffff8883a405f300 R11: 0000160000000000 R12: 0000000000002710 R13: 000000e9494b690c R14: 0000000000000202 R15: 0000000000000009 FS: 00007fd9187fe700(0000) GS:ffff888e46a40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 0000000de5d33002 CR4: 0000000000360ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> rcu_core+0x1a4/0x440 __do_softirq+0xd3/0x2c8 irq_exit+0x9d/0xa0 smp_apic_timer_interrupt+0x68/0x120 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0033:0x47ce80 Code: Bad RIP value. RSP: 002b:00007fd9187fba40 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13 RAX: 0000000000000002 RBX: 00007fd931789160 RCX: 000000000000010c RDX: 00007fd9308cdfb4 RSI: 00007fd9308cdfb4 RDI: 00007ffedd1ea0a8 RBP: 00007fd9187fbab0 R08: 000000000000000e R09: 000000000000002a R10: 0000000000480210 R11: 00007fd9187fc570 R12: 00007fd9316cc400 R13: 0000000000000118 R14: 00007fd9308cdfb4 R15: 00007fd9317a9380 After further analysis, the bug is triggered by Commit `eaaacd2391` ("bpf: Add task and task/file iterator targets") which introduced task_file bpf iterator, which traverses all open file descriptors for all tasks in the current namespace. The latest `bpftool prog` calls a task_file bpf program to traverse all files in the system in order to associate processes with progs/maps, etc. When traversing files for a given task, rcu read_lock is taken to access all files in a file_struct. But it used get_file() to grab a file, which is not right. It is possible file->f_count is 0 and get_file() will unconditionally increase it. Later put_file() may cause all kind of issues with the above as one of sympotoms. The failure can be reproduced with the following steps in a few seconds: $ cat t.c #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #define N 10000 int fd[N]; int main() { int i; for (i = 0; i < N; i++) { fd[i] = open("./note.txt", 'r'); if (fd[i] < 0) { fprintf(stderr, "failed\n"); return -1; } } for (i = 0; i < N; i++) close(fd[i]); return 0; } $ gcc -O2 t.c $ cat run.sh #/bin/bash for i in {1..100} do while true; do ./a.out; done & done $ ./run.sh $ while true; do bpftool prog >& /dev/null; done This patch used get_file_rcu() which only grabs a file if the file->f_count is not zero. This is to ensure the file pointer is always valid. The above reproducer did not fail for more than 30 minutes. Fixes: `eaaacd2391` ("bpf: Add task and task/file iterator targets") Suggested-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/bpf/20200817174214.252601-1-yhs@fb.com	2020-08-17 14:42:58 -07:00
Kees Cook	5a4fe06246	mailmap: Add WeiXiong Liao WeiXiong Liao noted to me offlist that his preference for email address had changed and that he'd like it updated in the mailmap so people discussing pstore/blk would be able to reach him. Cc: WeiXiong Liao <gmpy.liaowx@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org>	2020-08-17 14:32:44 -07:00
Kees Cook	d6bd5201f7	mailmap: Restore dictionary sorting Several names had been recently appended (instead of inserted). While git-shortlog doesn't need this file to be sorted, it helps humans to keep it organized this way. Sort the entire file (which includes some minor shuffling for dictionary order). Done with the following commands: grep -E '^(#\|$)' .mailmap > .mailmap.head grep -Ev '^(#\|$)' .mailmap > .mailmap.body sort -f .mailmap.body > .mailmap.body.sort cat .mailmap.head .mailmap.body.sort > .mailmap rm .mailmap.head .mailmap.body.sort Signed-off-by: Kees Cook <keescook@chromium.org>	2020-08-17 14:32:44 -07:00
Zqiang	62c789270c	libnvdimm: KASAN: global-out-of-bounds Read in internal_create_group Because the last member of the "nvdimm_firmware_attributes" array was not assigned a null ptr, when traversal of "grp->attrs" array is out of bounds in "create_files" func. func: create_files: ->for (i = 0, attr = grp->attrs; *attr && !error; i++, attr++) ->.... BUG: KASAN: global-out-of-bounds in create_files fs/sysfs/group.c:43 [inline] BUG: KASAN: global-out-of-bounds in internal_create_group+0x9d8/0xb20 fs/sysfs/group.c:149 Read of size 8 at addr ffffffff8a2e4cf0 by task kworker/u17:10/959 CPU: 2 PID: 959 Comm: kworker/u17:10 Not tainted 5.8.0-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Workqueue: events_unbound async_run_entry_fn Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x18f/0x20d lib/dump_stack.c:118 print_address_description.constprop.0.cold+0x5/0x497 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530 create_files fs/sysfs/group.c:43 [inline] internal_create_group+0x9d8/0xb20 fs/sysfs/group.c:149 internal_create_groups.part.0+0x90/0x140 fs/sysfs/group.c:189 internal_create_groups fs/sysfs/group.c:185 [inline] sysfs_create_groups+0x25/0x50 fs/sysfs/group.c:215 device_add_groups drivers/base/core.c:2024 [inline] device_add_attrs drivers/base/core.c:2178 [inline] device_add+0x7fd/0x1c40 drivers/base/core.c:2881 nd_async_device_register+0x12/0x80 drivers/nvdimm/bus.c:506 async_run_entry_fn+0x121/0x530 kernel/async.c:123 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 The buggy address belongs to the variable: nvdimm_firmware_attributes+0x10/0x40 Link: https://lore.kernel.org/r/20200812085501.30963-1-qiang.zhang@windriver.com Link: https://lore.kernel.org/r/20200814150509.225615-1-vaibhav@linux.ibm.com Fixes: `48001ea50d` ("PM, libnvdimm: Add runtime firmware activation support") Reported-by: syzbot+1cf0ffe61aecf46f588f@syzkaller.appspotmail.com Reported-by: Sandipan Das <sandipan@linux.ibm.com> Reported-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Zqiang <qiang.zhang@windriver.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>	2020-08-17 14:47:38 -06:00
Sharat Masetty	20925fe844	drm: msm: a6xx: use dev_pm_opp_set_bw to scale DDR This patches replaces the previously used static DDR vote and uses dev_pm_opp_set_bw() to scale GPU->DDR bandwidth along with scaling GPU frequency. Also since the icc path voting is handled completely in the opp driver, remove the icc_path handle and its usage in the drm driver. Signed-off-by: Sharat Masetty <smasetty@codeaurora.org> Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>	2020-08-17 12:31:44 -07:00
Colin Ian King	f49c7faf77	of/address: check for invalid range.cpu_addr Currently invalid CPU addresses are not being sanity checked resulting in SATA setup failure on a SynQuacer SC2A11 development machine. The original check was removed by and earlier commit, so add a sanity check back in to avoid this regression. Fixes: `7a8b64d17e` ("of/address: use range parser for of_dma_get_range") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20200817113208.523805-1-colin.king@canonical.com Signed-off-by: Rob Herring <robh@kernel.org>	2020-08-17 13:30:09 -06:00
Rob Clark	352c83fb39	drm/msm/gpu: make ringbuffer readonly The GPU has no business writing into the ringbuffer, let's make it readonly to the GPU. Fixes: `7198e6b031` ("drm/msm: add a3xx gpu support") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>	2020-08-17 12:25:22 -07:00
Rob Clark	f228af11df	drm/msm/adreno: fix updating ring fence We need to set it to the most recent completed fence, not the most recent submitted. Otherwise we have races where we think we can retire submits that the GPU is not finished with, if the GPU doesn't manage to overwrite the seqno before we look at it. This can show up with hang recovery if one of the submits after the crashing submit also hangs after it is replayed. Fixes: `f97decac5f` ("drm/msm: Support multiple ringbuffers") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>	2020-08-17 12:24:41 -07:00
Jim Mattson	cb957adb4e	kvm: x86: Toggling CR4.PKE does not load PDPTEs in PAE mode See the SDM, volume 3, section 4.4.1: If PAE paging would be in use following an execution of MOV to CR0 or MOV to CR4 (see Section 4.1.1) and the instruction is modifying any of CR0.CD, CR0.NW, CR0.PG, CR4.PAE, CR4.PGE, CR4.PSE, or CR4.SMEP; then the PDPTEs are loaded from the address in CR3. Fixes: `b9baba8614` ("KVM, pkeys: expose CPUID/CR4 to guest") Cc: Huaitong Han <huaitong.han@intel.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Peter Shier <pshier@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Message-Id: <20200817181655.3716509-1-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-17 15:24:08 -04:00
Rob Clark	35c719da95	drm/msm/dpu: fix unitialized variable error drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c:817 dpu_crtc_enable() error: uninitialized symbol 'request_bandwidth'. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org>	2020-08-17 12:24:07 -07:00
Jim Mattson	427890aff8	kvm: x86: Toggling CR4.SMAP does not load PDPTEs in PAE mode See the SDM, volume 3, section 4.4.1: If PAE paging would be in use following an execution of MOV to CR0 or MOV to CR4 (see Section 4.1.1) and the instruction is modifying any of CR0.CD, CR0.NW, CR0.PG, CR4.PAE, CR4.PGE, CR4.PSE, or CR4.SMEP; then the PDPTEs are loaded from the address in CR3. Fixes: `0be0226f07` ("KVM: MMU: fix SMAP virtualization") Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Peter Shier <pshier@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Message-Id: <20200817181655.3716509-2-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-17 15:23:50 -04:00
Kalyan Thota	4c978caf08	drm/msm/dpu: Fix scale params in plane validation Plane validation uses an API drm_calc_scale which will return src/dst value as a scale ratio. when viewing the range on a scale the values should fall in as Upscale ratio < Unity scale < Downscale ratio for src/dst formula Fix the min and max scale ratios to suit the API accordingly. Signed-off-by: Kalyan Thota <kalyan_t@codeaurora.org> Tested-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Signed-off-by: Rob Clark <robdclark@chromium.org>	2020-08-17 12:23:47 -07:00
Kalyan Thota	ccc862b957	drm/msm/dpu: Fix reservation failures in modeset In TEST_ONLY commit, rm global_state will duplicate the object and request for new reservations, once they pass then the new state will be swapped with the old and will be available for the Atomic Commit. This patch fixes some of missing links in the resource reservation sequence mentioned above. 1) Creation of duplicate state in test_only commit (Rob) 2) Allocate and release the resources on every modeset. 3) Avoid allocation only when active is false. In a modeset operation, swap state happens well before disable. Hence clearing reservations in disable will cause failures in modeset enable. Allow reservations to be cleared/allocated before swap, such that only newly committed resources are pushed to HW. Changes in v1: - Move the rm release to atomic_check. - Ensure resource allocation and free happens when active is not changed i.e only when mode is changed.(Rob) Changes in v2: - Handle dpu_kms_get_global_state API failure as it may return EDEADLK (swboyd). Signed-off-by: Kalyan Thota <kalyan_t@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>	2020-08-17 12:23:10 -07:00
Paolo Bonzini	19cf4b7eef	KVM: x86: fix access code passed to gva_to_gpa The PK bit of the error code is computed dynamically in permission_fault and therefore need not be passed to gva_to_gpa: only the access bits (fetch, user, write) need to be passed down. Not doing so causes a splat in the pku test: WARNING: CPU: 25 PID: 5465 at arch/x86/kvm/mmu.h:197 paging64_walk_addr_generic+0x594/0x750 [kvm] Hardware name: Intel Corporation WilsonCity/WilsonCity, BIOS WLYDCRB1.SYS.0014.D62.2001092233 01/09/2020 RIP: 0010:paging64_walk_addr_generic+0x594/0x750 [kvm] Code: <0f> 0b e9 db fe ff ff 44 8b 43 04 4c 89 6c 24 30 8b 13 41 39 d0 89 RSP: 0018:ff53778fc623fb60 EFLAGS: 00010202 RAX: 0000000000000001 RBX: ff53778fc623fbf0 RCX: 0000000000000007 RDX: 0000000000000001 RSI: 0000000000000002 RDI: ff4501efba818000 RBP: 0000000000000020 R08: 0000000000000005 R09: 00000000004000e7 R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000007 R13: ff4501efba818388 R14: 10000000004000e7 R15: 0000000000000000 FS: 00007f2dcf31a700(0000) GS:ff4501f1c8040000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000001dea475005 CR4: 0000000000763ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: paging64_gva_to_gpa+0x3f/0xb0 [kvm] kvm_fixup_and_inject_pf_error+0x48/0xa0 [kvm] handle_exception_nmi+0x4fc/0x5b0 [kvm_intel] kvm_arch_vcpu_ioctl_run+0x911/0x1c10 [kvm] kvm_vcpu_ioctl+0x23e/0x5d0 [kvm] ksys_ioctl+0x92/0xb0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x3e/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ---[ end trace d17eb998aee991da ]--- Reported-by: Sean Christopherson <sean.j.christopherson@intel.com> Fixes: `897861479c` ("KVM: x86: Add helper functions for illegal GPA checking and page fault injection") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-17 15:21:43 -04:00
Jessica Clarke	bd05220c7b	arch/ia64: Restore arch-specific pgd_offset_k implementation IA-64 is special and treats pgd_offset_k() differently to pgd_offset(), using different formulae to calculate the indices into the kernel and user PGDs. The index into the user PGDs takes into account the region number, but the index into the kernel (init_mm) PGD always assumes a predefined kernel region number. Commit `974b9b2c68` ("mm: consolidate pte_index() and pte_offset_() definitions") made IA-64 use a generic pgd_offset_k() which incorrectly used pgd_index() for kernel page tables. As a result, the index into the kernel PGD was going out of bounds and the kernel hung during early boot. Allow overrides of pgd_offset_k() and override it on IA-64 with the old implementation that will correctly index the kernel PGD. Fixes: `974b9b2c68` ("mm: consolidate pte_index() and pte_offset_() definitions") Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>	2020-08-17 21:50:54 +03:00
David S. Miller	7f9bf6e824	Revert "net: xdp: pull ethernet header off packet after computing skb->protocol" This reverts commit `f8414a8d88`. eth_type_trans() does the necessary pull on the skb. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-17 11:48:05 -07:00
Randy Dunlap	0b76e642f9	phylink: <linux/phylink.h>: fix function prototype kernel-doc warning Fix a kernel-doc warning for the pcs_config() function prototype: ../include/linux/phylink.h:406: warning: Excess function parameter 'permit_pause_to_mac' description in 'pcs_config' Fixes: `7137e18f6f` ("net: phylink: add struct phylink_pcs") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-17 11:46:22 -07:00
Tom Yan	d8d0db7bb3	ALSA: usb-audio: ignore broken processing/extension unit Some devices have broken extension unit where getting current value doesn't work. Attempt that once when creating mixer control for it. If it fails, just ignore it, so that it won't cripple the device entirely (and/or make the error floods). Signed-off-by: Tom Yan <tom.ty89@gmail.com> Link: https://lore.kernel.org/r/5f3abc52.1c69fb81.9cf2.fe91@mx.google.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2020-08-17 19:58:29 +02:00
Yang Weijiang	98b0bf0273	selftests: kvm: Use a shorter encoding to clear RAX If debug_regs.c is built with newer binutils, the resulting binary is "optimized" by the assembler: asm volatile("ss_start: " "xor %%rax,%%rax\n\t" "cpuid\n\t" "movl $0x1a0,%%ecx\n\t" "rdmsr\n\t" : : : "rax", "ecx"); is translated to : 000000000040194e <ss_start>: 40194e: 31 c0 xor %eax,%eax <----- rax->eax? 401950: 0f a2 cpuid 401952: b9 a0 01 00 00 mov $0x1a0,%ecx 401957: 0f 32 rdmsr As you can see rax is replaced with eax in target binary code. This causes a difference is the length of xor instruction (2 Byte vs 3 Byte), and makes the hard-coded instruction length check fail: /* Instruction lengths starting at ss_start / int ss_size[4] = { 3, / xor / <-------- 2 or 3? 2, / cpuid / 5, / mov / 2, / rdmsr */ }; Encode the shorter version directly and, while at it, fix the "clobbers" of the asm. Cc: stable@vger.kernel.org Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-17 13:45:22 -04:00
Daniel Vetter	77ef38574b	drm/modeset-lock: Take the modeset BKL for legacy drivers This fell off in the conversion in commit `9bcaa3fe58` Author: Michal Orzel <michalorzel.eng@gmail.com> Date: Tue Apr 28 19:10:04 2020 +0200 drm: Replace drm_modeset_lock/unlock_all with DRM_MODESET_LOCK_ALL_* helpers but it's caught by the drm_warn_on_modeset_not_all_locked() that the legacy modeset code uses. Since this is the bkl and it's unclear what's all protected, play it safe and grab it again for legacy drivers. Unfortunately this means we need to sprinkle a few more #includes around. Also we need to add the drm_device as a parameter to the _END macro. Finally remove the mute_lock() from setcrtc, since that's now done by the macro. Cc: Alex Deucher <alexdeucher@gmail.com> References: https://gitlab.freedesktop.org/drm/amd/-/issues/1224 Fixes: `9bcaa3fe58` ("drm: Replace drm_modeset_lock/unlock_all with DRM_MODESET_LOCK_ALL_* helpers") Cc: Michal Orzel <michalorzel.eng@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200814093842.3048472-1-daniel.vetter@ffwll.ch	2020-08-17 13:41:50 -04:00
Alex Williamson	aae7a75a82	vfio/type1: Add proper error unwind for vfio_iommu_replay() The vfio_iommu_replay() function does not currently unwind on error, yet it does pin pages, perform IOMMU mapping, and modify the vfio_dma structure to indicate IOMMU mapping. The IOMMU mappings are torn down when the domain is destroyed, but the other actions go on to cause trouble later. For example, the iommu->domain_list can be empty if we only have a non-IOMMU backed mdev attached. We don't currently check if the list is empty before getting the first entry in the list, which leads to a bogus domain pointer. If a vfio_dma entry is erroneously marked as iommu_mapped, we'll attempt to use that bogus pointer to retrieve the existing physical page addresses. This is the scenario that uncovered this issue, attempting to hot-add a vfio-pci device to a container with an existing mdev device and DMA mappings, one of which could not be pinned, causing a failure adding the new group to the existing container and setting the conditions for a subsequent attempt to explode. To resolve this, we can first check if the domain_list is empty so that we can reject replay of a bogus domain, should we ever encounter this inconsistent state again in the future. The real fix though is to add the necessary unwind support, which means cleaning up the current pinning if an IOMMU mapping fails, then walking back through the r-b tree of DMA entries, reading from the IOMMU which ranges are mapped, and unmapping and unpinning those ranges. To be able to do this, we also defer marking the DMA entry as IOMMU mapped until all entries are processed, in order to allow the unwind to know the disposition of each entry. Fixes: `a54eb55045` ("vfio iommu type1: Add support for mediated devices") Reported-by: Zhiyi Guo <zhguo@redhat.com> Tested-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-08-17 11:09:13 -06:00
Alex Williamson	bc93b9ae01	vfio-pci: Avoid recursive read-lock usage A down_read on memory_lock is held when performing read/write accesses to MMIO BAR space, including across the copy_to/from_user() callouts which may fault. If the user buffer for these copies resides in an mmap of device MMIO space, the mmap fault handler will acquire a recursive read-lock on memory_lock. Avoid this by reducing the lock granularity. Sequential accesses requiring multiple ioread/iowrite cycles are expected to be rare, therefore typical accesses should not see additional overhead. VGA MMIO accesses are expected to be non-fatal regardless of the PCI memory enable bit to allow legacy probing, this behavior remains with a comment added. ioeventfds are now included in memory access testing, with writes dropped while memory space is disabled. Fixes: `abafbc551f` ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory") Reported-by: Zhiyi Guo <zhguo@redhat.com> Tested-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-08-17 11:08:18 -06:00

... 10 11 12 13 14 ...

949707 Commits