Commit Graph

948892 Commits

Author SHA1 Message Date
Russell King
5c05c1dbb1 net: phylink, dsa: eliminate phylink_fixed_state_cb()
Move the callback into the phylink_config structure, rather than
providing a callback to set this up.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 16:45:37 -07:00
David S. Miller
6861d6d9cf Merge branch 'qdisc-noop'
Jesper Dangaard Brouer says:

====================
Fix qdisc noop issue caused by driver and identify future bugs

I've been very puzzled why networking on my NXP development board,
using driver dpaa2-eth, stopped working when I updated the kernel
version >= 5.3.  The observable issue were that interface would drop
all TX packets, because it had assigned the qdisc noop.

This turned out the be a NIC driver bug, that would only get triggered
when using sysctl net/core/default_qdisc=fq_codel. It was non-trivial
to find out[1] this was driver related. Thus, this patchset besides
fixing the driver bug, also helps end-user identify the issue.

[1]: https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/board_nxp_ls1088/nxp-board04-troubleshoot-qdisc.org
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 16:44:55 -07:00
Jesper Dangaard Brouer
b89c1e6bdc dpaa2-eth: fix return codes used in ndo_setup_tc
Drivers ndo_setup_tc call should return -EOPNOTSUPP, when it cannot
support the qdisc type. Other return values will result in failing the
qdisc setup.  This lead to qdisc noop getting assigned, which will
drop all TX packets on the interface.

Fixes: ab1e6de2bd ("dpaa2-eth: Add mqprio support")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 16:44:54 -07:00
Jesper Dangaard Brouer
b70ba69ef1 net: sched: report ndo_setup_tc failures via extack
Help end-users of the 'tc' command to see if the drivers ndo_setup_tc
function call fails. Troubleshooting when this happens is non-trivial
(see full process here[1]), and results in net_device getting assigned
the 'qdisc noop', which will drop all TX packets on the interface.

[1]: https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/board_nxp_ls1088/nxp-board04-troubleshoot-qdisc.org

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 16:44:54 -07:00
Taehee Yoo
7f32708036 macsec: avoid to set wrong mtu
When a macsec interface is created, the mtu is calculated with the lower
interface's mtu value.
If the mtu of lower interface is lower than the length, which is needed
by macsec interface, macsec's mtu value will be overflowed.
So, if the lower interface's mtu is too low, macsec interface's mtu
should be set to 0.

Test commands:
    ip link add dummy0 mtu 10 type dummy
    ip link add macsec0 link dummy0 type macsec
    ip link show macsec0

Before:
    11: macsec0@dummy0: <BROADCAST,MULTICAST,M-DOWN> mtu 4294967274
After:
    11: macsec0@dummy0: <BROADCAST,MULTICAST,M-DOWN> mtu 0

Fixes: c09440f7dc ("macsec: introduce IEEE 802.1AE driver")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 16:42:40 -07:00
Kuppuswamy Sathyanarayanan
af03958da0 PCI/EDR: Log only ACPI_NOTIFY_DISCONNECT_RECOVER events
Previously we logged *all* ACPI SYSTEM-level events, which may include lots
of non-EDR events.  Move the message so we only log those related to EDR.

Link: https://lore.kernel.org/r/01afb4e01efbe455de0c445bef6cf3ffc59340d2.1586996350.git.sathyanarayanan.kuppuswamy@linux.intel.com
[bhelgaas: drop the pci_dbg() of all events since ACPI can log those
already]
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-04-24 18:33:29 -05:00
Linus Torvalds
5ef58e2907 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Two minor fixes: one to update a Kconfig reference and the other to
  fix a resource leak on an error path in sg"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: Update referenced link to cdrtools
  scsi: sg: add sg_remove_request in sg_write
2020-04-24 16:23:24 -07:00
Rob Herring
adc9fbcd7d PCI: Use of_node_name_eq() for node name comparisons
Convert string compares of DT node names to use of_node_name_eq() helper
instead. This removes direct access to the node name pointer.

Link: https://lore.kernel.org/r/20200416215114.7715-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
2020-04-24 18:02:17 -05:00
Douglas Anderson
4bc77b2d21 dt-bindings: phy: qcom-qusb2: Fix defaults
The defaults listed in the bindings don't match what the code is
actually doing.  Presumably existing users care more about keeping
existing behavior the same, so change the bindings to match the code
in Linux.

The "qcom,preemphasis-level" default has been wrong for quite a long
time (May 2018).  The other two were recently added.

As some evidence that these values are wrong, this is from the Linux
driver:
- qcom,preemphasis-level: sets "PORT_TUNE1", lower 2 bits.  Driver
  programs PORT_TUNE1 to 0x30 by default and (0x30 & 0x3) = 0.
- qcom,bias-ctrl-value: sets "PLL_BIAS_CONTROL_2", lower 6 bits.
  Driver programs PLL_BIAS_CONTROL_2 to 0x20 by default and (0x20 &
  0x3f) = 0x20 = 32.
- qcom,hsdisc-trim-value: sets "PORT_TUNE2", lower 2 bits.  Driver
  programs PORT_TUNE2 to 0x29 by default and (0x29 & 0x3) = 1.

Fixes: 1e6f134eb6 ("dt-bindings: phy: qcom-qusb2: Add support for overriding Phy tuning parameters")
Fixes: a8b70ccf10 ("dt-bindings: phy-qcom-usb2: Add support to override tuning values")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-24 17:53:57 -05:00
David S. Miller
92dc39fd40 Merge branch 'mlxsw-Mirroring-cleanups'
Ido Schimmel says:

====================
mlxsw: Mirroring cleanups

This patch set contains various cleanups in SPAN (mirroring) code
noticed by Amit and I while working on future enhancements in this area.
No functional changes intended. Tested by current mirroring selftests.

Patches #1-#2 from Amit reduce nesting in a certain function and rename
a callback to a more meaningful name.

Patch #3 removes debug prints that have little value.

Patch #4 converts a reference count to 'refcount_t' in order to catch
over/under flows.

Patch #5 replaces a zero-length array with flexible-array member in
order to get a compiler warning in case the flexible array does not
occur last in the structure.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 15:41:51 -07:00
Ido Schimmel
4780dbdbd9 mlxsw: spectrum_span: Replace zero-length array with flexible-array member
In a similar fashion to commit e99f8e7f88 ("mlxsw: Replace zero-length
array with flexible-array member"), use a flexible-array member to get a
compiler warning in case the flexible array does not occur last in the
structure.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 15:41:51 -07:00
Ido Schimmel
4c00dafc59 mlxsw: spectrum_span: Use 'refcount_t' for reference counting
'refcount_t' is very useful for catching over/under flows. Convert the
SPAN agent objects to use it instead of 'int' for their reference count.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 15:41:51 -07:00
Ido Schimmel
c0c2899cf6 mlxsw: spectrum_span: Remove unnecessary debug prints
To the best of my knowledge, these debug prints were never used. Remove
them.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 15:41:51 -07:00
Amit Cohen
7f9b099bd9 mlxsw: spectrum_span: Rename parms() to parms_set()
Use a more meaningful name for parms() function.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 15:41:51 -07:00
Amit Cohen
8146458fcd mlxsw: spectrum_span: Reduce nesting in mlxsw_sp_span_entry_configure()
Use early return to avoid unnecessary nesting.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 15:41:51 -07:00
Jason Yan
f371d53453 scsi: sgiwd93: Remove unneeded semicolon in sgiwd93.c
Fix the following coccicheck warning:

drivers/scsi/sgiwd93.c:190:2-3: Unneeded semicolon

Link: https://lore.kernel.org/r/20200421034029.28030-1-yanaijie@huawei.com
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:29 -04:00
Jason Yan
9b77c9da6a scsi: qla4xxx: Remove unneeded semicolon in ql4_os.c
Fix the following coccicheck warning:

drivers/scsi/qla4xxx/ql4_os.c:969:3-4: Unneeded semicolon

Link: https://lore.kernel.org/r/20200421034038.28113-1-yanaijie@huawei.com
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:16 -04:00
Jason Yan
8d5e202802 scsi: isci: Use true, false for bool variables
Fix the following coccicheck warning:

drivers/scsi/isci/isci.h:515:1-12: WARNING: Assignment of 0/1 to bool
variable
drivers/scsi/isci/isci.h:503:1-12: WARNING: Assignment of 0/1 to bool
variable
drivers/scsi/isci/isci.h:509:1-12: WARNING: Assignment of 0/1 to bool
variable

Link: https://lore.kernel.org/r/20200421034050.28193-1-yanaijie@huawei.com
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:15 -04:00
Jason Yan
acfcb728bd scsi: bnx2fc: Remove unneeded semicolon in bnx2fc_fcoe.c
Fix the following coccicheck warning:

drivers/scsi/bnx2fc/bnx2fc_fcoe.c:948:4-5: Unneeded semicolon
drivers/scsi/bnx2fc/bnx2fc_fcoe.c:968:4-5: Unneeded semicolon

Link: https://lore.kernel.org/r/20200421034019.27949-1-yanaijie@huawei.com
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:15 -04:00
Jason Yan
f71ded01cc scsi: bfa: Remove unneeded semicolon in bfa_fcs_rport.c
Fix the following coccicheck warning:

drivers/scsi/bfa/bfa_fcs_rport.c:2452:2-3: Unneeded semicolon
drivers/scsi/bfa/bfa_fcs_rport.c:1578:3-4: Unneeded semicolon

Link: https://lore.kernel.org/r/20200421033957.27783-1-yanaijie@huawei.com
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:15 -04:00
YueHaibing
0745c834f7 scsi: bfa: Remove set but not used variable 'fchs'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/scsi/bfa/bfa_svc.c: In function 'uf_recv':
drivers/scsi/bfa/bfa_svc.c:5520:17: warning:
 variable 'fchs' set but not used [-Wunused-but-set-variable]
  struct fchs_s *fchs;
                 ^

Link: https://lore.kernel.org/r/20200418071057.96699-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:15 -04:00
Jason Yan
6942d531e2 scsi: snic: Make snic_io_exch_ver_cmpl_handler() return void
This function does not need a return value since no callers depend on
it. Make it return void.

This also fixes the coccicheck warning:

drivers/scsi/snic/snic_ctl.c:163:5-8: Unneeded variable: "ret". Return
"0" on line 228

Link: https://lore.kernel.org/r/20200418070615.11603-1-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:15 -04:00
Jason Yan
baf3fbf26c scsi: mpt3sas: Remove NULL check before freeing function
Fix the following coccicheck warning:

drivers/scsi/mpt3sas/mpt3sas_base.c:4906:3-19: WARNING: NULL check
before some freeing functions is not needed.

Link: https://lore.kernel.org/r/20200418095850.34883-1-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:15 -04:00
Jason Yan
2e9ef0fcac scsi: ipr: Remove NULL check before freeing function
Fix the following coccicheck warning:

drivers/scsi/ipr.c:9533:2-18: WARNING: NULL check before some freeing
functions is not needed.

Link: https://lore.kernel.org/r/20200418095903.35118-1-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:15 -04:00
Jason Yan
f166021c0f scsi: bfa: Remove unneeded semicolon in bfa_fcs_lport_ns_sm_online()
Fix the following coccicheck warning:

drivers/scsi/bfa/bfa_fcs_lport.c:4361:3-4: Unneeded semicolon

Link: https://lore.kernel.org/r/20200418070553.11262-1-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:15 -04:00
Wu Bo
f8f794a15a scsi: pmcraid: Replace dma_pool_malloc with dma_pool_zalloc
Replace dma_pool_malloc with dma_pool_zalloc to make the code more concise
in pmcraid_allocate_control_blocks() function.

Link: https://lore.kernel.org/r/1587197241-274646-1-git-send-email-wubo40@huawei.com
Signed-off-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:15 -04:00
Maurizio Lombardi
7c59dace7e scsi: target: iscsi: Remove the iscsi_data_count structure
This patch removes the iscsi_data_count structure and the
iscsit_do_rx_data() function because they are used only by rx_data()

Link: https://lore.kernel.org/r/20200424113913.17237-1-mlombard@redhat.com
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:15 -04:00
Ming Lei
f983622ae6 scsi: core: Avoid calling synchronize_rcu() for each device in scsi_host_block()
scsi_host_block() calls scsi_internal_device_block() for each scsi_device and
scsi_internal_device_block() calls blk_mq_quiesce_queue() for each LUN.

Since synchronize_rcu() is called from blk_mq_quiesce_queue(), this can cause
substantial slowdowns on systems with many LUNs.

Use scsi_internal_device_block_nowait() to implement scsi_host_block() so it
is sufficient to run synchronize_rcu() once. This is safe since SCSI does not
set the BLK_MQ_F_BLOCKING flag.

[mkp: commit desc and comment tweaks]

Link: https://lore.kernel.org/r/20200423020713.332743-1-ming.lei@redhat.com
Cc: Steffen Maier <maier@linux.ibm.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:15 -04:00
Jason Yan
3fa65812c2 scsi: BusLogic: Remove conversion to bool in blogic_inquiry()
The '!=' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:

drivers/scsi/BusLogic.c:2240:46-51: WARNING: conversion to bool not
needed here

Link: https://lore.kernel.org/r/20200421034120.28433-1-yanaijie@huawei.com
Acked-by: Khalid Aziz <khalid@gonehiking.org>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:14 -04:00
Jason Yan
1a5d1d940b scsi: megaraid: Use true, false for bool variables
Fix the following coccicheck warning:

drivers/scsi/megaraid/megaraid_sas_fusion.c:4242:6-16: WARNING:
Assignment of 0/1 to bool variable
drivers/scsi/megaraid/megaraid_sas_fusion.c:4786:1-29: WARNING:
Assignment of 0/1 to bool variable
drivers/scsi/megaraid/megaraid_sas_fusion.c:4791:1-29: WARNING:
Assignment of 0/1 to bool variable
drivers/scsi/megaraid/megaraid_sas_fusion.c:4716:1-29: WARNING:
Assignment of 0/1 to bool variable
drivers/scsi/megaraid/megaraid_sas_fusion.c:4721:1-29: WARNING:
Assignment of 0/1 to bool variable

Link: https://lore.kernel.org/r/20200421034111.28353-1-yanaijie@huawei.com
Acked-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:08 -04:00
Eric W. Biederman
3147d8aaa0 proc: Use PIDTYPE_TGID in next_tgid
Combine the pid_task and thes test has_group_leader_pid into a single
dereference by using pid_task(PIDTYPE_TGID).

This makes the code simpler and proof against needing to even think
about any shenanigans that de_thread might get up to.

Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-04-24 17:16:35 -05:00
Eric W. Biederman
0fb5ce62c5 proc: modernize proc to support multiple private instances
Alexey Gladkov <gladkov.alexey@gmail.com> writes:
 Procfs modernization:
 ---------------------
 Historically procfs was always tied to pid namespaces, during pid
 namespace creation we internally create a procfs mount for it. However,
 this has the effect that all new procfs mounts are just a mirror of the
 internal one, any change, any mount option update, any new future
 introduction will propagate to all other procfs mounts that are in the
 same pid namespace.

 This may have solved several use cases in that time. However today we
 face new requirements, and making procfs able to support new private
 instances inside same pid namespace seems a major point. If we want to
 to introduce new features and security mechanisms we have to make sure
 first that we do not break existing usecases. Supporting private procfs
 instances will allow to support new features and behaviour without
 propagating it to all other procfs mounts.

 Today procfs is more of a burden especially to some Embedded, IoT,
 sandbox, container use cases. In user space we are over-mounting null
 or inaccessible files on top to hide files and information. If we want
 to hide pids we have to create PID namespaces otherwise mount options
 propagate to all other proc mounts, changing a mount option value in one
 mount will propagate to all other proc mounts. If we want to introduce
 new features, then they will propagate to all other mounts too, resulting
 either maybe new useful functionality or maybe breaking stuff. We have
 also to note that userspace should not workaround procfs, the kernel
 should just provide a sane simple interface.

 In this regard several developers and maintainers pointed out that
 there are problems with procfs and it has to be modernized:

 "Here's another one: split up and modernize /proc." by Andy Lutomirski [1]

 Discussion about kernel pointer leaks:

 "And yes, as Kees and Daniel mentioned, it's definitely not just dmesg.
 In fact, the primary things tend to be /proc and /sys, not dmesg
 itself." By Linus Torvalds [2]

 Lot of other areas in the kernel and filesystems have been updated to be
 able to support private instances, devpts is one major example [3].

 Which will be used for:

 1) Embedded systems and IoT: usually we have one supervisor for
 apps, we have some lightweight sandbox support, however if we create
 pid namespaces we have to manage all the processes inside too,
 where our goal is to be able to run a bunch of apps each one inside
 its own mount namespace, maybe use network namespaces for vlans
 setups, but right now we only want mount namespaces, without all the
 other complexity. We want procfs to behave more like a real file system,
 and block access to inodes that belong to other users. The 'hidepid=' will
 not work since it is a shared mount option.

 2) Containers, sandboxes and Private instances of file systems - devpts case
 Historically, lot of file systems inside Linux kernel view when instantiated
 were just a mirror of an already created and mounted filesystem. This was the
 case of devpts filesystem, it seems at that time the requirements were to
 optimize things and reuse the same memory, etc. This design used to work but not
 anymore with today's containers, IoT, hostile environments and all the privacy
 challenges that Linux faces.

 In that regards, devpts was updated so that each new mounts is a total
 independent file system by the following patches:

 "devpts: Make each mount of devpts an independent filesystem" by
 Eric W. Biederman [3] [4]

 3) Linux Security Modules have multiple ptrace paths inside some
 subsystems, however inside procfs, the implementation does not guarantee
 that the ptrace() check which triggers the security_ptrace_check() hook
 will always run. We have the 'hidepid' mount option that can be used to
 force the ptrace_may_access() check inside has_pid_permissions() to run.
 The problem is that 'hidepid' is per pid namespace and not attached to
 the mount point, any remount or modification of 'hidepid' will propagate
 to all other procfs mounts.

 This also does not allow to support Yama LSM easily in desktop and user
 sessions. Yama ptrace scope which restricts ptrace and some other
 syscalls to be allowed only on inferiors, can be updated to have a
 per-task context, where the context will be inherited during fork(),
 clone() and preserved across execve(). If we support multiple private
 procfs instances, then we may force the ptrace_may_access() on
 /proc/<pids>/ to always run inside that new procfs instances. This will
 allow to specifiy on user sessions if we should populate procfs with
 pids that the user can ptrace or not.

 By using Yama ptrace scope, some restricted users will only be able to see
 inferiors inside /proc, they won't even be able to see their other
 processes. Some software like Chromium, Firefox's crash handler, Wine
 and others are already using Yama to restrict which processes can be
 ptracable. With this change this will give the possibility to restrict
 /proc/<pids>/ but more importantly this will give desktop users a
 generic and usuable way to specifiy which users should see all processes
 and which user can not.

 Side notes:

 * This covers the lack of seccomp where it is not able to parse
 arguments, it is easy to install a seccomp filter on direct syscalls
 that operate on pids, however /proc/<pid>/ is a Linux ABI using
 filesystem syscalls. With this change all LSMs should be able to analyze
 open/read/write/close... on /proc/<pid>/

 4) This will allow to implement new features either in kernel or
 userspace without having to worry about procfs.
 In containers, sandboxes, etc we have workarounds to hide some /proc
 inodes, this should be supported natively without doing extra complex
 work, the kernel should be able to support sane options that work with
 today and future Linux use cases.

 5) Creation of new superblock with all procfs options for each procfs
 mount will fix the ignoring of mount options. The problem is that the
 second mount of procfs in the same pid namespace ignores the mount
 options. The mount options are ignored without error until procfs is
 remounted.

 Before:

 proc /proc proc rw,relatime,hidepid=2 0 0

 mount("proc", "/tmp/proc", "proc", 0, "hidepid=1") = 0
 +++ exited with 0 +++

 proc /proc proc rw,relatime,hidepid=2 0 0
 proc /tmp/proc proc rw,relatime,hidepid=2 0 0

 proc /proc proc rw,relatime,hidepid=1 0 0
 proc /tmp/proc proc rw,relatime,hidepid=1 0 0

 After:

 proc /proc proc rw,relatime,hidepid=ptraceable 0 0

 proc /proc proc rw,relatime,hidepid=ptraceable 0 0
 proc /tmp/proc proc rw,relatime,hidepid=invisible 0 0

 Introduced changes:
 -------------------
 Each mount of procfs creates a separate procfs instance with its own
 mount options.

 This series adds few new mount options:

 * New 'hidepid=ptraceable' or 'hidepid=4' mount option to show only ptraceable
 processes in the procfs. This allows to support lightweight sandboxes in
 Embedded Linux, also solves the case for LSM where now with this mount option,
 we make sure that they have a ptrace path in procfs.

 * 'subset=pid' that allows to hide non-pid inodes from procfs. It can be used
 in containers and sandboxes, as these are already trying to hide and block
 access to procfs inodes anyway.

 ChangeLog:
 ----------
 * Rebase on top of v5.7-rc1.
 * Fix a resource leak if proc is not mounted or if proc is simply reconfigured.
 * Add few selftests.

 * After a discussion with Eric W. Biederman, the numerical values for hidepid
   parameter have been removed from uapi.
 * Remove proc_self and proc_thread_self from the pid_namespace struct.
 * I took into account the comment of Kees Cook.
 * Update Reviewed-by tags.

 * 'subset=pidfs' renamed to 'subset=pid' as suggested by Alexey Dobriyan.
 * Include Reviewed-by tags.

 * Rebase on top of Eric W. Biederman's procfs changes.
 * Add human readable values of 'hidepid' as suggested by Andy Lutomirski.

 * Started using RCU lock to clean dcache entries as suggested by Linus Torvalds.

 * 'pidonly=1' renamed to 'subset=pidfs' as suggested by Alexey Dobriyan.
 * HIDEPID_* moved to uapi/ as they are user interface to mount().
   Suggested-by Alexey Dobriyan <adobriyan@gmail.com>

 * 'hidepid=' and 'gid=' mount options are moved from pid namespace to superblock.
 * 'newinstance' mount option removed as suggested by Eric W. Biederman.
    Mount of procfs always creates a new instance.
 * 'limit_pids' renamed to 'hidepid=3'.
 * I took into account the comment of Linus Torvalds [7].
 * Documentation added.

 * Fixed a bug that caused a problem with the Fedora boot.
 * The 'pidonly' option is visible among the mount options.

 * Renamed mount options to 'newinstance' and 'pids='
    Suggested-by: Andy Lutomirski <luto@kernel.org>
 * Fixed order of commit, Suggested-by: Andy Lutomirski <luto@kernel.org>
 * Many bug fixes.

 * Removed 'unshared' mount option and replaced it with 'limit_pids'
    which is attached to the current procfs mount.
    Suggested-by Andy Lutomirski <luto@kernel.org>
 * Do not fill dcache with pid entries that we can not ptrace.
 * Many bug fixes.

 References:
 -----------
 [1] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2017-January/004215.html
 [2] http://www.openwall.com/lists/kernel-hardening/2017/10/05/5
 [3] https://lwn.net/Articles/689539/
 [4] http://lxr.free-electrons.com/source/Documentation/filesystems/devpts.txt?v=3.14
 [5] https://lkml.org/lkml/2017/5/2/407
 [6] https://lkml.org/lkml/2017/5/3/357
 [7] https://lkml.org/lkml/2018/5/11/505

 Alexey Gladkov (7):
   proc: rename struct proc_fs_info to proc_fs_opts
   proc: allow to mount many instances of proc in one pid namespace
   proc: instantiate only pids that we can ptrace on 'hidepid=4' mount
     option
   proc: add option to mount only a pids subset
   docs: proc: add documentation for "hidepid=4" and "subset=pid" options
     and new mount behavior
  proc: use human-readable values for hidepid
   proc: use named enums for better readability

  Documentation/filesystems/proc.rst            |  92 +++++++++---
  fs/proc/base.c                                |  48 +++++--
  fs/proc/generic.c                             |   9 ++
  fs/proc/inode.c                               |  30 +++-
  fs/proc/root.c                                | 131 +++++++++++++-----
  fs/proc/self.c                                |   6 +-
  fs/proc/thread_self.c                         |   6 +-
  fs/proc_namespace.c                           |  14 +-
  include/linux/pid_namespace.h                 |  12 --
  include/linux/proc_fs.h                       |  30 +++-
  tools/testing/selftests/proc/.gitignore       |   2 +
  tools/testing/selftests/proc/Makefile         |   2 +
  .../selftests/proc/proc-fsconfig-hidepid.c    |  50 +++++++
  .../selftests/proc/proc-multiple-procfs.c     |  48 +++++++
  14 files changed, 384 insertions(+), 96 deletions(-)
  create mode 100644 tools/testing/selftests/proc/proc-fsconfig-hidepid.c
  create mode 100644 tools/testing/selftests/proc/proc-multiple-procfs.c

Link: https://lore.kernel.org/lkml/20200419141057.621356-1-gladkov.alexey@gmail.com/
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-04-24 17:04:06 -05:00
Alexey Gladkov
c59f415a7c Use proc_pid_ns() to get pid_namespace from the proc superblock
To get pid_namespace from the procfs superblock should be used a special
helper. This will avoid errors when s_fs_info will change the type.

Link: https://lore.kernel.org/lkml/20200423200316.164518-3-gladkov.alexey@gmail.com/
Link: https://lore.kernel.org/lkml/20200423112858.95820-1-gladkov.alexey@gmail.com/
Link: https://lore.kernel.org/lkml/06B50A1C-406F-4057-BFA8-3A7729EA7469@lca.pw/
Signed-off-by: Alexey Gladkov <gladkov.alexey@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-04-24 16:38:30 -05:00
Rob Herring
2bdfd4fbcb dt-bindings: Fix erroneous 'additionalProperties'
There's several cases of json-schema 'additionalProperties' at the wrong
indentation level which has the effect of making them DT properties. This
is harmless, but let's fix them so a meta-schema check for this can be
added.

In all the cases, either the 'additionalProperties' was extra or doesn't
work because there's a $ref to more properties. In the latter case, we
can use 'unevaluatedProperties' instead.

Reported-by: Iskren Chernev <iskren.chernev@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Saravanan Sekar <sravanhome@gmail.com>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-24 16:36:48 -05:00
Chris Wilson
9669a50799 drm/i915: Drop rq->ring->vma peeking from error capture
We only hold the active spinlock while dumping the error state, and this
does not prevent another thread from retiring the request -- as it is
quite possible that despite us capturing the current state, the GPU has
completed the request. As such, it is dangerous to dereference state
below the request as it may already be freed, and the simplest way to
avoid the danger is not include it in the error state.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1788
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424191410.27570-1-chris@chris-wilson.co.uk
2020-04-24 22:14:35 +01:00
Eric W. Biederman
6ade99ec61 proc: Put thread_pid in release_task not proc_flush_pid
Oleg pointed out that in the unlikely event the kernel is compiled
with CONFIG_PROC_FS unset that release_task will now leak the pid.

Move the put_pid out of proc_flush_pid into release_task to fix this
and to guarantee I don't make that mistake again.

When possible it makes sense to keep get and put in the same function
so it can easily been seen how they pair up.

Fixes: 7bc3e6e55a ("proc: Use a list of inodes to flush from proc")
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-04-24 15:49:00 -05:00
Linus Torvalds
8e9ccd0f26 Merge tag 'pm-5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "Restore an optimization related to asynchronous suspend and resume of
  devices during system-wide power transitions that was disabled by
  mistake (Kai-Heng Feng) and update the pm-graph suite of power
  management utilities (Todd Brandt)"

* tag 'pm-5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: core: Switch back to async_schedule_dev()
  pm-graph v5.6
2020-04-24 13:43:37 -07:00
Linus Torvalds
5be35f7ffc Merge tag 'pnp-5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull PNP cleanup from Rafael Wysocki:
 "Make the PNP code use list_for_each_entry() in a few places instead of
  open-coding it (Jason Gunthorpe)"

* tag 'pnp-5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  pnp: Use list_for_each_entry() instead of open coding
2020-04-24 13:41:29 -07:00
Linus Torvalds
9dc5d985fd Merge tag 'acpi-5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
 "Drop a lid status quirk for Asus T200TA that is not necessary any more
  and clean up a resource management inconsistency in the PCI IRQ link
  configuration code.

  Both changes from Hans de Goede"

* tag 'acpi-5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: button: Drop no longer necessary Asus T200TA lid_init_state quirk
  ACPI/PCI: pci_link: use extended_irq union member when setting ext-irq shareable
2020-04-24 13:37:19 -07:00
Linus Torvalds
bc0c4d1e17 mm: check that mm is still valid in madvise()
IORING_OP_MADVISE can end up basically doing mprotect() on the VM of
another process, which means that it can race with our crazy core dump
handling which accesses the VM state without holding the mmap_sem
(because it incorrectly thinks that it is the final user).

This is clearly a core dumping problem, but we've never fixed it the
right way, and instead have the notion of "check that the mm is still
ok" using mmget_still_valid() after getting the mmap_sem for writing in
any situation where we're not the original VM thread.

See commit 04f5866e41 ("coredump: fix race condition between
mmget_not_zero()/get_task_mm() and core dumping") for more background on
this whole mmget_still_valid() thing.  You might want to have a barf bag
handy when you do.

We're discussing just fixing this properly in the only remaining core
dumping routines.  But even if we do that, let's make do_madvise() do
the right thing, and then when we fix core dumping, we can remove all
these mmget_still_valid() checks.

Reported-and-tested-by: Jann Horn <jannh@google.com>
Fixes: c1ca757bd6 ("io_uring: add IORING_OP_MADVISE")
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-24 13:28:03 -07:00
David S. Miller
c651b461b5 Merge tag 'mac80211-for-net-2020-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:

====================
Just three changes:
 * fix a wrong GFP_KERNEL in hwsim
 * fix the debugfs mess after the mac80211 registration race fix
 * suppress false-positive RCU list lockdep warnings
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 13:17:01 -07:00
David S. Miller
0303b3a168 Merge tag 'wireless-drivers-2020-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:

====================
wireless-drivers fixes for v5.7

Second set of fixes for v5.7. Quite a few iwlwifi fixes and some
maintainers file updates.

iwlwifi

* fix a bug with kmemdup() error handling

* fix a DMA pool warning about unfreed memory

* fix beacon statistics

* fix a theoritical bug in device initialisation

* fix queue limit handling and inactive TID removal

* disable ACK Enabled Aggregation which was enabled by accident

* fix transmit power setting reading from BIOS with certain versions
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 13:14:05 -07:00
Linus Torvalds
aee1a009c9 Merge tag 'io_uring-5.7-2020-04-24' of git://git.kernel.dk/linux-block
Pull io_uring fix from Jens Axboe:
 "Single fixup for a change that went into -rc2"

* tag 'io_uring-5.7-2020-04-24' of git://git.kernel.dk/linux-block:
  io_uring: only restore req->work for req that needs do completion
2020-04-24 12:58:22 -07:00
Linus Torvalds
81da3d3c10 Merge tag 'libata-5.7-2020-04-24' of git://git.kernel.dk/linux-block
Pull libata fixlet from Jens Axboe:
 "Minor spelling error fix for libata"

* tag 'libata-5.7-2020-04-24' of git://git.kernel.dk/linux-block:
  ata: sata_inic162x fix a spelling issue
2020-04-24 12:54:13 -07:00
Linus Torvalds
3d29cb17ba Merge tag 'block-5.7-2020-04-24' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A few fixes/changes that should go into this release:

   - null_blk zoned fixes (Damien)

   - blkdev_close() sync improvement (Douglas)

   - Fix regression in blk-iocost that impacted (at least) systemtap
     (Waiman)

   - Comment fix, header removal (Zhiqiang, Jianpeng)"

* tag 'block-5.7-2020-04-24' of git://git.kernel.dk/linux-block:
  null_blk: Cleanup zoned device initialization
  null_blk: Fix zoned command handling
  block: remove unused header
  blk-iocost: Fix error on iocost_ioc_vrate_adj
  bdev: Reduce time holding bd_mutex in sync in blkdev_close()
  buffer: remove useless comment and WB_REASON_FREE_MORE_MEM, reason.
2020-04-24 12:44:19 -07:00
Takashi Iwai
4b63340b9b Merge branch 'topic/pcm-oss-fix' into for-linus
An empty merge of PCM OSS fix for 5.6 code base.
The fix for 5.7 was already applied.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-04-24 21:39:31 +02:00
Linus Torvalds
da5de55d17 Merge tag 'trace-v5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
 "A few tracing fixes:

   - Two fixes for memory leaks detected by kmemleak

   - Removal of some dead code

   - A few local functions turned static"

* tag 'trace-v5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Convert local functions in tracing_map.c to static
  tracing: Remove DECLARE_TRACE_NOARGS
  ftrace: Fix memory leak caused by not freeing entry in unregister_ftrace_direct()
  tracing: Fix memory leaks in trace_events_hist.c
2020-04-24 12:39:21 -07:00
Takashi Iwai
ac957e8c54 ALSA: pcm: oss: Place the plugin buffer overflow checks correctly (for 5.7)
[ This is again a forward-port of the fix applied for 5.6-base code
  (commit 4285de0725) to 5.7-base, hence neither Fixes nor
  Cc-to-stable tags are included here -- tiwai ]

The checks of the plugin buffer overflow in the previous fix by commit
  f2ecf903ef ("ALSA: pcm: oss: Avoid plugin buffer overflow")
are put in the wrong places mistakenly, which leads to the expected
(repeated) sound when the rate plugin is involved.  Fix in the right
places.

Also, at those right places, the zero check is needed for the
termination node, so added there as well, and let's get it done,
finally.

Link: https://lore.kernel.org/r/20200424193843.20397-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-04-24 21:39:09 +02:00
Rafael J. Wysocki
edb7f9d6b5 Merge back system-wide PM updates for v5.8. 2020-04-24 21:37:01 +02:00
Bjorn Helgaas
8c8ff55b4d PCI/AER: Don't select CONFIG_PCIEAER by default
PCIe Advanced Error Reporting (AER) is optional and there's no need for it
to be selected by default.

Remove the "default y" for CONFIG_PCIEAER.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Russell Currey <ruscur@russell.cc>
Cc: Sam Bobroff <sbobroff@linux.ibm.com>
Cc: Oliver O'Halloran <oohall@gmail.com>
2020-04-24 14:35:55 -05:00