PCI: Restore ARI Capable Hierarchy before setting numVFs

In the restore path, we previously read PCI_SRIOV_VF_OFFSET and
PCI_SRIOV_VF_STRIDE before restoring PCI_SRIOV_CTRL_ARI:

  pci_restore_state
    pci_restore_iov_state
      sriov_restore_state
        pci_iov_set_numvfs
          pci_read_config_word(... PCI_SRIOV_VF_OFFSET, &iov->offset)
          pci_read_config_word(... PCI_SRIOV_VF_STRIDE, &iov->stride)
        pci_write_config_word(... PCI_SRIOV_CTRL, iov->ctrl)

But per SR-IOV r1.1, sec 3.3.3.5, the device can use PCI_SRIOV_CTRL_ARI to
determine PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE.  Therefore, this
path, which is used for suspend/resume and AER recovery, can corrupt
iov->offset and iov->stride.

Since the iov state is associated with the device, not the driver, if we
reload the driver, it will use the the corrupted data, which may cause
crashes like this:

  kernel BUG at drivers/pci/iov.c:157!
  RIP: 0010:pci_iov_add_virtfn+0x2eb/0x350
  Call Trace:
   pci_enable_sriov+0x353/0x440
   ixgbe_pci_sriov_configure+0xd5/0x1f0 [ixgbe]
   sriov_numvfs_store+0xf7/0x170
   dev_attr_store+0x18/0x30
   sysfs_kf_write+0x37/0x40
   kernfs_fop_write+0x120/0x1b0
   vfs_write+0xb5/0x1a0
   SyS_write+0x55/0xc0

Restore PCI_SRIOV_CTRL_ARI before calling pci_iov_set_numvfs(), then
restore the rest of PCI_SRIOV_CTRL (which may set PCI_SRIOV_CTRL_VFE)
afterwards.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
[bhelgaas: changelog, add comment, also clear ARI if necessary]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Emil Tantilov <emil.s.tantilov@intel.com>
This commit is contained in:
Tony Nguyen 2017-10-04 08:52:58 -07:00 committed by Bjorn Helgaas
parent 27d6162944
commit ff26449e41

View File

@ -498,6 +498,14 @@ static void sriov_restore_state(struct pci_dev *dev)
if (ctrl & PCI_SRIOV_CTRL_VFE)
return;
/*
* Restore PCI_SRIOV_CTRL_ARI before pci_iov_set_numvfs() because
* it reads offset & stride, which depend on PCI_SRIOV_CTRL_ARI.
*/
ctrl &= ~PCI_SRIOV_CTRL_ARI;
ctrl |= iov->ctrl & PCI_SRIOV_CTRL_ARI;
pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, ctrl);
for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++)
pci_update_resource(dev, i);