On PowerNV platform, resource position in M64 BAR implies the PE# the
resource belongs to. In some cases, adjustment of a resource is necessary
to locate it to a correct position in M64 BAR .
This patch adds pnv_pci_vf_resource_shift() to shift the 'real' PF IOV BAR
address according to an offset.
Note:
After doing so, there would be a "hole" in the /proc/iomem when offset
is a positive value. It looks like the device return some mmio back to
the system, which actually no one could use it.
[bhelgaas: rework loops, rework overlap check, index resource[]
conventionally, remove pci_regs.h include, squashed with next patch]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Implement pcibios_iov_resource_alignment() on powernv platform.
On PowerNV platform, there are 3 cases for the IOV BAR:
1. initial state, the IOV BAR size is multiple times of VF BAR size
2. after expanded, the IOV BAR size is expanded to meet the M64 segment size
3. sizing stage, the IOV BAR is truncated to 0
pnv_pci_iov_resource_alignment() handle these three cases respectively.
[bhelgaas: adjust to drop "align" parameter, return pci_iov_resource_size()
if no ppc_md machdep_call version]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On PHB3, PF IOV BAR will be covered by M64 BAR to have better PE isolation.
M64 BAR is a type of hardware resource in PHB3, which could map a range of
MMIO to PE numbers on powernv platform. And this range is divided equally
by the number of total_pe with each divided range mapping to a PE number.
Also, the M64 BAR must map a MMIO range with power-of-two size.
The total_pe number is usually different from total_VFs, which can lead to
a conflict between MMIO space and the PE number.
For example, if total_VFs is 128 and total_pe is 256, the second half of
M64 BAR will be part of other PCI device, which may already belong to other
PEs.
This patch prevents the conflict by reserving additional space for the PF
IOV BAR, which is total_pe number of VF's BAR size.
[bhelgaas: make dev_printk() output more consistent, index resource[]
conventionally]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Previously the iommu_table had the same lifetime as a struct pnv_ioda_pe
and was embedded in it. The pnv_ioda_pe was assigned to a PE on the bootup
stage. Since PEs are based on the hardware layout which is static in the
system, they will never get released. This means the iommu_table in the
pnv_ioda_pe will never get released either.
This no longer works for VF PE. VF PEs are created and released dynamically
when VFs are created and released. So we need to assign pnv_ioda_pe to VF
PEs respectively when VFs are enabled and clean up those resources for VF
PE when VFs are disabled. And iommu_table is one of the resources we need
to handle dynamically.
Current iommu_table is a static field in pnv_ioda_pe, which will face a
problem when freeing it. During the disabling of a VF,
pnv_pci_ioda2_release_dma_pe will call iommu_free_table to release the
iommu_table for this PE. A static iommu_table will fail in
iommu_free_table.
According to these requirement, this patch allocates iommu_table
dynamically.
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Flag PCI_REASSIGN_ALL_RSRC is used to ignore resources information setup by
firmware, so that kernel would re-assign all resources of pci devices.
On powerpc arch, this happens in a header fixup function
pcibios_fixup_resources(), which will clean up the resources if this flag
is set. This works fine for PFs, since after clean up, kernel will
re-assign the resources in pcibios_resource_survey().
Below is a simple call flow on how it works:
pcibios_init
pcibios_scan_phb
pci_scan_child_bus
...
pci_device_add
pci_fixup_device(pci_fixup_header)
pcibios_fixup_resources # header fixup
for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
dev->resource[i].start = 0
pcibios_resource_survey # re-assign
pcibios_allocate_resources
However, the VF resources won't be re-assigned, since the VF resources are
completely determined by the PF resources, and the PF resources have
already been reassigned. This means we need to leave VF's resources
un-cleared in pcibios_fixup_resources().
In this patch, we skip the resource unset process in
pcibios_fixup_resources(), if the pci_dev is a VF.
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
pci_dn is the extension of PCI device node and is created from device node.
Unfortunately, VFs are enabled dynamically by PF's driver and they don't
have corresponding device nodes and pci_dn, which is required to access
VFs' config spaces.
The patch creates pci_dn for VFs in pcibios_sriov_enable() on their PF,
and removes pci_dn for VFs in pcibios_sriov_disable() on their PF. When
VF's pci_dn is created, it's put to the child list of the pci_dn of PF's
upstream bridge. The pci_dn is linked to pci_dev during early fixup time
to setup the fast path.
[bhelgaas: add ifdef around add_one_dev_pci_info(), use dev_printk()]
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch removes struct eeh_dev::dn and the corresponding helper
functions: eeh_dev_to_of_node() and of_node_to_eeh_dev(). Instead,
eeh_dev_to_pdn() and pdn_to_eeh_dev() should be used to get the
pdn, which might contain device_node on PowerNV platform.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There are 3 EEH operations whose arguments contain device_node:
read_config(), write_config() and restore_config(). The patch
replaces device_node with pci_dn.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Originally, EEH core probes on device_node or pci_dev to populate
EEH devices and PEs, which conflicts with the fact: SRIOV VFs are
usually enabled and created by PF's driver and they don't have the
corresponding device_nodes. Instead, SRIOV VFs have dynamically
created pci_dn, which can be used for EEH probe.
The patch reworks EEH probe for PowerNV and pSeries platforms to
do probing based on pci_dn, instead of pci_dev or device_node any
more.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch adds function traverse_pci_dn(), which is similar to
traverse_pci_devices() except it takes pci_dn, not device_node
as parameter. The pci_dev.c has been reworked to create eeh_dev
from pci_dn, instead of device_node.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Originally, EEH probes on device_node or pci_dev and populates the
corresponding eeh_dev. In the subsequent patches, EEH will probes
on pci_dn and populates the corresponding eeh_dev. So we have to
cache some information in pci_dn, either from device_node or SRIOV
PF's enablement platform hook, to populate the eeh_dev properly.
The motivation to probe pci_dn, instead of device node or pci_dev,
to populate eeh_dev is SRIOV VFs are dynamically created and we
don't have the corresponding device nodes for them.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The PCI config accessors previously relied on device_node. Unfortunately,
VFs don't have a corresponding device_node, so change the accessors to use
pci_dn instead.
[bhelgaas: changelog]
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently, the PCI config accessors are implemented based on device node.
Unfortunately, SRIOV VFs won't have the corresponding device nodes. pci_dn
will be used in replacement with device node for SRIOV VFs. So we have to
use pci_dn in PCI config accessors.
The patch refactors pci_dn in following aspects to make it ready to be used
in PCI config accessors as we do in subsequent patch:
* pci_dn is organized as a hierarchy tree. PCI device's pci_dn is
put to the child list of pci_dn of its upstream bridge or PHB. VF's
pci_dn will be put to the child list of pci_dn of PF's bridge.
* For one particular PCI device (VF or not), its pci_dn can be
found from pdev->dev.archdata.pci_data, PCI_DN(devnode), or
parent's list. The fast path (fetching pci_dn through PCI device
instance) is populated during early fixup time.
[bhelgaas: changelog]
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch removes unused file eeh-ioda.c and updates makefile
accordingly. Besides, the definition of "struct pnv_eeh_ops" and
the instances are all removed. Until now, the chip layer of EEH
implementation for PowerNV platform is removed completely.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch drops PHB EEH operation reset() and merges its logic to
eeh_ops::reset().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch drops PHB EEH operation next_error() and merges its
logic to eeh_ops::next_error().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch drops PHB EEH operation get_state() and merges its logic
to eeh_ops::get_state().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch drops PHB EEH operation set_option() and merges its
logic to eeh_ops::set_option().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch drops PHB EEH operation configure_bridge() and merges
its logic to eeh_ops::configure_bridge().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch drops PHB operation get_log() and merges its logic to
eeh_ops::get_log().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch drops PHB EEH operation post_init() and merge its logic
to eeh_ops::post_init().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch drops PHB EEH operation err_inject() and merge its logic
to eeh_ops::err_inject().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch shortens names of EEH functions in powernv-eeh.c and no
logic change introduced by this patch.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch fixes the comments about ppc_md.pcibios_fixup(), which
should be called after allocating resources.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Function pcibios_set_pcie_reset_state() is possibly called by
pci_reset_function(), on which VFIO infrastructure depends to
issue reset. pcibios_set_pcie_reset_state() is issuing reset
on the parent PE of the indicated PCI device. The reset causes
state lost on all PCI devices except the indicated one as the
argument to pcibios_set_pcie_reset_state(). Also, sideband
MMIO access from guest when issuing reset would cause unexpected
EEH error.
For above two issues, the patch applies following enhancements
to pcibios_set_pcie_reset_state():
* For all PCI devices except the indicated one, save their
state prior to reset and restore state after that.
* Explicitly freeze PE prior to reset and unfreeze it after
that, in order to avoid unexpected EEH error.
Tested-by: Priya M. A <priyama2@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
- Fix ACPI resources management problems introduced by the recent
rework of the code in question (Jiang Liu) and a build issue
introduced by those changes (Joachim Nilsson).
- Fix a recent suspend-to-idle regression on systems where entering
idle states causes local timers to stop, prevent suspend-to-idle
from crashing in restricted configurations (no cpuidle driver,
cpuidle disabled etc.) and clean up the idle loop somewhat while
at it (Rafael J Wysocki).
- Fix build problem in the cpufreq ppc driver (Geert Uytterhoeven).
- Allow the ACPI backlight driver module to be loaded if ACPI is
disabled which helps the i915 driver in those configurations
(stable-candidate) and change the code to help debug unusual use
cases (Chris Wilson).
- Wakeup IRQ management changes in v3.18 caused some drivers on the
at91 platform to trigger a warning from the IRQ core related to
an unexpected combination of interrupt action handler flags.
However, on at91 a timer IRQ is shared with some other devices
(including system wakeup ones) and that leads to the unusual
combination of flags in question. To make it possible to avoid
the warning introduce a new interrupt action handler flag (which
can be used by drivers to indicate the special case to the core)
and rework the problematic at91 drivers to use it and work as
expected during system suspend/resume. From Boris Brezillon,
Rafael J Wysocki and Mark Rutland.
- Clean up the generic power domains subsystem's debugfs interface
(Kevin Hilman).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJU+cBpAAoJEILEb/54YlRxb+8P+weKzn3Lim4R86ZkYjUjSr+P
Y+1d9CvQETsGMqaJRssBQ8npSaXqGF7kDjj3a4WIONxrgIs9k/7wZmNtTDYC2C7T
flxQQunlaHrELFqguowSq2pLxDTbWIe1lF7vtPwv/Xn7bOd755NrnAPgITseuxh5
ggoZg4gWnfHL6THnnOY8Dw6ZciCe7/lxfdAQavL+0xYybvG8/0/Urn+CsA/Q4Oz7
S9g7OLuK5LOlgE8f14TvLykHCVrluGKXMaulDUqx0z4DqOS+OP+Dp65bLGAf6faE
kYmfnJfN5vcfARxvBHyYCKuQAviMxhbS3R4fqO15SbRws4hLHL7IEmuuBAuEbPES
oIXLR2OBHAWeyiStHxEOZ0yxwhK2KjCOks/dPPPGtK2ZF4PAmCsOk0cxh6WdnzH3
g50Tg5ebPFjnyT8OCFNFm1g1pAoKjt2RuN8OGcKwChYjek3Yk5fCrkty7jkJYtQE
xcfXwaDPwolZbo3X0yGrchbqJYmOU16Kuu1U20L80uL/1TxmzlF27pUyLj4BbJxW
co+cxumF4WA6lixfNOcVil4PEBgh3lhCD5FzkGOiE0CI/l3omVdmR40nPN++IllD
O7QxFVGxSRZfEeIP0ujjB6rwxJ8JsK3vwlUngommby7KFtssh9/VZ8l4FbjXnDXl
qLVbX2fxxSD3j8U9aEov
=nc5T
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"These are fixes for recent regressions (ACPI resources management,
suspend-to-idle), stable-candidate fixes (ACPI backlight), fixes
related to the wakeup IRQ management changes made in v3.18, other
fixes (suspend-to-idle, cpufreq ppc driver) and a couple of cleanups
(suspend-to-idle, generic power domains, ACPI backlight).
Specifics:
- Fix ACPI resources management problems introduced by the recent
rework of the code in question (Jiang Liu) and a build issue
introduced by those changes (Joachim Nilsson).
- Fix a recent suspend-to-idle regression on systems where entering
idle states causes local timers to stop, prevent suspend-to-idle
from crashing in restricted configurations (no cpuidle driver,
cpuidle disabled etc.) and clean up the idle loop somewhat while at
it (Rafael J Wysocki).
- Fix build problem in the cpufreq ppc driver (Geert Uytterhoeven).
- Allow the ACPI backlight driver module to be loaded if ACPI is
disabled which helps the i915 driver in those configurations
(stable-candidate) and change the code to help debug unusual use
cases (Chris Wilson).
- Wakeup IRQ management changes in v3.18 caused some drivers on the
at91 platform to trigger a warning from the IRQ core related to an
unexpected combination of interrupt action handler flags. However,
on at91 a timer IRQ is shared with some other devices (including
system wakeup ones) and that leads to the unusual combination of
flags in question.
To make it possible to avoid the warning introduce a new interrupt
action handler flag (which can be used by drivers to indicate the
special case to the core) and rework the problematic at91 drivers
to use it and work as expected during system suspend/resume. From
Boris Brezillon, Rafael J Wysocki and Mark Rutland.
- Clean up the generic power domains subsystem's debugfs interface
(Kevin Hilman)"
* tag 'pm+acpi-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
genirq / PM: describe IRQF_COND_SUSPEND
tty: serial: atmel: rework interrupt and wakeup handling
watchdog: at91sam9: request the irq with IRQF_NO_SUSPEND
cpuidle / sleep: Use broadcast timer for states that stop local timer
clk: at91: implement suspend/resume for the PMC irqchip
rtc: at91rm9200: rework wakeup and interrupt handling
rtc: at91sam9: rework wakeup and interrupt handling
PM / wakeup: export pm_system_wakeup symbol
genirq / PM: Add flag for shared NO_SUSPEND interrupt lines
ACPI / video: Propagate the error code for acpi_video_register
ACPI / video: Load the module even if ACPI is disabled
PM / Domains: cleanup: rename gpd -> genpd in debugfs interface
cpufreq: ppc: Add missing #include <asm/smp.h>
x86/PCI/ACPI: Relax ACPI resource descriptor checks to work around BIOS bugs
x86/PCI/ACPI: Ignore resources consumed by host bridge itself
cpuidle: Clean up fallback handling in cpuidle_idle_call()
cpuidle / sleep: Do sanity checks in cpuidle_enter_freeze() too
idle / sleep: Avoid excessive disabling and enabling interrupts
PCI: versatile: Update for list_for_each_entry() API change
genirq / PM: better describe IRQF_NO_SUSPEND semantics
The set_memory_* functions currently only support module
addresses. The addresses are validated using is_module_addr.
That function is special though and relies on internal state
in the module subsystem to work properly. At the time of
module initialization and calling set_memory_*, it's too early
for is_module_addr to work properly so it always returns
false. Rather than be subject to the whims of the module state,
just bounds check against the module virtual address range.
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* acpi-resources:
x86/PCI/ACPI: Relax ACPI resource descriptor checks to work around BIOS bugs
x86/PCI/ACPI: Ignore resources consumed by host bridge itself
PCI: versatile: Update for list_for_each_entry() API change
Pull x86 fixes from Ingo Molnar:
"Misc fixes: EFI fixes, an Intel Quark fix, an asm fix and an FPU
handling fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu/xsaves: Fix improper uses of __ex_table
x86/intel/quark: Select COMMON_CLK
x86/asm/entry/64: Remove a bogus 'ret_from_fork' optimization
firmware: dmi_scan: Fix dmi_len type
efi/libstub: Fix boundary checking in efi_high_alloc()
firmware: dmi_scan: Fix dmi scan to handle "End of Table" structure
Commit:
f31a9f7c71 ("x86/xsaves: Use xsaves/xrstors to save and restore xsave area")
introduced alternative instructions for XSAVES/XRSTORS and commit:
adb9d526e9 ("x86/xsaves: Add xsaves and xrstors support for booting time")
added support for the XSAVES/XRSTORS instructions at boot time.
Unfortunately both failed to properly protect them against faulting:
The 'xstate_fault' macro will use the closest label named '1'
backward and that ends up in the .altinstr_replacement section
rather than in .text. This means that the kernel will never find
in the __ex_table the .text address where this instruction might
fault, leading to serious problems if userspace manages to
trigger the fault.
Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Jamie Iles <jamie.iles@oracle.com>
[ Improved the changelog, fixed some whitespace noise. ]
Acked-by: Borislav Petkov <bp@alien8.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Cc: Allan Xavier <mr.a.xavier@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: adb9d526e9 ("x86/xsaves: Add xsaves and xrstors support for booting time")
Fixes: f31a9f7c71 ("x86/xsaves: Use xsaves/xrstors to save and restore xsave area")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The commit 8bbc2a135b ("x86/intel/quark: Add Intel Quark
platform support") introduced a minimal support of Intel Quark
SoC. That allows to use core parts of the SoC. However, the SPI,
I2C, and GPIO drivers can't be selected by kernel configuration
because they depend on COMMON_CLK. The patch adds a COMMON_CLK
selection to the platfrom definition to allow user choose the drivers.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Ong, Boon Leong <boon.leong.ong@intel.com>
Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Darren Hart <dvhart@linux.intel.com>
Fixes: 8bbc2a135b ("x86/intel/quark: Add Intel Quark platform support")
Link: http://lkml.kernel.org/r/1425569044-2867-1-git-send-email-andriy.shevchenko@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
'ret_from_fork' checks TIF_IA32 to determine whether 'pt_regs' and
the related state make sense for 'ret_from_sys_call'. This is
entirely the wrong check. TS_COMPAT would make a little more
sense, but there's really no point in keeping this optimization
at all.
This fixes a return to the wrong user CS if we came from int
0x80 in a 64-bit task.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/4710be56d76ef994ddf59087aad98c000fbab9a4.1424989793.git.luto@amacapital.net
[ Backported from tip:x86/asm. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- Fix for dynticks.
- Fix for smpboot bug.
- Fix for IOMMU group refcounting.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJU9mwsAAoJEFHr6jzI4aWAS28QALF6rTNK0lywqO+eV75BRLco
z8eINZozRJBBXjt5ATp4zJSAjOpnRNFlclfDPEcYav51cyxZRREbD322rMYR5uOL
9ixnOTDGpOFE9Sgt562OEVmRadAVrJf9gKw7/Cjebxb9XgfwpepvwBfN8Xxl1yc+
LnvnkJN327nDaBnWP7eTSkTeiznhDCdXz0HYmO1zUnrico2AZ/9qVdd+XD/QTk6o
t6m+aTowy9+4Ps7B4Y/l5dmKuOruFMxwSWFhUZrC6huvArJrYAMGvMT7++r2JTW4
tyHBCuB3pWHBm3BjlcR23sZoi8jPVkulKKLShGiXwaZxCSOhwMLwwFFVBHw2gJIW
grzzAtkM5cXACmGKiCWzAc8WMM0+u4vz/w1md+Nt4XvtZssRJwbXdJJHh7nXq+RE
M7IDnyvLI5m8Ae7qAnAts2Lf1SRZALY0zOUyS0jsn6jDJUw49BbKD54RuFqNzuhL
70dEsHy+awHEagIO47q+JEFWPaG20HzupVDe33j7KEx20xROvC4R2u+LIEG8YiJH
ve4xDVwTM7rVqvIvb17Rs0LLqKs8P9O7sQFoQf2V/xie/X1QwHDZsC24C4r07M/+
7ZqL9og3/nMo9SIKE1eNgBHq0FmUgOYkoaF+biDuzava9JYNxeT7nOhFfOqNz4RO
M+tgZwHU1NDpVHJerdP3
=8wky
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc fixes from Michael Ellerman:
- Fix for dynticks.
- Fix for smpboot bug.
- Fix for IOMMU group refcounting.
* tag 'powerpc-4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
powerpc/iommu: Remove IOMMU device references via bus notifier
powerpc/smp: Wait until secondaries are active & online
powerpc: Re-enable dynticks
When parsing resources for PCI host bridge, we should ignore resources
consumed by host bridge itself and only report window resources available
to child PCI busses.
Fixes: 593669c2ac (x86/PCI/ACPI: Use common ACPI resource interfaces ...)
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
After d905c5df9a ("PPC: POWERNV: move iommu_add_device earlier"), the
refcnt on the kobject backing the IOMMU group for a PCI device is
elevated by each call to pci_dma_dev_setup_pSeriesLP() (via
set_iommu_table_base_and_group). When we go to dlpar a multi-function
PCI device out:
iommu_reconfig_notifier ->
iommu_free_table ->
iommu_group_put
BUG_ON(tbl->it_group)
We trip this BUG_ON, because there are still references on the table, so
it is not freed. Fix this by moving the powernv bus notifier to common
code and calling it for both powernv and pseries.
Fixes: d905c5df9a ("PPC: POWERNV: move iommu_add_device earlier")
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Tested-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Anton has a busy ppc64le KVM box where guests sometimes hit the infamous
"kernel BUG at kernel/smpboot.c:134!" issue during boot:
BUG_ON(td->cpu != smp_processor_id());
Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops
output confirms it:
CPU: 0
Comm: watchdog/130
The problem is that we aren't ensuring the CPU active bit is set for the
secondary before allowing the master to continue on. The master unparks
the secondary CPU's kthreads and the scheduler looks for a CPU to run
on. It calls select_task_rq() and realises the suggested CPU is not in
the cpus_allowed mask. It then ends up in select_fallback_rq(), and
since the active bit isnt't set we choose some other CPU to run on.
This seems to have been introduced by 6acbfb9697 "sched: Fix hotplug
vs. set_cpus_allowed_ptr()", which changed from setting active before
online to setting active after online. However that was in turn fixing a
bug where other code assumed an active CPU was also online, so we can't
just revert that fix.
The simplest fix is just to spin waiting for both active & online to be
set. We already have a barrier prior to set_cpu_online() (which also
sets active), to ensure all other setup is completed before online &
active are set.
Fixes: 6acbfb9697 ("sched: Fix hotplug vs. set_cpus_allowed_ptr()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull networking fixes from David Miller:
1) If an IPVS tunnel is created with a mixed-family destination
address, it cannot be removed. Fix from Alexey Andriyanov.
2) Fix module refcount underflow in netfilter's nft_compat, from Pablo
Neira Ayuso.
3) Generic statistics infrastructure can reference variables sitting on
a released function stack, therefore use dynamic allocation always.
Fix from Ignacy Gawędzki.
4) skb_copy_bits() return value test is inverted in ip_check_defrag().
5) Fix network namespace exit in openvswitch, we have to release all of
the per-net vports. From Pravin B Shelar.
6) Fix signedness bug in CAIF's cfpkt_iterate(), from Dan Carpenter.
7) Fix rhashtable grow/shrink behavior, only expand during inserts and
shrink during deletes. From Daniel Borkmann.
8) Netdevice names with semicolons should never be allowed, because
they serve as a separator. From Matthew Thode.
9) Use {,__}set_current_state() where appropriate, from Fabian
Frederick.
10) Revert byte queue limits support in r8169 driver, it's causing
regressions we can't figure out.
11) tcp_should_expand_sndbuf() erroneously uses tp->packets_out to
measure packets in flight, properly use tcp_packets_in_flight()
instead. From Neal Cardwell.
12) Fix accidental removal of support for bluetooth in CSR based Intel
wireless cards. From Marcel Holtmann.
13) We accidently added a behavioral change between native and compat
tasks, wrt testing the MSG_CMSG_COMPAT bit. Just ignore it if the
user happened to set it in a native binary as that was always the
behavior we had. From Catalin Marinas.
14) Check genlmsg_unicast() return valud in hwsim netlink tx frame
handling, from Bob Copeland.
15) Fix stale ->radar_required setting in mac80211 that can prevent
starting new scans, from Eliad Peller.
16) Fix memory leak in nl80211 monitor, from Johannes Berg.
17) Fix race in TX index handling in xen-netback, from David Vrabel.
18) Don't enable interrupts in amx-xgbe driver until all software et al.
state is ready for the interrupt handler to run. From Thomas
Lendacky.
19) Add missing netlink_ns_capable() checks to rtnl_newlink(), from Eric
W Biederman.
20) The amount of header space needed in macvtap was not calculated
properly, fix it otherwise we splat past the beginning of the
packet. From Eric Dumazet.
21) Fix bcmgenet TCP TX perf regression, from Jaedon Shin.
22) Don't raw initialize or mod timers, use setup_timer() and
mod_timer() instead. From Vaishali Thakkar.
23) Fix software maintained statistics in bcmgenet and systemport
drivers, from Florian Fainelli.
24) DMA descriptor updates in sh_eth need proper memory barriers, from
Ben Hutchings.
25) Don't do UDP Fragmentation Offload on RAW sockets, from Michal
Kubecek.
26) Openvswitch's non-masked set actions aren't constructed properly
into netlink messages, fix from Joe Stringer.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
openvswitch: Fix serialization of non-masked set actions.
gianfar: Reduce logging noise seen due to phy polling if link is down
ibmveth: Add function to enable live MAC address changes
net: bridge: add compile-time assert for cb struct size
udp: only allow UFO for packets from SOCK_DGRAM sockets
sh_eth: Really fix padding of short frames on TX
Revert "sh_eth: Enable Rx descriptor word 0 shift for r8a7790"
sh_eth: Fix RX recovery on R-Car in case of RX ring underrun
sh_eth: Ensure proper ordering of descriptor active bit write/read
net/mlx4_en: Disbale GRO for incoming loopback/selftest packets
net/mlx4_core: Fix wrong mask and error flow for the update-qp command
net: systemport: fix software maintained statistics
net: bcmgenet: fix software maintained statistics
rxrpc: don't multiply with HZ twice
rxrpc: terminate retrans loop when sending of skb fails
net/hsr: Fix NULL pointer dereference and refcnt bugs when deleting a HSR interface.
net: pasemi: Use setup_timer and mod_timer
net: stmmac: Use setup_timer and mod_timer
net: 8390: axnet_cs: Use setup_timer and mod_timer
net: 8390: pcnet_cs: Use setup_timer and mod_timer
...
Enable disabled interrupt, on unsuccessful operation.
Found by Coccinelle.
Signed-off-by: Tapasweni Pathak <tapaswenipathak@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Currently the guest exit trace event saves the VCPU pointer to the
structure, and the guest PC is retrieved by dereferencing it when the
event is printed rather than directly from the trace record. This isn't
safe as the printing may occur long afterwards, after the PC has changed
and potentially after the VCPU has been freed. Usually this results in
the same (wrong) PC being printed for multiple trace events. It also
isn't portable as userland has no way to access the VCPU data structure
when interpreting the trace record itself.
Lets save the actual PC in the structure so that the correct value is
accessible later.
Fixes: 669e846e6c ("KVM/MIPS32: MIPS arch specific APIs for KVM")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # v3.10+
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
In commit b4eef9b36d, we started to use hwapic_isr_update() != NULL
instead of kvm_apic_vid_enabled(vcpu->kvm). This didn't work because
SVM had it defined and "apicv" path in apic_{set,clear}_isr() does not
change apic->isr_count, because it should always be 1. The initial
value of apic->isr_count was based on kvm_apic_vid_enabled(vcpu->kvm),
which is always 0 for SVM, so KVM could have injected interrupts when it
shouldn't.
Fix it by implicitly setting SVM's hwapic_isr_update to NULL and make the
initial isr_count depend on hwapic_isr_update() for good measure.
Fixes: b4eef9b36d ("kvm: x86: vmx: NULL out hwapic_isr_update() in case of !enable_apicv")
Reported-and-tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This is just a single patch to fix the KSTK_EIP() and KSTK_ESP() macros
for metag which have always been erronously returning the PC and stack
pointer of the task's kernel context rather than from its user context
saved at entry from userland into the kernel, which affects the contents
of /proc/<pid>/maps and /proc/<pid>/stat.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJU9EnnAAoJEGwLaZPeOHZ67uYQALvIb6R/Webca0hEPuHa95ic
4mfKzj7iD4DZTBLsuYyq9+NIt2g24mCd5vbDLfH63PpEnRwjdt7y+n4xhRVoQS87
ZC2aLx3Ry6NC7ByGhEq0n4aTOaKYffe9y4XE8FZddn/9rZUIE2sEdXGtyrg6rsYf
Eb/e/senPBRPNT8LpurSmYcPsCB2q2yo+0503aj41VjCjPbYe92/QrIDU8Ag3R5y
c5C0btD9NOcB4xt/vIGU7H0OH85Q+OvLHBzu/5aVFyPelPtIE4xpYP1fRyd/P002
Jmm6KH52ILMArgqB3KavKMvCebQBwwf92LLUtQ5ZhdeX9TMYzgG22P3CmZcS49Ha
xwkIgDbeI1BQeMoVgTgVRMDnAOXmF/HdzxlbILHonaptiHDEOj3izdWpfubrIGi7
9/69L/hF3DY5udt8qBQ4fWDJrvBYQpoqyUEiv/eFfyhFpVaCxKQ0YQgWto3UujWG
7ESNkNp3kTTlo4NeUh47x1TE0CBiNHAGU+r72Uysb/u9N3Aya8b/jy8x4wCBemLs
vHL3bfgg7Pee067/O+w9GTQoe7ldzifcSrTGV3s7wpUqKPBGUdq4MtPaXDvJqk/W
uqnjoH1+/juvBpjwNwoavCXAO5CI6j19kKQH9iCc3v3YizSRtCG4VwfnVd2HgSc4
LtUrSZkfkwQvBn1oc4mg
=1+sf
-----END PGP SIGNATURE-----
Merge tag 'metag-fixes-v4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag
Pull arch/metag fix from James Hogan:
"This is just a single patch to fix the KSTK_EIP() and KSTK_ESP()
macros for metag which have always been erronously returning the PC
and stack pointer of the task's kernel context rather than from its
user context saved at entry from userland into the kernel, which
affects the contents of /proc/<pid>/maps and /proc/<pid>/stat"
* tag 'metag-fixes-v4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
metag: Fix KSTK_EIP() and KSTK_ESP() macros
Pull x86 fixes from Ingo Molnar:
"A CR4-shadow 32-bit init fix, plus two typo fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Init per-cpu shadow copy of CR4 on 32-bit CPUs too
x86/platform/intel-mid: Fix trivial printk message typo in intel_mid_arch_setup()
x86/cpu/intel: Fix trivial typo in intel_tlb_table[]
Pull perf fixes from Ingo Molnar:
"Two kprobes fixes and a handful of tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Make sparc64 arch point to sparc
perf symbols: Define EM_AARCH64 for older OSes
perf top: Fix SIGBUS on sparc64
perf tools: Fix probing for PERF_FLAG_FD_CLOEXEC flag
perf tools: Fix pthread_attr_setaffinity_np build error
perf tools: Define _GNU_SOURCE on pthread_attr_setaffinity_np feature check
perf bench: Fix order of arguments to memcpy_alloc_mem
kprobes/x86: Check for invalid ftrace location in __recover_probed_insn()
kprobes/x86: Use 5-byte NOP when the code might be modified by ftrace
Core mm expects __PAGETABLE_{PUD,PMD}_FOLDED to be defined if these page
table levels folded. Usually, these defines are provided by
<asm-generic/pgtable-nopmd.h> and <asm-generic/pgtable-nopud.h>.
But some architectures fold page table levels in a custom way. They
need to define these macros themself. This patch adds missing defines.
The patch fixes mm->nr_pmds underflow and eliminates dead __pmd_alloc()
and __pud_alloc() on architectures without these page table levels.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The smc91x driver traditionally gets configured at compile-time
for whichever hardware it runs on. This no longer works on
ARM as we continue to move to building all-in-one kernels.
Most ARM configurations with this driver already use run-time
configuration through DT or through platform_data, but a
few have not been converted yet.
I've checked all ARM boards that use this driver in their
legacy board files, and converted the ones that were using
compile-time configuration in smc91x.h to behave like the
other ones and provide the interrupt polarity along with
the MMIO configuration (width, stride) at platform device
creation time.
In particular, these combinations were previously selectable
in Kconfig but in fact broken:
- sa1100 assabet plus pleb
- msm combined with any other armv6/v7 platform
- pxa-idp combined with any non-DMA pxa variant
- LogicPD PXA270 combined with any other pxa
- nomadik combined with any other armv4/v5 platform,
e.g. versatile.
None of these seem critical enough to warrant a backport
to stable, but it would be nice to clean this up for good.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
----
I would like the patch to get merged through netdev, after
Robert and/or Linus have verified it on at least some hardware.
There are a few other non-ARM platforms using this driver,
I could do the same patch for those if we want to take
it further.
arch/arm/mach-msm/board-halibut.c | 8 ++++-
arch/arm/mach-msm/board-qsd8x50.c | 8 ++++-
arch/arm/mach-pxa/idp.c | 5 +++
arch/arm/mach-pxa/lpd270.c | 8 ++++-
arch/arm/mach-realview/core.c | 7 ++++
arch/arm/mach-realview/realview_eb.c | 2 +-
arch/arm/mach-sa1100/neponset.c | 6 ++++
arch/arm/mach-sa1100/pleb.c | 7 ++++
drivers/net/ethernet/smsc/smc91x.c | 9 +++--
drivers/net/ethernet/smsc/smc91x.h | 114 ++----------------------------------------------------------
10 files changed, 57 insertions(+), 117 deletions(-)
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>