When domain is related to multiple iommus, need to check if the minimum agaw is sufficient for the mapped memory
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
vm_domid won't be set in context, find available domain id for a device from its iommu.
For a virtual machine domain, a default agaw will be set, and skip top levels of page tables for iommu which has less agaw than default.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
virtual machine domain is different from native DMA-API domain, implement separate allocation and free functions for virtual machine domain.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Because virtual machine domain may have multiple devices from different iommus, it cannot use __iommu_flush_cache.
In some common low level functions, use domain_flush_cache instead of __iommu_flush_cache. On the other hand, in some functions, iommu can is specified or domain cannot be got, still use __iommu_flush_cache
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Add iommu reference count in domain, and add a lock to protect iommu setting including iommu_bmp, iommu_count and iommu_coherency.
virtual machine domain may have multiple devices from different iommus, so it needs to do more things when add/remove domain device info. Thus implement separate these functions for virtual machine domain.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Add this flag for VT-d used in virtual machine, like KVM.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
In dmar_domain, more than one iommus may be included in iommu_bmp. Due to "Coherency" capability may be different across iommus, set this variable to indicate iommu access is coherent or not. Only when all related iommus in a dmar_domain are all coherent, iommu access of this domain is coherent.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
"SAGAW" capability may be different across iommus. Use a default agaw, but if default agaw is not supported in some iommus, choose a less supported agaw.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
In order to support assigning multiple devices from different iommus to a domain, iommu bitmap is used to keep all iommus the domain are related to.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
deferred_flush[] uses the iommu seq_id to index, so its iommu is fixed and can get it from g_iommus.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
It's random number after the domain is allocated by kmem_cache_alloc
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
On platforms with multiple PCI segments, any of the segments can have a DRHD
with INCLUDE_PCI_ALL flag. So need to check the DRHD's segment number against
the PCI device's when searching its DRHD.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Some macros were unused, so I just dropped them:
context_fault_disable
context_translation_type
context_address_root
context_address_width
context_domain_id
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We keep the struct root_entry forward declaration for the
pointer in struct intel_iommu.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
init_dmars() is not used outside of drivers/pci/intel-iommu.c
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The seg, saved_msg and sysdev fields appear to be unused since
before the code was first merged.
linux/msi.h is not needed in linux/intel-iommu.h anymore since
there is no longer a reference to struct msi_msg. The MSI code
in drivers/pci/intel-iommu.c still has linux/msi.h included
via linux/dmar.h.
linux/sysdev.h isn't needed because there is no reference to
struct sys_device.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits)
x86: export vector_used_by_percpu_irq
x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and()
sched: nominate preferred wakeup cpu, fix
x86: fix lguest used_vectors breakage, -v2
x86: fix warning in arch/x86/kernel/io_apic.c
sched: fix warning in kernel/sched.c
sched: move test_sd_parent() to an SMP section of sched.h
sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0
sched: activate active load balancing in new idle cpus
sched: bias task wakeups to preferred semi-idle packages
sched: nominate preferred wakeup cpu
sched: favour lower logical cpu number for sched_mc balance
sched: framework for sched_mc/smt_power_savings=N
sched: convert BALANCE_FOR_xx_POWER to inline functions
x86: use possible_cpus=NUM to extend the possible cpus allowed
x86: fix cpu_mask_to_apicid_and to include cpu_online_mask
x86: update io_apic.c to the new cpumask code
x86: Introduce topology_core_cpumask()/topology_thread_cpumask()
x86: xen: use smp_call_function_many()
x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c
...
Fixed up trivial conflict in kernel/time/tick-sched.c manually
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, sparseirq: clean up Kconfig entry
x86: turn CONFIG_SPARSE_IRQ off by default
sparseirq: fix numa_migrate_irq_desc dependency and comments
sparseirq: add kernel-doc notation for new member in irq_desc, -v2
locking, irq: enclose irq_desc_lock_class in CONFIG_LOCKDEP
sparseirq, xen: make sure irq_desc is allocated for interrupts
sparseirq: fix !SMP building, #2
x86, sparseirq: move irq_desc according to smp_affinity, v7
proc: enclose desc variable of show_stat() in CONFIG_SPARSE_IRQ
sparse irqs: add irqnr.h to the user headers list
sparse irqs: handle !GENIRQ platforms
sparseirq: fix !SMP && !PCI_MSI && !HT_IRQ build
sparseirq: fix Alpha build failure
sparseirq: fix typo in !CONFIG_IO_APIC case
x86, MSI: pass irq_cfg and irq_desc
x86: MSI start irq numbering from nr_irqs_gsi
x86: use NR_IRQS_LEGACY
sparse irq_desc[] array: core kernel and x86 changes
genirq: record IRQ_LEVEL in irq_desc[]
irq.h: remove padding from irq_desc on 64bits
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (144 commits)
powerpc/44x: Support 16K/64K base page sizes on 44x
powerpc: Force memory size to be a multiple of PAGE_SIZE
powerpc/32: Wire up the trampoline code for kdump
powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M
powerpc/32: Allow __ioremap on RAM addresses for kdump kernel
powerpc/32: Setup OF properties for kdump
powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs()
powerpc: Prepare xmon_save_regs for use with kdump
powerpc: Remove default kexec/crash_kernel ops assignments
powerpc: Make default kexec/crash_kernel ops implicit
powerpc: Setup OF properties for ppc32 kexec
powerpc/pseries: Fix cpu hotplug
powerpc: Fix KVM build on ppc440
powerpc/cell: add QPACE as a separate Cell platform
powerpc/cell: fix build breakage with CONFIG_SPUFS disabled
powerpc/mpc5200: fix error paths in PSC UART probe function
powerpc/mpc5200: add rts/cts handling in PSC UART driver
powerpc/mpc5200: Make PSC UART driver update serial errors counters
powerpc/mpc5200: Remove obsolete code from mpc5200 MDIO driver
powerpc/mpc5200: Add MDMA/UDMA support to MPC5200 ATA driver
...
Fix trivial conflict in drivers/char/Makefile as per Paul's directions
I happened to notice that the ibmphp hotplug driver does something
rather silly in its init routine. It purposely calls module_put so as
to underflow its module ref count to avoid being removed from the
kernel. This is bad practice, and wrong, since it provides a window for
subsequent module_gets to reset the refcount to zero, allowing an unload
to race in and cause all sorts of mysterious panics. If the module is
unsafe to load, simply omitting the module_exit parameter is sufficient
to prevent the kernel from allowing the unload.
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Impact: build fix
make intr_remapping.c to include smp.h, so could use boot_cpu_id there
also remove old change that disabling sparseirq with !SMP
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Certain HP machines require the full 64 bits of _SUN as allowed
by the ACPI spec. Without this change, we get name collisions in
the lower 32 bits of the _SUN returned by firmware.
Acked-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Justin Chen <justin.chen@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch fixes the problem that causes an occupied slot to be turned
off even if it has a working device.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
What we have to check here before calling is err_handler->resume, not
->slot_reset. Looks like a copy & paste error from report_slot_reset.
Acked-by: Yanmin Zhang <yanmin.zhang@intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Impact: change calling convention of existing cpumask APIs
Most cpumask functions started with cpus_: these have been replaced by
cpumask_ ones which take struct cpumask pointers as expected.
These four functions don't have good replacement names; fortunately
they're rarely used, so we just change them over.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: paulus@samba.org
Cc: mingo@redhat.com
Cc: tony.luck@intel.com
Cc: ralf@linux-mips.org
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: cl@linux-foundation.org
Cc: srostedt@redhat.com
We merge the irq/sparseirq, x86/quirks and x86/reboot trees into the
cpus4096 tree because the io-apic changes in the sparseirq change
conflict with the cpumask changes in the cpumask tree, and we
want to resolve those.
Makes a Compaq 6735s boot reliably again. It used to hang in the loop
on some boots. Give the link one second to train, otherwise break out
of the loop and reset the previously set clock bits.
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In pci_create_slot(), the local variable 'slot_name' is allocated by
make_slot_name(), but never freed. We never use it after passing it to
the kobject core, so we should free it upon function exit.
Cc: stable@kernel.org
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Impact: simplify code
Pass irq_desc and cfg around, instead of raw IRQ numbers - this way
we dont have to look it up again and again.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new feature
Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with
NR_CPUS set to large values. The goal is to be able to scale up to much
larger NR_IRQS value without impacting the (important) common case.
To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of
irq_desc pointers.
When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc,
this also makes the IRQ descriptors NUMA-local (to the site that calls
request_irq()).
This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now
uses desc->chip_data for x86 to store irq_cfg.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
for fsck sake, it's used only when parsing kernel command line...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>