linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-28 05:41:55 +00:00

Author	SHA1	Message	Date
Joerg Roedel	ec04b2604c	KVM: Prepare memslot data structures for multiple hugepage sizes [avi: fix build on non-x86] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:02 +03:00
Andre Przywara	4668f05078	KVM: x86 emulator: Add sysexit emulation Handle #UD intercept of the sysexit instruction in 64bit mode returning to 32bit compat mode on an AMD host. Setup the segment descriptors for CS and SS and the EIP/ESP registers according to the manual. Signed-off-by: Christoph Egger <christoph.egger@amd.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:01 +03:00
Andre Przywara	8c60435261	KVM: x86 emulator: Add sysenter emulation Handle #UD intercept of the sysenter instruction in 32bit compat mode on an AMD host. Setup the segment descriptors for CS and SS and the EIP/ESP registers according to the manual. Signed-off-by: Christoph Egger <christoph.egger@amd.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:01 +03:00
Andre Przywara	e66bb2ccdc	KVM: x86 emulator: add syscall emulation Handle #UD intercept of the syscall instruction in 32bit compat mode on an Intel host. Setup the segment descriptors for CS and SS and the EIP/ESP registers according to the manual. Save the RIP and EFLAGS to the correct registers. [avi: fix build on i386 due to missing R11] Signed-off-by: Christoph Egger <christoph.egger@amd.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:00 +03:00
Andre Przywara	e99f050712	KVM: x86 emulator: Prepare for emulation of syscall instructions Add the flags needed for syscall, sysenter and sysexit to the opcode table. Catch (but for now ignore) the opcodes in the emulation switch/case. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Christoph Egger <christoph.egger@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:00 +03:00
Andre Przywara	b1d861431e	KVM: x86 emulator: Add missing EFLAGS bit definitions Signed-off-by: Christoph Egger <christoph.egger@amd.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:00 +03:00
Andre Przywara	0cb5762ed2	KVM: Allow emulation of syscalls instructions on #UD Add the opcodes for syscall, sysenter and sysexit to the list of instructions handled by the undefined opcode handler. Signed-off-by: Christoph Egger <christoph.egger@amd.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:59 +03:00
Marcelo Tosatti	229456fc34	KVM: convert custom marker based tracing to event traces This allows use of the powerful ftrace infrastructure. See Documentation/trace/ for usage information. [avi, stephen: various build fixes] [sheng: fix control register breakage] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:59 +03:00
Alexander Graf	219b65dcf6	KVM: SVM: Improve nested interrupt injection While trying to get Hyper-V running, I realized that the interrupt injection mechanisms that are in place right now are not 100% correct. This patch makes nested SVM's interrupt injection behave more like on a real machine. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:59 +03:00
Alexander Graf	ff092385e8	KVM: SVM: Implement INVLPGA SVM adds another way to do INVLPG by ASID which Hyper-V makes use of, so let's implement it! For now we just do the same thing invlpg does, as asid switching means we flush the mmu anyways. That might change one day though. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:58 +03:00
Alexander Graf	3c5d0a44b0	KVM: Implement MSRs used by Hyper-V Hyper-V uses some MSRs, some of which are actually reserved for BIOS usage. But let's be nice today and have it its way, because otherwise it fails terribly. [jaswinder: fix build for linux-next changes] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:58 +03:00
Alexander Graf	0367b4330e	x86: Add definition for IGNNE MSR Hyper-V accesses MSR_IGNNE while running under KVM. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:58 +03:00
Avi Kivity	b3dbf89e67	KVM: SVM: Don't save/restore host cr2 The host never reads cr2 in process context, so are free to clobber it. The vmx code does this, so we can safely remove the save/restore code. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:58 +03:00
Avi Kivity	d3edefc003	KVM: VMX: Only reload guest cr2 if different from host cr2 cr2 changes only rarely, and writing it is expensive. Avoid the costly cr2 writes by checking if it does not already hold the desired value. Shaves 70 cycles off the vmexit latency. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:57 +03:00
Jan Kiszka	681405bfc7	KVM: Drop useless atomic test from timer function The current code tries to optimize the setting of KVM_REQ_PENDING_TIMER but used atomic_inc_and_test - which always returns true unless pending had the invalid value of -1 on entry. This patch drops the test part preserving the original semantic but expressing it less confusingly. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:57 +03:00
Jan Kiszka	f7104db26a	KVM: Fix racy event propagation in timer Minor issue that likely had no practical relevance: the kvm timer function so far incremented the pending counter and then may reset it again to 1 in case reinjection was disabled. This opened a small racy window with the corresponding VCPU loop that may have happened to run on another (real) CPU and already consumed the value. Fix it by skipping the incrementation in case pending is already > 0. This opens a different race windows, but may only rarely cause lost events in case we do not care about them anyway (!reinject). Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:57 +03:00
Gleb Natapov	33e4c68656	KVM: Optimize searching for highest IRR Most of the time IRR is empty, so instead of scanning the whole IRR on each VM entry keep a variable that tells us if IRR is not empty. IRR will have to be scanned twice on each IRQ delivery, but this is much more rare than VM entry. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:57 +03:00
Gleb Natapov	6edf14d8d0	KVM: Replace pending exception by PF if it happens serially Replace previous exception with a new one in a hope that instruction re-execution will regenerate lost exception. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:56 +03:00
Marcelo Tosatti	54dee9933e	KVM: VMX: conditionally disable 2M pages Disable usage of 2M pages if VMX_EPT_2MB_PAGE_BIT (bit 16) is clear in MSR_IA32_VMX_EPT_VPID_CAP and EPT is enabled. [avi: s/largepages_disabled/largepages_enabled/ to avoid negative logic] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:56 +03:00
Marcelo Tosatti	68f89400bc	KVM: VMX: EPT misconfiguration handler Handler for EPT misconfiguration which checks for valid state in the shadow pagetables, printing the spte on each level. The separate WARN_ONs are useful for kerneloops.org. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:56 +03:00
Marcelo Tosatti	94d8b056a2	KVM: MMU: add kvm_mmu_get_spte_hierarchy helper Required by EPT misconfiguration handler. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:56 +03:00
Marcelo Tosatti	4d88954d62	KVM: MMU: make for_each_shadow_entry aware of largepages This way there is no need to add explicit checks in every for_each_shadow_entry user. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:55 +03:00
Marcelo Tosatti	e799794e02	KVM: VMX: more MSR_IA32_VMX_EPT_VPID_CAP capability bits Required for EPT misconfiguration handler. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:55 +03:00
Andre Przywara	71db602322	KVM: Move performance counter MSR access interception to generic x86 path The performance counter MSRs are different for AMD and Intel CPUs and they are chosen mainly by the CPUID vendor string. This patch catches writes to all addresses (regardless of VMX/SVM path) and handles them in the generic MSR handler routine. Writing a 0 into the event select register is something we perfectly emulate ;-), so don't print out a warning to dmesg in this case. This fixes booting a 64bit Windows guest with an AMD CPUID on an Intel host. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:54 +03:00
Marcelo Tosatti	2920d72857	KVM: MMU audit: largepage handling Make the audit code aware of largepages. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:54 +03:00
Marcelo Tosatti	2aaf65e8c4	KVM: MMU audit: audit_mappings tweaks - Fail early in case gfn_to_pfn returns is_error_pfn. - For the pre pte write case, avoid spurious "gva is valid but spte is notrap" messages (the emulation code does the guest write first, so this particular case is OK). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:54 +03:00
Marcelo Tosatti	48fc03174b	KVM: MMU audit: nontrapping ptes in nonleaf level It is valid to set non leaf sptes as notrap. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:54 +03:00
Marcelo Tosatti	e58b0f9e0e	KVM: MMU audit: update audit_write_protection - Unsync pages contain writable sptes in the rmap. - rmaps do not exclusively contain writable sptes anymore. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:53 +03:00
Marcelo Tosatti	08a3732bf2	KVM: MMU audit: update count_writable_mappings / count_rmaps Under testing, count_writable_mappings returns a value that is 2 integers larger than what count_rmaps returns. Suspicion is that either of the two functions is counting a duplicate (either positively or negatively). Modifying check_writable_mappings_rmap to check for rmap existance on all present MMU pages fails to trigger an error, which should keep Avi happy. Also introduce mmu_spte_walk to invoke a callback on all present sptes visible to the current vcpu, might be useful in the future. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:53 +03:00
Marcelo Tosatti	776e663336	KVM: MMU: introduce is_last_spte helper Hiding some of the last largepage / level interaction (which is useful for gbpages and for zero based levels). Also merge the PT_PAGE_TABLE_LEVEL clearing loop in unlink_children. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:53 +03:00
Avi Kivity	3f5d18a965	KVM: Return to userspace on emulation failure Instead of mindlessly retrying to execute the instruction, report the failure to userspace. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:52 +03:00
Gleb Natapov	988a2cae6a	KVM: Use macro to iterate over vcpus. [christian: remove unused variables on s390] Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:52 +03:00
Gleb Natapov	73880c80aa	KVM: Break dependency between vcpu index in vcpus array and vcpu_id. Archs are free to use vcpu_id as they see fit. For x86 it is used as vcpu's apic id. New ioctl is added to configure boot vcpu id that was assumed to be 0 till now. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:52 +03:00
Gleb Natapov	1ed0ce000a	KVM: Use pointer to vcpu instead of vcpu_id in timer code. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:52 +03:00
Gleb Natapov	c5af89b68a	KVM: Introduce kvm_vcpu_is_bsp() function. Use it instead of open code "vcpu_id zero is BSP" assumption. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:51 +03:00
Avi Kivity	d555c333aa	KVM: MMU: s/shadow_pte/spte/ We use shadow_pte and spte inconsistently, switch to the shorter spelling. Rename set_shadow_pte() to __set_spte() to avoid a conflict with the existing set_spte(), and to indicate its lowlevelness. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:51 +03:00
Avi Kivity	43a3795a3a	KVM: MMU: Adjust pte accessors to explicitly indicate guest or shadow pte Since the guest and host ptes can have wildly different format, adjust the pte accessor names to indicate on which type of pte they operate on. No functional changes. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:51 +03:00
Avi Kivity	439e218a6f	KVM: MMU: Fix is_dirty_pte() is_dirty_pte() is used on guest ptes, not shadow ptes, so it needs to avoid shadow_dirty_mask and use PT_DIRTY_MASK instead. Misdetecting dirty pages could lead to unnecessarily setting the dirty bit under EPT. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:50 +03:00
Avi Kivity	7ffd92c53c	KVM: VMX: Move rmode structure to vmx-specific code rmode is only used in vmx, so move it to vmx.c Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:50 +03:00
Nitin A Kamble	3a624e29c7	KVM: VMX: Support Unrestricted Guest feature "Unrestricted Guest" feature is added in the VMX specification. Intel Westmere and onwards processors will support this feature. It allows kvm guests to run real mode and unpaged mode code natively in the VMX mode when EPT is turned on. With the unrestricted guest there is no need to emulate the guest real mode code in the vm86 container or in the emulator. Also the guest big real mode code works like native. The attached patch enhances KVM to use the unrestricted guest feature if available on the processor. It also adds a new kernel/module parameter to disable the unrestricted guest feature at the boot time. Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:49 +03:00
Marcelo Tosatti	fa40a8214b	KVM: switch irq injection/acking data structures to irq_lock Protect irq injection/acking data structures with a separate irq_lock mutex. This fixes the following deadlock: CPU A CPU B kvm_vm_ioctl_deassign_dev_irq() mutex_lock(&kvm->lock); worker_thread() -> kvm_deassign_irq() -> kvm_assigned_dev_interrupt_work_handler() -> deassign_host_irq() mutex_lock(&kvm->lock); -> cancel_work_sync() [blocked] [gleb: fix ia64 path] Reported-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:49 +03:00
Marcelo Tosatti	9f4cc12765	KVM: Grab pic lock in kvm_pic_clear_isr_ack isr_ack is protected by kvm_pic->lock. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:48 +03:00
Jan Kiszka	238adc7705	KVM: Cleanup LAPIC interface None of the interface services the LAPIC emulation provides need to be exported to modules, and kvm_lapic_get_base is even totally unused today. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:48 +03:00
Liu Yu	06579dd9c1	KVM: ppc: e500: Add MMUCFG and PVR emulation Latest kernel started to use these two registers. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:47 +03:00
Liu Yu	5b7c1a2c17	KVM: ppc: e500: Directly pass pvr to guest Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:47 +03:00
Liu Yu	0cfb50e50d	KVM: ppc: e500: Move to Book-3e MMU definitions According to commit `70fe3af840`. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:47 +03:00
Avi Kivity	596ae89565	KVM: VMX: Fix reporting of unhandled EPT violations Instead of returning -ENOTSUPP, exit normally but indicate the hardware exit reason. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:46 +03:00
Avi Kivity	6de4f3ada4	KVM: Cache pdptrs Instead of reloading the pdptrs on every entry and exit (vmcs writes on vmx, guest memory access on svm) extract them on demand. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:46 +03:00
Avi Kivity	8f5d549f02	KVM: VMX: Simplify pdptr and cr3 management Instead of reading the PDPTRs from memory after every exit (which is slow and wrong, as the PDPTRs are stored on the cpu), sync the PDPTRs from memory to the VMCS before entry, and from the VMCS to memory after exit. Do the same for cr3. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:46 +03:00
Avi Kivity	2d84e993a8	KVM: VMX: Avoid duplicate ept tlb flush when setting cr3 vmx_set_cr3() will call vmx_tlb_flush(), which will flush the ept context. So there is no need to call ept_sync_context() explicitly. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:46 +03:00
Gregory Haskins	6b66ac1ae3	KVM: do not register i8254 PIO regions until we are initialized We currently publish the i8254 resources to the pio_bus before the devices are fully initialized. Since we hold the pit_lock, its probably not a real issue. But lets clean this up anyway. Reported-by: Avi Kivity <avi@redhat.com> Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:45 +03:00
Gregory Haskins	d76685c4a0	KVM: cleanup io_device code We modernize the io_device code so that we use container_of() instead of dev->private, and move the vtable to a separate ops structure (theoretically allows better caching for multiple instances of the same ops structure) Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:45 +03:00
Stephen Rothwell	2986b8c72c	KVM: powerpc: fix some init/exit annotations Fixes a couple of warnings like this one: WARNING: arch/powerpc/kvm/kvm-440.o(.text+0x1e8c): Section mismatch in reference from the function kvmppc_44x_exit() to the function .exit.text:kvmppc_booke_exit() The function kvmppc_44x_exit() references a function in an exit section. Often the function kvmppc_booke_exit() has valid usage outside the exit section and the fix is to remove the __exit annotation of kvmppc_booke_exit. Also add some __init annotations on obvious routines. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:44 +03:00
Avi Kivity	6c8166a77c	KVM: SVM: Fold kvm_svm.h info svm.c kvm_svm.h is only included from svm.c, so fold it in. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:44 +03:00
Andre Przywara	017cb99e87	KVM: SVM: use explicit 64bit storage for sysenter values Since AMD does not support sysenter in 64bit mode, the VMCB fields storing the MSRs are truncated to 32bit upon VMRUN/#VMEXIT. So store the values in a separate 64bit storage to avoid truncation. [andre: fix amd->amd migration] Signed-off-by: Christoph Egger <christoph.egger@amd.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:43 +03:00
Christian Ehrhardt	628eb9b8a8	KVM: s390: streamline memslot handling This patch relocates the variables kvm-s390 uses to track guest mem addr/size. As discussed dropping the variables at struct kvm_arch level allows to use the common vcpu->request based mechanism to reload guest memory if e.g. changes via set_memory_region. The kick mechanism introduced in this series is used to ensure running vcpus leave guest state to catch the update. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:42 +03:00
Christian Ehrhardt	b1d16c495d	KVM: s390: fix signal handling If signal pending is true we exit without updating kvm_run, userspace currently just does nothing and jumps to kvm_run again. Since we did not set an exit_reason we might end up with a random one (whatever was the last exit). Therefore it was possible to e.g. jump to the psw position the last real interruption set. Setting the INTR exit reason ensures that no old psw data is swapped in on reentry. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:42 +03:00
Christian Ehrhardt	9ace903d17	KVM: s390: infrastructure to kick vcpus out of guest state To ensure vcpu's come out of guest context in certain cases this patch adds a s390 specific way to kick them out of guest context. Currently it kicks them out to rerun the vcpu_run path in the s390 code, but the mechanism itself is expandable and with a new flag we could also add e.g. kicks to userspace etc. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:42 +03:00
Jes Sorensen	3032b925f0	KVM: ia64: Correct itc_offset calculations Init the itc_offset for all possible vCPUs. The current code by mistake ends up only initializing the offset on vCPU 0. Spotted by Gleb Natapov. Signed-off-by: Jes Sorensen <jes@sgi.com> Acked-by : Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:42 +03:00
Jan Kiszka	c5ff41ce66	KVM: Allow PIT emulation without speaker port The in-kernel speaker emulation is only a dummy and also unneeded from the performance point of view. Rather, it takes user space support to generate sound output on the host, e.g. console beeps. To allow this, introduce KVM_CREATE_PIT2 which controls in-kernel speaker port emulation via a flag passed along the new IOCTL. It also leaves room for future extensions of the PIT configuration interface. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:41 +03:00
Gregory Haskins	721eecbf4f	KVM: irqfd KVM provides a complete virtual system environment for guests, including support for injecting interrupts modeled after the real exception/interrupt facilities present on the native platform (such as the IDT on x86). Virtual interrupts can come from a variety of sources (emulated devices, pass-through devices, etc) but all must be injected to the guest via the KVM infrastructure. This patch adds a new mechanism to inject a specific interrupt to a guest using a decoupled eventfd mechnanism: Any legal signal on the irqfd (using eventfd semantics from either userspace or kernel) will translate into an injected interrupt in the guest at the next available interrupt window. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:41 +03:00
Avi Kivity	0ba12d1081	KVM: Move common KVM Kconfig items to new file virt/kvm/Kconfig Reduce Kconfig code duplication. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:41 +03:00
Gleb Natapov	787ff73637	KVM: Drop interrupt shadow when single stepping should be done only on VMX The problem exists only on VMX. Also currently we skip this step if there is pending exception. The patch fixes this too. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:41 +03:00
Christoph Hellwig	284e9b0f5a	KVM: cleanup arch/x86/kvm/Makefile Use proper foo-y style list additions to cleanup all the conditionals, move module selection after compound object selection and remove the superflous comment. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:40 +03:00
Avi Kivity	ee3d29e8be	KVM: x86 emulator: fix jmp far decoding (opcode 0xea) The jump target should not be sign extened; use an unsigned decode flag. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:40 +03:00
Avi Kivity	c9eaf20f26	KVM: x86 emulator: Implement zero-extended immediate decoding Absolute jumps use zero extended immediate operands. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:39 +03:00
Mark McLoughlin	cb007648de	KVM: fix cpuid E2BIG handling for extended request types If we run out of cpuid entries for extended request types we should return -E2BIG, just like we do for the standard request types. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:39 +03:00
Jaswinder Singh Rajput	60af2ecdc5	KVM: Use MSR names in place of address Replace 0xc0010010 with MSR_K8_SYSCFG and 0xc0010015 with MSR_K7_HWCR. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:39 +03:00
Huang Ying	890ca9aefa	KVM: Add MCE support The related MSRs are emulated. MCE capability is exported via extension KVM_CAP_MCE and ioctl KVM_X86_GET_MCE_CAP_SUPPORTED. A new vcpu ioctl command KVM_X86_SETUP_MCE is used to setup MCE emulation such as the mcg_cap. MCE is injected via vcpu ioctl command KVM_X86_SET_MCE. Extended machine-check state (MCG_EXT_P) and CMCI are not implemented. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:39 +03:00
Jaswinder Singh Rajput	af24a4e4ae	KVM: Replace MSR_IA32_TIME_STAMP_COUNTER with MSR_IA32_TSC of msr-index.h Use standard msr-index.h's MSR declaration. MSR_IA32_TSC is better than MSR_IA32_TIME_STAMP_COUNTER as it also solves 80 column issue. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:38 +03:00
Gleb Natapov	ae0bb3e011	KVM: VMX: Properly handle software interrupt re-injection in real mode When reinjecting a software interrupt or exception, use the correct instruction length provided by the hardware instead of a hardcoded 1. Fixes problems running the suse 9.1 livecd boot loader. Problem introduced by commit f0a3602c20 ("KVM: Move interrupt injection logic to x86.c"). Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:38 +03:00
Yang Xiaowei	2496afbf1e	xen: use stronger barrier after unlocking lock We need to have a stronger barrier between releasing the lock and checking for any waiting spinners. A compiler barrier is not sufficient because the CPU's ordering rules do not prevent the read xl->spinners from happening before the unlock assignment, as they are different memory locations. We need to have an explicit barrier to enforce the write-read ordering to different memory locations. Because of it, I can't bring up > 4 HVM guests on one SMP machine. [ Code and commit comments expanded -J ] [ Impact: avoid deadlock when using Xen PV spinlocks ] Signed-off-by: Yang Xiaowei <xiaowei.yang@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-09-09 16:38:44 -07:00
Jeremy Fitzhardinge	4d576b57b5	xen: only enable interrupts while actually blocking for spinlock Where possible we enable interrupts while waiting for a spinlock to become free, in order to reduce big latency spikes in interrupt handling. However, at present if we manage to pick up the spinlock just before blocking, we'll end up holding the lock with interrupts enabled for a while. This will cause a deadlock if we recieve an interrupt in that window, and the interrupt handler tries to take the lock too. Solve this by shrinking the interrupt-enabled region to just around the blocking call. [ Impact: avoid race/deadlock when using Xen PV spinlocks ] Reported-by: "Yang, Xiaowei" <xiaowei.yang@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-09-09 16:38:11 -07:00
Jeremy Fitzhardinge	577eebeae3	xen: make -fstack-protector work under Xen -fstack-protector uses a special per-cpu "stack canary" value. gcc generates special code in each function to test the canary to make sure that the function's stack hasn't been overrun. On x86-64, this is simply an offset of %gs, which is the usual per-cpu base segment register, so setting it up simply requires loading %gs's base as normal. On i386, the stack protector segment is %gs (rather than the usual kernel percpu %fs segment register). This requires setting up the full kernel GDT and then loading %gs accordingly. We also need to make sure %gs is initialized when bringing up secondary cpus too. To keep things consistent, we do the full GDT/segment register setup on both architectures. Because we need to avoid -fstack-protected code before setting up the GDT and because there's no way to disable it on a per-function basis, several files need to have stack-protector inhibited. [ Impact: allow Xen booting with stack-protector enabled ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-09-09 16:37:39 -07:00
Rafael J. Wysocki	bf992fa2bc	Merge branch 'master' into for-linus	2009-09-10 00:02:02 +02:00
Jack Steiner	fa526d0d64	x86, pat: Fix cacheflush address in change_page_attr_set_clr() Fix address passed to cpa_flush_range() when changing page attributes from WB to UC. The address (*addr) is modified by __change_page_attr_set_clr(). The result is that the pages being flushed start at the _end_ of the changed range instead of the beginning. This should be considered for 2.6.30-stable and 2.6.31-stable. Signed-off-by: Jack Steiner <steiner@sgi.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Stable team <stable@kernel.org>	2009-09-09 14:05:24 -07:00
David Howells	733e5e4b4e	KEYS: Add missing linux/tracehook.h #inclusions Add #inclusions of linux/tracehook.h to those arch files that had the tracehook call for TIF_NOTIFY_RESUME added when support for that flag was added to that arch. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-09-09 18:30:02 +10:00
David S. Miller	d89be56b21	sparc64: Make touch_nmi_watchdog() actually work. It guards it's actions on nmi_watchdog_active, but nothing ever sets that and it's initial value is zero. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-08 23:29:16 -07:00
David S. Miller	4e85f5915d	sparc64: Kill unnecessary cast in profile_timer_exceptions_notify(). Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-08 23:20:39 -07:00
David S. Miller	a8f2226455	sparc64: Manage NMI watchdog enabling like x86. Use a per-cpu 'wd_enabled' boolean and a global atomic_t count of watchdog NMI enabled cpus which is set to '-1' if something is wrong with the watchdog and it can't be used. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-08 23:16:06 -07:00
Tejun Heo	3e5cd1f257	dmi: extend dmi_get_year() to dmi_get_date() There are cases where full date information is required instead of just the year. Add month and day parsing to dmi_get_year() and rename it to dmi_get_date(). As the original function only required '/' followed by any number of parseable characters at the end of the string, keep that behavior to avoid upsetting existing users. The new function takes dates of format [mm[/dd]]/yy[yy]. Year, month and date are checked to be in the ranges of [1-9999], [1-12] and [1-31] respectively and any invalid or out-of-range component is returned as zero. The dummy implementation is updated accordingly but the return value is updated to indicate field not found which is consistent with how other dummy functions behave. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-09-08 21:17:48 -04:00
Russell King	4abf27add8	Merge branch 'for-rmk' of git://git.marvell.com/orion into devel-stable	2009-09-08 21:21:15 +01:00
Simon Guinot	5478267408	[ARM] orion5x: Add LaCie NAS 2Big Network support This patch add support for the 2Big Network LaCie boards. Signed-off-by: Simon Guinot <sguinot@lacie.com> Signed-off-by: Nicolas Pitre <nico@marvell.com>	2009-09-08 14:10:35 -04:00
Russell King	5eb38f4483	Merge branch 'fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ycmiao/pxa-linux-2.6	2009-09-08 11:23:24 +01:00
Peter Zijlstra	a8fae3ec5f	sched: enable SD_WAKE_IDLE Now that SD_WAKE_IDLE doesn't make pipe-test suck anymore, enable it by default for MC, CPU and NUMA domains. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-07 22:00:17 +02:00
Pavel Machek	99f329a2ba	[ARM] pxa/sharpsl_pm: zaurus c3000 aka spitz: fix resume sharpsl_pm.c code tries to read battery state very early during resume, but those battery meters are connected on SPI and that's only resumed way later. Replace the check with simple checking of battery fatal signal, that actually works at this stage. Signed-off-by: Pavel Machek <pavel@ucw.cz> Tested-by: Stanislav Brabec <utx@penguin.cz> Signed-off-by: Eric Miao <eric.y.miao@gmail.com>	2009-09-07 23:14:59 +08:00
Krzysztof Halasa	5dbc46506a	IXP42x HSS support for setting internal clock rate HSS usually uses external clocks, so it's not a big deal. Internal clock is used for direct DTE-DTE connections and when the DCE doesn't provide it's own clock. This also depends on the oscillator frequency. Intel seems to have calculated the clock register settings for 33.33 MHz (66.66 MHz timer base). Their settings seem quite suboptimal both in terms of average frequency (60 ppm is unacceptable for G.703 applications, their primary intended usage(?)) and jitter. Many (most?) platforms use a 33.333 MHz oscillator, a 10 ppm difference from Intel's base. Instead of creating static tables, I've created a procedure to program the HSS clock register. The register consists of 3 parts (A, B, C). The average frequency (= bit rate) is: 66.66x MHz / (A + (B + 1) / (C + 1)) The procedure aims at the closest average frequency, possibly at the cost of increased jitter. Nobody would be able to directly drive an unbufferred transmitter with a HSS anyway, and the frequency error is what it really counts. I've verified the above with an oscilloscope on IXP425. It seems IXP46x and possibly IXP43x use a bit different clock generation algorithm - it looks like the avg frequency is: (on IXP465) 66.66x MHz / (A + B / (C + 1)). Also they use much greater precomputed A and B - on IXP425 it would simply result in more jitter, but I don't know how does it work on IXP46x (perhaps 3 least significant bits aren't used?). Anyway it looks that they were aiming for exactly +60 ppm or -60 ppm, while <1 ppm is typically possible (with a synchronized clock, of course). The attached patch makes it possible to set almost any bit rate (my IXP425 533 MHz quits at > 22 Mb/s if a single port is used, and the minimum is ca. 65 Kb/s). This is independent of MVIP (multi-E1/T1 on one HSS) mode. Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-07 01:56:49 -07:00
Nicolas Ferre	8c3cbd5a2b	ARM: 5686/1: at91: Correct AC97 reset line in at91sam9263ek board Board code was wrongly setting up the reset pin for AC97 on at91sam9263ek. Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Andrew Victor <linux@maxim.org.za> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-09-06 20:57:27 +01:00
sedji gaouaou	d656f07a74	ARM: 5640/1: This patch modifies the support of AC97 on the at91sam9263 ek board This patch modifies the support of AC97 on the at91sam9263 ek board, so it would share the code with AVR32. Plus it removes a typo in at91sam9263_devices.c. Signed-off-by: Sedji Gaouaou <sedji.gaouaou@atmel.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-09-06 20:57:26 +01:00
Kristoffer Ericson	56ddf7e6d4	ARM: 5689/1: Update default config of HP Jornada 700-series machines This patch updates the default config for HP Jornada 700-series handhelds. Signed-off-by: Kristoffer Ericson <kristoffer.ericson@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-09-06 20:54:41 +01:00
Tobias Klauser	d535e4319a	x86: Make memtype_seq_ops const Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-06 06:33:33 +02:00
Rafael J. Wysocki	23b6c52cf5	x86: Decrease the level of some NUMA messages to KERN_DEBUG Some NUMA messages in srat_32.c are confusing to users, because they seem to indicate errors, while in fact they reflect normal behaviour. Decrease the level of these messages to KERN_DEBUG so that they don't show up unnecessarily. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> LKML-Reference: <200909050107.45175.rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-06 06:32:23 +02:00
Ingo Molnar	ed011b22ce	Merge commit 'v2.6.31-rc9' into tracing/core Merge reason: move from -rc5 to -rc9. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-06 06:11:42 +02:00
Roderick Colenbrander	74a01180db	powerpc: Fix i8259 interrupt driver kernel crash on ML510 This patch fixes a null pointer exception caused by removal of 'ack()' for level interrupts in the Xilinx interrupt driver. A recent change to the xilinx interrupt controller removed the ack hook for level irqs. Signed-off-by: Roderick Colenbrander <thunderbird2k@gmail.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-05 14:58:07 -07:00
Linus Torvalds	535e0c1726	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] fix csum_ipv6_magic() [IA64] Fix warning in dma-mapping.c	2009-09-05 14:50:53 -07:00
Linus Torvalds	d3acd16cda	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Fix bootup with mcount in some configs. sparc64: Kill spurious NMI watchdog triggers by increasing limit to 30 seconds.	2009-09-05 13:49:06 -07:00
Linus Torvalds	93697a3cab	Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter/powerpc: Fix cache event codes for POWER7 perf_counter: Fix /0 bug in swcounters perf_counters: Increase paranoia level	2009-09-05 13:48:37 -07:00
Christof Schmitt	b592e89ac9	[SCSI] zfcp: Remove duplicated code for debug timestamps The timestamp calculation used for s390dbf output is the same in a private zfcp function and in debug.c. Replace both with a common inline function. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2009-09-05 08:49:48 -05:00
Jan Glauber	81bd5f6c96	crypto: sha-s390 - Fix warnings in import function That patch should fix the warnings. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-09-05 16:27:35 +10:00
Nicolas Pitre	7929eb9cf6	ARM: 5691/1: fix cache aliasing issues between kmap() and kmap_atomic() with highmem Let's suppose a highmem page is kmap'd with kmap(). A pkmap entry is used, the page mapped to it, and the virtual cache is dirtied. Then kunmap() is used which does virtually nothing except for decrementing a usage count. Then, let's suppose the _same_ page gets mapped using kmap_atomic(). It is therefore mapped onto a fixmap entry instead, which has a different virtual address unaware of the dirty cache data for that page sitting in the pkmap mapping. Fortunately it is easy to know if a pkmap mapping still exists for that page and use it directly with kmap_atomic(), thanks to kmap_high_get(). And actual testing with a printk in the added code path shows that this condition is actually met extremely frequently. Seems that we've been quite lucky that things have worked so well with highmem so far. Cc: stable@kernel.org Signed-off-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-09-04 19:20:07 +01:00
H. Peter Anvin	b19ae39998	x86, msr: change msr-reg.o to obj-y, and export its symbols Change msr-reg.o to obj-y (it will be included in virtually every kernel since it is used by the initialization code for AMD processors) and add a separate C file to export its symbols to modules, so that msr.ko can use them; on uniprocessors we bypass the helper functions in msr.o and use the accessor functions directly via inlines. Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <20090904140834.GA15789@elte.hu> Cc: Borislav Petkov <petkovbb@googlemail.com>	2009-09-04 10:00:09 -07:00
Pekka Enberg	8e019366ba	kmemleak: Don't scan uninitialized memory when kmemcheck is enabled Ingo Molnar reported the following kmemcheck warning when running both kmemleak and kmemcheck enabled: PM: Adding info for No Bus:vcsa7 WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (f6f6e1a4) d873f9f600000000c42ae4c1005c87f70000000070665f666978656400000000 i i i i u u u u i i i i i i i i i i i i i i i i i i i i i u u u ^ Pid: 3091, comm: kmemleak Not tainted (2.6.31-rc7-tip #1303) P4DC6 EIP: 0060:[<c110301f>] EFLAGS: 00010006 CPU: 0 EIP is at scan_block+0x3f/0xe0 EAX: f40bd700 EBX: f40bd780 ECX: f16b46c0 EDX: 00000001 ESI: f6f6e1a4 EDI: 00000000 EBP: f10f3f4c ESP: c2605fcc DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: e89a4844 CR3: 30ff1000 CR4: 000006f0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff4ff0 DR7: 00000400 [<c110313c>] scan_object+0x7c/0xf0 [<c1103389>] kmemleak_scan+0x1d9/0x400 [<c1103a3c>] kmemleak_scan_thread+0x4c/0xb0 [<c10819d4>] kthread+0x74/0x80 [<c10257db>] kernel_thread_helper+0x7/0x3c [<ffffffff>] 0xffffffff kmemleak: 515 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 42 new suspected memory leaks (see /sys/kernel/debug/kmemleak) The problem here is that kmemleak will scan partially initialized objects that makes kmemcheck complain. Fix that up by skipping uninitialized memory regions when kmemcheck is enabled. Reported-by: Ingo Molnar <mingo@elte.hu> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2009-09-04 16:05:55 +01:00
Ingo Molnar	695a461296	Merge branch 'amd-iommu/2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into core/iommu	2009-09-04 14:44:16 +02:00
David S. Miller	bd4352cadf	sparc64: Fix bootup with mcount in some configs. Functions invoked early when booting up a cpu can't use tracing because mcount requires a valid 'current_thread_info()' and TLB mappings to be setup. The code path of sun4v_register_mondo_queues --> register_one_mondo is one such case. sun4v_register_mondo_queues already has the necessary 'notrace' annotation, but register_one_mondo does not. Normally register_one_mondo is inlined so the bug doesn't trigger, but with some config/compiler combinations, it won't be so we must properly mark it notrace. While we're here, add 'notrace' annoations to prom_printf and prom_halt so that early error handling won't have the same problem. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Reported-by: Leif Sawyer <lsawyer@gci.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-04 03:39:45 -07:00
Jens Axboe	825c9fb47a	sparc: add basic support for 'perf' This wires up the perf_counter_open() syscall so that basic software support for perf is working. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-04 02:56:22 -07:00
Ingo Molnar	840a065310	sched: Turn on SD_BALANCE_NEWIDLE Start the re-tuning of the balancer by turning on newidle. It improves hackbench performance and parallelism on a 4x4 box. The "perf stat --repeat 10" measurements give us: domain0 domain1 ....................................... -SD_BALANCE_NEWIDLE -SD_BALANCE_NEWIDLE: 2041.273208 task-clock-msecs # 9.354 CPUs ( +- 0.363% ) +SD_BALANCE_NEWIDLE -SD_BALANCE_NEWIDLE: 2086.326925 task-clock-msecs # 11.934 CPUs ( +- 0.301% ) +SD_BALANCE_NEWIDLE +SD_BALANCE_NEWIDLE: 2115.289791 task-clock-msecs # 12.158 CPUs ( +- 0.263% ) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 11:52:54 +02:00
Ingo Molnar	47734f89be	sched: Clean up topology.h Re-organize the flag settings so that it's visible at a glance which sched-domains flags are set and which not. With the new balancer code we'll need to re-tune these details anyway, so make it cleaner to make fewer mistakes down the road ;-) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 11:52:53 +02:00
David S. Miller	a29889a536	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/	2009-09-04 02:22:21 -07:00
Yinghai Lu	0d96b9ff74	x86: Use hard_smp_processor_id() to get apic id for AMD K8 cpus Otherwise, system with apci id lifting will have wrong apicid in /proc/cpuinfo. and use that in srat_detect_node(). Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <4A998CCA.1040407@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 09:55:29 +02:00
Ingo Molnar	29e2035bdd	Merge branch 'linus' into core/rcu Merge reason: Avoid fuzz in init/main.c and update from rc6 to rc8. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 09:29:05 +02:00
markus.t.metzger@intel.com	1653192f51	x86, perf_counter, bts: Do not allow kernel BTS tracing for now Kernel BTS tracing generates too much data too fast for us to handle, causing the kernel to hang. Fail for BTS requests for kernel code. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Acked-by: Peter Zijlstra <a.p.zjilstra@chello.nl> LKML-Reference: <20090902140616.901253000@intel.com> [ This is really a workaround - but we want BTS tracing in .32 so make sure we dont regress. The lockup should be fixed ASAP. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 09:26:40 +02:00
markus.t.metzger@intel.com	596da17f94	x86, perf_counter, bts: Correct pointer-to-u64 casts On 32bit, pointers in the DS AREA configuration are cast to u64. The current (long) cast to avoid compiler warnings results in a signed 64bit address. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090902140615.305889000@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 09:26:39 +02:00
markus.t.metzger@intel.com	747b50aaf7	x86, perf_counter, bts: Fail if BTS is not available Reserve PERF_COUNT_HW_BRANCH_INSTRUCTIONS with sample_period == 1 for BTS tracing and fail, if BTS is not available. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090902140612.943801000@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 09:26:39 +02:00
Jeremy Fitzhardinge	53f824520b	x86/i386: Put aligned stack-canary in percpu shared_aligned section Pack aligned things together into a special section to minimize padding holes. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Tejun Heo <tj@kernel.org> LKML-Reference: <4AA035C0.9070202@goop.org> [ queued up in tip:x86/asm because it depends on this commit: x86/i386: Make sure stack-protector segment base is cache aligned ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 07:10:31 +02:00
Andreas Herrmann	5a925b4282	x86, sched: Workaround broken sched domain creation for AMD Magny-Cours Current sched domain creation code can't handle multi-node processors. When switching to power_savings scheduling errors show up and system might hang later on (due to broken sched domain hierarchy): # echo 0 >> /sys/devices/system/cpu/sched_mc_power_savings CPU0 attaching sched-domain: domain 0: span 0-5 level MC groups: 0 1 2 3 4 5 domain 1: span 0-23 level NODE groups: 0-5 6-11 18-23 12-17 ... # echo 1 >> /sys/devices/system/cpu/sched_mc_power_savings CPU0 attaching sched-domain: domain 0: span 0-11 level MC groups: 0 1 2 3 4 5 6 7 8 9 10 11 ERROR: parent span is not a superset of domain->span domain 1: span 0-5 level CPU ERROR: domain->groups does not contain CPU0 groups: 6-11 (__cpu_power = 12288) ERROR: groups don't span domain->span domain 2: span 0-23 level NODE groups: ERROR: domain->cpu_power not set ERROR: groups don't span domain->span ... Fixing all aspects of power-savings scheduling for Magny-Cours needs some larger changes in the sched domain creation code. As a short-term and temporary workaround avoid the problems by extending "the worst possible hack" ;-( and always use llc_shared_map on AMD Magny-Cours when MC domain span is calculated. With this I get: # echo 1 >> /sys/devices/system/cpu/sched_mc_power_savings CPU0 attaching sched-domain: domain 0: span 0-5 level MC groups: 0 1 2 3 4 5 domain 1: span 0-5 level CPU groups: 0-5 (__cpu_power = 6144) domain 2: span 0-23 level NODE groups: 0-5 (__cpu_power = 6144) 6-11 (__cpu_power = 6144) 18-23 (__cpu_power = 6144) 12-17 (__cpu_power = 6144) ... I.e. no errors during sched domain creation, no system hangs, and also mc_power_savings scheduling works to a certain extend. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-03 15:10:14 -07:00
Andreas Herrmann	cb9805ab5b	x86, mcheck: Use correct cpumask for shared bank4 This fixes threshold_bank4 support on multi-node processors. The correct mask to use is llc_shared_map, representing an internal node on Magny-Cours. We need to create 2 sets of symlinks for sibling shared banks -- one set for each internal node, symlinks of each set should target the first core on same internal node. Currently only one set is created where all symlinks are targeting the first core of the entire socket. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-03 15:10:08 -07:00
Andreas Herrmann	a326e948c5	x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors L3 cache size, associativity and shared_cpu information need to be adapted to show information for an internal node instead of the entire physical package. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-03 15:10:03 -07:00
Andreas Herrmann	4a376ec3a2	x86: Fix CPU llc_shared_map information for AMD Magny-Cours Construct entire NodeID and use it as cpu_llc_id. Thus internal node siblings are stored in llc_shared_map. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-03 15:09:59 -07:00
Jeremy Fitzhardinge	1ea0d14e48	x86/i386: Make sure stack-protector segment base is cache aligned The Intel Optimization Reference Guide says: In Intel Atom microarchitecture, the address generation unit assumes that the segment base will be 0 by default. Non-zero segment base will cause load and store operations to experience a delay. - If the segment base isn't aligned to a cache line boundary, the max throughput of memory operations is reduced to one [e]very 9 cycles. [...] Assembly/Compiler Coding Rule 15. (H impact, ML generality) For Intel Atom processors, use segments with base set to 0 whenever possible; avoid non-zero segment base address that is not aligned to cache line boundary at all cost. We can't avoid having a non-zero base for the stack-protector segment, but we can make it cache-aligned. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: <stable@kernel.org> LKML-Reference: <4AA01893.6000507@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-03 21:30:51 +02:00
Ingo Molnar	8adf65cfae	x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too The macro was defined in the 32-bit path as well - breaking the build on 32-bit platforms: arch/x86/lib/msr-reg.S: Assembler messages: arch/x86/lib/msr-reg.S:53: Error: Bad macro parameter list arch/x86/lib/msr-reg.S💯 Error: invalid character '_' in mnemonic arch/x86/lib/msr-reg.S:101: Error: invalid character '_' in mnemonic Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <tip-f6909f394c2d4a0a71320797df72d54c49c5927e@git.kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-03 21:26:34 +02:00
Joerg Roedel	2b681fafcc	Merge branch 'amd-iommu/pagetable' into amd-iommu/2.6.32 Conflicts: arch/x86/kernel/amd_iommu.c	2009-09-03 17:14:57 +02:00
Joerg Roedel	03362a05c5	Merge branch 'amd-iommu/passthrough' into amd-iommu/2.6.32 Conflicts: arch/x86/kernel/amd_iommu.c arch/x86/kernel/amd_iommu_init.c	2009-09-03 16:34:23 +02:00
Joerg Roedel	85da07c409	Merge branches 'gart/fixes', 'amd-iommu/fixes+cleanups' and 'amd-iommu/fault-handling' into amd-iommu/2.6.32	2009-09-03 16:32:00 +02:00
Pavel Vasilyev	6ac162d6c0	x86/gart: Do not select AGP for GART_IOMMU There is no dependency from the gart code to the agp code. And since a lot of systems today do not have agp anymore remove this dependency from the kernel configuration. Signed-off-by: Pavel Vasilyev <pavel@pavlinux.ru> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:20:55 +02:00
Joerg Roedel	4751a95134	x86/amd-iommu: Initialize passthrough mode when requested This patch enables the passthrough mode for AMD IOMMU by running the initialization function when iommu=pt is passed on the kernel command line. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:15:46 +02:00
Joerg Roedel	a1ca331c8a	x86/amd-iommu: Don't detach device from pt domain on driver unbind This patch makes sure a device is not detached from the passthrough domain when the device driver is unloaded or does otherwise release the device. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:15:46 +02:00
Joerg Roedel	21129f786f	x86/amd-iommu: Make sure a device is assigned in passthrough mode When the IOMMU driver runs in passthrough mode it has to make sure that every device not assigned to an IOMMU-API domain must be put into the passthrough domain instead of keeping it unassigned. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:15:45 +02:00
Joerg Roedel	eba6ac60ba	x86/amd-iommu: Align locking between attach_device and detach_device This patch makes the locking behavior between the functions attach_device and __attach_device consistent with the locking behavior between detach_device and __detach_device. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:15:44 +02:00
Joerg Roedel	aa879fff5d	x86/amd-iommu: Fix device table write order The V bit of the device table entry has to be set after the rest of the entry is written to not confuse the hardware. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:15:43 +02:00
Joerg Roedel	0feae533dd	x86/amd-iommu: Add passthrough mode initialization functions When iommu=pt is passed on kernel command line the devices should run untranslated. This requires the allocation of a special domain for that purpose. This patch implements the allocation and initialization path for iommu=pt. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:15:42 +02:00
Joerg Roedel	2650815fb0	x86/amd-iommu: Add core functions for pd allocation/freeing This patch factors some code of protection domain allocation into seperate functions. This way the logic can be used to allocate the passthrough domain later. As a side effect this patch fixes an unlikely domain id leakage bug. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:15:34 +02:00
Joerg Roedel	ac0101d396	x86/dma: Mark iommu_pass_through as __read_mostly This variable is read most of the time. This patch marks it as such. It also documents the meaning the this variable while at it. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:13:46 +02:00
Joerg Roedel	abdc5eb3d6	x86/amd-iommu: Change iommu_map_page to support multiple page sizes This patch adds a map_size parameter to the iommu_map_page function which makes it generic enough to handle multiple page sizes. This also requires a change to alloc_pte which is also done in this patch. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:11:17 +02:00
Joerg Roedel	a6b256b413	x86/amd-iommu: Support higher level PTEs in iommu_page_unmap This patch changes fetch_pte and iommu_page_unmap to support different page sizes too. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:11:08 +02:00
Joerg Roedel	674d798a80	x86/amd-iommu: Remove old page table handling macros These macros are not longer required. So remove them. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:49 +02:00
Joerg Roedel	8f7a017ce0	x86/amd-iommu: Use 2-level page tables for dma_ops domains The driver now supports a dynamic number of levels for IO page tables. This allows to reduce the number of levels for dma_ops domains by one because a dma_ops domain has usually an address space size between 128MB and 4G. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:49 +02:00
Joerg Roedel	bad1cac28a	x86/amd-iommu: Remove bus_addr check in iommu_map_page The driver now supports full 64 bit device address spaces. So this check is not longer required. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:48 +02:00
Joerg Roedel	8c8c143cdc	x86/amd-iommu: Remove last usages of IOMMU_PTE_L0_INDEX This change allows to remove these old macros later. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:47 +02:00
Joerg Roedel	8bc3e12742	x86/amd-iommu: Change alloc_pte to support 64 bit address space This patch changes the alloc_pte function to be able to map pages into the whole 64 bit address space supported by AMD IOMMU hardware from the old limit of 2**39 bytes. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:46 +02:00
Joerg Roedel	50020fb632	x86/amd-iommu: Introduce increase_address_space function This function will be used to increase the address space size of a protection domain. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:46 +02:00
Joerg Roedel	04bfdd8406	x86/amd-iommu: Flush domains if address space size was increased Thist patch introduces the update_domain function which propagates the larger address space of a protection domain to the device table and flushes all relevant DTEs and the domain TLB. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:45 +02:00
Joerg Roedel	407d733e30	x86/amd-iommu: Introduce set_dte_entry function This function factors out some logic of attach_device to a seperate function. This new function will be used to update device table entries when necessary. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:44 +02:00
Joerg Roedel	6a0dbcbe4e	x86/amd-iommu: Add a gneric version of amd_iommu_flush_all_devices This patch adds a generic variant of amd_iommu_flush_all_devices function which flushes only the DTEs for a given protection domain. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:43 +02:00
Joerg Roedel	a6d41a4027	x86/amd-iommu: Use fetch_pte in amd_iommu_iova_to_phys Don't reimplement the page table walker in this function. Use the generic one. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:42 +02:00
Joerg Roedel	38a76eeeaf	x86/amd-iommu: Use fetch_pte in iommu_unmap_page Instead of reimplementing existing logic use fetch_pte to walk the page table in iommu_unmap_page. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:41 +02:00
Joerg Roedel	9355a08186	x86/amd-iommu: Make fetch_pte aware of dynamic mapping levels This patch changes the fetch_pte function in the AMD IOMMU driver to support dynamic mapping levels. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:34 +02:00
Joerg Roedel	6a1eddd2f9	x86/amd-iommu: Reset command buffer if wait loop fails Instead of a panic on an comletion wait loop failure, try to recover from that event from resetting the command buffer. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:56:19 +02:00
Joerg Roedel	b26e81b871	x86/amd-iommu: Panic if IOMMU command buffer reset fails To prevent the driver from doing recursive command buffer resets, just panic when that recursion happens. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:55:35 +02:00
Joerg Roedel	a345b23b79	x86/amd-iommu: Reset command buffer on ILLEGAL_COMMAND_ERROR On an ILLEGAL_COMMAND_ERROR the IOMMU stops executing further commands. This patch changes the code to handle this case better by resetting the command buffer in the IOMMU. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:55:34 +02:00
Joerg Roedel	93f1cc67cf	x86/amd-iommu: Add reset function for command buffers This patch factors parts of the command buffer initialization code into a seperate function which can be used to reset the command buffer later. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:55:34 +02:00
Joerg Roedel	d586d7852c	x86/amd-iommu: Add function to flush all DTEs on one IOMMU This function flushes all DTE entries on one IOMMU for all devices behind this IOMMU. This is required for command buffer resetting later. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:55:23 +02:00
Joerg Roedel	e0faf54ee8	x86/amd-iommu: fix broken check in amd_iommu_flush_all_devices The amd_iommu_pd_table is indexed by protection domain number and not by device id. So this check is broken and must be removed. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:49:56 +02:00
Joerg Roedel	ae908c22aa	x86/amd-iommu: Remove redundant 'IOMMU' string The 'IOMMU: ' prefix is not necessary because the DUMP_printk macro already prints its own prefix. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:49:56 +02:00
Joerg Roedel	4c6f40d4e0	x86/amd-iommu: replace "AMD IOMMU" by "AMD-Vi" This patch replaces the "AMD IOMMU" printk strings with the official name for the hardware: "AMD-Vi". Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:49:56 +02:00
Joerg Roedel	f2430bd104	x86/amd-iommu: Remove some merge helper code This patch removes some left-overs which where put into the code to simplify merging code which also depends on changes in other trees. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:49:55 +02:00
Joerg Roedel	e394d72aa8	x86/amd-iommu: Introduce function for iommu-local domain flush This patch introduces a function to flush all domain tlbs for on one given IOMMU. This is required later to reset the command buffer on one IOMMU. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:41:34 +02:00
Joerg Roedel	945b4ac44e	x86/amd-iommu: Dump illegal command on ILLEGAL_COMMAND_ERROR This patch adds code to dump the command which caused an ILLEGAL_COMMAND_ERROR raised by the IOMMU hardware. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 14:28:04 +02:00
Joerg Roedel	e3e59876e8	x86/amd-iommu: Dump fault entry on DTE error This patch adds code to dump the content of the device table entry which caused an ILLEGAL_DEV_TABLE_ENTRY error from the IOMMU hardware. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 14:17:08 +02:00
David S. Miller	e6617c6ec2	sparc64: Kill spurious NMI watchdog triggers by increasing limit to 30 seconds. This is a compromise and a temporary workaround for bootup NMI watchdog triggers some people see with qla2xxx devices present. This happens when, for example: CPU 0 is in the driver init and looping submitting mailbox commands to load the firmware, then waiting for completion. CPU 1 is receiving the device interrupts. CPU 1 is where the NMI watchdog triggers. CPU 0 is submitting mailbox commands fast enough that by the time CPU 1 returns from the device interrupt handler, a new one is pending. This sequence runs for more than 5 seconds. The problematic case is CPU 1's timer interrupt running when the barrage of device interrupts begin. Then we have: timer interrupt return for softirq checking pending, thus enable interrupts qla2xxx interrupt return qla2xxx interrupt return ... 5+ seconds pass final qla2xxx interrupt for fw load return run timer softirq return At some point in the multi-second qla2xxx interrupt storm we trigger the NMI watchdog on CPU 1 from the NMI interrupt handler. The timer softirq, once we get back to running it, is smart enough to run the timer work enough times to make up for the missed timer interrupts. However, the NMI watchdogs (both x86 and sparc) use the timer interrupt count to notice the cpu is wedged. But in the above scenerio we'll receive only one such timer interrupt even if we last all the way back to running the timer softirq. The default watchdog trigger point is only 5 seconds, which is pretty low (the softwatchdog triggers at 60 seconds). So increase it to 30 seconds for now. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-03 02:35:20 -07:00
Paul Mackerras	a3df6f7d30	perf_counter/powerpc: Fix cache event codes for POWER7 I had the codes for L1 D-cache load accesses and misses swapped around, and the wrong codes for LL-cache accesses and misses. This corrects them. Reported-by: Corey Ashford <cjashfor@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> LKML-Reference: <19103.8514.709300.585484@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-03 08:41:53 +02:00
Ingo Molnar	f76bd108e5	Merge branch 'perfcounters/urgent' into perfcounters/core Merge reason: We are going to modify a place modified by perfcounters/urgent. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-02 21:42:59 +02:00
Jiri Bohac	5afe18d2f5	[IA64] fix csum_ipv6_magic() The 32-bit parameters (len and csum) of csum_ipv6_magic() are passed in 64-bit registers in2 and in4. The high order 32 bits of the registers were never cleared, and garbage was sometimes calculated into the checksum. Fix this by clearing the high order 32 bits of these registers. Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-02 09:14:48 -07:00
Luck, Tony	f2486f2669	[IA64] Fix warning in dma-mapping.c arch/ia64/kernel/dma-mapping.c:14: warning: control reaches end of non-void function arch/ia64/kernel/dma-mapping.c:14: warning: no return statement in function returning non-void This warning was introduced by commit: `390bd132b2` Add dma_debug_init() for ia64 Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-02 09:12:21 -07:00
David Howells	ee18d64c1f	KEYS: Add a keyctl to install a process's session keyring on its parent [try #6 ] Add a keyctl to install a process's session keyring onto its parent. This replaces the parent's session keyring. Because the COW credential code does not permit one process to change another process's credentials directly, the change is deferred until userspace next starts executing again. Normally this will be after a wait() syscall. To support this, three new security hooks have been provided: cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in the blank security creds and key_session_to_parent() - which asks the LSM if the process may replace its parent's session keyring. The replacement may only happen if the process has the same ownership details as its parent, and the process has LINK permission on the session keyring, and the session keyring is owned by the process, and the LSM permits it. Note that this requires alteration to each architecture's notify_resume path. This has been done for all arches barring blackfin, m68k and xtensa, all of which need assembly alteration to support TIF_NOTIFY_RESUME. This allows the replacement to be performed at the point the parent process resumes userspace execution. This allows the userspace AFS pioctl emulation to fully emulate newpag() and the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to alter the parent process's PAG membership. However, since kAFS doesn't use PAGs per se, but rather dumps the keys into the session keyring, the session keyring of the parent must be replaced if, for example, VIOCSETTOK is passed the newpag flag. This can be tested with the following program: #include <stdio.h> #include <stdlib.h> #include <keyutils.h> #define KEYCTL_SESSION_TO_PARENT 18 #define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0) int main(int argc, char **argv) { key_serial_t keyring, key; long ret; keyring = keyctl_join_session_keyring(argv[1]); OSERROR(keyring, "keyctl_join_session_keyring"); key = add_key("user", "a", "b", 1, keyring); OSERROR(key, "add_key"); ret = keyctl(KEYCTL_SESSION_TO_PARENT); OSERROR(ret, "KEYCTL_SESSION_TO_PARENT"); return 0; } Compiled and linked with -lkeyutils, you should see something like: [dhowells@andromeda ~]$ keyctl show Session Keyring -3 --alswrv 4043 4043 keyring: _ses 355907932 --alswrv 4043 -1 \_ keyring: _uid.4043 [dhowells@andromeda ~]$ /tmp/newpag [dhowells@andromeda ~]$ keyctl show Session Keyring -3 --alswrv 4043 4043 keyring: _ses 1055658746 --alswrv 4043 4043 \_ user: a [dhowells@andromeda ~]$ /tmp/newpag hello [dhowells@andromeda ~]$ keyctl show Session Keyring -3 --alswrv 4043 4043 keyring: hello 340417692 --alswrv 4043 4043 \_ user: a Where the test program creates a new session keyring, sticks a user key named 'a' into it and then installs it on its parent. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-09-02 21:29:22 +10:00
David Howells	d0420c83f3	KEYS: Extend TIF_NOTIFY_RESUME to (almost) all architectures [try #6 ] Implement TIF_NOTIFY_RESUME for most of those architectures in which isn't yet available, and, whilst we're at it, have it call the appropriate tracehook. After this patch, blackfin, m68k* and xtensa still lack support and need alteration of assembly code to make it work. Resume notification can then be used (by a later patch) to install a new session keyring on the parent of a process. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> cc: linux-arch@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>	2009-09-02 21:29:19 +10:00
Nicolas Pitre	13f96d8f4c	ARM: 5687/1: fix an oops with highmem In xdr_partial_copy_from_skb() there is that sequence: kaddr = kmap_atomic(ppage, KM_SKB_SUNRPC_DATA); [...] flush_dcache_page(ppage); kunmap_atomic(kaddr, KM_SKB_SUNRPC_DATA); Mixing flush_dcache_page() and kmap_atomic() is a bit odd, especially since kunmap_atomic() must deal with cache issues already. OTOH the non-highmem case must use flush_dcache_page() as kunmap_atomic() becomes a no op with no cache maintenance. Problem is that with highmem the implementation of kmap_atomic() doesn't set page->virtual, and page_address(page) returns 0 in that case. Here flush_dcache_page() calls __flush_dcache_page() which calls __cpuc_flush_dcache_page(page_address(page)) resulting in a kernel oops. None of the kmap_atomic() implementations uses set_page_address(). Hence we can assume page_address() is always expected to return 0 in that case. Let's conditionally call __cpuc_flush_dcache_page() only when the page address is non zero, and perform that test only when highmem is configured. Signed-off-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-09-02 11:33:24 +01:00
wanzongshun	8e22676e56	ARM: 5684/1: Add nuc960 platform to w90x900 Add nuc960 platform to w90x900. Signed-off-by: Wan ZongShun <mcuos.com@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-09-02 11:22:24 +01:00
wanzongshun	936fbe9efc	ARM: 5683/1: Add nuc950 platform to w90x900 Add nuc950 platform to w90x900. Signed-off-by: Wan ZongShun <mcuos.com@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-09-02 11:22:24 +01:00
wanzongshun	35c9221acb	ARM: 5682/1: Add cpu.c and dev.c and modify some files of w90p910 platform Add the cpu.c and dev.c and modify w90p910 platform to apply to use the common API(provided by cpu.c and dev.c) at the same time, I renamed all w90x900 to nuc900 in every c file of w90x900 platform and touchscreen's driver name. Signed-off-by: Wan ZongShun <mcuos.com@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-09-02 11:22:23 +01:00
Russell King	98b0979f02	MMC: MMCI: convert realview MMC to use gpiolib Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-09-02 11:21:14 +01:00
Stephen Hemminger	0fc0b732ea	netdev: drivers should make ethtool_ops const No need to put ethtool_ops in data, they should be const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-02 01:03:33 -07:00
Benjamin Herrenschmidt	cede3930f0	powerpc: Fix some late PowerMac G5 with PCIe ATI graphics A misconfiguration by the firmware of the U4 PCIe bridge on PowerMac G5 with the U4 bridge (latest generations, may also affect the iMac G5 "iSight") is causing us to re-assign the PCI BARs of the video card, which can get it out of sync with the firmware, thus breaking offb. This works around it by fixing up the bridge configuration properly at boot time. It also fixes a bug where the firmware provides us with an incorrect set of accessible regions in the device-tree. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-02 16:20:42 +10:00
Kumar Gala	76acc2c1a7	powerpc/fsl-booke: Use HW PTE format if CONFIG_PTE_64BIT Switch to using the Power ISA defined PTE format when we have a 64-bit PTE. This makes the code handling between fsl-booke and book3e-64 similiar for TLB faults. Additionally this lets use take advantage of the page size encodings and full permissions that the HW PTE defines. Also defined _PMD_PRESENT, _PMD_PRESENT_MASK, and _PMD_BAD since the 32-bit ppc arch code expects them. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-02 16:20:41 +10:00
Kumar Gala	1d5d9527d8	powerpc/book3e: Add missing page sizes Add defines for the other page sizes. Even if HW doesn't support them we made them use them for hugetlbfs support. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-02 16:20:40 +10:00
Brian King	46db2f86a3	powerpc/pseries: Fix to handle slb resize across migration The SLB can change sizes across a live migration, which was not being handled, resulting in possible machine crashes during migration if migrating to a machine which has a smaller max SLB size than the source machine. Fix this by first reducing the SLB size to the minimum possible value, which is 32, prior to migration. Then during the device tree update which occurs after migration, we make the call to ensure the SLB gets updated. Also add the slb_size to the lparcfg output so that the migration tools can check to make sure the kernel has this capability before allowing migration in scenarios where the SLB size will change. BenH: Fixed #include <asm/mmu-hash64.h> -> <asm/mmu.h> to avoid breaking ppc32 build Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-02 16:19:01 +10:00
Ingo Molnar	936e894a97	Merge commit 'v2.6.31-rc8' into x86/txt Conflicts: arch/x86/kernel/reboot.c security/Kconfig Merge reason: resolve the conflicts, bump up from rc3 to rc8. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-02 08:17:56 +02:00
Grant Likely	0ed2c722c6	powerpc/pci: Merge ppc32 and ppc64 versions of phb_scan() The two versions are doing almost exactly the same thing. No need to maintain them as separate files. This patch also has the side effect of making the PCI device tree scanning code available to 32 bit powerpc machines, but no board ports actually make use of this feature at this point. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-02 15:45:53 +10:00
Huang Ying	ae4b688db2	x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.h This function measures whether the FPU/SSE state can be touched in interrupt context. If the interrupted code is in user space or has no valid FPU/SSE context (CR0.TS == 1), FPU/SSE state can be used in IRQ or soft_irq context too. This is used by AES-NI accelerated AES implementation and PCLMULQDQ accelerated GHASH implementation. v3: - Renamed to irq_fpu_usable to reflect the purpose of the function. v2: - Renamed to irq_is_fpu_using to reflect the real situation. Signed-off-by: Huang Ying <ying.huang@intel.com> CC: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-01 21:39:15 -07:00
Shane Wang	69575d3886	x86, intel_txt: clean up the impact on generic code, unbreak non-x86 Move tboot.h from asm to linux to fix the build errors of intel_txt patch on non-X86 platforms. Remove the tboot code from generic code init/main.c and kernel/cpu.c. Signed-off-by: Shane Wang <shane.wang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-01 18:25:07 -07:00
Alexey Dobriyan	e7a088f935	sparc: convert /proc/io_map, /proc/dvma_map to seq_file Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-01 17:54:07 -07:00
H. Peter Anvin	f6909f394c	x86, msr: fix msr-reg.S compilation with gas 2.16.1 msr-reg.S used the :req option on a macro argument, which wasn't supported by gas 2.16.1 (but apparently by some earlier versions of gas, just to be confusing.) It isn't necessary, so just remove it. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@googlemail.com>	2009-09-01 13:37:21 -07:00
Ingo Molnar	c931aaf0e1	Merge branch 'x86/paravirt' into x86/cpu Conflicts: arch/x86/include/asm/paravirt.h Manual merge: arch/x86/include/asm/paravirt_types.h Merge reason: x86/paravirt conflicts non-trivially with x86/cpu, resolve it. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-01 12:13:30 +02:00
Catalin Marinas	acde31dc46	kmemleak: Ignore the aperture memory hole on x86_64 This block is allocated with alloc_bootmem() and scanned by kmemleak but the kernel direct mapping may no longer exist. This patch tells kmemleak to ignore this memory hole. The dma32_bootmem_ptr in dma32_reserve_bootmem() is also ignored. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Ingo Molnar <mingo@elte.hu>	2009-09-01 11:12:32 +01:00
Heiko Carstens	96910b6dc8	locking, m68k/asm-offsets: Rename signal defines In order to be able to use asm-offsets.h in C files the existing namespace conflicts must be solved first. In asm-offsets.h there are defines for signal constants, so they can be used in assembler files. Unfortunately the existing defines use a 1:1 mapping for the macro names which results in name space conflicts if the header file would also be used in C files. So rename the created defines and add an "L" prefix to each one since that has already been done for the SIGTRAP define in entry_mm. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Horst Hartmann <horsth@linux.vnet.ibm.com> Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <20090831124416.998821502@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-01 09:38:03 +02:00
H. Peter Anvin	ff55df53df	x86, msr: Export the register-setting MSR functions via /dev/*/msr Make it possible to access the all-register-setting/getting MSR functions via the MSR driver. This is implemented as an ioctl() on the standard MSR device node. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@gmail.com>	2009-08-31 16:16:04 -07:00
H. Peter Anvin	8b956bf1f0	x86, msr: Create _on_cpu helpers for {rw,wr}msr_safe_regs() Create _on_cpu helpers for {rw,wr}msr_safe_regs() analogously with the other MSR functions. This will be necessary to add support for these to the MSR driver. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@gmail.com>	2009-08-31 16:15:57 -07:00
H. Peter Anvin	0cc0213e73	x86, msr: Have the _safe MSR functions return -EIO, not -EFAULT For some reason, the _safe MSR functions returned -EFAULT, not -EIO. However, the only user which cares about the return code as anything other than a boolean is the MSR driver, which wants -EIO. Change it to -EIO across the board. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au>	2009-08-31 15:15:23 -07:00
H. Peter Anvin	79c5dca361	x86, msr: CFI annotations, cleanups for msr-reg.S Add CFI annotations for native_{rd,wr}msr_safe_regs(). Simplify the 64-bit implementation: we don't allow the upper half registers to be set, and so we can use them to carry state across the operation. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@gmail.com> LKML-Reference: <1251705011-18636-1-git-send-email-petkovbb@gmail.com>	2009-08-31 15:14:47 -07:00
H. Peter Anvin	709972b1f6	x86, asm: Make _ASM_EXTABLE() usable from assembly code We have had this convenient macro _ASM_EXTABLE() to generate exception table entry in inline assembly. Make it also usable for pure assembly. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-31 15:14:30 -07:00
H. Peter Anvin	fe9b4e4e40	x86, asm: Add 32-bit versions of the combined CFI macros Add 32-bit versions of the combined CFI macros, equivalent to the 64-bit ones except, obviously, operating on 32-bit stack words. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-31 15:14:29 -07:00
Borislav Petkov	6b0f43ddfa	x86, AMD: Disable wrongly set X86_FEATURE_LAHF_LM CPUID bit `fbd8b1819e` turns off the bit for /proc/cpuinfo. However, a proper/full fix would be to additionally turn off the bit in the CPUID output so that future callers get correct CPU features info. Do that by basically reversing what the BIOS wrongfully does at boot. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1251705011-18636-3-git-send-email-petkovbb@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-31 15:14:29 -07:00
Borislav Petkov	177fed1ee8	x86, msr: Rewrite AMD rd/wrmsr variants Switch them to native_{rd,wr}msr_safe_regs and remove pv_cpu_ops.read_msr_amd. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> LKML-Reference: <1251705011-18636-2-git-send-email-petkovbb@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-31 15:14:28 -07:00
Borislav Petkov	132ec92f3f	x86, msr: Add rd/wrmsr interfaces with preset registers native_{rdmsr,wrmsr}_safe_regs are two new interfaces which allow presetting of a subset of eight x86 GPRs before executing the rd/wrmsr instructions. This is needed at least on AMD K8 for accessing an erratum workaround MSR. Originally based on an idea by H. Peter Anvin. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> LKML-Reference: <1251705011-18636-1-git-send-email-petkovbb@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-31 15:14:26 -07:00
Heiko Carstens	b62e180cae	locking: Inline spinlock code for all locking variants on s390 Speeds up several benchmarks in a measurable way, so inline all spin-lock variants by default. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Horst Hartmann <horsth@linux.vnet.ibm.com> Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <20090831124419.319518405@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-31 18:08:51 +02:00
Heiko Carstens	0ee000e5e8	locking, m68k: Calculate thread_info offset with asm offset m68k has the thread_info structure embedded in its task struct. Therefore its not possible to implement current_thread_info() by looking at the stack pointer and do some simple calculations like most other architectures do it. To return the thread_info pointer for a task two defines are used. This works until the spinlock function bodies get moved into an own header file and CONFIG_SPINLOCK_DEBUG is turned on. That results into this compile error: In file included from include/linux/spinlock.h:378, from include/linux/seqlock.h:29, from include/linux/time.h:8, from include/linux/timex.h:56, from include/linux/sched.h:54, from arch/m68k/kernel/asm-offsets.c:12: include/linux/spinlock_api_smp.h: In function '__spin_unlock_irq': include/linux/spinlock_api_smp.h:371: error: 'current' undeclared (first use in this function) include/linux/spinlock_api_smp.h:371: error: (Each undeclared identifier is reported only once include/linux/spinlock_api_smp.h:371: error: for each function it appears in.) Including asm/current.h to asm-offsets.c wouldn't help since the definition of struct task is needed. So we end up with ugly header file include dependencies. To solve this calculate the offset of the thread_info structure into the task struct in asm-offsets.h and use the offset in task_thread_info(). This works just like it does for IA64 as well. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Horst Hartmann <horsth@linux.vnet.ibm.com> Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <20090831124417.329662275@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-31 18:08:49 +02:00
Heiko Carstens	f159ee7829	locking, m68k/asm-offsets: Rename pt_regs offset defines In order to be able to use asm-offsets.h in C files the existing namespace conflicts must be solved first. In asm-offsets.h e.g. PT_D0 gets defined which is the offset of the d0 member of the pt_regs structure. However a same define (with a different meaning) exists in asm/ptregs.h. So rename the defines created with the asm-offset mechanism to PT_OFF_D0 etc. There also already exist a few defines with these names that have the same meaning. So remove the existing defines and use the asm-offset generated ones. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Horst Hartmann <horsth@linux.vnet.ibm.com> Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <20090831124416.666403991@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-31 18:08:49 +02:00
Heiko Carstens	9f34ceb603	locking, sparc: Rename __spin_try_lock() and friends Needed to avoid namespace conflicts when the common code function bodies of _spin_try_lock() etc. are moved to a header file where the function name would be __spin_try_lock(). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Horst Hartmann <horsth@linux.vnet.ibm.com> Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <20090831124416.306495811@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-31 18:08:48 +02:00
Heiko Carstens	8307a98097	locking, powerpc: Rename __spin_try_lock() and friends Needed to avoid namespace conflicts when the common code function bodies of _spin_try_lock() etc. are moved to a header file where the function name would be __spin_try_lock(). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Horst Hartmann <horsth@linux.vnet.ibm.com> Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <20090831124415.918799705@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-31 18:08:48 +02:00
Tiejun Chen	c5b20d3926	powerpc/405ex: support cuImage via included dtb To support cuImage, we need to initialize the required sections and ensure that it is built. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2009-08-31 09:23:22 -04:00
Tiejun Chen	0484c1df47	powerpc/405ex: provide necessary fixup function to support cuImage For cuImage format it's necessary to provide clock fixups since u-boot will not pass necessary clock frequency into the dtb included into cuImage so we implement the clock fixups as defined in the technical documentation for the board and update header file with the basic register definitions. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2009-08-31 09:20:55 -04:00
Solomon Peachy	0cdf50a7c6	powerpc/40x: Add support for the ESTeem 195E (PPC405EP) SBC This patch adds support for the ESTeem 195E Hotfoot SBC. There are several variants of the SBC deployed, single/dual ethernet+serial, and also 4MB/8MB flash variations. In the interest of having a single kernel image boot on all boards, the cuboot shim detects the differences and mangles the DTS tree appropriately. With the exception of the CF interface that was never populated on production boards, this code/DTS supports all boardpop options. Signed-off-by: Solomon Peachy <solomon@linux-wlan.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2009-08-31 09:15:51 -04:00
fkan@amcc.com	c9f75093a4	powerpc/44x: Add Eiger AMCC (AppliedMicro) PPC460SX evaluation board support. This patch adds support for the AMCC (AppliedMicro) PPC460SX Eiger evaluation board. Signed-off-by: Tai Tri Nguyen <ttnguyen@amcc.com> Acked-by: Feng Kan <fkan@amcc.com> Acked-by: Tirumala Marri <tmarri@amcc.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2009-08-31 08:24:15 -04:00
Stefan Roese	2417492613	powerpc/44x: Update Arches defconfig This patch adds NOR MTD support and I2C HWMON support for the AD7414 to the AMCC Arches defconfig. Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2009-08-31 08:24:14 -04:00
Stefan Roese	f661be6c8a	powerpc/44x: Update Arches dts This patch adds some nodes to the AMCC Arches dts: - L2 cache support - NOR FLASH mapping with default partitioning - I2C HWMON device (AD7414) Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2009-08-31 08:24:14 -04:00
Anton Vorontsov	ed24157ede	powerpc/qe: Implement qe_alive_during_sleep() helper function In some CPUs (i.e. MPC8569) QE shuts down completely during sleep, drivers may want to know that to reinitialize registers and buffer descriptors. This patch implements qe_alive_during_sleep() helper function, so far it just checks if MPC8569-compatible power management controller is present, which is a sign that QE turns off during sleep. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-08-30 21:51:33 -07:00
Michal Schmidt	23386d63bb	x86: Detect stack protector for i386 builds on x86_64 Stack protector support was not detected when building with ARCH=i386 on x86_64 systems: arch/x86/Makefile:80: stack protector enabled but no compiler support The "-m32" argument needs to be passed to the detection script. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <20090829182718.10f566b1@leela> Signed-off-by: Ingo Molnar <mingo@elte.hu> --	2009-08-30 20:39:48 +02:00
Jan Beulich	47d25003cb	x86: Fix earlyprintk=dbgp for machines without NX Since parse_early_param() may (e.g. for earlyprintk=dbgp) involve calls to page table manipulation functions (here set_fixmap_nocache()), NX hardware support must be determined before calling that function (so that __supported_pte_mask gets properly set up). But the call after parse_early_param() can also not go away, as that will honor eventual command line specified disabling of the NX functionality. ( This will then just result in whatever mappings got established during parse_early_param() having the NX bit set despite it being disabled on the command line, but I think that's tolerable). Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> LKML-Reference: <4A97F3BD02000078000121B9@vpn.id2.novell.com> [ merged to x86/pat to resolve a conflict. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-29 15:47:32 +02:00
Ingo Molnar	eebc57f73d	Merge branch 'for-ingo' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-sfi-2.6 into x86/apic Merge reason: the SFI (Simple Firmware Interface) feature in the ACPI tree needs this cleanup, pull it into the APIC branch as well so that there's no interactions. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-29 09:31:47 +02:00
Yoshihiro Shimoda	4923576b8a	net: sh_eth: add value of ether_link pin in platform_data The method of ETHER_LINK pin is board dependence. This patch adding paramters are: - no_ether_link : If set to 1, do not use ETHER_LINK - ether_link_active_low : If set to 1, ETHER_LINK is active low. Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-08-29 00:19:35 -07:00
Grant Grundler	825e1e2391	parisc: fix warning in traps.c On Tue, Aug 18, 2009 at 01:45:17PM -0400, John David Anglin wrote: > CC arch/parisc/kernel/traps.o > arch/parisc/kernel/traps.c: In function 'handle_interruption': > arch/parisc/kernel/traps.c:535:18: warning: operation on 'regs->iasq[0]' > may be undefined Yes - Line 535 should use both [0] and [1]. Reported-by: John David Anglin <dave@hiauly1.hia.nrc.ca> Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@mcmartin.ca> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-28 19:37:20 -10:00
Linus Torvalds	4ed86af67e	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix vSMP boot crash x86, xen: Initialize cx to suppress warning x86, xen: Suppress WP test on Xen	2009-08-28 19:32:32 -10:00
Feng Tang	2a4ab640d3	ACPI, x86: expose some IO-APIC routines when CONFIG_ACPI=n Some IO-APIC routines are ACPI specific now, but need to be exposed when CONFIG_ACPI=n for the benefit of SFI. Remove #ifdef ACPI around these routines: io_apic_get_unique_id(int ioapic, int apic_id); io_apic_get_version(int ioapic); io_apic_get_redir_entries(int ioapic); Move these routines from ACPI-specific boot.c to io_apic.c: uniq_ioapic_id(u8 id) mp_find_ioapic() mp_find_ioapic_pin() mp_register_ioapic() Also, since uniq_ioapic_id() is now no longer static, re-name it to io_apic_unique_id() for consistency with the other public io_apic routines. For simplicity, do not #ifdef the resulting code ACPI \|\| SFI, thought that could be done in the future if it is important to optimize the !ACPI !SFI IO-APIC x86 kernel for size. Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Cc: x86@kernel.org	2009-08-28 19:57:27 -04:00
Dmitry Torokhov	4b61bb575b	Merge commit 'v2.6.31-rc8' into next	2009-08-27 22:00:20 -07:00
Benjamin Herrenschmidt	77c0a700c1	powerpc: Properly start decrementer on BookE secondary CPUs This moves the code to start the decrementer on 40x and BookE into a separate function which is now called from time_init() and secondary_time_init(), before the respective clock sources are registered. We also remove the 85xx specific code for doing it from the platform code. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:25:04 +10:00
Paul Gortmaker	e5a6a1c909	powerpc: derive COMMAND_LINE_SIZE from asm-generic The default COMMAND_LINE_SIZE in asm-generic is 512, so the net effect of this change is nil, aside from the cleanup factor. See also commit `2b74b8569`. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:16 +10:00
Kumar Gala	89c2dd62a3	powerpc/pci: Pull ppc32 PCI features into common Some of the PCI features we have in ppc32 we will need on ppc64 platforms in the future. These include support for: * ppc_md.pci_exclude_device * indirect config cycles * early config cycles We also simplified the logic in fake_pci_bus() to assume it will always get a valid pci_controller. Since all current callers seem to pass it one. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:15 +10:00
Grant Likely	fbe6544719	powerpc/pci: move pci_64.c device tree scanning code into pci-common.c The PCI device tree scanning code in pci_64.c is some useful functionality. It allows PCI devices to be described in the device tree instead of being probed for, which in turn allows pci devices to use all of the device tree facilities to describe complex PCI bus architectures like GPIO and IRQ routing (perhaps not a common situation for desktop or server systems, but useful for embedded systems with on-board PCI devices). This patch moves the device tree scanning into pci-common.c so it is available for 32-bit powerpc machines too. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:15 +10:00
Grant Likely	ae14e13a4c	powerpc/pci: Remove dead checks for CONFIG_PPC_OF PPC_OF is always selected for arch/powerpc. This patch removes the stale #defines Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:14 +10:00
Kumar Gala	bb1af71ecb	powerpc/book3e-64: Add support to initial_tlb_book3e for non-HES TLB We now search through TLBnCFG looking for the first array that has IPROT support (we assume that there is only one). If that TLB has hardware entry select (HES) support we use the existing code and with the proper TLB select (the HES code still needs to clean up bolted entries from firmware). The non-HES code is pretty similiar to the 32-bit FSL Book-E code but does make some new assumtions (like that we have tlbilx) and simplifies things down a bit. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:14 +10:00
Kumar Gala	4b98d9e713	powerpc/book3e-64: Add helper function to setup IVORs Not all 64-bit Book-3E parts will have fixed IVORs so add a function that cpusetup code can call to setup the base IVORs (0..15) to match the fixed offsets. We need to 'or' part of interrupt_base_book3e into the IVORs since on parts that have them the IVPR doesn't extend as far down. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:13 +10:00
Kumar Gala	6c188829d2	powerpc/book3e-64: Wait til generic_calibrate_decr to enable decrementer Match what we do on 32-bit Book-E processors and enable the decrementer in generic_calibrate_decr. We need to make sure we disable the decrementer early in boot since we currently use lazy (soft) interrupt on 64-bit Book-E and possible get a decrementer exception before we are ready for it. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:13 +10:00
Kumar Gala	f45c4486f7	powerpc/book3e-64: Move the default cpu table entry Move the default cpu entry table for CONFIG_PPC_BOOK3E_64 to the very end since we will probably want to support both 32-bit and 64-bit kernels for some processors that are higher up in the list. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:13 +10:00
Kumar Gala	df5d6ecf81	powerpc/mm: Add MMU features for TLB reservation & Paired MAS registers Support for TLB reservation (or TLB Write Conditional) and Paired MAS registers are optional for a processor implementation so we handle them via MMU feature sections. We currently only used paired MAS registers to access the full RPN + perm bits that are kept in MAS7\|\|MAS3. We assume that if an implementation has hardware page table at this time it also implements in TLB reservations. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:12 +10:00
Michael Wolf	23e55f92d4	powerpc: Adjust base and index registers in Altivec macros On POWER6 systems RA needs to be the base and RB the index. If they are reversed you take a misdirect hit. Signed-off-by: Mike Wolf <mjwolf@us.ibm.com> ---- Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:12 +10:00
Becky Bruce	e3e1d15855	powerpc: Name xpn & x fields in HW Hash PTE format Previously, the 36-bit code was using these bits, but they had never been named in the pte format definition. This patch just gives those fields their proper names and adds a comment that they are only present on some processors. There is no functional code change. Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:12 +10:00
FUJITA Tomonori	80d3e8abb7	powerpc: Add CONFIG_DMA_API_DEBUG support Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:11 +10:00
FUJITA Tomonori	4a9a6bfe70	powerpc: Handle SWIOTLB mapping error properly Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:11 +10:00
FUJITA Tomonori	46bab4e4b4	powerpc: Use asm-generic/dma-mapping-common.h Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:10 +10:00
FUJITA Tomonori	45223c5492	powerpc: use dma_map_ops struct This converts uses dma_map_ops struct (in include/linux/dma-mapping.h) instead of POWERPC homegrown dma_mapping_ops. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:10 +10:00
FUJITA Tomonori	3702977fa7	powerpc: Remove swiotlb_pci_dma_ops Now swiotlb_pci_dma_ops is identical to swiotlb_dma_ops; we can use swiotlb_dma_ops with any devices. This removes swiotlb_pci_dma_ops. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:09 +10:00
FUJITA Tomonori	762afb7317	powerpc: Remove addr_needs_map in struct dma_mapping_ops This patch adds max_direct_dma_addr to struct dev_archdata to remove addr_needs_map in struct dma_mapping_ops. It also converts dma_capable() to use max_direct_dma_addr. max_direct_dma_addr is initialized in pci_dma_dev_setup_swiotlb(), called via ppc_md.pci_dma_dev_setup hook. For further information: http://marc.info/?t=124719060200001&r=1&w=2 Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-28 14:24:09 +10:00
Benjamin Herrenschmidt	2864697cef	Merge commit 'tip/iommu-for-powerpc' into next	2009-08-28 14:23:06 +10:00
Linus Torvalds	e99b1f22f9	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/ps3: Update ps3_defconfig powerpc/ps3: Add missing check for PS3 to rtc-ps3 platform device registration	2009-08-26 20:39:31 -07:00
Geoff Levand	b080f187ad	powerpc/ps3: Update ps3_defconfig Update ps3_defconfig. o Refresh for 2.6.31. o Remove MTD support. o Add more HID drivers. Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-27 13:27:59 +10:00
Geert Uytterhoeven	7b6a09f3d6	powerpc/ps3: Add missing check for PS3 to rtc-ps3 platform device registration On non-PS3, we get: \| kernel BUG at drivers/rtc/rtc-ps3.c:36! because the rtc-ps3 platform device is registered unconditionally in a kernel with builtin support for PS3. Reported-by: Sachin Sant <sachinp@in.ibm.com> Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-27 13:25:46 +10:00
Benjamin Herrenschmidt	3c2ee2d9f4	Merge commit 'kumar/next' into next	2009-08-27 13:13:41 +10:00
Gautham R Shenoy	6776426320	powerpc/pseries: Reduce the polling interval in __cpu_up() Time time taken for a single cpu online operation on a pseries machine is as follows: Dedicated LPAR (POWER6): ~220ms. Shared LPAR (POWER5) : ~240ms. Of this time, approximately 200ms is taken up by __cpu_up(). This is because we poll every 200ms to check if the new cpu has notified it's presence through the cpu_callin_map. We repeat this operation until the new cpu sets the value in cpu_callin_map or 5 seconds elapse, whichever comes earlier. However, using completion_structs instead of polling loops, the time taken by the new processor to indicate it's presence has found to be less than 1ms on pseries. This method however may not work on all powerpc platforms due to the time-base synchronization code. Keeping this in mind, we could reduce msleep polling interval from 200ms to 1ms while retaining the 5 second timeout. With this, the time taken for a cpu online operation changes as follows: Dedicated LPAR (POWER6): 20-25ms. Shared LPAR (POWER5) : 60-80ms. In both these cases, it was found that the code polls through the loop only once indicating that 1ms is a reasonable value, atleast on pseries. The code needs testing on other powerpc platforms. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Acked-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-27 13:12:54 +10:00
Bastian Blank	6fdc31a2b8	powerpc: Remove SMP warning from PowerMac cpufreq On Thu, Aug 13, 2009 at 04:14:58PM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2009-08-11 at 11:39 +0200, Bastian Blank wrote: > > This patch just disables this driver on SMP kernels, as it is obviously > > not supported. > Why not remove the #error instead ? :-) I don't think it's still > meaningful, especially since we use the timebase for delays nowadays > which doesn't depend on the CPU frequency... Your call. Take this one: The build of a PowerMac 32bit kernel currently fails with error: #warning "WARNING, CPUFREQ not recommended on SMP kernels" Thie patch removes the not longer applicable SMP warning from the PowerMac cpufreq code. Signed-off-by: Bastian Blank <waldi@debian.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-27 13:12:53 +10:00
Josh Boyer	14d757520a	powerpc: Fix __flush_icache_range on 44x The ptrace POKETEXT interface allows a process to modify the text pages of a child process being ptraced, usually to insert breakpoints via trap instructions. The kernel eventually calls copy_to_user_page, which in turn calls __flush_icache_range to invalidate the icache lines for the child process. However, this function does not work on 44x due to the icache being virtually indexed. This was noticed by a breakpoint being triggered after it had been cleared by ltrace on a 440EPx board. The convenient solution is to do a flash invalidate of the icache in the __flush_icache_range function. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-27 13:12:52 +10:00
Benjamin Herrenschmidt	ea3cc330ac	powerpc/mm: Cleanup handling of execute permission This is an attempt at cleaning up a bit the way we handle execute permission on powerpc. _PAGE_HWEXEC is gone, _PAGE_EXEC is now only defined by CPUs that can do something with it, and the myriad of #ifdef's in the I$/D$ coherency code is reduced to 2 cases that hopefully should cover everything. The logic on BookE is a little bit different than what it was though not by much. Since now, _PAGE_EXEC will be set by the generic code for executable pages, we need to filter out if they are unclean and recover it. However, I don't expect the code to be more bloated than it already was in that area due to that change. I could boast that this brings proper enforcing of per-page execute permissions to all BookE and 40x but in fact, we've had that now for some time as a side effect of my previous rework in that area (and I didn't even know it :-) We would only enable execute permission if the page was cache clean and we would only cache clean it if we took and exec fault. Since we now enforce that the later only work if VM_EXEC is part of the VMA flags, we de-fact already enforce per-page execute permissions... Unless I missed something Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-08-27 13:12:51 +10:00
Benjamin Herrenschmidt	f480fe3916	Merge commit 'origin/master' into next	2009-08-27 13:12:40 +10:00
H. Peter Anvin	b855192c08	Merge branch 'x86/urgent' into x86/pat Reason: Change to is_new_memtype_allowed() in x86/urgent Resolved semantic conflicts in: arch/x86/mm/pat.c arch/x86/mm/ioremap.c Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 17:24:28 -07:00
Venkatesh Pallipadi	d886c73cd4	x86, pat: Sanity check remap_pfn_range for RAM region Add sanity check for remap_pfn_range of RAM regions using lookup_memtype(). Previously, we did not have anyway to get the type of RAM memory regions as they were tracked using a single bit in page_struct (WB, nonWB). Now we can get the actual type from page struct (WB, WC, UC_MINUS) and make sure the requester gets that type. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 15:41:35 -07:00
Venkatesh Pallipadi	1087637616	x86, pat: Lookup the protection from memtype list on vm_insert_pfn() Lookup the reserved memtype during vm_insert_pfn and use that memtype for the new mapping. This takes care or handling of vm_insert_pfn() interface in track_pfn_vma*/untrack_pfn_vma. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 15:41:32 -07:00
Venkatesh Pallipadi	637b86e75f	x86, pat: Add lookup_memtype to get the current memtype of a paddr Add a new routine lookup_memtype() to get the current memtype based on the PAT reserves and frees. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 15:41:28 -07:00
Venkatesh Pallipadi	f584174096	x86, pat: Use page flags to track memtypes of RAM pages Change reserve_ram_pages_type and free_ram_pages_type to use 2 page flags to track UC_MINUS, WC, WB and default types. Previous RAM tracking just tracked WB or NonWB, which was not complete and did not allow tracking of RAM fully and there was no way to get the actual type reserved by looking at the page flags. We use the memtype_lock spinlock for atomicity in dealing with memtype tracking in struct page. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 15:41:24 -07:00
Venkatesh Pallipadi	46cf98cdae	x86, pat: Generalize the use of page flag PG_uncached Only IA64 was using PG_uncached as of now. We now intend to use this bit in x86 as well, to keep track of memory type of those addresses that have page struct for them. So, generalize the use of that bit across ia64 and x86. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 15:41:22 -07:00
Venkatesh Pallipadi	335ef896d4	x86, pat: Add rbtree to do quick lookup in memtype tracking PAT memtype tracking uses a linear link list to keep track of IO (non-RAM) regions and their memtypes. The code used a last_accessed pointer as a cache to speedup the lookup. As per discussions with H. Peter Anvin a while back, having a rbtree here will avoid bad performances in pathological cases where we may end up with huge linked list. This may not add any noticable performance speedup in normal case as the number of entires in PAT memtype list tend to be ~20-30 range. The patch removes the "cached_entry" logic as with rbtree we have more generic way of speeding up the lookup. With this patch, we use rbtree to do the quick lookup. We still use linked list as the memtype range tracked can be of different sizes and can overlap in different ways. We also keep track of usage counts with linked list. Example: Multiple ioremaps with different sizes uncached-minus @ 0xfffff00000-0xfffff04000 uncached-minus @ 0xfffff02000-0xfffff03000 And one userlevel mmap and the thread forks a new process uncached-minus @ 0xbf453000-0xbf454000 uncached-minus @ 0xbf453000-0xbf454000 Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 15:41:19 -07:00
Venkatesh Pallipadi	9e36fda0b3	x86, pat: Add PAT reserve free to io_mapping* APIs io_mapping_* interfaces were added, mainly for graphics drivers. Make this interface go through the PAT reserve/free, instead of hardcoding WC mapping. This makes sure that there are no aliases due to unconditional WC setting. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 15:41:16 -07:00
Venkatesh Pallipadi	9fd126bc74	x86, pat: New i/f for driver to request memtype for IO regions Add new routines to request memtype for IO regions. This will currently be a backend for io_mapping_* routines. But, it can also be made available to drivers directly in future, in case it is needed. reserve interface reserves the memory, makes sure we have a compatible memory type available and keeps the identity map in sync when needed. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 15:41:10 -07:00

... 3 4 5 6 7 ...

38466 Commits