linux

Author	SHA1	Message	Date
Linus Torvalds	b840d79631	Merge branch 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits) x86: export vector_used_by_percpu_irq x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and() sched: nominate preferred wakeup cpu, fix x86: fix lguest used_vectors breakage, -v2 x86: fix warning in arch/x86/kernel/io_apic.c sched: fix warning in kernel/sched.c sched: move test_sd_parent() to an SMP section of sched.h sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0 sched: activate active load balancing in new idle cpus sched: bias task wakeups to preferred semi-idle packages sched: nominate preferred wakeup cpu sched: favour lower logical cpu number for sched_mc balance sched: framework for sched_mc/smt_power_savings=N sched: convert BALANCE_FOR_xx_POWER to inline functions x86: use possible_cpus=NUM to extend the possible cpus allowed x86: fix cpu_mask_to_apicid_and to include cpu_online_mask x86: update io_apic.c to the new cpumask code x86: Introduce topology_core_cpumask()/topology_thread_cpumask() x86: xen: use smp_call_function_many() x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c ... Fixed up trivial conflict in kernel/time/tick-sched.c manually	2009-01-02 11:44:09 -08:00
Linus Torvalds	5f34fe1cfc	Merge branch 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits) stacktrace: provide save_stack_trace_tsk() weak alias rcu: provide RCU options on non-preempt architectures too printk: fix discarding message when recursion_bug futex: clean up futex_(un)lock_pi fault handling "Tree RCU": scalable classic RCU implementation futex: rename field in futex_q to clarify single waiter semantics x86/swiotlb: add default swiotlb_arch_range_needs_mapping x86/swiotlb: add default phys<->bus conversion x86: unify pci iommu setup and allow swiotlb to compile for 32 bit x86: add swiotlb allocation functions swiotlb: consolidate swiotlb info message printing swiotlb: support bouncing of HighMem pages swiotlb: factor out copy to/from device swiotlb: add arch hook to force mapping swiotlb: allow architectures to override phys<->bus<->phys conversions swiotlb: add comment where we handle the overflow of a dma mask on 32 bit rcu: fix rcutorture behavior during reboot resources: skip sanity check of busy resources swiotlb: move some definitions to header swiotlb: allow architectures to override swiotlb pool allocation ... Fix up trivial conflicts in arch/x86/kernel/Makefile arch/x86/mm/init_32.c include/linux/hardirq.h as per Ingo's suggestions.	2008-12-30 16:10:19 -08:00
Sebastien Dugue	b906cfa397	powerpc/pseries: Fix cpu hotplug Currently, pseries_cpu_die() calls msleep() while polling RTAS for the status of the dying cpu. However, if the cpu that is going down also happens to be the one doing the tick then we're hosed as the tick_do_timer_cpu 'baton' is only passed later on in tick_shutdown() when _cpu_down() does the CPU_DEAD notification. Therefore jiffies won't be updated anymore. This replaces that msleep() with a cpu_relax() to make sure we're not going to schedule at that point. With this patch my test box survives a 100k iterations hotplug stress test on _all_ cpus, whereas without it, it quickly dies after ~50 iterations. Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net> Cc: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-23 15:13:27 +11:00
Brian King	fecba96268	powerpc: Add reboot notifier to Collaborative Memory Manager When running Active Memory Sharing, pages can get marked as "loaned" with the hypervisor by the CMM driver. This state gets cleared by the system firmware when rebooting the partition. When using kexec to boot a new kernel, this state never gets cleared and the hypervisor and CMM driver can get out of sync with respect to the number of pages currently marked "loaned". Fix this by adding a reboot notifier to the CMM driver to deflate the balloon and mark all pages as active. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-21 14:21:15 +11:00
Brian King	2218108e18	powerpc: Disable Collaborative Memory Manager for kdump When running Active Memory Sharing, the Collaborative Memory Manager (CMM) may mark some pages as "loaned" with the hypervisor. Periodically, the CMM will query the hypervisor for a loan request, which is a single signed value. When kexec'ing into a kdump kernel, the CMM driver in the kdump kernel is not aware of the pages the previous kernel had marked as "loaned", so the hypervisor and the CMM driver are out of sync. This results in the CMM driver getting a negative loan request, which can then get treated as a large unsigned value and can cause kdump to hang due to the CMM driver inflating too large. Since there really is no clean way for the CMM driver in the kdump kernel to clean this up, simply disable CMM in the kdump kernel. This fixes hangs we were seeing doing kdump with AMS. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-21 14:21:15 +11:00
Tony Breeds	532774ec7f	powerpc: Pass a valid token to rtas_call() in phyp-dump code ibm_configure_kernel_dump is passed as the token to rtas_call() is never initialised. This sets it to something sane. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Acked-by: Nathan Lynch <ntl@pobox.com> Acked-by: Manish Ahuja <mahujam@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-21 14:21:15 +11:00
Tony Breeds	7a2eab0d4e	powerpc: Protect against NULL pointer deref in phyp-dump code print_dump_header() will be called at least once with a NULL pointer in a normal boot sequence. If DEBUG is defined then we will dereference the pointer and crash. Add a quick fix to exit early in the NULL pointer case. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Acked-by: Manish Ahuja <mahujam@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-21 14:21:14 +11:00
Paul E. McKenney	64db4cfff9	"Tree RCU": scalable classic RCU implementation This patch fixes a long-standing performance bug in classic RCU that results in massive internal-to-RCU lock contention on systems with more than a few hundred CPUs. Although this patch creates a separate flavor of RCU for ease of review and patch maintenance, it is intended to replace classic RCU. This patch still handles stress better than does mainline, so I am still calling it ready for inclusion. This patch is against the -tip tree. Nevertheless, experience on an actual 1000+ CPU machine would still be most welcome. Most of the changes noted below were found while creating an rcutiny (which should permit ejecting the current rcuclassic) and while doing detailed line-by-line documentation. Updates from v9 (http://lkml.org/lkml/2008/12/2/334): o Fixes from remainder of line-by-line code walkthrough, including comment spelling, initialization, undesirable narrowing due to type conversion, removing redundant memory barriers, removing redundant local-variable initialization, and removing redundant local variables. I do not believe that any of these fixes address the CPU-hotplug issues that Andi Kleen was seeing, but please do give it a whirl in case the machine is smarter than I am. A writeup from the walkthrough may be found at the following URL, in case you are suffering from terminal insomnia or masochism: http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf o Made rcutree tracing use seq_file, as suggested some time ago by Lai Jiangshan. o Added a .csv variant of the rcudata debugfs trace file, to allow people having thousands of CPUs to drop the data into a spreadsheet. Tested with oocalc and gnumeric. Updated documentation to suit. Updates from v8 (http://lkml.org/lkml/2008/11/15/139): o Fix a theoretical race between grace-period initialization and force_quiescent_state() that could occur if more than three jiffies were required to carry out the grace-period initialization. Which it might, if you had enough CPUs. o Apply Ingo's printk-standardization patch. o Substitute local variables for repeated accesses to global variables. o Fix comment misspellings and redundant (but harmless) increments of ->n_rcu_pending (this latter after having explicitly added it). o Apply checkpatch fixes. Updates from v7 (http://lkml.org/lkml/2008/10/10/291): o Fixed a number of problems noted by Gautham Shenoy, including the cpu-stall-detection bug that he was having difficulty convincing me was real. ;-) o Changed cpu-stall detection to wait for ten seconds rather than three in order to reduce false positive, as suggested by Ingo Molnar. o Produced a design document (http://lwn.net/Articles/305782/). The act of writing this document uncovered a number of both theoretical and "here and now" bugs as noted below. o Fix dynticks_nesting accounting confusion, simplify WARN_ON() condition, fix kerneldoc comments, and add memory barriers in dynticks interface functions. o Add more data to tracing. o Remove unused "rcu_barrier" field from rcu_data structure. o Count calls to rcu_pending() from scheduling-clock interrupt to use as a surrogate timebase should jiffies stop counting. o Fix a theoretical race between force_quiescent_state() and grace-period initialization. Yes, initialization does have to go on for some jiffies for this race to occur, but given enough CPUs... Updates from v6 (http://lkml.org/lkml/2008/9/23/448): o Fix a number of checkpatch.pl complaints. o Apply review comments from Ingo Molnar and Lai Jiangshan on the stall-detection code. o Fix several bugs in !CONFIG_SMP builds. o Fix a misspelled config-parameter name so that RCU now announces at boot time if stall detection is configured. o Run tests on numerous combinations of configurations parameters, which after the fixes above, now build and run correctly. Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line): o Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a changeset some time ago, and finally got around to retesting this option). o Fix some tracing bugs in rcupreempt that caused incorrect totals to be printed. o I now test with a more brutal random-selection online/offline script (attached). Probably more brutal than it needs to be on the people reading it as well, but so it goes. o A number of optimizations and usability improvements: o Make rcu_pending() ignore the grace-period timeout when there is no grace period in progress. o Make force_quiescent_state() avoid going for a global lock in the case where there is no grace period in progress. o Rearrange struct fields to improve struct layout. o Make call_rcu() initiate a grace period if RCU was idle, rather than waiting for the next scheduling clock interrupt. o Invoke rcu_irq_enter() and rcu_irq_exit() only when idle, as suggested by Andi Kleen. I still don't completely trust this change, and might back it out. o Make CONFIG_RCU_TRACE be the single config variable manipulated for all forms of RCU, instead of the prior confusion. o Document tracing files and formats for both rcupreempt and rcutree. Updates from v4 for those missing v5 given its bad subject line: o Separated dynticks interface so that NMIs and irqs call separate functions, greatly simplifying it. In particular, this code no longer requires a proof of correctness. ;-) o Separated dynticks state out into its own per-CPU structure, avoiding the duplicated accounting. o The case where a dynticks-idle CPU runs an irq handler that invokes call_rcu() is now correctly handled, forcing that CPU out of dynticks-idle mode. o Review comments have been applied (thank you all!!!). For but one example, fixed the dynticks-ordering issue that Manfred pointed out, saving me much debugging. ;-) o Adjusted rcuclassic and rcupreempt to handle dynticks changes. Attached is an updated patch to Classic RCU that applies a hierarchy, greatly reducing the contention on the top-level lock for large machines. This passes 10-hour concurrent rcutorture and online-offline testing on 128-CPU ppc64 without dynticks enabled, and exposes some timekeeping bugs in presence of dynticks (exciting working on a system where "sleep 1" hangs until interrupted...), which were fixed in the 2.6.27 kernel. It is getting more reliable than mainline by some measures, so the next version will be against -tip for inclusion. See also Manfred Spraul's recent patches (or his earlier work from 2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2). We will converge onto a common patch in the fullness of time, but are currently exploring different regions of the design space. That said, I have already gratefully stolen quite a few of Manfred's ideas. This patch provides CONFIG_RCU_FANOUT, which controls the bushiness of the RCU hierarchy. Defaults to 32 on 32-bit machines and 64 on 64-bit machines. If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT, there is no hierarchy. By default, the RCU initialization code will adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable this balancing, allowing the hierarchy to be exactly aligned to the underlying hardware. Up to two levels of hierarchy are permitted (in addition to the root node), allowing up to 16,384 CPUs on 32-bit systems and up to 262,144 CPUs on 64-bit systems. I just know that I am going to regret saying this, but this seems more than sufficient for the foreseeable future. (Some architectures might wish to set CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs. If this becomes a real problem, additional levels can be added, but I doubt that it will make a significant difference on real hardware.) In the common case, a given CPU will manipulate its private rcu_data structure and the rcu_node structure that it shares with its immediate neighbors. This can reduce both lock and memory contention by multiple orders of magnitude, which should eliminate the need for the strange manipulations that are reported to be required when running Linux on very large systems. Some shortcomings: o More bugs will probably surface as a result of an ongoing line-by-line code inspection. Patches will be provided as required. o There are probably hangs, rcutorture failures, &c. Seems quite stable on a 128-CPU machine, but that is kind of small compared to 4096 CPUs. However, seems to do better than mainline. Patches will be provided as required. o The memory footprint of this version is several KB larger than rcuclassic. A separate UP-only rcutiny patch will be provided, which will reduce the memory footprint significantly, even compared to the old rcuclassic. One such patch passes light testing, and has a memory footprint smaller even than rcuclassic. Initial reaction from various embedded guys was "it is not worth it", so am putting it aside. Credits: o Manfred Spraul for ideas, review comments, and bugs spotted, as well as some good friendly competition. ;-) o Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers, Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton for reviews and comments. o Thomas Gleixner for much-needed help with some timer issues (see patches below). o Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos, Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines alive despite my heavy abuse^Wtesting. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-18 21:56:04 +01:00
Nathan Lynch	edc72ac4a0	powerpc/pseries: Check for GIQ indicator before calling set-indicator Since "Factor out cpu joining/unjoining the GIQ" (`b4963255ad`) the WARN_ON in xics_set_cpu_giq() is being triggered during boot on JS20 because the GIQ indicator is not available on that platform. While the warning is harmless and the system runs normally, it's nicer to check for the existence of the indicator before trying to manipulate it. Implement rtas_indicator_present(), which searches the /rtas/rtas-indicators property for the given indicator token, and use this function in xics_set_cpu_giq(). Also use a WARN statement in xics_set_cpu_giq to get better information on failure. Signed-off-by: Nathan Lynch <ntl@pobox.com> Acked-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-16 15:53:13 +11:00
Rusty Russell	0de26520c7	cpumask: make irq_set_affinity() take a const struct cpumask Impact: change existing irq_chip API Not much point with gentle transition here: the struct irq_chip's setaffinity method signature needs to change. Fortunately, not widely used code, but hits a few architectures. Note: In irq_select_affinity() I save a temporary in by mangling irq_desc[irq].affinity directly. Ingo, does this break anything? (Folded in fix from KOSAKI Motohiro) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Grant Grundler <grundler@parisc-linux.org> Acked-by: Ingo Molnar <mingo@redhat.com> Cc: ralf@linux-mips.org Cc: grundler@parisc-linux.org Cc: jeremy@xensource.com Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>	2008-12-13 21:20:26 +10:30
Rusty Russell	29c0177e6a	cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and cpulist_scnprintf to take pointers. Impact: change calling convention of existing cpumask APIs Most cpumask functions started with cpus_: these have been replaced by cpumask_ ones which take struct cpumask pointers as expected. These four functions don't have good replacement names; fortunately they're rarely used, so we just change them over. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: paulus@samba.org Cc: mingo@redhat.com Cc: tony.luck@intel.com Cc: ralf@linux-mips.org Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: cl@linux-foundation.org Cc: srostedt@redhat.com	2008-12-13 21:20:25 +10:30
Benjamin Herrenschmidt	fd6852c8fa	powerpc/pci: Fix various pseries PCI hotplug issues The pseries PCI hotplug code has a number of issues, ranging from incorrect resource setup to crashes, depending on what is added, when, whether it contains a bridge, etc etc.... This fixes a whole bunch of these, while actually simplifying the code a bit, using more generic code in the process and factoring out common code between adding of a PHB, a slot or a device. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-11-06 09:31:52 +11:00
Benjamin Herrenschmidt	57b066ff4e	powerpc/eeh: Make EEH device add/remove more robust To properly fix PCI hotplug, it's useful to be able to make the fixup passes on all devices whether they were just hot plugged or already there. The EEH code however used to not be very friendly with calling eeh_add_device_late() multiple time, and not very rebust in the way it generally tests whether a device is in the expected state vs. the EEH code. This improves it, along with cleaning up a couple of debug printk's. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-11-06 09:25:15 +11:00
Sebastien Dugue	1ef8014deb	powerpc/pseries: Fix getting the server number size The 'ibm,interrupt-server#-size' properties are not in the cpu nodes, which is where we currently look for them, but rather live under the interrupt source controller nodes (which have "ibm,ppc-xics" in their compatible property). This moves the code that looks for the ibm,interrupt-server#-size properties from xics_update_irq_servers() into xics_init_IRQ(). Also this adds a check for mismatched sizes across the interrupt source controller nodes. Not sure this is necessary as in this case the firmware might be seriously busted. This property only appears on POWER6 boxes and is only used in the set-indicator(gqirm) call, and apparently firmware currently ignores the value we pass. Nevertheless we need to fix it in case future firmware versions use it. Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-11-05 22:08:28 +11:00
Stephen Rothwell	454666eb78	powerpc: Fix "unused variable" warning in pci_dlpar.c This gets rid of this build warning: arch/powerpc/platforms/pseries/pci_dlpar.c: In function 'init_phb_dynamic': arch/powerpc/platforms/pseries/pci_dlpar.c:192: warning: unused variable 'b' This is one of the very few warnings left in a ppc64_defconfig build and getting rid of it will make it easier to see future introduced ones (in fact this was introduced very recently). Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-11-05 19:59:08 +11:00
Nathan Fontenot	e90a131846	powerpc/pci: Properly allocate bus resources for hotplug PHBs Resources for PHB's that are dynamically added to a system are not properly allocated in the resource tree. Not having these resources allocated causes an oops when removing the PHB when we try to release them. The diff appears a bit messy, this is mainly due to moving everything one tab to the left in the pcibios_allocate_bus_resources routine. The functionality change in this routine is only that the list_for_each_entry() loop is pulled out and moved to the necessary calling routine. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-10-31 16:12:03 +11:00
Milton Miller	62a8bd6c92	powerpc: Use is_kdump_kernel() linux/crash_dump.h defines is_kdump_kernel() to be used by code that needs to know if the previous kernel crashed instead of a (clean) boot or reboot. This updates the just added powerpc code to use it. This is needed for the next commit, which will remove __kdump_flag. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-10-31 16:11:47 +11:00
Mohan Kumar M	54622f10a6	powerpc: Support for relocatable kdump kernel This adds relocatable kernel support for kdump. With this one can use the same regular kernel to capture the kdump. A signature (0xfeed1234) is passed in r6 from panic code to the next kernel through kexec_sequence and purgatory code. The signature is used to differentiate between kdump kernel and non-kdump kernels. The purgatory code compares the signature and sets the __kdump_flag in head_64.S. During the boot up, kernel code checks __kdump_flag and if it is set, the kernel will behave as relocatable kdump kernel. This kernel will boot at the address where it was loaded by kexec-tools ie. at the address reserved through crashkernel boot parameter. CONFIG_CRASH_DUMP depends on CONFIG_RELOCATABLE option to build kdump kernel as relocatable. So the same kernel can be used as production and kdump kernel. This patch incorporates the changes suggested by Paul Mackerras to avoid GOT use and to avoid two copies of the code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Mohan Kumar M <mohan@in.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-22 15:01:22 +11:00
Milton Miller	6a75a6b8e8	powerpc: Use cpu_thread_in_core in smp_init for of_spin_map We used to assume that even numbered threads were the primary threads, ie those that would be listed and started as a cpu from open firmware. Replace a left over is even (% 2) check with a check for it being a primary thread and update the comments. Tested with a debug print on pseries, identical code found for cell. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-21 15:19:12 +11:00
Nathan Fontenot	04badfd293	powerpc/pseries: Validate PFN in pseries_remove_lmb() The pfn of the memory to be removed should be validated prior to attempting to remove the memory. In cases where the probe of a memory section fails during hotplug add, the pfn for the lmb may not be valid. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-21 15:17:47 +11:00
Benjamin Herrenschmidt	6dc6472581	Merge commit 'origin' Manual fixup of conflicts on: arch/powerpc/include/asm/dcr-regs.h drivers/net/ibm_newemac/core.h	2008-10-15 11:31:54 +11:00
Milton Miller	199f45c45e	powerpc/xics: Reduce and comment xics IPI use of memory barriers A single full sync (mb()) is requrired to order the mmio to the qirr reg with the set or clear of the message word. However, test_and_clear_bit has the effect of smp_mb() and we are not doing any other io from here, so we don't need a mb per bit processed. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-13 16:24:19 +11:00
Milton Miller	2172fe8704	powerpc/xics: Make printk format strings fit on one line Several printks were broken at word boundaries for line length. Some even referred to old function names. Using __func__ and changing the text slightly for the format allows these printk formats to fit on one line. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-13 16:24:19 +11:00
Milton Miller	d879f3849c	powerpc/xics: Mark xics IPI interrupt as per-cpu It is physically per-cpu, and we want the irq layer to treat it that way. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-13 16:24:19 +11:00
Milton Miller	1a57c926b6	powerpc/xics: EOI xics ipi by hand in kexec EOI normally has the side effect of returning the cpu to the base priority to recieve the next interrupt. This is actually controlled by the top byte of the xirr register. When we are exiting the kernel in kexec we must eoi the ipi for the next kernel because we never return from the handler, but we want to leave interrupt delivery blocked until the next kernel takes action. Since the hardware ipi vector is fixed, its easiest to just do the eoi explicitly. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-13 16:24:18 +11:00
Milton Miller	b4963255ad	powerpc/xics: Factor out cpu joining/unjoining the GIQ This factors out processors joining and unjoining the Global Interrupt Queue into a separate function. There is a bit of math to calculate the arguments to rtas to join or leave the global interrupt queue, and a warning on failure afterwards. Make a helper for the 3 callers. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-13 16:24:18 +11:00
Milton Miller	a244a957ab	powerpc/xics: Initialization code cleanups We only need to check the ibm,interrupt-server#-size property once, not once per global server and thread. We can use !CONFIG_SMP cpu masks and hard_smp_processor_id() to avoid an ifdef. Put the node when breaking out of the loop on lpar systems. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-13 16:24:18 +11:00
Milton Miller	188bdddd24	powerpc/xics: Trim #include list Trim unneeded includes from xics.c. We don't use signals or gfp flags, we use only OF functions and don't need prom, and the 8259 is now handled by our caller. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-13 16:24:17 +11:00
Milton Miller	9dc2d44113	powerpc/xics: Change *_xirr_info_set() prototype to avoid casts The xirr is 32 bits in hardware, but the hypervisor requries the upper bits of the register to be clear on the hcall. By changing the type from signed to unsigned int we can drop masking it back to 32 bits. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-13 16:24:17 +11:00
Milton Miller	0641cc91b0	powerpc/xics: Rearrange file to group code by function Now that xics_update_irq_servers is called only from init and hotplug code, it becomes possible to clean up the ordering of functions in the file, grouping them but the interfaces they implement. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-13 16:24:17 +11:00
Milton Miller	d13f7208b2	powerpc/xics: Consolidate ipi message encode and decode xics supports only one ipi per cpu, and expects software to use some queue to know why the interrupt was sent. In Linux, we use a an array of bitmaps indexed by cpu to identify the message. Currently the bits are set in smp.c and decoded in xics.c, with the data structure in a header file. Consolidate the code in xics.c similar to mpic and other interrupt controllers. Also, while making the the array static, the message word doesn't need to be volatile as set_bit and test_clear_bit take care of it for us, and put it under ifdef smp. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-13 16:24:16 +11:00
Milton Miller	302905a347	powerpc/xics: Update default_server during migrate_irqs_away Currently, every time we determine which irq server to use, we check if default_server, which is the id of the bootcpu, is still online. But default_server is a hardware cpu, not the logical cpu id needed to index cpu_online_map. Since the default server can only go offline during a cpu hotplug event, explicitly check the default server and choose the new one when we move irqs away from the cpu being offlined. This has the added benefit of only needing the boot_cpuid to be updated and not relying on the cpu being marked offline during migrate_irqs_away. Also, since xics_update_irq_servers only reads device tree information, we can call it before xics_init_host in xics_init_IRQ and then default_server will always be valid when we can reach get_irq_server via the host ops. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-13 10:55:47 +11:00
Milton Miller	8767e9badc	powerpc/xics: EOI unmapped irqs after disabling them When reciving an irq vector that does not have a linux mapping, the kernel prints a message and calls RTAS to disable the irq source. Previously the kernel did not EOI the interrupt, causing the source to think it is still being processed by software. While this does add an additional layer of protection against interrupt storms had RTAS failed to disable the source, it also prevents the interrupt from working when a driver later enables it. (We could alternatively send an EOI on startup, but that strategy would likely fail on an emulated xics.) All interrupts should be disabled when the kernel starts, but this can be observed if a driver does not shutdown an interrupt in its reboot hook before starting a new kernel with kexec. Michael reports this can be reproduced trivially by banging the keyboard while kexec'ing on a P5 LPAR: even though the hvc_console driver request's the console irq later in boot, the console is non-functional because we're receiving no console interrupts. Reported-By: Michael Ellerman Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-13 10:55:47 +11:00
Nathan Fontenot	9fd3f88cb6	powerpc: Oops in pseries_lmb_remove() Testing hotplug memory remove has revealed that we can oops in pseries_lmb_remove(). The incorrect shift causes a NULL pointer dereference in the page_zone() inline routine. I have only been able to reproduce the oops on kernels with large pages enabled. Tested on Power5 and Power6 with and without large pages enabled. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-10 15:55:17 +11:00
Vitaly Mayatskikh	76c31f239e	powerpc: Honor O_NONBLOCK flag when reading RTAS log rtas_log_read() doesn't check file flags for O_NONBLOCK and blocks non-blocking readers of /proc/ppc64/rtas/error_log when there is no data available. This fixes it. Also rtas_log_read() returns now with ENODATA to prevent suspending of process in wait_event_interruptible() when logging facility was switched off and log is already empty. Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-07 14:26:21 +11:00
Sebastien Dugue	967e012ef3	powerpc: Separate the irq radix tree insertion and lookup irq_radix_revmap() currently serves 2 purposes, irq mapping lookup and insertion which happen in interrupt and process context respectively. Separate the function into its 2 components, one for lookup only and one for insertion only. Fix the only user of the revmap tree (XICS) to use the new functions. Also, move the insertion into the radix tree of those irqs that were requested before it was initialized at said tree initialization. Mutual exclusion between the tree initialization and readers/writers is handled via a state variable (revmap_trees_allocated) set to 1 when the tree has been initialized and set to 2 after the already requested irqs have been inserted in the tree by the init path. This state is checked before any reader or writer access just like we used to check for tree.gfp_mask != 0 before. Finally, now that we're not any longer inserting nodes into the radix-tree in interrupt context, turn the GFP_ATOMIC allocations into GFP_KERNEL ones. Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-09-15 11:08:44 -07:00
Nathan Fontenot	525c411d40	powerpc: Check rc of notifier chain for memory remove The return code from invocation of the notifier for pSeries_reconfig_chain during update of the device tree is not checked. This causes writes to /proc/ppc64/ofdt to update memory properties (i.e. ibm,dyamic-reconfiguration-memory) to always return success, instead of the result of the notifier chain. This happens specifically when we remove/add memory from the device tree on machines using memory specified in the ibm,dynamic-reconfiguration-memory property of the device tree. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-09-15 11:07:52 -07:00
Paul Mackerras	7e392f8c29	Merge branch 'linux-2.6'	2008-09-10 11:36:13 +10:00
Adrian Bunk	9d5a9e7465	Remove asm/a.out.h files for all architectures without a.out support. This patch also includes the required removal of (unused) inclusion of <asm/a.out.h> <linux/a.out.h>'s in the arch/ code for these architectures. [dwmw2: updated for 2.6.27-rc] Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-09-06 19:30:24 +01:00
Andrew Morton	d617a40227	powerpc: Export CMO_PageSize This fixes an error building powerpc allmodconfig: ERROR: "CMO_PageSize" [arch/powerpc/platforms/pseries/cmm.ko] undefined! Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-08-26 10:24:47 +10:00
Tony Breeds	dcfcfe7567	powerpc: Guard print_device_node_tree() with #if 0 Currently print_device_node_tree() isn't called but it can be useful for debugging. Leave the function there but hide it behind '#if 0' to save it being rewritten. If you want to call it you're already editing this file anyway. ;P Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-08-20 16:34:57 +10:00
Harvey Harrison	5df72bf3f7	powerpc: Replace __FUNCTION__ with __func__ __FUNCTION__ is gcc-specific, use __func__ [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-08-20 16:34:57 +10:00
Brian King	370e4587d0	powerpc: Fix CMM page loaning on 64k page kernel with 4k hardware pages If the firmware page size used for collaborative memory overcommit is 4k, but the kernel is using 64k pages, the page loaning is currently broken as it only marks the first 4k page of each 64k page as loaned. This fixes this to iterate through each 4k page and mark them all as loaned/active. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-08-18 14:22:35 +10:00
Robert Jennings	81f14997e8	powerpc: Make CMO paging space pool ID and page size available During platform setup, save off the primary/secondary paging space pool IDs and the page size. Added accessors in hvcall.h for these variables. This is needed for a subsequent fix. Submitted-by: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-08-18 14:22:34 +10:00
Stephen Rothwell	3cee67f779	powerpc/pseries: Fix CMO sysdev attribute API change fallout Noticed due to these wanings: arch/powerpc/platforms/pseries/cmm.c:298: warning: initialization from incompatible pointer type arch/powerpc/platforms/pseries/cmm.c:299: warning: initialization from incompatible pointer type arch/powerpc/platforms/pseries/cmm.c:320: warning: initialization from incompatible pointer type arch/powerpc/platforms/pseries/cmm.c:320: warning: initialization from incompatible pointer type Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-28 16:30:50 +10:00
Robert Jennings	6490c4903d	powerpc/pseries: iommu enablement for CMO To support Cooperative Memory Overcommitment (CMO), we need to check for failure from some of the tce hcalls. These changes for the pseries platform affect the powerpc architecture; patches for the other affected platforms are included in this patch. pSeries platform IOMMU code changes: * platform TCE functions must handle H_NOT_ENOUGH_RESOURCES errors and return an error. Architecture IOMMU code changes: * Calls to ppc_md.tce_build need to check return values and return DMA_MAPPING_ERROR for transient errors. Architecture changes: * struct machdep_calls for tce_build_pSeriesLP functions need to change to indicate failure. all other platforms will need updates to iommu functions to match the new calling semantics; they will return 0 on success. The other platforms default configs have been built, but no further testing was performed. Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:43 +10:00
Brian King	84af458bb2	powerpc/pseries: Add collaborative memory manager Adds a collaborative memory manager, which acts as a simple balloon driver for System p machines that support cooperative memory overcommitment (CMO). Adds a platform configuration option for CMO called PPC_SMLPAR. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:42 +10:00
Brian King	86630a3232	powerpc/pseries: Utilities to set firmware page state Newer versions of firmware support page states, which are used by the collaborative memory manager (future patch) to "loan" pages to the hypervisor for use by other partitions. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:42 +10:00
Robert Jennings	e46de429cb	powerpc/pseries: Enable CMO feature during platform setup For Cooperative Memory Overcommitment (CMO), set the FW_FEATURE_CMO flag in powerpc_firmware_features from the rtas ibm,get-system-parameters table prior to calling iommu_init_early_pSeries. With this, any CMO specific functionality can be controlled by checking: firmware_has_feature(FW_FEATURE_CMO) Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:42 +10:00
Mike Mason	f36c5227cd	powerpc/eeh: Don't panic when EEH_MAX_FAILS is exceeded This patch changes the EEH_MAX_FAILS action from panic to printing an error message. Panicking under under this condition is too harsh. Although performance will be affected and the device may not recover, the system is still running, which at the very least should allow for a more graceful shutdown. The patch also removes the msleep() within a spinlock, which can lead to a deadlock and is not recommended. Signed-off-by: Mike Mason <mmlnx@us.ibm.com> Acked-by: Linas Vepstas <linasvepstas@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-22 10:39:37 +10:00

1 2 3 4 5 ...

429 Commits