linux

Author	SHA1	Message	Date
Vaidyanathan Srinivasan	5742bd8595	powerpc: Add support for new hcall H_BEST_ENERGY Create sysfs interface to export data from H_BEST_ENERGY hcall that can be used by administrative tools on supported pseries platforms for energy management optimizations. sys/device/system/cpu/pseries_(de)activate_hint_list and sys/device/system/cpu/cpuN/pseries_(de)activate_hint will provide hints for activation and deactivation of cpus respectively. These hints are abstract number given by the hypervisor based on the extended knowledge the hypervisor has regarding the system topology and resource mappings. The activate and the deactivate sysfs entry is for the two distinct operations that we could do for energy savings. When we have more capacity than required, we could deactivate few core to save energy. The choice of the core to deactivate will be based on /sys/devices/system/cpu/deactivate_hint_list. The comma separated list of cpus (cores) will be the preferred choice. If we have to activate some of the deactivated cores, then /sys/devices/system/cpu/activate_hint_list will be used. The per-cpu file /sys/device/system/cpu/cpuN/pseries_(de)activate_hint further provide more fine grain information by exporting the value of the hint itself. Added new driver module arch/powerpc/platforms/pseries/pseries_energy.c under new config option CONFIG_PSERIES_ENERGY Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-11-29 15:48:19 +11:00
Will Schmidt	4e89a2d8e2	powerpc/pseries: Add kernel parameter to disable batched hcalls This introduces a pair of kernel parameters that can be used to disable the MULTITCE and BULK_REMOVE h-calls. By default, those hcalls are enabled, active, and good for throughput and performance. The ability to disable them will be useful for some of the PREEMPT_RT related investigation and work occurring on Power. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> cc: Olof Johansson <olof@lixom.net> cc: Anton Blanchard <anton@samba.org> cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-11-29 15:48:18 +11:00
Nishanth Aravamudan	01cf6fe855	powerpc/pseries: Don't override CONFIG_PPC_PSERIES_DEBUG EEH and pci_dlpar #undef DEBUG, but I think they were added before the ability to control this from Kconfig. It's really annoying to only get some of the debug messages from these files. Leave the lpar.c #undef alone as it produces so much output as to make the kernel unusable. Update the Kconfig text to indicate this particular quirk :) Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-11-18 14:54:22 +11:00
Lionel Debroux	2f55ac072f	suspend: constify platform_suspend_ops While at it, fix two checkpatch errors. Several non-const struct instances constified by this patch were added after the introduction of platform_suspend_ops in checkpatch.pl's list of "should be const" structs (`79404849e9`). Patch against mainline. Inspired by hunks of the grsecurity patch, updated for newer kernels. Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-11-16 14:14:02 +01:00
matt mooney	6d2ad1e318	powerpc: remove cast from void* Unnecessary cast from void* in assignment. Signed-off-by: matt mooney <mfm@muteddisk.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-11-03 10:23:26 -04:00
Linus Torvalds	092e0e7e52	Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: vfs: make no_llseek the default vfs: don't use BKL in default_llseek llseek: automatically add .llseek fop libfs: use generic_file_llseek for simple_attr mac80211: disallow seeks in minstrel debug code lirc: make chardev nonseekable viotape: use noop_llseek raw: use explicit llseek file operations ibmasmfs: use generic_file_llseek spufs: use llseek in all file operations arm/omap: use generic_file_llseek in iommu_debug lkdtm: use generic_file_llseek in debugfs net/wireless: use generic_file_llseek in debugfs drm: use noop_llseek	2010-10-22 10:52:56 -07:00
Linus Torvalds	d4429f608a	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (71 commits) powerpc/44x: Update ppc44x_defconfig powerpc/watchdog: Make default timeout for Book-E watchdog a Kconfig option fsl_rio: Add comments for sRIO registers. powerpc/fsl-booke: Add e55xx (64-bit) smp defconfig powerpc/fsl-booke: Add p5020 DS board support powerpc/fsl-booke64: Use TLB CAMs to cover linear mapping on FSL 64-bit chips powerpc/fsl-booke: Add support for FSL Arch v1.0 MMU in setup_page_sizes powerpc/fsl-booke: Add support for FSL 64-bit e5500 core powerpc/85xx: add cache-sram support powerpc/85xx: add ngPIXIS FPGA device tree node to the P1022DS board powerpc: Fix compile error with paca code on ppc64e powerpc/fsl-booke: Add p3041 DS board support oprofile/fsl emb: Don't set MSR[PMM] until after clearing the interrupt. powerpc/fsl-booke: Add PCI device ids for P2040/P3041/P5010/P5020 QoirQ chips powerpc/mpc8xxx_gpio: Add support for 'qoriq-gpio' controllers powerpc/fsl_booke: Add support to boot from core other than 0 powerpc/p1022: Add probing for individual DMA channels powerpc/fsl_soc: Search all global-utilities nodes for rstccr powerpc: Fix invalid page flags in create TLB CAM path for PTE_64BIT powerpc/mpc83xx: Support for MPC8308 P1M board ... Fix up conflict with the generic irq_work changes in arch/powerpc/kernel/time.c	2010-10-21 21:19:54 -07:00
Arnd Bergmann	6038f373a3	llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode i, struct file f) { <+... ( nonseekable_open(...) \| nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable / }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, / open uses nonseekable / }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, / we have seq_read / }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, / readdir is present / }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, / read accesses f_pos / }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, / write accesses f_pos / }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, / read and write both use no f_pos / }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, / write uses no f_pos / }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, / read uses no f_pos / }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, / no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>	2010-10-15 15:53:27 +02:00
matt mooney	4108d9ba90	powerpc/Makefiles: Change to new flag variables Replace EXTRA_CFLAGS with ccflags-y and EXTRA_AFLAGS with asflags-y. Signed-off-by: matt mooney <mfm@muteddisk.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-10-13 16:19:22 +11:00
Nishanth Aravamudan	f56029b5ea	powerpc/pseries/xics: Use cpu_possible_mask rather than cpu_all_mask Current firmware only allows us to send IRQs to the first processor or all processors. We currently check to see if the passed in mask is equal to the all_mask, but the firmware is only considering whether the request is for the equivalent of the possible_mask. Thus, we think the request is for some subset of CPUs and only assign IRQs to the first CPU (on systems without irqbalance running) as evidenced by /proc/interrupts. By using possible_mask instead, we account for this and proper interleaving of interrupts occurs. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-10-13 16:19:22 +11:00
Nishanth Aravamudan	e72ed6b509	powerpc/pseries: Use kmemdup While looking at some code paths I came across this code that zeros memory then copies over the entire length. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-10-13 16:19:22 +11:00
Nathan Fontenot	410bccf978	powerpc/pseries: Partition migration in the kernel Enable partition migration in the kernel. To do this a new sysfs file, /sys/kernel/mobility/migration, is created. In order to initiate a migration the stream id (generated by the HMC managing the system) is written to this file. After a migration occurs, and what is the majority of this code, the device tree needs to be updated for the new system the partition is running on. This is done via the ibm,update-nodes and ibm,update-properties rtas calls which return information regarding which nodes and properties of the device tree are to be added/removed/updated. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-10-13 16:19:03 +11:00
Nathan Fontenot	206489748b	powerpc/pseries: Export device tree updating routines Export routines associated with adding and removing device tree nodes on pseries needed for device tree updating. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-10-13 16:19:02 +11:00
Thomas Gleixner	1c9db52534	pci: Convert msi to new irq_chip functions Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Tony Luck <tony.luck@intel.com> Cc: Russell King <linux@arm.linux.org.uk>	2010-10-12 16:53:34 +02:00
Paul Mackerras	872e439a45	powerpc/pseries: Re-enable dispatch trace log userspace interface Since the cpu accounting code uses the hypervisor dispatch trace log now when CONFIG_VIRT_CPU_ACCOUNTING = y, the previous commit disabled access to it via files in the /sys/kernel/debug/powerpc/dtl/ directory in that case. This restores those files. To do this, we now have a hook that the cpu accounting code will call as it processes each entry from the hypervisor dispatch trace log. The code in dtl.c now uses that to fill up its ring buffer, rather than having the hypervisor fill the ring buffer directly. This also fixes dtl_file_read() to handle overflow conditions a bit better and adds a spinlock to ensure that race conditions (multiple processes opening or reading the file concurrently) are handled correctly. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-09-02 14:07:32 +10:00
Paul Mackerras	cf9efce0ce	powerpc: Account time using timebase rather than PURR Currently, when CONFIG_VIRT_CPU_ACCOUNTING is enabled, we use the PURR register for measuring the user and system time used by processes, as well as other related times such as hardirq and softirq times. This turns out to be quite confusing for users because it means that a program will often be measured as taking less time when run on a multi-threaded processor (SMT2 or SMT4 mode) than it does when run on a single-threaded processor (ST mode), even though the program takes longer to finish. The discrepancy is accounted for as stolen time, which is also confusing, particularly when there are no other partitions running. This changes the accounting to use the timebase instead, meaning that the reported user and system times are the actual number of real-time seconds that the program was executing on the processor thread, regardless of which SMT mode the processor is in. Thus a program will generally show greater user and system times when run on a multi-threaded processor than on a single-threaded processor. On pSeries systems on POWER5 or later processors, we measure the stolen time (time when this partition wasn't running) using the hypervisor dispatch trace log. We check for new entries in the log on every entry from user mode and on every transition from kernel process context to soft or hard IRQ context (i.e. when account_system_vtime() gets called). So that we can correctly distinguish time stolen from user time and time stolen from system time, without having to check the log on every exit to user mode, we store separate timestamps for exit to user mode and entry from user mode. On systems that have a SPURR (POWER6 and POWER7), we read the SPURR in account_system_vtime() (as before), and then apportion the SPURR ticks since the last time we read it between scaled user time and scaled system time according to the relative proportions of user time and system time over the same interval. This avoids having to read the SPURR on every kernel entry and exit. On systems that have PURR but not SPURR (i.e., POWER5), we do the same using the PURR rather than the SPURR. This disables the DTL user interface in /sys/debug/kernel/powerpc/dtl for now since it conflicts with the use of the dispatch trace log by the time accounting code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-09-02 14:07:31 +10:00
Paul Mackerras	8154c5d22d	powerpc: Abstract indexing of lppaca structs Currently we have the lppaca structs as a simple array of NR_CPUS entries, taking up space in the data section of the kernel image. In future we would like to allocate them dynamically, so this abstracts out the accesses to the array, making it easier to change how we locate the lppaca for a given cpu in future. Specifically, lppaca[cpu] changes to lppaca_of(cpu). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-09-02 14:07:31 +10:00
Nathan Fontenot	93f68f1ef7	powerpc/pseries: Correct rtas_data_buf locking in dlpar code The dlpar code can cause a deadlock to occur when making the RTAS configure-connector call. This occurs because we make kmalloc calls, which can block, while parsing the rtas_data_buf and holding the rtas_data_buf_lock. This an cause issues if someone else attempts to grab the rtas_data_bug_lock. This patch alleviates this issue by copying the contents of the rtas_data_buf to a local buffer before parsing. This allows us to only hold the rtas_data_buf_lock around the RTAS configure-connector calls. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-09-02 10:07:38 +10:00
Anton Blanchard	7aa241fdce	powerpc: Fix bogus it_blocksize in VIO iommu code When looking at some issues with the virtual ethernet driver I noticed that TCE allocation was following a very strange pattern: address 00e9000 length 2048 address 0409000 length 2048 <----- address `0429000` length 2048 address 0449000 length 2048 address 0469000 length 2048 address 0489000 length 2048 address 04a9000 length 2048 address 04c9000 length 2048 address 04e9000 length 2048 address 4009000 length 2048 <----- address 4029000 length 2048 Huge unexplained gaps in what should be an empty TCE table. It turns out it_blocksize, the amount we want to align the next allocation to, was c0000000fe903b20. Completely bogus. Initialise it to something reasonable in the VIO IOMMU code, and use kzalloc everywhere to protect against this when we next add a non compulsary field to iommu code and forget to initialise it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-08-24 15:26:31 +10:00
Nathan Fontenot	954e6da54b	powerpc: Correct smt_enabled=X boot option for > 2 threads per core The 'smt_enabled=X' boot option does not handle values of X > 2. For Power 7 processors with smt modes of 0,1,2,3, and 4 this does not work. This patch allows the smt_enabled option to be set to any value limited to a max equal to the number of threads per core. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-08-24 15:26:30 +10:00
Signed-off-by: Darren Hart	1afb56cf97	powerpc: Silence xics_migrate_irqs_away() during cpu offline All IRQs are migrated away from a CPU that is being offlined so the following messages suggest a problem when the system is behaving as designed: IRQ 262 affinity broken off cpu 1 IRQ 17 affinity broken off cpu 0 IRQ 18 affinity broken off cpu 0 IRQ 19 affinity broken off cpu 0 IRQ 256 affinity broken off cpu 0 IRQ 261 affinity broken off cpu 0 IRQ 262 affinity broken off cpu 0 Don't print these messages when the CPU is not online. Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Nathan Fontenot <nfont@austin.ibm.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-08-24 15:26:30 +10:00
Linus Torvalds	cdd854bc42	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (79 commits) powerpc/8xx: Add support for the MPC8xx based boards from TQC powerpc/85xx: Introduce support for the Freescale P1022DS reference board powerpc/85xx: Adding DTS for the STx GP3-SSA MPC8555 board powerpc/85xx: Change deprecated binding for 85xx-based boards powerpc/tqm85xx: add a quirk for ti1520 PCMCIA bridge powerpc/tqm85xx: update PCI interrupt-map attribute powerpc/mpc8308rdb: support for MPC8308RDB board from Freescale powerpc/fsl_pci: add quirk for mpc8308 pcie bridge powerpc/85xx: Cleanup QE initialization for MPC85xxMDS boards powerpc/85xx: Fix booting for P1021MDS boards powerpc/85xx: Fix SWIOTLB initalization for MPC85xxMDS boards powerpc/85xx: kexec for SMP 85xx BookE systems powerpc/5200/i2c: improve i2c bus error recovery of/xilinxfb: update tft compatible versions powerpc/fsl-diu-fb: Support setting display mode using EDID powerpc/5121: doc/dts-bindings: update doc of FSL DIU bindings powerpc/5121: shared DIU framebuffer support powerpc/5121: move fsl-diu-fb.h to include/linux powerpc/5121: fsl-diu-fb: fix issue with re-enabling DIU area descriptor powerpc/512x: add clock structure for Video-IN (VIU) unit ...	2010-08-05 09:03:46 -07:00
Benjamin Herrenschmidt	412a4ac5e9	Merge commit 'gcl/next' into next	2010-08-04 10:26:03 +10:00
Robert Jennings	ceddee23be	powerpc: ONLINE to OFFLINE CPU state transition during removal If a CPU remove is attempted using the 'release' interface on hardware which supports extended cede, the CPU will be put in the INACTIVE state rather than the OFFLINE state due to the default preferred_offline_state in that situation. In the INACTIVE state it will fail to be removed. This patch changes the preferred offline state to OFFLINE when an CPU is in the ONLINE state. After cpu_down() is called in dlpar_offline_cpu() the CPU will be OFFLINE and CPU removal can continue. Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-07-31 15:05:19 +10:00
Benjamin Herrenschmidt	940ce422a3	powerpc/pseries: Increase cpu die timeout In testing SMT disable, we have been regularly seeing the following message: Querying DEAD? cpu %i (%i) shows %i This indicates the current delay in pseries_cpu_die where we wait for the specified CPU to die, is insufficient. Usually, this does not cause a problem, but we've seen this result in BUG_ON's going off in the timer code when we try to migrate the timers off the dead cpu while a timer is still running. Increasing this delay, as is done in this patch, seems to resolve this issue. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-07-31 15:04:15 +10:00
Brian King	e9ae9dabfc	powerpc: Remove redundant xics badness warning While testing cpu offlining, we are regularly seeing the WARN_ON go off in xics_ipi_dispatch. It can occur when an IPI gets sent to the CPU while it is going offline. There is already a similar WARN_ON in the handlers for PPC_MSG_CALL_FUNCTION and PPC_MSG_CALL_FUNC_SINGLE, so the warning is not needed in that path. The debugger handler handles this case by simply ignoring IPIs for offline CPUs, so no warning is needed there. And the reschedule IPI, which is what is occurring in our test environment, can be safely ignored, so we can simply remove the WARN_ON from xics_ipi_dispatch. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-07-31 14:56:31 +10:00
Benjamin Herrenschmidt	3fdfd99051	powerpc: Fix erroneous lmb->memblock conversions Oooops... we missed these. We incorrectly converted strings used when parsing the device-tree on pseries, thus breaking access to drconf memory and hotplug memory. While at it, also revert some variable names that represent something the FW calls "lmb" and thus don't need to be converted to "memblock". Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> ---	2010-07-23 12:56:57 +10:00
Yinghai Lu	95f72d1ed4	lmb: rename to memblock via following scripts FILES=$(find * -type f \| grep -vE 'oprofile\|[^K]config') sed -i \ -e 's/lmb/memblock/g' \ -e 's/LMB/MEMBLOCK/g' \ $FILES for N in $(find . -name lmb.[ch]); do M=$(echo $N \| sed 's/lmb/memblock/g') mv $N $M done and remove some wrong change like lmbench and dlmb etc. also move memblock.c from lib/ to mm/ Suggested-by: Ingo Molnar <mingo@elte.hu> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-07-14 17:14:00 +10:00
Julia Lawall	7405217317	powerpc/pseries: Use kstrdup Use kstrdup when the goal of an allocation is copy a string into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to; expression flag,E1,E2; statement S; @@ - to = kmalloc(strlen(from) + 1,flag); + to = kstrdup(from, flag); ... when != $from = E1 \\| to = E1 $ if (to==NULL \|\| ...) S ... when != $from = E2 \\| to = E2 $ - strcpy(to, from); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-07-09 11:28:37 +10:00
Mark Nelson	68581e9350	powerpc/pseries: Add WARN_ON() to request_event_sources_irqs() on irq allocation/request failure At the moment if request_event_sources_irqs() can't allocate or request the interrupt, it just does a KERN_ERR printk. This may be fine for the existing RAS code where if we miss an EPOW event it just means that the event won't be logged and if we miss one of the RAS errors then we could miss an event that we perhaps should take action on. But, for the upcoming IO events code that will use event-sources if we can't allocate or request the interrupt it means we'd potentially miss an interrupt from the device. So, let's add a WARN_ON() in this error case so that we're a bit more vocal when something's amiss. While we're at it, also use pr_err() to neaten the code up a bit. Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-07-09 11:28:32 +10:00
Mark Nelson	b08e281b5a	powerpc/pseries: Rename RAS_VECTOR_OFFSET to RTAS_VECTOR_EXTERNAL_INTERRUPT and move to rtas.h The RAS code has a #define, RAS_VECTOR_OFFSET, that's used in the check-exception RTAS call for the vector offset of the exception. We'll be using this same vector offset for the upcoming IO Event interrupts code (0x500) so let's move it to include/asm/rtas.h and call it RTAS_VECTOR_EXTERNAL_INTERRUPT. Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-07-09 11:28:31 +10:00
Kulikov Vasiliy	6901c6cca1	powerpc/pseries/eeh: Use for_each_pci_dev() Use for_each_pci_dev() to simplify the code. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-07-09 11:28:25 +10:00
Brian King	32d8ad4e62	powerpc/pseries: Partition hibernation support Enables support for HMC initiated partition hibernation. This is a firmware assisted hibernation, since the firmware handles writing the memory out to disk, along with other partition information, so we just mimic suspend to ram. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-07-09 11:26:17 +10:00
Stephen Rothwell	969ea5c5ad	tracing: fix for tracepoint API change Commit `38516ab59f` ("tracing: Let tracepoints have data passed to tracepoint callbacks") requires this fixup to the powerpc code. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-28 10:20:14 -07:00
Grant Likely	cf9b59e9d3	Merge remote branch 'origin' into secretlab/next-devicetree Merging in current state of Linus' tree to deal with merge conflicts and build failures in vio.c after merge. Conflicts: drivers/i2c/busses/i2c-cpm.c drivers/i2c/busses/i2c-mpc.c drivers/net/gianfar.c Also fixed up one line in arch/powerpc/kernel/vio.c to use the correct node pointer. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2010-05-22 00:36:56 -06:00
Mark Nelson	32c96f7765	powerpc/pseries: Make request_ras_irqs() available to other pseries code At the moment only the RAS code uses event-sources interrupts (for EPOW events and internal errors) so request_ras_irqs() (which actually requests the event-sources interrupts) is found in ras.c and is static. We want to be able to use event-sources interrupts in other pseries code, so let's rename request_ras_irqs() to request_event_sources_irqs() and move it to event_sources.c. This will be used in an upcoming patch that adds support for IO Event interrupts that come through as event sources. Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-05-21 17:31:12 +10:00
Anton Blanchard	b878dc0059	powerpc: Use smt_snooze_delay=-1 to always busy loop Right now if we want to busy loop and not give up any time to the hypervisor we put a very large value into smt_snooze_delay. This is sometimes useful when running a single partition and you want to avoid any latencies due to the hypervisor or CPU power state transitions. While this works, it's a bit ugly - how big a number is enough now we have NO_HZ and can be idle for a very long time. The patch below makes smt_snooze_delay signed, and a negative value means loop forever: echo -1 > /sys/devices/system/cpu/cpu0/smt_snooze_delay This change shouldn't affect the existing userspace tools (eg ppc64_cpu), but I'm cc-ing Nathan just to be sure. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-05-21 17:31:12 +10:00
Michael Neuling	d504bed676	powerpc/kexec: Speedup kexec hash PTE tear down Currently for kexec the PTE tear down on 1TB segment systems normally requires 3 hcalls for each PTE removal. On a machine with 32GB of memory it can take around a minute to remove all the PTEs. This optimises the path so that we only remove PTEs that are valid. It also uses the read 4 PTEs at once HCALL. For the common case where a PTEs is invalid in a 1TB segment, this turns the 3 HCALLs per PTE down to 1 HCALL per 4 PTEs. This gives an > 10x speedup in kexec times on PHYP, taking a 32GB machine from around 1 minute down to a few seconds. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-05-21 17:31:11 +10:00
Michael Neuling	f90ece28c1	powerpc/pseries: Add hcall to read 4 ptes at a time in real mode This adds plpar_pte_read_4_raw() which can be used read 4 PTEs from PHYP at a time, while in real mode. It also creates a new hcall9 which can be used in real mode. It's the same as plpar_hcall9 but minus the tracing hcall statistics which may require variables outside the RMO. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-05-21 17:31:10 +10:00
Anton Blanchard	ce47c1c45b	powerpc/eeh: Fix oops when probing in early boot If we take an EEH error early enough, we oops: Call Trace: [c000000010483770] [c000000000013ee4] .show_stack+0xd8/0x218 (unreliable) [c000000010483850] [c000000000658940] .dump_stack+0x28/0x3c [c0000000104838d0] [c000000000057a68] .eeh_dn_check_failure+0x2b8/0x304 [c000000010483990] [c0000000000259c8] .rtas_read_config+0x120/0x168 [c000000010483a40] [c000000000025af4] .rtas_pci_read_config+0xe4/0x124 [c000000010483af0] [c00000000037af18] .pci_bus_read_config_word+0xac/0x104 [c000000010483bc0] [c0000000008fec98] .pcibios_allocate_resources+0x7c/0x220 [c000000010483c90] [c0000000008feed8] .pcibios_resource_survey+0x9c/0x418 [c000000010483d80] [c0000000008fea10] .pcibios_init+0xbc/0xf4 [c000000010483e20] [c000000000009844] .do_one_initcall+0x98/0x1d8 [c000000010483ed0] [c0000000008f0560] .kernel_init+0x228/0x2e8 [c000000010483f90] [c000000000031a08] .kernel_thread+0x54/0x70 EEH: Detected PCI bus error on device <null> EEH: This PCI device has failed 1 times in the last hour: EEH: location=U78A5.001.WIH8464-P1 driver= pci addr=0001:00:01.0 EEH: of node=/pci@800000020000209/usb@1 EEH: PCI device/vendor: 00351033 EEH: PCI cmd/status register: 12100146 Unable to handle kernel paging request for data at address 0x00000468 Oops: Kernel access of bad area, sig: 11 [#1] .... NIP [c000000000057610] .rtas_set_slot_reset+0x38/0x10c LR [c000000000058724] .eeh_reset_device+0x5c/0x124 Call Trace: [c00000000bc6bd00] [c00000000005a0e0] .pcibios_remove_pci_devices+0x7c/0xb0 (unreliable) [c00000000bc6bd90] [c000000000058724] .eeh_reset_device+0x5c/0x124 [c00000000bc6be40] [c0000000000589c0] .handle_eeh_events+0x1d4/0x39c [c00000000bc6bf00] [c000000000059124] .eeh_event_handler+0xf0/0x188 [c00000000bc6bf90] [c000000000031a08] .kernel_thread+0x54/0x70 We called rtas_set_slot_reset while scanning the bus and before the pci_dn to pcidev mapping has been created. Since we only need the pcidev to work out the type of reset and that only gets set after the module for the device loads, lets just do a hot reset if the pcidev is NULL. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Linas Vepstas <linasvepstas@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-05-21 17:31:09 +10:00
Grant Likely	58f9b0b024	of: eliminate of_device->node and dev_archdata->{of,prom}_node This patch eliminates the node pointer from struct of_device and the of_node (or prom_node) pointer from struct dev_archdata since the node pointer is now part of struct device proper when CONFIG_OF is set, and all users of the old pointer locations have already been converted over to use device->of_node. Also remove dev_archdata_{get,set}_node() as it is no longer used by anything. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2010-05-18 16:10:45 -06:00
Benjamin Herrenschmidt	1ed31d6db9	Merge commit 'origin/master' into next	2010-05-07 11:29:25 +10:00
Anton Blanchard	828a69869b	powerpc/cpumask: Update some comments Since the _map cpumask variants are deprecated, change the comments to instead refer to _mask. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-05-06 17:41:59 +10:00
Anton Blanchard	8729faaa5e	powerpc/cpumask: Convert hotplug-cpu code to new cpumask API Convert hotplug-cpu code to new cpumask API. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-05-06 17:41:57 +10:00
Anton Blanchard	64fe220c13	powerpc/cpumask: Convert xics driver to new cpumask API Use the new cpumask API and add some comments to clarify how get_irq_server works. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-05-06 17:41:53 +10:00
Anton Blanchard	af831e1e44	powerpc/cpumask: Convert pseries SMP code to new cpumask API Use new cpumask functions in pseries SMP startup code. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-05-06 17:41:50 +10:00
Michael Neuling	aef40e87d8	powerpc/pseries: Only call start-cpu when a CPU is stopped Currently we always call start-cpu irrespective of if the CPU is stopped or not. Unfortunatley on POWER7, firmware seems to not like start-cpu being called when a cpu already been started. This was not the case on POWER6 and earlier. This patch checks to see if the CPU is stopped or not via an query-cpu-stopped-state call, and only calls start-cpu on CPUs which are stopped. This fixes a bug with kexec on POWER7 on PHYP where only the primary thread would make it to the second kernel. Reported-by: Ankita Garg <ankita@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-05-06 16:49:25 +10:00
Michael Neuling	f8b6769182	powerpc/pseries: Make query_cpu_stopped callable outside hotplug cpu This moves query_cpu_stopped() out of the hotplug cpu code and into smp.c so it can called in other places and renames it to smp_query_cpu_stopped(). It also cleans up the return values by adding some #defines Cc: <stable@kernel.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-05-06 16:49:25 +10:00
Benjamin Herrenschmidt	b4a26be9f6	powerpc/pseries: Flush lazy kernel mappings after unplug operations This ensures that the translations for unmapped IO mappings or unmapped memory are properly removed from the MMU hash table before such an unplug. Without this, the hypervisor refuses the unplug operations due to those resources still being mapped by the partition. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-04-28 16:23:24 +10:00
Julia Lawall	43caa61f15	powerpc/pseries/dlpar: Use kasprintf kasprintf combines kmalloc and sprintf, and takes care of the size calculation itself. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression a,flag; expression list args; statement S; @@ a = - $kmalloc\\|kzalloc$(...,flag) + kasprintf(flag,args) <... when != a if (a == NULL \|\| ...) S ...> - sprintf(a,args); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-04-07 18:00:42 +10:00
Julia Lawall	a7df5c5e52	powerpc/pseries/dlpar: Eliminate use after free dlpar_free_cc_nodes frees its argument, so dlpar_online_cpu should not be called on the same value. Skip over the call to dlpar_online_cpu by jumping directly to out. A simplified version of the semantic patch that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression E,E2; @@ dlpar_free_cc_nodes(E) ... ( E = E2 \| * E ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-04-07 18:00:41 +10:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Vaidyanathan Srinivasan	a8e6da093e	powerpc: Reduce printk from pseries_mach_cpu_die() Remove debug printks in pseries_mach_cpu_die(). These are noisy at runtime. Traceevents can be added to instrument this section of code. The following KERN_INFO printks are removed: cpu 62 (hwid 62) returned from cede. Decrementer value = b2802fff Timebase value = 2fa8f95035f4a cpu 62 (hwid 62) got prodded to go online cpu 58 (hwid 58) ceding for offline with hint 2 Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-03-09 11:57:10 +11:00
Vaidyanathan Srinivasan	0212f2602a	powerpc: Move checks in pseries_mach_cpu_die() Rearrange condition checks for better code readability and prevention of possible race conditions when preferred_offline_state can potentially change during the execution of pseries_mach_cpu_die(). The patch will make pseries_mach_cpu_die() put cpu in one of the consistent states and not hit the run over BUG() Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-03-09 11:57:10 +11:00
Vaidyanathan Srinivasan	8dbce53cc2	powerpc: Reset kernel stack on cpu online from cede state Cpu hotplug (offline) without dlpar operation will place cpu in cede state and the extended_cede_processor() function will return when resumed. Kernel stack pointer needs to be reset before start_secondary() is called to continue the online operation. Added new function start_secondary_resume() to do the above steps. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-03-09 11:57:10 +11:00
Mark Nelson	f09b7b2a11	powerpc/pseries: Pass CPPR value to H_XIRR hcall Now that we properly keep track of the CPPR value (since `49bd364713`, "powerpc/pseries: Track previous CPPR values to correctly EOI interrupts") we can pass it to the H_XIRR hcall. This is needed because the Partition Adjunct Option of new versions of pHyp extend the H_XIRR hcall to include the CPPR as an input parameter. Earlier versions not supporting this option just disregard the extra input parameter, so this doesn't cause any problems for existing systems. The Partition Adjunct Option is required for future systems that will support SR-IOV capable devices. Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-03-09 11:55:26 +11:00
Adam Lackorzynski	5b72d74ce2	powerpc: Fix SMP build with disabled CPU hotplugging. Compiling 2.6.33 with SMP enabled and HOTPLUG_CPU disabled gives me the following link errors: LD init/built-in.o LD .tmp_vmlinux1 arch/powerpc/platforms/built-in.o: In function `.smp_xics_setup_cpu': smp.c:(.devinit.text+0x88): undefined reference to `.set_cpu_current_state' smp.c:(.devinit.text+0x94): undefined reference to `.set_default_offline_state' arch/powerpc/platforms/built-in.o: In function `.smp_pSeries_kick_cpu': smp.c:(.devinit.text+0x13c): undefined reference to `.set_preferred_offline_state' smp.c:(.devinit.text+0x148): undefined reference to `.get_cpu_current_state' smp.c:(.devinit.text+0x1a8): undefined reference to `.get_cpu_current_state' make: *** [.tmp_vmlinux1] Error 1 The following change fixes that for me and seems to work as expected. Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-03-09 11:52:52 +11:00
Thomas Gleixner	3d37262828	powerpc: Convert confirm_error_lock to raw_spinlock confirm_error_lock needs to be a real spinlock in RT. Convert it to raw_spinlock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-02-19 14:52:31 +11:00
Anton Blanchard	fc380c0c8a	powerpc: Remove whitespace in irq chip name fields Now we use printf style alignment there is no need to manually space these fields. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-02-17 14:02:48 +11:00
Anton Blanchard	fda9d86100	powerpc: Reduce footprint of xics_ipi_struct Right now we allocate a cacheline sized NR_CPUS array for xics IPI communication. Use DECLARE_PER_CPU_SHARED_ALIGNED to put it in percpu data in its own cacheline since it is written to by other cpus. On a kernel with NR_CPUS=1024, this saves quite a lot of memory: text data bss dec hex filename 8767779 2944260 1505724 13217763 c9afe3 vmlinux.irq_cpustat 8767555 2813444 1505724 13086723 c7b003 vmlinux.xics A saving of around 128kB. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-02-17 14:02:48 +11:00
Breno Leitao	8d3d50bf19	powerpc/eeh: Fix a bug when pci structure is null During a EEH recover, the pci_dev structure can be null, mainly if an eeh event is detected during cpi config operation. In this case, the pci_dev will not be known (and will be null) the kernel will crash with the following message: Unable to handle kernel paging request for data at address 0x000000a0 Faulting instruction address: 0xc00000000006b8b4 Oops: Kernel access of bad area, sig: 11 [#1] NIP [c00000000006b8b4] .eeh_event_handler+0x10c/0x1a0 LR [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0 Call Trace: [c0000003a80dff00] [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0 [c0000003a80dff90] [c000000000031f1c] .kernel_thread+0x54/0x70 The bug occurs because pci_name() tries to access a null pointer. This patch just guarantee that pci_name() is not called on Null pointers. Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: Linas Vepstas <linasvepstas@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-02-17 14:02:47 +11:00
Benjamin Herrenschmidt	ec144a81ad	Merge commit 'origin/master' into next	2010-02-17 10:00:42 +11:00
Frans Pop	8354be9c10	powerpc: Remove trailing space in messages Signed-off-by: Frans Pop <elendil@planet.nl> Cc: linuxppc-dev@ozlabs.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-02-09 13:56:23 +11:00
Anton Blanchard	20a8ab9737	powerpc/pseries: Quieten cede latency printk The cede latency stuff is relatively new and we don't need to complain about it not working on older firmware. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-02-09 13:56:06 +11:00
Joe Perches	5a2ad98e92	arch/powerpc: Fix continuation line formats String constants that are continued on subsequent lines with \ are not good. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-02-09 13:55:05 +11:00
Will Schmidt	25ef231de2	powerpc/pseries: Hypervisor call tracepoints hcall_stats touchup The tb_total and purr_total values reported via the hcall_stats code should be cumulative, rather than being replaced by the latest delta tb or purr value. Tested-by: Will Schmidt <will_schmidt@vnet.ibm.com> Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-02-09 13:55:05 +11:00
Mark Nelson	36350e0069	powerpc/pseries: Fix kexec regression caused by CPPR tracking The code to track the CPPR values added by commit `49bd364713` ("powerpc/pseries: Track previous CPPR values to correctly EOI interrupts") broke kexec on pseries because the kexec code in xics.c calls xics_set_cpu_priority() before the IPI has been EOI'ed. This wasn't a problem previously but it now triggers a BUG_ON in xics_set_cpu_priority() because os_cppr->index isn't 0. Fix this problem by setting the index on the CPPR stack to 0 before calling xics_set_cpu_priority() in xics_teardown_cpu(). Also make it clear that we only want to set the priority when there's just one CPPR value in the stack, and enforce it by updating the value of os_cppr->stack[0] rather than os_cppr->stack[os_cppr->index]. While we're at it change the BUG_ON to a WARN_ON. Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-02-08 15:29:19 +11:00
Benjamin Herrenschmidt	bf647fafda	powerpc/pseries: Fix xics build without CONFIG_SMP desc->affinity doesn't exit in that case. Let's use a macro for the UP variant of get_irq_server(), it's the easiest way, avoids evaluating arguments. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-02-01 13:32:41 +11:00
Nathan Fontenot	d0174c7219	powerpc: Move cpu hotplug driver lock from pseries to powerpc Move the defintion and lock helper routines for the cpu hotplug driver lock from pseries to powerpc code to avoid build breaks for platforms other than pseries that use cpu hotplug. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-01-15 13:26:18 +11:00
FUJITA Tomonori	46150a050f	powerpc/pseries: Fix dlpar compile warning without CONFIG_PROC_DEVICETREE cc1: warnings being treated as errors arch/powerpc/platforms/pseries/dlpar.c: In function 'dlpar_attach_node': arch/powerpc/platforms/pseries/dlpar.c:239: error: unused variable 'ent' arch/powerpc/platforms/pseries/dlpar.c: In function 'dlpar_detach_node': arch/powerpc/platforms/pseries/dlpar.c:271: error: unused variable 'prop' arch/powerpc/platforms/pseries/dlpar.c:270: error: unused variable 'parent' make[3]: *** [arch/powerpc/platforms/pseries/dlpar.o] Error 1 Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-01-15 13:20:07 +11:00
Anton Blanchard	92cb3694dd	powerpc/pseries: Fix xics interrupt affinity Commit `57b150cce8` (irq: only update affinity if ->set_affinity() is sucessfull) broke xics irq affinity. We need to use the cpumask passed in, instead of accessing ->affinity directly. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2010-01-15 13:20:07 +11:00
Gautham R Shenoy	3d9b740b2d	powerpc/pseries: Make declarations of cpu_hotplug_driver_lock() ANSI compatible. And add the __acquires() and __releases() annotations, while at it. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-12-18 14:54:26 +11:00
Gautham R Shenoy	e9edb232d3	powerpc/pseries: Don't panic when H_PROD fails during cpu-online. If an online-attempt on a CPU which has been offlined using H_CEDE with an appropriate cede latency hint fails, don't panic. Instead print the error message and let the __cpu_up() code notify the CPU Hotplug framework of the failure, which in turn can notify the other subsystem through CPU_UP_CANCELED. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-12-18 14:54:26 +11:00
Mel Gorman	8a55c4ba3e	powerpc/pseries: Select XICS and PCI_MSI PSERIES It's possible to set CONFIG_XICS without CONFIG_PCI_MSI. When that happens, the kernel fails to build with arch/powerpc/platforms/built-in.o: In function `.xics_startup': xics.c:(.text+0x12f60): undefined reference to `.unmask_msi_irq' make: *** [.tmp_vmlinux1] Error 1 Furthermore, as noted by Benjamin Herrenschmidt, "CONFIG_XICS should be made invisible and selected by PSERIES." This patch fixes PSERIES to select both options Signed-off-by: Mel Gorman <mel[at]csn.ul.ie> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-12-18 14:54:25 +11:00
Robert Jennings	14b8a76b9d	powerpc: Make the CMM memory hotplug aware The Collaborative Memory Manager (CMM) module allocates individual pages over time that are not migratable. On a long running system this can severely impact the ability to find enough pages to support a hotplug memory remove operation. This patch adds a memory isolation notifier and a memory hotplug notifier. The memory isolation notifier will return the number of pages found in the range specified. This is used to determine if all of the used pages in a pageblock are owned by the balloon (or other entities in the notifier chain). The hotplug notifier will free pages in the range which is to be removed. The priority of this hotplug notifier is low so that it will be called near last, this helps avoids removing loaned pages in operations that fail due to other handlers. CMM activity will be halted when hotplug remove operations are active and resume activity after a delay period to allow the hypervisor time to adjust. Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Ingo Molnar <mingo@elte.hu> Cc: Brian King <brking@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Gerald Schaefer <geralds@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-12-18 14:53:36 +11:00
Thomas Gleixner	239007b844	genirq: Convert irq_desc.lock to raw_spinlock Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>	2009-12-14 23:55:33 +01:00
Linus Torvalds	d0316554d3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits) m68k: rename global variable vmalloc_end to m68k_vmalloc_end percpu: add missing per_cpu_ptr_to_phys() definition for UP percpu: Fix kdump failure if booted with percpu_alloc=page percpu: make misc percpu symbols unique percpu: make percpu symbols in ia64 unique percpu: make percpu symbols in powerpc unique percpu: make percpu symbols in x86 unique percpu: make percpu symbols in xen unique percpu: make percpu symbols in cpufreq unique percpu: make percpu symbols in oprofile unique percpu: make percpu symbols in tracer unique percpu: make percpu symbols under kernel/ and mm/ unique percpu: remove some sparse warnings percpu: make alloc_percpu() handle array types vmalloc: fix use of non-existent percpu variable in put_cpu_var() this_cpu: Use this_cpu_xx in trace_functions_graph.c this_cpu: Use this_cpu_xx for ftrace this_cpu: Use this_cpu_xx in nmi handling this_cpu: Use this_cpu operations in RCU this_cpu: Use this_cpu ops for VM statistics ... Fix up trivial (famous last words) global per-cpu naming conflicts in arch/x86/kvm/svm.c mm/slab.c	2009-12-14 09:58:24 -08:00
Benjamin Herrenschmidt	bcd6acd51f	Merge commit 'origin/master' into next Conflicts: include/linux/kvm.h	2009-12-09 17:14:38 +11:00
Mark Nelson	49bd364713	powerpc/pseries: Track previous CPPR values to correctly EOI interrupts At the moment when we EOI an interrupt we set the CPPR back to 0xFF regardless of its previous value. This could lead to problems if we take an interrupt with a priority of 5, but before EOIing it we get an IPI which has a priority of 4. The problem is that at the moment when we EOI the IPI we will set the CPPR to 0xFF, but it should really be set back to 5 (the previous priority). To keep track of the previous CPPR values we create the xics_cppr structure that has an array for CPPR values and an index pointing to the current priority. This can easily grow if new priorities get added in the future. This will also be useful because the partition adjunct option of upcoming machines will update the H_XIRR hcall to accept the CPPR as a parameter. Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-12-09 17:10:38 +11:00
Nathan Fontenot	275a64f604	powerpc/pseries: Correct pseries/dlpar.c build break without CONFIG_SMP The recent patch to add cpu offline/online as part of the DLPAR process for pseries causes a build break if CONFIG_SMP is not defined. Original patch here; http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-November/078299.html This corrects the build break by moving the online_node_cpus and offline_node_cpus under the #ifdef CONFIG_ARCH_CPU_PROBE_RELEASE portions of dlpar.c. This patch also slightly modifies the online_node_cpus and offline_node_cpus routines to prepend dlpar_ to the them and make them static. These two routine are only used in the dlpar add/remove of cpus and these changes should help clarify that. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-12-09 17:10:38 +11:00
Roman Fietze	40d50cf7ca	powerpc: Make "intspec" pointers in irq_host->xlate() const Writing a driver using SCLPC on the MPC5200B I detected, that the intspec arrays to map irqs to Linux virq cannot be const, because the mapping and xlate functions only take non const pointers. All those functions do not modify the intspec, so a const pointer could be used. Signed-off-by: Roman Fietze <roman.fietze@telemotive.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-12-09 17:10:37 +11:00
Gautham R Shenoy	51badebdcf	powerpc/pseries: Serialize cpu hotplug operations during deactivate Vs deallocate Currently the cpu-allocation/deallocation process comprises of two steps: - Set the indicators and to update the device tree with DLPAR node information. - Online/offline the allocated/deallocated CPU. This is achieved by writing to the sysfs tunables "probe" during allocation and "release" during deallocation. At the sametime, the userspace can independently online/offline the CPUs of the system using the sysfs tunable "online". It is quite possible that when a userspace tool offlines a CPU for the purpose of deallocation and is in the process of updating the device tree, some other userspace tool could bring the CPU back online by writing to the "online" sysfs tunable thereby causing the deallocate process to fail. The solution to this is to serialize writes to the "probe/release" sysfs tunable with the writes to the "online" sysfs tunable. This patch employs a mutex to provide this serialization, which is a no-op on all architectures except PPC_PSERIES Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-12-09 17:09:36 +11:00
Gautham R Shenoy	b6db63d1a7	pseries/pseries: Add code to online/offline CPUs of a DLPAR node Currently the cpu-allocation/deallocation on pSeries is a two step process from the Userspace. - Set the indicators and update the device tree by writing to the sysfs tunable "probe" during allocation and "release" during deallocation. - Online / Offline the CPUs of the allocated/would_be_deallocated node by writing to the sysfs tunable "online". This patch adds kernel code to online/offline the CPUs soon_after/just_before they have been allocated/would_be_deallocated. This way, the userspace tool that performs DLPAR operations would only have to deal with one set of sysfs tunables namely "probe" and release". Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-12-09 17:09:35 +11:00
Nathan Fontenot	1a8061c46c	powerpc/pseries: Add kernel based CPU DLPAR handling This patch adds the specific routines to probe and release (add and remove) cpu resource for the powerpc pseries platform and registers these handlers with the ppc_md callout structure. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-12-09 17:09:34 +11:00
Nathan Fontenot	ab519a011c	powerpc/pseries: Kernel DLPAR Infrastructure The Dynamic Logical Partitioning capabilities of the powerpc pseries platform allows for the addition and removal of resources (i.e. CPU's, memory, and PCI devices) from a partition. The removal of a resource involves removing the resource's node from the device tree and then returning the resource to firmware via the rtas set-indicator call. To add a resource, it is first obtained from firmware via the rtas set-indicator call and then a new device tree node is created using the ibm,configure-coinnector rtas call and added to the device tree. This patch provides the kernel DLPAR infrastructure in a new filed named dlpar.c. The functionality provided is for acquiring and releasing a resource from firmware and the parsing of information returned from the ibm,configure-connector rtas call. Additionally this exports the pSeries reconfiguration notifier chain so that it can be invoked when device tree updates are made. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-12-09 17:09:32 +11:00
Gautham R Shenoy	3aa565f53c	powerpc/pseries: Add hooks to put the CPU into an appropriate offline state When a CPU is offlined on POWER currently, we call rtas_stop_self() and hand the CPU back to the resource pool. This path is used for DLPAR which will cause a change in the LPAR configuration which will be visible outside. This patch changes the default state a CPU is put into when it is offlined. On platforms which support ceding the processor to the hypervisor with latency hint specifier value, during a cpu offline operation, instead of calling rtas_stop_self(), we cede the vCPU to the hypervisor while passing a latency hint specifier value. The Hypervisor can use this hint to provide better energy savings. Also, during the offline operation, the control of the vCPU remains with the LPAR as oppposed to returning it to the resource pool. The patch achieves this by creating an infrastructure to set the preferred_offline_state() which can be either - CPU_STATE_OFFLINE: which is the current behaviour of calling rtas_stop_self() - CPU_STATE_INACTIVE: which cedes the vCPU to the hypervisor with the latency hint specifier. The codepath which wants to perform a DLPAR operation can set the preferred_offline_state() of a CPU to CPU_STATE_OFFLINE before invoking cpu_down(). The patch also provides a boot-time command line argument to disable/enable CPU_STATE_INACTIVE. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-11-24 14:33:04 +11:00
Gautham R Shenoy	69ddb57cbe	powerpc/pseries: Add extended_cede_processor() helper function. This patch provides an extended_cede_processor() helper function which takes the cede latency hint as an argument. This hint is to be passed on to the hypervisor to cede to the corresponding state on platforms which support it. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Arun R Bharadwaj <arun@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-11-24 14:33:03 +11:00
Thomas Gleixner	b27df67248	powerpc: Fixup last users of irq_chip->typename The typename member of struct irq_chip was kept for migration purposes and is obsolete since more than 2 years. Fix up the leftovers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev@ozlabs.org Acked-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-11-24 14:32:45 +11:00
Ingo Molnar	0ffa798d94	Merge branches 'perf/powerpc' and 'perf/bench' into perf/core Merge reason: Both 'perf bench' and the pending PowerPC changes are now ready for the next merge window. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-15 09:51:24 +01:00
Benjamin Herrenschmidt	0526484aa3	Merge commit 'origin/master' into next	2009-11-12 10:59:04 +11:00
Andre Detsch	8435b027b8	powerpc/pci: Fix regression in powerpc MSI-X Patch `f598282f51` exposed a problem in powerpc MSI-X functionality, making network interfaces such as ixgbe and cxgb3 stop to work when MSI-X is enabled. RX interrupts were not being generated. The problem was caused because MSI irq was not being effectively unmasked after device initialization. Signed-off-by: Andre Detsch <adetsch@br.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-11-05 17:06:27 +11:00
Brian King	8be8cf5b47	powerpc: Add kdump support to Collaborative Memory Manager When running Active Memory Sharing, the Collaborative Memory Manager (CMM) may mark some pages as "loaned" with the hypervisor. Periodically, the CMM will query the hypervisor for a loan request, which is a single signed value. When kexec'ing into a kdump kernel, the CMM driver in the kdump kernel is not aware of the pages the previous kernel had marked as "loaned", so the hypervisor and the CMM driver are out of sync. Fix the CMM driver to handle this scenario by ignoring requests to decrease the number of loaned pages if we don't think we have any pages loaned. Pages that are marked as "loaned" which are not in the balloon will automatically get switched to "active" the next time we touch the page. This also fixes the case where totalram_pages is smaller than min_mem_mb, which can occur during kdump. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-10-30 17:20:56 +11:00
Michael Ellerman	6cff46f4bc	powerpc: Remove get_irq_desc() get_irq_desc() is a powerpc-specific version of irq_to_desc(). That is reason enough to remove it, but it also doesn't know about sparse irq_desc support which irq_to_desc() does (when we enable it). Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-10-30 17:20:55 +11:00
Michael Ellerman	59e3f83702	powerpc/pseries: Use irq_has_action() in eeh_disable_irq() Rather than open-coding our own check, use irq_has_action() to check if an irq has an action - ie. is "in use". irq_has_action() doesn't take the descriptor lock, but it shouldn't matter - we're just using it as an indicator that the irq is in use. disable_irq_nosync() will take the descriptor lock before doing anything also. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-10-30 17:20:54 +11:00
Benjamin Herrenschmidt	3d541c4b7f	powerpc/chrp: Use the same RTAS daemon as pSeries The CHRP code has some fishy timer based code to scan the RTAS event log, which uses a 1KB stack buffer and doesn't even use the results. The pSeries code as a nicer daemon that allows userspace to read the event log and basically uses the same RTAS interface This patch moves rtasd.c out of platform/pseries and makes it usable by CHRP, after removing the old crufty event log mechanism in there. The nvram logging part of the daemon is still only available on 64-bit since the underlying nvram management routines aren't currently shared. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-10-30 17:20:53 +11:00
Benjamin Herrenschmidt	188917e183	powerpc: Move /proc/ppc64 to /proc/powerpc and add symlink Some of the stuff in /proc/ppc64 such as the RTAS bits are actually useful to some 32-bit platforms. Rename the file, and create a symlink on 64-bit for backward compatibility Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-10-30 17:20:53 +11:00
Tejun Heo	6b7487fc65	percpu: make percpu symbols in powerpc unique This patch updates percpu related symbols in powerpc such that percpu symbols are unique and don't clash with local symbols. This serves two purposes of decreasing the possibility of global percpu symbol collision and allowing dropping per_cpu__ prefix from percpu symbols. * arch/powerpc/kernel/perf_callchain.c: s/callchain/cpu_perf_callchain/ * arch/powerpc/kernel/setup-common.c: s/pvr/cpu_pvr/ * arch/powerpc/platforms/pseries/dtl.c: s/dtl/cpu_dtl/ * arch/powerpc/platforms/cell/interrupt.c: s/iic/cpu_iic/ Partly based on Rusty Russell's "alloc_percpu: rename percpu vars which cause name clashes" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@ozlabs.org	2009-10-29 22:34:14 +09:00
Anton Blanchard	6f26353ca2	powerpc: tracing: Give hypervisor call tracepoints access to arguments While most users of the hcall tracepoints will only want the opcode and return code, some will want all the arguments. To avoid the complexity of using varargs we pass a pointer to the register save area, which contains all the arguments. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2009-10-28 16:13:04 +11:00
Anton Blanchard	c8cd093a6e	powerpc: tracing: Add hypervisor call tracepoints Add hcall_entry and hcall_exit tracepoints. This replaces the inline assembly HCALL_STATS code and converts it to use the new tracepoints. To keep the disabled case as quick as possible, we embed a status word in the TOC so we can get at it with a single load. By doing so we keep the overhead at a minimum. Time taken for a null hcall: No tracepoint code: 135.79 cycles Disabled tracepoints: 137.95 cycles For reference, before this patch enabling HCALL_STATS resulted in a null hcall of 201.44 cycles! Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2009-10-28 16:13:04 +11:00
Anton Blanchard	b6dcde5c74	powerpc: Fix hypervisor TLB batching Profiling of a page fault scalability microbenchmark shows flush_hash_range is not calling the batch hpte invalidate hcall (H_BULK_REMOVE). It turns out we have a duplicate firmware feature for hcall-bulk and the current setup code stops after finding the first match. This meant we never batch and always do individual invalidates. The patch below removes the duplicate and shifts FW_FEATURE_CMO to close the gap. With the patch applied the single threaded page fault rate improves from 217169 to 238755 per second on a POWER5 test box, a 10% improvement. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-10-14 16:58:37 +11:00

1 2 3 4 5 ...

624 Commits