Commit Graph

48276 Commits

Author SHA1 Message Date
Linus Torvalds
03da309867 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (276 commits)
  [SCSI] zfcp: Trigger logging in the FCP channel on qdio error conditions
  [SCSI] zfcp: Introduce experimental support for DIF/DIX
  [SCSI] zfcp: Enable data division support for FCP devices
  [SCSI] zfcp: Prevent access on uninitialized memory.
  [SCSI] zfcp: Post events through FC transport class
  [SCSI] zfcp: Cleanup QDIO attachment and improve processing.
  [SCSI] zfcp: Cleanup function parameters for sbal value.
  [SCSI] zfcp: Use correct width for timer_interval field
  [SCSI] zfcp: Remove SCSI device when removing unit
  [SCSI] zfcp: Use memdup_user and kstrdup
  [SCSI] zfcp: Fix retry after failed "open port" erp action
  [SCSI] zfcp: Fail erp after timeout
  [SCSI] zfcp: Use forced_reopen in terminate_rport_io callback
  [SCSI] zfcp: Register SCSI devices after successful fc_remote_port_add
  [SCSI] zfcp: Do not try "forced close" when port is already closed
  [SCSI] zfcp: Do not unblock rport from REOPEN_PORT_FORCED
  [SCSI] sd: add support for runtime PM
  [SCSI] implement runtime Power Management
  [SCSI] convert to the new PM framework
  [SCSI] Unify SAM_ and SAM_STAT_ macros
  ...
2010-08-04 15:15:15 -07:00
Heiko Schocher
c4b6a77663 powerpc/8xx: Add support for the MPC8xx based boards from TQC
Supported SMC1 (serial console), SCC1 Ethernet (10Mbps HD).
FEC Ethernet, 8MB NOR CFI Flash.

Tested on STK8xx with TQM860L (with FEC) and with TQM855M (without FEC).

Signed-off-by: Heiko Schocher <hs@denx.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-08-04 14:25:22 -05:00
Timur Tabi
30be4c965c powerpc/85xx: Introduce support for the Freescale P1022DS reference board
Introduce basic support for the Freescale P1022DS reference board, based on the
Freescale BSP for this board.  This patch excludes the DIU, SSI, and MMC/SD
drivers.  Only a 36-bit DTS is provided.

Update mpc86xx_smp_defconfig and mpc85xx_defconfig to support the P1022DS.
This means enabling 64-bit physical address support, increasing the maximum
zone order to 12 (to allow the DIU driver to allocate large chunks), and
clean up the audio options to disable the deprecated OSS support.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-08-04 14:22:52 -05:00
Bradley Hughes
51974d3162 powerpc/85xx: Adding DTS for the STx GP3-SSA MPC8555 board
This version uses "fsl,mpc8555..." instead of "fsl,85..." notation.

There is also an 8541 version of this board so DTS for this board
is specific to the 8555 processor.

Another patch is coming to fix-up other DTS that use old notation.

Signed-off-by: Bradley Hughes <bhughes@silicontkx.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-08-04 14:22:19 -05:00
Bradley Hughes
8a4ab218ef powerpc/85xx: Change deprecated binding for 85xx-based boards
The "fsl,85..." style compatible binding was to be deprecated
some time ago.  This patch corrects existing occurrences of
the incorrect binding.  The memory-controller and
l2-cache-controller are the only affected nodes.

Signed-off-by: Bradley Hughes <bhughes@silicontkx.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-08-04 14:22:04 -05:00
Dmitry Eremin-Solenikov
e9502fbe2d powerpc/tqm85xx: add a quirk for ti1520 PCMCIA bridge
By default ti1520 bridge expects an input clock on CLOCK pin (to control
power chip). However on this boards CLOCK should be generated by PCI1520
itself. Add a quirk that enables internal 16 KHz clock generation on
this pin.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-08-04 14:21:41 -05:00
Dmitry Eremin-Solenikov
07c638398f powerpc/tqm85xx: update PCI interrupt-map attribute
Update PCI IRQ mapping on TQM85xx platforms: include INTC and INTD on PCI-X
slot and add INTA/INTB mapping for PCMCIA bridge.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-08-04 14:21:10 -05:00
Ilya Yanok
ba4d1275d1 powerpc/mpc8308rdb: support for MPC8308RDB board from Freescale
This patch adds support for MPC8308RDB development board from
Freescale.
Supported devices:
 DUART
 Dual Ethernet
 NOR and NAND flashes
 I2C
 USB in peripheral mode

PCIE support is broken by the commit 3da34aa ("powerpc/fsl: Support
unique MSI addresses per PCIe Root Complex"). Works after revert.

Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-08-04 14:18:50 -05:00
Ilya Yanok
e3b5e0d552 powerpc/fsl_pci: add quirk for mpc8308 pcie bridge
This patch adds the quirk for PCIE controller found on Freescale MPC8308.
The quirk is the same as for other MPC83xx processors.

Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-08-04 14:16:22 -05:00
Anton Vorontsov
99d8238f5f powerpc/85xx: Cleanup QE initialization for MPC85xxMDS boards
The mpc85xx_mds_setup_arch() function is incomprehensible
and unmaintainable. Factor out all QE specific stuff into
mpc85xx_mds_qe_init() and mpc85xx_mds_reset_ucc_phys().

Also move QE stuff out of mpc85xx_mds_pic_init().

The diff is unreadable, but only because the code was so. ;-)
It should be better now, and less indented.

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-08-04 14:16:12 -05:00
Anton Vorontsov
dee9ad718b powerpc/85xx: Fix booting for P1021MDS boards
P1021 processors have no dedicated ROM to store the QE microcode,
so the fimrware is stored externally, and it is U-Boot responsibility
to load it. It might be that the board is booting without QE, e.g.
currently U-Boot doesn't support QE for P1021MDS boards, which means
that QE isn't initialized, and so the board hangs early at boot.

This patch fixes the issue by marking QE as disabled and checking the
state in the probing code. U-Boot should fixup the state if it
initialized the QE.

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-08-04 14:16:01 -05:00
Anton Vorontsov
bb863e85a7 powerpc/85xx: Fix SWIOTLB initalization for MPC85xxMDS boards
The code inside '#ifdef CONFIG_QUICC_ENGINE' makes the
mpc85xx_mds_setup_arch() return early if no QE nodes present,
and so SWIOTLB is never initialized.

This patch fixes the issue by moving SWIOTLB code above
QE.

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-08-04 14:15:22 -05:00
Linus Torvalds
6ba74014c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits)
  phy/marvell: add 88ec048 support
  igb: Program MDICNFG register prior to PHY init
  e1000e: correct MAC-PHY interconnect register offset for 82579
  hso: Add new product ID
  can: Add driver for esd CAN-USB/2 device
  l2tp: fix export of header file for userspace
  can-raw: Fix skb_orphan_try handling
  Revert "net: remove zap_completion_queue"
  net: cleanup inclusion
  phy/marvell: add 88e1121 interface mode support
  u32: negative offset fix
  net: Fix a typo from "dev" to "ndev"
  igb: Use irq_synchronize per vector when using MSI-X
  ixgbevf: fix null pointer dereference due to filter being set for VLAN 0
  e1000e: Fix irq_synchronize in MSI-X case
  e1000e: register pm_qos request on hardware activation
  ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice
  net: Add getsockopt support for TCP thin-streams
  cxgb4: update driver version
  cxgb4: add new PCI IDs
  ...

Manually fix up conflicts in:
 - drivers/net/e1000e/netdev.c: due to pm_qos registration
   infrastructure changes
 - drivers/net/phy/marvell.c: conflict between adding 88ec048 support
   and cleaning up the IDs
 - drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req
   conflict (registration change vs marking it static)
2010-08-04 11:47:58 -07:00
Linus Torvalds
d5fc1d5175 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  amd64_edac: Minor formatting fix
  amd64_edac: Fix operator precendence error
  edac, mc: Improve scrub rate handling
  amd64_edac: Correct scrub rate setting
  amd64_edac: Fix DCT base address selector
  amd64_edac: Remove polling mechanism
  x86, mce: Notify about corrected events too
  amd64_edac: Remove unneeded defines
  edac: Remove EDAC_DEBUG_VERBOSE
  amd64_edac: Sanitize syndrome extraction
2010-08-04 11:23:59 -07:00
Kyle McMartin
d9b68e5e88 parisc: pass through '\t' to early (iodc) console
The firmware handles '\t' internally, so stop trying to emulate it
(which, incidentally, had a bug in it.)

Fixes a really weird hang at bootup in rcu_bootup_announce, which,
as far as I can tell, is the first printk in the core kernel to use
a tab as the first character.

Cc: stable@kernel.org
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-04 11:17:10 -07:00
Linus Torvalds
8d91530c5f Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Remove pointless printk from p4-clockmod.
  [CPUFREQ] Fix section mismatch for powernow_cpu_init in powernow-k7.c
  [CPUFREQ] Fix section mismatch for longhaul_cpu_init.
  [CPUFREQ] Fix section mismatch for longrun_cpu_init.
  [CPUFREQ] powernow-k8: Fix misleading variable naming
  [CPUFREQ] Convert pci_table entries to PCI_VDEVICE (if PCI_ANY_ID is used)
  [CPUFREQ] arch/x86/kernel/cpu/cpufreq: use for_each_pci_dev()
  [CPUFREQ] fix brace coding style issue.
  [CPUFREQ] x86 cpufreq: Make trace_power_frequency cpufreq driver independent
  [CPUFREQ] acpi-cpufreq: Fix CPU_ANY CPUFREQ_{PRE,POST}CHANGE notification
  [CPUFREQ] ondemand: don't synchronize sample rate unless multiple cpus present
  [CPUFREQ] unexport (un)lock_policy_rwsem* functions
  [CPUFREQ] ondemand: Refactor frequency increase code
  [CPUFREQ] powernow-k8: On load failure, remind the user to enable support in BIOS setup
  [CPUFREQ] powernow-k8: Limit Pstate transition latency check
  [CPUFREQ] Fix PCC driver error path
  [CPUFREQ] fix double freeing in error path of pcc-cpufreq
  [CPUFREQ] pcc driver should check for pcch method before calling _OSC
  [CPUFREQ] fix memory leak in cpufreq_add_dev
  [CPUFREQ] revert "[CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)"

Manually fix up non-data merge conflict introduced by new calling
conventions for trace_power_start() in commit 6f4f2723d0 ("x86
cpufreq: Make trace_power_frequency cpufreq driver independent"), which
didn't update the intel_idle native hardware cpuidle driver.
2010-08-04 11:13:36 -07:00
Linus Torvalds
c145307a11 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86: (88 commits)
  ips driver: make it less chatty
  intel_scu_ipc: fix size field for intel_scu_ipc_command
  intel_scu_ipc: return -EIO for error condition in busy_loop
  intel_scu_ipc: fix data packing of PMIC command on Moorestown
  Clean up command packing on MRST.
  zero the stack buffer before giving random garbage to the SCU
  Fix stack buffer size for IPC writev messages
  intel_scu_ipc: Use the new cpu identification function
  intel_scu_ipc: tidy up unused bits
  Remove indirect read write api support.
  intel_scu_ipc: Support Medfield processors
  intel_scu_ipc: detect CPU type automatically
  x86 plat: limit x86 platform driver menu to X86
  acpi ec_sys: Be more cautious about ec write access
  acpi ec: Fix possible double io port registration
  hp-wmi: acpi_drivers.h is already included through acpi.h two lines below
  hp-wmi: Fix mixing up of and/or directive
  dell-laptop: make dell_laptop_i8042_filter() static
  asus-laptop: fix asus_input_init error path
  msi-wmi: make needlessly global symbols static
  ...
2010-08-04 10:44:06 -07:00
Linus Torvalds
5e83f6fbdb Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (198 commits)
  KVM: VMX: Fix host GDT.LIMIT corruption
  KVM: MMU: using __xchg_spte more smarter
  KVM: MMU: cleanup spte set and accssed/dirty tracking
  KVM: MMU: don't atomicly set spte if it's not present
  KVM: MMU: fix page dirty tracking lost while sync page
  KVM: MMU: fix broken page accessed tracking with ept enabled
  KVM: MMU: add missing reserved bits check in speculative path
  KVM: MMU: fix mmu notifier invalidate handler for huge spte
  KVM: x86 emulator: fix xchg instruction emulation
  KVM: x86: Call mask notifiers from pic
  KVM: x86: never re-execute instruction with enabled tdp
  KVM: Document KVM_GET_SUPPORTED_CPUID2 ioctl
  KVM: x86: emulator: inc/dec can have lock prefix
  KVM: MMU: Eliminate redundant temporaries in FNAME(fetch)
  KVM: MMU: Validate all gptes during fetch, not just those used for new pages
  KVM: MMU: Simplify spte fetch() function
  KVM: MMU: Add gpte_valid() helper
  KVM: MMU: Add validate_direct_spte() helper
  KVM: MMU: Add drop_large_spte() helper
  KVM: MMU: Use __set_spte to link shadow pages
  ...
2010-08-04 10:43:01 -07:00
Linus Torvalds
fe445c6e2c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (57 commits)
  Input: adp5588-keypad - fix NULL dereference in adp5588_gpio_add()
  Input: cy8ctmg110 - capacitive touchscreen support
  Input: keyboard - also match braille-only keyboards
  Input: adp5588-keys - export unused GPIO pins
  Input: xpad - add product ID for Hori Fighting Stick EX2
  Input: adxl34x - fix leak and use after free
  Input: samsung-keypad - Add samsung keypad driver
  Input: i8042 - reset keyboard controller wehen resuming from S2R
  Input: synaptics - set min/max for finger width
  Input: synaptics - only report width on hardware that supports it
  Input: evdev - signal that device is writable in evdev_poll()
  Input: mousedev - signal that device is writable in mousedev_poll()
  Input: change input handlers to use bool when possible
  Input: document the MT event slot protocol
  Input: introduce MT event slots
  Input: usbtouchscreen - implement reset_resume
  Input: usbtouchscreen - implement runtime power management
  Input: usbtouchscreen - implement basic suspend/resume
  Input: Add ATMEL QT602240 touchscreen driver
  Input: fix signedness warning in input_set_keycode()
  ...
2010-08-04 10:41:52 -07:00
Linus Torvalds
f63b759c44 Merge branch 'v4l_for_2.6.35' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'v4l_for_2.6.35' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (243 commits)
  V4L/DVB: sms: Convert IR support to use the Remote Controller core
  V4L/DVB: sms: properly initialize IR phys and IR name
  V4L/DVB: standardize names at rc-dib0700 tables
  V4L/DVB: smsusb: enable IR port for Hauppauge WinTV MiniStick
  V4L/DVB: dib0700: Fix RC protocol logic to properly handle NEC/NECx and RC-5
  V4L/DVB: dib0700: properly implement IR change_protocol
  V4L/DVB: dib0700: break keytable into NEC and RC-5 variants
  V4L/DVB: dib0700: avoid bad repeat
  V4L/DVB: Port dib0700 to rc-core
  V4L/DVB: Add a keymap file with dib0700 table
  V4L/DVB: dvb-usb: add support for rc-core mode
  V4L/DVB: dvb-usb: prepare drivers for using rc-core
  V4L/DVB: dvb-usb: get rid of struct dvb_usb_rc_key
  V4L/DVB: rj54n1cb0c: fix a comment in the driver
  V4L/DVB: V4L2: sh_vou: VOU does support the full PAL resolution too
  V4L/DVB: V4L2: sh_mobile_camera_ceu: add support for CSI2
  V4L/DVB: V4L2: soc-camera: add a MIPI CSI-2 driver for SH-Mobile platforms
  V4L/DVB: V4L2: soc-camera: export soc-camera bus type for notifications
  V4L/DVB: V4L2: mediabus: add 12-bit Bayer and YUV420 pixel formats
  V4L/DVB: mediabus: fix ambiguous pixel code names
  ...
2010-08-04 10:38:08 -07:00
Tony Luck
9354a55f94 Pull misc2-6-36 into release branch 2010-08-04 08:53:02 -07:00
Tony Luck
38e14a7cf1 Pull percpu256K into release branch 2010-08-04 08:52:50 -07:00
Tony Luck
13a1e6e1a5 Pull reduce-config into release branch 2010-08-04 08:52:27 -07:00
Jiri Kosina
d790d4d583 Merge branch 'master' into for-next 2010-08-04 15:14:38 +02:00
Michal Simek
2d5973cb5a microblaze: Add KGDB support
Kgdb uses brki r16, 0x18 instruction to call
low level _debug_exception function which save
current state to pt_regs and call microblaze_kgdb_break
function. _debug_exception should be called only from
the kernel space. User space calling is not supported
because user application debugging uses different handling.

pt_regs_to_gdb_regs loads additional special registers
which can't be changed

 * Enable KGDB in Kconfig
 * Remove ancient not-tested KGDB support
 * Remove ancient _debug_exception code from entry.S

Only MMU KGDB support is supported.

Signed-off-by: Michal Simek <monstr@monstr.eu>
CC: Jason Wessel <jason.wessel@windriver.com>
CC: John Williams <john.williams@petalogix.com>
CC: Edgar E. Iglesias <edgar.iglesias@petalogix.com>
CC: linux-kernel@vger.kernel.org
Acked-by: Jason Wessel <jason.wessel@windriver.com>
2010-08-04 10:45:17 +02:00
Michal Simek
751f1605e0 microblaze: Support brki rX, 0x18 for user application debugging
This is the first patch which add support for
user application debugging through brki rX, 0x18 vector.

This patch has side effect which also remove security issue
to use brki rX, 0x18 to freeze kernel.

Support for old gdb support via priviledged exception
(brk r0, r0) is still there. It will be remove in future.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:45:16 +02:00
Michal Simek
958063e67b microblaze: Remove nop after MSRCLR/SET, MTS, MFS instructions
We need to save instruction and the latest Microblaze shouldn't
have any problem with it.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:45:16 +02:00
Michal Simek
0e41c90908 microblaze: Simplify syscall rutine
Syscall can be called only from userspace that's why
we don't need to check which space kernel come from.

Kernel syscall calling is not check and shouldn't come
throught this part of code.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:45:15 +02:00
Michal Simek
0a6b08fda6 microblaze: Move PT_MODE saving to delay slot
We can save one more instruction if PT_MODE is saved in delay slot

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:45:14 +02:00
Michal Simek
80c5ff6b9b microblaze: Fix _interrupt function
Save instructions by using delay slot and
clear UMS only if kernel comes from user space.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:45:13 +02:00
Michal Simek
25f6e59657 microblaze: Fix _user_exception function
Saving some instructions. Clear VMS bit if kernel comes
from kernel space.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:45:12 +02:00
Michal Simek
287503fabd microblaze: Put together addik instructions
Saving instructions by adding 2/3 addik instructions to one.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:44:56 +02:00
Michal Simek
9814cc11e5 microblaze: Use delay slot in syscall macros
Saving instruction with delay slot usage.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:32:22 +02:00
Michal Simek
da23355280 microblaze: Save kernel mode in delay slot
This change save one instruction if kernel comes from kernel
space.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:32:21 +02:00
Michal Simek
e7741075b3 microblaze: Do not mix register saving and mode setting
Separate reg saving and mode setting.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:32:20 +02:00
Michal Simek
e5d2af2b96 microblaze: Move SAVE_STATE upward
SAVE_STATE macro could be used by other rutines too.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:31:09 +02:00
Michal Simek
66f7de8634 microblaze: entry.S: Macro optimization
We are not working with values from MSR that's why
we can discard it and use r11 for different purpose without
saving/restoring.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:30:07 +02:00
Michal Simek
c318d483b3 microblaze: Optimize hw exception rutine
Remove set_vms because UMS is cleared and VMS is already setup.
Optimize function calling which save one additional instruction.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:30:06 +02:00
Michal Simek
b318067e2c microblaze: Implement clear_ums macro and fix SAVE_STATE macro
VMS is always setup because VM mode was before
exception/syscall/interrupt. Kernel continues in kernel mode
that's why we have to clear UMS bit if kernel comes from
user space.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:29:44 +02:00
Michal Simek
77f6d22605 microblaze: Remove additional setup for kernel_mode
PT_MODE stores information if kernel comes from user
or kernel space. If come from user space, PT_MODE
contains 0. If come from kernel store, PT_MODE contains
non zero value. We don't need to save value 1. I am using
r1 register which contains non zero value.
This change save one additional instruction.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:53 +02:00
Michal Simek
06a54604a3 microblaze: Optimize SAVE_STATE macro
SAVE_STATE macro could be used for user_exception
or interrupt functions.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:52 +02:00
Michal Simek
40eb0dc456 microblaze: Remove additional loading
We don't need to save r0 to PT_R0. It could be additional
operation.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:51 +02:00
Michal Simek
653e447e11 microblaze: Completely remove working with R11 register
We don't need to save R11 register. There is easy way
to use only R1 which is saved and restore later.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:50 +02:00
Michal Simek
0388107dd5 microblaze: Do not setup BIP in _debug_exception
BIP is already setup.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:50 +02:00
Michal Simek
06b2864038 microblaze: Simplify _debug_exception function
Keep together all arguments for send_sig function.
Move returning address to delay slot which is executed.
Remove additional send_sig loading. I am using IMM part of
rtbd instruction with r0.

old solution:
addik r11, r0, send_sig
rtbd r11, 0
nop

new solution:
rtbd r0, send_sig
nop

There is one instruction saving.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:48 +02:00
Michal Simek
8b110d157c microblaze: Optimize SAVE_STATE macro
It is necessary to setup BIP and EE and clear EIP
only for unaligned exception handler. The rest of
hw exception handlers don't require it.
HW exception occured and we are not in virtual mode.
That's why we can do operations protected by EIP.
Interrupt, next hw exception or syscall can't occur.

EIP is cleared by rted.

This change speedup page_fault hw exception handler
which is critical path.

There is also necessary to save R11 content before
flag setup for unaligned exception.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:48 +02:00
Michal Simek
b9ea77e2d3 microblaze: trivial: Use la insted of addik
la is translated to addik by toolchain.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:47 +02:00
Michal Simek
be304350dd microblaze: remove enable_irq from SAVE_STATE macro
SAVE_STATE macro is used in hw exceptions high level handling
functions. Hw exception doesn't disable IRQ that's why we don't
need to reenable it.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:46 +02:00
Michal Simek
63708f635c microblaze: Move stack backup to SAVE_STATE macro
Remove code duplicity and move it to SAVE_STATE macro.
There is no impact on performance.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:45 +02:00
Michal Simek
96014cc39b microblaze: Move BIP setup to the end of ret_from_trap/ret_from_exc
We don't need to protect by BIP whole ret_from_trap/ret_from_exc code.
Only restoring from user/hw exception should be covered.
If BIP is setup, IRQ can't occur.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:45 +02:00
Michal Simek
5c0d72b1b3 microblaze: Remove PER_CPU(KM) variable
There is a way howto remove Kernel Mode variable. It is easier
to parse UMS bit in MSR to find out if I come from kernel or user
space. Loading MSR content should be in one cycle and loading
PER_CPU variable depends on memory state.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:44 +02:00
Michal Simek
3fbd93e58e microblaze: Optimize clear_vms_ums macro
We can save two instruction when MSR_VMS and MSR_UMS
are setup in one instruction.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:43 +02:00
Michal Simek
36f6095419 microblaze: Save and restore r3/r4 in SAVE/RESTORE_REGS macros
Save and restore R3/R4 registers in macros. This change
help to cleanup entry.S.

In ret_from_trap function we are saving returning value from
syscall to pt_regs on stack that's why we don't need to save and
restore these values before kernel functions (schedule, do_signal).

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:42 +02:00
Michal Simek
a4a94dbf20 microblaze: Fix VM_ON and VM_OFF macros
Jump behind macro. We don't want to execute nop instruction again.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:42 +02:00
Michal Simek
ca28b51016 microblaze: Do not use _start in vmlinux
_start symbol stores physical address where kernel is.
Gdb uses this symbol for their purpose that's why
we have to rename it.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:41 +02:00
Michal Simek
61b403af8b microblaze: Cleanup boot/Makefile
Remove spaces and use tabs instead.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:40 +02:00
Michal Simek
aee04d76d2 microblaze: Fix number of pvr regs
Microblaze has only 11 pvr regs according manual.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:39 +02:00
Michal Simek
c8f77436d1 microblaze: Decrease time shifting values
Lower shifting values ensure that shifted 32bit counter
value doesn't exceed 64bit cycle variable too fast.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:39 +02:00
Michal Simek
615748aefa microblaze: Enable early printk only for uartlite
Microblaze has support for early printk. The second serial
driver (uart16550/8250) has no microblaze support for early
printk.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:38 +02:00
FUJITA Tomonori
75842abfd8 microblaze: remove unused HAVE_ARCH_PCI_SET_DMA_MASK
HAVE_ARCH_PCI_SET_DMA_MASK was removed in 2.6.34 (no architecture has
the own implementation of pci_set_dma_mask).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:37 +02:00
Michal Simek
d0f140e03e microblaze: Do not trace cpu_relax function
IRQsoff tracer requires to protect cpu_idle function
to get correct timing report.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:36 +02:00
Michal Simek
6f34b08f58 microblaze: Improve ftrace time measuring
I had to comment sched_clock generic function because of broken toolchain.
It is fine grain timing.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:35 +02:00
Steven J. Magnani
ce3266c047 microblaze: Add stack unwinder
Implement intelligent backtracing by searching for stack frame creation,
and emitting only return addresses. Use print_hex_dump() to display the
entire binary kernel stack.

Limitation: MMU kernels are not currently able to trace beyond a system trap
(interrupt, syscall, etc.). It is the intent of this patch to provide
infrastructure that can be extended to add this capability later.

Changes from V1:
* Removed checks in find_frame_creation() that prevented location of the frame
  creation instruction in heavily optimized code
* Various formatting/commenting/file location tweaks per review comments
* Dropped Kconfig option to enable STACKTRACE as something logically separate

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
2010-08-04 10:22:35 +02:00
Steven J. Magnani
ba9c4f88d7 microblaze: Allow PAGE_SIZE configuration
Allow developer to configure memory page size at compile time.
Larger pages can improve performance on some workloads.

Based on PowerPC code.

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:34 +02:00
Michal Simek
0d9ec762af microblaze: Trace hardirqs
Add trace_hardirqs_off and trace_hardirqs_on to do_IRQ function.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:33 +02:00
Michal Simek
570e3e236e microblaze: Fix sys_clone syscall
sys_clone syscall ignored args which this patch mapped to args
which are passing from glibc.

Here is the origin problem description.

"I ran the static libgcc tests (very few of them are there, they are
mostly dynamically linked) and some of  them fail with an assertion in
fork() system call (tid != pid), I looked at the microblaze/entry.S
file and it looks suspicious (ignores arguments 3-5)"

Arg mapping should be:
glibc ARCH_FORK(...) -> do_fork(...)
r5 -> r5   (clone_flags)
r6  -> r6 (stack_start, use parent->stack if NULL)
pt_regs -> r7 (pt_regs)
r7 -> r8 (stack_size)
r8 -> r9 (parent_tidptr)
r9 -> r10 (child_tidptr)

Signed-off-by: John Williams <john.williams@petalogix.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:32 +02:00
Michal Simek
6847ba91a1 microblaze: Fix copy_to_user_page macro
copy_to_user_page macro is used in mm/memory.c:access_process_vm
function. This function is called from ptrace code (POKETEXT, POKEDATA)
which write data to memory. Microblaze handle physical address for
caches that's why there is virt_to_phys conversion.

There is potential one location which can caused the problem on WB system.

The important is take a look at write PTRACEs requests
(POKE/TEXT, DATA, USR).

Note:
Majority of Microblaze PTRACE code is moved to generic location
in newer kernel version that's why this solution should work on
the newest kernel version too.

linux/io.h is in cacheflush because of mm/nommu.c

Tested on a WB system - hello world debugging.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:31 +02:00
Michal Simek
e05816679b microblaze: Sync noMMU and MMU setup_memory
Both versions can use the same node to register NODE_DATA(0)

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:30 +02:00
Michal Simek
ef78705034 microblaze: Remove unused label
The label should be remove by
21e1c93631

Warning message:
arch/microblaze/mm/fault.c: In function 'do_page_fault':
arch/microblaze/mm/fault.c:229: warning: label 'survive' defined but not used

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:29 +02:00
Michal Simek
79e87830fa microblaze: Implement flush_dcache_page macro
flush_dcache_page macro is necessary to implement for
JFFS2 rootfs support on WB system.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:09 +02:00
Benjamin Herrenschmidt
412a4ac5e9 Merge commit 'gcl/next' into next 2010-08-04 10:26:03 +10:00
Linus Torvalds
be82ae0238 Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (291 commits)
  ARM: AMBA: Add pclk support to AMBA bus infrastructure
  ARM: 6278/2: fix regression in RealView after the introduction of pclk
  ARM: 6277/1: mach-shmobile: Allow users to select HZ, default to 128
  ARM: 6276/1: mach-shmobile: remove duplicate NR_IRQS_LEGACY
  ARM: 6246/1: mmci: support larger MMCIDATALENGTH register
  ARM: 6245/1: mmci: enable hardware flow control on Ux500 variants
  ARM: 6244/1: mmci: add variant data and default MCICLOCK support
  ARM: 6243/1: mmci: pass power_mode to the translate_vdd callback
  ARM: 6274/1: add global control registers definition header file for nuc900
  mx2_camera: fix type of dma buffer virtual address pointer
  mx2_camera: Add soc_camera support for i.MX25/i.MX27
  arm/imx/gpio: add spinlock protection
  ARM: Add support for the LPC32XX arch
  ARM: LPC32XX: Arch config menu supoport and makefiles
  ARM: LPC32XX: Phytec 3250 platform support
  ARM: LPC32XX: Misc support functions
  ARM: LPC32XX: Serial support code
  ARM: LPC32XX: System suspend support
  ARM: LPC32XX: GPIO, timer, and IRQ drivers
  ARM: LPC32XX: Clock driver
  ...
2010-08-03 14:31:24 -07:00
Dave Jones
9d1f44ee20 [CPUFREQ] Remove pointless printk from p4-clockmod.
The only machines this is triggering on should be supported by
acpi-cpufreq or acpi's internal throttling.

Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:30 -04:00
Holger Freyther
307069cf6c [CPUFREQ] Fix section mismatch for powernow_cpu_init in powernow-k7.c
Use __cpuinit instead of __init for the cpufreq_driver
init function like it is done in powernow-k8.c.

This is removing the warning generated when compiling with
the CONFIG_DEBUG_SECTION_MISMATCH=y option.

Signed-off-by: Holger Hans Peter Freyther <holger@moiji-mobile.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:06 -04:00
Holger Freyther
2530573e45 [CPUFREQ] Fix section mismatch for longhaul_cpu_init.
Use __cpuinit instead of __init for the cpufreq_driver
init function like it is done in powernow-k8.c. Use the
__cpuinitdata for data used by the routines marked as __cpuinit.

This is removing the warning generated when compiling with
the CONFIG_DEBUG_SECTION_MISMATCH=y option.

Signed-off-by: Holger Hans Peter Freyther <holger@moiji-mobile.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:06 -04:00
Holger Freyther
7e2d811220 [CPUFREQ] Fix section mismatch for longrun_cpu_init.
Use __cpuinit instead of __init for the cpufreq_driver
init function like it is done in powernow-k8.c.

This is removing the warning generated when compiling with
the CONFIG_DEBUG_SECTION_MISMATCH=y option.

Signed-off-by: Holger Hans Peter Freyther <holger@moiji-mobile.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:06 -04:00
Borislav Petkov
b30d3304c9 [CPUFREQ] powernow-k8: Fix misleading variable naming
rdmsr() takes the lower 32 bits as a second argument and the high 32 as
a third. Fix the names accordingly since they were swapped.

There should be no functionality change resulting from this patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:06 -04:00
Peter Huewe
55c789bb2b [CPUFREQ] Convert pci_table entries to PCI_VDEVICE (if PCI_ANY_ID is used)
This patch converts pci_table entries, where .subvendor=PCI_ANY_ID and
.subdevice=PCI_ANY_ID, .class=0 and .class_mask=0, to use the
PCI_VDEVICE macro, and thus improves readability.

Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:05 -04:00
Kulikov Vasiliy
ccc5638a20 [CPUFREQ] arch/x86/kernel/cpu/cpufreq: use for_each_pci_dev()
Use for_each_pci_dev() to simplify the code.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:05 -04:00
Thomas Renninger
6f4f2723d0 [CPUFREQ] x86 cpufreq: Make trace_power_frequency cpufreq driver independent
and fix the broken case if a core's frequency depends on others.

trace_power_frequency was only implemented in a rather ungeneric way
in acpi-cpufreq driver's target() function only.
-> Move the call to trace_power_frequency to
   cpufreq.c:cpufreq_notify_transition() where CPUFREQ_POSTCHANGE
   notifier is triggered.
   This will support power frequency tracing by all cpufreq drivers

trace_power_frequency did not trace frequency changes correctly when
the userspace governor was used or when CPU cores' frequency depend
on each other.
-> Moving this into the CPUFREQ_POSTCHANGE notifier and pass the cpu
   which gets switched automatically fixes this.

Robert Schoene provided some important fixes on top of my initial
quick shot version which are integrated in this patch:
- Forgot some changes in power_end trace (TP_printk/variable names)
- Variable dummy in power_end must now be cpu_id
- Use static 64 bit variable instead of unsigned int for cpu_id

Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: davej@redhat.com
CC: arjan@infradead.org
CC: linux-kernel@vger.kernel.org
CC: robert.schoene@tu-dresden.de
Tested-by: robert.schoene@tu-dresden.de
Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:05 -04:00
Thomas Renninger
6b72e3934b [CPUFREQ] acpi-cpufreq: Fix CPU_ANY CPUFREQ_{PRE,POST}CHANGE notification
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: venki@google.com
CC: davej@redhat.com
CC: arjan@infradead.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:05 -04:00
Marti Raudsepp
298decfbc4 [CPUFREQ] powernow-k8: On load failure, remind the user to enable support in BIOS setup
On Wed, 2010-01-20 at 16:56 +0100, Thomas Renninger wrote:
> But most often this happens if people upgrade their CPU and do not
> update their BIOS.
> Or the vendor does not recognise the new CPU even if the BIOS got
> updated.

Maybe some of those people just didn't realize it was disabled in BIOS?
If you tell users that it's a firmware bug then they'll probably just
give up.

> The itself message might be an enhancment, IMO it's not worth a patch.

Why do you think so? I spent an hour on hunting down the BIOS upgrade,
only to find that it didn't improve anything. It was a day later that I
realized that it might be a BIOS option; and the option was literally
the _last_ option in the whole BIOS setup. :)

This message would have saved the day.

> But do not revert the FW_BUG part!

Sure, you have a point here.

How about this patch?
2010-08-03 13:47:04 -04:00
Borislav Petkov
c2f4a2c6e0 [CPUFREQ] powernow-k8: Limit Pstate transition latency check
The Pstate transition latency check was added for broken F10h BIOSen
which wrongly contain a value of 0 for transition and bus master
latency. Fam11h and later, however, (will) have similar transition
latency so extend that behavior for them too.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:03 -04:00
Matthew Garrett
6ebdf777ba [CPUFREQ] Fix PCC driver error path
The PCC cpufreq driver unmaps the mailbox address range if any CPUs fail to
initialise, but doesn't do anything to remove the registered CPUs from the
cpufreq core resulting in failures further down the line. We're better off
simply returning a failure - the cpufreq core will unregister us cleanly if
we end up with no successfully registered CPUs. Tidy up the failure path
and also add a sanity check to ensure that the firmware gives us a realistic
frequency - the core deals badly with that being set to 0.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:02 -04:00
Daniel J Blueman
0d9715d64f [CPUFREQ] fix double freeing in error path of pcc-cpufreq
Prevent double freeing on error path.

Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:02 -04:00
Matthew Garrett
5d77b85458 [CPUFREQ] pcc driver should check for pcch method before calling _OSC
The pcc specification documents an _OSC method that's incompatible with the
one defined as part of the ACPI spec. This shouldn't be a problem as both
are supposed to be guarded with a UUID. Unfortunately approximately nobody
(including HP, who wrote this spec) properly check the UUID on entry to the
_OSC call. Right now this could result in surprising behaviour if the pcc
driver performs an _OSC call on a machine that doesn't implement the pcc
specification. Check whether the PCCH method exists first in order to reduce
this probability.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2010-08-03 13:47:02 -04:00
Borislav Petkov
98a5ae2d99 x86, mce: Notify about corrected events too
Notify all parties registered on the mce decoder chain about logged
correctable MCEs.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
2010-08-03 16:14:02 +02:00
Sreedhara DS
804f8681a9 Remove indirect read write api support.
The firmware of production devices does not support this interface so this
is dead code.

Signed-off-by: Sreedhara DS <sreedhara.ds@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
2010-08-03 09:50:30 -04:00
Feng Tang
35f2915c3b intel_scu_ipc: add definitions for vRTC related command
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
2010-08-03 09:48:47 -04:00
Chris Zankel
66630f71e2 Merge remote branch 'origin/master' 2010-08-03 00:38:00 -07:00
Chris Zankel
ecd53497b7 xtensa: Disable PCI and nfsroot on simulation target
The ISS platform is a pure simulation target and doesn't support PCI,
so disable it in the default configuration.

Signed-off-by: Chris Zankel <chris@zankel.net>
2010-08-03 00:18:38 -07:00
Dmitry Torokhov
d01d0756f7 Merge branch 'next' into for-linus 2010-08-02 18:35:17 -07:00
Julia Lawall
71cd03b004 arch/sparc/mm: Use GFP_KERNEL
GFP_ATOMIC is not needed here, as evidenced by the other two uses of
GFP_KERNEL in the same function.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@ identifier f; @@

*f(...,GFP_ATOMIC,...)
... when != spin_unlock(...)
    when != read_unlock(...)
    when != write_unlock(...)
    when != read_unlock_irq(...)
    when != write_unlock_irq(...)
    when != read_unlock_irqrestore(...)
    when != write_unlock_irqrestore(...)
    when != spin_unlock_irq(...)
    when != spin_unlock_irqrestore(...)
*f(...,GFP_KERNEL,...)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-02 16:04:21 -07:00
Guennadi Liakhovetski
ace6e9799f V4L/DVB: mediabus: fix ambiguous pixel code names
Endianness notation is meaningless for 8 bit YUYV codes. Switch pixel code
names to explicitly state the order of colour components in the data
stream.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-02 16:43:36 -03:00
Matthew McClintock
f933a41e41 powerpc/85xx: kexec for SMP 85xx BookE systems
Adds support for kexec on 85xx machines for the BookE platform.
Including support for SMP machines

Based off work from Maxim Uvarov <muvarov@mvista.com>
Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-08-02 14:36:28 -05:00
Michal Simek
af58ed854b microblaze: Fix comment for TLB
There is wrong comment for TLB. Early printk uartlite
console uses TLB 63.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-02 10:44:03 +02:00
Michal Simek
8d7ec6ee59 microblaze: Fix __copy_to/from_user_inatomic macros
__copy_to/from_user_inatomic should call __copy_to/from_user
because there is not necessary to check access because of kernel function.

might_sleep in copy_to/from_user macros is causing problems
in debug sessions too (CONFIG_DEBUG_SPINLOCK_SLEEP).

BUG: sleeping function called from invalid context at
.../arch/microblaze/include/asm/uaccess.h:388
in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper
1 lock held by swapper/1:
 #0:  (&p->cred_guard_mutex){......}, at: [<c00d4b90>] prepare_bprm_creds+0x2c/0x88
Kernel Stack:
...

Call Trace:
[<c0006bd4>] microblaze_unwind+0x7c/0x94
[<c0006684>] show_stack+0xf4/0x190
[<c0006730>] dump_stack+0x10/0x30
[<c00103a0>] __might_sleep+0x12c/0x160
[<c0090de4>] file_read_actor+0x1d8/0x2a8
[<c0091568>] generic_file_aio_read+0x6b4/0xa64
[<c00cd778>] do_sync_read+0xac/0x110
[<c00ce254>] vfs_read+0xc8/0x160
[<c00d585c>] kernel_read+0x38/0x64
[<c00d5984>] prepare_binprm+0xfc/0x130
[<c00d6430>] do_execve+0x228/0x370
[<c000614c>] microblaze_execve+0x58/0xa4

caused by file_read_actor (mm/filemap.c) which calls
__copy_to_user_inatomic.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-02 10:44:03 +02:00
Avi Kivity
3444d7da18 KVM: VMX: Fix host GDT.LIMIT corruption
vmx does not restore GDT.LIMIT to the host value, instead it sets it to 64KB.
This means host userspace can learn a few bits of host memory.

Fix by reloading GDTR when we load other host state.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 08:10:18 +03:00
Xiao Guangrong
9a3aad7057 KVM: MMU: using __xchg_spte more smarter
Sometimes, atomically set spte is not needed, this patch call __xchg_spte()
more smartly

Note: if the old mapping's access bit is already set, we no need atomic operation
since the access bit is not lost

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:41:01 +03:00
Xiao Guangrong
e4b502ead2 KVM: MMU: cleanup spte set and accssed/dirty tracking
Introduce set_spte_track_bits() to cleanup current code

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:41:00 +03:00
Xiao Guangrong
be233d49ea KVM: MMU: don't atomicly set spte if it's not present
If the old mapping is not present, the spte.a is not lost, so no need
atomic operation to set it

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:59 +03:00
Xiao Guangrong
9ed5520dd3 KVM: MMU: fix page dirty tracking lost while sync page
In sync-page path, if spte.writable is changed, it will lose page dirty
tracking, for example:

assume spte.writable = 0 in a unsync-page, when it's synced, it map spte
to writable(that is spte.writable = 1), later guest write spte.gfn, it means
spte.gfn is dirty, then guest changed this mapping to read-only, after it's
synced,  spte.writable = 0

So, when host release the spte, it detect spte.writable = 0 and not mark page
dirty

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:58 +03:00
Xiao Guangrong
daa3db693c KVM: MMU: fix broken page accessed tracking with ept enabled
In current code, if ept is enabled(shadow_accessed_mask = 0), the page
accessed tracking is lost.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:57 +03:00
Xiao Guangrong
fa1de2bfc0 KVM: MMU: add missing reserved bits check in speculative path
In the speculative path, we should check guest pte's reserved bits just as
the real processor does

Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:56 +03:00
Andrea Arcangeli
6e3e243c3b KVM: MMU: fix mmu notifier invalidate handler for huge spte
The index wasn't calculated correctly (off by one) for huge spte so KVM guest
was unstable with transparent hugepages.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:54 +03:00
Wei Yongjun
c19b8bd60e KVM: x86 emulator: fix xchg instruction emulation
If the destination is a memory operand and the memory cannot
map to a valid page, the xchg instruction emulation and locked
instruction will not work on io regions and stuck in endless
loop. We should emulate exchange as write to fix it.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:53 +03:00
Gleb Natapov
9195c4da26 KVM: x86: Call mask notifiers from pic
If pit delivers interrupt while pic is masking it OS will never do EOI
and ack notifier will not be called so when pit will be unmasked no pit
interrupts will be delivered any more. Calling mask notifiers solves this
issue.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:52 +03:00
Gleb Natapov
68be080345 KVM: x86: never re-execute instruction with enabled tdp
With tdp enabled we should get into emulator only when emulating io, so
reexecution will always bring us back into emulator.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:51 +03:00
Gleb Natapov
c0e0608cb9 KVM: x86: emulator: inc/dec can have lock prefix
Mark inc (0xfe/0 0xff/0) and dec (0xfe/1 0xff/1) as lock prefix capable.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:49 +03:00
Avi Kivity
24157aaf83 KVM: MMU: Eliminate redundant temporaries in FNAME(fetch)
'level' and 'sptep' are aliases for 'interator.level' and 'iterator.sptep', no
need for them.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:48 +03:00
Avi Kivity
5991b33237 KVM: MMU: Validate all gptes during fetch, not just those used for new pages
Currently, when we fetch an spte, we only verify that gptes match those that
the walker saw if we build new shadow pages for them.

However, this misses the following race:

  vcpu1            vcpu2

  walk
                  change gpte
                  walk
                  instantiate sp

  fetch existing sp

Fix by validating every gpte, regardless of whether it is used for building
a new sp or not.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:47 +03:00
Avi Kivity
0b3c933302 KVM: MMU: Simplify spte fetch() function
Partition the function into three sections:

- fetching indirect shadow pages (host_level > guest_level)
- fetching direct shadow pages (page_level < host_level <= guest_level)
- the final spte (page_level == host_level)

Instead of the current spaghetti.

A slight change from the original code is that we call validate_direct_spte()
more often: previously we called it only for gw->level, now we also call it for
lower levels.  The change should have no effect.

[xiao: fix regression caused by validate_direct_spte() called too late]

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:45 +03:00
Avi Kivity
39c8c672a1 KVM: MMU: Add gpte_valid() helper
Move the code to check whether a gpte has changed since we fetched it into
a helper.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:44 +03:00
Avi Kivity
a357bd229c KVM: MMU: Add validate_direct_spte() helper
Add a helper to verify that a direct shadow page is valid wrt the required
access permissions; drop the page if it is not valid.

Reviewed-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:43 +03:00
Avi Kivity
a3aa51cfaa KVM: MMU: Add drop_large_spte() helper
To clarify spte fetching code, move large spte handling into a helper.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:42 +03:00
Avi Kivity
121eee97a7 KVM: MMU: Use __set_spte to link shadow pages
To avoid split accesses to 64 bit sptes on i386, use __set_spte() to link
shadow pages together.

(not technically required since shadow pages are __GFP_KERNEL, so upper 32
bits are always clear)

Reviewed-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:41 +03:00
Avi Kivity
32ef26a359 KVM: MMU: Add link_shadow_page() helper
To simplify the process of fetching an spte, add a helper that links
a shadow page to an spte.

Reviewed-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:40 +03:00
Avi Kivity
908e75f3e7 KVM: Expose MCE control MSRs to userspace
Userspace needs to reset and save/restore these MSRs.

The MCE banks are not exposed since their number varies from vcpu to vcpu.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:36 +03:00
Xiao Guangrong
aea924f606 KVM: PIT: stop vpit before freeing irq_routing
Fix:
general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
......
Call Trace:
 [<ffffffffa0159bd1>] ? kvm_set_irq+0xdd/0x24b [kvm]
 [<ffffffff8106ea8b>] ? trace_hardirqs_off_caller+0x1f/0x10e
 [<ffffffff813ad17f>] ? sub_preempt_count+0xe/0xb6
 [<ffffffff8106d273>] ? put_lock_stats+0xe/0x27
...
RIP  [<ffffffffa0159c72>] kvm_set_irq+0x17e/0x24b [kvm]

This bug is triggered when guest is shutdown, is because we freed
irq_routing before pit thread stopped

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:35 +03:00
Gleb Natapov
a6f177efaa KVM: Reenter guest after emulation failure if due to access to non-mmio address
When shadow pages are in use sometimes KVM try to emulate an instruction
when it accesses a shadowed page. If emulation fails KVM un-shadows the
page and reenter guest to allow vcpu to execute the instruction. If page
is not in shadow page hash KVM assumes that this was attempt to do MMIO
and reports emulation failure to userspace since there is no way to fix
the situation. This logic has a race though. If two vcpus tries to write
to the same shadowed page simultaneously both will enter emulator, but
only one of them will find the page in shadow page hash since the one who
founds it also removes it from there, so another cpu will report failure
to userspace and will abort the guest.

Fix this by checking (in addition to checking shadowed page hash) that
page that caused the emulation belongs to valid memory slot. If it is
then reenter the guest to allow vcpu to reexecute the instruction.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:34 +03:00
Gleb Natapov
edba23e515 KVM: Return EFAULT from kvm ioctl when guest accesses bad area
Currently if guest access address that belongs to memory slot but is not
backed up by page or page is read only KVM treats it like MMIO access.
Remove that capability. It was never part of the interface and should
not be relied upon.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:33 +03:00
Jiri Slaby
673813e81d KVM: fix lock imbalance in kvm_create_pit()
Stanse found that there is an omitted unlock in kvm_create_pit in one fail
path. Add proper unlock there.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Gleb Natapov <gleb@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gregory Haskins <ghaskins@novell.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:31 +03:00
Avi Kivity
f59c1d2ded KVM: MMU: Keep going on permission error
Real hardware disregards permission errors when computing page fault error
code bit 0 (page present).  Do the same.

Reviewed-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:30 +03:00
Avi Kivity
b0eeec29fe KVM: MMU: Only indicate a fetch fault in page fault error code if nx is enabled
Bit 4 of the page fault error code is set only if EFER.NX is set.

Reviewed-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:29 +03:00
Wei Yongjun
5d55f299f9 KVM: x86 emulator: re-implementing 'mov AL,moffs' instruction decoding
This patch change to use DstAcc for decoding 'mov AL, moffs'
and introduced SrcAcc for decoding 'mov moffs, AL'.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:27 +03:00
Wei Yongjun
07cbc6c185 KVM: x86 emulator: fix cli/sti instruction emulation
If IOPL check fail, the cli/sti emulate GP and then we should
skip writeback since the default write OP is OP_REG.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:26 +03:00
Wei Yongjun
b16b2b7bb5 KVM: x86 emulator: fix 'mov rm,sreg' instruction decoding
The source operand of 'mov rm,sreg' is segment register, not
general-purpose register, so remove SrcReg from decoding.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:25 +03:00
Wei Yongjun
e97e883f8b KVM: x86 emulator: fix 'and AL,imm8' instruction decoding
'and AL,imm8' should be mask as ByteOp, otherwise the dest operand
length will no correct and we may fill the full EAX when writeback.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:24 +03:00
Wei Yongjun
ce7a0ad3bd KVM: x86 emulator: fix the comment of out instruction
Fix the comment of out instruction, using the same style as the
other instructions.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:23 +03:00
Wei Yongjun
a5046e6c7d KVM: x86 emulator: fix 'mov sreg,rm16' instruction decoding
Memory reads for 'mov sreg,rm16' should be 16 bits only.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:22 +03:00
Avi Kivity
b79b93f92c KVM: MMU: Don't drop accessed bit while updating an spte
__set_spte() will happily replace an spte with the accessed bit set with
one that has the accessed bit clear.  Add a helper update_spte() which checks
for this condition and updates the page flag if needed.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:21 +03:00
Avi Kivity
a9221dd5ec KVM: MMU: Atomically check for accessed bit when dropping an spte
Currently, in the window between the check for the accessed bit, and actually
dropping the spte, a vcpu can access the page through the spte and set the bit,
which will be ignored by the mmu.

Fix by using an exchange operation to atmoically fetch the spte and drop it.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:20 +03:00
Avi Kivity
ce061867aa KVM: MMU: Move accessed/dirty bit checks from rmap_remove() to drop_spte()
Since we need to make the check atomic, move it to the place that will
set the new spte.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:18 +03:00
Avi Kivity
be38d276b0 KVM: MMU: Introduce drop_spte()
When we call rmap_remove(), we (almost) always immediately follow it by
an __set_spte() to a nonpresent pte.  Since we need to perform the two
operations atomically, to avoid losing the dirty and accessed bits, introduce
a helper drop_spte() and convert all call sites.

The operation is still nonatomic at this point.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-02 06:40:17 +03:00
Xiao Guangrong
dd180b3e90 KVM: VMX: fix tlb flush with invalid root
Commit 341d9b535b6c simplify reload logic while entry guest mode, it
can avoid unnecessary sync-root if KVM_REQ_MMU_RELOAD and
KVM_REQ_MMU_SYNC both set.

But, it cause a issue that when we handle 'KVM_REQ_TLB_FLUSH', the
root is invalid, it is triggered during my test:

Kernel BUG at ffffffffa00212b8 [verbose debug info unavailable]
......

Fixed by directly return if the root is not ready.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:40:16 +03:00
Joerg Roedel
5689cc53fa KVM: Use u64 for frame data types
For 32bit machines where the physical address width is
larger than the virtual address width the frame number types
in KVM may overflow. Fix this by changing them to u64.

[sfr: fix build on 32-bit ppc]

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-02 06:39:44 +03:00
Anatolij Gustschin
4b5006ec7b powerpc/5121: shared DIU framebuffer support
MPC5121 DIU configuration/setup as initialized by the boot
loader currently will get lost while booting Linux. As a
result displaying the boot splash is not possible through
the boot process.

To prevent this we reserve configured DIU frame buffer
address range while booting and preserve AOI descriptor
and gamma table so that DIU continues displaying through
the whole boot process. On first open from user space
DIU frame buffer driver releases the reserved frame
buffer area and continues to operate as usual.

Signed-off-by: John Rigby <jcrigby@gmail.com>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Acked-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01 17:06:44 -06:00
Anatolij Gustschin
9e2089cbed powerpc/512x: add clock structure for Video-IN (VIU) unit
Allows using clk_get()/clk_enable()/clk_disable() for VIU
clock in the v4l2 video driver.

Signed-off-by: Hongjun Chen <hong-jun.chen@freescale.com>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01 17:06:44 -06:00
Anatolij Gustschin
12fb0eb4c9 powerpc/5121: add initial support for PDM360NG board
Adds IFM PDM360NG device tree and platform code.

Currently following is supported:
 - Spansion S29GL512P 256 MB NOR flash
 - ST Micro NAND 1 GiB flash
 - DIU, please use "fbcon=map:5 video=fslfb:800x480-32@60"
   at the kernel command line to enable PrimeView PM070WL3
   Display support.
 - FEC
 - I2C
 - RTC, EEPROM
 - MSCAN
 - PSC UART, please pass "console=tty0 console=ttyPSC5,115200"
   on the kernel command line.
 - SPI, ADS7845 Touchscreen
 - USB0/1 Host
 - USB0 OTG Host/Device
 - VIU, Overlay/Capture support

Signed-off-by: Markus Fischer <markus.fischer.ec@ifm.com>
Signed-off-by: Wolfgang Grandegger <wg@denx.de>
Signed-off-by: Michael Weiss <michael.weiss@ifm.com>
Signed-off-by: Detlev Zundel <dzu@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01 17:06:44 -06:00
Anatolij Gustschin
1044ea8cfd powerpc/512x: Group mpc512x board's selection menu
Allow board selection in a drop-down board sub-menu
like many other platforms do.

Before the patch:
...
[ ] Freescale MPC5121E ADS
[ ] Generic support for simple MPC5121 based boards
[ ] 52xx-based boards
...

Patched:
...
[*] 512x-based boards
[ ]   Freescale MPC5121E ADS
[ ]   Generic support for simple MPC5121 based boards
[ ] 52xx-based boards
...

This is a cleanup before adding new board selection entry.

Signed-off-by: Anatolij Gustschin <agust@denx.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01 17:06:44 -06:00
Joerg Roedel
828554136b KVM: Remove unnecessary divide operations
This patch converts unnecessary divide and modulo operations
in the KVM large page related code into logical operations.
This allows to convert gfn_t to u64 while not breaking 32
bit builds.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:47:30 +03:00
Alexander Graf
fef093bec0 KVM: PPC: Make use of hash based Shadow MMU
We just introduced generic functions to handle shadow pages on PPC.
This patch makes the respective backends make use of them, getting
rid of a lot of duplicate code along the way.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:47:28 +03:00
Alexander Graf
7741909bf1 KVM: PPC: Add generic hpte management functions
Currently the shadow paging code keeps an array of entries it knows about.
Whenever the guest invalidates an entry, we loop through that entry,
trying to invalidate matching parts.

While this is a really simple implementation, it is probably the most
ineffective one possible. So instead, let's keep an array of lists around
that are indexed by a hash. This way each PTE can be added by 4 list_add,
removed by 4 list_del invocations and the search only needs to loop through
entries that share the same hash.

This patch implements said lookup and exports generic functions that both
the 32-bit and 64-bit backend can use.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:47:27 +03:00
Xiao Guangrong
84754cd8fc KVM: MMU: cleanup FNAME(fetch)() functions
Cleanup this function that we are already get the direct sp's access

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:47:26 +03:00
Xiao Guangrong
9e7b0e7fba KVM: MMU: fix direct sp's access corrupted
If the mapping is writable but the dirty flag is not set, we will find
the read-only direct sp and setup the mapping, then if the write #PF
occur, we will mark this mapping writable in the read-only direct sp,
now, other real read-only mapping will happily write it without #PF.

It may hurt guest's COW

Fixed by re-install the mapping when write #PF occur.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:47:25 +03:00
Xiao Guangrong
5fd5387c89 KVM: MMU: fix conflict access permissions in direct sp
In no-direct mapping, we mark sp is 'direct' when we mapping the
guest's larger page, but its access is encoded form upper page-struct
entire not include the last mapping, it will cause access conflict.

For example, have this mapping:
        [W]
      / PDE1 -> |---|
  P[W]          |   | LPA
      \ PDE2 -> |---|
        [R]

P have two children, PDE1 and PDE2, both PDE1 and PDE2 mapping the
same lage page(LPA). The P's access is WR, PDE1's access is WR,
PDE2's access is RO(just consider read-write permissions here)

When guest access PDE1, we will create a direct sp for LPA, the sp's
access is from P, is W, then we will mark the ptes is W in this sp.

Then, guest access PDE2, we will find LPA's shadow page, is the same as
PDE's, and mark the ptes is RO.

So, if guest access PDE1, the incorrect #PF is occured.

Fixed by encode the last mapping access into direct shadow page

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:47:23 +03:00
Xiao Guangrong
36a2e6774b KVM: MMU: fix writable sync sp mapping
While we sync many unsync sp at one time(in mmu_sync_children()),
we may mapping the spte writable, it's dangerous, if one unsync
sp's mapping gfn is another unsync page's gfn.

For example:

SP1.pte[0] = P
SP2.gfn's pfn = P
[SP1.pte[0] = SP2.gfn's pfn]

First, we write protected SP1 and SP2, but SP1 and SP2 are still the
unsync sp.

Then, sync SP1 first, it will detect SP1.pte[0].gfn only has one unsync-sp,
that is SP2, so it will mapping it writable, but we plan to sync SP2 soon,
at this point, the SP2->unsync is not reliable since later we sync SP2 but
SP2->gfn is already writable.

So the final result is: SP2 is the sync page but SP2.gfn is writable.

This bug will corrupt guest's page table, fixed by mark read-only mapping
if the mapped gfn has shadow pages.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:47:22 +03:00
Sheng Yang
f5f48ee15c KVM: VMX: Execute WBINVD to keep data consistency with assigned devices
Some guest device driver may leverage the "Non-Snoop" I/O, and explicitly
WBINVD or CLFLUSH to a RAM space. Since migration may occur before WBINVD or
CLFLUSH, we need to maintain data consistency either by:
1: flushing cache (wbinvd) when the guest is scheduled out if there is no
wbinvd exit, or
2: execute wbinvd on all dirty physical CPUs when guest wbinvd exits.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:47:21 +03:00
Avi Kivity
3e00750947 KVM: Simplify vcpu_enter_guest() mmu reload logic slightly
No need to reload the mmu in between two different vcpu->requests checks.

kvm_mmu_reload() may trigger KVM_REQ_TRIPLE_FAULT, but that will be caught
during atomic guest entry later.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:47:19 +03:00
Chris Lalancette
529df65e39 KVM: Search the LAPIC's for one that will accept a PIC interrupt
Older versions of 32-bit linux have a "Checking 'hlt' instruction"
test where they repeatedly call the 'hlt' instruction, and then
expect a timer interrupt to kick the CPU out of halt.  This happens
before any LAPIC or IOAPIC setup happens, which means that all of
the APIC's are in virtual wire mode at this point.  Unfortunately,
the current implementation of virtual wire mode is hardcoded to
only kick the BSP, so if a crash+kexec occurs on a different
vcpu, it will never get kicked.

This patch makes pic_unlock() do the equivalent of
kvm_irq_delivery_to_apic() for the IOAPIC code.  That is, it runs
through all of the vcpus looking for one that is in virtual wire
mode.  In the normal case where LAPICs and IOAPICs are configured,
this won't be used at all.  In the bootstrap phase of a modern
OS, before the LAPICs and IOAPICs are configured, this will have
exactly the same behavior as today; VCPU0 is always looked at
first, so it will always get out of the loop after the first
iteration.  This will only go through the loop more than once
during a kexec/kdump, in which case it will only do it a few times
until the kexec'ed kernel programs the LAPIC and IOAPIC.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:17 +03:00
Takuya Yoshikawa
979586e0b5 KVM: ia64: cleanup kvm_ia64_sync_dirty_log()
kvm_ia64_sync_dirty_log() is a helper function for kvm_vm_ioctl_get_dirty_log()
which copies ia64's arch specific dirty bitmap to general one in memslot.
So doing sanity checks in this function is unnatural. We move these checks
outside of this and change the prototype appropriately.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:16 +03:00
Takuya Yoshikawa
4482b06c04 KVM: ia64: fix dirty_log_lock spin_lock section not to include get_dirty_log()
kvm_get_dirty_log() calls copy_to_user(). So we need to narrow the
dirty_log_lock spin_lock section not to include this.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:15 +03:00
Alexander Graf
4d29bdbf12 KVM: PPC: Make BAT only guest segments work
When a guest sets its SR entry to invalid, we may still find a
corresponding entry in a BAT. So we need to make sure we're not
faulting on invalid SR entries, but instead just claim them to be
BAT resolved.

This resolves breakage experienced when using libogc based guests.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:14 +03:00
Alexander Graf
3b249157c0 KVM: PPC: Use kernel hash function
The linux kernel already provides a hash function. Let's reuse that
instead of reinventing the wheel!

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:13 +03:00
Alexander Graf
a576f7a294 KVM: PPC: Remove obsolete kvmppc_mmu_find_pte
Initially we had to search for pte entries to invalidate them. Since
the logic has improved since then, we can just get rid of the search
function.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:12 +03:00
Sheng Yang
6c3f604117 KVM: x86: Enable AVX for guest
Enable Intel(R) Advanced Vector Extension(AVX) for guest.

The detection of AVX feature includes OSXSAVE bit testing. When OSXSAVE bit is
not set, even if AVX is supported, the AVX instruction would result in UD as
well. So we're safe to expose AVX bits to guest directly.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:10 +03:00
Avi Kivity
7ac77099ce KVM: Prevent internal slots from being COWed
If a process with a memory slot is COWed, the page will change its address
(despite having an elevated reference count).  This breaks internal memory
slots which have their physical addresses loaded into vmcs registers (see
the APIC access memory slot).

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:08 +03:00
Avi Kivity
a8eeb04a44 KVM: Add mini-API for vcpu->requests
Makes it a little more readable and hackable.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:05 +03:00
Avi Kivity
36633f32ba KVM: i8259: simplify pic_irq_request() calling sequence
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:04 +03:00
Avi Kivity
073d46133a KVM: i8259: reduce excessive abstraction for pic_irq_request()
Part of the i8259 code pretends it isn't part of kvm, but we know better.
Reduce excessive abstraction, eliminating callbacks and void pointers.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:03 +03:00
Avi Kivity
b74a07beed KVM: Remove kernel-allocated memory regions
Equivalent (and better) functionality is provided by user-allocated memory
regions.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:01 +03:00
Avi Kivity
a1f4d39500 KVM: Remove memory alias support
As advertised in feature-removal-schedule.txt.  Equivalent support is provided
by overlapping memory regions.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:00 +03:00
Christian Borntraeger
fc34531db3 KVM: s390: Don't exit SIE on SIGP sense running
Newer (guest) kernels use sigp sense running in their spinlock
implementation to check if the other cpu is running before yielding
the processor. This revealed some wrong guest settings, causing
unnecessary exits for every sigp sense running.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:59 +03:00
Christian Borntraeger
971eb77f87 KVM: s390: Fix build failure due to centralized vcpu locking patches
This patch fixes
ERROR: "__kvm_s390_vcpu_store_status" [arch/s390/kvm/kvm.ko] undefined!

triggered by
commit 3268c56840dcee78c3e928336550f4e1861504c4 (kvm.git)
Author: Avi Kivity <avi@redhat.com>
Date:   Thu May 13 12:21:46 2010 +0300
    KVM: s390: Centrally lock arch specific vcpu ioctls

Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:58 +03:00
Avi Kivity
d1ac91d8a2 KVM: Consolidate load/save temporary buffer allocation and freeing
Instead of three temporary variables and three free calls, have one temporary
variable (with four names) and one free call.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:57 +03:00
Avi Kivity
a1a005f36e KVM: Fix xsave and xcr save/restore memory leak
We allocate temporary kernel buffers for these structures, but never free them.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:56 +03:00
Wei Yongjun
7d5993d63f KVM: x86 emulator: fix group3 instruction decoding
Group 3 instruction with ModRM reg field as 001 is
defined as test instruction under AMD arch, and
emulate_grp3() is ready for emulate it, so fix the
decoding.

static inline int emulate_grp3(...)
{
	...
	switch (c->modrm_reg) {
	case 0 ... 1:   /* test */
		emulate_2op_SrcV("test", c->src, c->dst, ctxt->eflags);
	...
}

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:55 +03:00
Asias He
6045be5dea KVM: PPC: fix uninitialized variable warning in kvm_ppc_core_deliver_interrupts
Fixes:
arch/powerpc/kvm/booke.c: In function 'kvmppc_core_deliver_interrupts':
arch/powerpc/kvm/booke.c:147: warning: 'msr_mask' may be used uninitialized in this function

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:53 +03:00
Chris Lalancette
e7dca5c0eb KVM: x86: Allow any LAPIC to accept PIC interrupts
If the guest wants to accept timer interrupts on a CPU other
than the BSP, we need to remove this gate.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:50 +03:00
Chris Lalancette
33572ac0ad KVM: x86: Introduce a workqueue to deliver PIT timer interrupts
We really want to "kvm_set_irq" during the hrtimer callback,
but that is risky because that is during interrupt context.
Instead, offload the work to a workqueue, which is a bit safer
and should provide most of the same functionality.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:49 +03:00
Wei Yongjun
c37eda1384 KVM: x86 emulator: fix pusha instruction emulation
emulate pusha instruction only writeback the last
EDI register, but the other registers which need
to be writeback is ignored. This patch fixed it.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:48 +03:00
Zachary Amsden
bd371396b3 KVM: x86: fix -DDEBUG oops
Fix a slight error with assertion in local APIC code.

Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:46 +03:00
Xiao Guangrong
1047df1fb6 KVM: MMU: don't walk every parent pages while mark unsync
While we mark the parent's unsync_child_bitmap, if the parent is already
unsynced, it no need walk it's parent, it can reduce some unnecessary
workload

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:45 +03:00
Xiao Guangrong
7a8f1a74e4 KVM: MMU: clear unsync_child_bitmap completely
In current code, some page's unsync_child_bitmap is not cleared completely
in mmu_sync_children(), for example, if two PDPEs shard one PDT, one of
PDPE's unsync_child_bitmap is not cleared.

Currently, it not harm anything just little overload, but it's the prepare
work for the later patch

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:44 +03:00
Xiao Guangrong
ebdea638df KVM: MMU: cleanup for __mmu_unsync_walk()
Decrease sp->unsync_children after clear unsync_child_bitmap bit

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:43 +03:00
Xiao Guangrong
be71e061d1 KVM: MMU: don't mark pte notrap if it's just sync transient
If the sync-sp just sync transient, don't mark its pte notrap

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:42 +03:00
Xiao Guangrong
f918b44352 KVM: MMU: avoid double write protected in sync page path
The sync page is already write protected in mmu_sync_children(), don't
write protected it again

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:41 +03:00
Xiao Guangrong
cb83cad2e7 KVM: MMU: cleanup for dirty page judgment
Using wrap function to cleanup page dirty judgment

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:39 +03:00
Xiao Guangrong
ac3cd03cca KVM: MMU: rename 'page' and 'shadow_page' to 'sp'
Rename 'page' and 'shadow_page' to 'sp' to better fit the context

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:38 +03:00
Sheng Yang
2d5b5a6655 KVM: x86: XSAVE/XRSTOR live migration support
This patch enable save/restore of xsave state.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:37 +03:00
Denis Kirjanov
69b61833f7 KVM: PPC: fix build warning in kvm_arch_vcpu_ioctl_run
Fix compile warning:
  CC [M]  arch/powerpc/kvm/powerpc.o
  arch/powerpc/kvm/powerpc.c: In function 'kvm_arch_vcpu_ioctl_run':
  arch/powerpc/kvm/powerpc.c:290: warning: 'gpr' may be used uninitialized in this function
  arch/powerpc/kvm/powerpc.c:290: note: 'gpr' was declared here

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:36 +03:00
Avi Kivity
2390218b6a KVM: Fix mov cr3 #GP at wrong instruction
On Intel, we call skip_emulated_instruction() even if we injected a #GP,
resulting in the #GP pointing at the wrong address.

Fix by injecting the exception and skipping the instruction at the same place,
so we can do just one or the other.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:35 +03:00
Avi Kivity
a83b29c6ad KVM: Fix mov cr4 #GP at wrong instruction
On Intel, we call skip_emulated_instruction() even if we injected a #GP,
resulting in the #GP pointing at the wrong address.

Fix by injecting the exception and skipping the instruction at the same place,
so we can do just one or the other.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:34 +03:00
Avi Kivity
49a9b07edc KVM: Fix mov cr0 #GP at wrong instruction
On Intel, we call skip_emulated_instruction() even if we injected a #GP,
resulting in the #GP pointing at the wrong address.

Fix by injecting the exception and skipping the instruction at the same place,
so we can do just one or the other.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:32 +03:00
Dexuan Cui
2acf923e38 KVM: VMX: Enable XSAVE/XRSTOR for guest
This patch enable guest to use XSAVE/XRSTOR instructions.

We assume that host_xcr0 would use all possible bits that OS supported.

And we loaded xcr0 in the same way we handled fpu - do it as late as we can.

Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:31 +03:00
Avi Kivity
f495c6e5e8 KVM: VMX: Fix incorrect rcu deref in rmode_tss_base()
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:30 +03:00
Andi Kleen
a24e809902 KVM: Fix unused but set warnings
No real bugs in this one.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:29 +03:00
Xiao Guangrong
3b5d132186 KVM: MMU: delay local tlb flush
delay local tlb flush until enter guest moden, it can reduce vpid flush
frequency and reduce remote tlb flush IPI(if KVM_REQ_TLB_FLUSH bit is
already set, IPI is not sent)

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:26 +03:00
Xiao Guangrong
5304efde6a KVM: MMU: use wrapper function to flush local tlb
Use kvm_mmu_flush_tlb() function instead of calling
kvm_x86_ops->tlb_flush(vcpu) directly.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:25 +03:00
Xiao Guangrong
4f78fd08e9 KVM: MMU: remove unnecessary remote tlb flush
This remote tlb flush is no necessary since we have synced while
sp is zapped

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:24 +03:00
Xiao Guangrong
4b9d3a0451 KVM: VMX: fix rcu usage warning in init_rmode()
fix:

[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
include/linux/kvm_host.h:258 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 1
1 lock held by qemu-system-x86/3796:
 #0:  (&vcpu->mutex){+.+.+.}, at: [<ffffffffa0217fd8>] vcpu_load+0x1a/0x66 [kvm]

stack backtrace:
Pid: 3796, comm: qemu-system-x86 Not tainted 2.6.34 #25
Call Trace:
 [<ffffffff81070ed1>] lockdep_rcu_dereference+0x9d/0xa5
 [<ffffffffa0214fdf>] gfn_to_memslot_unaliased+0x65/0xa0 [kvm]
 [<ffffffffa0216139>] gfn_to_hva+0x22/0x4c [kvm]
 [<ffffffffa0216217>] kvm_write_guest_page+0x2a/0x7f [kvm]
 [<ffffffffa0216286>] kvm_clear_guest_page+0x1a/0x1c [kvm]
 [<ffffffffa0278239>] init_rmode+0x3b/0x180 [kvm_intel]
 [<ffffffffa02786ce>] vmx_set_cr0+0x350/0x4d3 [kvm_intel]
 [<ffffffffa02274ff>] kvm_arch_vcpu_ioctl_set_sregs+0x122/0x31a [kvm]
 [<ffffffffa021859c>] kvm_vcpu_ioctl+0x578/0xa3d [kvm]
 [<ffffffff8106624c>] ? cpu_clock+0x2d/0x40
 [<ffffffff810f7d86>] ? fget_light+0x244/0x28e
 [<ffffffff810709b9>] ? trace_hardirqs_off_caller+0x1f/0x10e
 [<ffffffff8110501b>] vfs_ioctl+0x32/0xa6
 [<ffffffff81105597>] do_vfs_ioctl+0x47f/0x4b8
 [<ffffffff813ae654>] ? sub_preempt_count+0xa3/0xb7
 [<ffffffff810f7da8>] ? fget_light+0x266/0x28e
 [<ffffffff810f7c53>] ? fget_light+0x111/0x28e
 [<ffffffff81105617>] sys_ioctl+0x47/0x6a
 [<ffffffff81002c1b>] system_call_fastpath+0x16/0x1b

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:23 +03:00
Gui Jianfeng
1760dd4939 KVM: VMX: rename vpid_sync_vcpu_all() to vpid_sync_vcpu_single()
The name "pid_sync_vcpu_all" isn't appropriate since it just affect
a single vpid, so rename it to vpid_sync_vcpu_single().

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:16 +03:00
Gui Jianfeng
b9d762fa79 KVM: VMX: Add all-context INVVPID type support
Add all-context INVVPID type support.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:04 +03:00
Xiao Guangrong
0671a8e75d KVM: MMU: reduce remote tlb flush in kvm_mmu_pte_write()
collect remote tlb flush in kvm_mmu_pte_write() path

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:28 +03:00
Xiao Guangrong
f41d335a02 KVM: MMU: traverse sp hlish safely
Now, we can safely to traverse sp hlish

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:28 +03:00
Xiao Guangrong
d98ba05365 KVM: MMU: gather remote tlb flush which occurs during page zapped
Using kvm_mmu_prepare_zap_page() and kvm_mmu_zap_page() instead of
kvm_mmu_zap_page() that can reduce remote tlb flush IPI

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:27 +03:00
Xiao Guangrong
103ad25a86 KVM: MMU: don't get free page number in the loop
In the later patch, we will modify sp's zapping way like below:

	kvm_mmu_prepare_zap_page A
	kvm_mmu_prepare_zap_page B
	kvm_mmu_prepare_zap_page C
	....
	kvm_mmu_commit_zap_page

[ zaped multiple sps only need to call kvm_mmu_commit_zap_page once ]

In __kvm_mmu_free_some_pages() function, the free page number is
getted form 'vcpu->kvm->arch.n_free_mmu_pages' in loop, it will
hinders us to apply kvm_mmu_prepare_zap_page() and kvm_mmu_commit_zap_page()
since kvm_mmu_prepare_zap_page() not free sp.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:27 +03:00
Xiao Guangrong
7775834a23 KVM: MMU: split the operations of kvm_mmu_zap_page()
Using kvm_mmu_prepare_zap_page() and kvm_mmu_commit_zap_page() to
split kvm_mmu_zap_page() function, then we can:

- traverse hlist safely
- easily to gather remote tlb flush which occurs during page zapped

Those feature can be used in the later patches

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:27 +03:00
Xiao Guangrong
7ae680eb2d KVM: MMU: introduce some macros to cleanup hlist traverseing
Introduce for_each_gfn_sp() and for_each_gfn_indirect_valid_sp() to
cleanup hlist traverseing

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:27 +03:00
Xiao Guangrong
03116aa57e KVM: MMU: skip invalid sp when unprotect page
In kvm_mmu_unprotect_page(), the invalid sp can be skipped

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:26 +03:00