When growing halt-polling, there is no check that the poll time exceeds
the limit. It's possible for vcpu->halt_poll_ns grow once past
halt_poll_ns, and stay there until a halt which takes longer than
vcpu->halt_poll_ns. For example, booting a Linux guest with
halt_poll_ns=11000:
... kvm:kvm_halt_poll_ns: vcpu 0: halt_poll_ns 0 (shrink 10000)
... kvm:kvm_halt_poll_ns: vcpu 0: halt_poll_ns 10000 (grow 0)
... kvm:kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (grow 10000)
Signed-off-by: David Matlack <dmatlack@google.com>
Fixes: aca6ff29c4
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Basic GICv3 ACPI support
- Alpine MSI widget on top of GICv3
- More RealView GIC support
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW3++eAAoJECPQ0LrRPXpDBoMP/R8WOgjk1mbwiICjCdKs8n4f
QA0BszmPTYSnwnbjqxCEgoWmDkpPGurODjyt6FQeaZFj/bhWxKTsFVcOBslFkXcc
8EB9Avx4N/zEI9VOFuQoiKLIZwSldBNJjgGPSU7lQW2pVUUyC+BmJvcLwyP76R4H
21WBjf9egHYTm1I7kWKoLqgrx993uIfSmwILV1lHuyF5Ubhw1AuNoMsvEFLGY76v
ZWNfyAud7xOnbu3kNXnryKO4B8vS1q6o23yqFlxXA6HYWL9bTpWi0bseZs7G/MyF
LEBN9CGukaGwGokbE8G0AxtS/c6fz2Uy51xFQRhL+EN3VDrnHbWae5Ys5B7PWgIT
DObzVGHkm2V1Nl2jNkwXeeNIrr/APRMDBVJcA5tzXHjcIntHn2ez8Fg0Npo+Zv9G
q8m55qbpssSQLKKu7pXG3T3fK59EghTsuFpTRSwP4QtIdyI1yxcgqaXg73s2vOFv
DnCxzlXBQ9ia3QqJKY0lNd/QBUoqueGC93TijeBXawSdBhSHp0SXsUp5g99ws0sF
fZTzB8dYOB2ZCJc2UW2yNlpMCBPEwCkggQ6M6NISpZLwt3wOyYhTpeBGzSk4j757
AssL9BQf5v2rBkJR93bNlIzBVCa7V0DADgAR5w+uYP4zTyt8tFg1YAoz1lY0435W
xQM6Ubx4/xID4Go2sR/Q
=bzNQ
-----END PGP SIGNATURE-----
Merge tag 'gic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
Pull GIC updates for 4.6 from Marc Zyngier:
- Basic GICv3 ACPI support
- Alpine MSI widget on top of GICv3
- More RealView GIC support
The following upcoming upstream commit:
92b0729c34 ("x86/mm, x86/mce: Add memcpy_mcsafe()")
Adds _ASM_EXTABLE_FAULT(), which is not available in user-space
and breaks the build.
We don't really need _ASM_EXTABLE_FAULT() in user-space, so simply
wrap it to nothing.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In the add-on file for the GIC dealing with the RealView family
we currently only handle the PB11MPCore, let's extend this to
manage the RealView EB ARM11MPCore as well. The Revision B of the
ARM11MPCore core tile is a bit special and needs special handling
as it moves a system control register around at random.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree@vger.kernel.org
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Following the addition of the Alpine MSIX driver, this patch adds the
corresponding bindings documentation.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Tsahee Zidenberg <tsahee@annapurnalabs.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This patch adds the Alpine MSIX interrupt controller driver.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Tsahee Zidenberg <tsahee@annapurnalabs.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Always return IRQ_SET_MASK_OK_DONE instead of IRQ_SET_MASK_OK when the
affinity has been updated. When using stacked irqchips, returning
IRQ_SET_MASK_OK_DONE means skipping all descendant irqchips.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
adding unmap of sources and destinations while doing dequeue.
Signed-off-by: Xuelin Shi <xuelin.shi@nxp.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
- Fix ipu probe if optional port nodes are not present in the device tree
- Reset the ipu before initializing interrupts, not thereafter
- Notify DRM core about the state of vblank interrupts
- Add missing RGB565 format to the list of plate formats
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWxs28AAoJEFDCiBxwnmDre0IP/3f3EaTmFP8bSzCW4rtEMn26
t5+frIRg/y7waM/BC5iK9jbi98oqKGdWB2kiaA8uS4jowS3AdD8wHN/9/W20onBF
BVrNbMELRSNqOTGxgjouO7FB/2FwvlKRZzFJlIQ4Xci/PFDkyVZ+Bhva+4L4kvNg
iHanmUM3nVsgCukefvkboVUJTfAkMsT1jd64Z8UDa+Bl6zsVsnzIUtHQKOzD6lwk
Pq+/d6l4B9dFbZLkhITI+A3N7GbjJ20SEjn+LOYehjTz8B3BWw9K5BvWndr2x6yX
/hTYSE2nVdsGOdZrwuYZY2pOocLvApme7raB8CSgsXzktZ+BBbDprsxae08ecGEg
qyekFuVGTrMzvpuBkMk16Phh6g7F2oTNzJ9mdR3c9J3Eoolc3d7ZVN9ZUtjqiCk5
XArrQMG9WbW9e6pFWIc3tpYRfybNN7oIEUAJnI7VKpeh56osIUXTCtXreyVxBDu6
ycKE4ZNsX2Fu+XhDk2enMtOdmNm32x1waR3vbKdIfJY0fjSd+N9FOt5SUXNlwzSX
NhLOLPYRorQ7cj72RvavEQCAtOywsoHWRLwdMDj4BAND1o4KcXzANb2ZwGH8awMk
beG8+0ZKM3HWzkhrpqti7StCrhWQNlmfkUgTu6/PX++m+SplII27cScKqFQZuFWm
M4YCzksuVhO6dgAT05zt
=9nEz
-----END PGP SIGNATURE-----
Merge tag 'imx-drm-fixes-2016-02-19' of git://git.pengutronix.de/git/pza/linux into drm-fixes
ipu-v3 probe and imx-drm crtc and plane fixes
- Fix ipu probe if optional port nodes are not present in the device tree
- Reset the ipu before initializing interrupts, not thereafter
- Notify DRM core about the state of vblank interrupts
- Add missing RGB565 format to the list of plate formats
* tag 'imx-drm-fixes-2016-02-19' of git://git.pengutronix.de/git/pza/linux:
drm/imx: Add missing DRM_FORMAT_RGB565 to ipu_plane_formats
drm/imx: notify DRM core about CRTC vblank state
gpu: ipu-v3: Reset IPU before activating IRQ
gpu: ipu-v3: Do not bail out on missing optional port nodes
radeon and amdgpu fixes for 4.5. Three regression fixes and
some fixups for the error handling in the vblank regression fixes
from earlier.
* 'drm-fixes-4.5' of git://people.freedesktop.org/~agd5f/linux:
Revert "drm/radeon/pm: adjust display configuration after powerstate"
drm/amdgpu/dp: add back special handling for NUTMEG
drm/radeon/dp: add back special handling for NUTMEG
drm/radeon: Fix error handling in radeon_flip_work_func.
drm/amdgpu: Fix error handling in amdgpu_flip_work_func.
gicv3_init_bases() is the only caller for its_init(),
also it is a __init function, so mark its_init() as __init too,
then recursively mark the functions called as __init.
This will help to introduce ITS initialization using ACPI tables as
we will use acpi_table_parse_entries family functions there which
belong to __init section as well.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The gic_root_node variable defined in ITS driver is not actually
used, so just remove it.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Following ACPI spec:
On systems supporting GICv3 and above, GICR Base Address in MADT GICC
structure holds the 64-bit physical address of the associated Redistributor.
If all of the GIC Redistributors are in the always-on power domain,
GICR structures should be used to describe the Redistributors instead,
and this field must be set to 0.
It means that we have two ways to initialize registirbutors map.
1. via GICD structure which can accommodate many redistributors as a region
2. via GICC which is able to describe single redistributor
This patch is going to add support for second option.
Considering redistributors, GICD and GICC subtables have be mutually
exclusive. While discovering and mapping redistributor, we need to know
its size in advance. For the GICC case, redistributor can be in
a power-domain that is off, thus we cannot relay on GICR TYPER register.
Therefore, we get GIC version from distributor register and map 2xSZ_64K
for GICv3 and 4xSZ_64K for GICv4.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
With the refator of gic_of_init(), GICv3/4 can be initialized
by gic_init_bases() with gic distributor base address and gic
redistributor region(s).
So get the redistributor region base addresses from MADT GIC
redistributor subtable, and the distributor base address from
GICD subtable to init GICv3 irqchip in ACPI way.
Note: GIC redistributor base address may also be provided in
GICC structures on systems supporting GICv3 and above if the GIC
Redistributors are not in the always-on power domain, this
patch didn't implement such feature yet.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Isolate hardware abstraction (FDT) code to gic_of_init().
Rest of the logic goes to gic_init_bases() and expects well
defined data to initialize GIC properly. The same solution
is used for GICv2 driver.
This is needed for ACPI initialization later.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
ACPICA commit eade8f78f2aa21e8eabc3380a5728db47273bcf1
Revert commit ae90fbf562 (ACPICA: Parser: Fix for SuperName method
invocation).
Support for method invocations as part of super_name will be
removed from the ACPI specification, since no AML interpreter
supports it.
Fixes: ae90fbf562 (ACPICA: Parser: Fix for SuperName method invocation)
Link: https://github.com/acpica/acpica/commit/eade8f78
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It's always an ambivalent feeling to send a large pull request at the
late stage like this, especially when most of patches came from me.
In anyway, this is a collection of lots of small fixes that slipped
from the previous pull request.
All fixes are about ASoC, and the majority of changes are corrections
of the wrong access types in ALSA ctl enum items. They are mostly
harmless on 32bit architectures, but actually buggy on 64bit. So we
addressed all these now in a shot. The rest are various small ASoC
driver fixes.
Among them, only two changes have been done to ASoC core, and both of
them are trivial. The rest are all device-specific. So overall, they
should be safe to apply.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJW3rMXAAoJEGwxgFQ9KSmkVycP/3413WwXgXsqvAdbHiqvJQGf
B+KQw/tN7qXhut7+mubuOk+eKYpDH3Jj2hokb/BFH5AHADzOMdBfCnLvoewZUrTJ
GURasd0JvMmaoZaIB9Nk1G6QEaiobZLJjsjKLdaAu3yeQOyo2FDghiWAVkDMzMV8
73v+eStoGeDQX41vWnV63747Zpfd80q5c5qudKR0FMGphzkV0j7JUEkVNWiedMfL
MjchPbLBlxK6CDJh+cbjKMZK2Y/h+j05b4oLdqwdt98z00RJP4vJTJ3bSWyDyqil
HZb2F4Ugsv1nI9sQ8nHLb7PH4/u45PKuBxILNv7mGfS1WPOCuV7j+PHyMOBIr97P
seS38DjukIDU1Q+zpar+p5v3/PrshQuxKknwrI/Z+tRMNWPurbgSIWNeNOIUNIoF
HTg/pETlwr4zLMBy78lTot+7NqDJwLit1Yl4tI1+7ac0wSycJ4yYwWXv5mr++23G
QZxXznJELhhpYhUKT/b804STu2bT3dpSJxUYe/EYApYgoDx3TxWRhWhbKaILzPH9
8EYLJ3Xgd7AZcMsEpC5R2SNKRMnQvPQnnDncwcNUSA/bju8H0eqThvRs2n30xc2q
9Ris9m0iLJAXwi/bVcxwMvJddfzLSL1DWPBkpQsJ2yWmux9lohUA/cFXWDrGiDWs
0L6G1f8mH8iQuLmQcHWi
=GmQ0
-----END PGP SIGNATURE-----
Merge tag 'sound-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"It's always an ambivalent feeling to send a large pull request at the
late stage like this, especially when most of patches came from me.
Anyway, this is a collection of lots of small fixes that slipped from
the previous pull request.
All fixes are about ASoC, and the majority of changes are corrections
of the wrong access types in ALSA ctl enum items. They are mostly
harmless on 32bit architectures, but actually buggy on 64bit. So we
addressed all these now in a shot. The rest are various small ASoC
driver fixes.
Among them, only two changes have been done to ASoC core, and both of
them are trivial. The rest are all device-specific. So overall, they
should be safe to apply"
* tag 'sound-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (33 commits)
ASoC: wm_adsp: Fix enum ctl accesses in a wrong type
ASoC: wm9081: Fix enum ctl accesses in a wrong type
ASoC: wm8996: Fix enum ctl accesses in a wrong type
ASoC: wm8994: Fix enum ctl accesses in a wrong type
ASoC: wm8985: Fix enum ctl accesses in a wrong type
ASoC: wm8983: Fix enum ctl accesses in a wrong type
ASoC: wm8958: Fix enum ctl accesses in a wrong type
ASoC: wm8904: Fix enum ctl accesses in a wrong type
ASoC: wm8753: Fix enum ctl accesses in a wrong type
ASoC: wl1273: Fix enum ctl accesses in a wrong type
ASoC: tlv320dac33: Fix enum ctl accesses in a wrong type
ASoC: max98095: Fix enum ctl accesses in a wrong type
ASoC: max98088: Fix enum ctl accesses in a wrong type
ASoC: ab8500: Fix enum ctl accesses in a wrong type
ASoC: da732x: Fix enum ctl accesses in a wrong type
ASoC: cs42l51: Fix enum ctl accesses in a wrong type
ASoC: intel: mfld: Fix enum ctl accesses in a wrong type
ASoC: omap: rx51: Fix enum ctl accesses in a wrong type
ASoC: omap: n810: Fix enum ctl accesses in a wrong type
ASoC: pxa: tosa: Fix enum ctl accesses in a wrong type
...
during the merge window.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW3cVJAAoJEBLB8Bhh3lVKPxMP/0BKTlS7mp7FRjEhnUQPorEb
sW20KtpC372uD5eVREbC9dUA1q9FGQacOKs4sZvSZUxj8Nx+BGEkpTLH+sHwKnk0
mfI0roq0oWcXGWb19A9oI506V94Z0ejs2p1aEBA/iidUjpu1RoR1ZhOPMmkE6+PS
gMWJj10xonodP5y8e3ZlTg+2dDohdy1KA6GKbLr1145cQLvH6p01Sf+cvNncZwMk
r22e89I8CVSn96kXOuyiu2mXZ+9sDCkDvwc/uYECs34XfhGAoVJ6Qy3xy2ZvaakH
t6Idc4Z5eVbvXKErfRzrUbj12qBQ2zIs2h8feVtmyNM57Awjk/XQkTncBp3Kb5XG
szB/pPFM2DEffNMOeXGE1ZJyZVuErhVbcsZYPmBdXKITeUuvzAbOF47qZpXdsJZr
d65BzKCEHyoV7+M3lRti13KAestPQupvS7b++qyLhSnqKFP5pacVJPyhNDkMPDic
wSXiu5z7IdjJgH7GwTmcvMHrI1vwL2nlAqanfQNNYVAzGzJ2GSUm9ClK6KGtaHY9
QUtk8VGw3ZlemU+PT0PghmrgcPnGRbVLyaH/ufEb+IE3F1+9i0f1ZpkcHF42faMq
GKFMXCdQPfpy67y6oGql5PmMrwbaDwXSn2atoPb6gbbeJWTlSf0TnfRCeJe/S3/7
eW9geMHYK4QgTIuWodYp
=le3x
-----END PGP SIGNATURE-----
Merge tag 'edac_fix_for_4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Pull EDAC fix from Borislav Petkov:
"Last minute fix for sb_edac which fixes DIMM detection on certain Xeon
Phi configurations:
A single fix to the Xeon Phi section of sb_edac. The issue was
introduced during this merge window"
* tag 'edac_fix_for_4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, sb_edac: Fix logic when computing DIMM sizes on Xeon Phi
Make use of the EXTABLE_FAULT exception table entries to write
a kernel copy routine that doesn't crash the system if it
encounters a machine check. Prime use case for this is to copy
from large arrays of non-volatile memory used as storage.
We have to use an unrolled copy loop for now because current
hardware implementations treat a machine check in "rep mov"
as fatal. When that is fixed we can simplify.
Return type is a "bool". True means that we copied OK, false means
that it didn't.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@gmail.com>
Link: http://lkml.kernel.org/r/a44e1055efc2d2a9473307b22c91caa437aa3f8b.1456439214.git.tony.luck@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Drop the quirk() function pointer in favor of a simple boolean which
says whether the quirk should be applied or not. Update comment while at
it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-tip-commits@vger.kernel.org
Link: http://lkml.kernel.org/r/20160308164041.GF16568@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When I fixed the dp rate selection in:
3b73b168cffd9c392584d3f665021fa2190f8612
drm/amdgpu: fix dp link rate selection (v2)
I accidently dropped the special handling for NUTMEG
DP bridge chips. They require a fixed link rate.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
When I fixed the dp rate selection in:
092c96a8ab
drm/radeon: fix dp link rate selection (v2)
I accidently dropped the special handling for NUTMEG
DP bridge chips. They require a fixed link rate.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Tested-by: Ken Moffat <zarniwhoop@ntlworld.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Commit e91467ecd1 ("bug in futex unqueue_me") introduced a barrier() in
unqueue_me() to prevent the compiler from rereading the lock pointer which
might change after a check for NULL.
Replace the barrier() with a READ_ONCE() for the following reasons:
1) READ_ONCE() is a weaker form of barrier() that affects only the specific
load operation, while barrier() is a general compiler level memory barrier.
READ_ONCE() was not available at the time when the barrier was added.
2) Aside of that READ_ONCE() is descriptive and self explainatory while a
barrier without comment is not clear to the casual reader.
No functional change.
[ tglx: Massaged changelog ]
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Darren Hart <dvhart@linux.intel.com>
Cc: dave@stgolabs.net
Cc: peterz@infradead.org
Cc: linux@rasmusvillemoes.dk
Cc: akpm@linux-foundation.org
Cc: fengguang.wu@intel.com
Cc: bigeasy@linutronix.de
Link: http://lkml.kernel.org/r/1457314344-5685-1-git-send-email-nasa4836@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It no longer has any users.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: lguest@lists.ozlabs.org
Cc: xen-devel@lists.xensource.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
x86_64 has very clean espfix handling on paravirt: espfix64 is set
up in native_iret, so paravirt systems that override iret bypass
espfix64 automatically. This is robust and straightforward.
x86_32 is messier. espfix is set up before the IRET paravirt patch
point, so it can't be directly conditionalized on whether we use
native_iret. We also can't easily move it into native_iret without
regressing performance due to a bizarre consideration. Specifically,
on 64-bit kernels, the logic is:
if (regs->ss & 0x4)
setup_espfix;
On 32-bit kernels, the logic is:
if ((regs->ss & 0x4) && (regs->cs & 0x3) == 3 &&
(regs->flags & X86_EFLAGS_VM) == 0)
setup_espfix;
The performance of setup_espfix itself is essentially irrelevant, but
the comparison happens on every IRET so its performance matters. On
x86_64, there's no need for any registers except flags to implement
the comparison, so we fold the whole thing into native_iret. On
x86_32, we don't do that because we need a free register to
implement the comparison efficiently. We therefore do espfix setup
before restoring registers on x86_32.
This patch gets rid of the explicit paravirt_enabled check by
introducing X86_BUG_ESPFIX on 32-bit systems and using an ALTERNATIVE
to skip espfix on paravirt systems where iret != native_iret. This is
also messy, but it's at least in line with other things we do.
This improves espfix performance by removing a branch, but no one
cares. More importantly, it removes a paravirt_enabled user, which is
good because paravirt_enabled is ill-defined and is going away.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: lguest@lists.ozlabs.org
Cc: xen-devel@lists.xensource.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull nohz enhancements from Frederic Weisbecker:
"Currently in nohz full configs, the tick dependency is checked
asynchronously by nohz code from interrupt and context switch for each
concerned subsystem with a set of function provided by these. Such
functions are made of many conditions and details that can be heavyweight
as they are called on fastpath: sched_can_stop_tick(),
posix_cpu_timer_can_stop_tick(), perf_event_can_stop_tick()...
Thomas suggested a few months ago to make that tick dependency check
synchronous. Instead of checking subsystems details from each interrupt
to guess if the tick can be stopped, every subsystem that may have a tick
dependency should set itself a flag specifying the state of that
dependency. This way we can verify if we can stop the tick with a single
lightweight mask check on fast path.
This conversion from a pull to a push model to implement tick dependency
is the core feature of this patchset that is split into:
* Nohz wide kick simplification
* Improve nohz tracing
* Introduce tick dependency mask
* Migrate scheduler, posix timers, perf events and sched clock tick
dependencies to the tick dependency mask."
Signed-off-by: Ingo Molnar <mingo@kernel.org>
ignore_nmis is used in two distinct places:
1. modified through {stop,restart}_nmi by alternative_instructions
2. read by do_nmi to determine if default_do_nmi should be called or not
thus the access pattern conforms to __read_mostly and do_nmi() is a fastpath.
Signed-off-by: Kostenzer Felix <fkostenzer@live.at>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With MACHINE_HAS_VX, we convert the floating point registers from the
vector registeres when storing the status. For other VCPUs, these are
stored to vcpu->run->s.regs.vrs, but we are using current->thread.fpu.vxrs,
which resolves to the currently loaded VCPU.
So kvm_s390_store_status_unloaded() currently writes the wrong floating
point registers (converted from the vector registers) when called from
another VCPU on a z13.
This is only the case for old user space not handling SIGP STORE STATUS and
SIGP STOP AND STORE STATUS, but relying on the kernel implementation. All
other calls come from the loaded VCPU via kvm_s390_store_status().
Fixes: 9abc2a08a7 (KVM: s390: fix memory overwrites when vx is disabled)
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linux guests on Haswell (and also SandyBridge and Broadwell, at least)
would crash if you decided to run a host command that uses PEBS, like
perf record -e 'cpu/mem-stores/pp' -a
This happens because KVM is using VMX MSR switching to disable PEBS, but
SDM [2015-12] 18.4.4.4 Re-configuring PEBS Facilities explains why it
isn't safe:
When software needs to reconfigure PEBS facilities, it should allow a
quiescent period between stopping the prior event counting and setting
up a new PEBS event. The quiescent period is to allow any latent
residual PEBS records to complete its capture at their previously
specified buffer address (provided by IA32_DS_AREA).
There might not be a quiescent period after the MSR switch, so a CPU
ends up using host's MSR_IA32_DS_AREA to access an area in guest's
memory. (Or MSR switching is just buggy on some models.)
The guest can learn something about the host this way:
If the guest doesn't map address pointed by MSR_IA32_DS_AREA, it results
in #PF where we leak host's MSR_IA32_DS_AREA through CR2.
After that, a malicious guest can map and configure memory where
MSR_IA32_DS_AREA is pointing and can therefore get an output from
host's tracing.
This is not a critical leak as the host must initiate with PEBS tracing
and I have not been able to get a record from more than one instruction
before vmentry in vmx_vcpu_run() (that place has most registers already
overwritten with guest's).
We could disable PEBS just few instructions before vmentry, but
disabling it earlier shouldn't affect host tracing too much.
We also don't need to switch MSR_IA32_PEBS_ENABLE on VMENTRY, but that
optimization isn't worth its code, IMO.
(If you are implementing PEBS for guests, be sure to handle the case
where both host and guest enable PEBS, because this patch doesn't.)
Fixes: 26a4f3c08d ("perf/x86: disable PEBS on a guest entry.")
Cc: <stable@vger.kernel.org>
Reported-by: Jiří Olša <jolsa@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The callers of steal_account_process_tick() expect it to return
whether a jiffy should be considered stolen or not.
Currently the return value of steal_account_process_tick() is in
units of cputime, which vary between either jiffies or nsecs
depending on CONFIG_VIRT_CPU_ACCOUNTING_GEN.
If cputime has nsecs granularity and there is a tiny amount of
stolen time (a few nsecs, say) then we will consider the entire
tick stolen and will not account the tick on user/system/idle,
causing /proc/stats to show invalid data.
The fix is to change steal_account_process_tick() to accumulate
the stolen time and only account it once it's worth a jiffy.
(Thanks to Frederic Weisbecker for suggestions to fix a bug in my
first version of the patch.)
Signed-off-by: Chris Friesen <chris.friesen@windriver.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/56DBBDB8.40305@mail.usask.ca
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The dl_new field of struct sched_dl_entity is currently used to
identify new deadline tasks, so that their deadline and runtime
can be properly initialised.
However, these tasks can be easily identified by checking if
their deadline is smaller than the current time when they switch
to SCHED_DEADLINE. So, dl_new can be removed by introducing this
check in switched_to_dl(); this allows to simplify the
SCHED_DEADLINE code.
Signed-off-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457350024-7825-2-git-send-email-luca.abeni@unitn.it
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jiri reported some time ago that some entries in the PEBS data source table
in perf do not agree with the SDM. We investigated and the bits
changed for Sandy Bridge, but the SDM was not updated.
perf already implements the bits correctly for Sandy Bridge
and later. This patch patches it up for Nehalem and Westmere.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1456871124-15985-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch adds a Broadwell specific PEBS event constraint table.
Broadwell has a fix for the HT corruption bug erratum HSD29 on
Haswell. Therefore, there is no need to mark events 0xd0, 0xd1, 0xd2,
0xd3 has requiring the exclusive mode across both sibling HT threads.
This holds true for regular counting and sampling (see core.c) and
PEBS (ds.c) which we fix in this patch.
In doing so, we relax evnt scheduling for these events, they can now
be programmed on any 4 counters without impacting what is measured on
the sibling thread.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@redhat.com
Cc: adrian.hunter@intel.com
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Cc: namhyung@kernel.org
Link: http://lkml.kernel.org/r/1457034642-21837-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch fixes an issue with the GLOBAL_OVERFLOW_STATUS bits on
Haswell, Broadwell and Skylake processors when using PEBS.
The SDM stipulates that when the PEBS iterrupt threshold is crossed,
an interrupt is posted and the kernel is interrupted. The kernel will
find GLOBAL_OVF_SATUS bit 62 set indicating there are PEBS records to
drain. But the bits corresponding to the actual counters should NOT be
set. The kernel follows the SDM and assumes that all PEBS events are
processed in the drain_pebs() callback. The kernel then checks for
remaining overflows on any other (non-PEBS) events and processes these
in the for_each_bit_set(&status) loop.
As it turns out, under certain conditions on HSW and later processors,
on PEBS buffer interrupt, bit 62 is set but the counter bits may be
set as well. In that case, the kernel drains PEBS and generates
SAMPLES with the EXACT tag, then it processes the counter bits, and
generates normal (non-EXACT) SAMPLES.
I ran into this problem by trying to understand why on HSW sampling on
a PEBS event was sometimes returning SAMPLES without the EXACT tag.
This should not happen on user level code because HSW has the
eventing_ip which always point to the instruction that caused the
event.
The workaround in this patch simply ensures that the bits for the
counters used for PEBS events are cleared after the PEBS buffer has
been drained. With this fix 100% of the PEBS samples on my user code
report the EXACT tag.
Before:
$ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase
$ perf report -D | fgrep SAMPLES
PERF_RECORD_SAMPLE(IP, 0x2): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y
\--- EXACT tag is missing
After:
$ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase
$ perf report -D | fgrep SAMPLES
PERF_RECORD_SAMPLE(IP, 0x4002): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y
\--- EXACT tag is set
The problem tends to appear more often when multiple PEBS events are used.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: adrian.hunter@intel.com
Cc: kan.liang@intel.com
Cc: namhyung@kernel.org
Link: http://lkml.kernel.org/r/1457034642-21837-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch adds a definition for GLOBAL_OVFL_STATUS bit 55
which is used with the Processor Trace (PT) feature.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: adrian.hunter@intel.com
Cc: kan.liang@intel.com
Cc: namhyung@kernel.org
Link: http://lkml.kernel.org/r/1457034642-21837-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch tries to fix a PEBS warning found in my stress test. The
following perf command can easily trigger the pebs warning or spurious
NMI error on Skylake/Broadwell/Haswell platforms:
sudo perf record -e 'cpu/umask=0x04,event=0xc4/pp,cycles,branches,ref-cycles,cache-misses,cache-references' --call-graph fp -b -c1000 -a
Also the NMI watchdog must be enabled.
For this case, the events number is larger than counter number. So
perf has to do multiplexing.
In perf_mux_hrtimer_handler, it does perf_pmu_disable(), schedule out
old events, rotate_ctx, schedule in new events and finally
perf_pmu_enable().
If the old events include precise event, the MSR_IA32_PEBS_ENABLE
should be cleared when perf_pmu_disable(). The MSR_IA32_PEBS_ENABLE
should keep 0 until the perf_pmu_enable() is called and the new event is
precise event.
However, there is a corner case which could restore PEBS_ENABLE to
stale value during the above period. In perf_pmu_disable(), GLOBAL_CTRL
will be set to 0 to stop overflow and followed PMI. But there may be
pending PMI from an earlier overflow, which cannot be stopped. So even
GLOBAL_CTRL is cleared, the kernel still be possible to get PMI. At
the end of the PMI handler, __intel_pmu_enable_all() will be called,
which will restore the stale values if old events haven't scheduled
out.
Once the stale pebs value is set, it's impossible to be corrected if
the new events are non-precise. Because the pebs_enabled will be set
to 0. x86_pmu.enable_all() will ignore the MSR_IA32_PEBS_ENABLE
setting. As a result, the following NMI with stale PEBS_ENABLE
trigger pebs warning.
The pending PMI after enabled=0 will become harmless if the NMI handler
does not change the state. This patch checks cpuc->enabled in pmi and
only restore the state when PMU is active.
Here is the dump:
Call Trace:
<NMI> [<ffffffff813c3a2e>] dump_stack+0x63/0x85
[<ffffffff810a46f2>] warn_slowpath_common+0x82/0xc0
[<ffffffff810a483a>] warn_slowpath_null+0x1a/0x20
[<ffffffff8100fe2e>] intel_pmu_drain_pebs_nhm+0x2be/0x320
[<ffffffff8100caa9>] intel_pmu_handle_irq+0x279/0x460
[<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
[<ffffffff811f290d>] ? vunmap_page_range+0x20d/0x330
[<ffffffff811f2f11>] ? unmap_kernel_range_noflush+0x11/0x20
[<ffffffff8148379f>] ? ghes_copy_tofrom_phys+0x10f/0x2a0
[<ffffffff814839c8>] ? ghes_read_estatus+0x98/0x170
[<ffffffff81005a7d>] perf_event_nmi_handler+0x2d/0x50
[<ffffffff810310b9>] nmi_handle+0x69/0x120
[<ffffffff810316f6>] default_do_nmi+0xe6/0x100
[<ffffffff810317f2>] do_nmi+0xe2/0x130
[<ffffffff817aea71>] end_repeat_nmi+0x1a/0x1e
[<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
[<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
[<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
<<EOE>> <IRQ> [<ffffffff81006df8>] ? x86_perf_event_set_period+0xd8/0x180
[<ffffffff81006eec>] x86_pmu_start+0x4c/0x100
[<ffffffff8100722d>] x86_pmu_enable+0x28d/0x300
[<ffffffff811994d7>] perf_pmu_enable.part.81+0x7/0x10
[<ffffffff8119cb70>] perf_mux_hrtimer_handler+0x200/0x280
[<ffffffff8119c970>] ? __perf_install_in_context+0xc0/0xc0
[<ffffffff8110f92d>] __hrtimer_run_queues+0xfd/0x280
[<ffffffff811100d8>] hrtimer_interrupt+0xa8/0x190
[<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0
[<ffffffff81051bd8>] local_apic_timer_interrupt+0x38/0x60
[<ffffffff817af01d>] smp_apic_timer_interrupt+0x3d/0x50
[<ffffffff817ad15c>] apic_timer_interrupt+0x8c/0xa0
<EOI> [<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0
[<ffffffff81123de5>] ? smp_call_function_single+0xd5/0x130
[<ffffffff81123ddb>] ? smp_call_function_single+0xcb/0x130
[<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0
[<ffffffff8119765a>] event_function_call+0x10a/0x120
[<ffffffff8119c660>] ? ctx_resched+0x90/0x90
[<ffffffff811971e0>] ? cpu_clock_event_read+0x30/0x30
[<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60
[<ffffffff8119772b>] _perf_event_enable+0x5b/0x70
[<ffffffff81197388>] perf_event_for_each_child+0x38/0xa0
[<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60
[<ffffffff811a0ffd>] perf_ioctl+0x12d/0x3c0
[<ffffffff8134d855>] ? selinux_file_ioctl+0x95/0x1e0
[<ffffffff8124a3a1>] do_vfs_ioctl+0xa1/0x5a0
[<ffffffff81036d29>] ? sched_clock+0x9/0x10
[<ffffffff8124a919>] SyS_ioctl+0x79/0x90
[<ffffffff817ac4b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
---[ end trace aef202839fe9a71d ]---
Uhhuh. NMI received for unknown reason 2d on CPU 2.
Do you have a strange power saving mode enabled?
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1457046448-6184-1-git-send-email-kan.liang@intel.com
[ Fixed various typos and other small details. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Using PAGE_SIZE buffers makes the WRMSR to PERF_GLOBAL_CTRL in
intel_pmu_enable_all() mysteriously hang on Core2. As a workaround, we
don't do this.
The hard lockup is easily triggered by running 'perf test attr'
repeatedly. Most of the time it gets stuck on sample session with
small periods.
# perf test attr -vv
14: struct perf_event_attr setup :
--- start ---
...
'PERF_TEST_ATTR=/tmp/tmpuEKz3B /usr/bin/perf record -o /tmp/tmpuEKz3B/perf.data -c 123 kill >/dev/null 2>&1' ret 1
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20160301190352.GA8355@krava.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The error path in perf_event_open() is such that asking for a sampling
event on a PMU that doesn't generate interrupts will end up in dropping
the perf_sched_count even though it hasn't been incremented for this
event yet.
Given a sufficient amount of these calls, we'll end up disabling
scheduler's jump label even though we'd still have active events in the
system, thereby facilitating the arrival of the infernal regions upon us.
I'm fixing this by moving account_event() inside perf_event_alloc().
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1456917854-29427-1-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In an attempt to aid in understanding of what the threshold_block
structure holds, provide comments to describe the members here. Also,
trim comments around threshold_restart_bank() and update copyright info.
No functional change is introduced.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
[ Shorten comments. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1457021458-2522-6-git-send-email-Aravind.Gopalakrishnan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Deferred errors indicate errors that hardware could not fix. But it
still does not cause any interruption to program flow. So it does not
generate any #MC and UC bit in MCx_STATUS is not set.
Correct comment.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1457021458-2522-5-git-send-email-Aravind.Gopalakrishnan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In upcoming processors, the BLKPTR field is no longer used to indicate
the MSR number of the additional register. Insted, it simply indicates
the prescence of additional MSRs.
Fix the logic here to gather MSR address from MSR_AMD64_SMCA_MCx_MISC()
for newer processors and fall back to existing logic for older
processors.
[ Drop nextaddr_out label; style cleanups. ]
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1457021458-2522-4-git-send-email-Aravind.Gopalakrishnan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For Scalable MCA enabled processors, errors are listed per IP block. And
since it is not required for an IP to map to a particular bank, we need
to use HWID and McaType values from the MCx_IPID register to figure out
which IP a given bank represents.
We also have a new bit (TCC) in the MCx_STATUS register to indicate Task
context is corrupt.
Add logic here to decode errors from all known IP blocks for Fam17h
Model 00-0fh and to print TCC errors.
[ Minor fixups. ]
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1457021458-2522-3-git-send-email-Aravind.Gopalakrishnan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Those MSRs are used only by the MCE code so move them there.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1456785179-14378-2-git-send-email-Aravind.Gopalakrishnan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>