David Carrillo-Cisneros
f09509b939
perf/x86/intel: Print LBR support statement after validation
...
The following commit:
338b522ca4 ("perf/x86/intel: Protect LBR and extra_regs against KVM lying")
added an additional test to LBR support detection that is performed after
printing the LBR support statement to dmesg.
Move the LBR support output after the very last test, to make sure we
print the true status of LBR support.
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Reviewed-by: Stephane Eranian <eranian@google.com >
Reviewed-by: Andi Kleen <ak@linux.intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Kan Liang <kan.liang@intel.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Link: http://lkml.kernel.org/r/1466533874-52003-2-git-send-email-davidcc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-27 11:34:18 +02:00
Bjorn Helgaas
281ee056e3
perf/x86/intel/uncore: Remove redundant pci_get_drvdata()
...
Remove redundant pci_get_drvdata() call. There's another call a few lines
down, just before we test "box" for NULL.
No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Acked-by: Thomas Gleixner <tglx@linutronix.de >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@suse.de >
Cc: H. Peter Anvin <hpa@zytor.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Kan Liang <kan.liang@intel.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Link: http://lkml.kernel.org/r/20160531212527.28718.92371.stgit@bhelgaas-glaptop2.roam.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-14 11:16:51 +02:00
Ingo Molnar
3559ff9650
Merge branch 'linus' into perf/core, to pick up fixes before merging new changes
...
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-14 11:14:34 +02:00
Jacob Pan
348c5ac6c7
perf/x86/rapl: Add Skylake server model detection
...
SKX uses similar RAPL interface as Broadwell server.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com >
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andy Lutomirski <luto@amacapital.net >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Brian Gerst <brgerst@gmail.com >
Cc: Dave Hansen <dave@sr71.net >
Cc: Denys Vlasenko <dvlasenk@redhat.com >
Cc: H. Peter Anvin <hpa@zytor.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: jacob.jun.pan@intel.com
Link: http://lkml.kernel.org/r/20160603001953.38848836@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-08 12:06:01 +02:00
Dave Hansen
a07301ab3d
perf/x86/uncore: Use Intel family name macros for uncore
...
Another straightforward replacement of magic numbers
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andy Lutomirski <luto@amacapital.net >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Brian Gerst <brgerst@gmail.com >
Cc: Dave Hansen <dave@sr71.net >
Cc: Denys Vlasenko <dvlasenk@redhat.com >
Cc: H. Peter Anvin <hpa@zytor.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: jacob.jun.pan@intel.com
Link: http://lkml.kernel.org/r/20160603001942.537570B6@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-08 12:06:00 +02:00
Dave Hansen
bf4ad54199
perf/x86/cstate: Use Intel Model name macros
...
This should be getting old by now. Use the new macros intead of
open-coded magic numbers.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andy Lutomirski <luto@amacapital.net >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Brian Gerst <brgerst@gmail.com >
Cc: Dave Hansen <dave@sr71.net >
Cc: Denys Vlasenko <dvlasenk@redhat.com >
Cc: H. Peter Anvin <hpa@zytor.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Kan Liang <kan.liang@intel.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: jacob.jun.pan@intel.com
Link: http://lkml.kernel.org/r/20160603001940.FE69D646@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-08 12:06:00 +02:00
Dave Hansen
5134596cae
perf/x86/msr: Add missing Intel models
...
This patch presumes that Kabylake and Skylake Server will be the
same as the existing Skylake parts and adds them to the MSR
events code.
Also add handling for "WESTMERE2".
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andy Lutomirski <luto@amacapital.net >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Brian Gerst <brgerst@gmail.com >
Cc: Dave Hansen <dave@sr71.net >
Cc: Denys Vlasenko <dvlasenk@redhat.com >
Cc: H. Peter Anvin <hpa@zytor.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: jacob.jun.pan@intel.com
Link: http://lkml.kernel.org/r/20160603001935.FE6B3847@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-08 12:05:59 +02:00
Dave Hansen
353bf605a7
perf/x86/msr: Use Intel family macros for MSR events code
...
Use the new INTEL_MODEL_* macros for arch/x86/events/msr.c.
This code appears to be missing handling for "WESTMERE2" and
"SKYLAKE_X".
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andy Lutomirski <luto@amacapital.net >
Cc: Andy Lutomirski <luto@kernel.org >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Brian Gerst <brgerst@gmail.com >
Cc: Dave Hansen <dave@sr71.net >
Cc: Denys Vlasenko <dvlasenk@redhat.com >
Cc: H. Peter Anvin <hpa@zytor.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: jacob.jun.pan@intel.com
Link: http://lkml.kernel.org/r/20160603001933.99A402B0@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-08 12:05:59 +02:00
Dave Hansen
7f2236d0bf
perf/x86/rapl: Use Intel family macros for RAPL
...
Use the new INTEL_FAM6_* macros for rapl.c.
Note that this is missing at least one Westmere model and Skylake
Server which will we fixed later in this series.
The resulting binary structure 'rapl_cpu_match' is the same
before and after this patch.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andy Lutomirski <luto@amacapital.net >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Brian Gerst <brgerst@gmail.com >
Cc: Dave Hansen <dave@sr71.net >
Cc: Denys Vlasenko <dvlasenk@redhat.com >
Cc: H. Peter Anvin <hpa@zytor.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: jacob.jun.pan@intel.com
Link: http://lkml.kernel.org/r/20160603001930.6AC50BE3@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-08 12:05:58 +02:00
Dave Hansen
ef5f9f47d4
perf/x86/intel: Use Intel family macros for core perf events
...
Use the new model number macros instead of spelling things out
in the comments.
Note that this is missing a Nehalem model that is mentioned in
intel_idle which is fixed up in a later patch.
The resulting binary (arch/x86/events/intel/core.o) is exactly
the same with and without this patch modulo some harmless changes
to restoring %esi in the return path of functions, even those
untouched by this patch.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andy Lutomirski <luto@amacapital.net >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Brian Gerst <brgerst@gmail.com >
Cc: Dave Hansen <dave@sr71.net >
Cc: Denys Vlasenko <dvlasenk@redhat.com >
Cc: H. Peter Anvin <hpa@zytor.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Kan Liang <kan.liang@intel.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: jacob.jun.pan@intel.com
Link: http://lkml.kernel.org/r/20160603001929.C5F1C079@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-08 12:05:58 +02:00
Andi Kleen
030ba6cd10
perf/x86/intel: Use new topology_max_smt_threads() in HT leak workaround
...
Now that we have topology_max_smt_threads() use it
to detect the HT workarounds for older CPUs.
Signed-off-by: Andi Kleen <ak@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-6-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-03 09:41:25 +02:00
Andi Kleen
eb12b8ece7
perf/x86/intel: Add topdown events to Intel Atom
...
Add topdown event declarations to Silvermont / Airmont.
These cores do not support the full Top Down metrics, but an useful
subset (FrontendBound, Retiring, Backend Bound/Bad Speculation).
The perf stat tool automatically handles the missing events
and combines the available metrics.
Signed-off-by: Andi Kleen <ak@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-5-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-03 09:41:24 +02:00
Andi Kleen
a39fcae7a8
perf/x86/intel: Add topdown events to Intel Core
...
Add declarations for the events needed for topdown to the
Intel big core CPUs starting with Sandy Bridge. We need
to report different values if HyperThreading is on or off.
The only thing this patch does is to export some events
in sysfs.
topdown level 1 uses a set of abstracted metrics which
are generic to out of order CPU cores (although some
CPUs may not implement all of them):
topdown-total-slots Available slots in the pipeline
topdown-slots-issued Slots issued into the pipeline
topdown-slots-retired Slots successfully retired
topdown-fetch-bubbles Pipeline gaps in the frontend
topdown-recovery-bubbles Pipeline gaps during recovery
from misspeculation
A slot is a single operation in the CPU pipe line.
These metrics then allow to compute four useful metrics:
FrontendBound, BackendBound, Retiring, BadSpeculation.
The formulas to compute the metrics are generic, they
only change based on the availability on the abstracted
input values.
The kernel declares the events supported by the current
CPU and their scaling factors (such as the pipeline width)
and perf stat then computes the formulas based on the
available metrics. This is similar how existing
perf metrics, such as TSC metrics or IPC, are implemented.
This abstracts all CPU pipe line specific knowledge in the
kernel driver, but still avoids the need for larger scale perf
interface changes.
For HyperThreading the any bit is needed to get accurate
values when both threads are executing. This implies that
the events can only be collected as root or with
perf_event_paranoid=-1 for now.
The basic scheme is based on the following paper:
Yasin, A Top Down Method for Performance analysis and Counter architecture ISPASS14 (pdf available via google)
Signed-off-by: Andi Kleen <ak@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-03 09:41:23 +02:00
Andi Kleen
fc07e9f983
perf/x86: Support sysfs files depending on SMT status
...
Add a way to show different sysfs events attributes depending on
HyperThreading is on or off. This is difficult to determine
early at boot, so we just do it dynamically when the sysfs
attribute is read.
Signed-off-by: Andi Kleen <ak@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-3-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-03 09:41:22 +02:00
Kan Liang
a54fa07930
perf/x86/intel/uncore: Locate specific box by checking full device info
...
Some platforms, e.g. Knights Landing, use a common PCI device ID for
multiple instances of an uncore PMU device type. So it is impossible to
locate the specific instances only by PCI device ID.
The current code specially handles Knights Landing by arbitrarily pointing
an instance to an unused uncore box. However, we still have no idea
which uncore device is mapped to which box.
Furthermore, there could be more platforms which use a common PCI device ID
for uncore devices. We have to specially handle them one by one.
This patch records full device information (slot, func, and device ID)
in id_table[]. So the probe function can point the instance to a specific
uncore box by checking the full device information.
Tested-by: Lukasz Odzioba <lukasz.odzioba@intel.com >
Signed-off-by: Kan Liang <kan.liang@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Acked-by: tglx@linutronix.de
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: bp@suse.de
Cc: harish.chegondi@intel.com
Cc: hubert.chrzaniuk@intel.com
Cc: lawrence.f.meadows@intel.com
Link: http://lkml.kernel.org/r/1463379504-39003-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-03 09:40:18 +02:00
Lukasz Odzioba
9c489fce7a
perf/x86/intel: Change offcore response masks for Knights Landing
...
Due to change in register definition we need to update OCR mask:
MSR_OFFCORE_RESP0 reserved bits: 3,4,18,29,30,33,34, 8,11,14
MSR_OFFCORE_RESP1 reserved bits: 3,4,18,29,30,33,34, 38
Reported-by: Andi Kleen <ak@linux.intel.com >
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: kan.liang@intel.com
Cc: lukasz.anaczkowski@intel.com
Cc: zheng.z.yan@intel.com
Link: http://lkml.kernel.org/r/1463433419-16893-1-git-send-email-lukasz.odzioba@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-03 09:40:17 +02:00
Lukasz Odzioba
20f3627858
perf/x86/intel: Add 'static' keyword to locally used arrays
...
Add the 'static' keyword to intel_bdw_event_constraints[], snb_events_attrs[],
nhm_events_attrs[] and intel_skl_event_constraints arrays[], because
they are only used locally.
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: kan.liang@intel.com
Cc: lukasz.anaczkowski@intel.com
Cc: zheng.z.yan@intel.com
Link: http://lkml.kernel.org/r/1463433378-16816-1-git-send-email-lukasz.odzioba@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-03 09:40:17 +02:00
Kan Liang
3b94a89166
perf/x86/intel/uncore: Remove SBOX support for Broadwell server
...
There was a report that on certain Broadwell-EP systems writing any bit of
the SBOX PMU initialization MSR would #GP at boot. This did not happen
on all systems. My test systems booted fine.
Considering both DE and EP may have such issues, this patch removes SBOX
support for all Broadwell platforms for now.
Reported-and-tested-by: Mark van Dijk <mark@voidzero.net >
Signed-off-by: Kan Liang <kan.liang@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Link: http://lkml.kernel.org/r/1464347540-5763-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-06-03 08:39:38 +02:00
Vincent Stehlé
275ae411e5
perf/x86/intel/rapl: Fix pmus free during cleanup
...
On rapl cleanup path, kfree() is given by mistake the address of the
pointer of the structure to free (rapl_pmus->pmus + i). Pass the pointer
instead (rapl_pmus->pmus[i]).
Fixes: 9de8d68695 "perf/x86/intel/rapl: Convert it to a per package facility"
Signed-off-by: Vincent Stehlé <vincent.stehle@intel.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1464101629-14905-1-git-send-email-vincent.stehle@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de >
2016-05-25 10:56:43 +02:00
Colin Ian King
1ab94188e4
perf/x86/intel/p4: Trival indentation fix, remove space
...
Remove an extraneous space to fix up indentation. Trivial and no
functional change
Signed-off-by: Colin Ian King <colin.king@canonical.com >
Cc: Andy Lutomirski <luto@amacapital.net >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Borislav Petkov <bp@suse.de >
Cc: Brian Gerst <brgerst@gmail.com >
Cc: Dave Hansen <dave.hansen@linux.intel.com >
Cc: Denys Vlasenko <dvlasenk@redhat.com >
Cc: Fenghua Yu <fenghua.yu@intel.com >
Cc: H. Peter Anvin <hpa@zytor.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Oleg Nesterov <oleg@redhat.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Link: http://lkml.kernel.org/r/1463503215-18339-1-git-send-email-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-20 09:18:22 +02:00
Ingo Molnar
21f77d231f
Merge tag 'perf-core-for-mingo-20160516' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
...
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Honour the kernel.perf_event_max_stack knob more precisely by not counting
PERF_CONTEXT_{KERNEL,USER} when deciding when to stop adding entries to
the perf_sample->ip_callchain[] array (Arnaldo Carvalho de Melo)
- Fix identation of 'stalled-backend-cycles' in 'perf stat' (Namhyung Kim)
- Update runtime using 'cpu-clock' event in 'perf stat' (Namhyung Kim)
- Use 'cpu-clock' for cpu targets in 'perf stat' (Namhyung Kim)
- Avoid fractional digits for integer scales in 'perf stat' (Andi Kleen)
- Store vdso buildid unconditionally, as it appears in callchains and
we're not checking those when creating the build-id table, so we
end up not being able to resolve VDSO symbols when doing analysis
on a different machine than the one where recording was done, possibly
of a different arch even (arm -> x86_64) (He Kuang)
Infrastructure changes:
- Generalize max_stack sysctl handler, will be used for configuring
multiple kernel knobs related to callchains (Arnaldo Carvalho de Melo)
Cleanups:
- Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE, to stop using
open coded strings (Masami Hiramatsu)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-20 08:20:14 +02:00
Jiri Olsa
ef3f00a4d3
perf/x86/intel/uncore: Remove WARN_ON_ONCE in uncore_pci_probe
...
When booting with nr_cpus=1, uncore_pci_probe tries to init the PCI/uncore
also for the other packages and fails with warning when they are not found.
The warning is bogus because it's correct to fail here for packages which are
not initialized. Remove it and return silently.
Fixes: cf6d445f68 "perf/x86/uncore: Track packages, not per CPU data"
Signed-off-by: Jiri Olsa <jolsa@kernel.org >
Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org >
Signed-off-by: Thomas Gleixner <tglx@linutronix.de >
2016-05-18 16:17:25 +02:00
Arnaldo Carvalho de Melo
3b1fff0803
perf core: Add a 'nr' field to perf_event_callchain_context
...
We will use it to count how many addresses are in the entry->ip[] array,
excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
return the number of entries specified by the user via the relevant
sysctl, kernel.perf_event_max_contexts, or via the per event
perf_event_attr.sample_max_stack knob.
This way we keep the perf_sample->ip_callchain->nr meaning, that is the
number of entries, be it real addresses or PERF_CONTEXT_ entries, while
honouring the max_stack knobs, i.e. the end result will be max_stack
entries if we have at least that many entries in a given stack trace.
Cc: David Ahern <dsahern@gmail.com >
Cc: Frederic Weisbecker <fweisbec@gmail.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2016-05-16 23:11:51 -03:00
Arnaldo Carvalho de Melo
cfbcf46845
perf core: Pass max stack as a perf_callchain_entry context
...
This makes perf_callchain_{user,kernel}() receive the max stack
as context for the perf_callchain_entry, instead of accessing
the global sysctl_perf_event_max_stack.
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Alexei Starovoitov <ast@kernel.org >
Cc: Brendan Gregg <brendan.d.gregg@gmail.com >
Cc: David Ahern <dsahern@gmail.com >
Cc: Frederic Weisbecker <fweisbec@gmail.com >
Cc: He Kuang <hekuang@huawei.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Masami Hiramatsu <mhiramat@kernel.org >
Cc: Milian Wolff <milian.wolff@kdab.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: Wang Nan <wangnan0@huawei.com >
Cc: Zefan Li <lizefan@huawei.com >
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2016-05-16 23:11:50 -03:00
Linus Torvalds
168f1a7163
Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
...
Pull x86 asm updates from Ingo Molnar:
"The main changes in this cycle were:
- MSR access API fixes and enhancements (Andy Lutomirski)
- early exception handling improvements (Andy Lutomirski)
- user-space FS/GS prctl usage fixes and improvements (Andy
Lutomirski)
- Remove the cpu_has_*() APIs and replace them with equivalents
(Borislav Petkov)
- task switch micro-optimization (Brian Gerst)
- 32-bit entry code simplification (Denys Vlasenko)
- enhance PAT handling in enumated CPUs (Toshi Kani)
... and lots of other cleanups/fixlets"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
x86/arch_prctl/64: Restore accidentally removed put_cpu() in ARCH_SET_GS
x86/entry/32: Remove asmlinkage_protect()
x86/entry/32: Remove GET_THREAD_INFO() from entry code
x86/entry, sched/x86: Don't save/restore EFLAGS on task switch
x86/asm/entry/32: Simplify pushes of zeroed pt_regs->REGs
selftests/x86/ldt_gdt: Test set_thread_area() deletion of an active segment
x86/tls: Synchronize segment registers in set_thread_area()
x86/asm/64: Rename thread_struct's fs and gs to fsbase and gsbase
x86/arch_prctl/64: Remove FSBASE/GSBASE < 4G optimization
x86/segments/64: When load_gs_index fails, clear the base
x86/segments/64: When loadsegment(fs, ...) fails, clear the base
x86/asm: Make asm/alternative.h safe from assembly
x86/asm: Stop depending on ptrace.h in alternative.h
x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
x86/asm: Make sure verify_cpu() has a good stack
x86/extable: Add a comment about early exception handlers
x86/msr: Set the return value to zero when native_rdmsr_safe() fails
x86/paravirt: Make "unsafe" MSR accesses unsafe even if PARAVIRT=y
x86/paravirt: Add paravirt_{read,write}_msr()
x86/msr: Carry on after a non-"safe" MSR access fails
...
2016-05-16 15:15:17 -07:00
Linus Torvalds
825a3b2605
Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
...
Pull scheduler updates from Ingo Molnar:
- massive CPU hotplug rework (Thomas Gleixner)
- improve migration fairness (Peter Zijlstra)
- CPU load calculation updates/cleanups (Yuyang Du)
- cpufreq updates (Steve Muckle)
- nohz optimizations (Frederic Weisbecker)
- switch_mm() micro-optimization on x86 (Andy Lutomirski)
- ... lots of other enhancements, fixes and cleanups.
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (66 commits)
ARM: Hide finish_arch_post_lock_switch() from modules
sched/core: Provide a tsk_nr_cpus_allowed() helper
sched/core: Use tsk_cpus_allowed() instead of accessing ->cpus_allowed
sched/loadavg: Fix loadavg artifacts on fully idle and on fully loaded systems
sched/fair: Correct unit of load_above_capacity
sched/fair: Clean up scale confusion
sched/nohz: Fix affine unpinned timers mess
sched/fair: Fix fairness issue on migration
sched/core: Kill sched_class::task_waking to clean up the migration logic
sched/fair: Prepare to fix fairness problems on migration
sched/fair: Move record_wakee()
sched/core: Fix comment typo in wake_q_add()
sched/core: Remove unused variable
sched: Make hrtick_notifier an explicit call
sched/fair: Make ilb_notifier an explicit call
sched/hotplug: Make activate() the last hotplug step
sched/hotplug: Move migration CPU_DYING to sched_cpu_dying()
sched/migration: Move CPU_ONLINE into scheduler state
sched/migration: Move calc_load_migrate() into CPU_DYING
sched/migration: Move prepare transition to SCHED_STARTING state
...
2016-05-16 14:47:16 -07:00
Alexander Shishkin
5fbe4788b5
perf/x86/intel/pt: Generate PMI in the STOP region as well
...
Currently, the PT driver always sets the PMI bit one region (page) before
the STOP region so that we can wake up the consumer before we run out of
room in the buffer and have to disable the event. However, we also need
an interrupt in the last output region, so that we actually get to disable
the event (if no more room from new data is available at that point),
otherwise hardware just quietly refuses to start, but the event is
scheduled in and we end up losing trace data till the event gets removed.
For a cpu-wide event it is even worse since there may not be any
re-scheduling at all and no chance for the ring buffer code to notice
that its buffer is filled up and the event needs to be disabled (so that
the consumer can re-enable it when it finishes reading the data out). In
other words, all the trace data will be lost after the buffer gets filled
up.
This patch makes PT also generate a PMI when the last output region is
full.
Reported-by: Markus Metzger <markus.t.metzger@intel.com >
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1462886313-13660-2-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-12 10:14:55 +02:00
Ingo Molnar
5319e5ad0c
Merge branch 'perf/urgent' into perf/core, to pick up fixes
...
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-12 10:14:45 +02:00
Andrey Ryabinin
6d6f2833bf
perf/x86: Fix undefined shift on 32-bit kernels
...
Jim reported:
UBSAN: Undefined behaviour in arch/x86/events/intel/core.c:3708:12
shift exponent 35 is too large for 32-bit type 'long unsigned int'
The use of 'unsigned long' type obviously is not correct here, make it
'unsigned long long' instead.
Reported-by: Jim Cromie <jim.cromie@gmail.com >
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: <stable@vger.kernel.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: H. Peter Anvin <hpa@zytor.com >
Cc: Imre Palik <imrep@amazon.de >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Fixes: 2c33645d36 ("perf/x86: Honor the architectural performance monitoring version")
Link: http://lkml.kernel.org/r/1462974711-10037-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-12 10:14:31 +02:00
Peter Zijlstra
3c3116b745
perf/x86/msr: Fix SMI overflow
...
We compute 'delta' and properly sign extend it and then ignore it and
recompute the raw value, loosing the sign extention.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andy Lutomirski <luto@amacapital.net >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: kan.liang@intel.com
Cc: linux-kernel@vger.kernel.org
Cc: luto@kernel.org
Cc: ray.huang@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-12 10:14:30 +02:00
hchrzani
ec336c879c
perf/x86/intel/uncore: Fix CHA registers configuration procedure for Knights Landing platform
...
CHA events in Knights Landing platform require programming filter registers properly.
Remote node, local node and NonNearMemCachable bits should be set to 1 at all times.
Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com >
Signed-off-by: Lawrence F Meadows <lawrence.f.meadows@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: bp@suse.de
Cc: harish.chegondi@intel.com
Cc: hpa@zytor.com
Cc: izumi.taku@jp.fujitsu.com
Cc: kan.liang@intel.com
Cc: lukasz.anaczkowski@intel.com
Cc: vthakkar1994@gmail.com
Fixes: 77af0037de ('perf/x86/intel/uncore: Add Knights Landing uncore PMU support')
Link: http://lkml.kernel.org/r/1462779419-17115-2-git-send-email-hubert.chrzaniuk@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-12 10:14:30 +02:00
Ingo Molnar
eb60b3e5e8
Merge branch 'sched/urgent' into sched/core to pick up fixes
...
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-12 09:18:13 +02:00
Alexander Shishkin
1b6de59171
perf/x86/intel/pt: Convert ACCESS_ONCE()s
...
This patch converts remaining ACCESS_ONCE() instances into READ_ONCE()
and WRITE_ONCE() as appropriate.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1461857746-31346-2-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-05 10:16:29 +02:00
Alexander Shishkin
65c7e6f1c4
perf/x86/intel/pt: Export CPU frequency ratios needed by PT decoders
...
Intel PT decoders need access to various bits of timing related
information to be able to correctly decode timing packets from a PT
stream (MTC and CBR packets). This patch exports all the necessary
bits as sysfs attributes for the sake of consistency:
* max_nonturbo_ratio: ratio between the invariant TSC and base clock;
* tsc_art_ratio: TSC to core crystal clock ratio (also available as CPUID.15H).
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/87zisdvibe.fsf@ashishki-desk.ger.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-05 10:16:28 +02:00
Alexander Shishkin
ccbebba4c6
perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it
...
Not all cores prevent using Intel PT and LBRs simultaneously, although
most of them still do as of today. This patch adds an opt-in flag for
such cores to disable mutual exclusivity between PT and LBR; also flip
it on for Goldmont.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1461857746-31346-4-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-05 10:16:28 +02:00
Alexander Shishkin
eadf48cab4
perf/x86/intel/pt: Add support for address range filtering in PT
...
Newer versions of Intel PT support address ranges, which can be used to
define IP address range-based filters or TraceSTOP regions. Number of
ranges in enumerated via cpuid.
This patch implements PMU callbacks and related low-level code to allow
filter validation, configuration and programming into the hardware.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1461771888-10409-7-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-05 10:13:58 +02:00
Alexander Shishkin
f127fa098d
perf/x86/intel/pt: Add IP filtering register/CPUID bits
...
New versions of Intel PT support address range-based filtering. Add
the new registers, bit definitions and relevant CPUID bits.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1461771888-10409-4-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-05 10:13:56 +02:00
Alexander Shishkin
0dd28e2cda
perf/x86/intel/pt: Move PT specific MSR bit definitions to a private header
...
Nothing outside of the Intel PT driver should ever care about its MSR
bits, so there is no reason to keep them in msr-index.h. This patch
moves them to a pt-local header.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1461771888-10409-3-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-05 10:13:55 +02:00
Ingo Molnar
1a618c2cfe
Merge branch 'perf/urgent' into perf/core, to pick up fixes
...
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-05 10:12:37 +02:00
Peter Zijlstra
8482716b9d
perf/x86/amd/iommu: Do not register a task ctx for uncore like PMUs
...
The new sanity check introduced by:
2665784850 ("perf/core: Verify we have a single perf_hw_context PMU")
... triggered on the AMD IOMMU driver.
IOMMUs are not per logical CPU, they cannot have per-task counters. Fix it.
Reported-by: Borislav Petkov <bp@alien8.de >
Tested-by: Borislav Petkov <bp@suse.de >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: jroedel@suse.de
Cc: suravee.suthikulpanit@amd.com
Link: http://lkml.kernel.org/r/20160423224255.GB3430@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-05 10:11:28 +02:00
Andi Kleen
cba1b3798e
perf/x86: Add model numbers for Kabylake CPUs
...
Everything the same as Skylake, just new model numbers.
Signed-off-by: Andi Kleen <ak@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Link: http://lkml.kernel.org/r/1461977748-17616-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-05 09:29:00 +02:00
Ingo Molnar
1fb48f8e54
Merge tag 'v4.6-rc6' into x86/asm, to refresh the tree
...
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-05-05 08:35:00 +02:00
Ingo Molnar
0b20e59cef
Merge branch 'perf/urgent' into perf/core, to resolve conflict
...
Conflicts:
arch/x86/events/intel/pt.c
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-04-28 10:35:17 +02:00
Kan Liang
cf3beb7c90
perf/x86/intel: Fix incorrect lbr_sel_mask value
...
This patch fixes a bug which was introduced by:
b16a5b52eb ("perf/x86: Add option to disable reading branch flags/cycles")
In this patch, lbr_sel_mask is used to mask the lbr_select. But LBR_SEL_MASK
doesn't include the bit for LBR_CALL_STACK. So LBR call stack will never be
set in lbr_select.
This patch corrects the LBR_SEL_MASK by including all valid bits in
LBR_SELECT. Also, the LBR_CALL_STACK bit is different as other bit in
LBR_SELECT. It does not operate in suppress mode, so it needs to be
specially handled in intel_pmu_setup_hw_lbr_filter.
Signed-off-by: Kan Liang <kan.liang@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Link: http://lkml.kernel.org/r/1461231010-4399-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-04-28 10:32:43 +02:00
Alexander Shishkin
1c5ac21a0e
perf/x86/intel/pt: Don't die on VMXON
...
Some versions of Intel PT do not support tracing across VMXON, more
specifically, VMXON will clear TraceEn control bit and any attempt to
set it before VMXOFF will throw a #GP, which in the current state of
things will crash the kernel. Namely:
$ perf record -e intel_pt// kvm -nographic
on such a machine will kill it.
To avoid this, notify the intel_pt driver before VMXON and after
VMXOFF so that it knows when not to enable itself.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@infradead.org >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@alien8.de >
Cc: Gleb Natapov <gleb@kernel.org >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Paolo Bonzini <pbonzini@redhat.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/87oa9dwrfk.fsf@ashishki-desk.ger.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-04-28 10:32:42 +02:00
Adam Borowski
0a25556f84
perf/x86/amd: Set the size of event map array to PERF_COUNT_HW_MAX
...
The entry for PERF_COUNT_HW_REF_CPU_CYCLES is not used on AMD, but is
referenced by filter_events() which expects undefined events to have a
value of 0.
Found via KASAN:
UBSAN: Undefined behaviour in arch/x86/events/amd/core.c:132:30
index 9 is out of range for type 'u64 [9]'
UBSAN: Undefined behaviour in arch/x86/events/amd/core.c:132:9
load of address ffffffff81c021c8 with insufficient space for an object of type 'const u64'
Signed-off-by: Adam Borowski <kilobyte@angband.pl >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Borislav Petkov <bp@suse.de >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Link: http://lkml.kernel.org/r/1461749731-30979-1-git-send-email-kilobyte@angband.pl
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-04-28 10:20:25 +02:00
Arnaldo Carvalho de Melo
c5dfd78eb7
perf core: Allow setting up max frame stack depth via sysctl
...
The default remains 127, which is good for most cases, and not even hit
most of the time, but then for some cases, as reported by Brendan, 1024+
deep frames are appearing on the radar for things like groovy, ruby.
And in some workloads putting a _lower_ cap on this may make sense. One
that is per event still needs to be put in place tho.
The new file is:
# cat /proc/sys/kernel/perf_event_max_stack
127
Chaging it:
# echo 256 > /proc/sys/kernel/perf_event_max_stack
# cat /proc/sys/kernel/perf_event_max_stack
256
But as soon as there is some event using callchains we get:
# echo 512 > /proc/sys/kernel/perf_event_max_stack
-bash: echo: write error: Device or resource busy
#
Because we only allocate the callchain percpu data structures when there
is a user, which allows for changing the max easily, its just a matter
of having no callchain users at that point.
Reported-and-Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com >
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com >
Acked-by: Alexei Starovoitov <ast@kernel.org >
Acked-by: David Ahern <dsahern@gmail.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: He Kuang <hekuang@huawei.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Masami Hiramatsu <mhiramat@kernel.org >
Cc: Milian Wolff <milian.wolff@kdab.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Cc: Wang Nan <wangnan0@huawei.com >
Cc: Zefan Li <lizefan@huawei.com >
Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2016-04-27 10:20:39 -03:00
Ingo Molnar
84eaae155a
Merge tag 'v4.6-rc4' into sched/core, to refresh the tree
...
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-04-23 14:16:36 +02:00
Peter Zijlstra
31b84310c7
x86/perf/rapl: Add missing Broadwell model
...
With the array aligned as per events/intel/core.c it was fairly
obvious we missed one, add it in.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-04-23 14:14:27 +02:00
Peter Zijlstra
c416e5aa40
x86/perf/rapl: Reorder model numbers
...
Re-order the model array to match the order in events/intel/core.c,
to easier spot gaps and such.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vince Weaver <vincent.weaver@maine.edu >
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2016-04-23 14:14:26 +02:00