* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/amd-iommu: Fix PCI hotplug with passthrough mode
x86/amd-iommu: Fix passthrough mode
x86: mmio-mod.c: Use pr_fmt
x86: kmmio.c: Add and use pr_fmt(fmt)
x86: i8254.c: Add pr_fmt(fmt)
x86: setup_percpu.c: Use pr_<level> and add pr_fmt(fmt)
x86: es7000_32.c: Use pr_<level> and add pr_fmt(fmt)
x86: Print DMI_BOARD_NAME as well as DMI_PRODUCT_NAME from __show_regs()
x86: Factor duplicated code out of __show_regs() into show_regs_common()
arch/x86/kernel/microcode*: Use pr_fmt() and remove duplicated KERN_ERR prefix
x86, mce: fix confusion between bank attributes and mce attributes
x86/mce: Set up timer unconditionally
x86: Fix bogus warning in apic_noop.apic_write()
x86: Fix typo in arch/x86/mm/kmmio.c
x86: ASUS P4S800 reboot=bios quirk
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (57 commits)
x86, perf events: Check if we have APIC enabled
perf_event: Fix variable initialization in other codepaths
perf kmem: Fix unused argument build warning
perf symbols: perf_header__read_build_ids() offset'n'size should be u64
perf symbols: dsos__read_build_ids() should read both user and kernel buildids
perf tools: Align long options which have no short forms
perf kmem: Show usage if no option is specified
sched: Mark sched_clock() as notrace
perf sched: Add max delay time snapshot
perf tools: Correct size given to memset
perf_event: Fix perf_swevent_hrtimer() variable initialization
perf sched: Fix for getting task's execution time
tracing/kprobes: Fix field creation's bad error handling
perf_event: Cleanup for cpu_clock_perf_event_update()
perf_event: Allocate children's perf_event_ctxp at the right time
perf_event: Clean up __perf_event_init_context()
hw-breakpoints: Modify breakpoints without unregistering them
perf probe: Update perf-probe document
perf probe: Support --del option
trace-kprobe: Support delete probe syntax
...
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[ACPI/CPUFREQ] Introduce bios_limit per cpu cpufreq sysfs interface
[CPUFREQ] make internal cpufreq_add_dev_* static
[CPUFREQ] use an enum for speedstep processor identification
[CPUFREQ] Document units for transition latency
[CPUFREQ] Use global sysfs cpufreq structure for conservative governor tunings
[CPUFREQ] Documentation: ABI: /sys/devices/system/cpu/cpu#/cpufreq/
[CPUFREQ] powernow-k6: set transition latency value so ondemand governor can be used
[CPUFREQ] cpumask: don't put a cpumask on the stack in x86...cpufreq/powernow-k8.c
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
kgdb: Always process the whole breakpoint list on activate or deactivate
kgdb: continue and warn on signal passing from gdb
kgdb,x86: do not set kgdb_single_step on x86
kgdb: allow for cpu switch when single stepping
kgdb,i386: Fix corner case access to ss with NMI watch dog exception
kgdb: Replace strstr() by strchr() for single-character needles
kgdbts: Read buffer overflow
kgdb: Read buffer overflow
kgdb,x86: remove redundant test
* git://git.kernel.org/pub/scm/linux/kernel/git/viro/mmap:
Add missing alignment check in arch/score sys_mmap()
fix broken aliasing checks for MAP_FIXED on sparc32, mips, arm and sh
Get rid of open-coding in ia64_brk()
sparc_brk() is not needed anymore
switch do_brk() to get_unmapped_area()
Take arch_mmap_check() into get_unmapped_area()
fix a struct file leak in do_mmap_pgoff()
Unify sys_mmap*
Cut hugetlb case early for 32bit on ia64
arch_mmap_check() on mn10300
Kill ancient crap in s390 compat mmap
arm: add arch_mmap_check(), get rid of sys_arm_mremap()
file ->get_unmapped_area() shouldn't duplicate work of get_unmapped_area()
kill useless checks in sparc mremap variants
fix pgoff in "have to relocate" case of mremap()
fix the arch checks in MREMAP_FIXED case
fix checks for expand-in-place mremap
do_mremap() untangling, part 3
do_mremap() untangling, part 2
untangling do_mremap(), part 1
On an SMP system the kgdb_single_step flag has the possibility to
indefinitely hang the system in the case. Consider the case where,
CPU 1 has the schedule lock and CPU 0 is set to single step, there is
no way for CPU 0 to run another task.
The easy way to observe the problem is to make 2 cpus busy, and run
the kgdb test suite. You will see that it hangs the system very
quickly.
while [ 1 ] ; do find /proc > /dev/null 2>&1 ; done &
while [ 1 ] ; do find /proc > /dev/null 2>&1 ; done &
echo V1 > /sys/module/kgdbts/parameters/kgdbts
The side effect of this patch is that there is the possibility
to miss a breakpoint in the case that a single step operation
was executed to step over a breakpoint in common code.
The trade off of the missed breakpoint is preferred to
hanging the kernel. This can be fixed in the future by
using kprobes or another strategy to step over planted
breakpoints with out of line execution.
CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
It is possible for the user_mode_vm(regs) check to return true on the
i368 arch for a non master kgdb cpu or when the master kgdb cpu
handles the NMI watch dog exception.
The solution is simply to select the correct gdb_ss location
based on the check to user_mode_vm(regs).
CC: Ingo Molnar <mingo@elte.hu>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
The for loop starts with a breakno of 0, and ends when it's 4. so
this test is always true.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
New helper - sys_mmap_pgoff(); switch syscalls to using it.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The device change notifier is initialized in the dma_ops
initialization path. But this path is never executed for
iommu=pt. Move the notifier initialization to IOMMU hardware
init code to fix this.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The data structure changes to use dev->archdata.iommu field
broke the iommu=pt mode because in this case the
dev->archdata.iommu was left uninitialized. This moves the
inititalization of the devices into the main init function
and fixes the problem.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
* 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPICA: Update version to 20091112.
ACPICA: Add additional module-level code support
ACPICA: Deploy new create integer interface where appropriate
ACPICA: New internal utility function to create Integer objects
ACPICA: Add repair for predefined methods that must return sorted lists
ACPICA: Fix possible fault if return Package objects contain NULL elements
ACPICA: Add post-order callback to acpi_walk_namespace
ACPICA: Change package length error message to an info message
ACPICA: Reduce severity of predefined repair messages, Warning to Info
ACPICA: Update version to 20091013
ACPICA: Fix possible memory leak for Scope ASL operator
ACPICA: Remove possibility of executing _REG methods twice
ACPICA: Add repair for bad _MAT buffers
ACPICA: Add repair for bad _BIF/_BIX packages
Robert Hancock observes that DMI_BOARD_NAME is often more useful
than DMI_PRODUCT_NAME, especially on standalone motherboards.
So, print both.
Signed-off-by: Andy Isaacson <adi@hexapodia.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Robert Hancock <hancockrwd@gmail.com>
Cc: Richard Zidlicky <rz@linux-m68k.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20091208083021.GB27174@hexapodia.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Unify x86_32 and x86_64 implementations of __show_regs() header,
standardizing on the x86_64 format string in the process. Also,
32-bit will now call print_modules.
Signed-off-by: Andy Isaacson <adi@hexapodia.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Robert Hancock <hancockrwd@gmail.com>
Cc: Richard Zidlicky <rz@linux-m68k.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20091208082942.GA27174@hexapodia.org>
[ v2: resolved conflict ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently, when ptrace needs to modify a breakpoint, like disabling
it, changing its address, type or len, it calls
modify_user_hw_breakpoint(). This latter will perform the heavy and
racy task of unregistering the old breakpoint and registering a new
one.
This is racy as someone else might steal the reserved breakpoint
slot under us, which is undesired as the breakpoint is only
supposed to be modified, sometimes in the middle of a debugging
workflow. We don't want our slot to be stolen in the middle.
So instead of unregistering/registering the breakpoint, just
disable it while we modify its breakpoint fields and re-enable it
after if necessary.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
LKML-Reference: <1260347148-5519-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- Use #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
- Remove "microcode: " prefix from each pr_<level>
- Fix duplicated KERN_ERR prefix
- Coalesce pr_<level> format strings
- Add a space after an exclamation point
No other change in output.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
LKML-Reference: <1260340250.27677.191.camel@Joe-Laptop.home>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
timers, init: Limit the number of per cpu calibration bootup messages
posix-cpu-timers: optimize and document timer_create callback
clockevents: Add missing include to pacify sparse
x86: vmiclock: Fix printk format
x86: Fix printk format due to variable type change
sparc: fix printk for change of variable type
clocksource/events: Fix fallout of generic code changes
nohz: Allow 32-bit machines to sleep for more than 2.15 seconds
nohz: Track last do_timer() cpu
nohz: Prevent clocksource wrapping during idle
nohz: Type cast printk argument
mips: Use generic mult/shift factor calculation for clocks
clocksource: Provide a generic mult/shift factor calculation
clockevents: Use u32 for mult and shift factors
nohz: Introduce arch_needs_cpu
nohz: Reuse ktime in sub-functions of tick_check_idle.
time: Remove xtime_cache
time: Implement logarithmic time accumulation
* 'timers-for-linus-hpet' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: hpet: Make WARN_ON understandable
x86: arch specific support for remapping HPET MSIs
intr-remap: generic support for remapping HPET MSIs
x86, hpet: Simplify the HPET code
x86, hpet: Disable per-cpu hpet timer if ARAT is supported
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, mce: don't restart timer if disabled
x86: Use -maccumulate-outgoing-args for sane mcount prologues
x86: Prevent GCC 4.4.x (pentium-mmx et al) function prologue wreckage
x86: AMD Northbridge: Verify NB's node is online
x86 VSDO: Fix Kconfig help
x86: Fix typo in Intel CPU cache size descriptor
x86: Add new Intel CPU cache size descriptors
* 'x86-reboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/reboot: Add pci_dev_put in reboot_fixup_32.c for consistency
* 'x86-process-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-64: merge the standard and compat start_thread() functions
x86-64: make compat_start_thread() match start_thread()
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (36 commits)
x86, mm: Correct the implementation of is_untracked_pat_range()
x86/pat: Trivial: don't create debugfs for memtype if pat is disabled
x86, mtrr: Fix sorting of mtrr after subtracting
x86: Move find_smp_config() earlier and avoid bootmem usage
x86, platform: Change is_untracked_pat_range() to bool; cleanup init
x86: Change is_ISA_range() into an inline function
x86, mm: is_untracked_pat_range() takes a normal semiclosed range
x86, mm: Call is_untracked_pat_range() rather than is_ISA_range()
x86: UV SGI: Don't track GRU space in PAT
x86: SGI UV: Fix BAU initialization
x86, numa: Use near(er) online node instead of roundrobin for NUMA
x86, numa, bootmem: Only free bootmem on NUMA failure path
x86: Change crash kernel to reserve via reserve_early()
x86: Eliminate redundant/contradicting cache line size config options
x86: When cleaning MTRRs, do not fold WP into UC
x86: remove "extern" from function prototypes in <asm/proto.h>
x86, mm: Report state of NX protections during boot
x86, mm: Clean up and simplify NX enablement
x86, pageattr: Make set_memory_(x|nx) aware of NX support
x86, sleep: Always save the value of EFER
...
Fix up conflicts (added both iommu_shutdown and is_untracked_pat_range)
to 'struct x86_platform_ops') in
arch/x86/include/asm/x86_init.h
arch/x86/kernel/x86_init.c
* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: ucode-amd: Move family check to microcde_amd.c's init function
x86, ucode-amd: Ensure ucode update on suspend/resume after CPU off/online cycle
x86: ucode-amd: Convert printk(KERN_*...) to pr_*(...)
x86: ucode-amd: Don't warn when no ucode is available for a CPU revision
x86: ucode-amd: Load ucode-patches once and not separately of each CPU
x86, amd-ucode: Remove needless log messages
Commit cebe182033 had an unnecessary,
wrong change: &mce_banks[i].attr is equivalent to the former
bank_attrs[i], not to mce_attrs[i].
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <4B1E05CC.4040703@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* 'kvm-updates/2.6.33' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (84 commits)
KVM: VMX: Fix comparison of guest efer with stale host value
KVM: s390: Fix prefix register checking in arch/s390/kvm/sigp.c
KVM: Drop user return notifier when disabling virtualization on a cpu
KVM: VMX: Disable unrestricted guest when EPT disabled
KVM: x86 emulator: limit instructions to 15 bytes
KVM: s390: Make psw available on all exits, not just a subset
KVM: x86: Add KVM_GET/SET_VCPU_EVENTS
KVM: VMX: Report unexpected simultaneous exceptions as internal errors
KVM: Allow internal errors reported to userspace to carry extra data
KVM: Reorder IOCTLs in main kvm.h
KVM: x86: Polish exception injection via KVM_SET_GUEST_DEBUG
KVM: only clear irq_source_id if irqchip is present
KVM: x86: disallow KVM_{SET,GET}_LAPIC without allocated in-kernel lapic
KVM: x86: disallow multiple KVM_CREATE_IRQCHIP
KVM: VMX: Remove vmx->msr_offset_efer
KVM: MMU: update invlpg handler comment
KVM: VMX: move CR3/PDPTR update to vmx_set_cr3
KVM: remove duplicated task_switch check
KVM: powerpc: Fix BUILD_BUG_ON condition
KVM: VMX: Use shared msr infrastructure
...
Trivial conflicts due to new Kconfig options in arch/Kconfig and kernel/Makefile
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
mac80211: fix reorder buffer release
iwmc3200wifi: Enable wimax core through module parameter
iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
iwmc3200wifi: Coex table command does not expect a response
iwmc3200wifi: Update wiwi priority table
iwlwifi: driver version track kernel version
iwlwifi: indicate uCode type when fail dump error/event log
iwl3945: remove duplicated event logging code
b43: fix two warnings
ipw2100: fix rebooting hang with driver loaded
cfg80211: indent regulatory messages with spaces
iwmc3200wifi: fix NULL pointer dereference in pmkid update
mac80211: Fix TX status reporting for injected data frames
ath9k: enable 2GHz band only if the device supports it
airo: Fix integer overflow warning
rt2x00: Fix padding bug on L2PAD devices.
WE: Fix set events not propagated
b43legacy: avoid PPC fault during resume
b43: avoid PPC fault during resume
tcp: fix a timewait refcnt race
...
Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
CTL_UNNUMBERED removed) in
kernel/sysctl_check.c
net/ipv4/sysctl_net_ipv4.c
net/ipv6/addrconf.c
net/sctp/sysctl.c
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits)
security/tomoyo: Remove now unnecessary handling of security_sysctl.
security/tomoyo: Add a special case to handle accesses through the internal proc mount.
sysctl: Drop & in front of every proc_handler.
sysctl: Remove CTL_NONE and CTL_UNNUMBERED
sysctl: kill dead ctl_handler definitions.
sysctl: Remove the last of the generic binary sysctl support
sysctl net: Remove unused binary sysctl code
sysctl security/tomoyo: Don't look at ctl_name
sysctl arm: Remove binary sysctl support
sysctl x86: Remove dead binary sysctl support
sysctl sh: Remove dead binary sysctl support
sysctl powerpc: Remove dead binary sysctl support
sysctl ia64: Remove dead binary sysctl support
sysctl s390: Remove dead sysctl binary support
sysctl frv: Remove dead binary sysctl support
sysctl mips/lasat: Remove dead binary sysctl support
sysctl drivers: Remove dead binary sysctl support
sysctl crypto: Remove dead binary sysctl support
sysctl security/keys: Remove dead binary sysctl support
sysctl kernel: Remove binary sysctl logic
...
mce_timer must be passed to setup_timer() in all cases, no
matter whether it is going to be actually used. Otherwise, when
the CPU gets brought down, its call to del_timer_sync() will
never return, as the timer won't have a base associated, and
hence lock_timer_base() will loop infinitely.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: <stable@kernel.org>
LKML-Reference: <4B1DB831.2030801@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
apic_noop is used to provide dummy apic functions. It's installed
when the CPU has no APIC or when the APIC is disabled on the kernel
command line.
The apic_noop implementation of apic_write() warns when the CPU has
an APIC or when the APIC is not disabled.
That's bogus. The warning should only happen when the CPU has an
APIC _AND_ the APIC is not disabled. apic_noop.apic_read() has the
correct check.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: <stable@kernel.org> # in <= .32 this typo resides in native_apic_write_dummy()
LKML-Reference: <alpine.LFD.2.00.0912071255420.3089@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When we enter in irq, two things can happen to preserve the link
to the previous frame pointer:
- If we were in an irq already, we don't switch to the irq stack
as we are inside. We just need to save the previous frame
pointer and to link the new one to the previous.
- Otherwise we need another level of indirection. We enter the irq with
the previous stack. We save the previous bp inside and make bp
pointing to its saved address. Then we switch to the irq stack and
push bp another time but to the new stack. This makes two levels to
dereference instead of one.
In the second case, the current stacktrace code omits the second level
and loses the frame pointer accuracy. The stack that follows will then
be considered as unreliable.
Handling that makes the perf callchain happier.
Before:
43.94% [k] _raw_read_lock
|
--- _read_lock
|
|--60.53%-- send_sigio
| __kill_fasync
| kill_fasync
| evdev_pass_event
| evdev_event
| input_pass_event
| input_handle_event
| input_event
| synaptics_process_byte
| psmouse_handle_byte
| psmouse_interrupt
| serio_interrupt
| i8042_interrupt
| handle_IRQ_event
| handle_edge_irq
| handle_irq
| __irqentry_text_start
| ret_from_intr
| |
| |--30.43%-- __select
| |
| |--17.39%-- 0x454f15
| |
| |--13.04%-- __read
| |
| |--13.04%-- vread_hpet
| |
| |--13.04%-- _xcb_lock_io
| |
| --13.04%-- 0x7f630878ce8
After:
50.00% [k] _raw_read_lock
|
--- _read_lock
|
|--98.97%-- send_sigio
| __kill_fasync
| kill_fasync
| evdev_pass_event
| evdev_event
| input_pass_event
| input_handle_event
| input_event
| |
| |--96.88%-- synaptics_process_byte
| | psmouse_handle_byte
| | psmouse_interrupt
| | serio_interrupt
| | i8042_interrupt
| | handle_IRQ_event
| | handle_edge_irq
| | handle_irq
| | __irqentry_text_start
| | ret_from_intr
| | |
| | |--39.78%-- __const_udelay
| | | |
| | | |--91.89%-- ath5k_hw_register_timeout
| | | | ath5k_hw_noise_floor_calibration
| | | | ath5k_hw_reset
| | | | ath5k_reset
| | | | ath5k_config
| | | | ieee80211_hw_config
| | | | |
| | | | |--88.24%-- ieee80211_scan_work
| | | | | worker_thread
| | | | | kthread
| | | | | child_rip
| | | | |
| | | | --11.76%-- ieee80211_scan_completed
| | | | ieee80211_scan_work
| | | | worker_thread
| | | | kthread
| | | | child_rip
| | | |
| | | --8.11%-- ath5k_hw_noise_floor_calibration
| | | ath5k_hw_reset
| | | ath5k_reset
| | | ath5k_config
Note: This does not only affect perf events but also x86-64
stacktraces. They were considered as unreliable once we quit
the irq stack frame.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
While dumping a stacktrace, the end of the exception stack won't link
the frame pointer to the previous stack.
The interrupted stack will then be considered as unreliable and ignored
by perf, as the frame pointer is unreliable itself.
This happens because we overwrite the frame pointer that links to the
interrupted frame with the address of the exception stack. This is
done in order to reserve space inside.
But rbp has been chosen here only because it is not a scratch register,
so that the address of the exception stack remains in rbp after calling
do_debug(), we can then release the exception stack space without the
need to retrieve its address again.
But we can pick another non-scratch register to do that, so that we
preserve the link to the interrupted stack frame in the stacktraces.
Just randomly choose r12. Every registers are saved just before and
restored just after calling do_debug(). And r12 is not used in the
middle, which makes it a perfect candidate.
Example: perf record -g -a -c 1 -f -e mem:$(tasklist_lock_addr):rw
Before:
44.18% [k] _raw_read_lock
|
|
--- |--6.31%-- waitid
|
|--4.26%-- writev
|
|--3.63%-- __select
|
|--3.15%-- __waitpid
| |
| |--28.57%-- 0x8b52e00000139f
| |
| |--28.57%-- 0x8b52e0000013c6
| |
| |--14.29%-- 0x7fde786dc000
| |
| |--14.29%-- 0x62696c2f7273752f
| |
| --14.29%-- 0x1ea9df800000000
|
|--3.00%-- __poll
After:
43.94% [k] _raw_read_lock
|
--- _read_lock
|
|--60.53%-- send_sigio
| __kill_fasync
| kill_fasync
| evdev_pass_event
| evdev_event
| input_pass_event
| input_handle_event
| input_event
| synaptics_process_byte
| psmouse_handle_byte
| psmouse_interrupt
| serio_interrupt
| i8042_interrupt
| handle_IRQ_event
| handle_edge_irq
| handle_irq
| __irqentry_text_start
| ret_from_intr
| |
| |--30.43%-- __select
| |
| |--17.39%-- 0x454f15
| |
| |--13.04%-- __read
| |
| |--13.04%-- vread_hpet
| |
| |--13.04%-- _xcb_lock_io
| |
| --13.04%-- 0x7f630878ce87
Note: it does not only affect perf events but also other stacktraces in
x86-64. They were considered as unreliable once we quit the debug
stack frame.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Dumping the callchains from breakpoint events with perf gives strange
results:
3.75% perf [kernel] [k] _raw_read_unlock
|
--- _raw_read_unlock
perf_callchain
perf_prepare_sample
__perf_event_overflow
perf_swevent_overflow
perf_swevent_add
perf_bp_event
hw_breakpoint_exceptions_notify
notifier_call_chain
__atomic_notifier_call_chain
atomic_notifier_call_chain
notify_die
do_debug
debug
munmap
We are infected with all the debug stack. Like the nmi stack, the debug
stack is undesired as it is part of the profiling path, not helpful for
the user.
Ignore it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
struct perf_event::event callback was called when a breakpoint
triggers. But this is a rather opaque callback, pretty
tied-only to the breakpoint API and not really integrated into perf
as it triggers even when we don't overflow.
We prefer to use overflow_handler() as it fits into the perf events
rules, being called only when we overflow.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
Drop the callback and task parameters from modify_user_hw_breakpoint().
For now we have no user that need to modify a breakpoint to the point
of changing its handler or its task context.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Limit number of per cpu TSC sync messages
x86: dumpstack, 64-bit: Disable preemption when walking the IRQ/exception stacks
x86: dumpstack: Clean up the x86_stack_ids[][] initalization and other details
x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes
x86: Suppress stack overrun message for init_task
x86: Fix cpu_devs[] initialization in early_cpu_init()
x86: Remove CPU cache size output for non-Intel too
x86: Minimise printk spew from per-vendor init code
x86: Remove the CPU cache size printk's
cpumask: Avoid cpumask_t in arch/x86/kernel/apic/nmi.c
x86: Make sure we also print a Code: line for show_regs()
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, msr, cpumask: Use struct cpumask rather than the deprecated cpumask_t
x86, cpuid: Simplify the code in cpuid_open
x86, cpuid: Remove the bkl from cpuid_open()
x86, msr: Remove the bkl from msr_open()
x86: AMD Geode LX optimizations
x86, msr: Unify rdmsr_on_cpus/wrmsr_on_cpus
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Fix a section mismatch in arch/x86/kernel/setup.c
x86: Fixup last users of irq_chip->typename
x86: Remove BKL from apm_32
x86: Remove BKL from microcode
x86: use kernel_stack_pointer() in kprobes.c
x86: use kernel_stack_pointer() in kgdb.c
x86: use kernel_stack_pointer() in dumpstack.c
x86: use kernel_stack_pointer() in process_32.c