sg_dma_xxx should be used after a dma_map_sg call has been done to get bus
addresses of each of the SG entries and their lengths. But mmci_host_ops
validate_data can be called before dma_map_sg. This patch replaces theses
macros by sg->offset and sg->length which are always defined.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200128090636.13689-2-ludovic.barre@st.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In case the host specify a max_busy_timeout, we need to validate that the
needed timeout for the HPI command conforms to that requirement. If that's
not the case, let's convert from a R1B response to a R1 response, as to
instruct the host to avoid HW busy detection.
Additionally, when R1B is used we must also inform the host about the busy
timeout for the command, so let's do that via updating cmd.busy_timeout.
Finally, when R1B is used and in case the host supports HW busy detection,
there should be no need for doing polling, so then skip that.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-12-ulf.hansson@linaro.org
Rather than open coding the polling loop in mmc_interrupt_hpi(), let's
convert to use mmc_poll_for_busy().
Note that, moving to mmc_poll_for_busy() for HPI also improves the
behaviour according to below.
- Adds support for polling via the optional ->card_busy() host ops.
- Require R1_READY_FOR_DATA to be set in the CMD13 response before exiting
the polling loop.
- Adds a throttling mechanism to avoid CPU hogging when polling.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-11-ulf.hansson@linaro.org
Rather than open coding the polling loop in mmc_do_erase(), let's convert
to use mmc_poll_for_busy().
To allow a slightly different error parsing during polling, compared to the
__mmc_switch() case, a new in-parameter to mmc_poll_for_busy() is needed,
but other than that the conversion is straight forward.
Besides addressing the open coding issue, moving to mmc_poll_for_busy() for
erase/trim/discard improves the behaviour according to below.
- Adds support for polling via the optional ->card_busy() host ops.
- Returns zero to indicate success when the final polling attempt finds the
card non-busy, even if the timeout expired.
- Exits the polling loop when state moves to R1_STATE_TRAN, rather than
when leaving R1_STATE_PRG.
- Decreases the starting range for throttling to 32-64us.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-9-ulf.hansson@linaro.org
Through mmc_poll_for_busy() a CMD13 may be sent to get the status of the
(e)MMC card. If the state of the card is R1_STATE_PRG, the card is
considered as being busy, which means we continue to poll with CMD13. This
seems to be sufficient, but it's also unnecessary fragile, as it means a
new command/request could potentially be sent to the card when it's in an
unknown state.
To try to improve the situation, but also to move towards a more consistent
CMD13 polling behaviour in the mmc core, let's deploy the same policy we
use for regular I/O write requests. In other words, let's check that card
returns to the R1_STATE_TRAN and that the R1_READY_FOR_DATA bit is set in
the CMD13 response, before exiting the polling loop.
Note that, potentially this changed behaviour could lead to unnecessary
waiting for the timeout to expire, if the card for some reason, moves to an
unexpected error state. However, as we bail out from the polling loop when
R1_SWITCH_ERROR bit is set or when the CMD13 fails, this shouldn't be an
issue.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-8-ulf.hansson@linaro.org
In mmc_poll_for_busy() we loop continuously, either by sending a CMD13 or
by invoking the ->card_busy() host ops, as to detect when the card stops
signaling busy. This behaviour is problematic as it may cause CPU hogging,
especially when the busy signal time reaches beyond a few ms.
Let's fix the issue by adding a throttling mechanism, that inserts a
usleep_range() in between the polling attempts. The sleep range starts at
32-64us, but increases for each loop by a factor of 2, up until the range
reaches ~32-64ms. In this way, we are able to keep the loop fine-grained
enough for short busy signaling times, while also not hogging the CPU for
longer times.
Note that, this change is inspired by the similar throttling mechanism that
we already use for mmc_do_erase().
Reported-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-2-ulf.hansson@linaro.org
When using the host software queue, it will trigger the next request in
irq handler without a context switch. But the sdhci_request() can not be
called in interrupt context when using host software queue for some host
drivers, due to the get_cd() ops can be sleepable.
But for some host drivers, such as Spreadtrum host driver, the card is
nonremovable, so the get_cd() ops is not sleepable, which means we can
complete the data request and trigger the next request in irq handler
to remove the context switch for the Spreadtrum host driver.
As suggested by Adrian, we should introduce a request_atomic() API to
indicate that a request can be called in interrupt context to remove
the context switch when using mmc host software queue. But this should
be done in another thread to convert the users of mmc host software queue.
Thus we can introduce a variable in struct sdhci_host to indicate that
we will always to defer to complete requests when using the host software
queue.
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Link: https://lore.kernel.org/r/e693e7a29beb3c1922b333f4603ea81f43d5c5b1.1581478568.git.baolin.wang7@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Enable the MMC host software queue for the SD card if the host controller
supports the MMC host software queue.
On my Spreadtrum platform, I did not see any obvious performance changes
in 4K block size when changing to use hsq for the SD cards, I think the
reason is the SD card works at a low speed on my platform, and most of
time is spent in the hardware. But we can see some obvious improvements
when enabling the packed request based on hsq, that's why we still add hsq
support for the SD cards.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Link: https://lore.kernel.org/r/0065b4631fef2d61c3b89d14a4ea4f2b7499ea56.1581478568.git.baolin.wang7@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Now the MMC read/write stack will always wait for previous request is
completed by mmc_blk_rw_wait(), before sending a new request to hardware,
or queue a work to complete request, that will bring context switching
overhead and spend some extra time to poll the card for busy completion
for I/O writes via sending CMD13, especially for high I/O per second
rates, to affect the IO performance.
Thus this patch introduces MMC software queue interface based on the
hardware command queue engine's interfaces, which is similar with the
hardware command queue engine's idea, that can remove the context
switching. Moreover we set the default queue depth as 64 for software
queue, which allows more requests to be prepared, merged and inserted
into IO scheduler to improve performance, but we only allow 2 requests
in flight, that is enough to let the irq handler always trigger the
next request without a context switch, as well as avoiding a long latency.
Moreover the host controller should support HW busy detection for I/O
operations when enabling the host software queue. That means, the host
controller must not complete a data transfer request, until after the
card stops signals busy.
From the fio testing data in cover letter, we can see the software
queue can improve some performance with 4K block size, increasing
about 16% for random read, increasing about 90% for random write,
though no obvious improvement for sequential read and write.
Moreover we can expand the software queue interface to support MMC
packed request or packed command in future.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Link: https://lore.kernel.org/r/4409c1586a9b3ed20d57ad2faf6c262fc3ccb6e2.1581478568.git.baolin.wang7@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
R-Car Gen3 cannot use correction error status with HS400.
HS200: CMD and DAT signal timing are based on CLK signal.
HS400: CMD signal is based on CLK. DAT signal is based on DS signal.
In HS400, CMD signal is 200MHz(SDR). DAT signal is 200MHz(DDR).
Center position of signal is different between CMD and DAT.
TAP position should be adjusted to the center position of CMD signal.
DAT sampling timing is adjusted by HS400 calibration circuit regardless
of TAP position. Refer to renesas_sdhi_adjust_hs400mode_enable().
However, correction error status contains CMD and DAT status in HS400
(DAT signal is not masked in HS400). Therefore, correction error status
cannot use in HS400. It means that auto correction cannot be uses in
HS400. Manual correction can change to the correct TAP position by
ignoring DAT correction error status and using only CMD correction
status.
Signed-off-by: Takeshi Saito <takeshi.saito.xv@renesas.com>
[wsa: refactored patch from BSP]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20191217114034.13290-4-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Set fw_devlink to "permissive" behavior by default so that device links
are automatically created (with DL_FLAG_SYNC_STATE_ONLY) by scanning the
firmware.
This ensures suppliers get their sync_state() calls only after all their
consumers have probed successfully. Without this, suppliers will get
their sync_state() calls at late_initcall_sync() even if their consuer
Ideally, we'd want to set fw_devlink to "on" or "rpm" by default. But
that needs more testing as it's known to break some corner case
drivers/platforms.
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: devicetree@vger.kernel.org
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20200321210305.28937-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The module owner field can be used to prevent the removal of kernel
modules when there are any device files associated with it opened in
userspace. Hence, modify the API to pass module owner field. For
convenience, module_mhi_driver() macro is used which takes care of
passing the module owner through THIS_MODULE of the module of the
driver and also avoiding the use of specifying the default MHI client
driver register/unregister routines.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20200324061050.14845-2-manivannan.sadhasivam@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently on ACPI-enabled systems the USB interface device has no link to
the actual firmware node and thus drivers may not parse additional information
given in the table. The new feature, proposed here, allows to pass properties
or other information to the drivers.
The ACPI companion of the device has to be set for USB interface devices
to achieve above. Use ACPI_COMPANION_SET macro to set this.
Note, OF already does link of_node and this is the same for ACPI case.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200324100923.8332-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit f01642e491 ("perf metricgroup: Support multiple events for
metricgroup") introduced support for multiple events in a metric group.
But with the current upstream, metric events names are not printed
properly incase we try to run multiple metric groups with overlapping
event.
With current upstream version, incase of overlapping metric events issue
is, we always start our comparision logic from start. So, the events
which already matched with some metric group also take part in
comparision logic. Because of that when we have overlapping events, we
end up matching current metric group event with already matched one.
For example, in skylake machine we have metric event CoreIPC and
Instructions. Both of them need 'inst_retired.any' event value. As
events in Instructions is subset of events in CoreIPC, they endup in
pointing to same 'inst_retired.any' value.
In skylake platform:
command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
Performance counter stats for 'CPU(s) 0':
1,254,992,790 inst_retired.any # 1254992790.0
Instructions
# 1.3 CoreIPC
977,172,805 cycles
1,254,992,756 inst_retired.any
1.000802596 seconds time elapsed
command:# sudo ./perf stat -M UPI,IPC sleep 1
Performance counter stats for 'sleep 1':
948,650 uops_retired.retire_slots
866,182 inst_retired.any # 0.7 IPC
866,182 inst_retired.any
1,175,671 cpu_clk_unhalted.thread
Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep
track of events which already matched with some group by setting it
true. So, we skip all used events in list when we start comparision
logic. Patch also make some changes in comparision logic, incase we get
a match miss, we discard the whole match and start again with first
event id in metric event.
With this patch:
In skylake platform:
command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
Performance counter stats for 'CPU(s) 0':
3,348,415 inst_retired.any # 0.3 CoreIPC
11,779,026 cycles
3,348,381 inst_retired.any # 3348381.0
Instructions
1.001649056 seconds time elapsed
command:# ./perf stat -M UPI,IPC sleep 1
Performance counter stats for 'sleep 1':
1,023,148 uops_retired.retire_slots # 1.1 UPI
924,976 inst_retired.any
924,976 inst_retired.any # 0.6 IPC
1,489,414 cpu_clk_unhalted.thread
1.003064672 seconds time elapsed
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200221101121.28920-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is a slight misalignment in -A -I output.
For example:
# perf stat -e cpu/event=cpu-cycles/ -a -A -I 1000
# time CPU counts unit events
1.000440863 CPU0 1,068,388 cpu/event=cpu-cycles/
1.000440863 CPU1 875,954 cpu/event=cpu-cycles/
1.000440863 CPU2 3,072,538 cpu/event=cpu-cycles/
1.000440863 CPU3 4,026,870 cpu/event=cpu-cycles/
1.000440863 CPU4 5,919,630 cpu/event=cpu-cycles/
1.000440863 CPU5 2,714,260 cpu/event=cpu-cycles/
1.000440863 CPU6 2,219,240 cpu/event=cpu-cycles/
1.000440863 CPU7 1,299,232 cpu/event=cpu-cycles/
The value of counts is not aligned with the column "counts" and
the event name is not aligned with the column "events".
With this patch, the output is,
# perf stat -e cpu/event=cpu-cycles/ -a -A -I 1000
# time CPU counts unit events
1.000423009 CPU0 997,421 cpu/event=cpu-cycles/
1.000423009 CPU1 1,422,042 cpu/event=cpu-cycles/
1.000423009 CPU2 484,651 cpu/event=cpu-cycles/
1.000423009 CPU3 525,791 cpu/event=cpu-cycles/
1.000423009 CPU4 1,370,100 cpu/event=cpu-cycles/
1.000423009 CPU5 442,072 cpu/event=cpu-cycles/
1.000423009 CPU6 205,643 cpu/event=cpu-cycles/
1.000423009 CPU7 1,302,250 cpu/event=cpu-cycles/
Now output is aligned.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200218071614.25736-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When performing "perf report --group", it shows the event group information
together. In previous patch, we have supported a new option "--group-sort-idx"
to sort the output by the event at the index n in event group.
It would be nice if we can use a hotkey in browser to select a event
to sort.
For example,
# perf report --group
Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, ...
Overhead Command Shared Object Symbol
92.19% 98.68% 0.00% 93.30% mgen mgen [.] LOOP1
3.12% 0.29% 0.00% 0.16% gsd-color libglib-2.0.so.0.5600.4 [.] 0x0000000000049515
1.56% 0.03% 0.00% 0.04% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494b7
1.56% 0.01% 0.00% 0.00% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494ce
1.56% 0.00% 0.00% 0.00% mgen [kernel.kallsyms] [k] task_tick_fair
0.00% 0.15% 0.00% 0.04% perf [kernel.kallsyms] [k] smp_call_function_single
0.00% 0.13% 0.00% 6.08% swapper [kernel.kallsyms] [k] intel_idle
0.00% 0.03% 0.00% 0.00% gsd-color libglib-2.0.so.0.5600.4 [.] g_main_context_check
0.00% 0.03% 0.00% 0.00% swapper [kernel.kallsyms] [k] apic_timer_interrupt
0.00% 0.03% 0.00% 0.00% swapper [kernel.kallsyms] [k] check_preempt_curr
When user press hotkey '3' (event index, starting from 0), it indicates
to sort output by the forth event in group.
Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, ...
Overhead Command Shared Object Symbol
92.19% 98.68% 0.00% 93.30% mgen mgen [.] LOOP1
0.00% 0.13% 0.00% 6.08% swapper [kernel.kallsyms] [k] intel_idle
3.12% 0.29% 0.00% 0.16% gsd-color libglib-2.0.so.0.5600.4 [.] 0x0000000000049515
0.00% 0.00% 0.00% 0.06% swapper [kernel.kallsyms] [k] hrtimer_start_range_ns
1.56% 0.03% 0.00% 0.04% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494b7
0.00% 0.15% 0.00% 0.04% perf [kernel.kallsyms] [k] smp_call_function_single
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] update_curr
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] apic_timer_interrupt
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] native_apic_msr_eoi_write
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] __update_load_avg_se
v6:
---
Jiri provided a good improvement to eliminate unneeded refresh.
This improvement is added to v6.
v2:
---
1. Report warning at helpline when index is invalid.
2. Report warning at helpline when it's not group event.
3. Use "case '0' ... '9'" to refine the code
4. Split K_RELOAD implementation to another patch.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200220013616.19916-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When performing "perf report --group", it shows the event group
information together. By default, the output is sorted by the first
event in group.
It would be nice for user to select any event for sorting. This patch
introduces a new option "--group-sort-idx" to sort the output by the
event at the index n in event group.
For example,
Before:
# perf report --group --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, BR_MISP_RETIRED.ALL_BRANCHES:pp, cpu/event=0xc0,umask=1,cmask=1,
# Event count (approx.): 6451235635
#
# Overhead Command Shared Object Symbol
# ................................ ......... ....................... ...................................
#
92.19% 98.68% 0.00% 93.30% mgen mgen [.] LOOP1
3.12% 0.29% 0.00% 0.16% gsd-color libglib-2.0.so.0.5600.4 [.] 0x0000000000049515
1.56% 0.03% 0.00% 0.04% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494b7
1.56% 0.01% 0.00% 0.00% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494ce
1.56% 0.00% 0.00% 0.00% mgen [kernel.kallsyms] [k] task_tick_fair
0.00% 0.15% 0.00% 0.04% perf [kernel.kallsyms] [k] smp_call_function_single
0.00% 0.13% 0.00% 6.08% swapper [kernel.kallsyms] [k] intel_idle
0.00% 0.03% 0.00% 0.00% gsd-color libglib-2.0.so.0.5600.4 [.] g_main_context_check
0.00% 0.03% 0.00% 0.00% swapper [kernel.kallsyms] [k] apic_timer_interrupt
...
After:
# perf report --group --stdio --group-sort-idx 3
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, BR_MISP_RETIRED.ALL_BRANCHES:pp, cpu/event=0xc0,umask=1,cmask=1,
# Event count (approx.): 6451235635
#
# Overhead Command Shared Object Symbol
# ................................ ......... ....................... ...................................
#
92.19% 98.68% 0.00% 93.30% mgen mgen [.] LOOP1
0.00% 0.13% 0.00% 6.08% swapper [kernel.kallsyms] [k] intel_idle
3.12% 0.29% 0.00% 0.16% gsd-color libglib-2.0.so.0.5600.4 [.] 0x0000000000049515
0.00% 0.00% 0.00% 0.06% swapper [kernel.kallsyms] [k] hrtimer_start_range_ns
1.56% 0.03% 0.00% 0.04% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494b7
0.00% 0.15% 0.00% 0.04% perf [kernel.kallsyms] [k] smp_call_function_single
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] update_curr
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] apic_timer_interrupt
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] native_apic_msr_eoi_write
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] __update_load_avg_se
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] scheduler_tick
Now the output is sorted by the fourth event in group.
v7:
---
Rebase to latest perf/core, no other change.
v4:
---
1. Update Documentation/perf-report.txt to mention
'--group-sort-idx' support multiple groups with different
amount of events and it should be used on grouped events.
2. Update __hpp__group_sort_idx(), just return when the
idx is out of limit.
3. Return failure on symbol_conf.group_sort_idx && !session->evlist->nr_groups.
So now we don't need to use together with --group.
v3:
---
Refine the code in __hpp__group_sort_idx().
Before:
for (i = 1; i < nr_members; i++) {
if (i == idx) {
ret = field_cmp(fields_a[i], fields_b[i]);
if (ret)
goto out;
}
}
After:
if (idx >= 1 && idx < nr_members) {
ret = field_cmp(fields_a[idx], fields_b[idx]);
if (ret)
goto out;
}
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200220013616.19916-2-yao.jin@linux.intel.com
[ Renamed pair_fields_alloc() to hist_entry__new_pair() and combined decl + assignment of vars ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For perf report on stripped binaries it is currently impossible to do
annotation. The annotation state is all tied to symbols, but there are
either no symbols, or symbols are not covering all the code.
We should support the annotation functionality even without symbols.
This patch fakes a symbol and the symbol name is the string of address.
After that, we just follow current annotation working flow.
For example,
1. perf report
Overhead Command Shared Object Symbol
20.67% div libc-2.27.so [.] __random_r
17.29% div libc-2.27.so [.] __random
10.59% div div [.] 0x0000000000000628
9.25% div div [.] 0x0000000000000612
6.11% div div [.] 0x0000000000000645
2. Select the line of "10.59% div div [.] 0x0000000000000628" and ENTER.
Annotate 0x0000000000000628
Zoom into div thread
Zoom into div DSO (use the 'k' hotkey to zoom directly into the kernel)
Browse map details
Run scripts for samples of symbol [0x0000000000000628]
Run scripts for all samples
Switch to another data file in PWD
Exit
3. Select the "Annotate 0x0000000000000628" and ENTER.
Percent│
│
│
│ Disassembly of section .text:
│
│ 0000000000000628 <.text+0x68>:
│ divsd %xmm4,%xmm0
│ divsd %xmm3,%xmm1
│ movsd (%rsp),%xmm2
│ addsd %xmm1,%xmm0
│ addsd %xmm2,%xmm0
│ movsd %xmm0,(%rsp)
Now we can see the dump of object starting from 0x628.
v5:
---
Remove the hotkey 'a' implementation from this patch. It
will be moved to a separate patch.
v4:
---
1. Support the hotkey 'a'. When we press 'a' on address,
now it supports the annotation.
2. Change the patch title from
"Support interactive annotation of code without symbols" to
"perf report: Support interactive annotation of code without symbols"
v3:
---
Keep just the ANNOTATION_DUMMY_LEN, and remove the
opts->annotate_dummy_len since it's the "maybe in future
we will provide" feature.
v2:
---
Fix a crash issue when annotating an address in "unknown" object.
The steps to reproduce this issue:
perf record -e cycles:u ls
perf report
75.29% ls ld-2.27.so [.] do_lookup_x
23.64% ls ld-2.27.so [.] __GI___tunables_init
1.04% ls [unknown] [k] 0xffffffff85c01210
0.03% ls ld-2.27.so [.] _start
When annotating 0xffffffff85c01210, the crash happens.
v2 adds checking for ms->map in add_annotate_opt(). If the object is
"unknown", ms->map is NULL.
Committer notes:
Renamed new_annotate_sym() to symbol__new_unresolved().
Use PRIx64 to fix this issue in some 32-bit arches:
ui/browsers/hists.c: In function 'symbol__new_unresolved':
ui/browsers/hists.c:2474:38: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
snprintf(name, sizeof(name), "%-#.*lx", BITS_PER_LONG / 4, addr);
~~~~~~^ ~~~~
%-#.*llx
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200227043939.4403-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>