Commit Graph

4820 Commits

Author SHA1 Message Date
Florent Revest
4f19cab761 bpf: Add a bpf_sock_from_file helper
While eBPF programs can check whether a file is a socket by file->f_op
== &socket_file_ops, they cannot convert the void private_data pointer
to a struct socket BTF pointer. In order to do this a new helper
wrapping sock_from_file is added.

This is useful to tracing programs but also other program types
inheriting this set of helpers such as iterators or LSM programs.

Signed-off-by: Florent Revest <revest@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: KP Singh <kpsingh@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201204113609.1850150-2-revest@google.com
2020-12-04 22:32:40 +01:00
Christoph Hellwig
a54895fa05 block: remove the request_queue to argument request based tracepoints
The request_queue can trivially be derived from the request.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-04 09:42:00 -07:00
Christoph Hellwig
1c02fca620 block: remove the request_queue argument to the block_bio_remap tracepoint
The request_queue can trivially be derived from the bio.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-04 09:42:00 -07:00
Christoph Hellwig
eb6f7f7cd3 block: remove the request_queue argument to the block_split tracepoint
The request_queue can trivially be derived from the bio.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-04 09:42:00 -07:00
Christoph Hellwig
e8a676d61c block: simplify and extend the block_bio_merge tracepoint class
The block_bio_merge tracepoint class can be reused for most bio-based
tracepoints.  For that it just needs to lose the superfluous q and rq
parameters.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-04 09:42:00 -07:00
Christoph Hellwig
b81b8f40c5 block: remove the unused block_sleeprq tracepoint
The block_sleeprq tracepoint was only used by the legacy request code.
Remove it now that the legacy request code is gone.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-04 09:42:00 -07:00
Jakub Kicinski
a1dd1d8697 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-12-03

The main changes are:

1) Support BTF in kernel modules, from Andrii.

2) Introduce preferred busy-polling, from Björn.

3) bpf_ima_inode_hash() and bpf_bprm_opts_set() helpers, from KP Singh.

4) Memcg-based memory accounting for bpf objects, from Roman.

5) Allow bpf_{s,g}etsockopt from cgroup bind{4,6} hooks, from Stanislav.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (118 commits)
  selftests/bpf: Fix invalid use of strncat in test_sockmap
  libbpf: Use memcpy instead of strncpy to please GCC
  selftests/bpf: Add fentry/fexit/fmod_ret selftest for kernel module
  selftests/bpf: Add tp_btf CO-RE reloc test for modules
  libbpf: Support attachment of BPF tracing programs to kernel modules
  libbpf: Factor out low-level BPF program loading helper
  bpf: Allow to specify kernel module BTFs when attaching BPF programs
  bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier
  selftests/bpf: Add CO-RE relocs selftest relying on kernel module BTF
  selftests/bpf: Add support for marking sub-tests as skipped
  selftests/bpf: Add bpf_testmod kernel module for testing
  libbpf: Add kernel module BTF support for CO-RE relocations
  libbpf: Refactor CO-RE relocs to not assume a single BTF object
  libbpf: Add internal helper to load BTF data by FD
  bpf: Keep module's btf_data_size intact after load
  bpf: Fix bpf_put_raw_tracepoint()'s use of __module_address()
  selftests/bpf: Add Userspace tests for TCP_WINDOW_CLAMP
  bpf: Adds support for setting window clamp
  samples/bpf: Fix spelling mistake "recieving" -> "receiving"
  bpf: Fix cold build of test_progs-no_alu32
  ...
====================

Link: https://lore.kernel.org/r/20201204021936.85653-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04 07:48:12 -08:00
Andrii Nakryiko
12cc126df8 bpf: Fix bpf_put_raw_tracepoint()'s use of __module_address()
__module_address() needs to be called with preemption disabled or with
module_mutex taken. preempt_disable() is enough for read-only uses, which is
what this fix does. Also, module_put() does internal check for NULL, so drop
it as well.

Fixes: a38d1107f9 ("bpf: support raw tracepoints in modules")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-2-andrii@kernel.org
2020-12-03 17:38:20 -08:00
Jakub Kicinski
55fd59b003 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:
	drivers/net/ethernet/ibm/ibmvnic.c

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-03 15:44:09 -08:00
Steven Rostedt (VMware)
5b7be9c709 ring-buffer: Add test to validate the time stamp deltas
While debugging a situation where a delta for an event was calucalted wrong,
I realize there was nothing making sure that the delta of events are
correct. If a single event has an incorrect delta, then all events after it
will also have one. If the discrepency gets large enough, it could cause
the time stamps to go backwards when crossing sub buffers, that record a
full 64 bit time stamp, and the new deltas are added to that.

Add a way to validate the events at most events and when crossing a buffer
page. This will help make sure that the deltas are always correct. This test
will detect if they are ever corrupted.

The test adds a high overhead to the ring buffer recording, as it does the
audit for almost every event, and should only be used for testing the ring
buffer.

This will catch the bug that is fixed by commit 55ea4cf403 ("ring-buffer:
Update write stamp with the correct ts"), which is not applied when this
commit is applied.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-12-01 22:00:46 -05:00
Linus Torvalds
ef6900acc8 Tracing fixes for 5.10-rc6
- Use correct timestamp variable for ring buffer write stamp update
  - Fix up before stamp and write stamp when crossing ring buffer sub
    buffers
  - Keep a zero delta in ring buffer in slow path if cmpxchg fails
  - Fix trace_printk static buffer for archs that care
  - Fix ftrace record accounting for ftrace ops with trampolines
  - Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency
  - Remove WARN_ON in hwlat tracer that triggers on something that is OK
  - Make "my_tramp" trampoline in ftrace direct sample code global
  - Fixes in the bootconfig tool for better alignment management
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCX8ZzghQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qg0JAQCII1bDQyF3APLlNFRqfHf3bTo7Zl5z
 WaUd1Cd7JkY+WAD/eF1dWjN0JRtfU+oRlk6UZ4oNmp8WMJvQ7oV26ub2egE=
 =lts8
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Use correct timestamp variable for ring buffer write stamp update

 - Fix up before stamp and write stamp when crossing ring buffer sub
   buffers

 - Keep a zero delta in ring buffer in slow path if cmpxchg fails

 - Fix trace_printk static buffer for archs that care

 - Fix ftrace record accounting for ftrace ops with trampolines

 - Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency

 - Remove WARN_ON in hwlat tracer that triggers on something that is OK

 - Make "my_tramp" trampoline in ftrace direct sample code global

 - Fixes in the bootconfig tool for better alignment management

* tag 'trace-v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ring-buffer: Always check to put back before stamp when crossing pages
  ftrace: Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency
  ftrace: Fix updating FTRACE_FL_TRAMP
  tracing: Fix alignment of static buffer
  tracing: Remove WARN_ON in start_thread()
  samples/ftrace: Mark my_tramp[12]? global
  ring-buffer: Set the right timestamp in the slow path of __rb_reserve_next()
  ring-buffer: Update write stamp with the correct ts
  docs: bootconfig: Update file format on initrd image
  tools/bootconfig: Align the bootconfig applied initrd image size to 4
  tools/bootconfig: Fix to check the write failure correctly
  tools/bootconfig: Fix errno reference after printf()
2020-12-01 15:30:18 -08:00
Christoph Hellwig
0d02129e76 block: merge struct block_device and struct hd_struct
Instead of having two structures that represent each block device with
different life time rules, merge them into a single one.  This also
greatly simplifies the reference counting rules, as we can use the inode
reference count as the main reference count for the new struct
block_device, with the device model reference front ending it for device
model interaction.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-01 14:53:40 -07:00
Christoph Hellwig
29ff57c610 block: move the start_sect field to struct block_device
Move the start_sect field to struct block_device in preparation
of killing struct hd_struct.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-01 14:53:40 -07:00
Christoph Hellwig
a782483cc1 block: remove the nr_sects field in struct hd_struct
Now that the hd_struct always has a block device attached to it, there is
no need for having two size field that just get out of sync.

Additionally the field in hd_struct did not use proper serialization,
possibly allowing for torn writes.  By only using the block_device field
this problem also gets fixed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Coly Li <colyli@suse.de>			[bcache]
Acked-by: Chao Yu <yuchao0@huawei.com>			[f2fs]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-01 14:53:40 -07:00
Steven Rostedt (VMware)
68e10d5ff5 ring-buffer: Always check to put back before stamp when crossing pages
The current ring buffer logic checks to see if the updating of the event
buffer was interrupted, and if it is, it will try to fix up the before stamp
with the write stamp to make them equal again. This logic is flawed, because
if it is not interrupted, the two are guaranteed to be different, as the
current event just updated the before stamp before allocation. This
guarantees that the next event (this one or another interrupting one) will
think it interrupted the time updates of a previous event and inject an
absolute time stamp to compensate.

The correct logic is to always update the timestamps when traversing to a
new sub buffer.

Cc: stable@vger.kernel.org
Fixes: a389d86f7f ("ring-buffer: Have nested events still record running time stamp")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30 23:21:51 -05:00
Naveen N. Rao
49a962c075 ftrace: Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency
DYNAMIC_FTRACE_WITH_DIRECT_CALLS should depend on
DYNAMIC_FTRACE_WITH_REGS since we need ftrace_regs_caller().

Link: https://lkml.kernel.org/r/fc4b257ea8689a36f086d2389a9ed989496ca63a.1606412433.git.naveen.n.rao@linux.vnet.ibm.com

Cc: stable@vger.kernel.org
Fixes: 763e34e74b ("ftrace: Add register_ftrace_direct()")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30 21:43:08 -05:00
Naveen N. Rao
4c75b0ff4e ftrace: Fix updating FTRACE_FL_TRAMP
On powerpc, kprobe-direct.tc triggered FTRACE_WARN_ON() in
ftrace_get_addr_new() followed by the below message:
  Bad trampoline accounting at: 000000004222522f (wake_up_process+0xc/0x20) (f0000001)

The set of steps leading to this involved:
- modprobe ftrace-direct-too
- enable_probe
- modprobe ftrace-direct
- rmmod ftrace-direct <-- trigger

The problem turned out to be that we were not updating flags in the
ftrace record properly. From the above message about the trampoline
accounting being bad, it can be seen that the ftrace record still has
FTRACE_FL_TRAMP set though ftrace-direct module is going away. This
happens because we are checking if any ftrace_ops has the
FTRACE_FL_TRAMP flag set _before_ updating the filter hash.

The fix for this is to look for any _other_ ftrace_ops that also needs
FTRACE_FL_TRAMP.

Link: https://lkml.kernel.org/r/56c113aa9c3e10c19144a36d9684c7882bf09af5.1606412433.git.naveen.n.rao@linux.vnet.ibm.com

Cc: stable@vger.kernel.org
Fixes: a124692b69 ("ftrace: Enable trampoline when rec count returns back to one")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30 21:43:07 -05:00
Minchan Kim
8fa655a3a0 tracing: Fix alignment of static buffer
With 5.9 kernel on ARM64, I found ftrace_dump output was broken but
it had no problem with normal output "cat /sys/kernel/debug/tracing/trace".

With investigation, it seems coping the data into temporal buffer seems to
break the align binary printf expects if the static buffer is not aligned
with 4-byte. IIUC, get_arg in bstr_printf expects that args has already
right align to be decoded and seq_buf_bprintf says ``the arguments are saved
in a 32bit word array that is defined by the format string constraints``.
So if we don't keep the align under copy to temporal buffer, the output
will be broken by shifting some bytes.

This patch fixes it.

Link: https://lkml.kernel.org/r/20201125225654.1618966-1-minchan@kernel.org

Cc: <stable@vger.kernel.org>
Fixes: 8e99cf91b9 ("tracing: Do not allocate buffer in trace_find_next_entry() in atomic")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30 21:43:07 -05:00
Vasily Averin
310e3a4b5a tracing: Remove WARN_ON in start_thread()
This patch reverts commit 978defee11 ("tracing: Do a WARN_ON()
 if start_thread() in hwlat is called when thread exists")

.start hook can be legally called several times if according
tracer is stopped

screen window 1
[root@localhost ~]# echo 1 > /sys/kernel/tracing/events/kmem/kfree/enable
[root@localhost ~]# echo 1 > /sys/kernel/tracing/options/pause-on-trace
[root@localhost ~]# less -F /sys/kernel/tracing/trace

screen window 2
[root@localhost ~]# cat /sys/kernel/debug/tracing/tracing_on
0
[root@localhost ~]# echo hwlat >  /sys/kernel/debug/tracing/current_tracer
[root@localhost ~]# echo 1 > /sys/kernel/debug/tracing/tracing_on
[root@localhost ~]# cat /sys/kernel/debug/tracing/tracing_on
0
[root@localhost ~]# echo 2 > /sys/kernel/debug/tracing/tracing_on

triggers warning in dmesg:
WARNING: CPU: 3 PID: 1403 at kernel/trace/trace_hwlat.c:371 hwlat_tracer_start+0xc9/0xd0

Link: https://lkml.kernel.org/r/bd4d3e70-400d-9c82-7b73-a2d695e86b58@virtuozzo.com

Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 978defee11 ("tracing: Do a WARN_ON() if start_thread() in hwlat is called when thread exists")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30 21:43:07 -05:00
Andrea Righi
8785f51a17 ring-buffer: Set the right timestamp in the slow path of __rb_reserve_next()
In the slow path of __rb_reserve_next() a nested event(s) can happen
between evaluating the timestamp delta of the current event and updating
write_stamp via local_cmpxchg(); in this case the delta is not valid
anymore and it should be set to 0 (same timestamp as the interrupting
event), since the event that we are currently processing is not the last
event in the buffer.

Link: https://lkml.kernel.org/r/X8IVJcp1gRE+FJCJ@xps-13-7390

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lwn.net/Articles/831207
Fixes: a389d86f7f ("ring-buffer: Have nested events still record running time stamp")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30 15:22:05 -05:00
Steven Rostedt (VMware)
55ea4cf403 ring-buffer: Update write stamp with the correct ts
The write stamp, used to calculate deltas between events, was updated with
the stale "ts" value in the "info" structure, and not with the updated "ts"
variable. This caused the deltas between events to be inaccurate, and when
crossing into a new sub buffer, had time go backwards.

Link: https://lkml.kernel.org/r/20201124223917.795844-1-elavila@google.com

Cc: stable@vger.kernel.org
Fixes: a389d86f7f ("ring-buffer: Have nested events still record running time stamp")
Reported-by: "J. Avila" <elavila@google.com>
Tested-by: Daniel Mentz <danielmentz@google.com>
Tested-by: Will McVicker <willmcvicker@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30 15:21:31 -05:00
Ingo Molnar
a787bdaff8 Merge branch 'linus' into sched/core, to resolve semantic conflict
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-11-27 11:10:50 +01:00
Peter Zijlstra
7a9f50a058 irq_work: Cleanup
Get rid of the __call_single_node union and clean up the API a little
to avoid external code relying on the structure layout as much.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
2020-11-24 16:47:48 +01:00
Jakub Kicinski
56495a2442 Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 19:08:46 -08:00
Daniel Xu
6fa6d28051 lib/strncpy_from_user.c: Mask out bytes after NUL terminator.
do_strncpy_from_user() may copy some extra bytes after the NUL
terminator into the destination buffer. This usually does not matter for
normal string operations. However, when BPF programs key BPF maps with
strings, this matters a lot.

A BPF program may read strings from user memory by calling the
bpf_probe_read_user_str() helper which eventually calls
do_strncpy_from_user(). The program can then key a map with the
destination buffer. BPF map keys are fixed-width and string-agnostic,
meaning that map keys are treated as a set of bytes.

The issue is when do_strncpy_from_user() overcopies bytes after the NUL
terminator, it can result in seemingly identical strings occupying
multiple slots in a BPF map. This behavior is subtle and totally
unexpected by the user.

This commit masks out the bytes following the NUL while preserving
long-sized stride in the fast path.

Fixes: 6ae08ae3de ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers")
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/21efc982b3e9f2f7b0379eed642294caaa0c27a7.1605642949.git.dxu@dxuuu.xyz
2020-11-19 11:56:16 -08:00
Dmitrii Banshchikov
d055126180 bpf: Add bpf_ktime_get_coarse_ns helper
The helper uses CLOCK_MONOTONIC_COARSE source of time that is less
accurate but more performant.

We have a BPF CGROUP_SKB firewall that supports event logging through
bpf_perf_event_output(). Each event has a timestamp and currently we use
bpf_ktime_get_ns() for it. Use of bpf_ktime_get_coarse_ns() saves ~15-20
ns in time required for event logging.

bpf_ktime_get_ns():
EgressLogByRemoteEndpoint                              113.82ns    8.79M

bpf_ktime_get_coarse_ns():
EgressLogByRemoteEndpoint                               95.40ns   10.48M

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201117184549.257280-1-me@ubique.spb.ru
2020-11-18 23:25:32 +01:00
Gabriel Krisman Bertazi
524666cb5d tracepoints: Migrate to use SYSCALL_WORK flag
On architectures using the generic syscall entry code the architecture
independent syscall work is moved to flags in thread_info::syscall_work.
This removes architecture dependencies and frees up TIF bits.

Define SYSCALL_WORK_SYSCALL_TRACEPOINT, use it in the generic entry code
and convert the code which uses the TIF specific helper functions to use
the new *_syscall_work() helpers which either resolve to the new mode for
users of the generic entry code or to the TIF based functions for the other
architectures.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-6-krisman@collabora.com
2020-11-16 21:53:15 +01:00
Lukas Bulwahn
76980f5fa0 tracing: Clean up after filter logic rewriting
The functions event_{set,clear,}_no_set_filter_flag were only used in
replace_system_preds() [now, renamed to process_system_preds()].

Commit 80765597bc ("tracing: Rewrite filter logic to be simpler and
faster") removed the use of those functions in replace_system_preds().

Since then, the functions event_{set,clear,}_no_set_filter_flag were
unused. Fortunately, make CC=clang W=1 indicates this with
-Wunused-function warnings on those three functions.

So, clean up these obsolete unused functions.

Link: https://lkml.kernel.org/r/20201115155336.20248-1-lukas.bulwahn@gmail.com

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-16 12:32:34 -05:00
Jakub Kicinski
07cbce2e46 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-11-14

1) Add BTF generation for kernel modules and extend BTF infra in kernel
   e.g. support for split BTF loading and validation, from Andrii Nakryiko.

2) Support for pointers beyond pkt_end to recognize LLVM generated patterns
   on inlined branch conditions, from Alexei Starovoitov.

3) Implements bpf_local_storage for task_struct for BPF LSM, from KP Singh.

4) Enable FENTRY/FEXIT/RAW_TP tracing program to use the bpf_sk_storage
   infra, from Martin KaFai Lau.

5) Add XDP bulk APIs that introduce a defer/flush mechanism to optimize the
   XDP_REDIRECT path, from Lorenzo Bianconi.

6) Fix a potential (although rather theoretical) deadlock of hashtab in NMI
   context, from Song Liu.

7) Fixes for cross and out-of-tree build of bpftool and runqslower allowing build
   for different target archs on same source tree, from Jean-Philippe Brucker.

8) Fix error path in htab_map_alloc() triggered from syzbot, from Eric Dumazet.

9) Move functionality from test_tcpbpf_user into the test_progs framework so it
   can run in BPF CI, from Alexander Duyck.

10) Lift hashtab key_size limit to be larger than MAX_BPF_STACK, from Florian Lehner.

Note that for the fix from Song we have seen a sparse report on context
imbalance which requires changes in sparse itself for proper annotation
detection where this is currently being discussed on linux-sparse among
developers [0]. Once we have more clarification/guidance after their fix,
Song will follow-up.

  [0] https://lore.kernel.org/linux-sparse/CAHk-=wh4bx8A8dHnX612MsDO13st6uzAz1mJ1PaHHVevJx_ZCw@mail.gmail.com/T/
      https://lore.kernel.org/linux-sparse/20201109221345.uklbp3lzgq6g42zb@ltop.local/T/

* git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (66 commits)
  net: mlx5: Add xdp tx return bulking support
  net: mvpp2: Add xdp tx return bulking support
  net: mvneta: Add xdp tx return bulking support
  net: page_pool: Add bulk support for ptr_ring
  net: xdp: Introduce bulking for xdp tx return path
  bpf: Expose bpf_d_path helper to sleepable LSM hooks
  bpf: Augment the set of sleepable LSM hooks
  bpf: selftest: Use bpf_sk_storage in FENTRY/FEXIT/RAW_TP
  bpf: Allow using bpf_sk_storage in FENTRY/FEXIT/RAW_TP
  bpf: Rename some functions in bpf_sk_storage
  bpf: Folding omem_charge() into sk_storage_charge()
  selftests/bpf: Add asm tests for pkt vs pkt_end comparison.
  selftests/bpf: Add skb_pkt_end test
  bpf: Support for pointers beyond pkt_end.
  tools/bpf: Always run the *-clean recipes
  tools/bpf: Add bootstrap/ to .gitignore
  bpf: Fix NULL dereference in bpf_task_storage
  tools/bpftool: Fix build slowdown
  tools/runqslower: Build bpftool using HOSTCC
  tools/runqslower: Enable out-of-tree build
  ...
====================

Link: https://lore.kernel.org/r/20201114020819.29584-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 09:13:41 -08:00
Kaixu Xia
b111545d26 tracing: Remove the useless value assignment in test_create_synth_event()
The value of variable ret is overwritten on the delete branch in the
test_create_synth_event() and we care more about the above error than
this delete portion. Remove it.

Link: https://lkml.kernel.org/r/1605283360-6804-1-git-send-email-kaixuxia@tencent.com

Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-13 12:19:52 -05:00
Steven Rostedt (VMware)
02a474ca26 ftrace/x86: Allow for arguments to be passed in to ftrace_regs by default
Currently, the only way to get access to the registers of a function via a
ftrace callback is to set the "FL_SAVE_REGS" bit in the ftrace_ops. But as this
saves all regs as if a breakpoint were to trigger (for use with kprobes), it
is expensive.

The regs are already saved on the stack for the default ftrace callbacks, as
that is required otherwise a function being traced will get the wrong
arguments and possibly crash. And on x86, the arguments are already stored
where they would be on a pt_regs structure to use that code for both the
regs version of a callback, it makes sense to pass that information always
to all functions.

If an architecture does this (as x86_64 now does), it is to set
HAVE_DYNAMIC_FTRACE_WITH_ARGS, and this will let the generic code that it
could have access to arguments without having to set the flags.

This also includes having the stack pointer being saved, which could be used
for accessing arguments on the stack, as well as having the function graph
tracer not require its own trampoline!

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-13 12:14:59 -05:00
Steven Rostedt (VMware)
d19ad0775d ftrace: Have the callbacks receive a struct ftrace_regs instead of pt_regs
In preparation to have arguments of a function passed to callbacks attached
to functions as default, change the default callback prototype to receive a
struct ftrace_regs as the forth parameter instead of a pt_regs.

For callbacks that set the FL_SAVE_REGS flag in their ftrace_ops flags, they
will now need to get the pt_regs via a ftrace_get_regs() helper call. If
this is called by a callback that their ftrace_ops did not have a
FL_SAVE_REGS flag set, it that helper function will return NULL.

This will allow the ftrace_regs to hold enough just to get the parameters
and stack pointer, but without the worry that callbacks may have a pt_regs
that is not completely filled.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-13 12:14:55 -05:00
KP Singh
6f100640ca bpf: Expose bpf_d_path helper to sleepable LSM hooks
Sleepable hooks are never called from an NMI/interrupt context, so it
is safe to use the bpf_d_path helper in LSM programs attaching to these
hooks.

The helper is not restricted to sleepable programs and merely uses the
list of sleepable hooks as the initial subset of LSM hooks where it can
be used.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201113005930.541956-3-kpsingh@chromium.org
2020-11-13 15:47:21 +01:00
Martin KaFai Lau
8e4597c627 bpf: Allow using bpf_sk_storage in FENTRY/FEXIT/RAW_TP
This patch enables the FENTRY/FEXIT/RAW_TP tracing program to use
the bpf_sk_storage_(get|delete) helper, so those tracing programs
can access the sk's bpf_local_storage and the later selftest
will show some examples.

The bpf_sk_storage is currently used in bpf-tcp-cc, tc,
cg sockops...etc which is running either in softirq or
task context.

This patch adds bpf_sk_storage_get_tracing_proto and
bpf_sk_storage_delete_tracing_proto.  They will check
in runtime that the helpers can only be called when serving
softirq or running in a task context.  That should enable
most common tracing use cases on sk.

During the load time, the new tracing_allowed() function
will ensure the tracing prog using the bpf_sk_storage_(get|delete)
helper is not tracing any bpf_sk_storage*() function itself.
The sk is passed as "void *" when calling into bpf_local_storage.

This patch only allows tracing a kernel function.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201112211313.2587383-1-kafai@fb.com
2020-11-12 18:39:28 -08:00
Qiujun Huang
2b5894cc33 tracing: Fix some typos in comments
s/detetector/detector/
s/enfoced/enforced/
s/writen/written/
s/actualy/actually/
s/bascially/basically/
s/Regarldess/Regardless/
s/zeroes/zeros/
s/followd/followed/
s/incrememented/incremented/
s/separatelly/separately/
s/accesible/accessible/
s/sythetic/synthetic/
s/enabed/enabled/
s/heurisitc/heuristic/
s/assocated/associated/
s/otherwides/otherwise/
s/specfied/specified/
s/seaching/searching/
s/hierachry/hierarchy/
s/internel/internal/
s/Thise/This/

Link: https://lkml.kernel.org/r/20201029150554.3354-1-hqjagain@gmail.com

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-10 20:39:40 -05:00
Alex Shi
045e269c1e ftrace: Remove unused varible 'ret'
'ret' in 2 functions are not used. and one of them is a void function.
So remove them to avoid gcc warning:
kernel/trace/ftrace.c:4166:6: warning: variable ‘ret’ set but not used
[-Wunused-but-set-variable]
kernel/trace/ftrace.c:5571:6: warning: variable ‘ret’ set but not used
[-Wunused-but-set-variable]

Link: https://lkml.kernel.org/r/1604674486-52350-1-git-send-email-alex.shi@linux.alibaba.com

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-10 20:39:40 -05:00
Steven Rostedt (VMware)
28575c61ea ring-buffer: Add recording of ring buffer recursion into recursed_functions
Add a new config RING_BUFFER_RECORD_RECURSION that will place functions that
recurse from the ring buffer into the ftrace recused_functions file.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-10 20:39:40 -05:00
Steven Rostedt (VMware)
60602cb549 fgraph: Make overruns 4 bytes in graph stack structure
Inspecting the data structures of the function graph tracer, I found that
the overrun value is unsigned long, which is 8 bytes on a 64 bit machine,
and not only that, the depth is an int (4 bytes). The overrun can be simply
an unsigned int (4 bytes) and pack the ftrace_graph_ret structure better.

The depth is moved up next to the func, as it is used more often with func,
and improves cache locality.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-10 20:39:36 -05:00
Wang Qing
abbaa433de bpf: Fix passing zero to PTR_ERR() in bpf_btf_printf_prepare
There is a bug when passing zero to PTR_ERR() and return.
Fix the smatch error.

Fixes: c4d0bfb450 ("bpf: Add bpf_snprintf_btf helper")
Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/1604735144-686-1-git-send-email-wangqing@vivo.com
2020-11-09 22:37:19 +01:00
Ingo Molnar
666fab4a3e Merge branch 'linus' into perf/kprobes
Conflicts:
	include/asm-generic/atomic-instrumented.h
	kernel/kprobes.c

Use the upstream atomic-instrumented.h checksum, and pick
the kprobes version of kernel/kprobes.c, which effectively
reverts this upstream workaround:

  645f224e7b: ("kprobes: Tell lockdep about kprobe nesting")

Since the new code *should* be fine without nesting.

Knock on wood ...

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-11-07 13:20:17 +01:00
Ingo Molnar
0a986ea81e Merge branch 'linus' into perf/kprobes
Merge recent kprobes updates into perf/kprobes that came from -mm.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-11-07 13:18:49 +01:00
KP Singh
3ca1032ab7 bpf: Implement get_current_task_btf and RET_PTR_TO_BTF_ID
The currently available bpf_get_current_task returns an unsigned integer
which can be used along with BPF_CORE_READ to read data from
the task_struct but still cannot be used as an input argument to a
helper that accepts an ARG_PTR_TO_BTF_ID of type task_struct.

In order to implement this helper a new return type, RET_PTR_TO_BTF_ID,
is added. This is similar to RET_PTR_TO_BTF_ID_OR_NULL but does not
require checking the nullness of returned pointer.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201106103747.2780972-6-kpsingh@chromium.org
2020-11-06 08:08:37 -08:00
Steven Rostedt (VMware)
773c167050 ftrace: Add recording of functions that caused recursion
This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file
"recursed_functions" all the functions that caused recursion while a
callback to the function tracer was running.

Link: https://lkml.kernel.org/r/20201106023548.102375687@goodmis.org

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Guo Ren <guoren@kernel.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-csky@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: live-patching@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-06 08:42:26 -05:00
Steven Rostedt (VMware)
a25d036d93 ftrace: Reverse what the RECURSION flag means in the ftrace_ops
Now that all callbacks are recursion safe, reverse the meaning of the
RECURSION flag and rename it from RECURSION_SAFE to simply RECURSION.
Now only callbacks that request to have recursion protecting it will
have the added trampoline to do so.

Also remove the outdated comment about "PER_CPU" when determining to
use the ftrace_ops_assist_func.

Link: https://lkml.kernel.org/r/20201028115613.742454631@goodmis.org
Link: https://lkml.kernel.org/r/20201106023547.904270143@goodmis.org

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh  Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-06 08:42:12 -05:00
Steven Rostedt (VMware)
5d029b035b perf/ftrace: Check for rcu_is_watching() in callback function
If a ftrace callback requires "rcu_is_watching", then it adds the
FTRACE_OPS_FL_RCU flag and it will not be called if RCU is not "watching".
But this means that it will use a trampoline when called, and this slows
down the function tracing a tad. By checking rcu_is_watching() from within
the callback, it no longer needs the RCU flag set in the ftrace_ops and it
can be safely called directly.

Link: https://lkml.kernel.org/r/20201028115613.591878956@goodmis.org
Link: https://lkml.kernel.org/r/20201106023547.711035826@goodmis.org

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh  Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-06 08:42:05 -05:00
Steven Rostedt (VMware)
5d15a624c3 perf/ftrace: Add recursion protection to the ftrace callback
If a ftrace callback does not supply its own recursion protection and
does not set the RECURSION_SAFE flag in its ftrace_ops, then ftrace will
make a helper trampoline to do so before calling the callback instead of
just calling the callback directly.

The default for ftrace_ops is going to change. It will expect that handlers
provide their own recursion protection, unless its ftrace_ops states
otherwise.

Link: https://lkml.kernel.org/r/20201028115613.444477858@goodmis.org
Link: https://lkml.kernel.org/r/20201106023547.466892083@goodmis.org

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh  Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-06 08:41:56 -05:00
Steven Rostedt (VMware)
6e4eb9cb22 ftrace: Add ftrace_test_recursion_trylock() helper function
To make it easier for ftrace callbacks to have recursion protection, provide
a ftrace_test_recursion_trylock() and ftrace_test_recursion_unlock() helper
that tests for recursion.

Link: https://lkml.kernel.org/r/20201028115612.634927593@goodmis.org
Link: https://lkml.kernel.org/r/20201106023546.378584067@goodmis.org

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-06 08:33:23 -05:00
Steven Rostedt (VMware)
0264c8c9e1 ftrace: Move the recursion testing into global headers
Currently, if a callback is registered to a ftrace function and its
ftrace_ops does not have the RECURSION flag set, it is encapsulated in a
helper function that does the recursion for it.

Really, all the callbacks should have their own recursion protection for
performance reasons. But they should not all implement their own. Move the
recursion helpers to global headers, so that all callbacks can use them.

Link: https://lkml.kernel.org/r/20201028115612.460535535@goodmis.org
Link: https://lkml.kernel.org/r/20201106023546.166456258@goodmis.org

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-06 08:33:23 -05:00
Steven Rostedt (VMware)
561ca66910 tracing: Make -ENOMEM the default error for parse_synth_field()
parse_synth_field() returns a pointer and requires that errors get
surrounded by ERR_PTR(). The ret variable is initialized to zero, but should
never be used as zero, and if it is, it could cause a false return code and
produce a NULL pointer dereference. It makes no sense to set ret to zero.

Set ret to -ENOMEM (the most common error case), and have any other errors
set it to something else. This removes the need to initialize ret on *every*
error branch.

Fixes: 761a8c58db ("tracing, synthetic events: Replace buggy strcat() with seq_buf operations")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-02 15:58:32 -05:00
Steven Rostedt (VMware)
b02414c8f0 ring-buffer: Fix recursion protection transitions between interrupt context
The recursion protection of the ring buffer depends on preempt_count() to be
correct. But it is possible that the ring buffer gets called after an
interrupt comes in but before it updates the preempt_count(). This will
trigger a false positive in the recursion code.

Use the same trick from the ftrace function callback recursion code which
uses a "transition" bit that gets set, to allow for a single recursion for
to handle transitions between contexts.

Cc: stable@vger.kernel.org
Fixes: 567cd4da54 ("ring-buffer: User context bit recursion checking")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-02 15:58:32 -05:00
Qiujun Huang
906695e593 tracing: Fix the checking of stackidx in __ftrace_trace_stack
The array size is FTRACE_KSTACK_NESTING, so the index FTRACE_KSTACK_NESTING
is illegal too. And fix two typos by the way.

Link: https://lkml.kernel.org/r/20201031085714.2147-1-hqjagain@gmail.com

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-02 11:21:40 -05:00
Steven Rostedt (VMware)
726b3d3f14 ftrace: Handle tracing when switching between context
When an interrupt or NMI comes in and switches the context, there's a delay
from when the preempt_count() shows the update. As the preempt_count() is
used to detect recursion having each context have its own bit get set when
tracing starts, and if that bit is already set, it is considered a recursion
and the function exits. But if this happens in that section where context
has changed but preempt_count() has not been updated, this will be
incorrectly flagged as a recursion.

To handle this case, create another bit call TRANSITION and test it if the
current context bit is already set. Flag the call as a recursion if the
TRANSITION bit is already set, and if not, set it and continue. The
TRANSITION bit will be cleared normally on the return of the function that
set it, or if the current context bit is clear, set it and clear the
TRANSITION bit to allow for another transition between the current context
and an even higher one.

Cc: stable@vger.kernel.org
Fixes: edc15cafcb ("tracing: Avoid unnecessary multiple recursion checks")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-02 08:52:18 -05:00
Steven Rostedt (VMware)
ee11b93f95 ftrace: Fix recursion check for NMI test
The code that checks recursion will work to only do the recursion check once
if there's nested checks. The top one will do the check, the other nested
checks will see recursion was already checked and return zero for its "bit".
On the return side, nothing will be done if the "bit" is zero.

The problem is that zero is returned for the "good" bit when in NMI context.
This will set the bit for NMIs making it look like *all* NMI tracing is
recursing, and prevent tracing of anything in NMI context!

The simple fix is to return "bit + 1" and subtract that bit on the end to
get the real bit.

Cc: stable@vger.kernel.org
Fixes: edc15cafcb ("tracing: Avoid unnecessary multiple recursion checks")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-02 08:52:18 -05:00
Qiujun Huang
c1acb4ac1a tracing: Fix out of bounds write in get_trace_buf
The nesting count of trace_printk allows for 4 levels of nesting. The
nesting counter starts at zero and is incremented before being used to
retrieve the current context's buffer. But the index to the buffer uses the
nesting counter after it was incremented, and not its original number,
which in needs to do.

Link: https://lkml.kernel.org/r/20201029161905.4269-1-hqjagain@gmail.com

Cc: stable@vger.kernel.org
Fixes: 3d9622c12c ("tracing: Add barrier to trace_printk() buffer nesting modification")
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-02 08:52:18 -05:00
Arnd Bergmann
77f6c0b874 timekeeping: remove arch_gettimeoffset
With Arm EBSA110 gone, nothing uses it any more, so the corresponding
code and the Kconfig option can be removed.

Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-10-30 21:57:04 +01:00
Linus Torvalds
23859ae444 Fix synthetic event "strcat" overrun
New synthetic event code used strcat() and miscalculated the ending, causing
 the concatenation to write beyond the allocated memory.
 
 Instead of using strncat(), the code is switched over to seq_buf which has
 all the mechanisms in place to protect against writing more than what is
 allocated, and cleans up the code a bit.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCX5lZkBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qu8+AQDfg1UM12HUIs1XRhbXBxf9g3kjwrJh
 nuoMilEZZstSCgD8DDQiPckOS9NfrdkyCPQ86tIKoOsGPowoA21sNOHPvQQ=
 =+V+S
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Fix synthetic event "strcat" overrun

  New synthetic event code used strcat() and miscalculated the ending,
  causing the concatenation to write beyond the allocated memory.

  Instead of using strncat(), the code is switched over to seq_buf which
  has all the mechanisms in place to protect against writing more than
  what is allocated, and cleans up the code a bit"

* tag 'trace-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing, synthetic events: Replace buggy strcat() with seq_buf operations
2020-10-28 12:05:14 -07:00
Steven Rostedt (VMware)
761a8c58db tracing, synthetic events: Replace buggy strcat() with seq_buf operations
There was a memory corruption bug happening while running the synthetic
event selftests:

 kmemleak: Cannot insert 0xffff8c196fa2afe5 into the object search tree (overlaps existing)
 CPU: 5 PID: 6866 Comm: ftracetest Tainted: G        W         5.9.0-rc5-test+ #577
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
 Call Trace:
  dump_stack+0x8d/0xc0
  create_object.cold+0x3b/0x60
  slab_post_alloc_hook+0x57/0x510
  ? tracing_map_init+0x178/0x340
  __kmalloc+0x1b1/0x390
  tracing_map_init+0x178/0x340
  event_hist_trigger_func+0x523/0xa40
  trigger_process_regex+0xc5/0x110
  event_trigger_write+0x71/0xd0
  vfs_write+0xca/0x210
  ksys_write+0x70/0xf0
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fef0a63a487
 Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
 RSP: 002b:00007fff76f18398 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
 RAX: ffffffffffffffda RBX: 0000000000000039 RCX: 00007fef0a63a487
 RDX: 0000000000000039 RSI: 000055eb3b26d690 RDI: 0000000000000001
 RBP: 000055eb3b26d690 R08: 000000000000000a R09: 0000000000000038
 R10: 000055eb3b2cdb80 R11: 0000000000000246 R12: 0000000000000039
 R13: 00007fef0a70b500 R14: 0000000000000039 R15: 00007fef0a70b700
 kmemleak: Kernel memory leak detector disabled
 kmemleak: Object 0xffff8c196fa2afe0 (size 8):
 kmemleak:   comm "ftracetest", pid 6866, jiffies 4295082531
 kmemleak:   min_count = 1
 kmemleak:   count = 0
 kmemleak:   flags = 0x1
 kmemleak:   checksum = 0
 kmemleak:   backtrace:
      __kmalloc+0x1b1/0x390
      tracing_map_init+0x1be/0x340
      event_hist_trigger_func+0x523/0xa40
      trigger_process_regex+0xc5/0x110
      event_trigger_write+0x71/0xd0
      vfs_write+0xca/0x210
      ksys_write+0x70/0xf0
      do_syscall_64+0x33/0x40
      entry_SYSCALL_64_after_hwframe+0x44/0xa9

The cause came down to a use of strcat() that was adding an string that was
shorten, but the strcat() did not take that into account.

strcat() is extremely dangerous as it does not care how big the buffer is.
Replace it with seq_buf operations that prevent the buffer from being
overwritten if what is being written is bigger than the buffer.

Fixes: 10819e2579 ("tracing: Handle synthetic event array field type checking correctly")
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Tested-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-27 09:25:36 -04:00
Joe Perches
33def8498f treewide: Convert macro and uses of __section(foo) to __section("foo")
Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.

Remove the quote operator # from compiler_attributes.h __section macro.

Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.

Conversion done using the script at:

    https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-25 14:51:49 -07:00
Linus Torvalds
a5e5c274c9 ring-buffer fix
The success return value of ring_buffer_resize() is stated to be zero,
 and checked that way. But it is incorrectly returning the size allocated.
 
 Also, a fix to a comment.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCX5LnlRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qtPxAP439tmV1wWK4uBF6TLBahaPdj0tGe5b
 NT/ASnYjokZKWgEA//vmUBMMmNBohcd8DkkTu8Pp3tkc2b4RLR5WJIpXGwk=
 =1Og+
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing ring-buffer fix from Steven Rostedt:
 "The success return value of ring_buffer_resize() is stated to be
  zero and checked that way.

  But it was incorrectly returning the size allocated.

  Also, a fix to a comment"

* tag 'trace-v5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ring-buffer: Update the description for ring_buffer_wait
  ring-buffer: Return 0 on success from ring_buffer_resize()
2020-10-23 17:09:38 -07:00
Qiujun Huang
e1981f75d3 ring-buffer: Update the description for ring_buffer_wait
The function changed at some point, but the description was not
updated.

Link: https://lkml.kernel.org/r/20201017095246.5170-1-hqjagain@gmail.com

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-22 11:26:26 -04:00
Qiujun Huang
0a1754b2a9 ring-buffer: Return 0 on success from ring_buffer_resize()
We don't need to check the new buffer size, and the return value
had confused resize_buffer_duplicate_size().
...
	ret = ring_buffer_resize(trace_buf->buffer,
		per_cpu_ptr(size_buf->data,cpu_id)->entries, cpu_id);
	if (ret == 0)
		per_cpu_ptr(trace_buf->data, cpu_id)->entries =
			per_cpu_ptr(size_buf->data, cpu_id)->entries;
...

Link: https://lkml.kernel.org/r/20201019142242.11560-1-hqjagain@gmail.com

Cc: stable@vger.kernel.org
Fixes: d60da506cb ("tracing: Add a resize function to make one buffer equivalent to another buffer")
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-22 11:24:10 -04:00
Linus Torvalds
93f3d8f54a Tracing: Fix mismatch section of adding early trace events
- Fixes the issue of a mismatch section that was missed due to gcc
   inlining the offending function, while clang did not (and reported
   the issue).
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCX4oAlhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qkyBAQDnmsIHN1BPMdWbSFJeRYobSYFKoSzh
 qwcqy0prvNeFvgD9GeC7wZeJaUWCt4zRbpZhuclIq2BNqoiA/llE67zXQwQ=
 =W1ca
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Fix mismatch section of adding early trace events.

  Fixes the issue of a mismatch section that was missed due to gcc
  inlining the offending function, while clang did not (and reported the
  issue)"

* tag 'trace-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Remove __init from __trace_early_add_new_event()
2020-10-16 14:56:52 -07:00
Masami Hiramatsu
ce66f61364 tracing: Remove __init from __trace_early_add_new_event()
The commit 720dee53ad ("tracing/boot: Initialize per-instance event
list in early boot") removes __init from __trace_early_add_events()
but __trace_early_add_new_event() still has __init and will cause a
section mismatch.

Remove __init from __trace_early_add_new_event() as same as
__trace_early_add_events().

Link: https://lore.kernel.org/lkml/CAHk-=wjU86UhovK4XuwvCqTOfc+nvtpAuaN2PJBz15z=w=u0Xg@mail.gmail.com/

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-16 09:43:36 -04:00
Linus Torvalds
9ff9b0d392 networking changes for the 5.10 merge window
Add redirect_neigh() BPF packet redirect helper, allowing to limit stack
 traversal in common container configs and improving TCP back-pressure.
 Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain.
 
 Expand netlink policy support and improve policy export to user space.
 (Ge)netlink core performs request validation according to declared
 policies. Expand the expressiveness of those policies (min/max length
 and bitmasks). Allow dumping policies for particular commands.
 This is used for feature discovery by user space (instead of kernel
 version parsing or trial and error).
 
 Support IGMPv3/MLDv2 multicast listener discovery protocols in bridge.
 
 Allow more than 255 IPv4 multicast interfaces.
 
 Add support for Type of Service (ToS) reflection in SYN/SYN-ACK
 packets of TCPv6.
 
 In Multi-patch TCP (MPTCP) support concurrent transmission of data
 on multiple subflows in a load balancing scenario. Enhance advertising
 addresses via the RM_ADDR/ADD_ADDR options.
 
 Support SMC-Dv2 version of SMC, which enables multi-subnet deployments.
 
 Allow more calls to same peer in RxRPC.
 
 Support two new Controller Area Network (CAN) protocols -
 CAN-FD and ISO 15765-2:2016.
 
 Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit
 kernel problem.
 
 Add TC actions for implementing MPLS L2 VPNs.
 
 Improve nexthop code - e.g. handle various corner cases when nexthop
 objects are removed from groups better, skip unnecessary notifications
 and make it easier to offload nexthops into HW by converting
 to a blocking notifier.
 
 Support adding and consuming TCP header options by BPF programs,
 opening the doors for easy experimental and deployment-specific
 TCP option use.
 
 Reorganize TCP congestion control (CC) initialization to simplify life
 of TCP CC implemented in BPF.
 
 Add support for shipping BPF programs with the kernel and loading them
 early on boot via the User Mode Driver mechanism, hence reusing all the
 user space infra we have.
 
 Support sleepable BPF programs, initially targeting LSM and tracing.
 
 Add bpf_d_path() helper for returning full path for given 'struct path'.
 
 Make bpf_tail_call compatible with bpf-to-bpf calls.
 
 Allow BPF programs to call map_update_elem on sockmaps.
 
 Add BPF Type Format (BTF) support for type and enum discovery, as
 well as support for using BTF within the kernel itself (current use
 is for pretty printing structures).
 
 Support listing and getting information about bpf_links via the bpf
 syscall.
 
 Enhance kernel interfaces around NIC firmware update. Allow specifying
 overwrite mask to control if settings etc. are reset during update;
 report expected max time operation may take to users; support firmware
 activation without machine reboot incl. limits of how much impact
 reset may have (e.g. dropping link or not).
 
 Extend ethtool configuration interface to report IEEE-standard
 counters, to limit the need for per-vendor logic in user space.
 
 Adopt or extend devlink use for debug, monitoring, fw update
 in many drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw,
 mv88e6xxx, dpaa2-eth).
 
 In mlxsw expose critical and emergency SFP module temperature alarms.
 Refactor port buffer handling to make the defaults more suitable and
 support setting these values explicitly via the DCBNL interface.
 
 Add XDP support for Intel's igb driver.
 
 Support offloading TC flower classification and filtering rules to
 mscc_ocelot switches.
 
 Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as
 fixed interval period pulse generator and one-step timestamping in
 dpaa-eth.
 
 Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3)
 offload.
 
 Add Lynx PHY/PCS MDIO module, and convert various drivers which have
 this HW to use it. Convert mvpp2 to split PCS.
 
 Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as
 7-port Mediatek MT7531 IP.
 
 Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver,
 and wcn3680 support in wcn36xx.
 
 Improve performance for packets which don't require much offloads
 on recent Mellanox NICs by 20% by making multiple packets share
 a descriptor entry.
 
 Move chelsio inline crypto drivers (for TLS and IPsec) from the crypto
 subtree to drivers/net. Move MDIO drivers out of the phy directory.
 
 Clean up a lot of W=1 warnings, reportedly the actively developed
 subsections of networking drivers should now build W=1 warning free.
 
 Make sure drivers don't use in_interrupt() to dynamically adapt their
 code. Convert tasklets to use new tasklet_setup API (sadly this
 conversion is not yet complete).
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl+ItRwACgkQMUZtbf5S
 IrtTMg//UxpdR/MirT1DatBU0K/UGAZY82hV7F/UC8tPgjfHZeHvWlDFxfi3YP81
 PtPKbhRZ7DhwBXefUp6nY3UdvjftrJK2lJm8prJUPSsZRye8Wlcb7y65q7/P2y2U
 Efucyopg6RUrmrM0DUsIGYGJgylQLHnMYUl/keCsD4t5Bp4ksyi9R2t5eitGoWzh
 r3QGdbSa0AuWx4iu0i+tqp6Tj0ekMBMXLVb35dtU1t0joj2KTNEnSgABN3prOa8E
 iWYf2erOau68Ogp3yU3miCy0ZU4p/7qGHTtzbcp677692P/ekak6+zmfHLT9/Pjy
 2Stq2z6GoKuVxdktr91D9pA3jxG4LxSJmr0TImcGnXbvkMP3Ez3g9RrpV5fn8j6F
 mZCH8TKZAoD5aJrAJAMkhZmLYE1pvDa7KolSk8WogXrbCnTEb5Nv8FHTS1Qnk3yl
 wSKXuvutFVNLMEHCnWQLtODbTST9DI/aOi6EctPpuOA/ZyL1v3pl+gfp37S+LUTe
 owMnT/7TdvKaTD0+gIyU53M6rAWTtr5YyRQorX9awIu/4Ha0F0gYD7BJZQUGtegp
 HzKt59NiSrFdbSH7UdyemdBF4LuCgIhS7rgfeoUXMXmuPHq7eHXyHZt5dzPPa/xP
 81P0MAvdpFVwg8ij2yp2sHS7sISIRKq17fd1tIewUabxQbjXqPc=
 =bc1U
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:

 - Add redirect_neigh() BPF packet redirect helper, allowing to limit
   stack traversal in common container configs and improving TCP
   back-pressure.

   Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain.

 - Expand netlink policy support and improve policy export to user
   space. (Ge)netlink core performs request validation according to
   declared policies. Expand the expressiveness of those policies
   (min/max length and bitmasks). Allow dumping policies for particular
   commands. This is used for feature discovery by user space (instead
   of kernel version parsing or trial and error).

 - Support IGMPv3/MLDv2 multicast listener discovery protocols in
   bridge.

 - Allow more than 255 IPv4 multicast interfaces.

 - Add support for Type of Service (ToS) reflection in SYN/SYN-ACK
   packets of TCPv6.

 - In Multi-patch TCP (MPTCP) support concurrent transmission of data on
   multiple subflows in a load balancing scenario. Enhance advertising
   addresses via the RM_ADDR/ADD_ADDR options.

 - Support SMC-Dv2 version of SMC, which enables multi-subnet
   deployments.

 - Allow more calls to same peer in RxRPC.

 - Support two new Controller Area Network (CAN) protocols - CAN-FD and
   ISO 15765-2:2016.

 - Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit
   kernel problem.

 - Add TC actions for implementing MPLS L2 VPNs.

 - Improve nexthop code - e.g. handle various corner cases when nexthop
   objects are removed from groups better, skip unnecessary
   notifications and make it easier to offload nexthops into HW by
   converting to a blocking notifier.

 - Support adding and consuming TCP header options by BPF programs,
   opening the doors for easy experimental and deployment-specific TCP
   option use.

 - Reorganize TCP congestion control (CC) initialization to simplify
   life of TCP CC implemented in BPF.

 - Add support for shipping BPF programs with the kernel and loading
   them early on boot via the User Mode Driver mechanism, hence reusing
   all the user space infra we have.

 - Support sleepable BPF programs, initially targeting LSM and tracing.

 - Add bpf_d_path() helper for returning full path for given 'struct
   path'.

 - Make bpf_tail_call compatible with bpf-to-bpf calls.

 - Allow BPF programs to call map_update_elem on sockmaps.

 - Add BPF Type Format (BTF) support for type and enum discovery, as
   well as support for using BTF within the kernel itself (current use
   is for pretty printing structures).

 - Support listing and getting information about bpf_links via the bpf
   syscall.

 - Enhance kernel interfaces around NIC firmware update. Allow
   specifying overwrite mask to control if settings etc. are reset
   during update; report expected max time operation may take to users;
   support firmware activation without machine reboot incl. limits of
   how much impact reset may have (e.g. dropping link or not).

 - Extend ethtool configuration interface to report IEEE-standard
   counters, to limit the need for per-vendor logic in user space.

 - Adopt or extend devlink use for debug, monitoring, fw update in many
   drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx,
   dpaa2-eth).

 - In mlxsw expose critical and emergency SFP module temperature alarms.
   Refactor port buffer handling to make the defaults more suitable and
   support setting these values explicitly via the DCBNL interface.

 - Add XDP support for Intel's igb driver.

 - Support offloading TC flower classification and filtering rules to
   mscc_ocelot switches.

 - Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as
   fixed interval period pulse generator and one-step timestamping in
   dpaa-eth.

 - Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3)
   offload.

 - Add Lynx PHY/PCS MDIO module, and convert various drivers which have
   this HW to use it. Convert mvpp2 to split PCS.

 - Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as
   7-port Mediatek MT7531 IP.

 - Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver,
   and wcn3680 support in wcn36xx.

 - Improve performance for packets which don't require much offloads on
   recent Mellanox NICs by 20% by making multiple packets share a
   descriptor entry.

 - Move chelsio inline crypto drivers (for TLS and IPsec) from the
   crypto subtree to drivers/net. Move MDIO drivers out of the phy
   directory.

 - Clean up a lot of W=1 warnings, reportedly the actively developed
   subsections of networking drivers should now build W=1 warning free.

 - Make sure drivers don't use in_interrupt() to dynamically adapt their
   code. Convert tasklets to use new tasklet_setup API (sadly this
   conversion is not yet complete).

* tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2583 commits)
  Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH"
  net, sockmap: Don't call bpf_prog_put() on NULL pointer
  bpf, selftest: Fix flaky tcp_hdr_options test when adding addr to lo
  bpf, sockmap: Add locking annotations to iterator
  netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements
  net: fix pos incrementment in ipv6_route_seq_next
  net/smc: fix invalid return code in smcd_new_buf_create()
  net/smc: fix valid DMBE buffer sizes
  net/smc: fix use-after-free of delayed events
  bpfilter: Fix build error with CONFIG_BPFILTER_UMH
  cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr
  net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info
  bpf: Fix register equivalence tracking.
  rxrpc: Fix loss of final ack on shutdown
  rxrpc: Fix bundle counting for exclusive connections
  netfilter: restore NF_INET_NUMHOOKS
  ibmveth: Identify ingress large send packets.
  ibmveth: Switch order of ibmveth_helper calls.
  cxgb4: handle 4-tuple PEDIT to NAT mode translation
  selftests: Add VRF route leaking tests
  ...
2020-10-15 18:42:13 -07:00
Linus Torvalds
fefa636d81 Updates for tracing and bootconfig:
- Add support for "bool" type in synthetic events
 
 - Add per instance tracing for bootconfig
 
 - Support perf-style return probe ("SYMBOL%return") in kprobes and uprobes
 
 - Allow for kprobes to be enabled earlier in boot up
 
 - Added tracepoint helper function to allow testing if tracepoints are
   enabled in headers
 
 - Synthetic events can now have dynamic strings (variable length)
 
 - Various fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCX4iMDRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qrMPAP0UAfOeQcYxBAw9y8oX7oJnBBylLFTR
 CICOVEhBYC/xIQD/edVPEUt77ozM/Bplwv8BiO4QxFjgZFqtpZI8mskIfAo=
 =sbny
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "Updates for tracing and bootconfig:

   - Add support for "bool" type in synthetic events

   - Add per instance tracing for bootconfig

   - Support perf-style return probe ("SYMBOL%return") in kprobes and
     uprobes

   - Allow for kprobes to be enabled earlier in boot up

   - Added tracepoint helper function to allow testing if tracepoints
     are enabled in headers

   - Synthetic events can now have dynamic strings (variable length)

   - Various fixes and cleanups"

* tag 'trace-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (58 commits)
  tracing: support "bool" type in synthetic trace events
  selftests/ftrace: Add test case for synthetic event syntax errors
  tracing: Handle synthetic event array field type checking correctly
  selftests/ftrace: Change synthetic event name for inter-event-combined test
  tracing: Add synthetic event error logging
  tracing: Check that the synthetic event and field names are legal
  tracing: Move is_good_name() from trace_probe.h to trace.h
  tracing: Don't show dynamic string internals in synthetic event description
  tracing: Fix some typos in comments
  tracing/boot: Add ftrace.instance.*.alloc_snapshot option
  tracing: Fix race in trace_open and buffer resize call
  tracing: Check return value of __create_val_fields() before using its result
  tracing: Fix synthetic print fmt check for use of __get_str()
  tracing: Remove a pointless assignment
  ftrace: ftrace_global_list is renamed to ftrace_ops_list
  ftrace: Format variable declarations of ftrace_allocate_records
  ftrace: Simplify the calculation of page number for ftrace_page->records
  ftrace: Simplify the dyn_ftrace->flags macro
  ftrace: Simplify the hash calculation
  ftrace: Use fls() to get the bits for dup_hash()
  ...
2020-10-15 15:51:28 -07:00
Linus Torvalds
726eb70e0d Char/Misc driver patches for 5.10-rc1
Here is the big set of char, misc, and other assorted driver subsystem
 patches for 5.10-rc1.
 
 There's a lot of different things in here, all over the drivers/
 directory.  Some summaries:
 	- soundwire driver updates
 	- habanalabs driver updates
 	- extcon driver updates
 	- nitro_enclaves new driver
 	- fsl-mc driver and core updates
 	- mhi core and bus updates
 	- nvmem driver updates
 	- eeprom driver updates
 	- binder driver updates and fixes
 	- vbox minor bugfixes
 	- fsi driver updates
 	- w1 driver updates
 	- coresight driver updates
 	- interconnect driver updates
 	- misc driver updates
 	- other minor driver updates
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX4g8YQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yngKgCeNpArCP/9vQJRK9upnDm8ZLunSCUAn1wUT/2A
 /bTQ42c/WRQ+LU828GSM
 =6sO2
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of char, misc, and other assorted driver subsystem
  patches for 5.10-rc1.

  There's a lot of different things in here, all over the drivers/
  directory. Some summaries:

   - soundwire driver updates

   - habanalabs driver updates

   - extcon driver updates

   - nitro_enclaves new driver

   - fsl-mc driver and core updates

   - mhi core and bus updates

   - nvmem driver updates

   - eeprom driver updates

   - binder driver updates and fixes

   - vbox minor bugfixes

   - fsi driver updates

   - w1 driver updates

   - coresight driver updates

   - interconnect driver updates

   - misc driver updates

   - other minor driver updates

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (396 commits)
  binder: fix UAF when releasing todo list
  docs: w1: w1_therm: Fix broken xref, mistakes, clarify text
  misc: Kconfig: fix a HISI_HIKEY_USB dependency
  LSM: Fix type of id parameter in kernel_post_load_data prototype
  misc: Kconfig: add a new dependency for HISI_HIKEY_USB
  firmware_loader: fix a kernel-doc markup
  w1: w1_therm: make w1_poll_completion static
  binder: simplify the return expression of binder_mmap
  test_firmware: Test partial read support
  firmware: Add request_partial_firmware_into_buf()
  firmware: Store opt_flags in fw_priv
  fs/kernel_file_read: Add "offset" arg for partial reads
  IMA: Add support for file reads without contents
  LSM: Add "contents" flag to kernel_read_file hook
  module: Call security_kernel_post_load_data()
  firmware_loader: Use security_post_load_data()
  LSM: Introduce kernel_post_load_data() hook
  fs/kernel_read_file: Add file_size output argument
  fs/kernel_read_file: Switch buffer size arg to size_t
  fs/kernel_read_file: Remove redundant size argument
  ...
2020-10-15 10:01:51 -07:00
Axel Rasmussen
6107742d15 tracing: support "bool" type in synthetic trace events
It's common [1] to define tracepoint fields as "bool" when they contain
a true / false value. Currently, defining a synthetic event with a
"bool" field yields EINVAL. It's possible to work around this by using
e.g. u8 (assuming sizeof(bool) is 1, and bool is unsigned; if either of
these properties don't match, you get EINVAL [2]).

Supporting "bool" explicitly makes hooking this up easier and more
portable for userspace.

[1]: grep -r "bool" include/trace/events/
[2]: check_synth_field() in kernel/trace/trace_events_hist.c

Link: https://lkml.kernel.org/r/20201009220524.485102-2-axelrasmussen@google.com

Acked-by: Michel Lespinasse <walken@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-15 12:01:14 -04:00
Tom Zanussi
10819e2579 tracing: Handle synthetic event array field type checking correctly
Since synthetic event array types are derived from the field name,
there may be a semicolon at the end of the type which should be
stripped off.

If there are more characters following that, normal type string
checking will result in an invalid type.

Without this patch, you can end up with an invalid field type string
that gets displayed in both the synthetic event description and the
event format:

Before:

  # echo 'myevent char str[16]; int v' >> synthetic_events
  # cat synthetic_events
    myevent	char[16]; str; int v

  name: myevent
  ID: 1936
  format:
  	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
  	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
  	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
  	field:int common_pid;	offset:4;	size:4;	signed:1;

  	field:char str[16];;	offset:8;	size:16;	signed:1;
  	field:int v;	offset:40;	size:4;	signed:1;

  print fmt: "str=%s, v=%d", REC->str, REC->v

After:

  # echo 'myevent char str[16]; int v' >> synthetic_events
  # cat synthetic_events
    myevent	char[16] str; int v

  # cat events/synthetic/myevent/format
  name: myevent
  ID: 1936
  format:
	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
	field:int common_pid;	offset:4;	size:4;	signed:1;

	field:char str[16];	offset:8;	size:16;	signed:1;
	field:int v;	offset:40;	size:4;	signed:1;

  print fmt: "str=%s, v=%d", REC->str, REC->v

Link: https://lkml.kernel.org/r/6587663b56c2d45ab9d8c8472a2110713cdec97d.1602598160.git.zanussi@kernel.org

[ <rostedt@goodmis.org>: wrote parse_synth_field() snippet. ]
Fixes: 4b147936fa (tracing: Add support for 'synthetic' events)
Reported-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-15 12:01:13 -04:00
Tom Zanussi
d4d704637d tracing: Add synthetic event error logging
Add support for synthetic event error logging, which entails adding a
logging function for it, a way to save the synthetic event command,
and a set of specific synthetic event parse error strings and
handling.

Link: https://lkml.kernel.org/r/ed099c66df13b40cfc633aaeb17f66c37a923066.1602598160.git.zanussi@kernel.org

[ <rostedt@goodmis.org>: wrote save_cmdstr() seq_buf implementation. ]
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-15 12:01:13 -04:00
Tom Zanussi
9bbb33291f tracing: Check that the synthetic event and field names are legal
Call the is_good_name() function used by probe events to make sure
synthetic event and field names don't contain illegal characters and
cause unexpected parsing of synthetic event commands.

Link: https://lkml.kernel.org/r/c4d4bb59d3ac39bcbd70fba0cf837d6b1cedb015.1602598160.git.zanussi@kernel.org

Fixes: 4b147936fa (tracing: Add support for 'synthetic' events)
Reported-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-15 12:01:13 -04:00
Tom Zanussi
42d120e2dd tracing: Move is_good_name() from trace_probe.h to trace.h
is_good_name() is useful for other trace infrastructure, such as
synthetic events, so make it available via trace.h.

Link: https://lkml.kernel.org/r/cc6d6a2d7da6957fcbe1e2922e76d18d2bb459b4.1602598160.git.zanussi@kernel.org

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-15 12:01:13 -04:00
Tom Zanussi
7d27adf575 tracing: Don't show dynamic string internals in synthetic event description
For synthetic event dynamic fields, the type contains "__data_loc",
which is basically an internal part of the type which is only meant to
be displayed in the format, not in the event description itself, which
is confusing to users since they can't use __data_loc on the
command-line to define an event field, which printing it would lead
them to believe.

So filter it out from the description, while leaving it in the type.

Link: https://lkml.kernel.org/r/b3b7baf7813298a5ede4ff02e2e837b91c05a724.1602598160.git.zanussi@kernel.org

Reported-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-15 12:01:13 -04:00
Qiujun Huang
499f7bb085 tracing: Fix some typos in comments
s/wihin/within/
s/retrieven/retrieved/
s/suppport/support/
s/wil/will/
s/accidently/accidentally/
s/if the if the/if the/

Link: https://lkml.kernel.org/r/20201010140924.3809-1-hqjagain@gmail.com

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-15 12:01:13 -04:00
Masami Hiramatsu
c163409719 tracing/boot: Add ftrace.instance.*.alloc_snapshot option
Add ftrace.instance.*.alloc_snapshot option.

This option has been described in Documentation/trace/boottime-trace.rst
but not implemented yet.

ftrace.[instance.INSTANCE.]alloc_snapshot
   Allocate snapshot buffer.

The difference from kernel.alloc_snapshot is that the kernel.alloc_snapshot
will allocate the buffer only for the main instance, but this can allocate
buffer for any new instances.

Link: https://lkml.kernel.org/r/160234368948.400560.15313384470765915015.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-15 12:01:13 -04:00
Gaurav Kohli
bbeb97464e tracing: Fix race in trace_open and buffer resize call
Below race can come, if trace_open and resize of
cpu buffer is running parallely on different cpus
CPUX                                CPUY
				    ring_buffer_resize
				    atomic_read(&buffer->resize_disabled)
tracing_open
tracing_reset_online_cpus
ring_buffer_reset_cpu
rb_reset_cpu
				    rb_update_pages
				    remove/insert pages
resetting pointer

This race can cause data abort or some times infinte loop in
rb_remove_pages and rb_insert_pages while checking pages
for sanity.

Take buffer lock to fix this.

Link: https://lkml.kernel.org/r/1601976833-24377-1-git-send-email-gkohli@codeaurora.org

Cc: stable@vger.kernel.org
Fixes: b23d7a5f4a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")
Signed-off-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-15 12:01:13 -04:00
Steven Rostedt (VMware)
6d9bd13945 tracing: Check return value of __create_val_fields() before using its result
After having a typo for writing a histogram trigger.

Wrote:
  echo 'hist:key=pid:ts=common_timestamp.usec' > events/sched/sched_waking/trigger

Instead of:
  echo 'hist:key=pid:ts=common_timestamp.usecs' > events/sched/sched_waking/trigger

and the following crash happened:

 BUG: kernel NULL pointer dereference, address: 0000000000000008
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 4 PID: 1641 Comm: sh Not tainted 5.9.0-rc5-test+ #549
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
 RIP: 0010:event_hist_trigger_func+0x70b/0x1ee0
 Code: 24 08 89 d5 49 89 cc e9 8c 00 00 00 4c 89 f2 41 b9 00 10 00 00 4c 89 e1 44 89 ee 4c 89 ff e8 dc d3 ff ff 45 89 ea 4b 8b 14 d7 <f6> 42 08 04 74 17 41 8b 8f c0 00 00 00 8d 71 01 41 89 b7 c0 00 00
 RSP: 0018:ffff959213d53db0 EFLAGS: 00010202
 RAX: ffffffffffffffea RBX: 0000000000000000 RCX: 0000000000084c04
 RDX: 0000000000000000 RSI: df7326aefebd174c RDI: 0000000000031080
 RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000000000001
 R10: 0000000000000001 R11: 0000000000000046 R12: ffff959211dcf690
 R13: 0000000000000001 R14: ffff95925a36e370 R15: ffff959251c89800
 FS:  00007fb9ea934740(0000) GS:ffff95925ab00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000008 CR3: 00000000c976c005 CR4: 00000000001706e0
 Call Trace:
  ? trigger_process_regex+0x78/0x110
  trigger_process_regex+0xc5/0x110
  event_trigger_write+0x71/0xd0
  vfs_write+0xca/0x210
  ksys_write+0x70/0xf0
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fb9eaa29487
 Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24

This was caused by accessing the hlist_data fields after the call to
__create_val_fields() without checking if the creation succeed.

Link: https://lkml.kernel.org/r/20201013154852.3abd8702@gandalf.local.home

Fixes: 63a1e5de30 ("tracing: Save normal string variables")
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-15 12:00:59 -04:00
Linus Torvalds
3ad11d7ac8 block-5.10-2020-10-12
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl+EWUgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpnoxEADCVSNBRkpV0OVkOEC3wf8EGhXhk01Jnjtl
 u5Mg2V55hcgJ0thQxBV/V28XyqmsEBrmAVi0Yf8Vr9Qbq4Ze08Wae4ChS4rEOyh1
 jTcGYWx5aJB3ChLvV/HI0nWQ3bkj03mMrL3SW8rhhf5DTyKHsVeTenpx42Qu/FKf
 fRzi09FSr3Pjd0B+EX6gunwJnlyXQC5Fa4AA0GhnXJzAznANXxHkkcXu8a6Yw75x
 e28CfhIBliORsK8sRHLoUnPpeTe1vtxCBhBMsE+gJAj9ZUOWMzvNFIPP4FvfawDy
 6cCQo2m1azJ/IdZZCDjFUWyjh+wxdKMp+NNryEcoV+VlqIoc3n98rFwrSL+GIq5Z
 WVwEwq+AcwoMCsD29Lu1ytL2PQ/RVqcJP5UheMrbL4vzefNfJFumQVZLIcX0k943
 8dFL2QHL+H/hM9Dx5y5rjeiWkAlq75v4xPKVjh/DHb4nehddCqn/+DD5HDhNANHf
 c1kmmEuYhvLpIaC4DHjE6DwLh8TPKahJjwsGuBOTr7D93NUQD+OOWsIhX6mNISIl
 FFhP8cd0/ZZVV//9j+q+5B4BaJsT+ZtwmrelKFnPdwPSnh+3iu8zPRRWO+8P8fRC
 YvddxuJAmE6BLmsAYrdz6Xb/wqfyV44cEiyivF0oBQfnhbtnXwDnkDWSfJD1bvCm
 ZwfpDh2+Tg==
 =LzyE
 -----END PGP SIGNATURE-----

Merge tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:

 - Series of merge handling cleanups (Baolin, Christoph)

 - Series of blk-throttle fixes and cleanups (Baolin)

 - Series cleaning up BDI, seperating the block device from the
   backing_dev_info (Christoph)

 - Removal of bdget() as a generic API (Christoph)

 - Removal of blkdev_get() as a generic API (Christoph)

 - Cleanup of is-partition checks (Christoph)

 - Series reworking disk revalidation (Christoph)

 - Series cleaning up bio flags (Christoph)

 - bio crypt fixes (Eric)

 - IO stats inflight tweak (Gabriel)

 - blk-mq tags fixes (Hannes)

 - Buffer invalidation fixes (Jan)

 - Allow soft limits for zone append (Johannes)

 - Shared tag set improvements (John, Kashyap)

 - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel)

 - DM no-wait support (Mike, Konstantin)

 - Request allocation improvements (Ming)

 - Allow md/dm/bcache to use IO stat helpers (Song)

 - Series improving blk-iocost (Tejun)

 - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang,
   Xianting, Yang, Yufen, yangerkun)

* tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits)
  block: fix uapi blkzoned.h comments
  blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue
  blk-mq: get rid of the dead flush handle code path
  block: get rid of unnecessary local variable
  block: fix comment and add lockdep assert
  blk-mq: use helper function to test hw stopped
  block: use helper function to test queue register
  block: remove redundant mq check
  block: invoke blk_mq_exit_sched no matter whether have .exit_sched
  percpu_ref: don't refer to ref->data if it isn't allocated
  block: ratelimit handle_bad_sector() message
  blk-throttle: Re-use the throtl_set_slice_end()
  blk-throttle: Open code __throtl_de/enqueue_tg()
  blk-throttle: Move service tree validation out of the throtl_rb_first()
  blk-throttle: Move the list operation after list validation
  blk-throttle: Fix IO hang for a corner case
  blk-throttle: Avoid tracking latency if low limit is invalid
  blk-throttle: Avoid getting the current time if tg->last_finish_time is 0
  blk-throttle: Remove a meaningless parameter for throtl_downgrade_state()
  block: Remove redundant 'return' statement
  ...
2020-10-13 12:12:44 -07:00
Jakub Kicinski
ccdf7fae3a Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-10-12

The main changes are:

1) The BPF verifier improvements to track register allocation pattern, from Alexei and Yonghong.

2) libbpf relocation support for different size load/store, from Andrii.

3) bpf_redirect_peer() helper and support for inner map array with different max_entries, from Daniel.

4) BPF support for per-cpu variables, form Hao.

5) sockmap improvements, from John.
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-12 16:16:50 -07:00
Linus Torvalds
3bff6112c8 These are the performance events changes for v5.10:
x86 Intel updates:
 
  - Add Jasper Lake support
 
  - Add support for TopDown metrics on Ice Lake
 
  - Fix Ice Lake & Tiger Lake uncore support, add Snow Ridge support
 
  - Add a PCI sub driver to support uncore PMUs where the PCI resources
    have been claimed already - extending the range of supported systems.
 
 x86 AMD updates:
 
  - Restore 'perf stat -a' behaviour to program the uncore PMU
    to count all CPU threads.
 
  - Fix setting the proper count when sampling Large Increment
    per Cycle events / 'paired' events.
 
  - Fix IBS Fetch sampling on F17h and some other IBS fine tuning,
    greatly reducing the number of interrupts when large sample
    periods are specified.
 
  - Extends Family 17h RAPL support to also work on compatible
    F19h machines.
 
 Core code updates:
 
  - Fix race in perf_mmap_close()
 
  - Add PERF_EV_CAP_SIBLING, to denote that sibling events should be
    closed if the leader is removed.
 
  - Smaller fixes and updates.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+Ef40RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1h7NQ//ZdQ26Yg79ZaxBX1QSINJ9AgXDi6rXs75
 qU9qNwr/6EF+633RZoPQGAE0Iy5v6h7iLFokcJzM9+kK/rE3ax44tSnPlcMa0+6N
 SHXKCa5iL+hH7o2Spo2MZwCYseH79rloX3TSH7ajnN3X8PvwgWshF0lUE3WEWtCs
 eHSojdCk43IuL9TpusuNOBM2FvgnheFYWiMbFHd0MTBUMxul30sLVCG8IIWCPA+q
 TwG4RJS3X42VbL3SuAGFmOv4OmqNsfkvHvjpDs4NF07tRB9zjXzGrxmGhgSw0NAN
 2KK25qbmrpKATIb4Eqsgk/yikX/SCrDEXrjhg3r8FnyPvRfctq1crZjjf672PI2E
 bDda76dH6Lq9jv5fsyJjas5OsYdMKBCnA+tGQxXPGbmTXeEcYMRbDnwhYnevI/Q/
 8pP+xstF0pmBA3tvpDPrQnYH72Qt7CLJSdcTB15NqZftU2tJxaAyJGx4gJy33jxQ
 wu6BIEGHQ7onQYiIyTwsBHyz6xNsF/CRHwAPcGdYrRRbXB5K5nxHiXNb4awciTMx
 2HF31/S4OqURNpfcpxOQo+1fb/cLqj3loGqE4jCTwkbS3lrHcAcfxyv9QNn77l1f
 hdQ0jworbUNVLUYEUQz1bkZ06GD3LSSas2ZlY1NNdHo62mjyXMQmgirNcZmrFgWl
 tl2gNFAU9x4=
 =2fuY
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull performance events updates from Ingo Molnar:
 "x86 Intel updates:

   - Add Jasper Lake support

   - Add support for TopDown metrics on Ice Lake

   - Fix Ice Lake & Tiger Lake uncore support, add Snow Ridge support

   - Add a PCI sub driver to support uncore PMUs where the PCI resources
     have been claimed already - extending the range of supported
     systems.

  x86 AMD updates:

   - Restore 'perf stat -a' behaviour to program the uncore PMU to count
     all CPU threads.

   - Fix setting the proper count when sampling Large Increment per
     Cycle events / 'paired' events.

   - Fix IBS Fetch sampling on F17h and some other IBS fine tuning,
     greatly reducing the number of interrupts when large sample periods
     are specified.

   - Extends Family 17h RAPL support to also work on compatible F19h
     machines.

  Core code updates:

   - Fix race in perf_mmap_close()

   - Add PERF_EV_CAP_SIBLING, to denote that sibling events should be
     closed if the leader is removed.

   - Smaller fixes and updates"

* tag 'perf-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
  perf/core: Fix race in the perf_mmap_close() function
  perf/x86: Fix n_metric for cancelled txn
  perf/x86: Fix n_pair for cancelled txn
  x86/events/amd/iommu: Fix sizeof mismatch
  perf/x86/intel: Check perf metrics feature for each CPU
  perf/x86/intel: Fix Ice Lake event constraint table
  perf/x86/intel/uncore: Fix the scale of the IMC free-running events
  perf/x86/intel/uncore: Fix for iio mapping on Skylake Server
  perf/x86/msr: Add Jasper Lake support
  perf/x86/intel: Add Jasper Lake support
  perf/x86/intel/uncore: Reduce the number of CBOX counters
  perf/x86/intel/uncore: Update Ice Lake uncore units
  perf/x86/intel/uncore: Split the Ice Lake and Tiger Lake MSR uncore support
  perf/x86/intel/uncore: Support PCIe3 unit on Snow Ridge
  perf/x86/intel/uncore: Generic support for the PCI sub driver
  perf/x86/intel/uncore: Factor out uncore_pci_pmu_unregister()
  perf/x86/intel/uncore: Factor out uncore_pci_pmu_register()
  perf/x86/intel/uncore: Factor out uncore_pci_find_dev_pmu()
  perf/x86/intel/uncore: Factor out uncore_pci_get_dev_die_info()
  perf/amd/uncore: Inform the user how many counters each uncore PMU has
  ...
2020-10-12 14:14:35 -07:00
Linus Torvalds
dd502a8107 This tree introduces static_call(), which is the idea of static_branch()
applied to indirect function calls. Remove a data load (indirection) by
 modifying the text.
 
 They give the flexibility of function pointers, but with better
 performance. (This is especially important for cases where
 retpolines would otherwise be used, as retpolines can be pretty
 slow.)
 
 API overview:
 
   DECLARE_STATIC_CALL(name, func);
   DEFINE_STATIC_CALL(name, func);
   DEFINE_STATIC_CALL_NULL(name, typename);
 
   static_call(name)(args...);
   static_call_cond(name)(args...);
   static_call_update(name, func);
 
 x86 is supported via text patching, otherwise basic indirect calls are used,
 with function pointers.
 
 There's a second variant using inline code patching, inspired by jump-labels,
 implemented on x86 as well.
 
 The new APIs are utilized in the x86 perf code, a heavy user of function pointers,
 where static calls speed up the PMU handler by 4.2% (!).
 
 The generic implementation is not really excercised on other architectures,
 outside of the trivial test_static_call_init() self-test.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+EfAQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iEAw//divHeVCJnHhV+YBbuI9ROUsERkzu8VhK
 O1DEmW68Fvj7pszT8NZsMjtkt97ZtxDRK7aCJiiup0eItG9qCJ8lpCLb84ZbizHV
 HhCbhBLrpxSvTrWlQnkgP1OkPAbtoryIjVlZzWhjye2MY8UEbVnZWyviBolbAAxH
 Fk1Yi56fIMu19GO+9Ohzy9E2VDnVEH1iMx5YWoLD2H88Qbq/yEMP+U2tIj8hIVKT
 Y/jdogihNXRIau6QB+YPfDPisdty+RHxfU7zct4Rv8cFF5ylglZB5fD34C3sUQF2
 WqsaYz7zjUj9f02F8pw8hIaAT7InzArPhlNVITxf2oMfmdrNqBptnSCddZqCJLvv
 oDGew21k50Zcbqkv9amclpxXH5tTpRvJeqit2pz/85GMeqBRuhzHUAkCpht5YA73
 qJsHWS3z+qIxKi0tDbhDJswuwa51q5sgdUUwo1uCr3wT3DGDlqNhCAZBzX14dcty
 0shDSbv13TCwqAcb7asPzEoPwE15cwa+x+viGEIL901pyZKyQYjs/abDU26It3BW
 roWRkuVJZ9/QMdZJs1v7kaXw1L8YiKIDkBgke+xbfrDwEvvjudQkl2LUL66DB11j
 RJU3GyxKClvdY06SSRh/H13fqZLNKh1JZ0nPEWSTJECDFN9zcDjrDrod/7PFOcpY
 NAlawLoGG+s=
 =JvpF
 -----END PGP SIGNATURE-----

Merge tag 'core-static_call-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull static call support from Ingo Molnar:
 "This introduces static_call(), which is the idea of static_branch()
  applied to indirect function calls. Remove a data load (indirection)
  by modifying the text.

  They give the flexibility of function pointers, but with better
  performance. (This is especially important for cases where retpolines
  would otherwise be used, as retpolines can be pretty slow.)

  API overview:

      DECLARE_STATIC_CALL(name, func);
      DEFINE_STATIC_CALL(name, func);
      DEFINE_STATIC_CALL_NULL(name, typename);

      static_call(name)(args...);
      static_call_cond(name)(args...);
      static_call_update(name, func);

  x86 is supported via text patching, otherwise basic indirect calls are
  used, with function pointers.

  There's a second variant using inline code patching, inspired by
  jump-labels, implemented on x86 as well.

  The new APIs are utilized in the x86 perf code, a heavy user of
  function pointers, where static calls speed up the PMU handler by
  4.2% (!).

  The generic implementation is not really excercised on other
  architectures, outside of the trivial test_static_call_init()
  self-test"

* tag 'core-static_call-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  static_call: Fix return type of static_call_init
  tracepoint: Fix out of sync data passing by static caller
  tracepoint: Fix overly long tracepoint names
  x86/perf, static_call: Optimize x86_pmu methods
  tracepoint: Optimize using static_call()
  static_call: Allow early init
  static_call: Add some validation
  static_call: Handle tail-calls
  static_call: Add static_call_cond()
  x86/alternatives: Teach text_poke_bp() to emulate RET
  static_call: Add simple self-test for static calls
  x86/static_call: Add inline static call implementation for x86-64
  x86/static_call: Add out-of-line static call implementation
  static_call: Avoid kprobes on inline static_call()s
  static_call: Add inline static call infrastructure
  static_call: Add basic static call infrastructure
  compiler.h: Make __ADDRESSABLE() symbol truly unique
  jump_label,module: Fix module lifetime for __jump_label_mod_text_reserved()
  module: Properly propagate MODULE_STATE_COMING failure
  module: Fix up module_notifier return values
  ...
2020-10-12 13:58:15 -07:00
Peter Zijlstra
d741bf41d7 kprobes: Remove kretprobe hash
The kretprobe hash is mostly superfluous, replace it with a per-task
variable.

This gets rid of the task hash and it's related locking.

Note that this may change the kprobes module-exported API for kretprobe
handlers. If any out-of-tree kretprobe user uses ri->rp, use
get_kretprobe(ri) instead.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/159870620431.1229682.16325792502413731312.stgit@devnote2
2020-10-12 18:27:27 +02:00
Steven Rostedt (VMware)
848183553e tracing: Fix synthetic print fmt check for use of __get_str()
A cut and paste error had the check to use __get_str() test "is_dynamic"
twice, instead of checking "is_string && is_dynamic".

Link: https://lore.kernel.org/r/d34dccd5-96ba-a2d9-46ea-de8807525deb@canonical.com

Reported-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-08 15:29:07 -04:00
Sudip Mukherjee
43aa422c0c tracing: Remove a pointless assignment
The variable 'len' has been assigned a value but is not used after that.
So, remove the assignement.

Link: https://lkml.kernel.org/r/20200930184303.22896-1-sudipm.mukherjee@gmail.com

Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-08 15:29:06 -04:00
Steven Rostedt (VMware)
7ba031e8b7 ftrace: Format variable declarations of ftrace_allocate_records
I hate when unrelated variables are declared on the same line.
Split them.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-08 15:29:06 -04:00
Wei Yang
b40c6eabfc ftrace: Simplify the calculation of page number for ftrace_page->records
Based on the following two reasones, we could simplify the calculation:

  - If the number after roundup count is not power of 2, we would
    definitely have more than 1 empty page with a higher order.
  - get_count_order() just return current order, so one lower order
    could meet the requirement.

The calculation could be simplified by lower one order level when pages
are not power of 2.

Link: https://lkml.kernel.org/r/20200831031104.23322-5-richard.weiyang@linux.alibaba.com

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-08 15:29:06 -04:00
Steven Rostedt (VMware)
be49313273 ftrace: Simplify the hash calculation
No need to add a check to subtract the number of bits if bits is zero after
fls(). Just divide the size by two before calling it. This does give the
same answer for size of 0 and 1, but that's fine.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-08 15:29:06 -04:00
Wei Yang
59e65b3358 ftrace: Use fls() to get the bits for dup_hash()
The effect here is to get the number of bits, lets use fls() to do
this job.

Link: https://lkml.kernel.org/r/20200831031104.23322-3-richard.weiyang@linux.alibaba.com

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-08 15:29:06 -04:00
Steven Rostedt (VMware)
8db4d6bfbb tracing: Change synthetic event string format to limit printed length
Change the format for printing synthetic field strings to limit the
length of the string printed even if it's not correctly terminated.

Link: https://lore.kernel.org/r/20201002210036.0200371b@oasis.local.home
Link: https://lkml.kernel.org/r/b6bdb34e70d970e8026daa3503db6b8e5cdad524.1601848695.git.zanussi@kernel.org

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-08 15:29:06 -04:00
Tom Zanussi
1bc36bd4a8 tracing: Add README information for synthetic_events file
Add an entry with a basic description of events/synthetic_events along
with a simple example.

Link: https://lkml.kernel.org/r/3c7f178cf95aaeebc01eda7d95600dd937233eb7.1601848695.git.zanussi@kernel.org

Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-08 15:28:14 -04:00
Yonghong Song
ebfb4d40ed bpf: Fix build failure for kernel/trace/bpf_trace.c with CONFIG_NET=n
When CONFIG_NET is not defined, I hit the following build error:
    kernel/trace/bpf_trace.o:(.rodata+0x110): undefined reference to `bpf_prog_test_run_raw_tp'

Commit 1b4d60ec16 ("bpf: Enable BPF_PROG_TEST_RUN for raw_tracepoint")
added test_run support for raw_tracepoint in /kernel/trace/bpf_trace.c.
But the test_run function bpf_prog_test_run_raw_tp is defined in
net/bpf/test_run.c, only available with CONFIG_NET=y.

Adding a CONFIG_NET guard for
    .test_run = bpf_prog_test_run_raw_tp;
fixed the above build issue.

Fixes: 1b4d60ec16 ("bpf: Enable BPF_PROG_TEST_RUN for raw_tracepoint")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201007062933.3425899-1-yhs@fb.com
2020-10-07 10:56:46 -07:00
David S. Miller
8b0308fe31 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Rejecting non-native endian BTF overlapped with the addition
of support for it.

The rest were more simple overlapping changes, except the
renesas ravb binding update, which had to follow a file
move as well as a YAML conversion.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-05 18:40:01 -07:00
Tom Zanussi
bd82631d7c tracing: Add support for dynamic strings to synthetic events
Currently, sythetic events only support static string fields such as:

  # echo 'test_latency u64 lat; char somename[32]' > /sys/kernel/debug/tracing/synthetic_events

Which is fine, but wastes a lot of space in the event.

It also prevents the most commonly-defined strings in the existing
trace events e.g. those defined using __string(), from being passed to
synthetic events via the trace() action.

With this change, synthetic events with dynamic fields can be defined:

  # echo 'test_latency u64 lat; char somename[]' > /sys/kernel/debug/tracing/synthetic_events

And the trace() action can be used to generate events using either
dynamic or static strings:

  # echo 'hist:keys=name:lat=common_timestamp.usecs-$ts0:onmatch(sys.event).test_latency($lat,name)' > /sys/kernel/debug/tracing/events

The synthetic event dynamic strings are implemented in the same way as
the existing __data_loc strings and appear as such in the format file.

[ <rostedt@goodmis.org>: added __set_synth_event_print_fmt() changes:

  I added the following to make it work with trace-cmd. Dynamic strings
  must have __get_str() for events in the print_fmt otherwise it can't be
  parsed correctly. ]

Link: https://lore.kernel.org/r/cover.1601588066.git.zanussi@kernel.org
Link: https://lkml.kernel.org/r/3ed35b6d0e390f5b94cb4a9ba1cc18f5982ab277.1601848695.git.zanussi@kernel.org

Tested-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-05 19:32:18 -04:00
Tom Zanussi
63a1e5de30 tracing: Save normal string variables
String variables created as field variables and save variables are
already handled properly by having their values copied when set.  The
same isn't done for normal variables, but needs to be - simply saving
a pointer to a string contained in an old event isn't sufficient,
since that event's data may quickly become overwritten and therefore a
string pointer to it could yield garbage.

This change uses the same mechanism as field variables and simply
appends the new strings to the existing per-element field_var_str[]
array allocated for that purpose.

Link: https://lkml.kernel.org/r/1c1a03798b02e67307412a0c719d1bfb69b13007.1601848695.git.zanussi@kernel.org

Fixes: 02205a6752 (tracing: Add support for 'field variables')
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-05 18:13:53 -04:00
Tom Zanussi
8fbeb52a59 tracing: Fix parse_synth_field() error handling
synth_field_size() returns either a positive size or an error (zero or
a negative value). However, the existing code assumes the only error
value is 0. It doesn't handle negative error codes, as it assigns
directly to field->size (a size_t; unsigned), thereby interpreting the
error code as a valid size instead.

Do the test before assignment to field->size.

[ axelrasmussen@google.com: changelog addition, first paragraph above ]

Link: https://lkml.kernel.org/r/9b6946d9776b2eeb43227678158196de1c3c6e1d.1601848695.git.zanussi@kernel.org

Fixes: 4b147936fa (tracing: Add support for 'synthetic' events)
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-05 18:13:53 -04:00
Tom Zanussi
4a4a56b4e7 tracing: Change STR_VAR_MAX_LEN
32 is too small for this value, and anyway it makes more sense to use
MAX_FILTER_STR_VAL, as this is also the value used for variable-length
__strings.

Link: https://lkml.kernel.org/r/6adfd1668ac1fd8670bd58206944a762061a5559.1601848695.git.zanussi@kernel.org

Tested-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-05 18:13:53 -04:00
Christoph Hellwig
10ed16662d block: add a bdget_part helper
All remaining callers of bdget() outside of fs/block_dev.c want to get a
reference to the struct block_device for a given struct hd_struct.  Add
a helper just for that and then mark bdget static.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-05 10:38:33 -06:00
Tingwei Zhang
458999c6f6 tracing: Add trace_export support for trace_marker
Add the support to route trace_marker buffer to other destination
via trace_export.

Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Link: https://lore.kernel.org/r/20201005071319.78508-5-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 12:43:53 +02:00
Tingwei Zhang
8ab7a2b705 tracing: Add trace_export support for event trace
Only function traces can be exported to other destinations currently.
This patch exports event trace as well. Move trace export related
function to the beginning of file so other trace can call
trace_process_export() to export.

Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Link: https://lore.kernel.org/r/20201005071319.78508-4-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 12:43:53 +02:00
Tingwei Zhang
8438f52114 tracing: Add flag to control different traces
More traces like event trace or trace marker will be supported.
Add flag for difference traces, so that they can be controlled
separately. Move current function trace to it's own flag
instead of global ftrace enable flag.

Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Link: https://lore.kernel.org/r/20201005071319.78508-3-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 12:43:53 +02:00
Hao Luo
63d9b80dcf bpf: Introducte bpf_this_cpu_ptr()
Add bpf_this_cpu_ptr() to help access percpu var on this cpu. This
helper always returns a valid pointer, therefore no need to check
returned value for NULL. Also note that all programs run with
preemption disabled, which means that the returned pointer is stable
during all the execution of the program.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200929235049.2533242-6-haoluo@google.com
2020-10-02 15:00:49 -07:00
Hao Luo
eaa6bcb71e bpf: Introduce bpf_per_cpu_ptr()
Add bpf_per_cpu_ptr() to help bpf programs access percpu vars.
bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the kernel
except that it may return NULL. This happens when the cpu parameter is
out of range. So the caller must check the returned value.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200929235049.2533242-5-haoluo@google.com
2020-10-02 15:00:49 -07:00
Qiujun Huang
fdda88d31a ftrace: Fix some typos in comment
s/coorditate/coordinate/
s/emty/empty/
s/preeptive/preemptive/
s/succes/success/
s/carefule/careful/

Link: https://lkml.kernel.org/r/20201002143126.2890-1-hqjagain@gmail.com

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-02 14:05:48 -04:00
Linus Torvalds
aa5ff93523 Two tracing fixes:
- Fix temp buffer accounting that caused a WARNING for
   ftrace_dump_on_opps()
 
 - Move the recursion check in one of the function callback helpers to the
   beginning of the function, as if the rcu_is_watching() gets traced, it
   will cause a recursive loop that will crash the kernel.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCX3UZ7hQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qlelAP4nzEIyE7s8HkKHo+IEvzEiYL523Xq7
 zrTm27XnLjZ+EQD9ECbtea9me8kL+zBcG8H3Wu/ykN15LIT2ZsvVTrWYmwI=
 =eTh4
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Two tracing fixes:

   - Fix temp buffer accounting that caused a WARNING for
     ftrace_dump_on_opps()

   - Move the recursion check in one of the function callback helpers to
     the beginning of the function, as if the rcu_is_watching() gets
     traced, it will cause a recursive loop that will crash the kernel"

* tag 'trace-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Move RCU is watching check after recursion check
  tracing: Fix trace_find_next_entry() accounting of temp buffer size
2020-10-01 09:41:02 -07:00
Steven Rostedt (VMware)
b40341fad6 ftrace: Move RCU is watching check after recursion check
The first thing that the ftrace function callback helper functions should do
is to check for recursion. Peter Zijlstra found that when
"rcu_is_watching()" had its notrace removed, it caused perf function tracing
to crash. This is because the call of rcu_is_watching() is tested before
function recursion is checked and and if it is traced, it will cause an
infinite recursion loop.

rcu_is_watching() should still stay notrace, but to prevent this should
never had crashed in the first place. The recursion prevention must be the
first thing done in callback functions.

Link: https://lore.kernel.org/r/20200929112541.GM2628@hirez.programming.kicks-ass.net

Cc: stable@vger.kernel.org
Cc: Paul McKenney <paulmck@kernel.org>
Fixes: c68c0fa293 ("ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too")
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-29 13:05:10 -04:00
Steven Rostedt (VMware)
851e6f61cd tracing: Fix trace_find_next_entry() accounting of temp buffer size
The temp buffer size variable for trace_find_next_entry() was incorrectly
being updated when the size did not change. The temp buffer size should only
be updated when it is reallocated.

This is mostly an issue when used with ftrace_dump(). That's because
ftrace_dump() can not allocate a new buffer, and instead uses a temporary
buffer with a fix size. But the variable that keeps track of that size is
incorrectly updated with each call, and it could fall into the path that
would try to reallocate the buffer and produce a warning.

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 1601 at kernel/trace/trace.c:3548
trace_find_next_entry+0xd0/0xe0
 Modules linked in [..]
 CPU: 1 PID: 1601 Comm: bash Not tainted 5.9.0-rc5-test+ #521
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03
07/14/2016
 RIP: 0010:trace_find_next_entry+0xd0/0xe0
 Code: 40 21 00 00 4c 89 e1 31 d2 4c 89 ee 48 89 df e8 c6 9e ff ff 89 ab 54
21 00 00 5b 5d 41 5c 41 5d c3 48 63 d5 eb bf 31 c0 eb f0 <0f> 0b 48 63 d5 eb
b4 66 0f 1f 84 00 00 00 00 00 53 48 8d 8f 60 21
 RSP: 0018:ffff95a4f2e8bd70 EFLAGS: 00010046
 RAX: ffffffff96679fc0 RBX: ffffffff97910de0 RCX: ffffffff96679fc0
 RDX: ffff95a4f2e8bd98 RSI: ffff95a4ee321098 RDI: ffffffff97913000
 RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000046 R12: ffff95a4f2e8bd98
 R13: 0000000000000000 R14: ffff95a4ee321098 R15: 00000000009aa301
 FS:  00007f8565484740(0000) GS:ffff95a55aa40000(0000)
knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055876bd43d90 CR3: 00000000b76e6003 CR4: 00000000001706e0
 Call Trace:
  trace_print_lat_context+0x58/0x2d0
  ? cpumask_next+0x16/0x20
  print_trace_line+0x1a4/0x4f0
  ftrace_dump.cold+0xad/0x12c
  __handle_sysrq.cold+0x51/0x126
  write_sysrq_trigger+0x3f/0x4a
  proc_reg_write+0x53/0x80
  vfs_write+0xca/0x210
  ksys_write+0x70/0xf0
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f8565579487
 Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa
64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff
77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
 RSP: 002b:00007ffd40707948 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f8565579487
 RDX: 0000000000000002 RSI: 000055876bd74de0 RDI: 0000000000000001
 RBP: 000055876bd74de0 R08: 000000000000000a R09: 0000000000000001
 R10: 000055876bdec280 R11: 0000000000000246 R12: 0000000000000002
 R13: 00007f856564a500 R14: 0000000000000002 R15: 00007f856564a700
 irq event stamp: 109958
 ---[ end trace 7aab5b7e51484b00 ]---

Not only fix the updating of the temp buffer, but also do not free the temp
buffer before a new buffer is allocated (there's no reason to not continue
to use the current temp buffer if an allocation fails).

Cc: stable@vger.kernel.org
Fixes: 8e99cf91b9 ("tracing: Do not allocate buffer in trace_find_next_entry() in atomic")
Reported-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-29 12:46:22 -04:00
Alan Maguire
eb411377ae bpf: Add bpf_seq_printf_btf helper
A helper is added to allow seq file writing of kernel data
structures using vmlinux BTF.  Its signature is

long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr,
                        u32 btf_ptr_size, u64 flags);

Flags and struct btf_ptr definitions/use are identical to the
bpf_snprintf_btf helper, and the helper returns 0 on success
or a negative error value.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-8-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Alan Maguire
c4d0bfb450 bpf: Add bpf_snprintf_btf helper
A helper is added to support tracing kernel type information in BPF
using the BPF Type Format (BTF).  Its signature is

long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr,
		      u32 btf_ptr_size, u64 flags);

struct btf_ptr * specifies

- a pointer to the data to be traced
- the BTF id of the type of data pointed to
- a flags field is provided for future use; these flags
  are not to be confused with the BTF_F_* flags
  below that control how the btf_ptr is displayed; the
  flags member of the struct btf_ptr may be used to
  disambiguate types in kernel versus module BTF, etc;
  the main distinction is the flags relate to the type
  and information needed in identifying it; not how it
  is displayed.

For example a BPF program with a struct sk_buff *skb
could do the following:

	static struct btf_ptr b = { };

	b.ptr = skb;
	b.type_id = __builtin_btf_type_id(struct sk_buff, 1);
	bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0);

Default output looks like this:

(struct sk_buff){
 .transport_header = (__u16)65535,
 .mac_header = (__u16)65535,
 .end = (sk_buff_data_t)192,
 .head = (unsigned char *)0x000000007524fd8b,
 .data = (unsigned char *)0x000000007524fd8b,
 .truesize = (unsigned int)768,
 .users = (refcount_t){
  .refs = (atomic_t){
   .counter = (int)1,
  },
 },
}

Flags modifying display are as follows:

- BTF_F_COMPACT:	no formatting around type information
- BTF_F_NONAME:		no struct/union member names/types
- BTF_F_PTR_RAW:	show raw (unobfuscated) pointer values;
			equivalent to %px.
- BTF_F_ZERO:		show zero-valued struct/union members;
			they are not displayed by default

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-4-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Song Liu
1b4d60ec16 bpf: Enable BPF_PROG_TEST_RUN for raw_tracepoint
Add .test_run for raw_tracepoint. Also, introduce a new feature that runs
the target program on a specific CPU. This is achieved by a new flag in
bpf_attr.test, BPF_F_TEST_RUN_ON_CPU. When this flag is set, the program
is triggered on cpu with id bpf_attr.test.cpu. This feature is needed for
BPF programs that handle perf_event and other percpu resources, as the
program can access these resource locally.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200925205432.1777-2-songliubraving@fb.com
2020-09-28 21:52:36 +02:00
Masami Hiramatsu
720dee53ad tracing/boot: Initialize per-instance event list in early boot
Initialize per-instance event list in early boot time (before
initializing instance directory on tracefs). This fixes boot-time
tracing to correctly handle the boot-time per-instance settings.

Link: https://lkml.kernel.org/r/160096560826.182763.17110991546046128881.stgit@devnote2

Fixes: 4114fbfd02 ("tracing: Enable creating new instance early boot")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-25 15:36:03 -04:00
Christoph Hellwig
fa01b1e973 block: add a bdev_is_partition helper
Add a littler helper to make the somewhat arcane bd_contains checks a
little more obvious.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-25 08:18:57 -06:00
Jarkko Sakkinen
e9ffc8c1b8 kprobes: Use module_name() macro
It is advised to use module_name() macro instead of dereferencing mod->name
directly. This makes sense for consistencys sake and also it prevents a
hard dependency to CONFIG_MODULES.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20200818050857.117998-1-jarkko.sakkinen@linux.intel.com
2020-09-24 15:55:49 +02:00
David S. Miller
6d772f328d Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-09-23

The following pull-request contains BPF updates for your *net-next* tree.

We've added 95 non-merge commits during the last 22 day(s) which contain
a total of 124 files changed, 4211 insertions(+), 2040 deletions(-).

The main changes are:

1) Full multi function support in libbpf, from Andrii.

2) Refactoring of function argument checks, from Lorenz.

3) Make bpf_tail_call compatible with functions (subprograms), from Maciej.

4) Program metadata support, from YiFei.

5) bpf iterator optimizations, from Yonghong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23 13:11:11 -07:00
David S. Miller
3ab0a7a0c3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Two minor conflicts:

1) net/ipv4/route.c, adding a new local variable while
   moving another local variable and removing it's
   initial assignment.

2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes.
   One pretty prints the port mode differently, whilst another
   changes the driver to try and obtain the port mode from
   the port node rather than the switch node.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-22 16:45:34 -07:00
Linus Torvalds
eff48ddeab Tracing fixes:
- Check kprobe is enabled before unregistering from ftrace as it isn't
   registered when disabled.
 
 - Remove kprobes enabled via command-line that is on init text when freed.
 
 - Add missing RCU synchronization for ftrace trampoline symbols removed
   from kallsyms.
 
 - Free trampoline on error path if ftrace_startup() fails.
 
 - Give more space for the longer PID numbers in trace output.
 
 - Fix a possible double free in the histogram code.
 
 - A couple of fixes that were discovered by sparse.
 -----BEGIN PGP SIGNATURE-----
 
 iIkEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCX2lEchQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qokAAQDphHFTOlgjKi7lF7bc5V1bl/MT1bVo
 bJRHV8w2agtXMgD49ElFOl6znXqid3X++0dYZ5/AQgOZXf1rsYS05Pj0Dw==
 =+cVX
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Check kprobe is enabled before unregistering from ftrace as it isn't
   registered when disabled.

 - Remove kprobes enabled via command-line that is on init text when
   freed.

 - Add missing RCU synchronization for ftrace trampoline symbols removed
   from kallsyms.

 - Free trampoline on error path if ftrace_startup() fails.

 - Give more space for the longer PID numbers in trace output.

 - Fix a possible double free in the histogram code.

 - A couple of fixes that were discovered by sparse.

* tag 'trace-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  bootconfig: init: make xbc_namebuf static
  kprobes: tracing/kprobes: Fix to kill kprobes on initmem after boot
  tracing: fix double free
  ftrace: Let ftrace_enable_sysctl take a kernel pointer buffer
  tracing: Make the space reserved for the pid wider
  ftrace: Fix missing synchronize_rcu() removing trampoline from kallsyms
  ftrace: Free the trampoline when ftrace_startup() fails
  kprobes: Fix to check probe enabled before disarm_kprobe_ftrace()
2020-09-22 09:08:33 -07:00
Masami Hiramatsu
ba0fbfbb21 tracing/boot, kprobe, synth: Initialize boot-time tracing earlier
Initialize boot-time tracing in core_initcall_sync instead of
fs_initcall, and initialize required tracers (kprobes and synth)
in core_initcall. This will allow the boot-time tracing to trace
__init code from the beginning of postcore_initcall stage.

Link: https://lkml.kernel.org/r/159974155727.478751.7486926132902849578.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-21 21:06:04 -04:00
Masami Hiramatsu
4114fbfd02 tracing: Enable creating new instance early boot
Enable creating new trace_array instance in early boot stage.
If the instances directory is not created, postpone it until
the tracefs is initialized.

Link: https://lkml.kernel.org/r/159974154763.478751.6289753509587233103.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-21 21:06:04 -04:00
Masami Hiramatsu
a838deab4e tracing: Enable adding dynamic events early stage
Split the event fields initialization from creating new event directory.
This allows the boot-time tracing to define dynamic events before
initializing events directory on tracefs.

Link: https://lkml.kernel.org/r/159974153790.478751.3475515065034825374.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-21 21:06:04 -04:00
Masami Hiramatsu
ac343da7bc tracing: Define event fields early stage
Define event fields at early stage so that boot-time tracing can
access the event fields (like per-event filter setting).

Link: https://lkml.kernel.org/r/159974152862.478751.2023768466808361350.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-21 21:06:04 -04:00
Masami Hiramatsu
3dd3aae32d tracing/uprobes: Support perf-style return probe
Support perf-style return probe ("SYMBOL%return") for uprobe events
as same as kprobe events does.

Link: https://lkml.kernel.org/r/159972814601.428528.7641183316212425445.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-21 21:06:03 -04:00
Masami Hiramatsu
4725cd8997 tracing/kprobes: Support perf-style return probe
Support perf-style return probe ("SYMBOL%return") for kprobe events.
This will allow boot-time tracing user to define a return probe event.

Link: https://lkml.kernel.org/r/159972813535.428528.4437029657208468954.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-21 21:06:03 -04:00
Masami Hiramatsu
8490db06f9 tracing/boot: Add per-instance tracing_on option support
Add per-instance tracing_on option, which will be useful with
traceon/traceoff event trigger actions. For example, if we
disable tracing_on by default and set traceon and traceoff on
a pair of events, we can trace functions between the pair of
events.

Link: https://lkml.kernel.org/r/159972811538.428528.2561315102284268611.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-21 21:06:03 -04:00
Davidlohr Bueso
40d14da383 fgraph: Convert ret_stack tasklist scanning to rcu
It seems that alloc_retstack_tasklist() can also take a lockless
approach for scanning the tasklist, instead of using the big global
tasklist_lock. For this we also kill another deprecated and rcu-unsafe
tsk->thread_group user replacing it with for_each_process_thread(),
maintaining semantics.

Here tasklist_lock does not protect anything other than the list
against concurrent fork/exit. And considering that the whole thing
is capped by FTRACE_RETSTACK_ALLOC_SIZE (32), it should not be a
problem to have a pontentially stale, yet stable, list. The task cannot
go away either, so we don't risk racing with ftrace_graph_exit_task()
which clears the retstack.

The tsk->ret_stack management is not protected by tasklist_lock, being
serialized with the corresponding publish/subscribe barriers against
concurrent ftrace_push_return_trace(). In addition this plays nicer
with cachelines by avoiding two atomic ops in the uncontended case.

Link: https://lkml.kernel.org/r/20200907013326.9870-1-dave@stgolabs.net

Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-21 21:06:02 -04:00
Dan Carpenter
eb8d8b4c98 tracing: remove a pointless assignment
The "tr" is a stack variable so setting it to NULL before a return is
a no-op.  Delete the assignment.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-21 21:06:02 -04:00
Jarkko Sakkinen
f3d3642661 kprobes: Use module_name() macro
It is advised to use module_name() macro instead of dereferencing mod->name
directly. This makes sense for consistencys sake and also it prevents a
hard dependency to CONFIG_MODULES.

Link: https://lkml.kernel.org/r/20200818050857.117998-1-jarkko.sakkinen@linux.intel.com

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-21 21:06:02 -04:00
Xianting Tian
b427e765bd tracing: Use __this_cpu_read() in trace_buffered_event_enable()
The code is executed with preemption disabled, so it's
safe to use __this_cpu_read().

Link: https://lkml.kernel.org/r/20200813112803.12256-1-tian.xianting@h3c.com

Signed-off-by: Xianting Tian <tian.xianting@h3c.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-21 21:06:02 -04:00
Randy Dunlap
5c8c206e43 tracing: Delete repeated words in comments
Drop repeated words in kernel/trace/.
{and, the, not}

Link: https://lkml.kernel.org/r/20200807033259.13778-1-rdunlap@infradead.org

Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-21 21:06:02 -04:00
Jiri Olsa
a8a717963f selftests/bpf: Fix stat probe in d_path test
Some kernels builds might inline vfs_getattr call within fstat
syscall code path, so fentry/vfs_getattr trampoline is not called.

Add security_inode_getattr to allowlist and switch the d_path test stat
trampoline to security_inode_getattr.

Keeping dentry_open and filp_close, because they are in their own
files, so unlikely to be inlined, but in case they are, adding
security_file_open.

Adding flags that indicate trampolines were called and failing
the test if any of them got missed, so it's easier to identify
the issue next time.

Fixes: e4d1af4b16 ("selftests/bpf: Add test for d_path helper")
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200918112338.2618444-1-jolsa@kernel.org
2020-09-21 16:18:00 -07:00
Lorenz Bauer
9436ef6e86 bpf: Allow specifying a BTF ID per argument in function protos
Function prototypes using ARG_PTR_TO_BTF_ID currently use two ways to signal
which BTF IDs are acceptable. First, bpf_func_proto.btf_id is an array of
IDs, one for each argument. This array is only accessed up to the highest
numbered argument that uses ARG_PTR_TO_BTF_ID and may therefore be less than
five arguments long. It usually points at a BTF_ID_LIST. Second, check_btf_id
is a function pointer that is called by the verifier if present. It gets the
actual BTF ID of the register, and the argument number we're currently checking.
It turns out that the only user check_arg_btf_id ignores the argument, and is
simply used to check whether the BTF ID has a struct sock_common at it's start.

Replace both of these mechanisms with an explicit BTF ID for each argument
in a function proto. Thanks to btf_struct_ids_match this is very flexible:
check_arg_btf_id can be replaced by requiring struct sock_common.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200921121227.255763-5-lmb@cloudflare.com
2020-09-21 15:00:40 -07:00
Linus Torvalds
325d0eab4f Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "15 patches.

  Subsystems affected by this patch series: mailmap, mm/hotfixes,
  mm/thp, mm/memory-hotplug, misc, kcsan"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  kcsan: kconfig: move to menu 'Generic Kernel Debugging Instruments'
  fs/fs-writeback.c: adjust dirtytime_interval_handler definition to match prototype
  stackleak: let stack_erasing_sysctl take a kernel pointer buffer
  ftrace: let ftrace_enable_sysctl take a kernel pointer buffer
  mm/memory_hotplug: drain per-cpu pages again during memory offline
  selftests/vm: fix display of page size in map_hugetlb
  mm/thp: fix __split_huge_pmd_locked() for migration PMD
  kprobes: fix kill kprobe which has been marked as gone
  tmpfs: restore functionality of nr_inodes=0
  mlock: fix unevictable_pgs event counts on THP
  mm: fix check_move_unevictable_pages() on THP
  mm: migration of hugetlbfs page skip memcg
  ksm: reinstate memcg charge on copied pages
  mailmap: add older email addresses for Kees Cook
2020-09-19 18:18:37 -07:00
Tobias Klauser
7bb82ac30c ftrace: let ftrace_enable_sysctl take a kernel pointer buffer
Commit 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer.  Adjust the
signature of ftrace_enable_sysctl to match ctl_table.proc_handler which
fixes the following sparse warning:

kernel/trace/ftrace.c:7544:43: warning: incorrect type in argument 3 (different address spaces)
kernel/trace/ftrace.c:7544:43:    expected void *
kernel/trace/ftrace.c:7544:43:    got void [noderef] __user *buffer

Fixes: 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lkml.kernel.org/r/20200907093207.13540-1-tklauser@distanz.ch
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-19 13:13:39 -07:00
Wei Yang
22c36b1826 tracing: make tracing_init_dentry() returns an integer instead of a d_entry pointer
Current tracing_init_dentry() return a d_entry pointer, while is not
necessary. This function returns NULL on success or error on failure,
which means there is no valid d_entry pointer return.

Let's return 0 on success and negative value for error.

Link: https://lkml.kernel.org/r/20200712011036.70948-5-richard.weiyang@linux.alibaba.com

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-18 22:17:14 -04:00
Wei Yang
dc300d77b8 tracing: toplevel d_entry already initialized
Currently we have following call flow:

    tracer_init_tracefs()
        tracing_init_dentry()
        event_trace_init()
            tracing_init_dentry()

This shows tracing_init_dentry() is called twice in this flow and this
is not necessary.

Let's remove the second one when it is for sure be properly initialized.

Link: https://lkml.kernel.org/r/20200712011036.70948-4-richard.weiyang@linux.alibaba.com

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-18 22:17:14 -04:00
Tom Rix
46bbe5c671 tracing: fix double free
clang static analyzer reports this problem

trace_events_hist.c:3824:3: warning: Attempt to free
  released memory
    kfree(hist_data->attrs->var_defs.name[i]);

In parse_var_defs() if there is a problem allocating
var_defs.expr, the earlier var_defs.name is freed.
This free is duplicated by free_var_defs() which frees
the rest of the list.

Because free_var_defs() has to run anyway, remove the
second free fom parse_var_defs().

Link: https://lkml.kernel.org/r/20200907135845.15804-1-trix@redhat.com

Cc: stable@vger.kernel.org
Fixes: 30350d65ac ("tracing: Add variable support to hist triggers")
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-18 13:16:40 -04:00
Tobias Klauser
54fa9ba564 ftrace: Let ftrace_enable_sysctl take a kernel pointer buffer
Commit 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer. Adjust the
signature of ftrace_enable_sysctl to match ctl_table.proc_handler which
fixes the following sparse warning:

kernel/trace/ftrace.c:7544:43: warning: incorrect type in argument 3 (different address spaces)
kernel/trace/ftrace.c:7544:43:    expected void *
kernel/trace/ftrace.c:7544:43:    got void [noderef] __user *buffer

Link: https://lkml.kernel.org/r/20200907093207.13540-1-tklauser@distanz.ch

Fixes: 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-18 13:15:56 -04:00
Sebastian Andrzej Siewior
795d6379a4 tracing: Make the space reserved for the pid wider
For 64bit CONFIG_BASE_SMALL=0 systems PID_MAX_LIMIT is set by default to
4194304. During boot the kernel sets a new value based on number of CPUs
but no lower than 32768. It is 1024 per CPU so with 128 CPUs the default
becomes 131072 which needs six digits.
This value can be increased during run time but must not exceed the
initial upper limit.

Systemd sometime after v241 sets it to the upper limit during boot. The
result is that when the pid exceeds five digits, the trace output is a
little hard to read because it is no longer properly padded (same like
on big iron with 98+ CPUs).

Increase the pid padding to seven digits.

Link: https://lkml.kernel.org/r/20200904082331.dcdkrr3bkn3e4qlg@linutronix.de

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-18 12:42:11 -04:00
Adrian Hunter
478ece9573 ftrace: Fix missing synchronize_rcu() removing trampoline from kallsyms
Add synchronize_rcu() after list_del_rcu() in
ftrace_remove_trampoline_from_kallsyms() to protect readers of
ftrace_ops_trampoline_list (in ftrace_get_trampoline_kallsym)
which is used when kallsyms is read.

Link: https://lkml.kernel.org/r/20200901091617.31837-1-adrian.hunter@intel.com

Fixes: fc0ea795f5 ("ftrace: Add symbols for ftrace trampolines")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-18 12:22:42 -04:00
Miroslav Benes
d5e47505e0 ftrace: Free the trampoline when ftrace_startup() fails
Commit fc0ea795f5 ("ftrace: Add symbols for ftrace trampolines")
missed to remove ops from new ftrace_ops_trampoline_list in
ftrace_startup() if ftrace_hash_ipmodify_enable() fails there. It may
lead to BUG if such ops come from a module which may be removed.

Moreover, the trampoline itself is not freed in this case.

Fix it by calling ftrace_trampoline_free() during the rollback.

Link: https://lkml.kernel.org/r/20200831122631.28057-1-mbenes@suse.cz

Fixes: fc0ea795f5 ("ftrace: Add symbols for ftrace trampolines")
Fixes: f8b8be8a31 ("ftrace, kprobes: Support IPMODIFY flag to find IP modify conflict")
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-18 12:19:08 -04:00
Sven Schnelle
73ac74c7d4 lockdep: fix order in trace_hardirqs_off_caller()
Switch order so that locking state is consistent even
if the IRQ tracer calls into lockdep again.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:08:07 +02:00
Wang Hai
e75ad2cc41 blktrace: make function blk_trace_bio_get_cgid() static
The sparse tool complains as follows:

kernel/trace/blktrace.c:796:5: warning:
 symbol 'blk_trace_bio_get_cgid' was not declared. Should it be static?

This function is not used outside of blktrace.c, so this commit
marks it static.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-07 20:11:15 -06:00
Jakub Kicinski
44a8c4f33c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
We got slightly different patches removing a double word
in a comment in net/ipv4/raw.c - picked the version from net.

Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached
values instead of VNIC login response buffer (following what
commit 507ebe6444 ("ibmvnic: Fix use-after-free of VNIC login
response buffer") did).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-04 21:28:59 -07:00
Peter Zijlstra
0340a6b7fb module: Fix up module_notifier return values
While auditing all module notifiers I noticed a whole bunch of fail
wrt the return value. Notifiers have a 'special' return semantics.

As is; NOTIFY_DONE vs NOTIFY_OK is a bit vague; but
notifier_from_errno(0) results in NOTIFY_OK and NOTIFY_DONE has a
comment that says "Don't care".

From this I've used NOTIFY_DONE when the function completely ignores
the callback and notifier_to_error() isn't used.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Robert Richter <rric@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20200818135804.385360407@infradead.org
2020-09-01 09:58:03 +02:00
Alexei Starovoitov
07be4c4a3e bpf: Add bpf_copy_from_user() helper.
Sleepable BPF programs can now use copy_from_user() to access user memory.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-4-alexei.starovoitov@gmail.com
2020-08-28 21:20:33 +02:00
Jiri Olsa
6e22ab9da7 bpf: Add d_path helper
Adding d_path helper function that returns full path for
given 'struct path' object, which needs to be the kernel
BTF 'path' object. The path is returned in buffer provided
'buf' of size 'sz' and is zero terminated.

  bpf_d_path(&file->f_path, buf, size);

The helper calls directly d_path function, so there's only
limited set of function it can be called from. Adding just
very modest set for the start.

Updating also bpf.h tools uapi header and adding 'path' to
bpf_helpers_doc.py script.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-11-jolsa@kernel.org
2020-08-25 15:41:15 -07:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Linus Torvalds
a1d21081a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
 "Some merge window fallout, some longer term fixes:

   1) Handle headroom properly in lapbether and x25_asy drivers, from
      Xie He.

   2) Fetch MAC address from correct r8152 device node, from Thierry
      Reding.

   3) In the sw kTLS path we should allow MSG_CMSG_COMPAT in sendmsg,
      from Rouven Czerwinski.

   4) Correct fdputs in socket layer, from Miaohe Lin.

   5) Revert troublesome sockptr_t optimization, from Christoph Hellwig.

   6) Fix TCP TFO key reading on big endian, from Jason Baron.

   7) Missing CAP_NET_RAW check in nfc, from Qingyu Li.

   8) Fix inet fastreuse optimization with tproxy sockets, from Tim
      Froidcoeur.

   9) Fix 64-bit divide in new SFC driver, from Edward Cree.

  10) Add a tracepoint for prandom_u32 so that we can more easily
      perform usage analysis. From Eric Dumazet.

  11) Fix rwlock imbalance in AF_PACKET, from John Ogness"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits)
  net: openvswitch: introduce common code for flushing flows
  af_packet: TPACKET_V3: fix fill status rwlock imbalance
  random32: add a tracepoint for prandom_u32()
  Revert "ipv4: tunnel: fix compilation on ARCH=um"
  net: accept an empty mask in /sys/class/net/*/queues/rx-*/rps_cpus
  net: ethernet: stmmac: Disable hardware multicast filter
  net: stmmac: dwmac1000: provide multicast filter fallback
  ipv4: tunnel: fix compilation on ARCH=um
  vsock: fix potential null pointer dereference in vsock_poll()
  sfc: fix ef100 design-param checking
  net: initialize fastreuse on inet_inherit_port
  net: refactor bind_bucket fastreuse into helper
  net: phy: marvell10g: fix null pointer dereference
  net: Fix potential memory leak in proto_register()
  net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init
  ionic_lif: Use devm_kcalloc() in ionic_qcq_alloc()
  net/nfc/rawsock.c: add CAP_NET_RAW check.
  hinic: fix strncpy output truncated compile warnings
  drivers/net/wan/x25_asy: Added needed_headroom and a skb->len check
  net/tls: Fix kmap usage
  ...
2020-08-13 20:03:11 -07:00
Linus Torvalds
fc80c51fd4 Kbuild updates for v5.9
- run the checker (e.g. sparse) after the compiler
 
  - remove unneeded cc-option tests for old compiler flags
 
  - fix tar-pkg to install dtbs
 
  - introduce ccflags-remove-y and asflags-remove-y syntax
 
  - allow to trace functions in sub-directories of lib/
 
  - introduce hostprogs-always-y and userprogs-always-y syntax
 
  - various Makefile cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl8wJXEVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGMGEP/0jDq/WafbfPN0aU83EqEWLt/sKg
 bluzmf/6HGx3XVRnuAzsHNNqysUx77WJiDsU/jbC/zdH8Iox3Sc1diE2sELLNAfY
 iJmQ8NBPggyU74aYG3OJdpDjz8T9EX/nVaYrjyFlbuXElM+Qvo8Z4Fz6NpWqKWlA
 gU+yGxEPPdX6MLHcSPSIu1hGWx7UT4fgfx3zDFTI2qvbQgQjKtzyTjAH5Cm3o87h
 rfomvHSSoAUg+Fh1LediRh1tJlkdVO+w7c+LNwCswmdBtkZuxecj1bQGUTS8GaLl
 CCWOKYfWp0KsVf1veXNNNaX/ecbp+Y34WErFq3V9Fdq5RmVlp+FPSGMyjDMRiQ/p
 LGvzbJLPpG586MnK8of0dOj6Es6tVPuq6WH2HuvsyTGcZJDpFTTxRcK3HDkE8ig6
 ZtuM3owB/Mep8IzwY2yWQiDrc7TX5Fz8S4hzGPU1zG9cfj4VT6TBqHGAy1Eql/0l
 txj6vJpnbQSdXiIX8MIU3yH35Y7eW3JYWgspTZH5Woj1S/wAWwuG93Fuuxq6mQIJ
 q6LSkMavtOfuCjOA9vJBZewpKXRU6yo0CzWNL/5EZ6z/r/I+DGtfb/qka8oYUDjX
 9H0cecL37AQxDHRPTxCZDQF0TpYiFJ6bmnMftK9NKNuIdvsk9DF7UBa3EdUNIj38
 yKS3rI7Lw55xWuY3
 =bkNQ
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - run the checker (e.g. sparse) after the compiler

 - remove unneeded cc-option tests for old compiler flags

 - fix tar-pkg to install dtbs

 - introduce ccflags-remove-y and asflags-remove-y syntax

 - allow to trace functions in sub-directories of lib/

 - introduce hostprogs-always-y and userprogs-always-y syntax

 - various Makefile cleanups

* tag 'kbuild-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: stop filtering out $(GCC_PLUGINS_CFLAGS) from cc-option base
  kbuild: include scripts/Makefile.* only when relevant CONFIG is enabled
  kbuild: introduce hostprogs-always-y and userprogs-always-y
  kbuild: sort hostprogs before passing it to ifneq
  kbuild: move host .so build rules to scripts/gcc-plugins/Makefile
  kbuild: Replace HTTP links with HTTPS ones
  kbuild: trace functions in subdirectories of lib/
  kbuild: introduce ccflags-remove-y and asflags-remove-y
  kbuild: do not export LDFLAGS_vmlinux
  kbuild: always create directories of targets
  powerpc/boot: add DTB to 'targets'
  kbuild: buildtar: add dtbs support
  kbuild: remove cc-option test of -ffreestanding
  kbuild: remove cc-option test of -fno-stack-protector
  Revert "kbuild: Create directory for target DTB"
  kbuild: run the checker after the compiler
2020-08-09 14:10:26 -07:00
Masahiro Yamada
15d5761ad3 kbuild: introduce ccflags-remove-y and asflags-remove-y
CFLAGS_REMOVE_<file>.o filters out flags when compiling a particular
object, but there is no convenient way to do that for every object in
a directory.

Add ccflags-remove-y and asflags-remove-y to make it easily.

Use ccflags-remove-y to clean up some Makefiles.

The add/remove order works as follows:

 [1] KBUILD_CFLAGS specifies compiler flags used globally

 [2] ccflags-y adds compiler flags for all objects in the
     current Makefile

 [3] ccflags-remove-y removes compiler flags for all objects in the
     current Makefile (New feature)

 [4] CFLAGS_<file> adds compiler flags per file.

 [5] CFLAGS_REMOVE_<file> removes compiler flags per file.

Having [3] before [4] allows us to remove flags from most (but not all)
objects in the current Makefile.

For example, kernel/trace/Makefile removes $(CC_FLAGS_FTRACE)
from all objects in the directory, then adds it back to
trace_selftest_dynamic.o and CFLAGS_trace_kprobe_selftest.o

The same applies to lib/livepatch/Makefile.

Please note ccflags-remove-y has no effect to the sub-directories.
In contrast, the previous notation got rid of compiler flags also from
all the sub-directories.

The following are not affected because they have no sub-directories:

  arch/arm/boot/compressed/
  arch/powerpc/xmon/
  arch/sh/
  kernel/trace/

However, lib/ has several sub-directories.

To keep the behavior, I added ccflags-remove-y to all Makefiles
in subdirectories of lib/, except the following:

  lib/vdso/Makefile        - Kbuild does not descend into this Makefile
  lib/raid/test/Makefile   - This is not used for the kernel build

I think commit 2464a609de ("ftrace: do not trace library functions")
excluded too much. In the next commit, I will remove ccflags-remove-y
from the sub-directories of lib/.

Suggested-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Brendan Higgins <brendanhiggins@google.com> (KUnit)
Tested-by: Anders Roxell <anders.roxell@linaro.org>
2020-08-10 01:32:59 +09:00
Linus Torvalds
32663c78c1 Tracing updates for 5.9
- The biggest news in that the tracing ring buffer can now time events that
    interrupted other ring buffer events. Before this change, if an interrupt
    came in while recording another event, and that interrupt also had an
    event, those events would all have the same time stamp as the event it
    interrupted. Now, with the new design, those events will have a unique time
    stamp and rightfully display the time for those events that were recorded
    while interrupting another event.
 
  - Bootconfig how has an "override" operator that lets the users have a
    default config, but then add options to override the default.
 
  - A fix was made to properly filter function graph tracing to the ftrace
    PIDs. This came in at the end of the -rc cycle, and needs to be backported.
 
  - Several clean ups, performance updates, and minor fixes as well.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXy3GOBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qphsAP9ci1jtrC2+cMBMCNKb/AFpA/nDaKsD
 hpsDzvD0YPOmCAEA9QbZset8wUNG49R4FexP7egQ8Ad2S6Oa5f60jWleDQY=
 =lH+q
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - The biggest news in that the tracing ring buffer can now time events
   that interrupted other ring buffer events.

   Before this change, if an interrupt came in while recording another
   event, and that interrupt also had an event, those events would all
   have the same time stamp as the event it interrupted.

   Now, with the new design, those events will have a unique time stamp
   and rightfully display the time for those events that were recorded
   while interrupting another event.

 - Bootconfig how has an "override" operator that lets the users have a
   default config, but then add options to override the default.

 - A fix was made to properly filter function graph tracing to the
   ftrace PIDs. This came in at the end of the -rc cycle, and needs to
   be backported.

 - Several clean ups, performance updates, and minor fixes as well.

* tag 'trace-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (39 commits)
  tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers
  kprobes: Fix compiler warning for !CONFIG_KPROBES_ON_FTRACE
  tracing: Use trace_sched_process_free() instead of exit() for pid tracing
  bootconfig: Fix to find the initargs correctly
  Documentation: bootconfig: Add bootconfig override operator
  tools/bootconfig: Add testcases for value override operator
  lib/bootconfig: Add override operator support
  kprobes: Remove show_registers() function prototype
  tracing/uprobe: Remove dead code in trace_uprobe_register()
  kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler
  ftrace: Fix ftrace_trace_task return value
  tracepoint: Use __used attribute definitions from compiler_attributes.h
  tracepoint: Mark __tracepoint_string's __used
  trace : Have tracing buffer info use kvzalloc instead of kzalloc
  tracing: Remove outdated comment in stack handling
  ftrace: Do not let direct or IPMODIFY ftrace_ops be added to module and set trampolines
  ftrace: Setup correct FTRACE_FL_REGS flags for module
  tracing/hwlat: Honor the tracing_cpumask
  tracing/hwlat: Drop the duplicate assignment in start_kthread()
  tracing: Save one trace_event->type by using __TRACE_LAST_TYPE
  ...
2020-08-07 18:29:15 -07:00
Steven Rostedt (VMware)
38ce2a9e33 tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers
As trace_array_printk() used with not global instances will not add noise to
the main buffer, they are OK to have in the kernel (unlike trace_printk()).
This require the subsystem to create their own tracing instance, and the
trace_array_printk() only writes into those instances.

Add trace_array_init_printk() to initialize the trace_printk() buffers
without printing out the WARNING message.

Reported-by: Sean Paul <sean@poorly.run>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-08-07 17:05:01 -04:00
Linus Torvalds
eb65405eb6 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAl8qeCkACgkQnJ2qBz9k
 QNlAGQf/YVruyVLZ7kCv6EMCHauXm3K1lEGpbXsTW04HpStxGx7mtLGN/Au+EYJR
 VnRkCMt6TSMQGMBkNF83dUCwXHkeL1rd6frJBLVOErkg50nUuD4kjTVw9Lzw9itx
 CPhKnPPlsRkDkZPxkg3WEdqPgzJREWBZUaB38QUPjYN46q7HfPYDANTh5wI1GiGs
 27+PvzlttjhkQpQ14pYU/nu4xf/nmgmmHhgfsJArQP2EzYOrKxsWKhXS5uPdtNlf
 mXiZMaqW2AlyDGlw3myOEySrrSuaR77M2bzDo7mjqffI9wSVTytKEhtg0i8OMWmv
 pZ38OQobznnFoqzc1GL70IE0DEU48g==
 =d81d
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify updates from Jan Kara:

 - fanotify fix for softlockups when there are many queued events

 - performance improvement to reduce fsnotify overhead when not used

 - Amir's implementation of fanotify events with names. With these you
   can now efficiently monitor whole filesystem, eg to mirror changes to
   another machine.

* tag 'fsnotify_for_v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (37 commits)
  fanotify: compare fsid when merging name event
  fsnotify: create method handle_inode_event() in fsnotify_operations
  fanotify: report parent fid + child fid
  fanotify: report parent fid + name + child fid
  fanotify: add support for FAN_REPORT_NAME
  fanotify: report events with parent dir fid to sb/mount/non-dir marks
  fanotify: add basic support for FAN_REPORT_DIR_FID
  fsnotify: remove check that source dentry is positive
  fsnotify: send event with parent/name info to sb/mount/non-dir marks
  audit: do not set FS_EVENT_ON_CHILD in audit marks mask
  inotify: do not set FS_EVENT_ON_CHILD in non-dir mark mask
  fsnotify: pass dir and inode arguments to fsnotify()
  fsnotify: create helper fsnotify_inode()
  fsnotify: send event to parent and child with single callback
  inotify: report both events on parent and child with single callback
  dnotify: report both events on parent and child with single callback
  fanotify: no external fh buffer in fanotify_name_event
  fanotify: use struct fanotify_info to parcel the variable size buffer
  fsnotify: add object type "child" to object type iterator
  fanotify: use FAN_EVENT_ON_CHILD as implicit flag on sb/mount/non-dir marks
  ...
2020-08-06 19:29:51 -07:00