Allow a buffer to be marked such that read() must return the entire buffer
in one go or return ENOBUFS. Multiple buffers can be amalgamated into a
single read, but a short read will occur if the next "whole" buffer won't
fit.
This is useful for watch queue notifications to make sure we don't split a
notification across multiple reads, especially given that we need to
fabricate an overrun record under some circumstances - and that isn't in
the buffers.
Signed-off-by: David Howells <dhowells@redhat.com>
The sample program is run like:
./samples/watch_queue/watch_test
and watches "/" for mount changes and the current session keyring for key
changes:
# keyctl add user a a @s
1035096409
# keyctl unlink 1035096409 @s
producing:
# ./watch_test
read() = 16
NOTIFY[000]: ty=000001 sy=02 i=00000110
KEY 2ffc2e5d change=2[linked] aux=1035096409
read() = 16
NOTIFY[000]: ty=000001 sy=02 i=00000110
KEY 2ffc2e5d change=3[unlinked] aux=1035096409
Other events may be produced, such as with a failing disk:
read() = 22
NOTIFY[000]: ty=000003 sy=02 i=00000416
USB 3-7.7 dev-reset e=0 r=0
read() = 24
NOTIFY[000]: ty=000002 sy=06 i=00000418
BLOCK 00800050 e=6[critical medium] s=64000ef8
This corresponds to:
blk_update_request: critical medium error, dev sdf, sector 1677725432 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
in dmesg.
Signed-off-by: David Howells <dhowells@redhat.com>
Differing default states set on driver init / perf init and as a result
of a sysfs reset.
The ETMv4 can be programmed to trace the entire instruction address range
without the need to use address comparator filter resources.
(Described in the ETMv4.x technical reference manual)
sysfs reset was using this method, perf and default driver init were setup
with an address range comparator for the entire address range.
The perf / driver init has been altered to use the method without needing
any comparator address hardware.
Minor adjustment to the vinst_ctrl register initialisation to ensure
correct zero initialisation.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-15-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On some QCOM platforms like SC7180, SDM845 and SM8150,
reading TMC mode register without proper coresight power
management can lead to async exceptions like the one in
the call trace below in tmc_read_prepare_etb(). This can
happen if the user tries to read the TMC etf data via
device node without setting up source and the sink first.
Fix this by having a check for coresight sysfs mode
before reading TMC mode management register.
Kernel panic - not syncing: Asynchronous SError Interrupt
CPU: 7 PID: 2605 Comm: hexdump Tainted: G S 5.4.30 #122
Call trace:
dump_backtrace+0x0/0x188
show_stack+0x20/0x2c
dump_stack+0xdc/0x144
panic+0x168/0x36c
panic+0x0/0x36c
arm64_serror_panic+0x78/0x84
do_serror+0x130/0x138
el1_error+0x84/0xf8
tmc_read_prepare_etb+0x88/0xb8
tmc_open+0x40/0xd8
misc_open+0x120/0x158
chrdev_open+0xb8/0x1a4
do_dentry_open+0x268/0x3a0
vfs_open+0x34/0x40
path_openat+0x39c/0xdf4
do_filp_open+0x90/0x10c
do_sys_open+0x150/0x3e8
__arm64_compat_sys_openat+0x28/0x34
el0_svc_common+0xa8/0x160
el0_svc_compat_handler+0x2c/0x38
el0_svc_compat+0x8/0x10
Fixes: 4525412a50 ("coresight: tmc: making prepare/unprepare functions generic")
Reported-by: Stephen Boyd <swboyd@chromium.org>
Suggested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-14-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On some systems the firmware may not describe all the ports
connected to a component (e.g, for security reasons). This
could be especially problematic for "funnels" where we could
end up in modifying memory beyond the allocated space for
refcounts.
e.g, for a funnel with input ports listed 0, 3, 5, nr_inport = 3.
However the we could access refcnts[5] while checking for
references, like :
[ 526.110401] ==================================================================
[ 526.117988] BUG: KASAN: slab-out-of-bounds in funnel_enable+0x54/0x1b0
[ 526.124706] Read of size 4 at addr ffffff8135f9549c by task bash/1114
[ 526.131324]
[ 526.132886] CPU: 3 PID: 1114 Comm: bash Tainted: G S 5.4.25 #232
[ 526.140397] Hardware name: Qualcomm Technologies, Inc. SC7180 IDP (DT)
[ 526.147113] Call trace:
[ 526.149653] dump_backtrace+0x0/0x188
[ 526.153431] show_stack+0x20/0x2c
[ 526.156852] dump_stack+0xdc/0x144
[ 526.160370] print_address_description+0x3c/0x494
[ 526.165211] __kasan_report+0x144/0x168
[ 526.169170] kasan_report+0x10/0x18
[ 526.172769] check_memory_region+0x1a4/0x1b4
[ 526.177164] __kasan_check_read+0x18/0x24
[ 526.181292] funnel_enable+0x54/0x1b0
[ 526.185072] coresight_enable_path+0x104/0x198
[ 526.189649] coresight_enable+0x118/0x26c
...
[ 526.237782] Allocated by task 280:
[ 526.241298] __kasan_kmalloc+0xf0/0x1ac
[ 526.245249] kasan_kmalloc+0xc/0x14
[ 526.248849] __kmalloc+0x28c/0x3b4
[ 526.252361] coresight_register+0x88/0x250
[ 526.256587] funnel_probe+0x15c/0x228
[ 526.260365] dynamic_funnel_probe+0x20/0x2c
[ 526.264679] amba_probe+0xbc/0x158
[ 526.268193] really_probe+0x144/0x408
[ 526.271970] driver_probe_device+0x70/0x140
...
[ 526.316810]
[ 526.318364] Freed by task 0:
[ 526.321344] (stack is not available)
[ 526.325024]
[ 526.326580] The buggy address belongs to the object at ffffff8135f95480
[ 526.326580] which belongs to the cache kmalloc-128 of size 128
[ 526.339439] The buggy address is located 28 bytes inside of
[ 526.339439] 128-byte region [ffffff8135f95480, ffffff8135f95500)
[ 526.351399] The buggy address belongs to the page:
[ 526.356342] page:ffffffff04b7e500 refcount:1 mapcount:0 mapping:ffffff814b00c380 index:0x0 compound_mapcount: 0
[ 526.366711] flags: 0x4000000000010200(slab|head)
[ 526.371475] raw: 4000000000010200 ffffffff05034008 ffffffff0501eb08 ffffff814b00c380
[ 526.379435] raw: 0000000000000000 0000000000190019 00000001ffffffff 0000000000000000
[ 526.387393] page dumped because: kasan: bad access detected
[ 526.393128]
[ 526.394681] Memory state around the buggy address:
[ 526.399619] ffffff8135f95380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 526.407046] ffffff8135f95400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 526.414473] >ffffff8135f95480: 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 526.421900] ^
[ 526.426029] ffffff8135f95500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 526.433456] ffffff8135f95580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 526.440883] ==================================================================
To keep the code simple, we now track the maximum number of
possible input/output connections to/from this component
@ nr_inport and nr_outport in platform_data, respectively.
Thus the output connections could be sparse and code is
adjusted to skip the unspecified connections.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Reported-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Tested-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-13-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix the following sparse warning:
drivers/hwtracing/coresight/coresight-cti.c:22:1: warning: symbol
'ect_net' was not declared. Should it be static?
drivers/hwtracing/coresight/coresight-cti.c:625:32: warning: symbol
'cti_ops_ect' was not declared. Should it be static?
drivers/hwtracing/coresight/coresight-cti.c:630:28: warning: symbol
'cti_ops' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-11-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Coresight device connections are a bit complicated and is not
exposed currently to the user. One has to look at the platform
descriptions (DT bindings or ACPI bindings) to make an understanding.
Given the new naming scheme, it will be helpful to have this information
to choose the appropriate devices for tracing. This patch exposes
the device connections via links in the sysfs directories.
e.g, for a connection devA[OutputPort_X] -> devB[InputPort_Y]
is represented as two symlinks:
/sys/bus/coresight/.../devA/out:X -> /sys/bus/coresight/.../devB
/sys/bus/coresight/.../devB/in:Y -> /sys/bus/coresight/.../devA
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[Revised to use the generic sysfs links functions & link structures.
Provides a connections sysfs group in each device to hold the links.]
Co-developed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-5-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Oded writes:
This tag contains the following changes for kernel 5.8:
- GAUDI ASIC support. The tag contains code and header files needed to
initialize the GAUDI ASIC and run workloads on it. There are changes to
the common code that are needed for GAUDI and there is the addition of
new ASIC-dependent code of GAUDI.
- Add new feature of signal/wait command submissions. This is relevant to
GAUDI only and allows the user to sync between streams (queues) inside
the device.
- Allow user to retrieve the device time alongside the host time, to allow
a user application to synchronize device time together with host
time during profiling.
- Change ASIC's CPU initialization by loading its boot loader code from the
Host memory (instead of it being programmed on the on-board FLASH).
- Expose more attributes through HWMON.
- Move the ASIC event handling code to be "common code". This will be
shared between GAUDI and future ASICs. Goya will still use its own code.
- Fix bug in command submission parsing in Goya.
- Small fixes to security configuration (open up some registers for user
access).
- Improvements to ASIC reset code.
* tag 'misc-habanalabs-next-2020-05-19' of git://people.freedesktop.org/~gabbayo/linux: (38 commits)
habanalabs: update patched_cb_size for Wreg32
habanalabs: move event handling to common firmware file
habanalabs: enable gaudi code in driver
habanalabs: add gaudi profiler module
habanalabs: add gaudi security module
habanalabs: add hwmgr module for gaudi
habanalabs: add gaudi asic-dependent code
uapi: habanalabs: add gaudi defines
habanalabs: add gaudi asic registers header files
habanalabs: get card type, location from F/W
habanalabs: support clock gating enable/disable
habanalabs: set PM profile to auto only for goya
habanalabs: add dedicated define for hard reset
habanalabs: check if CoreSight is supported
habanalabs: add signal/wait to CS IOCTL operations
habanalabs: handle the h/w sync object
habanalabs: define ASIC-dependent interface for signal/wait
uapi: habanalabs: add signal/wait operations
habanalabs: add missing MODULE_DEVICE_TABLE
habanalabs: print all CB handles as hex numbers
...
Add a key/keyring change notification facility whereby notifications about
changes in key and keyring content and attributes can be received.
Firstly, an event queue needs to be created:
pipe2(fds, O_NOTIFICATION_PIPE);
ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, 256);
then a notification can be set up to report notifications via that queue:
struct watch_notification_filter filter = {
.nr_filters = 1,
.filters = {
[0] = {
.type = WATCH_TYPE_KEY_NOTIFY,
.subtype_filter[0] = UINT_MAX,
},
},
};
ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);
keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fds[1], 0x01);
After that, records will be placed into the queue when events occur in
which keys are changed in some way. Records are of the following format:
struct key_notification {
struct watch_notification watch;
__u32 key_id;
__u32 aux;
} *n;
Where:
n->watch.type will be WATCH_TYPE_KEY_NOTIFY.
n->watch.subtype will indicate the type of event, such as
NOTIFY_KEY_REVOKED.
n->watch.info & WATCH_INFO_LENGTH will indicate the length of the
record.
n->watch.info & WATCH_INFO_ID will be the second argument to
keyctl_watch_key(), shifted.
n->key will be the ID of the affected key.
n->aux will hold subtype-dependent information, such as the key
being linked into the keyring specified by n->key in the case of
NOTIFY_KEY_LINKED.
Note that it is permissible for event records to be of variable length -
or, at least, the length may be dependent on the subtype. Note also that
the queue can be shared between multiple notifications of various types.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
mac80211 can provide space for the driver to put a tx header on
the skb buffer instead coping the entire frame on to a local
buffer with the header.
To use this extra_tx_headroom must be set in mac80211 with the largest
possible header which is struct vnt_tx_buffer.
The driver has 8 possible combinations of tx header size which
are found in vnt_get_hdr_size replacing vnt_mac_hdr_pos.
Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Link: https://lore.kernel.org/r/7b967bfc-1d4b-4b45-efab-d54f16cca226@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make it possible to have a general notification queue built on top of a
standard pipe. Notifications are 'spliced' into the pipe and then read
out. splice(), vmsplice() and sendfile() are forbidden on pipes used for
notifications as post_one_notification() cannot take pipe->mutex. This
means that notifications could be posted in between individual pipe
buffers, making iov_iter_revert() difficult to effect.
The way the notification queue is used is:
(1) An application opens a pipe with a special flag and indicates the
number of messages it wishes to be able to queue at once (this can
only be set once):
pipe2(fds, O_NOTIFICATION_PIPE);
ioctl(fds[0], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);
(2) The application then uses poll() and read() as normal to extract data
from the pipe. read() will return multiple notifications if the
buffer is big enough, but it will not split a notification across
buffers - rather it will return a short read or EMSGSIZE.
Notification messages include a length in the header so that the
caller can split them up.
Each message has a header that describes it:
struct watch_notification {
__u32 type:24;
__u32 subtype:8;
__u32 info;
};
The type indicates the source (eg. mount tree changes, superblock events,
keyring changes, block layer events) and the subtype indicates the event
type (eg. mount, unmount; EIO, EDQUOT; link, unlink). The info field
indicates a number of things, including the entry length, an ID assigned to
a watchpoint contributing to this buffer and type-specific flags.
Supplementary data, such as the key ID that generated an event, can be
attached in additional slots. The maximum message size is 127 bytes.
Messages may not be padded or aligned, so there is no guarantee, for
example, that the notification type will be on a 4-byte bounary.
Signed-off-by: David Howells <dhowells@redhat.com>
Add an O_NOTIFICATION_PIPE flag that can be passed to pipe2() to indicate
that the pipe being created is going to be used for notifications. This
suppresses the use of splice(), vmsplice(), tee() and sendfile() on the
pipe as calling iov_iter_revert() on a pipe when a kernel notification
message has been inserted into the middle of a multi-buffer splice will be
messy.
The flag is given the same value as O_EXCL as it seems unlikely that
this flag will ever be applicable to pipes and I don't want to use up
another O_* bit unnecessarily. An alternative could be to add a pipe3()
system call.
Signed-off-by: David Howells <dhowells@redhat.com>
Add a security hook that allows an LSM to rule on whether a notification
message is allowed to be inserted into a particular watch queue.
The hook is given the following information:
(1) The credentials of the triggerer (which may be init_cred for a system
notification, eg. a hardware error).
(2) The credentials of the whoever set the watch.
(3) The notification message.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jamorris@linux.microsoft.com>
cc: Casey Schaufler <casey@schaufler-ca.com>
cc: Stephen Smalley <sds@tycho.nsa.gov>
cc: linux-security-module@vger.kernel.org
Add UAPI definitions for the general notification queue, including the
following pieces:
(*) struct watch_notification.
This is the metadata header for notification messages. It includes a
type and subtype that indicate the source of the message
(eg. WATCH_TYPE_MOUNT_NOTIFY) and the kind of the message
(eg. NOTIFY_MOUNT_NEW_MOUNT).
The header also contains an information field that conveys the
following information:
- WATCH_INFO_LENGTH. The size of the entry (entries are variable
length).
- WATCH_INFO_ID. The watch ID specified when the watchpoint was
set.
- WATCH_INFO_TYPE_INFO. (Sub)type-specific information.
- WATCH_INFO_FLAG_*. Flag bits overlain on the type-specific
information. For use by the type.
All the information in the header can be used in filtering messages at
the point of writing into the buffer.
(*) struct watch_notification_removal
This is an extended watch-removal notification record that includes an
'id' field that can indicate the identifier of the object being
removed if available (for instance, a keyring serial number).
Signed-off-by: David Howells <dhowells@redhat.com>