Inline Encryption hardware allows software to specify an encryption context
(an encryption key, crypto algorithm, data unit num, data unit size) along
with a data transfer request to a storage device, and the inline encryption
hardware will use that context to en/decrypt the data. The inline
encryption hardware is part of the storage device, and it conceptually sits
on the data path between system memory and the storage device.
Inline Encryption hardware implementations often function around the
concept of "keyslots". These implementations often have a limited number
of "keyslots", each of which can hold a key (we say that a key can be
"programmed" into a keyslot). Requests made to the storage device may have
a keyslot and a data unit number associated with them, and the inline
encryption hardware will en/decrypt the data in the requests using the key
programmed into that associated keyslot and the data unit number specified
with the request.
As keyslots are limited, and programming keys may be expensive in many
implementations, and multiple requests may use exactly the same encryption
contexts, we introduce a Keyslot Manager to efficiently manage keyslots.
We also introduce a blk_crypto_key, which will represent the key that's
programmed into keyslots managed by keyslot managers. The keyslot manager
also functions as the interface that upper layers will use to program keys
into inline encryption hardware. For more information on the Keyslot
Manager, refer to documentation found in block/keyslot-manager.c and
linux/keyslot-manager.h.
Co-developed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The blk-crypto framework adds support for inline encryption. There are
numerous changes throughout the storage stack. This patch documents the
main design choices in the block layer, the API presented to users of
the block layer (like fscrypt or layered devices) and the API presented
to drivers for adding support for inline encryption.
Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When the QoS targets are met and nothing is being throttled, there's
no way to tell how saturated the underlying device is - it could be
almost entirely idle, at the cusp of saturation or anywhere inbetween.
Given that there's no information, it's best to keep vrate as-is in
this state. Before 7cd806a9a9 ("iocost: improve nr_lagging
handling"), this was the case - if the device isn't missing QoS
targets and nothing is being throttled, busy_level was reset to zero.
While fixing nr_lagging handling, 7cd806a9a9 ("iocost: improve
nr_lagging handling") broke this. Now, while the device is hitting
QoS targets and nothing is being throttled, vrate keeps getting
adjusted according to the existing busy_level.
This led to vrate keeping climing till it hits max when there's an IO
issuer with limited request concurrency if the vrate started low.
vrate starts getting adjusted upwards until the issuer can issue IOs
w/o being throttled. From then on, QoS targets keeps getting met and
nothing on the system needs throttling and vrate keeps getting
increased due to the existing busy_level.
This patch makes the following changes to the busy_level logic.
* Reset busy_level if nr_shortages is zero to avoid the above
scenario.
* Make non-zero nr_lagging block lowering nr_level but still clear
positive busy_level if there's clear non-saturation signal - QoS
targets are met and nr_shortages is non-zero. nr_lagging's role is
preventing adjusting vrate upwards while there are long-running
commands and it shouldn't keep busy_level positive while there's
clear non-saturation signal.
* Restructure code for clarity and add comments.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Andy Newell <newella@fb.com>
Fixes: 7cd806a9a9 ("iocost: improve nr_lagging handling")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The function copy_kernel_to_xregs_err() uses XRSTOR which can work with
standard or compacted format without supervisor xstates. However, when
supervisor xstates are present, XRSTORS must be used. Fix it by using
XRSTORS when supervisor state handling is enabled.
I also considered if there were additional cases where XRSTOR might be
mistakenly called instead of XRSTORS. There are only three XRSTOR sites
in the kernel:
1. copy_kernel_to_xregs_booting(), already switches between XRSTOR and
XRSTORS based on X86_FEATURE_XSAVES.
2. copy_user_to_xregs(), which *needs* XRSTOR because it is copying from
userspace and must never copy supervisor state with XRSTORS.
3. copy_kernel_to_xregs_err() mistakenly used XRSTOR only. Fix it.
[ bp: Massage commit message. ]
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20200512145444.15483-8-yu-cheng.yu@intel.com
POSIX defines faccessat() as having a fourth "flags" argument, while the
linux syscall doesn't have it. Glibc tries to emulate AT_EACCESS and
AT_SYMLINK_NOFOLLOW, but AT_EACCESS emulation is broken.
Add a new faccessat(2) syscall with the added flags argument and implement
both flags.
The value of AT_EACCESS is defined in glibc headers to be the same as
AT_REMOVEDIR. Use this value for the kernel interface as well, together
with the explanatory comment.
Also add AT_EMPTY_PATH support, which is not documented by POSIX, but can
be useful and is trivial to implement.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Parsing "silent" and clearing SB_SILENT makes zero sense.
Parsing "silent" and setting SB_SILENT would make a bit more sense, but
apparently nobody cares.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Unlike the others, this is _not_ a standard option accepted by mount(8).
In fact SB_POSIXACL is an internal flag, and accepting MS_POSIXACL on the
mount(2) interface is possibly a bug.
The only filesystem that apparently wants to handle the "posixacl" option
is 9p, but it has special handling of that option besides setting
SB_POSIXACL.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Makes little sense to keep this blacklist synced with what mount(8) parses
and what it doesn't. E.g. it has various forms of "*atime" options, but
not "atime"...
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Determining whether a path or file descriptor refers to a mountpoint (or
more precisely a mount root) is not trivial using current tools.
Add a flag to statx that indicates whether the path or fd refers to the
root of a mount or not.
Cc: linux-api@vger.kernel.org
Cc: linux-man@vger.kernel.org
Reported-by: Lennart Poettering <mzxreary@0pointer.de>
Reported-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
IS_NOATIME(inode) is defined as __IS_FLG(inode, SB_RDONLY|SB_NOATIME), so
generic_fillattr() will clear STATX_ATIME from the result_mask if the super
block is marked read only.
This was probably not the intention, so fix to only clear STATX_ATIME if
the fs doesn't support atime at all.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Constants of the *_ALL type can be actively harmful due to the fact that
developers will usually fail to consider the possible effects of future
changes to the definition.
Deprecate STATX_ALL in the uapi, while no damage has been done yet.
We could keep something like this around in the kernel, but there's
actually no point, since all filesystems should be explicitly checking
flags that they support and not rely on the VFS masking unknown ones out: a
flag could be known to the VFS, yet not known to the filesystem.
Cc: David Howells <dhowells@redhat.com>
Cc: linux-api@vger.kernel.org
Cc: linux-man@vger.kernel.org
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Split out a helper that overrides the credentials in preparation for
actually doing the access check.
This prepares for the next patch that optionally disables the creds
override.
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
If mounts are deleted after a read(2) call on /proc/self/mounts (or its
kin), the subsequent read(2) could miss a mount that comes after the
deleted one in the list. This is because the file position is interpreted
as the number mount entries from the start of the list.
E.g. first read gets entries #0 to #9; the seq file index will be 10. Then
entry #5 is deleted, resulting in #10 becoming #9 and #11 becoming #10,
etc... The next read will continue from entry #10, and #9 is missed.
Solve this by adding a cursor entry for each open instance. Taking the
global namespace_sem for write seems excessive, since we are only dealing
with a per-namespace list. Instead add a per-namespace spinlock and use
that together with namespace_sem taken for read to protect against
concurrent modification of the mount list. This may reduce parallelism of
is_local_mountpoint(), but it's hardly a big contention point. We could
also use RCU freeing of cursors to make traversal not need additional
locks, if that turns out to be neceesary.
Only move the cursor once for each read (cursor is not added on open) to
minimize cacheline invalidation. When EOF is reached, the cursor is taken
off the list, in order to prevent an excessive number of cursors due to
inactive open file descriptors.
Reported-by: Karel Zak <kzak@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Avi Kivity reports that on fuse filesystems running in a user namespace
asyncronous fsync fails with EOVERFLOW.
The reason is that f_ops->fsync() is called with the creds of the kthread
performing aio work instead of the creds of the process originally
submitting IOCB_CMD_FSYNC.
Fuse sends the creds of the caller in the request header and it needs to
translate the uid and gid into the server's user namespace. Since the
kthread is running in init_user_ns, the translation will fail and the
operation returns an error.
It can be argued that fsync doesn't actually need any creds, but just
zeroing out those fields in the header (as with requests that currently
don't take creds) is a backward compatibility risk.
Instead of working around this issue in fuse, solve the core of the problem
by calling the filesystem with the proper creds.
Reported-by: Avi Kivity <avi@scylladb.com>
Tested-by: Giuseppe Scrivano <gscrivan@redhat.com>
Fixes: c9582eb0ff ("fuse: Fail all requests with invalid uids or gids")
Cc: stable@vger.kernel.org # 4.18+
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Whiteouts, unlike real device node should not require privileges to create.
The general concern with device nodes is that opening them can have side
effects. The kernel already avoids zero major (see
Documentation/admin-guide/devices.txt). To be on the safe side the patch
explicitly forbids registering a char device with 0/0 number (see
cdev_add()).
This guarantees that a non-O_PATH open on a whiteout will fail with ENODEV;
i.e. it won't have any side effect.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
blk_io_schedule() isn't called from performance sensitive code path, and
it is easier to maintain by exporting it as symbol.
Also blk_io_schedule() is only called by CONFIG_BLOCK code, so it is safe
to do this way. Meantime fixes build failure when CONFIG_BLOCK is off.
Cc: Christoph Hellwig <hch@infradead.org>
Fixes: e6249cdd46 ("block: add blk_io_schedule() for avoiding task hung in sync dio")
Reported-by: Satya Tangirala <satyat@google.com>
Tested-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There's a lot of checks to make sure the ring buffer is working, and if an
anomaly is detected, it safely shuts itself down. But there's a few cases
that it will call BUG(), which defeats the point of being safe (it crashes
the kernel when an anomaly is found!). There's no reason for them. Switch
them all to either WARN_ON_ONCE() (when no ring buffer descriptor is present),
or to RB_WARN_ON() (when a ring buffer descriptor is present).
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
If the function tracer is running and the trace file is read (which uses the
ring buffer iterator), the iterator can get in sync with the writes, and
caues it to fail to find a page with content it can read three times. This
causes a warning and deactivation of the ring buffer code.
Looking at the other cases of failure to get an event, it appears that
there's a chance that the writer could cause them too. Since the iterator is
a "best effort" to read the ring buffer if there's an active writer (the
consumer reader is made for this case "see trace_pipe"), if it fails to get
an event after three tries, simply give up and return NULL. Don't warn, nor
disable the ring buffer on this failure.
Link: https://lore.kernel.org/r/20200429090508.GG5770@shao2-debian
Reported-by: kernel test robot <lkp@intel.com>
Fixes: ff84c50cfb ("ring-buffer: Do not die if rb_iter_peek() fails more than thrice")
Tested-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The R-Car DU driver calls drm_vblank_init via some helper functions in
probe(). From what I checked, most drivers do this as well. I have a
config now where DU always stays in deferred_probe state because of a
missing dependency. This means that every time I rebind another driver
like MMC, the vblank init message is displayed again when the DU driver
is retried. Because the message doesn't really carry a useful
information, I suggest to simply drop it.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200513201016.23047-1-wsa+renesas@sang-engineering.com
Since its inclusion in v3.9, no users of the SuperH VEU mem2mem video
processing driver have appeared upstream. All VEU devices in SuperH
board code still bind to the "uio_pdrv_genirq" driver instead.
The original author marked the driver orphaned in v3.15.
Remove the driver; it can always be resurrected from git history when
needed.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Rob Landley <rob@landley.net>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
IoT Box is an IoT gateway device based on Stinger96 board powered by
STM32MP1 SoC, designed and manufactured by Shiratech Solutions. This
device makes use of Stinger96 board by having it as a base board with
one additional mezzanine on top.
Following are the features exposed by this device in addition to the
Stinger96 board:
* WiFi/BT
* CCS811 VOC sensor
* 2x Digital microphones IM69D130
* 12x WS2812B LEDs
Following peripherals are tested and known to work:
* WiFi/BT
* CCS811
More information about this device can be found in Shiratech website:
https://www.shiratech-solutions.com/products/iot-box/
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
Stinger96 is a 96Boards IoT Extended edition board designed and
manufactured by Shiratech solutions based on STM32MP1 SoC. Following
are the features of this board:
* 256MB DDR
* 125MB NAND Flash
* Onboard BG96 modem
* 1x uSD
* 2x USB (1 available as external connector and another connected to BG96)
* 1x SPI
* 1x PCM
* 2x UART (apart from serial console)
* 2x I2C (apart from one connected to PMIC)
Following peripherals are tested and known to work:
* BG96 modem
* 1x I2C (LS-I2C0)
* 1x SPI
* 1x UART (LS-UART0)
* USB (Only Gadget mode)
* uSD
More information about this board can be found in Shiratech website:
https://www.shiratech-solutions.com/products/stinger96/
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
These pinctrl definitions will be used by Stinger96/IoTBox boards
from Shiratech.
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
Those were used to create files in /proc/acpi long ago
and were missed when that code was deleted.
Signed-off-by: Pascal Terjan <pterjan@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
On platforms with IOMMU enabled, multiple SGs can be coalesced into one
by the IOMMU driver. In that case the SG list processing as part of the
completion of a urb on a bulk endpoint can result into a NULL pointer
dereference with the below stack dump.
<6> Unable to handle kernel NULL pointer dereference at virtual address 0000000c
<6> pgd = c0004000
<6> [0000000c] *pgd=00000000
<6> Internal error: Oops: 5 [#1] PREEMPT SMP ARM
<2> PC is at xhci_queue_bulk_tx+0x454/0x80c
<2> LR is at xhci_queue_bulk_tx+0x44c/0x80c
<2> pc : [<c08907c4>] lr : [<c08907bc>] psr: 000000d3
<2> sp : ca337c80 ip : 00000000 fp : ffffffff
<2> r10: 00000000 r9 : 50037000 r8 : 00004000
<2> r7 : 00000000 r6 : 00004000 r5 : 00000000 r4 : 00000000
<2> r3 : 00000000 r2 : 00000082 r1 : c2c1a200 r0 : 00000000
<2> Flags: nzcv IRQs off FIQs off Mode SVC_32 ISA ARM Segment none
<2> Control: 10c0383d Table: b412c06a DAC: 00000051
<6> Process usb-storage (pid: 5961, stack limit = 0xca336210)
<snip>
<2> [<c08907c4>] (xhci_queue_bulk_tx)
<2> [<c0881b3c>] (xhci_urb_enqueue)
<2> [<c0831068>] (usb_hcd_submit_urb)
<2> [<c08350b4>] (usb_sg_wait)
<2> [<c089f384>] (usb_stor_bulk_transfer_sglist)
<2> [<c089f2c0>] (usb_stor_bulk_srb)
<2> [<c089fe38>] (usb_stor_Bulk_transport)
<2> [<c089f468>] (usb_stor_invoke_transport)
<2> [<c08a11b4>] (usb_stor_control_thread)
<2> [<c014a534>] (kthread)
The above NULL pointer dereference is the result of block_len and the
sent_len set to zero after the first SG of the list when IOMMU driver
is enabled. Because of this the loop of processing the SGs has run
more than num_sgs which resulted in a sg_next on the last SG of the
list which has SG_END set.
Fix this by check for the sg before any attributes of the sg are
accessed.
[modified reason for null pointer dereference in commit message subject -Mathias]
Fixes: f9c589e142 ("xhci: TD-fragment, align the unsplittable case with a bounce buffer")
Cc: stable@vger.kernel.org
Signed-off-by: Sriharsha Allenki <sallenki@codeaurora.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20200514110432.25564-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In order to use a common VSC SDP Colorimetry calculating code on PSR,
it uses a new psr vsc sdp compute routine.
Because PSR routine has its own scenario and timings of writing a VSC SDP,
the current PSR routine needs to have its own drm_dp_vsc_sdp structure
member variable on struct i915_psr.
In order to calculate colorimetry information, intel_psr_update()
function and intel_psr_enable() function extend a drm_connector_state
argument.
There are no changes to PSR mechanism.
v3: Replace a structure name to drm_dp_vsc_sdp from intel_dp_vsc_sdp
v4: Rebased
v8: Rebased
v10: When a PSR is enabled, it needs to add DP_SDP_VSC to
infoframes.enable.
It is needed for comparing between HW and pipe_state of VSC_SDP.
v11: If PSR is disabled by flag, it don't enable psr on pipe compute.
v12: Fix an inconsistent indenting
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200514060732.3378396-15-gwan-gyeong.mun@intel.com
In order to use a common VSC SDP Colorimetry calculating code on PSR,
it adds a compute routine for PSR VSC SDP.
As PSR routine can not use infoframes.vsc of crtc state, it also adds new
writing of DP SDPs (Secondary Data Packet) for PSR.
PSR routine has its own scenario and timings of writing a VSC SDP.
v3: Replace a structure name to drm_dp_vsc_sdp from intel_dp_vsc_sdp
v4: Use struct drm_device logging macros
v10: 1) Fix packing of VSC SDP where Pixel Encoding/Colorimetry Format is
not supported.
2) Change a checking of PSR state.
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200514060732.3378396-14-gwan-gyeong.mun@intel.com
Compared to implementation of DP and HDMI's encoder->infoframes_enabled,
the lspcon's implementation returns its active state. (we expect enabled
infoframe states of HW.) It leads to pipe state mismatch error
when ddi_get_config is called.
Because the current implementation of lspcon is not ready to support
readout infoframes, we need to return 0 here.
In order to support readout to lspcon, we need to implement read_infoframe
and infoframes_enabled. And set_infoframes also have to set an appropriate
bit on crtc_state->infoframes.enable
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200514060732.3378396-11-gwan-gyeong.mun@intel.com
In order to use computed config for DP SDPs (DP VSC SDP and DP HDR Metadata
Infoframe SDP), it replaces intel_dp_vsc_enable() function and
intel_dp_hdr_metadata_enable() function to intel_dp_set_infoframes()
function.
And it removes unused functions.
Before:
intel_dp_vsc_enable() and intel_dp_hdr_metadata_enable() compute sdp
configs and program sdp registers on enable callback of encoder.
After:
It separates computing of sdp configs and programming of sdp register.
The compute config callback of encoder calls computing sdp configs.
The enable callback of encoder calls programming sdp register.
v3: Rebased
v5: Polish commit message [Uma]
v10: Rebased
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200514060732.3378396-8-gwan-gyeong.mun@intel.com