Commit Graph

982974 Commits

Author SHA1 Message Date
Linus Torvalds
4fd84bc969 block-5.10-2020-11-20
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl+4C/oQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpnlQD/40VMzxToudShz6bHyYYaJnA9TDEklQ/aUM
 rW5dyUthc7jap7bv5Yy6hRcoqXCjy77+3hBZ536q9XyX+jFZ/q34XqwHBOJ4tBwZ
 teBACT+sLxB3n41B/AecSvcG40GfINHyIONIeWc4JQmB5IqX4eX3PMIp+rsN+6yx
 IILnkZg9XULZ/IS48GBo8lgxaJHTU0qcOa20YZ3F9yY+/EsROwcy4ClyM+VUcoXe
 3jmXpYm/MucH4YIFPv0BhQI2TVQYzo2VK11YF6JgcDAJk15NNRjglCGesUwSfreZ
 45KdeXmReS5zKeJ9kSna1ifAwT7C7ag2EjHXUaDIhnqSKoClCuevv8cXb+0sk88X
 T3fifR0JKCrVcn2z8QTAVUJU9mq7+fTsj7HdwV/yuhonpiu61p1NlgThLQmamA9N
 Cm2yam/VdND6w7i5ZbR4pxDo7YAWg48WRZ5YOYPcHsXhqv6sxdrgxyX0ufK8CJaz
 ZDxcFKpG8LTjpKFhz/A5fDp06CdYWngCjE4+3ATFhNTo04pSNJ/3warPR+4n0Lb8
 KEMe+lgrwX9nQY65XPFFO5S/K1upzOmFpJQ6R6+NYJUS94IE/5pNUwSCa/kX07x4
 nniUUq6WkOR276K427iZVChUp2yc4bGTMJrZHd4Rs9OqD6IF7qhJM2G1YGrgPup6
 8pb9tB/5Tw==
 =v5pS
 -----END PGP SIGNATURE-----

Merge tag 'block-5.10-2020-11-20' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - NVMe pull request from Christoph:
      - Doorbell Buffer freeing fix (Minwoo Im)
      - CSE log leak fix (Keith Busch)

 - blk-cgroup hd_struct leak fix (Christoph)

 - Flush request state fix (Ming)

 - dasd NULL deref fix (Stefan)

* tag 'block-5.10-2020-11-20' of git://git.kernel.dk/linux-block:
  s390/dasd: fix null pointer dereference for ERP requests
  blk-cgroup: fix a hd_struct leak in blkcg_fill_root_iostats
  nvme: fix memory leak freeing command effects
  nvme: directly cache command effects log
  nvme: free sq/cq dbbuf pointers when dbbuf set fails
  block: mark flush request as IDLE when it is really finished
2020-11-20 12:03:40 -08:00
Linus Torvalds
fa5fca78bb io_uring-5.10-2020-11-20
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl+4DAwQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgphdOD/9xOEnYPuekvVH9G9nyNd//Q9fPArG2+j6V
 /MCnze07GNtDt7z15oR+T07hKXmf+Ejh4nu3JJ6MUNfe/47hhJqHSxRHU6+PJCjk
 hPrsaTsDedxxLEDiLmvhXnUPzfVzJtefxVAAaKikWOb3SBqLdh7xTFSlor1HbRBl
 Zk4d343cjBDYfvSSt/zMWDzwwvramdz7rJnnPMKXITu64ITL5314vuK2YVZmBOet
 YujSah7J8FL1jKhiG1Iw5rayd2Q3smnHWIEQ+lvW6WiTvMJMLOxif2xNF4/VEZs1
 CBGJUQt42LI6QGEzRBHohcefZFuPGoxnduSzHCOIhh7d6+k+y9mZfsPGohr3g9Ov
 NotXpVonnA7GbRqzo1+IfBRve7iRONdZ3/LBwyRmqav4I4jX68wXBNH5IDpVR0Sn
 c31avxa/ZL7iLIBx32enp0/r3mqNTQotEleSLUdyJQXAZTyG2INRhjLLXTqSQ5BX
 oVp0fZzKCwsr6HCPZpXZ/f2G7dhzuF0ghoceC02GsOVooni22gdVnQj+AWNus398
 e+wcimT4MX6AHNFxO2aUtJow0KWWZRzC1p5Mxu/9W3YiMtJiC0YOGePfSqiTqX0g
 Uk0H5dOAgBUQrAsusf7bKr0K6W25yEk/JipxhWqi0rC71x42mLTsCT1wxSCvLwqs
 WxhdtVKroQ==
 =7PAe
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.10-2020-11-20' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "Mostly regression or stable fodder:

   - Disallow async path resolution of /proc/self

   - Tighten constraints for segmented async buffered reads

   - Fix double completion for a retry error case

   - Fix for fixed file life times (Pavel)"

* tag 'io_uring-5.10-2020-11-20' of git://git.kernel.dk/linux-block:
  io_uring: order refnode recycling
  io_uring: get an active ref_node from files_data
  io_uring: don't double complete failed reissue request
  mm: never attempt async page lock if we've transferred data already
  io_uring: handle -EOPNOTSUPP on path resolution
  proc: don't allow async path resolution of /proc/self components
2020-11-20 11:47:22 -08:00
Kees Cook
7ef95e3dbc Merge branch 'for-linus/seccomp' into for-next/seccomp 2020-11-20 11:39:39 -08:00
Jann Horn
fab686eb03 seccomp: Remove bogus __user annotations
Buffers that are passed to read_actions_logged() and write_actions_logged()
are in kernel memory; the sysctl core takes care of copying from/to
userspace.

Fixes: 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
Reviewed-by: Tyler Hicks <code@tyhicks.com>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20201120170545.1419332-1-jannh@google.com
2020-11-20 11:39:21 -08:00
Song Liu
91b2db27d3 bpf: Simplify task_file_seq_get_next()
Simplify task_file_seq_get_next() by removing two in/out arguments: task
and fstruct. Use info->task and info->files instead.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201120002833.2481110-1-songliubraving@fb.com
2020-11-20 20:36:34 +01:00
Kevin Hilman
5bc0d7561a Amlogic fixes for v5.10-rc1
- misc DT only fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEe4dGDhaSf6n1v/EMWTcYmtP7xmUFAl+XNX4ACgkQWTcYmtP7
 xmVoHw/9FkmZXDbAfr8vZzD+4OGAmMABwzTDzbn2/1eBBXi4yKJLYnX7IwDYpdiG
 qL7sHBjauCCJXAuDvnmPyTaXBVmHlyrwnsjs3zDGCJ93GzlzewCmJCsam3KCw9E7
 CFKJEYZmijcBLHbm98Ia98dUj2NeYnmO7kZP402C/gKHZusC2A6q9PPLVsTJ3asS
 hRZqN2DfSGytkWHBIlwaEg8j67nWKwY2dqXVF3nx5+PXakS4KbNWeVXPy2/2xE1Q
 W18AsHvDp/3S1/PcxCPiIZeDkOgOdS7eBQruFFm4oZtAMGRrIhg6YH6qIuHkuRqY
 qXc7ztoZaLmulsCgdSUqPX0RDztQLrXUuvP8xnbY/NmMtAhxpmAOpgEvqEEiBOq8
 Z1t1RQetrIO3ktohqq+UrSRgyzcoGQgy+ghFH6g+Wd6gqixc9/tdFSvKtGNf9CsB
 YajmrsBWHmZKeuEBASgNUgBVIrMM2uAtcOwGbzfuGLDqkA077MJW/iLfNTt7t6Lx
 zE5hmp06NV/P5Uiv9BU/sbSfEwUVPou+XFZ/0IZ4G6pLgB8wkL53gZm3QoghQBcZ
 9tT0I/RA1s2hOVYevXbSzAKOquD9FWTSHXGiy58sL7zP9murxbUkeA/2nOmR0ZM5
 iD+G9+rXJKdNl+k4Il51Xbm1UliuCln/VT9BZj6A3cv6Wk71dUo=
 =Z+jV
 -----END PGP SIGNATURE-----

Merge tag 'amlogic-fixes' into v5.11/dt64

Amlogic fixes for v5.10-rc1
- misc DT only fixes
2020-11-20 11:34:10 -08:00
Gustavo A. R. Silva
3371c6f9f4
ASoC: codecs: Fix fall-through warnings for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple
warnings by explicitly adding multiple break statements instead of just
letting the code fall through, and also add fallthrough pseudo-keywords
in places where the code is intended to fall through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/d17b4d8300dbb6aff0d055b06b487c96ca264757.1605896059.git.gustavoars@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-11-20 19:28:34 +00:00
YiFei Zhu
0d8315dddd seccomp/cache: Report cache data through /proc/pid/seccomp_cache
Currently the kernel does not provide an infrastructure to translate
architecture numbers to a human-readable name. Translating syscall
numbers to syscall names is possible through FTRACE_SYSCALL
infrastructure but it does not provide support for compat syscalls.

This will create a file for each PID as /proc/pid/seccomp_cache.
The file will be empty when no seccomp filters are loaded, or be
in the format of:
<arch name> <decimal syscall number> <ALLOW | FILTER>
where ALLOW means the cache is guaranteed to allow the syscall,
and filter means the cache will pass the syscall to the BPF filter.

For the docker default profile on x86_64 it looks like:
x86_64 0 ALLOW
x86_64 1 ALLOW
x86_64 2 ALLOW
x86_64 3 ALLOW
[...]
x86_64 132 ALLOW
x86_64 133 ALLOW
x86_64 134 FILTER
x86_64 135 FILTER
x86_64 136 FILTER
x86_64 137 ALLOW
x86_64 138 ALLOW
x86_64 139 FILTER
x86_64 140 ALLOW
x86_64 141 ALLOW
[...]

This file is guarded by CONFIG_SECCOMP_CACHE_DEBUG with a default
of N because I think certain users of seccomp might not want the
application to know which syscalls are definitely usable. For
the same reason, it is also guarded by CAP_SYS_ADMIN.

Suggested-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/lkml/CAG48ez3Ofqp4crXGksLmZY6=fGrF_tWyUCg7PBkAetvbbOPeOA@mail.gmail.com/
Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/94e663fa53136f5a11f432c661794d1ee7060779.1605101222.git.yifeifz2@illinois.edu
2020-11-20 11:16:35 -08:00
YiFei Zhu
445247b023 xtensa: Enable seccomp architecture tracking
To enable seccomp constant action bitmaps, we need to have a static
mapping to the audit architecture and system call table size. Add these
for xtensa.

Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/79669648ba167d668ea6ffb4884250abcd5ed254.1605101222.git.yifeifz2@illinois.edu
2020-11-20 11:16:35 -08:00
YiFei Zhu
4c18bc054b sh: Enable seccomp architecture tracking
To enable seccomp constant action bitmaps, we need to have a static
mapping to the audit architecture and system call table size. Add these
for sh.

Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/61ae084cd4783b9b50860d9dedb4a348cf1b7b6f.1605101222.git.yifeifz2@illinois.edu
2020-11-20 11:16:35 -08:00
YiFei Zhu
c09058eda2 s390: Enable seccomp architecture tracking
To enable seccomp constant action bitmaps, we need to have a static
mapping to the audit architecture and system call table size. Add these
for s390.

Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/a381b10aa2c5b1e583642f3cd46ced842d9d4ce5.1605101222.git.yifeifz2@illinois.edu
2020-11-20 11:16:35 -08:00
YiFei Zhu
673a11a7e4 riscv: Enable seccomp architecture tracking
To enable seccomp constant action bitmaps, we need to have a static
mapping to the audit architecture and system call table size. Add these
for riscv.

Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/58ef925d00505cbb77478fa6bd2b48ab2d902460.1605101222.git.yifeifz2@illinois.edu
2020-11-20 11:16:35 -08:00
YiFei Zhu
e7bcb4622d powerpc: Enable seccomp architecture tracking
To enable seccomp constant action bitmaps, we need to have a static
mapping to the audit architecture and system call table size. Add these
for powerpc.

__LITTLE_ENDIAN__ is used here instead of CONFIG_CPU_LITTLE_ENDIAN
to keep it consistent with asm/syscall.h.

Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/0b64925362671cdaa26d01bfe50b3ba5e164adfd.1605101222.git.yifeifz2@illinois.edu
2020-11-20 11:16:35 -08:00
YiFei Zhu
6aa7923c87 parisc: Enable seccomp architecture tracking
To enable seccomp constant action bitmaps, we need to have a static
mapping to the audit architecture and system call table size. Add these
for parisc.

Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/9bb86c546eda753adf5270425e7353202dbce87c.1605101222.git.yifeifz2@illinois.edu
2020-11-20 11:16:34 -08:00
YiFei Zhu
6e9ae6f988 csky: Enable seccomp architecture tracking
To enable seccomp constant action bitmaps, we need to have a static
mapping to the audit architecture and system call table size. Add these
for csky.

Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/f9219026d4803b22f3e57e3768b4e42e004ef236.1605101222.git.yifeifz2@illinois.edu
2020-11-20 11:16:34 -08:00
Kees Cook
424c9102fa arm: Enable seccomp architecture tracking
To enable seccomp constant action bitmaps, we need to have a static
mapping to the audit architecture and system call table size. Add these
for arm.

Signed-off-by: Kees Cook <keescook@chromium.org>
2020-11-20 11:16:34 -08:00
Kees Cook
ffde703470 arm64: Enable seccomp architecture tracking
To enable seccomp constant action bitmaps, we need to have a static
mapping to the audit architecture and system call table size. Add these
for arm64.

Signed-off-by: Kees Cook <keescook@chromium.org>
2020-11-20 11:16:34 -08:00
Kees Cook
192cf32243 selftests/seccomp: Compare bitmap vs filter overhead
As part of the seccomp benchmarking, include the expectations with
regard to the timing behavior of the constant action bitmaps, and report
inconsistencies better.

Example output with constant action bitmaps on x86:

$ sudo ./seccomp_benchmark 100000000
Current BPF sysctl settings:
net.core.bpf_jit_enable = 1
net.core.bpf_jit_harden = 0
Benchmarking 200000000 syscalls...
129.359381409 - 0.008724424 = 129350656985 (129.4s)
getpid native: 646 ns
264.385890006 - 129.360453229 = 135025436777 (135.0s)
getpid RET_ALLOW 1 filter (bitmap): 675 ns
399.400511893 - 264.387045901 = 135013465992 (135.0s)
getpid RET_ALLOW 2 filters (bitmap): 675 ns
545.872866260 - 399.401718327 = 146471147933 (146.5s)
getpid RET_ALLOW 3 filters (full): 732 ns
696.337101319 - 545.874097681 = 150463003638 (150.5s)
getpid RET_ALLOW 4 filters (full): 752 ns
Estimated total seccomp overhead for 1 bitmapped filter: 29 ns
Estimated total seccomp overhead for 2 bitmapped filters: 29 ns
Estimated total seccomp overhead for 3 full filters: 86 ns
Estimated total seccomp overhead for 4 full filters: 106 ns
Estimated seccomp entry overhead: 29 ns
Estimated seccomp per-filter overhead (last 2 diff): 20 ns
Estimated seccomp per-filter overhead (filters / 4): 19 ns
Expectations:
	native ≤ 1 bitmap (646 ≤ 675): ✔️
	native ≤ 1 filter (646 ≤ 732): ✔️
	per-filter (last 2 diff) ≈ per-filter (filters / 4) (20 ≈ 19): ✔️
	1 bitmapped ≈ 2 bitmapped (29 ≈ 29): ✔️
	entry ≈ 1 bitmapped (29 ≈ 29): ✔️
	entry ≈ 2 bitmapped (29 ≈ 29): ✔️
	native + entry + (per filter * 4) ≈ 4 filters total (755 ≈ 752): ✔️

[YiFei: Changed commit message to show stats for this patch series]
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/1b61df3db85c5f7f1b9202722c45e7b39df73ef2.1602431034.git.yifeifz2@illinois.edu
2020-11-20 11:16:34 -08:00
Kees Cook
25db91209a x86: Enable seccomp architecture tracking
Provide seccomp internals with the details to calculate which syscall
table the running kernel is expecting to deal with. This allows for
efficient architecture pinning and paves the way for constant-action
bitmaps.

Co-developed-by: YiFei Zhu <yifeifz2@illinois.edu>
Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/da58c3733d95c4f2115dd94225dfbe2573ba4d87.1602431034.git.yifeifz2@illinois.edu
2020-11-20 11:16:34 -08:00
YiFei Zhu
8e01b51a31 seccomp/cache: Add "emulator" to check if filter is constant allow
SECCOMP_CACHE will only operate on syscalls that do not access
any syscall arguments or instruction pointer. To facilitate
this we need a static analyser to know whether a filter will
return allow regardless of syscall arguments for a given
architecture number / syscall number pair. This is implemented
here with a pseudo-emulator, and stored in a per-filter bitmap.

In order to build this bitmap at filter attach time, each filter is
emulated for every syscall (under each possible architecture), and
checked for any accesses of struct seccomp_data that are not the "arch"
nor "nr" (syscall) members. If only "arch" and "nr" are examined, and
the program returns allow, then we can be sure that the filter must
return allow independent from syscall arguments.

Nearly all seccomp filters are built from these cBPF instructions:

BPF_LD  | BPF_W    | BPF_ABS
BPF_JMP | BPF_JEQ  | BPF_K
BPF_JMP | BPF_JGE  | BPF_K
BPF_JMP | BPF_JGT  | BPF_K
BPF_JMP | BPF_JSET | BPF_K
BPF_JMP | BPF_JA
BPF_RET | BPF_K
BPF_ALU | BPF_AND  | BPF_K

Each of these instructions are emulated. Any weirdness or loading
from a syscall argument will cause the emulator to bail.

The emulation is also halted if it reaches a return. In that case,
if it returns an SECCOMP_RET_ALLOW, the syscall is marked as good.

Emulator structure and comments are from Kees [1] and Jann [2].

Emulation is done at attach time. If a filter depends on more
filters, and if the dependee does not guarantee to allow the
syscall, then we skip the emulation of this syscall.

[1] https://lore.kernel.org/lkml/20200923232923.3142503-5-keescook@chromium.org/
[2] https://lore.kernel.org/lkml/CAG48ez1p=dR_2ikKq=xVxkoGg0fYpTBpkhJSv1w-6BG=76PAvw@mail.gmail.com/

Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Reviewed-by: Jann Horn <jannh@google.com>
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/71c7be2db5ee08905f41c3be5c1ad6e2601ce88f.1602431034.git.yifeifz2@illinois.edu
2020-11-20 11:16:34 -08:00
YiFei Zhu
f9d480b6ff seccomp/cache: Lookup syscall allowlist bitmap for fast path
The overhead of running Seccomp filters has been part of some past
discussions [1][2][3]. Oftentimes, the filters have a large number
of instructions that check syscall numbers one by one and jump based
on that. Some users chain BPF filters which further enlarge the
overhead. A recent work [6] comprehensively measures the Seccomp
overhead and shows that the overhead is non-negligible and has a
non-trivial impact on application performance.

We observed some common filters, such as docker's [4] or
systemd's [5], will make most decisions based only on the syscall
numbers, and as past discussions considered, a bitmap where each bit
represents a syscall makes most sense for these filters.

The fast (common) path for seccomp should be that the filter permits
the syscall to pass through, and failing seccomp is expected to be
an exceptional case; it is not expected for userspace to call a
denylisted syscall over and over.

When it can be concluded that an allow must occur for the given
architecture and syscall pair (this determination is introduced in
the next commit), seccomp will immediately allow the syscall,
bypassing further BPF execution.

Each architecture number has its own bitmap. The architecture
number in seccomp_data is checked against the defined architecture
number constant before proceeding to test the bit against the
bitmap with the syscall number as the index of the bit in the
bitmap, and if the bit is set, seccomp returns allow. The bitmaps
are all clear in this patch and will be initialized in the next
commit.

When only one architecture exists, the check against architecture
number is skipped, suggested by Kees Cook [7].

[1] https://lore.kernel.org/linux-security-module/c22a6c3cefc2412cad00ae14c1371711@huawei.com/T/
[2] https://lore.kernel.org/lkml/202005181120.971232B7B@keescook/T/
[3] https://github.com/seccomp/libseccomp/issues/116
[4] ae0ef82b90/profiles/seccomp/default.json
[5] 6743a1caf4/src/shared/seccomp-util.c (L270)
[6] Draco: Architectural and Operating System Support for System Call Security
    https://tianyin.github.io/pub/draco.pdf, MICRO-53, Oct. 2020
[7] https://lore.kernel.org/bpf/202010091614.8BB0EB64@keescook/

Co-developed-by: Dimitrios Skarlatos <dskarlat@cs.cmu.edu>
Signed-off-by: Dimitrios Skarlatos <dskarlat@cs.cmu.edu>
Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Reviewed-by: Jann Horn <jannh@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/10f91a367ec4fcdea7fc3f086de3f5f13a4a7436.1602431034.git.yifeifz2@illinois.edu
2020-11-20 11:16:34 -08:00
Colin Ian King
7648398017 octeontx2-af: Fix access of iter->entry after iter object has been kfree'd
The call to pc_delete_flow can kfree the iter object, so the following
dev_err message that accesses iter->entry can accessmemory that has
just been kfree'd.  Fix this by adding a temporary variable 'entry'
that has a copy of iter->entry and also use this when indexing into
the array mcam->entry2target_pffunc[]. Also print the unsigned value
using the %u format specifier rather than %d.

Addresses-Coverity: ("Read from pointer after free")
Fixes: 55307fcb92 ("octeontx2-af: Add mbox messages to install and delete MCAM rules")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20201118143803.463297-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:07:57 -08:00
Colin Ian King
dd6028a3cb octeontx2-af: Fix return of uninitialized variable err
Currently the variable err may be uninitialized if several of the if
statements are not executed in function nix_tx_vtag_decfg and a garbage
value in err is returned.  Fix this by initialized ret at the start of
the function.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 9a946def26 ("octeontx2-af: Modify nix_vtag_cfg mailbox to support TX VTAG entries")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20201118132502.461098-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:06:10 -08:00
Colin Ian King
583b273dea octeontx2-pf: Fix unintentional sign extension issue
The shifting of the u16 result from ntohs(proto) by 16 bits to the
left will be promoted to a 32 bit signed int and then sign-extended
to a u64.  In the event that the top bit of the return from ntohs(proto)
is set then all then all the upper 32 bits of a 64 bit long end up as
also being set because of the sign-extension. Fix this by casting to
a u64 long before the shift.

Addresses-Coverity: ("Unintended sign extension")
Fixes: f0c2982aaf ("octeontx2-pf: Add support for SR-IOV management function")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20201118130520.460365-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:05:16 -08:00
Raju Rangoju
bff453921a cxgb4: fix the panic caused by non smac rewrite
SMT entry is allocated only when loopback Source MAC
rewriting is requested. Accessing SMT entry for non
smac rewrite cases results in kernel panic.

Fix the panic caused by non smac rewrite

Fixes: 937d842058 ("cxgb4: set up filter action after rewrites")
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Link: https://lore.kernel.org/r/20201118143213.13319-1-rajur@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:03:37 -08:00
Kees Cook
4c222f31fb selftests/seccomp: sh: Fix register names
It looks like the seccomp selftests was never actually built for sh.
This fixes it, though I don't have an environment to do a runtime test
of it yet.

Fixes: 0bb605c2c7 ("sh: Add SECCOMP_FILTER")
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/lkml/a36d7b48-6598-1642-e403-0c77a86f416d@physik.fu-berlin.de
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-11-20 11:03:08 -08:00
Kees Cook
f5098e34dd selftests/seccomp: powerpc: Fix typo in macro variable name
A typo sneaked into the powerpc selftest. Fix the name so it builds again.

Fixes: 46138329fa ("selftests/seccomp: powerpc: Fix seccomp return value testing")
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/lkml/87y2ix2895.fsf@mpe.ellerman.id.au
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-11-20 11:02:28 -08:00
Jakub Kicinski
b5fb0b1bbb Merge branch 'add-support-for-marvell-octeontx2-cryptographic'
Srujana Challa says:

====================
Add Support for Marvell OcteonTX2 Cryptographic

This patchset adds support for CPT in OcteonTX2 admin function(AF).
CPT is a cryptographic accelerator unit and it includes microcoded
Giga Cipher engines.

OcteonTX2 SOC's resource virtualization unit (RVU) supports multiple
physical and virtual functions. Each of the PF/VF's functionality is
determined by what kind of resources are attached to it. When the CPT
block is attached to a VF, it can function as a security device.
The following document provides an overview of the hardware and
different drivers for the OcteonTX2 SOC:
https://www.kernel.org/doc/Documentation/networking/device_drivers/marvell/octeontx2.rst

This patch series includes:
- Patch to update existing Marvell sources to support CPT.
- Patch that adds mailbox messages to the admin function (AF) driver,
to configure CPT HW registers.
- Patch to provide debug information about CPT.
====================

Link: https://lore.kernel.org/r/20201118114416.28307-1-schalla@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:01:56 -08:00
Srujana Challa
76638a2e58 octeontx2-af: add debugfs entries for CPT block
Add entries to debugfs at /sys/kernel/debug/octeontx2/cpt.

cpt_pc: dump cpt performance HW registers.
Usage:
cat /sys/kernel/debug/octeontx2/cpt/cpt_pc

cpt_ae_sts: show cpt asymmetric engines current state
Usage:
cat /sys/kernel/debug/octeontx2/cpt/cpt_ae_sts

cpt_se_sts: show cpt symmetric engines current state
Usage:
cat /sys/kernel/debug/octeontx2/cpt/cpt_se_sts

cpt_engines_info: dump cpt engine control registers.
Usage:
cat /sys/kernel/debug/octeontx2/cpt/cpt_engines_info

cpt_lfs_info: dump cpt lfs control registers.
Usage:
cat /sys/kernel/debug/octeontx2/cpt/cpt_lfs_info

cpt_err_info: dump cpt error registers.
Usage:
cat /sys/kernel/debug/octeontx2/cpt/cpt_err_info

Signed-off-by: Suheil Chandran <schandran@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:01:13 -08:00
Srujana Challa
ae454086e3 octeontx2-af: add mailbox interface for CPT
On OcteonTX2 SoC, the admin function (AF) is the only one with all
priviliges to configure HW and alloc resources, PFs and it's VFs
have to request AF via mailbox for all their needs. This patch adds
a mailbox interface for CPT PFs and VFs to allocate resources
for cryptography. It also adds hardware CPT AF register defines.

Signed-off-by: Suheil Chandran <schandran@marvell.com>
Signed-off-by: Lukasz Bartosik <lbartosik@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:01:13 -08:00
Srujana Challa
956fb85218 octeontx2-pf: move lmt flush to include/linux/soc
On OcteonTX2 platform CPT instruction enqueue and NIX
packet send are only possible via LMTST operations which
uses LDEOR instruction. This patch moves lmt flush
function from OcteonTX2 nic driver to include/linux/soc
since it will be used by OcteonTX2 CPT and NIC driver for
LMTST.

Signed-off-by: Suheil Chandran <schandran@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:01:13 -08:00
Tariq Toukan
1a0058cf0c net/mlx4_en: Remove unused performance counters
Performance analysis counters are maintained under the MLX4_EN_PERF_STAT
definition, which is never set.
Clean them up, with all related structures and logic.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Link: https://lore.kernel.org/r/20201118103427.4314-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 10:59:39 -08:00
Eric Biggers
47a846536e block/keyslot-manager: prevent crash when num_slots=1
If there is only one keyslot, then blk_ksm_init() computes
slot_hashtable_size=1 and log_slot_ht_size=0.  This causes
blk_ksm_find_keyslot() to crash later because it uses
hash_ptr(key, log_slot_ht_size) to find the hash bucket containing the
key, and hash_ptr() doesn't support the bits == 0 case.

Fix this by making the hash table always have at least 2 buckets.

Tested by running:

    kvm-xfstests -c ext4 -g encrypt -m inlinecrypt \
                 -o blk-crypto-fallback.num_keyslots=1

Fixes: 1b26283970 ("block: Keyslot Manager for Inline Encryption")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-11-20 11:52:52 -07:00
Lakshmi Ramasubramanian
dea87d0889 ima: select ima-buf template for buffer measurement
The default IMA template used for all policy rules is the value set
for CONFIG_IMA_DEFAULT_TEMPLATE if the policy rule does not specify
a template. The default IMA template for buffer measurements should be
'ima-buf' - so that the measured buffer is correctly included in the IMA
measurement log entry.

With the default template format, buffer measurements are added to
the measurement list, but do not include the buffer data, making it
difficult, if not impossible, to validate. Including 'ima-buf'
template records in the measurement list by default, should not impact
existing attestation servers without 'ima-buf' template support.

Initialize a global 'ima-buf' template and select that template,
by default, for buffer measurements.

Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-11-20 13:52:43 -05:00
Gustavo Pimentel
341917490d PCI: Decode PCIe 64 GT/s link speed
PCIe r6.0, sec 7.5.3.18, defines a new 64.0 GT/s bit in the Supported Link
Speeds Vector of Link Capabilities 2.

This patch does not affect the speed of the link, which should be
negotiated automatically by the hardware; it only adds decoding when
showing the speed to the user.

Decode this new speed.  Previously, reading the speed of a link operating
at this speed showed "Unknown speed" instead of "64.0 GT/s".

Link: https://lore.kernel.org/r/aaaab33fe18975e123a84aebce2adb85f44e2bbe.1605739760.git.gustavo.pimentel@synopsys.com
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
2020-11-20 12:35:27 -06:00
Linus Torvalds
4ccf7a01e8 xen: branch for v5.10-rc5
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCX7dYpAAKCRCAXGG7T9hj
 vqRpAQDiOASJZyPTwbB3kW7EPIyMz5kZgeJUyX4BEBUOR0g3CAEA/Qz9ZE9h9h3l
 YIXs9LDGQJ0pOl04wOzjte57AjzGoAM=
 =B9Bu
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.10b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fix from Juergen Gross:
 "A single fix for avoiding WARN splats when booting a Xen guest with
  nosmt"

* tag 'for-linus-5.10b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: don't unbind uninitialized lock_kicker_irq
2020-11-20 10:30:48 -08:00
Vadim Fedorenko
20ffc7adf5 net/tls: missing received data after fast remote close
In case when tcp socket received FIN after some data and the
parser haven't started before reading data caller will receive
an empty buffer. This behavior differs from plain TCP socket and
leads to special treating in user-space.
The flow that triggers the race is simple. Server sends small
amount of data right after the connection is configured to use TLS
and closes the connection. In this case receiver sees TLS Handshake
data, configures TLS socket right after Change Cipher Spec record.
While the configuration is in process, TCP socket receives small
Application Data record, Encrypted Alert record and FIN packet. So
the TCP socket changes sk_shutdown to RCV_SHUTDOWN and sk_flag with
SK_DONE bit set. The received data is not parsed upon arrival and is
never sent to user-space.

Patch unpauses parser directly if we have unparsed data in tcp
receive queue.

Fixes: fcf4793e27 ("tls: check RCV_SHUTDOWN in tls_wait_data")
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Link: https://lore.kernel.org/r/1605801588-12236-1-git-send-email-vfedorenko@novek.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 10:25:26 -08:00
Linus Torvalds
bd4d74e8f8 dmaengine fixes for v5.10-rc5
Core:
 *) channel_register error handling
 
 Driver fixes for:
 *) idxd: wq config registers programming and mapping of portal size
 *) ioatdma: unused fn removal
 *) pl330: fix burst size
 *) ti: pm fix on busy and -Wenum-conversion warns
 *) xilinx: SG capability check, usage of xilinx_aximcdma_tx_segment,
 readl_poll_timeout_atomic variant
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAl+32mIACgkQfBQHDyUj
 g0ddOw/9GWhL1vkW+0JTeVSNDVT/GajwXONLbi3brSStlKViG0NrusmIEadNOgAR
 QCj//oww7rkT5jK9KVy577y2Pbns+jZYJYsgUkSQxjaXLztvdMn9ujlP2cpcdo6A
 JZ0+YvMastIAUO2XWVxON7nhhVDcOIUdRlEXR/j0Dllk5NIQInNDyEHi76JQgUfb
 5UV8wZSBO1QbgXogqE8KDnkOsiZEhUfl2Ah0wdYkPtr90GyrBQg8qocR7sfH8idk
 1cJ5bA0UeQX5fpHhIM2dEwtc/115QXJIDOP8u6xMN3StHp9ce+/ghYsBONdPYUN9
 NaRVxs2fyHxp8kx5qz76xZoyIHIZq1Tfyx2oYhTEKUmMWGJYXA4tUSJ071+Wq/eg
 fFwd6u557b1TpUJoU6HmOEQlAYIUXbDMO62pgZj1T/hn6BOuj7SC4v7CzDcucykv
 96Cgj/B6ArFaQEmni4R6XhBCD+vD1Vv/CrayOeBZ0VoZAZLTyH/TZXDKSWQzu+Hl
 KVMHeqR6O/DIUOFFWrz6cYaVynSQecbk5mQwlkWD2G6HzRzBJT2FcBAvNUS/4z6E
 8ie2EQBatjHzsJXWZNUVN2XPUzrJFq+gfY/TCh+2ZNVNvcI5Z6pBfpr0DQGMScbQ
 sf4RiEo7mQ608D04nvD6Sco6QBHdEJYSc86v5qfeTykN5dfilwU=
 =V+mh
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-fix-5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine fixes from Vinod Koul:
 "A solitary core fix and a few driver fixes:

  Core:

   - channel_register error handling

  Driver fixes:

   - idxd: wq config registers programming and mapping of portal size

   - ioatdma: unused fn removal

   - pl330: fix burst size

   - ti: pm fix on busy and -Wenum-conversion warns

   - xilinx: SG capability check, usage of xilinx_aximcdma_tx_segment,
     readl_poll_timeout_atomic variant"

* tag 'dmaengine-fix-5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dmaengine: fix error codes in channel_register()
  dmaengine: pl330: _prep_dma_memcpy: Fix wrong burst size
  dmaengine: ioatdma: remove unused function missed during dma_v2 removal
  dmaengine: idxd: fix mapping of portal size
  dmaengine: ti: omap-dma: Block PM if SDMA is busy to fix audio
  dmaengine: xilinx_dma: Fix SG capability check for MCDMA
  dmaengine: xilinx_dma: Fix usage of xilinx_aximcdma_tx_segment
  dmaengine: xilinx_dma: use readl_poll_timeout_atomic variant
  dmaengine: ti: k3-udma: fix -Wenum-conversion warning
  dmaengine: idxd: fix wq config registers offset programming
2020-11-20 10:23:49 -08:00
Linus Torvalds
fc8299f9f3 iommu fixes for -rc5
- Fix boot when intel iommu initialisation fails under TXT (tboot)
 
 - Fix intel iommu compilation error when DMAR is enabled without ATS
 
 - Temporarily update IOMMU MAINTAINERs entry
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl+3p6gQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNPowCADBKq5PBRwqIM1tmntRu5rePf/WuB1BotlJ
 cUUR8dIxFDvKMeGvN4mDbn/5jjL1+TlD7sQqFh+gcWJ9tImpAMzp1ENbg71isl5o
 Ej21X9YhjfiIsKpTrNGxcetCkYaR2tp8Z4/WzSQaO+IGB578dQ9VwiwSnDyFdWUb
 1Qcj712ainYvKjbq4NyL80lPm9v43OV8QZ73WeelzBjE2UcDFAV0Xl4BIX8uuGtv
 I1x03JfgzmzhVBmpj5HUGPsSopMDdo5K4vlcqscqSqvBY90iHD5fgRF4ZatTjR1t
 PVM/1+c9vdJIWSxSt1hsB8DPtqXVOvGjzrLW/h63rxyWsISwkAq0
 =DNX/
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull iommu fixes from Will Deacon:
 "Two straightforward vt-d fixes:

   - Fix boot when intel iommu initialisation fails under TXT (tboot)

   - Fix intel iommu compilation error when DMAR is enabled without ATS

  and temporarily update IOMMU MAINTAINERs entry"

* tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  MAINTAINERS: Temporarily add myself to the IOMMU entry
  iommu/vt-d: Fix compile error with CONFIG_PCI_ATS not set
  iommu/vt-d: Avoid panic if iommu init fails in tboot system
2020-11-20 10:20:16 -08:00
Heiko Carstens
334ef6ed06 init/Kconfig: make COMPILE_TEST depend on !S390
While allmodconfig and allyesconfig build for s390 there are also
various bots running compile tests with randconfig, where PCI is
disabled. This reveals that a lot of drivers should actually depend on
HAS_IOMEM.
Adding this to each device driver would be a never ending story,
therefore just disable COMPILE_TEST for s390.

The reasoning is more or less the same as described in
commit bc083a64b6 ("init/Kconfig: make COMPILE_TEST depend on !UML").

Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:12 +01:00
Alexander Gordeev
12bb4c6823 s390/vmem: make variable and function names consistent
Rename some variable and functions to better clarify
what they are and what they do.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:12 +01:00
Alexander Gordeev
af71657c15 s390/vmem: remove redundant check
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:12 +01:00
Julian Wiedmann
074ff04e27 s390/stp: let subsys_system_register() sysfs attributes
Instead of creating the sysfs attributes for the stp root_dev by hand,
pass them to subsys_system_register() as parameter.

This also ensures that the attributes are available when the KOBJ_ADD
event is raised.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
ba1a6be994 s390/decompressor: print cmdline and BEAR on pgm_check
Add kernel command line and last breaking event.
The kernel command line is taken from early_command_line and printed
only if kernel is not running as protected virtualization guest and
if it has been already initialized from the "COMMAND_LINE".

Linux version 5.10.0-rc3-22794-gecaa72788df0-dirty (gor@tuxmaker) #28 SMP PREEMPT Mon Nov 9 17:41:20 CET 2020
Kernel command line: audit_enable=0 audit=0 selinux=0 crashkernel=296M root=/dev/dasda1 dasd=ec5b
memblock=debug die
Kernel fault: interruption code 0005 ilc:2
PSW : 0000000180000000 0000000000012f92 (parse_boot_command_line+0x27a/0x46c)
      R:0 T:0 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:0 CC:0 PM:0 RI:0 EA:3
GPRS: 0000000000000000 00ffffffffffffff 0000000000000000 000000000001a65c
      000000000000bf60 0000000000000000 00000000000003c0 0000000000000000
      0000000000000080 000000000002322d 000000007f29ef20 0000000000efd018
      000000000311c000 0000000000010070 0000000000012f82 000000000000bea8
Call Trace:
(sp:000000000000bea8 [<000000000002016e>] 000000000002016e)
 sp:000000000000bf18 [<0000000000012408>] startup_kernel+0x88/0x2fc
 sp:000000000000bf60 [<00000000000100c4>] startup_normal+0xb0/0xb0
Last Breaking-Event-Address:
 [<00000000000135ba>] strcmp+0x22/0x24

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Acked-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
8977ab65b8 s390/decompressor: add stacktrace support
Decompressor works on a single statically allocated stack. Stacktrace
implementation with -mbackchain just takes few lines of code.

Linux version 5.10.0-rc3-22793-g0f84a355b776-dirty (gor@tuxmaker) #27 SMP PREEMPT Mon Nov 9 17:30:18 CET 2020
Kernel fault: interruption code 0005 ilc:2
PSW : 0000000180000000 0000000000012f92 (parse_boot_command_line+0x27a/0x46c)
      R:0 T:0 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:0 CC:0 PM:0 RI:0 EA:3
GPRS: 0000000000000000 00ffffffffffffff 0000000000000000 000000000001a62c
      000000000000bf60 0000000000000000 00000000000003c0 0000000000000000
      0000000000000080 000000000002322d 000000007f29ef20 0000000000efd018
      000000000311c000 0000000000010070 0000000000012f82 000000000000bea8
Call Trace:
(sp:000000000000bea8 [<000000000002016e>] 000000000002016e)
 sp:000000000000bf18 [<0000000000012408>] startup_kernel+0x88/0x2fc
 sp:000000000000bf60 [<00000000000100c4>] startup_normal+0xb0/0xb0

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
246218962e s390/decompressor: add symbols support
Information printed by print_pgm_check_info() is crucial for
debugging decompressor problems. Printing instruction addresses is
better than nothing, but turns further debugging into tedious job of
figuring out which function those addresses correspond to.

This change adds simplistic symbols resolution support. And adds %pS
format specifier support to decompressor_printk().

Decompressor symbols list is extracted and sorted with
nm -n -S:
...
0000000000010000 0000000000000014 T startup
0000000000010014 00000000000000b0 t startup_normal
0000000000010180 00000000000000b2 t startup_kdump
...

Then functions are filtered and contracted to a form:
"10000 14 startup\0""10014 b0 startup_normal\0""10180 b2 startup_kdump\0"
...
Which makes it trivial to find beginning of an entry and names are 0
terminated, so could be used as is. Symbols are binary-searched.

To get symbols list with final addresses and then get it into the
decompressor's image the same trick as for kallsyms is used.
Decompressor's vmlinux is linked twice.

Symbols are stored in .decompressor.syms section, current size is about
2kb.

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
ec55d1e1db s390/decompressor: correct some asm symbols annotations
Use SYM_CODE_* annotations for asm functions, so that function lengths
are recognized correctly.

Also currently the most part of startup is marked as startup_kdump. Move
misplaced startup_kdump where it belongs.

$ nm -n -S arch/s390/boot/compressed/vmlinux
Before:
0000000000010000 T startup
0000000000010010 T startup_kdump
After:
0000000000010000 0000000000000014 T startup
0000000000010014 00000000000000b0 t startup_normal
0000000000010180 00000000000000b2 t startup_kdump

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
9a78c70a1b s390/decompressor: add decompressor_printk
The decompressor does not have any special debug means. Running the
kernel under qemu with gdb is helpful but tedious exercise if done
repeatedly. It is also not applicable to debugging under LPAR and z/VM.

One special thing which stands out is a working sclp_early_printk,
which could be used once the kernel switches to 64-bit addressing mode.

But sclp_early_printk does not provide any string formating capabilities.
Formatting and printing string without printk-alike function is a
not fun. The lack of printk-alike function means people would save up on
testing and introduce more bugs.

So, finally, provide decompressor_printk function, which fits on one
screen and trades features for simplicity.

It only supports "%s", "%x" and "%lx" specifiers and zero padding for
hex values.

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
c9343637d6 s390/ftrace: assume -mhotpatch or -mrecord-mcount always available
Currently the kernel minimal compiler requirement is gcc 4.9 or
clang 10.0.1.
* gcc -mhotpatch option is supported since 4.8.
* A combination of -pg -mrecord-mcount -mnop-mcount -mfentry flags is
supported since gcc 9 and since clang 10.

Drop support for old -pg function prologues. Which leaves binary
compatible -mhotpatch / -mnop-mcount -mfentry prologues in a form:
	brcl	0,0
Which are also do not require initial nop optimization / conversion and
presence of _mcount symbol.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
73045a08cf s390: unify identity mapping limits handling
Currently we have to consider too many different values which
in the end only affect identity mapping size. These are:
1. max_physmem_end - end of physical memory online or standby.
   Always <= end of the last online memory block (get_mem_detect_end()).
2. CONFIG_MAX_PHYSMEM_BITS - the maximum size of physical memory the
   kernel is able to support.
3. "mem=" kernel command line option which limits physical memory usage.
4. OLDMEM_BASE which is a kdump memory limit when the kernel is executed as
   crash kernel.
5. "hsa" size which is a memory limit when the kernel is executed during
   zfcp/nvme dump.

Through out kernel startup and run we juggle all those values at once
but that does not bring any amusement, only confusion and complexity.

Unify all those values to a single one we should really care, that is
our identity mapping size.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:10 +01:00