Commit Graph

994003 Commits

Author SHA1 Message Date
Stanislav Fomichev
1e0aa3fb05 libbpf: Use AF_LOCAL instead of AF_INET in xsk.c
We have the environments where usage of AF_INET is prohibited
(cgroup/sock_create returns EPERM for AF_INET). Let's use
AF_LOCAL instead of AF_INET, it should perfectly work with SIOCETHTOOL.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/20210209221826.922940-1-sdf@google.com
2021-02-12 11:36:48 -08:00
Linus Torvalds
a81bfdf8bf drm fixes for 5.11-rc8
ttm:
 - page pool regression fix.
 
 dp_mst:
 - Don't report un-attached ports as connected
 
 amdgpu:
 - Blank screen fix
 
 i915:
 - Ensure Type-C FIA is powered when initializing
 - Fix overlay frontbuffer tracking
 
 sun4i:
 - tcon1 sync polarity fix
 - Always set HDMI clock rate
 - Fix H6 HDMI PHY config
 - Fix H6 max frequency
 
 vc4:
 - Fix buffer overflow
 
 xlnx:
 - Fix memory leak
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJgJf1KAAoJEAx081l5xIa+CwYQAKj4UzfoZnjLH7174FiLsbmJ
 DzOXcLIjQkMWSmMUnTmUmUclBSTexXaIm7sUzbHNsjpIet+cjxVQo8UhatOokM48
 YrNnYRnkReiy2rLJYMB5tSPDXAh8BNuZDGXiN4AEPdxErFK0DsisDzSSYZJfg01M
 Qpt/diCJRu4T3Hr+YletfxWV+Dqq5m79zXmMfr1NlCW4D1V1WLEVW1zyb5rTFiTn
 goBis52fL2FWTe9DVFa+VVDqCeQxMxs4/gFsZoY5IMTy/w+4YI9gy4BElnyDcOoW
 kWz7rjUNl5Hy1BG3YrH9egzKktlGzqHuD6x8bp/PbQdgL8w9m/hvKghYssByHDs0
 WBhEg8gd9XviWRfIdNuFeJ4/jq6kMlmLEzZKbcBUpP7Du5/c3rUMqhHFTilz4y7y
 8kNDAG5rtL/9b8um0H2MR179qYspRiL0XhG0HL4YZuopfruhmky8gkRhYcsOHkPR
 WYFR321FItpXFTw8/gUdudeEr2KssWZ9Ob2lBO960Or2JEtWPYSk1F3G7SCt6N+L
 IalFXYRMG2OfAiNjjiVEiZNFe1dgpfy89tka5bVnVuuBEYx+DjU7ZTl3i4xAxtYG
 X8WA1de0FzuYiiEUBrrYe8s6YFBEo4Z20dlA1ghYU/rMEXNfte0RVW8DAdsKiTwd
 Bqt8tCb6CblCNPxbkeQ3
 =HSyG
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2021-02-12' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Regular fixes for final, there is a ttm regression fix, dp-mst fix,
  one amdgpu revert, two i915 fixes, and some misc fixes for sun4i,
  xlnx, and vc4.

  All pretty quiet and don't think we have any known outstanding
  regressions.

  ttm:
   - page pool regression fix.

  dp_mst:
   - don't report un-attached ports as connected

  amdgpu:
   - blank screen fix

  i915:
   - ensure Type-C FIA is powered when initializing
   - fix overlay frontbuffer tracking

  sun4i:
   - tcon1 sync polarity fix
   - always set HDMI clock rate
   - fix H6 HDMI PHY config
   - fix H6 max frequency

  vc4:
   - fix buffer overflow

  xlnx:
   - fix memory leak"

* tag 'drm-fixes-2021-02-12' of git://anongit.freedesktop.org/drm/drm:
  drm/ttm: make sure pool pages are cleared
  drm/sun4i: dw-hdmi: Fix max. frequency for H6
  drm/sun4i: Fix H6 HDMI PHY configuration
  drm/sun4i: dw-hdmi: always set clock rate
  drm/sun4i: tcon: set sync polarity for tcon1 channel
  drm/i915: Fix overlay frontbuffer tracking
  Revert "drm/amd/display: Update NV1x SR latency values"
  drm/i915/tgl+: Make sure TypeC FIA is powered up when initializing it
  drm/dp_mst: Don't report ports connected if nothing is attached to them
  drm/xlnx: fix kmemleak by sending vblank_event in atomic_disable
  drm/vc4: hvs: Fix buffer overflow with the dlist handling
2021-02-12 11:29:06 -08:00
Linus Torvalds
e77a6817d4 tracing: Fix buffer overflow in trace event filter
It was reported that if an trace event was larger than a page
 and was filtered, that it caused memory corruption. The reason
 is that filtered events first go into a buffer to test the filter
 before being written into the ring buffer. Unfortunately,
 this write did not check the size.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYCaSHBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qqOpAQCUSlZdBxLzs87zeHgXbkMudWvCYSbA
 mndzddqtxPXlXwEAsRnO8BERyZnasEdXnJ98JJwQaFFYH0dBCA2pTU2onQc=
 =NokV
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.11-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Fix buffer overflow in trace event filter.

  It was reported that if an trace event was larger than a page and was
  filtered, that it caused memory corruption. The reason is that
  filtered events first go into a buffer to test the filter before being
  written into the ring buffer. Unfortunately, this write did not check
  the size"

* tag 'trace-v5.11-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Check length before giving out the filter buffer
2021-02-12 11:16:17 -08:00
Linus Torvalds
2dbbaae5f7 xen: branch for v5.11-rc8
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYCUrYAAKCRCAXGG7T9hj
 voF2AP9x3wKo+kYAqlRQrrDBb/AoMezoJ3pFiqJ+1Lg4yK4N6AD+IQxHuIODBfgj
 bK12hr5oE6S3e7tTaVWweWzys5UmWA4=
 =zleO
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.11-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fix from Juergen Gross:
 "A single fix for an issue introduced this development cycle: when
  running as a Xen guest on Arm systems the kernel will hang during
  boot"

* tag 'for-linus-5.11-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  arm/xen: Don't probe xenbus as part of an early initcall
2021-02-12 11:12:58 -08:00
Linus Torvalds
f951625980 A Single RISC-V Fix for 5.11-rc8 (or 5.11)
I have a single fix this week:
 
 * The removal of the GPIO reset method for the Ethernet phy on the
   HiFive Unleashed.  This returns to relying on the bootloader's phy
   reset sequence, which we'll have to cintune doing until we can sort
   out how to get the Linux phy driver to perform the special reset
   dance required for this phy.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmAmyJUTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYie6sD/9pM+HN+6xsf/oVhwLoU/99Yz6tGfcb
 uyKcd2LtmG05cuJeRgpmthyERd+uBxU0Hiy/ZhNmedD49WDKYygoP9O8zujTSGjj
 pyR0U2ZzbSeVFRWhWfs2UA38wnr0w2fy2f4Pd2OGAE6mXzBMd0dPIrStGSDlLTD2
 usFLOkK0TY8nQMNvL/UoGtHFJd+NcBxJGl2YPE5PLd050VEGZX/r+aON2KaSx/n0
 fr+i9lGTpXz3dbJuuS/RTgOBhWPWWSW6QTOOsWDBdhcoZN6R0Ednu3ECpfaFWtXY
 U0IzYQo7FYDn20dEXRb+9B5EL4r90yV1YFXJ4auMEn9KVUbIPIxe/Y+arLJPTDmi
 yDJJ6WhZiftRHY6T0ZBqpy4UPTNgyob4JG0pWo3f4HJEwoVnWYEmxHOdmaQOV778
 pESDFgmUl5Ty9s9ETnzsFw4Vdc+GVOP2lbLuYuRI0UENUsU1OYJyBbzi2LW0qMJM
 siBETN9zeo/uxgqktW+6EKWDhjhTYHbo2hUBkJqHbBPsHhhvB4OAD+S0+w/Hjo6/
 YeZx1y1DCbHCae8IklTBk4nuYHTW1KXJSaELLTita50NxvfqjPS7XgNpc7Tz10Ag
 +Zebb9F+eD0JE71jYqyuUiAzzhoFpPgZIVcUc8DSwZZBaJyTtpxOfnYblMf8PNeb
 tbRHmG4pDt+PnA==
 =Oqf2
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fix from Palmer Dabbelt:
 "A single fix this week: the removal of the GPIO reset method for the
  Ethernet phy on the HiFive Unleashed.

  This returns to relying on the bootloader's phy reset sequence, which
  we'll have to continue doing until we can sort out how to get the
  Linux phy driver to perform the special reset dance required for this
  phy"

* tag 'riscv-for-linus-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  Revert "dts: phy: add GPIO number and active state used for phy reset"
2021-02-12 11:07:29 -08:00
Linus Torvalds
93908500b8 Fix PTRACE_PEEKMTETAGS access to an mmapped region before the first
write.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmAmwyAACgkQa9axLQDI
 XvGGoQ/+O0K1SmqwSCEcq1l5imBCKVAj//kkBi761uZa5JFIueNxg+alfVSLqqah
 ZKBihhtOG+5VSa/BC5qP3qjqHz/5aLB3bFe5qHY3By8Iz+RfTROQJ8Otw/n8gwy4
 FkzBGg1gTqPwDpGOa14Y8YSmU7lkW2M6FPwECK/6Ek8zle8q/8NdFbath9b7tdPx
 JWST5MRWrqK6QO7MGvCJ5H1qd1isNSiFtFCdZS6r2wKpJl0nI47X/ncQsVFrTdHr
 BKvy8Hudc+sOOt6TljMBTUg8vwo6l3Fk6W6i9f3GgMMgpSUy8zH5anmWVyqSwguF
 2Uh2CA8bJqWxdOYQq21shTApz2tz9uwMpavlvLR3mnuvXgC4SVB0B0W6WxsW0fvX
 7p+ipZAbtcuifkVjn3XM9QVi3XNtot7Fg532YugtPDvh6uCNglw/26Ix3CdQBV5Z
 C2N4hATXDcmgank1P6dZ+i+y8dpWpAa/RXFqTWXBZDtKZmqT62xtRyhoZvZWlebs
 o6n1Ni5p+cYJRvyoNILhChZE6SNd4uAKrvGeSQmyf3zLo9pxU6fWyQUF3ZTtVhrF
 sdEP4KTKDkEl/dCfPkUIvZ/F0tIAGwPczzuDIny7+xzVlDMbKTMlGCky+s5CAKvh
 khxKeY+xtinYpfVNsTvVLKi8yFf5Um/DroEz6ItlXnRwhoXtDrg=
 =HYVs
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Fix PTRACE_PEEKMTETAGS access to an mmapped region before the first
  write"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mte: Allow PTRACE_PEEKMTETAGS access to the zero page
2021-02-12 11:03:30 -08:00
Pavel Begunkov
5be9ad1e42 io_uring: optimise io_init_req() flags setting
Invalid req->flags are tolerated by free/put well, avoid this dancing
needlessly presetting it to zero, and then not even resetting but
modifying it, i.e. "|=".

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12 11:49:50 -07:00
Pavel Begunkov
cdbff98223 io_uring: clean io_req_find_next() fast check
Indirectly io_req_find_next() is called for every request, optimise the
check by testing flags as it was long before -- __io_req_find_next()
tolerates false-positives well (i.e. link==NULL), and those should be
really rare.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12 11:49:49 -07:00
Pavel Begunkov
dc0eced5d9 io_uring: don't check PF_EXITING from syscall
io_sq_thread_acquire_mm_files() can find a PF_EXITING task only when
it's called from task_work context. Don't check it in all other cases,
that are when we're in io_uring_enter().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12 11:49:48 -07:00
Maciej Fijalkowski
c0d4e9d223 ixgbe: store the result of ixgbe_rx_offset() onto ixgbe_ring
Output of ixgbe_rx_offset() is based on ethtool's priv flag setting, which
when changed, causes PF reset (disables napi, frees irqs, loads
different Rx mem model, etc.). This means that within napi its result is
constant and there is no reason to call it per each processed frame.

Add new 'rx_offset' field to ixgbe_ring that is meant to hold the
ixgbe_rx_offset() result and use it within ixgbe_clean_rx_irq().
Furthermore, use it within ixgbe_alloc_mapped_page().

Last but not least, un-inline the function of interest as it lives in .c
file so let compiler do the decision about the inlining.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:40:03 -08:00
Maciej Fijalkowski
f1b1f409bf ice: store the result of ice_rx_offset() onto ice_ring
Output of ice_rx_offset() is based on ethtool's priv flag setting, which
when changed, causes PF reset (disables napi, frees irqs, loads
different Rx mem model, etc.). This means that within napi its result is
constant and there is no reason to call it per each processed frame.

Add new 'rx_offset' field to ice_ring that is meant to hold the
ice_rx_offset() result and use it within ice_clean_rx_irq().
Furthermore, use it within ice_alloc_mapped_page().

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:36:57 -08:00
Maciej Fijalkowski
f7bb0d71d6 i40e: store the result of i40e_rx_offset() onto i40e_ring
Output of i40e_rx_offset() is based on ethtool's priv flag setting,
which when changed, causes PF reset (disables napi, frees irqs, loads
different Rx mem model, etc.). This means that within napi its result is
constant and there is no reason to call it per each processed frame.

Add new 'rx_offset' field to i40e_ring that is meant to hold the
i40e_rx_offset() result and use it within i40e_clean_rx_irq().
Furthermore, use it within i40e_alloc_mapped_page().

Last but not least, un-inline the function of interest so that compiler
makes the decision about inlining as it lives in .c file.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:35:13 -08:00
Björn Töpel
f892a9af0c i40e: Simplify the do-while allocation loop
Fold the count decrement into the while-statement.

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:33:42 -08:00
Maciej Fijalkowski
5c57e507f2 ice: skip NULL check against XDP prog in ZC path
Whole zero-copy variant of clean Rx IRQ is executed when xsk_pool is
attached to rx_ring and it can happen only when XDP program is present
on interface. Therefore it is safe to assume that program is always
!NULL and there is no need for checking it in ice_run_xdp_zc.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:28:40 -08:00
Maciej Fijalkowski
43a925e49d ice: remove redundant checks in ice_change_mtu
dev_validate_mtu checks that mtu value specified by user is not less
than min mtu and not greater than max allowed mtu. It is being done
before calling the ndo_change_mtu exposed by driver, so remove these
redundant checks in ice_change_mtu.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:20:13 -08:00
Maciej Fijalkowski
29b82f2a09 ice: move skb pointer from rx_buf to rx_ring
Similar thing has been done in i40e, as there is no real need for having
the sk_buff pointer in each rx_buf. Non-eop frames can be simply handled
on that pointer moved upwards to rx_ring.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:18:40 -08:00
Maciej Fijalkowski
59c97d1b51 ice: simplify ice_run_xdp
There's no need for 'result' variable, we can directly return the
internal status based on action returned by xdp prog.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:14:30 -08:00
Maciej Fijalkowski
d06e2f05b4 i40e: adjust i40e_is_non_eop
i40e_is_non_eop had a leftover comment and unused skb argument which was
used for placing the skb onto rx_buf in case when current buffer was
non-eop one. This is not relevant anymore as commit e72e56597b
("i40e/i40evf: Moves skb from i40e_rx_buffer to i40e_ring") pulled the
non-complete skb handling out of rx_bufs up to rx_ring.  Therefore,
let's adjust the function arguments that i40e_is_non_eop takes.

Furthermore, since there is already a function responsible for bumping
the ntc, make use of that and drop that logic from i40e_is_non_eop so
that the scope of this function is limited to what the name actually
states.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:13:15 -08:00
Maciej Fijalkowski
4a14994a92 i40e: drop misleading function comments
i40e_cleanup_headers has a statement about check against skb being
linear or not which is not relevant anymore, so let's remove it.

Same case for i40e_can_reuse_rx_page, it references things that are not
present there anymore.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:10:59 -08:00
Maciej Fijalkowski
99f097270a i40e: drop redundant check when setting xdp prog
Net core handles the case where netdev has no xdp prog attached and
current prog is NULL. Therefore, remove such check within
i40e_xdp_setup.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 09:49:32 -08:00
Krzysztof Kozlowski
e1d3209f95 MAINTAINERS: cpuidle: exynos: include header in file pattern
Include the platform data header in Exynos cpuidle maintainer entry.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-12 18:05:59 +01:00
John Ogness
13791c80b0 printk: avoid prb_first_valid_seq() where possible
If message sizes average larger than expected (more than 32
characters), the data_ring will wrap before the desc_ring. Once the
data_ring wraps, it will start invalidating descriptors. These
invalid descriptors hang around until they are eventually recycled
when the desc_ring wraps. Readers do not care about invalid
descriptors, but they still need to iterate past them. If the
average message size is much larger than 32 characters, then there
will be many invalid descriptors preceding the valid descriptors.

The function prb_first_valid_seq() always begins at the oldest
descriptor and searches for the first valid descriptor. This can
be rather expensive for the above scenario. And, in fact, because
of its heavy usage in /dev/kmsg, there have been reports of long
delays and even RCU stalls.

For code that does not need to search from the oldest record,
replace prb_first_valid_seq() usage with prb_read_valid_*()
functions, which provide a start sequence number to search from.

Fixes: 896fbe20b4 ("printk: use the lockless ringbuffer")
Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: J. Avila <elavila@google.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210211173152.1629-1-john.ogness@linutronix.de
2021-02-12 17:54:59 +01:00
Viktor Rosendahl
e23db805da tracing/tools: Add the latency-collector to tools directory
This is a tool that is intended to work around the fact that the
preemptoff, irqsoff, and preemptirqsoff tracers only work in
overwrite mode. The idea is to act randomly in such a way that we
do not systematically lose any latencies, so that if enough testing
is done, all latencies will be captured. If the same burst of
latencies is repeated, then sooner or later we will have captured all
the latencies.

It also works with the wakeup_dl, wakeup_rt, and wakeup tracers.
However, in that case it is probably not useful to use the random
sleep functionality.

The reason why it may be desirable to catch all latencies with a long
test campaign is that for some organizations, it's necessary to test
the kernel in the field and not practical for developers to work
iteratively with field testers. Because of cost and project schedules
it is not possible to start a new test campaign every time a latency
problem has been fixed.

It uses inotify to detect changes to /sys/kernel/tracing/trace.
When a latency is detected, it will either sleep or print
immediately, depending on a function that act as an unfair coin
toss.

If immediate print is chosen, it means that we open
/sys/kernel/tracing/trace and thereby cause a blackout period
that will hide any subsequent latencies.

If sleep is chosen, it means that we wait before opening
/sys/kernel/tracing/trace, by default for 1000 ms, to see if
there is another latency during this period. If there is, then we will
lose the previous latency. The coin will be tossed again with a
different probability, and we will either print the new latency, or
possibly a subsequent one.

The probability for the unfair coin toss is chosen so that there
is equal probability to obtain any of the latencies in a burst.
However, this assumes that we make an assumption of how many
latencies there can be. By default  the program assumes that there
are no more than 2 latencies in a burst, the probability of immediate
printout will be:

1/2 and 1

Thus, the probability of getting each of the two latencies will be 1/2.

If we ever find that there is more than one latency in a series,
meaning that we reach the probability of 1, then the table will be
expanded to:

1/3, 1/2, and 1

Thus, we assume that there are no more than three latencies and each
with a probability of 1/3 of being captured. If the probability of 1
is reached in the new table, that is we see more than two closely
occurring latencies, then the table will again be extended, and so
on.

On my systems, it seems like this scheme works fairly well, as
long as the latencies we trace are long enough, 300 us seems to be
enough. This userspace program receive the inotify event at the end
of a latency, and it has time until the end of the next latency
to react, that is to open /sys/kernel/tracing/trace. Thus,
if we trace latencies that are >300 us, then we have at least 300 us
to react.

The minimum latency will of course not be 300 us on all systems, it
will depend on the hardware, kernel version, workload and
configuration.

Example usage:

In one shell, give the following command:
sudo latency-collector -rvv -t preemptirqsoff -s 2000 -a 3

This will trace latencies > 2000us with the preemptirqsoff tracer,
using random sleep with maximum verbosity, with a probability
table initialized to a size of 3.

In another shell, generate a few bursts of latencies:

root@host:~# modprobe preemptirq_delay_test delay=3000 test_mode=alternate
burst_size=3
root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger

If all goes well, you should be getting stack traces that shows
all the different latencies, i.e. you should see all the three
functions preemptirqtest_0, preemptirqtest_1, preemptirqtest_2 in the
stack traces.

Link: https://lkml.kernel.org/r/20210212134421.172750-2-Viktor.Rosendahl@bmw.de

Signed-off-by: Viktor Rosendahl <Viktor.Rosendahl@bmw.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-12 11:52:59 -05:00
Steven Rostedt (VMware)
99e22ce73c tracing: Make hash-ptr option default
Since the original behavior of the trace events is to hash the %p pointers,
make that the default, and have developers have to enable the option in
order to have them unhashed.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-12 11:52:59 -05:00
Paolo Bonzini
8c6e67bec3 KVM/arm64 updates for Linux 5.12
- Make the nVHE EL2 object relocatable, resulting in much more
   maintainable code
 - Handle concurrent translation faults hitting the same page
   in a more elegant way
 - Support for the standard TRNG hypervisor call
 - A bunch of small PMU/Debug fixes
 - Allow the disabling of symbol export from assembly code
 - Simplification of the early init hypercall handling
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmAmjqEPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDoUEQAIrJ7YF4v4gz06a0HG9+b6fbmykHyxlG7jfm
 trvctfaiKzOybKoY5odPpNFzhbYOOdXXqYipyTHGwBYtGSy9G/9SjMKSUrfln2Ni
 lr1wBqapr9TE+SVKoR8pWWuZxGGbHVa7brNuMbMsMi1wwAsM2/n70H9PXrdq3QiK
 Ge1DWLso2oEfhtTwqNKa4dwB2MHjBhBFhhq+Nq5pslm6mmxJaYqz7pyBmw/C+2cc
 oU/6kpAa1yPAauptWXtYXJYOMHihxgEa1IdK3Gl0hUyFyu96xVkwH/KFsj+bRs23
 QGGCSdy4313hzaoGaSOTK22R98Aeg0wI9a6tcCBvVVjTAztnlu1FPtUZr8e/F7uc
 +r8xVJUJFiywt3Zktf/D7YDK9LuMMqFnj0BkI4U9nIBY59XZRNhENsBCmjru5lnL
 iXa5cuta03H4emfssIChLpgn0XHFas6t5dFXBPGbXyw0qsQchTw98iQX9LVxefUK
 rOUGPIN4nE9ESRIZe0SPlAVeCtNP8cLH7+0YG9MJ1QeDVYaUsnvy9Ln/ox+514mR
 5y2KJ6y7xnLB136SKCzPDDloYtz7BDiJq6a/RPiXKGheKoxy+N+BSe58yWCqFZYE
 Fx/cGUr7oSg39U7gCboog6BDp5e2CXBfbRllg6P47bZFfdPNwzNEzHvk49VltMxx
 Rl2W05bk
 =6EwV
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for Linux 5.12

- Make the nVHE EL2 object relocatable, resulting in much more
  maintainable code
- Handle concurrent translation faults hitting the same page
  in a more elegant way
- Support for the standard TRNG hypervisor call
- A bunch of small PMU/Debug fixes
- Allow the disabling of symbol export from assembly code
- Simplification of the early init hypercall handling
2021-02-12 11:23:44 -05:00
Wei Yongjun
f6692213b5 integrity: Make function integrity_add_key() static
The sparse tool complains as follows:

security/integrity/digsig.c:146:12: warning:
 symbol 'integrity_add_key' was not declared. Should it be static?

This function is not used outside of digsig.c, so this
commit marks it static.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 60740accf7 ("integrity: Load certs to the platform keyring")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-02-12 11:11:59 -05:00
Catalin Marinas
68d54ceeec arm64: mte: Allow PTRACE_PEEKMTETAGS access to the zero page
The ptrace(PTRACE_PEEKMTETAGS) implementation checks whether the user
page has valid tags (mapped with PROT_MTE) by testing the PG_mte_tagged
page flag. If this bit is cleared, ptrace(PTRACE_PEEKMTETAGS) returns
-EIO.

A newly created (PROT_MTE) mapping points to the zero page which had its
tags zeroed during cpu_enable_mte(). If there were no prior writes to
this mapping, ptrace(PTRACE_PEEKMTETAGS) fails with -EIO since the zero
page does not have the PG_mte_tagged flag set.

Set PG_mte_tagged on the zero page when its tags are cleared during
boot. In addition, to avoid ptrace(PTRACE_PEEKMTETAGS) succeeding on
!PROT_MTE mappings pointing to the zero page, change the
__access_remote_tags() check to (vm_flags & VM_MTE) instead of
PG_mte_tagged.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Fixes: 34bfeea4a9 ("arm64: mte: Clear the tags when a page is mapped in user-space with PROT_MTE")
Cc: <stable@vger.kernel.org> # 5.10.x
Cc: Will Deacon <will@kernel.org>
Reported-by: Luis Machado <luis.machado@linaro.org>
Tested-by: Luis Machado <luis.machado@linaro.org>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20210210180316.23654-1-catalin.marinas@arm.com
2021-02-12 16:08:31 +00:00
Yunfeng Ye
65348ba259 powercap: intel_rapl: Use topology interface in rapl_init_domains()
It's not a good idea to access the phys_proc_id of cpuinfo directly.

Use topology_physical_package_id(cpu) instead.

Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-12 16:53:01 +01:00
Yunfeng Ye
88ffce9576 powercap: intel_rapl: Use topology interface in rapl_add_package()
It's not a good idea to access phys_proc_id and cpu_die_id directly.

Use topology_physical_package_id(cpu) and topology_die_id(cpu)
instead.

Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-12 16:53:01 +01:00
Rikard Falkeborn
1556057413 PM: sleep: Constify static struct attribute_group
The only usage of suspend_attr_group is to put its address in an
array of pointers to const attribute_group structs.

Make it const to allow the compiler to put it into read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-12 16:46:59 +01:00
Lukasz Luba
c4cc3141b6 PM: Kconfig: remove unneeded "default n" options
Remove "default n" options. If the "default" line is removed, it
defaults to 'n'.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-12 16:43:23 +01:00
Lukasz Luba
3af2f0aa2e PM: EM: update Kconfig description and drop "default n" option
Energy Model supports now other devices like GPUs, DSPs, not only CPUs.
Thus, update the description in the config option. Remove also unneeded
"default n". If the "default" line is removed, it defaults to 'n'.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-12 16:43:23 +01:00
Chunfeng Yun
b5a12546e7 dt-bindings: usb: mediatek: musb: add mt8516 compatbile
Add support mt8516 compatbile

Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/20210201070016.41721-8-chunfeng.yun@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-12 16:42:03 +01:00
Chunfeng Yun
fcad8dd5b9 dt-bindings: usb: mtk-xhci: add compatible for mt2701 and mt7623
Add two compatible for mt2701 and mt7623;

Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/20210201070016.41721-7-chunfeng.yun@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-12 16:42:03 +01:00
Chunfeng Yun
2b9f3ed937 dt-bindings: usb: mtk-xhci: add optional assigned clock properties
Add optional property "assigned-clock" and "assigned-clock-parents"
used by mt7629.

Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/20210201070016.41721-6-chunfeng.yun@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-12 16:42:02 +01:00
Helge Deller
b7795074a0 parisc: Optimize per-pagetable spinlocks
On parisc a spinlock is stored in the next page behind the pgd which
protects against parallel accesses to the pgd. That's why one additional
page (PGD_ALLOC_ORDER) is allocated for the pgd.

Matthew Wilcox suggested that we instead should use a pointer in the
struct page table for this spinlock and noted, that the comments for the
PGD_ORDER and PMD_ORDER defines were wrong.

Both suggestions are addressed with this patch. Instead of having an own
spinlock to protect the pgd, we now switch to use the existing
page_table_lock.  Additionally, beside loading the pgd into cr25 in
switch_mm_irqs_off(), the physical address of this lock is loaded into
cr28 (tr4), so that we can avoid implementing a complicated lookup in
assembly for this lock in the TLB fault handlers.

The existing Hybrid L2/L3 page table scheme (where the pmd is adjacent
to the pgd) has been dropped with this patch.

Remove the locking in set_pte() and the huge-page pte functions too.
They trigger a spinlock recursion on 32bit machines and seem unnecessary.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Fixes: b37d1c1898 ("parisc: Use per-pagetable spinlock")
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-02-12 16:39:42 +01:00
Kai Vehmanen
0d3070f5e6 ALSA: hda: Add another CometLake-H PCI ID
Add one more HD Audio PCI ID for CometLake-H PCH.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210212151022.2568567-1-kai.vehmanen@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-02-12 16:39:18 +01:00
Kyle Tso
4b59b60d89 Documentation: connector: Update the description of sink-vdos
Remove the acronym "VDM" and replace it with the full name "Vendor
Defined Message".

Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210212073743.665038-1-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-12 16:36:56 +01:00
Tiezhu Yang
ae3c4761c1 parisc: Replace test_ti_thread_flag() with test_tsk_thread_flag()
Use test_tsk_thread_flag() directly instead of test_ti_thread_flag() to
improve readability when the argument type is struct task_struct, it is
similar with commit 5afc78551b ("arm64: Use test_tsk_thread_flag() for
checking TIF_SINGLESTEP").

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-02-12 16:33:05 +01:00
John David Anglin
31680c1d15 parisc: Bump 64-bit IRQ stack size to 64 KB
Bump 64-bit IRQ stack size to 64 KB.

I had a kernel IRQ stack overflow on the mx3210 debian buildd machine.  This patch increases the
64-bit IRQ stack size to 64 KB.  The 64-bit stack size needs to be larger than the 32-bit stack
size since registers are twice as big.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-02-12 16:31:51 +01:00
Sven Schnelle
c70919bd9d parisc: Fix IVT checksum calculation wrt HPMC
On my C8000 a HPMC was triggered, but the HPMC handler wasn't called.
I got the following chassis codes:

<Cpu2> e800009802e00000  0000000000000000  CC_ERR_CHECK_HPMC
<Cpu3> e800009803e00000  00000000001b28a3  CC_ERR_CHECK_HPMC
<Cpu2> 37000f7302e00000  8400000000800000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu3> 37000f7303e00000  8400000000800000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu2> f600105e02e00000  fffffff0f0c00000  CC_MC_HPMC_MONARCH_SELECTED
<Cpu3> 5600100b03e00000  00000000000001a0  CC_MC_OS_HPMC_LEN_ERR
<Cpu2> 140003b202e00000  000000000000000b  CC_ERR_HPMC_STATE_ENTRY
<Cpu3> 5600106403e00000  fffffff0f043ad20  CC_MC_BR_TO_OS_HPMC_FAILED
<Cpu3> 160012cf03e00000  030001001e000007  CC_MPS_CPU_WAITING
<Cpu2> 5600100b02e00000  00000000000001a0  CC_MC_OS_HPMC_LEN_ERR
<Cpu2> 5600106402e00000  fffffff0f0438e70  CC_MC_BR_TO_OS_HPMC_FAILED
<Cpu2> e800009802e00000  0000000000000000  CC_ERR_CHECK_HPMC
<Cpu2> 37000f7302e00000  8400000000800000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu2> 4000109f02e00000  0000000000000000  CC_MC_HPMC_INITIATED
<Cpu2> 4000101902e00000  0000000000000000  CC_MC_MULTIPLE_HPMCS
<Cpu2> 030010d502e00000  0000000000000000  CC_CPU_STOP

C8000 PDC is complaining about our HPMC handler length, which is 1a0 (second
part of the chassis code). Changing that to 0 makes the error go away:

<Cpu0> e800009800e00000  0000000000000000  CC_ERR_CHECK_HPMC
<Cpu3> e800009803e00000  0000000000000000  CC_ERR_CHECK_HPMC
<Cpu1> e800009801e00000  0000000000000000  CC_ERR_CHECK_HPMC
<Cpu2> e800009802e00000  0000000000000000  CC_ERR_CHECK_HPMC
<Cpu0> 37000f7300e00000  8060004000000000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu3> 37000f7303e00000  8060004000000000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu1> 37000f7301e00000  8060004000000000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu2> 37000f7302e00000  8060004000000000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu0> f600105e00e00000  fffffff0f0c00000  CC_MC_HPMC_MONARCH_SELECTED
<Cpu3> 5600109b03e00000  00000000001eb024  CC_MC_BR_TO_OS_HPMC
<Cpu1> 5600109b01e00000  00000000001eb024  CC_MC_BR_TO_OS_HPMC
<Cpu2> 5600109b02e00000  00000000001eb024  CC_MC_BR_TO_OS_HPMC
<Cpu0> 140003b200e00000  000000000000000b  CC_ERR_HPMC_STATE_ENTRY
<Cpu3> 0000000003000000  0000000000000000
<Cpu1> 0000000001000000  0000000000000000
<Cpu2> 0000000002000000  0000000000000000
<Cpu0> 5600109b00e00000  00000000001eb024  CC_MC_BR_TO_OS_HPMC
<Cpu0> 0000000000000000  0000000000000000

So at least the HPMC handler is now called, but it hangs. Which isn't really
suprising, as the code has at least one comment saying it can't handle multiple
CPUs, and here the handler is called on all CPUs. And i'm not sure whether it
can handle 64 Bit.

So despite what the PDC spec says, C8000 and RP34xx/RP44xx don't want the
OS_HPMC length in the vector set, which is odd. I disassembled the firmware and
it actually looks like a Bug in PDC.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-02-12 16:31:42 +01:00
Helge Deller
61c439439c parisc: Use the generic devmem_is_allowed()
Signed-off-by: Helge Deller <deller@gmx.de>
2021-02-12 16:30:14 +01:00
Helge Deller
f286303286 parisc: Drop out of get_whan() if task is running again
Signed-off-by: Helge Deller <deller@gmx.de>
2021-02-12 16:30:07 +01:00
Sebastian Andrzej Siewior
f9ab49184a blk-mq: Use llist_head for blk_cpu_done
With llist_head it is possible to avoid the locking (the irq-off region)
when items are added. This makes it possible to add items on a remote
CPU without additional locking.
llist_add() returns true if the list was previously empty. This can be
used to invoke the SMP function call / raise sofirq only if the first
item was added (otherwise it is already pending).
This simplifies the code a little and reduces the IRQ-off regions.

blk_mq_raise_softirq() needs a preempt-disable section to ensure the
request is enqueued on the same CPU as the softirq is raised.
Some callers (USB-storage) invoke this path in preemptible context.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12 08:28:02 -07:00
Sebastian Andrzej Siewior
0a2efafbb1 blk-mq: Always complete remote completions requests in softirq
Controllers with multiple queues have their IRQ-handelers pinned to a
CPU. The core shouldn't need to complete the request on a remote CPU.

Remove this case and always raise the softirq to complete the request.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12 08:28:02 -07:00
Jens Axboe
93e4f73a93 Merge branch 'sched/smp' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-5.12/block-ipi
* 'sched/smp' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  smp: Process pending softirqs in flush_smp_call_function_from_idle()
2021-02-12 08:27:51 -07:00
Will Deacon
1ffa976382 Merge branch 'for-next/vdso' into for-next/core
vDSO build improvements.

* for-next/vdso:
  arm64: Support running gen_vdso_offsets.sh with BSD userland.
  arm64: do not descend to vdso directories twice
2021-02-12 15:17:42 +00:00
Will Deacon
dcabe10d97 Merge branch 'for-next/topology' into for-next/core
Cleanup to the AMU support code and initialisation rework to support
cpufreq drivers built as modules.

* for-next/topology:
  arm64: topology: Make AMUs work with modular cpufreq drivers
  arm64: topology: Reorder init_amu_fie() a bit
  arm64: topology: Avoid the have_policy check
2021-02-12 15:15:53 +00:00
Will Deacon
d23fa87cde Merge branch 'for-next/stacktrace' into for-next/core
Remove synthetic frame record from exception stack when entering from
userspace.

* for-next/stacktrace:
  arm64: remove EL0 exception frame record
2021-02-12 15:14:22 +00:00
Will Deacon
82a1c2b94a Merge branch 'for-next/selftests' into for-next/core
Trivial cleanup to one of the MTE selftests.

* for-next/selftests:
  arm64: mte: style: Simplify bool comparison
2021-02-12 15:13:57 +00:00