mainlining shenanigans
Go to file
Jean-Philippe Brucker 1effe8ca4e skbuff: fix coalescing for page_pool fragment recycling
Fix a use-after-free when using page_pool with page fragments. We
encountered this problem during normal RX in the hns3 driver:

(1) Initially we have three descriptors in the RX queue. The first one
    allocates PAGE1 through page_pool, and the other two allocate one
    half of PAGE2 each. Page references look like this:

                RX_BD1 _______ PAGE1
                RX_BD2 _______ PAGE2
                RX_BD3 _________/

(2) Handle RX on the first descriptor. Allocate SKB1, eventually added
    to the receive queue by tcp_queue_rcv().

(3) Handle RX on the second descriptor. Allocate SKB2 and pass it to
    netif_receive_skb():

    netif_receive_skb(SKB2)
      ip_rcv(SKB2)
        SKB3 = skb_clone(SKB2)

    SKB2 and SKB3 share a reference to PAGE2 through
    skb_shinfo()->dataref. The other ref to PAGE2 is still held by
    RX_BD3:

                      SKB2 ---+- PAGE2
                      SKB3 __/   /
                RX_BD3 _________/

 (3b) Now while handling TCP, coalesce SKB3 with SKB1:

      tcp_v4_rcv(SKB3)
        tcp_try_coalesce(to=SKB1, from=SKB3)    // succeeds
        kfree_skb_partial(SKB3)
          skb_release_data(SKB3)                // drops one dataref

                      SKB1 _____ PAGE1
                           \____
                      SKB2 _____ PAGE2
                                 /
                RX_BD3 _________/

    In skb_try_coalesce(), __skb_frag_ref() takes a page reference to
    PAGE2, where it should instead have increased the page_pool frag
    reference, pp_frag_count. Without coalescing, when releasing both
    SKB2 and SKB3, a single reference to PAGE2 would be dropped. Now
    when releasing SKB1 and SKB2, two references to PAGE2 will be
    dropped, resulting in underflow.

 (3c) Drop SKB2:

      af_packet_rcv(SKB2)
        consume_skb(SKB2)
          skb_release_data(SKB2)                // drops second dataref
            page_pool_return_skb_page(PAGE2)    // drops one pp_frag_count

                      SKB1 _____ PAGE1
                           \____
                                 PAGE2
                                 /
                RX_BD3 _________/

(4) Userspace calls recvmsg()
    Copies SKB1 and releases it. Since SKB3 was coalesced with SKB1, we
    release the SKB3 page as well:

    tcp_eat_recv_skb(SKB1)
      skb_release_data(SKB1)
        page_pool_return_skb_page(PAGE1)
        page_pool_return_skb_page(PAGE2)        // drops second pp_frag_count

(5) PAGE2 is freed, but the third RX descriptor was still using it!
    In our case this causes IOMMU faults, but it would silently corrupt
    memory if the IOMMU was disabled.

Change the logic that checks whether pp_recycle SKBs can be coalesced.
We still reject differing pp_recycle between 'from' and 'to' SKBs, but
in order to avoid the situation described above, we also reject
coalescing when both 'from' and 'to' are pp_recycled and 'from' is
cloned.

The new logic allows coalescing a cloned pp_recycle SKB into a page
refcounted one, because in this case the release (4) will drop the right
reference, the one taken by skb_try_coalesce().

Fixes: 53e0961da1 ("page_pool: add frag page recycling support in page pool")
Suggested-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-01 11:57:58 +01:00
arch Networking fixes for 5.18-rc1 and rethook patches. 2022-03-31 11:23:31 -07:00
block ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
certs KEYS: Introduce link restriction for machine keys 2022-03-08 13:55:52 +02:00
crypto for-5.18/64bit-pi-2022-03-25 2022-03-26 12:01:35 -07:00
Documentation Networking fixes for 5.18-rc1 and rethook patches. 2022-03-31 11:23:31 -07:00
drivers vrf: fix packet sniffing for traffic originating from ip tunnels 2022-04-01 11:56:55 +01:00
fs fs: fix fd table size alignment properly 2022-03-29 23:29:18 -07:00
include Networking fixes for 5.18-rc1 and rethook patches. 2022-03-31 11:23:31 -07:00
init Merge branch 'akpm' (patches from Andrew) 2022-03-24 14:14:07 -07:00
ipc fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
kernel Networking fixes for 5.18-rc1 and rethook patches. 2022-03-31 11:23:31 -07:00
lib lib/test: use after free in register_test_dev_kmod() 2022-03-29 15:13:36 -07:00
LICENSES LICENSES/LGPL-2.1: Add LGPL-2.1-or-later as valid identifiers 2021-12-16 14:33:10 +01:00
mm mm: page_alloc: validate buddy before check its migratetype. 2022-03-30 15:45:43 -07:00
net skbuff: fix coalescing for page_pool fragment recycling 2022-04-01 11:57:58 +01:00
samples dma-mapping updates for Linux 5.18 2022-03-29 08:50:14 -07:00
scripts Driver core changes for 5.18-rc1 2022-03-28 12:41:28 -07:00
security ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
sound xen: branch for v5.18-rc1 2022-03-28 14:32:39 -07:00
tools Networking fixes for 5.18-rc1 and rethook patches. 2022-03-31 11:23:31 -07:00
usr reiserfs_xattr.h: add linux/reiserfs_xattr.h to UAPI compile-test coverage 2022-02-17 09:09:38 +01:00
virt KVM: compat: riscv: Prevent KVM_COMPAT from being selected 2022-03-11 19:02:15 +05:30
.clang-format genirq/msi: Make interrupt allocation less convoluted 2021-12-16 22:22:20 +01:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap Char/Misc and other driver updates for 5.18-rc1 2022-03-28 12:27:35 -07:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: replace a Microchip AT91 maintainer 2022-02-09 11:30:01 +01:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Networking fixes for 5.18-rc1 and rethook patches. 2022-03-31 11:23:31 -07:00
Makefile array-bounds updates for v5.18-rc1 2022-03-26 12:30:44 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.