linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-02 00:51:44 +00:00

Author	SHA1	Message	Date
David S. Miller	d8513df259	net: Correct type of tcp_syncookies sysctl. It can take on the values of '0', '1', and '2' and thus is not a boolean. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-02 16:07:38 -08:00
Steven Rostedt (VMware)	02f4e01ce7	tracing: Initialize val to zero in parse_entry of inject code gcc produces a variable may be uninitialized warning for "val" in parse_entry(). This is really a false positive, but the code is subtle enough to just initialize val to zero and it's not a fast path to worry about it. Marked for stable to remove the warning in the stable trees as well. Cc: stable@vger.kernel.org Fixes: `6c3edaf9fd` ("tracing: Introduce trace event injection") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-01-02 19:04:57 -05:00
Pengcheng Yang	c9655008e7	tcp: fix "old stuff" D-SACK causing SACK to be treated as D-SACK When we receive a D-SACK, where the sequence number satisfies: undo_marker <= start_seq < end_seq <= prior_snd_una we consider this is a valid D-SACK and tcp_is_sackblock_valid() returns true, then this D-SACK is discarded as "old stuff", but the variable first_sack_index is not marked as negative in tcp_sacktag_write_queue(). If this D-SACK also carries a SACK that needs to be processed (for example, the previous SACK segment was lost), this SACK will be treated as a D-SACK in the following processing of tcp_sacktag_write_queue(), which will eventually lead to incorrect updates of undo_retrans and reordering. Fixes: `fd6dad616d` ("[TCP]: Earlier SACK block verification & simplify access to them") Signed-off-by: Pengcheng Yang <yangpc@wangsu.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-02 15:42:28 -08:00
Baruch Siach	f7a48b68ab	net: dsa: mv88e6xxx: force cmode write on 6141/6341 mv88e6xxx_port_set_cmode() relies on cmode stored in struct mv88e6xxx_port to skip cmode update when the requested value matches the cached value. It turns out that mv88e6xxx_port_hidden_write() might change the port cmode setting as a side effect, so we can't rely on the cached value to determine that cmode update in not necessary. Force cmode update in mv88e6341_port_set_cmode(), to make serdes configuration work again. Other mv88e6xxx_port_set_cmode() callers keep the current behaviour. This fixes serdes configuration of the 6141 switch on SolidRun Clearfog GT-8K. Fixes: `7a3007d22e` ("net: dsa: mv88e6xxx: fully support SERDES on Topaz family") Reported-by: Denis Odintsov <d.odintsov@traviangames.com> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-02 15:30:48 -08:00
Arnd Bergmann	a5b0dc5a46	gcc-plugins: make it possible to disable CONFIG_GCC_PLUGINS again I noticed that randconfig builds with gcc no longer produce a lot of ccache hits, unlike with clang, and traced this back to plugins now being enabled unconditionally if they are supported. I am now working around this by adding export CCACHE_COMPILERCHECK=/usr/bin/size -A %compiler% to my top-level Makefile. This changes the heuristic that ccache uses to determine whether the plugins are the same after a 'make clean'. However, it also seems that being able to just turn off the plugins is generally useful, at least for build testing it adds noticeable overhead but does not find a lot of bugs additional bugs, and may be easier for ccache users than my workaround. Fixes: `9f671e5815` ("security: Create "kernel hardening" config area") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20191211133951.401933-1-arnd@arndb.de Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>	2020-01-02 13:30:14 -08:00
Sargun Dhillon	e4ab5ccc35	selftests/seccomp: Catch garbage on SECCOMP_IOCTL_NOTIF_RECV This adds logic to the user_notification_basic test to set a member of struct seccomp_notif to an invalid value to ensure that the kernel returns EINVAL if any of the struct seccomp_notif members are set to invalid values. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Suggested-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20191230203811.4996-1-sargun@sargun.me Fixes: `6a21cc50f0` ("seccomp: add a return code to trap to userspace") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>	2020-01-02 13:15:45 -08:00
Sargun Dhillon	2882d53c9c	seccomp: Check that seccomp_notif is zeroed out by the user This patch is a small change in enforcement of the uapi for SECCOMP_IOCTL_NOTIF_RECV ioctl. Specifically, the datastructure which is passed (seccomp_notif) must be zeroed out. Previously any of its members could be set to nonsense values, and we would ignore it. This ensures all fields are set to their zero value. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> Acked-by: Tycho Andersen <tycho@tycho.ws> Link: https://lore.kernel.org/r/20191229062451.9467-2-sargun@sargun.me Fixes: `6a21cc50f0` ("seccomp: add a return code to trap to userspace") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>	2020-01-02 13:03:45 -08:00
Sargun Dhillon	88c13f8bd7	selftests/seccomp: Zero out seccomp_notif The seccomp_notif structure should be zeroed out prior to calling the SECCOMP_IOCTL_NOTIF_RECV ioctl. Previously, the kernel did not check whether these structures were zeroed out or not, so these worked. This patch zeroes out the seccomp_notif data structure prior to calling the ioctl. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Reviewed-by: Tycho Andersen <tycho@tycho.ws> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20191229062451.9467-1-sargun@sargun.me Fixes: `6a21cc50f0` ("seccomp: add a return code to trap to userspace") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>	2020-01-02 13:03:42 -08:00
Sargun Dhillon	771b894f2f	samples/seccomp: Zero out members based on seccomp_notif_sizes The sizes by which seccomp_notif and seccomp_notif_resp are allocated are based on the SECCOMP_GET_NOTIF_SIZES ioctl. This allows for graceful extension of these datastructures. If userspace zeroes out the datastructure based on its version, and it is lagging behind the kernel's version, it will end up sending trailing garbage. On the other hand, if it is ahead of the kernel version, it will write extra zero space, and potentially cause corruption. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Suggested-by: Tycho Andersen <tycho@tycho.ws> Link: https://lore.kernel.org/r/20191230203503.4925-1-sargun@sargun.me Fixes: `fec7b66905` ("samples: add an example of seccomp user trap") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>	2020-01-02 13:03:39 -08:00
Aleksandr Yashkin	9e5f1c1980	pstore/ram: Write new dumps to start of recycled zones The ram_core.c routines treat przs as circular buffers. When writing a new crash dump, the old buffer needs to be cleared so that the new dump doesn't end up in the wrong place (i.e. at the end). The solution to this problem is to reset the circular buffer state before writing a new Oops dump. Signed-off-by: Aleksandr Yashkin <a.yashkin@inango-systems.com> Signed-off-by: Nikolay Merinov <n.merinov@inango-systems.com> Signed-off-by: Ariel Gilman <a.gilman@inango-systems.com> Link: https://lore.kernel.org/r/20191223133816.28155-1-n.merinov@inango-systems.com Fixes: `896fc1f0c4` ("pstore/ram: Switch to persistent_ram routines") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>	2020-01-02 12:30:50 -08:00
Kees Cook	8df955a32a	pstore/ram: Fix error-path memory leak in persistent_ram_new() callers For callers that allocated a label for persistent_ram_new(), if the call fails, they must clean up the allocation. Suggested-by: Navid Emamdoost <navid.emamdoost@gmail.com> Fixes: `1227daa43b` ("pstore/ram: Clarify resource reservation labels") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/20191211191353.14385-1-navid.emamdoost@gmail.com Signed-off-by: Kees Cook <keescook@chromium.org>	2020-01-02 12:30:39 -08:00
Florian Faber	2d77bd61a2	can: mscan: mscan_rx_poll(): fix rx path lockup when returning from polling to irq mode Under load, the RX side of the mscan driver can get stuck while TX still works. Restarting the interface locks up the system. This behaviour could be reproduced reliably on a MPC5121e based system. The patch fixes the return value of the NAPI polling function (should be the number of processed packets, not constant 1) and the condition under which IRQs are enabled again after polling is finished. With this patch, no more lockups were observed over a test period of ten days. Fixes: `afa17a500a` ("net/can: add driver for mscan family & mpc52xx_mscan") Signed-off-by: Florian Faber <faber@faberman.de> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:27 +01:00
Johan Hovold	2f361cd947	can: gs_usb: gs_usb_probe(): use descriptors of current altsetting Make sure to always use the descriptors of the current alternate setting to avoid future issues when accessing fields that may differ between settings. Signed-off-by: Johan Hovold <johan@kernel.org> Fixes: `d08e973a77` ("can: gs_usb: Added support for the GS_USB CAN devices") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:27 +01:00
Johan Hovold	5660493c63	can: kvaser_usb: fix interface sanity check Make sure to use the current alternate setting when verifying the interface descriptors to avoid binding to an invalid interface. Failing to do so could cause the driver to misbehave or trigger a WARN() in usb_submit_urb() that kernels with panic_on_warn set would choke on. Fixes: `aec5fb2268` ("can: kvaser_usb: Add support for Kvaser USB hydra family") Cc: stable <stable@vger.kernel.org> # 4.19 Cc: Jimmy Assarsson <extja@kvaser.com> Cc: Christer Beskow <chbe@kvaser.com> Cc: Nicklas Johansson <extnj@kvaser.com> Cc: Martin Henriksson <mh@kvaser.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:27 +01:00
Oliver Hartkopp	e7153bf70c	can: can_dropped_invalid_skb(): ensure an initialized headroom in outgoing CAN sk_buffs KMSAN sysbot detected a read access to an untinitialized value in the headroom of an outgoing CAN related sk_buff. When using CAN sockets this area is filled appropriately - but when using a packet socket this initialization is missing. The problematic read access occurs in the CAN receive path which can only be triggered when the sk_buff is sent through a (virtual) CAN interface. So we check in the sending path whether we need to perform the missing initializations. Fixes: `d3b58c47d3` ("can: replace timestamp as unique skb attribute") Reported-by: syzbot+b02ff0707a97e4e79ebb@syzkaller.appspotmail.com Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> # >= v4.1 Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:27 +01:00
Gustavo A. R. Silva	93bdc0eb0b	can: tcan4x5x: tcan4x5x_parse_config(): fix inconsistent IS_ERR and PTR_ERR Fix inconsistent IS_ERR and PTR_ERR in tcan4x5x_parse_config(). The proper pointer to be passed as argument is tcan4x5x->device_wake_gpio. This bug was detected with the help of Coccinelle. Fixes: `2de4973569` ("can: tcan45x: Make wake-up GPIO an optional GPIO") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:26 +01:00
Dan Murphy	5a1f8f5e5e	can: tcan4x5x: tcan4x5x_parse_config(): Disable the INH pin device-state GPIO is unavailable If the device state GPIO is not connected to the host then disable the INH output from the TCAN device per section 8.3.5 of the data sheet. Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:26 +01:00
Sean Nyekjaer	c3083124e6	can: tcan4x5x: tcan4x5x_parse_config(): reset device before register access It's a good idea to reset a ip-block/spi device before using it, this patch will reset the device. And a generic reset function if needed elsewhere. Signed-off-by: Sean Nyekjaer <sean@geanix.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:26 +01:00
Dan Murphy	3814ca3a10	can: tcan4x5x: tcan4x5x_can_probe(): turn on the power before parsing the config The tcan4x5x_parse_config() function now performs action on the device either reading or writing and a reset. If the devive has a switchable power supppy (i.e. regulator is managed) it needs to be turned on. So turn on the regulator if available. If the parsing fails, turn off the regulator. Fixes: `2de4973569` ("can: tcan45x: Make wake-up GPIO an optional GPIO") Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:26 +01:00
Sean Nyekjaer	3069ce620d	can: tcan4x5x: tcan4x5x_can_probe(): get the device out of standby before register access The m_can tries to detect if Non ISO Operation is available while in standby mode, this function results in the following error: \| tcan4x5x spi2.0 (unnamed net_device) (uninitialized): Failed to init module \| tcan4x5x spi2.0: m_can device registered (irq=84, version=32) \| tcan4x5x spi2.0 can2: TCAN4X5X successfully initialized. When the tcan device comes out of reset it goes in standby mode. The m_can driver tries to access the control register but fails due to the device being in standby mode. So this patch will put the tcan device in normal mode before the m_can driver does the initialization. Fixes: `5443c226ba` ("can: tcan4x5x: Add tcan4x5x driver to the kernel") Cc: stable@vger.kernel.org Signed-off-by: Sean Nyekjaer <sean@geanix.com> Acked-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:26 +01:00
John Johansen	20d4e80d25	apparmor: only get a label reference if the fast path check fails The common fast path check can be done under rcu_read_lock() and doesn't need a reference count on the label. Only take a reference count if entering the slow path. Fixes reported hackbench regression - sha1 `79e178a57d` ("Merge tag 'apparmor-pr-2019-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor") hackbench -l (256000/#grp) -g #grp 128 groups 19.679 ±0.90% - previous sha1 `01d1dff646` ("Merge tag 's390-5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux") hackbench -l (256000/#grp) -g #grp 128 groups 3.1689 ±3.04% Reported-by: Vincent Guittot <vincent.guittot@linaro.org> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: `bce4e7e9c4` ("apparmor: reduce rcu_read_lock scope for aa_file_perm mediation") Signed-off-by: John Johansen <john.johansen@canonical.com>	2020-01-02 05:31:40 -08:00
Patrick Steinhardt	9c95a278ba	apparmor: fix bind mounts aborting with -ENOMEM With commit `df323337e5` ("apparmor: Use a memory pool instead per-CPU caches, 2019-05-03"), AppArmor code was converted to use memory pools. In that conversion, a bug snuck into the code that polices bind mounts that causes all bind mounts to fail with -ENOMEM, as we erroneously error out if `aa_get_buffer` returns a pointer instead of erroring out when it does _not_ return a valid pointer. Fix the issue by correctly checking for valid pointers returned by `aa_get_buffer` to fix bind mounts with AppArmor. Fixes: `df323337e5` ("apparmor: Use a memory pool instead per-CPU caches") Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: John Johansen <john.johansen@canonical.com>	2020-01-02 05:31:40 -08:00
Dave Airlie	866bd5eeaf	Merge tag 'amd-drm-fixes-5.5-2020-01-01' of git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.5-2020-01-01: amdgpu: - ATPX regression fix - SMU metrics table locking fixes - gfxoff fix for raven - RLC firmware loading stability fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200101151307.5242-1-alexander.deucher@amd.com	2020-01-02 10:16:04 +10:00
Dave Airlie	e7cbcb16fa	-sun4i: Fix double-free in connector/encoder cleanup (Stefan) -malidp: Make vtable static (Ben) Cc: Ben Dooks <ben.dooks@codethink.co.uk> Cc: Stefan Mavrodiev <stefan@olimex.com> -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEHF6rntfJ3enn8gh8cywAJXLcr3kFAl4LaBgACgkQcywAJXLc r3l58ggA2PKL0955KFeWAV1T8JZEaGSA6i5TNhtQFHbLQO2Ks5tE/YXsAROcA+ry Eblqk36Yg0NyhsJ8e6FwQYiRdkL3Vh+kpxV+zbexfeKqSuyX12Xqy1Ukbxij0qYt faDhwh4TYHQnA1QvsPfXSkf0gQt0nWAq2EahHaT9qn0tjjmA1YmAyIh851kBtT4B UeyDJgLh4l0ALs2dptaVrpB+uKT7GKeQvFoAMddYOBFAQ21/ihJwCQdGBqfYAaO0 PTsfNxjfDnm8s1N/qNV+21Pk1pMm6g9xKTmP/cuD6iwiaWlB5GYxeo6xFLjQ3NVU EOHv4SJdsm3CFGEggJvqCiBYIJfG9Q== =+u2K -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2019-12-31' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes -sun4i: Fix double-free in connector/encoder cleanup (Stefan) -malidp: Make vtable static (Ben) Cc: Ben Dooks <ben.dooks@codethink.co.uk> Cc: Stefan Mavrodiev <stefan@olimex.com> Signed-off-by: Dave Airlie <airlied@redhat.com> From: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20191231152503.GA46740@art_vandelay	2020-01-02 09:41:00 +10:00
Dave Airlie	886a0dc04d	Mediatek DRM fixes for Linux 5.5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJeCrsAAAoJEOHKc6PJWU4k7rsP/jh81H+i4vO+derBdItdCa81 b9eQjyVZNOdVvCpYAugIz2Xa71PkjcnB53Ck885bEhSJ9sHPewMFq/K+1dHpzzqB X+68pD83B07KU6x8Y9g+DcCsksR2zOANtE0I8Y0ElraCsCN2KkUzWKUa5wefFw5M J3xWqDboad8MTUDIDs0eo4JoVFApj7PmXPKRKQfFy7In/888sCFWtC3zDKxk9OHF zj5PQ43E8cy1bCQIKnMRJjB5PuPdI0SUZlFLoxu7YbSOPkA8QGqkkOKqi4qBZ2n5 e0AcCXrwrQSqkiSZybHcUsvihJLSW64LlgorfRCDzl4TOexHVAZDgc/lg8UgEiQ2 JC/piFB1vTeI/3DMELWx6U+ULbX5hx+yAFc5zQIrio+WJYETZyVHLBLl1GE2SA8F hbEq5xjuJHvWDCsr2nsR0qHO3ozodSGgfg03vhKtQp5J0wUMIvr9kNI/4HUSJGe4 7GYBLcsPFPS6huscakkGn2hF5FL8v1Sxvqk+m/3chUU2PfsNxyVLlUztFJkaAGkH 6WjLxxpFgWVSnSqig+5wC67fF3FjUa+HtAclIi9zrnG64joH7+M/1wsTvuldo20R GPdr2iOgGkxjg8NsB9IYXTJjSFi+CBGsCxmM53K8+e3lpwN8km4pSKOHCsi1TsO6 Y5pZw4+yirymE8kKiCMo =6y+R -----END PGP SIGNATURE----- Merge tag 'mediatek-drm-fixes-5.5' of https://github.com/ckhu-mediatek/linux.git-tags into drm-fixes Mediatek DRM fixes for Linux 5.5 Signed-off-by: Dave Airlie <airlied@redhat.com> From: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.freedesktop.org/patch/msgid/1577762298.23194.2.camel@mtksdaap41	2020-01-02 09:40:30 +10:00
Evan Quan	969e115292	drm/amdgpu: correct RLC firmwares loading sequence Per confirmation with RLC firmware team, the RLC should be unhalted after all RLC related firmwares uploaded. However, in fact the RLC is unhalted immediately after RLCG firmware uploaded. And that may causes unexpected PSP hang on loading the succeeding RLC save restore list related firmwares. So, we correct the firmware loading sequence to load RLC save restore list related firmwares before RLCG ucode. That will help to get around this issue. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-01-01 09:26:09 -05:00
Linus Torvalds	738d290277	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from David Miller: 1) Fix big endian overflow in nf_flow_table, from Arnd Bergmann. 2) Fix port selection on big endian in nft_tproxy, from Phil Sutter. 3) Fix precision tracking for unbound scalars in bpf verifier, from Daniel Borkmann. 4) Fix integer overflow in socket rcvbuf check in UDP, from Antonio Messina. 5) Do not perform a neigh confirmation during a pmtu update over a tunnel, from Hangbin Liu. 6) Fix DMA mapping leak in dpaa_eth driver, from Madalin Bucur. 7) Various PTP fixes for sja1105 dsa driver, from Vladimir Oltean. 8) Add missing to dummy definition of of_mdiobus_child_is_phy(), from Geert Uytterhoeven * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits) hsr: fix slab-out-of-bounds Read in hsr_debugfs_rename() net/sched: add delete_empty() to filters and use it in cls_flower tcp: Fix highest_sack and highest_sack_seq ptp: fix the race between the release of ptp_clock and cdev net: dsa: sja1105: Reconcile the meaning of TPID and TPID2 for E/T and P/Q/R/S Documentation: net: dsa: sja1105: Remove text about taprio base-time limitation net: dsa: sja1105: Remove restriction of zero base-time for taprio offload net: dsa: sja1105: Really make the PTP command read-write net: dsa: sja1105: Take PTP egress timestamp by port, not mgmt slot cxgb4/cxgb4vf: fix flow control display for auto negotiation mlxsw: spectrum: Use dedicated policer for VRRP packets mlxsw: spectrum_router: Skip loopback RIFs during MAC validation net: stmmac: dwmac-meson8b: Fix the RGMII TX delay on Meson8b/8m2 SoCs net/sched: act_mirred: Pull mac prior redir to non mac_header_xmit device net_sched: sch_fq: properly set sk->sk_pacing_status bnx2x: Fix accounting of vlan resources among the PFs bnx2x: Use appropriate define for vlan credit of: mdio: Add missing inline to of_mdiobus_child_is_phy() dummy net: phy: aquantia: add suspend / resume ops for AQR105 dpaa_eth: fix DMA mapping leak ...	2019-12-31 11:14:58 -08:00
Linus Torvalds	c5c928c667	Two bugfix patches for 5.5. tomoyo: Suppress RCU warning at list_for_each_entry_rcu(). tomoyo: Don't use nifty names on sockets. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJeCqRMAAoJEEJfEo0MZPUqmRoQAJdWfsAObzNRq8VsiZJTQ7NV Y4Nz1QYAUMjsSnWCTY1iI/9a4SY7zOmlGoCO9pC4leGeV/H9FCp+flBVBs6nEiQF A/bRT9P9ek3urr+kOj1A9udxRRXQdQ292TVd9Ll7fJuamLHsLAQSTht27SdJyStQ FY5COJCTC3Qvoz/jdkqQ1qEyYUavH6FvNjN9eIsjLow6BahHzDj/sw6WC/iUQhOC 55mK10bN7jHvewRsrW5HLQ0aUazz/6FTZIuVckFpk/R67aljEIsMccAeYw7XeBWS a4AqI5a+8Go8ryXeM6y76JF3SnWpX9PZYLMYZz2SYkohBbl+ivVKJZOWEOQPhYTO wFAFSwZVA4uEsYEQF7qWGsQGMS/QR1tre2na4dWjpSZ0Ly2xX81tcjEhVYq6jsuk 1MGHTDCc93dMJW8OKx31CRRr9mIkJ4C1pJqQzlApjkqxUMq3Bxdidc4WOshB20y6 cLf1nfIor4/8VBb8LdBICPcedfHWk3KY6nL5yTqUjtETln7Ba/UeXpPKL9kH15Pg N29AzQfNuRyTL/s51ZRrvmh/WJWIrl2xLue3s5u8yJA6DivUb8U46BK+BQn+POHG XJDydAxynKYbPlqbNE6yoFl36BwyNwy5vWQp/0ONiwXHPSl3lsi5LYWy0XGODS8a 9SSbLovjKTvMeLnoN+AI =11MC -----END PGP SIGNATURE----- Merge tag 'tomoyo-fixes-for-5.5' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1 Pull tomoyo fixes from Tetsuo Handa: "Two bug fixes: - Suppress RCU warning at list_for_each_entry_rcu() - Don't use fancy names on sockets" * tag 'tomoyo-fixes-for-5.5' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1: tomoyo: Suppress RCU warning at list_for_each_entry_rcu(). tomoyo: Don't use nifty names on sockets.	2019-12-31 10:51:27 -08:00
Taehee Yoo	04b69426d8	hsr: fix slab-out-of-bounds Read in hsr_debugfs_rename() hsr slave interfaces don't have debugfs directory. So, hsr_debugfs_rename() shouldn't be called when hsr slave interface name is changed. Test commands: ip link add dummy0 type dummy ip link add dummy1 type dummy ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1 ip link set dummy0 name ap Splat looks like: [21071.899367][T22666] ap: renamed from dummy0 [21071.914005][T22666] ================================================================== [21071.919008][T22666] BUG: KASAN: slab-out-of-bounds in hsr_debugfs_rename+0xaa/0xb0 [hsr] [21071.923640][T22666] Read of size 8 at addr ffff88805febcd98 by task ip/22666 [21071.926941][T22666] [21071.927750][T22666] CPU: 0 PID: 22666 Comm: ip Not tainted 5.5.0-rc2+ #240 [21071.929919][T22666] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [21071.935094][T22666] Call Trace: [21071.935867][T22666] dump_stack+0x96/0xdb [21071.936687][T22666] ? hsr_debugfs_rename+0xaa/0xb0 [hsr] [21071.937774][T22666] print_address_description.constprop.5+0x1be/0x360 [21071.939019][T22666] ? hsr_debugfs_rename+0xaa/0xb0 [hsr] [21071.940081][T22666] ? hsr_debugfs_rename+0xaa/0xb0 [hsr] [21071.940949][T22666] __kasan_report+0x12a/0x16f [21071.941758][T22666] ? hsr_debugfs_rename+0xaa/0xb0 [hsr] [21071.942674][T22666] kasan_report+0xe/0x20 [21071.943325][T22666] hsr_debugfs_rename+0xaa/0xb0 [hsr] [21071.944187][T22666] hsr_netdev_notify+0x1fe/0x9b0 [hsr] [21071.945052][T22666] ? __module_text_address+0x13/0x140 [21071.945897][T22666] notifier_call_chain+0x90/0x160 [21071.946743][T22666] dev_change_name+0x419/0x840 [21071.947496][T22666] ? __read_once_size_nocheck.constprop.6+0x10/0x10 [21071.948600][T22666] ? netdev_adjacent_rename_links+0x280/0x280 [21071.949577][T22666] ? __read_once_size_nocheck.constprop.6+0x10/0x10 [21071.950672][T22666] ? lock_downgrade+0x6e0/0x6e0 [21071.951345][T22666] ? do_setlink+0x811/0x2ef0 [21071.951991][T22666] do_setlink+0x811/0x2ef0 [21071.952613][T22666] ? is_bpf_text_address+0x81/0xe0 [ ... ] Reported-by: syzbot+9328206518f08318a5fd@syzkaller.appspotmail.com Fixes: `4c2d5e33dc` ("hsr: rename debugfs file when interface name is changed") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-30 20:36:27 -08:00
Davide Caratti	a5b72a083d	net/sched: add delete_empty() to filters and use it in cls_flower Revert "net/sched: cls_u32: fix refcount leak in the error path of u32_change()", and fix the u32 refcount leak in a more generic way that preserves the semantic of rule dumping. On tc filters that don't support lockless insertion/removal, there is no need to guard against concurrent insertion when a removal is in progress. Therefore, for most of them we can avoid a full walk() when deleting, and just decrease the refcount, like it was done on older Linux kernels. This fixes situations where walk() was wrongly detecting a non-empty filter, like it happened with cls_u32 in the error path of change(), thus leading to failures in the following tdc selftests: 6aa7: (filter, u32) Add/Replace u32 with source match and invalid indev 6658: (filter, u32) Add/Replace u32 with custom hash table and invalid handle 74c2: (filter, u32) Add/Replace u32 filter with invalid hash table id On cls_flower, and on (future) lockless filters, this check is necessary: move all the check_empty() logic in a callback so that each filter can have its own implementation. For cls_flower, it's sufficient to check if no IDRs have been allocated. This reverts commit `275c44aa19`. Changes since v1: - document the need for delete_empty() when TCF_PROTO_OPS_DOIT_UNLOCKED is used, thanks to Vlad Buslov - implement delete_empty() without doing fl_walk(), thanks to Vlad Buslov - squash revert and new fix in a single patch, to be nice with bisect tests that run tdc on u32 filter, thanks to Dave Miller Fixes: `275c44aa19` ("net/sched: cls_u32: fix refcount leak in the error path of u32_change()") Fixes: `6676d5e416` ("net: sched: set dedicated tcf_walker flag when tp is empty") Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com> Suggested-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-30 20:35:19 -08:00
Cambda Zhu	853697504d	tcp: Fix highest_sack and highest_sack_seq >From commit `50895b9de1` ("tcp: highest_sack fix"), the logic about setting tp->highest_sack to the head of the send queue was removed. Of course the logic is error prone, but it is logical. Before we remove the pointer to the highest sack skb and use the seq instead, we need to set tp->highest_sack to NULL when there is no skb after the last sack, and then replace NULL with the real skb when new skb inserted into the rtx queue, because the NULL means the highest sack seq is tp->snd_nxt. If tp->highest_sack is NULL and new data sent, the next ACK with sack option will increase tp->reordering unexpectedly. This patch sets tp->highest_sack to the tail of the rtx queue if it's NULL and new data is sent. The patch keeps the rule that the highest_sack can only be maintained by sack processing, except for this only case. Fixes: `50895b9de1` ("tcp: highest_sack fix") Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-30 20:28:39 -08:00
Vladis Dronov	a33121e548	ptp: fix the race between the release of ptp_clock and cdev In a case when a ptp chardev (like /dev/ptp0) is open but an underlying device is removed, closing this file leads to a race. This reproduces easily in a kvm virtual machine: ts# cat openptp0.c int main() { ... fp = fopen("/dev/ptp0", "r"); ... sleep(10); } ts# uname -r 5.5.0-rc3-46cf053e ts# cat /proc/cmdline ... slub_debug=FZP ts# modprobe ptp_kvm ts# ./openptp0 & [1] 670 opened /dev/ptp0, sleeping 10s... ts# rmmod ptp_kvm ts# ls /dev/ptp* ls: cannot access '/dev/ptp': No such file or directory ts# ...woken up [ 48.010809] general protection fault: 0000 [#1] SMP [ 48.012502] CPU: 6 PID: 658 Comm: openptp0 Not tainted 5.5.0-rc3-46cf053e #25 [ 48.014624] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ... [ 48.016270] RIP: 0010:module_put.part.0+0x7/0x80 [ 48.017939] RSP: 0018:ffffb3850073be00 EFLAGS: 00010202 [ 48.018339] RAX: 000000006b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: ffff89a476c00ad0 [ 48.018936] RDX: fffff65a08d3ea08 RSI: 0000000000000247 RDI: 6b6b6b6b6b6b6b6b [ 48.019470] ... ^^^ a slub poison [ 48.023854] Call Trace: [ 48.024050] __fput+0x21f/0x240 [ 48.024288] task_work_run+0x79/0x90 [ 48.024555] do_exit+0x2af/0xab0 [ 48.024799] ? vfs_write+0x16a/0x190 [ 48.025082] do_group_exit+0x35/0x90 [ 48.025387] __x64_sys_exit_group+0xf/0x10 [ 48.025737] do_syscall_64+0x3d/0x130 [ 48.026056] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 48.026479] RIP: 0033:0x7f53b12082f6 [ 48.026792] ... [ 48.030945] Modules linked in: ptp i6300esb watchdog [last unloaded: ptp_kvm] [ 48.045001] Fixing recursive fault but reboot is needed! This happens in: static void __fput(struct file file) { ... if (file->f_op->release) file->f_op->release(inode, file); <<< cdev is kfree'd here if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL && !(mode & FMODE_PATH))) { cdev_put(inode->i_cdev); <<< cdev fields are accessed here Namely: __fput() posix_clock_release() kref_put(&clk->kref, delete_clock) <<< the last reference delete_clock() delete_ptp_clock() kfree(ptp) <<< cdev is embedded in ptp cdev_put module_put(p->owner) <<< *p is kfree'd, bang! Here cdev is embedded in posix_clock which is embedded in ptp_clock. The race happens because ptp_clock's lifetime is controlled by two refcounts: kref and cdev.kobj in posix_clock. This is wrong. Make ptp_clock's sysfs device a parent of cdev with cdev_device_add() created especially for such cases. This way the parent device with its ptp_clock is not released until all references to the cdev are released. This adds a requirement that an initialized but not exposed struct device should be provided to posix_clock_register() by a caller instead of a simple dev_t. This approach was adopted from the commit `72139dfa24` ("watchdog: Fix the race between the release of watchdog_core_data and cdev"). See details of the implementation in the commit `233ed09d7f` ("chardev: add helper function to register char devs with a struct device"). Link: https://lore.kernel.org/linux-fsdevel/20191125125342.6189-1-vdronov@redhat.com/T/#u Analyzed-by: Stephen Johnston <sjohnsto@redhat.com> Analyzed-by: Vern Lovejoy <vlovejoy@redhat.com> Signed-off-by: Vladis Dronov <vdronov@redhat.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-30 20:19:27 -08:00
Vladimir Oltean	54fa49ee88	net: dsa: sja1105: Reconcile the meaning of TPID and TPID2 for E/T and P/Q/R/S For first-generation switches (SJA1105E and SJA1105T): - TPID means C-Tag (typically 0x8100) - TPID2 means S-Tag (typically 0x88A8) While for the second generation switches (SJA1105P, SJA1105Q, SJA1105R, SJA1105S) it is the other way around: - TPID means S-Tag (typically 0x88A8) - TPID2 means C-Tag (typically 0x8100) In other words, E/T tags untagged traffic with TPID, and P/Q/R/S with TPID2. So the patch mentioned below fixed VLAN filtering for P/Q/R/S, but broke it for E/T. We strive for a common code path for all switches in the family, so just lie in the static config packing functions that TPID and TPID2 are at swapped bit offsets than they actually are, for P/Q/R/S. This will make both switches understand TPID to be ETH_P_8021Q and TPID2 to be ETH_P_8021AD. The meaning from the original E/T was chosen over P/Q/R/S because E/T is actually the one with public documentation available (UM10944.pdf). Fixes: `f9a1a7646c` ("net: dsa: sja1105: Reverse TPID and TPID2") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-30 20:15:02 -08:00
Vladimir Oltean	3a323ed7c9	Documentation: net: dsa: sja1105: Remove text about taprio base-time limitation Since commit `86db36a347` ("net: dsa: sja1105: Implement state machine for TAS with PTP clock source"), this paragraph is no longer true. So remove it. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-30 20:14:27 -08:00
Vladimir Oltean	d00bdc0a88	net: dsa: sja1105: Remove restriction of zero base-time for taprio offload The check originates from the initial implementation which was not based on PTP time but on a standalone clock source. In the meantime we can now program the PTPSCHTM register at runtime with the dynamic base time (actually with a value that is 200 ns smaller, to avoid writing DELTA=0 in the Schedule Entry Points Parameters Table). And we also have logic for moving the actual base time in the future of the PHC's current time base, so the check for zero serves no purpose, since even if the user will specify zero, that's not what will end up in the static config table where the limitation is. Fixes: `86db36a347` ("net: dsa: sja1105: Implement state machine for TAS with PTP clock source") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-30 20:13:11 -08:00
Vladimir Oltean	5a47f588ee	net: dsa: sja1105: Really make the PTP command read-write When activating tc-taprio offload on the switch ports, the TAS state machine will try to check whether it is running or not, but will find both the STARTED and STOPPED bits as false in the sja1105_tas_check_running function. So the function will return -EINVAL (an abnormal situation) and the kernel will keep printing this from the TAS FSM workqueue: [ 37.691971] sja1105 spi0.1: An operation returned -22 The reason is that the underlying function that gets called, sja1105_ptp_commit, does not actually do a SPI_READ, but a SPI_WRITE. So the command buffer remains initialized with zeroes instead of retrieving the hardware state. Fix that. Fixes: `41603d78b3` ("net: dsa: sja1105: Make the PTP command read-write") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-30 20:11:28 -08:00
Vladimir Oltean	9fcf024dd6	net: dsa: sja1105: Take PTP egress timestamp by port, not mgmt slot The PTP egress timestamp N must be captured from register PTPEGR_TS[n], where n = 2 * PORT + TSREG. There are 10 PTPEGR_TS registers, 2 per port. We are only using TSREG=0. As opposed to the management slots, which are 4 in number (SJA1105_NUM_PORTS, minus the CPU port). Any management frame (which includes PTP frames) can be sent to any non-CPU port through any management slot. When the CPU port is not the last port (#4), there will be a mismatch between the slot and the port number. Luckily, the only mainline occurrence with this switch (arch/arm/boot/dts/ls1021a-tsn.dts) does have the CPU port as #4, so the issue did not manifest itself thus far. Fixes: `47ed985e97` ("net: dsa: sja1105: Add logic for TX timestamping") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-30 20:10:20 -08:00
Vladimir Oltean	ca59d5a516	spi: spi-fsl-dspi: Fix 16-bit word order in 32-bit XSPI mode When used in Extended SPI mode on LS1021A, the DSPI controller wants to have the least significant 16-bit word written first to the TX FIFO. In fact, the LS1021A reference manual says: 33.5.2.4.2 Draining the TX FIFO When Extended SPI Mode (DSPIx_MCR[XSPI]) is enabled, if the frame size of SPI Data to be transmitted is more than 16 bits, then it causes two Data entries to be popped from TX FIFO simultaneously which are transferred to the shift register. The first of the two popped entries forms the 16 least significant bits of the SPI frame to be transmitted. So given the following TX buffer: +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ \| 0x0 \| 0x1 \| 0x2 \| 0x3 \| 0x4 \| 0x5 \| 0x6 \| 0x7 \| 0x8 \| 0x9 \| 0xa \| 0xb \| +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ \| 32-bit word 1 \| 32-bit word 2 \| 32-bit word 3 \| +-----------------------+-----------------------+-----------------------+ The correct way that a little-endian system should transmit it on the wire when bits_per_word is 32 is: 0x03020100 0x07060504 0x0b0a0908 But it is actually transmitted as following, as seen with a scope: 0x01000302 0x05040706 0x09080b0a It appears that this patch has been submitted at least once before: https://lkml.org/lkml/2018/9/21/286 but in that case Chuanhua Han did not manage to explain the problem clearly enough and the patch did not get merged, leaving XSPI mode broken. Fixes: `8fcd151d26` ("spi: spi-fsl-dspi: XSPI FIFO handling (in TCFQ mode)") Cc: Esben Haabendal <eha@deif.com> Cc: Chuanhua Han <chuanhua.han@nxp.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20191228135536.14284-1-olteanv@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org	2019-12-31 00:29:36 +00:00
Rahul Lakkireddy	0caeaf6ad5	cxgb4/cxgb4vf: fix flow control display for auto negotiation As per 802.3-2005, Section Two, Annex 28B, Table 28B-2 [1], when _only_ Rx pause is enabled, both symmetric and asymmetric pause towards local device must be enabled. Also, firmware returns the local device's flow control pause params as part of advertised capabilities and negotiated params as part of current link attributes. So, fix up ethtool's flow control pause params fetch logic to read from acaps, instead of linkattr. [1] https://standards.ieee.org/standard/802_3-2005.html Fixes: `c3168cabe1` ("cxgb4/cxgbvf: Handle 32-bit fw port capabilities") Signed-off-by: Surendra Mobiya <surendra@chelsio.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-30 14:40:42 -08:00
Kees Cook	1f07dcc459	kernel.h: Remove unused FIELD_SIZEOF() Now that all callers of FIELD_SIZEOF() have been converted to sizeof_field(), remove the unused prior macro. Signed-off-by: Kees Cook <keescook@chromium.org>	2019-12-30 12:01:56 -08:00
Damien Le Moal	c7d776f85d	null_blk: Fix REQ_OP_ZONE_CLOSE handling In order to match ZBC defined behavior, closing an empty zone must result in the "empty" zone condition instead of the "closed" condition. Fixes: `da644b2cc1` ("null_blk: add zone open, close, and finish support") Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-12-30 08:51:46 -07:00
Ming Lei	429120f3df	block: fix splitting segments on boundary masks We ran into a problem with a mpt3sas based controller, where we would see random (and hard to reproduce) file corruption). The issue seemed specific to this controller, but wasn't specific to the file system. After a lot of debugging, we find out that it's caused by segments spanning a 4G memory boundary. This shouldn't happen, as the default setting for segment boundary masks is 4G. Turns out there are two issues in get_max_segment_size(): 1) The default segment boundary mask is bypassed 2) The segment start address isn't taken into account when checking segment boundary limit Fix these two issues by removing the bypass of the segment boundary check even if the mask is set to the default value, and taking into account the actual start address of the request when checking if a segment needs splitting. Cc: stable@vger.kernel.org # v5.1+ Reviewed-by: Chris Mason <clm@fb.com> Tested-by: Chris Mason <clm@fb.com> Fixes: `dcebd75592` ("block: use bio_for_each_bvec() to compute multi-page bvec count") Signed-off-by: Ming Lei <ming.lei@redhat.com> Dropped const on the page pointer, ppc page_to_phys() doesn't mark the page as const... Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-12-30 08:51:18 -07:00
Filipe Manana	de7999afed	Btrfs: fix infinite loop during nocow writeback due to race When starting writeback for a range that covers part of a preallocated extent, due to a race with writeback for another range that also covers another part of the same preallocated extent, we can end up in an infinite loop. Consider the following example where for inode 280 we have two dirty ranges: range A, from 294912 to 303103, 8192 bytes range B, from 348160 to 438271, 90112 bytes and we have the following file extent item layout for our inode: leaf 38895616 gen 24544 total ptrs 29 free space 13820 owner 5 (...) item 27 key (280 108 200704) itemoff 14598 itemsize 53 extent data disk bytenr 0 nr 0 type 1 (regular) extent data offset 0 nr 94208 ram 94208 item 28 key (280 108 294912) itemoff 14545 itemsize 53 extent data disk bytenr 10433052672 nr 81920 type 2 (prealloc) extent data offset 0 nr 81920 ram 81920 Then the following happens: 1) Writeback starts for range B (from 348160 to 438271), execution of run_delalloc_nocow() starts; 2) The first iteration of run_delalloc_nocow()'s whil loop leaves us at the extent item at slot 28, pointing to the prealloc extent item covering the range from 294912 to 376831. This extent covers part of our range; 3) An ordered extent is created against that extent, covering the file range from 348160 to 376831 (28672 bytes); 4) We adjust 'cur_offset' to 376832 and move on to the next iteration of the while loop; 5) The call to btrfs_lookup_file_extent() leaves us at the same leaf, pointing to slot 29, 1 slot after the last item (the extent item we processed in the previous iteration); 6) Because we are a slot beyond the last item, we call btrfs_next_leaf(), which releases the search path before doing a another search for the last key of the leaf (280 108 294912); 7) Right after btrfs_next_leaf() released the path, and before it did another search for the last key of the leaf, writeback for the range A (from 294912 to 303103) completes (it was previously started at some point); 8) Upon completion of the ordered extent for range A, the prealloc extent we previously found got split into two extent items, one covering the range from 294912 to 303103 (8192 bytes), with a type of regular extent (and no longer prealloc) and another covering the range from 303104 to 376831 (73728 bytes), with a type of prealloc and an offset of 8192 bytes. So our leaf now has the following layout: leaf 38895616 gen 24544 total ptrs 31 free space 13664 owner 5 (...) item 27 key (280 108 200704) itemoff 14598 itemsize 53 extent data disk bytenr 0 nr 0 type 1 extent data offset 0 nr 8192 ram 94208 item 28 key (280 108 208896) itemoff 14545 itemsize 53 extent data disk bytenr 10433142784 nr 86016 type 1 extent data offset 0 nr 86016 ram 86016 item 29 key (280 108 294912) itemoff 14492 itemsize 53 extent data disk bytenr 10433052672 nr 81920 type 1 extent data offset 0 nr 8192 ram 81920 item 30 key (280 108 303104) itemoff 14439 itemsize 53 extent data disk bytenr 10433052672 nr 81920 type 2 extent data offset 8192 nr 73728 ram 81920 9) After btrfs_next_leaf() returns, we have our path pointing to that same leaf and at slot 30, since it has a key we didn't have before and it's the first key greater then the key that was previously the last key of the leaf (key (280 108 294912)); 10) The extent item at slot 30 covers the range from 303104 to 376831 which is in our target range, so we process it, despite having already created an ordered extent against this extent for the file range from 348160 to 376831. This is because we skip to the next extent item only if its end is less than or equals to the start of our delalloc range, and not less than or equals to the current offset ('cur_offset'); 11) As a result we compute 'num_bytes' as: num_bytes = min(end + 1, extent_end) - cur_offset; = min(438271 + 1, 376832) - 376832 = 0 12) We then call create_io_em() for a 0 bytes range starting at offset 376832; 13) Then create_io_em() enters an infinite loop because its calls to btrfs_drop_extent_cache() do nothing due to the 0 length range passed to it. So no existing extent maps that cover the offset 376832 get removed, and therefore calls to add_extent_mapping() return -EEXIST, resulting in an infinite loop. This loop from create_io_em() is the following: do { btrfs_drop_extent_cache(BTRFS_I(inode), em->start, em->start + em->len - 1, 0); write_lock(&em_tree->lock); ret = add_extent_mapping(em_tree, em, 1); write_unlock(&em_tree->lock); /* * The caller has taken lock_extent(), who could race with us * to add em? */ } while (ret == -EEXIST); Also, each call to btrfs_drop_extent_cache() triggers a warning because the start offset passed to it (376832) is smaller then the end offset (376832 - 1) passed to it by -1, due to the 0 length: [258532.052621] ------------[ cut here ]------------ [258532.052643] WARNING: CPU: 0 PID: 9987 at fs/btrfs/file.c:602 btrfs_drop_extent_cache+0x3f4/0x590 [btrfs] (...) [258532.052672] CPU: 0 PID: 9987 Comm: fsx Tainted: G W 5.4.0-rc7-btrfs-next-64 #1 [258532.052673] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 [258532.052691] RIP: 0010:btrfs_drop_extent_cache+0x3f4/0x590 [btrfs] (...) [258532.052695] RSP: 0018:ffffb4be0153f860 EFLAGS: 00010287 [258532.052700] RAX: ffff975b445ee360 RBX: ffff975b44eb3e08 RCX: 0000000000000000 [258532.052700] RDX: 0000000000038fff RSI: 0000000000039000 RDI: ffff975b445ee308 [258532.052700] RBP: 0000000000038fff R08: 0000000000000000 R09: 0000000000000001 [258532.052701] R10: ffff975b513c5c10 R11: 00000000e3c0cfa9 R12: 0000000000039000 [258532.052703] R13: ffff975b445ee360 R14: 00000000ffffffef R15: ffff975b445ee308 [258532.052705] FS: 00007f86a821de80(0000) GS:ffff975b76a00000(0000) knlGS:0000000000000000 [258532.052707] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [258532.052708] CR2: 00007fdacf0f3ab4 CR3: 00000001f9d26002 CR4: 00000000003606f0 [258532.052712] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [258532.052717] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [258532.052717] Call Trace: [258532.052718] ? preempt_schedule_common+0x32/0x70 [258532.052722] ? ___preempt_schedule+0x16/0x20 [258532.052741] create_io_em+0xff/0x180 [btrfs] [258532.052767] run_delalloc_nocow+0x942/0xb10 [btrfs] [258532.052791] btrfs_run_delalloc_range+0x30b/0x520 [btrfs] [258532.052812] ? find_lock_delalloc_range+0x221/0x250 [btrfs] [258532.052834] writepage_delalloc+0xe4/0x140 [btrfs] [258532.052855] __extent_writepage+0x110/0x4e0 [btrfs] [258532.052876] extent_write_cache_pages+0x21c/0x480 [btrfs] [258532.052906] extent_writepages+0x52/0xb0 [btrfs] [258532.052911] do_writepages+0x23/0x80 [258532.052915] __filemap_fdatawrite_range+0xd2/0x110 [258532.052938] btrfs_fdatawrite_range+0x1b/0x50 [btrfs] [258532.052954] start_ordered_ops+0x57/0xa0 [btrfs] [258532.052973] ? btrfs_sync_file+0x225/0x490 [btrfs] [258532.052988] btrfs_sync_file+0x225/0x490 [btrfs] [258532.052997] __x64_sys_msync+0x199/0x200 [258532.053004] do_syscall_64+0x5c/0x250 [258532.053007] entry_SYSCALL_64_after_hwframe+0x49/0xbe [258532.053010] RIP: 0033:0x7f86a7dfd760 (...) [258532.053014] RSP: 002b:00007ffd99af0368 EFLAGS: 00000246 ORIG_RAX: 000000000000001a [258532.053016] RAX: ffffffffffffffda RBX: 0000000000000ec9 RCX: 00007f86a7dfd760 [258532.053017] RDX: 0000000000000004 RSI: 000000000000836c RDI: 00007f86a8221000 [258532.053019] RBP: 0000000000021ec9 R08: 0000000000000003 R09: 00007f86a812037c [258532.053020] R10: 0000000000000001 R11: 0000000000000246 R12: 00000000000074a3 [258532.053021] R13: 00007f86a8221000 R14: 000000000000836c R15: 0000000000000001 [258532.053032] irq event stamp: 1653450494 [258532.053035] hardirqs last enabled at (1653450493): [<ffffffff9dec69f9>] _raw_spin_unlock_irq+0x29/0x50 [258532.053037] hardirqs last disabled at (1653450494): [<ffffffff9d4048ea>] trace_hardirqs_off_thunk+0x1a/0x20 [258532.053039] softirqs last enabled at (1653449852): [<ffffffff9e200466>] __do_softirq+0x466/0x6bd [258532.053042] softirqs last disabled at (1653449845): [<ffffffff9d4c8a0c>] irq_exit+0xec/0x120 [258532.053043] ---[ end trace 8476fce13d9ce20a ]--- Which results in flooding dmesg/syslog since btrfs_drop_extent_cache() uses WARN_ON() and not WARN_ON_ONCE(). So fix this issue by changing run_delalloc_nocow()'s loop to move to the next extent item when the current extent item ends at at offset less than or equals to the current offset instead of the start offset. Fixes: `80ff385665` ("Btrfs: update nodatacow code v2") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-12-30 16:13:20 +01:00
Dennis Zhou	46bcff2bfc	btrfs: fix compressed write bio blkcg attribution Bio attribution is handled at bio_set_dev() as once we have a device, we have a corresponding request_queue and then can derive the current css. In special cases, we want to attribute to bio to someone else. This can be done by calling bio_associate_blkg_from_css() or kthread_associate_blkcg() depending on the scenario. Btrfs does this for compressed writeback as they are handled by kworkers, so the latter can be done here. Commit `1a41802701` ("btrfs: drop bio_set_dev where not needed") removes early bio_set_dev() calls prior to submit_stripe_bio(). This breaks the above assumption that we'll have a request_queue when we are doing association. To fix this, switch to using kthread_associate_blkcg(). Without this, we crash in btrfs/024: [ 3052.093088] BUG: kernel NULL pointer dereference, address: 0000000000000510 [ 3052.107013] #PF: supervisor read access in kernel mode [ 3052.107014] #PF: error_code(0x0000) - not-present page [ 3052.107015] PGD 0 P4D 0 [ 3052.107021] Oops: 0000 [#1] SMP [ 3052.138904] CPU: 42 PID: 201270 Comm: kworker/u161:0 Kdump: loaded Not tainted 5.5.0-rc1-00062-g4852d8ac90a9 #712 [ 3052.138905] Hardware name: Quanta Tioga Pass Single Side 01-0032211004/Tioga Pass Single Side, BIOS F08_3A18 12/20/2018 [ 3052.138912] Workqueue: btrfs-delalloc btrfs_work_helper [ 3052.191375] RIP: 0010:bio_associate_blkg_from_css+0x1e/0x3c0 [ 3052.191379] RSP: 0018:ffffc900210cfc90 EFLAGS: 00010282 [ 3052.191380] RAX: 0000000000000000 RBX: ffff88bfe5573c00 RCX: 0000000000000000 [ 3052.191382] RDX: ffff889db48ec2f0 RSI: ffff88bfe5573c00 RDI: ffff889db48ec2f0 [ 3052.191386] RBP: 0000000000000800 R08: 0000000000203bb0 R09: ffff889db16b2400 [ 3052.293364] R10: 0000000000000000 R11: ffff88a07fffde80 R12: ffff889db48ec2f0 [ 3052.293365] R13: 0000000000001000 R14: ffff889de82bc000 R15: ffff889e2b7bdcc8 [ 3052.293367] FS: 0000000000000000(0000) GS:ffff889ffba00000(0000) knlGS:0000000000000000 [ 3052.293368] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3052.293369] CR2: 0000000000000510 CR3: 0000000002611001 CR4: 00000000007606e0 [ 3052.293370] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3052.293371] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3052.293372] PKRU: 55555554 [ 3052.293376] Call Trace: [ 3052.402552] btrfs_submit_compressed_write+0x137/0x390 [ 3052.402558] submit_compressed_extents+0x40f/0x4c0 [ 3052.422401] btrfs_work_helper+0x246/0x5a0 [ 3052.422408] process_one_work+0x200/0x570 [ 3052.438601] ? process_one_work+0x180/0x570 [ 3052.438605] worker_thread+0x4c/0x3e0 [ 3052.438614] kthread+0x103/0x140 [ 3052.460735] ? process_one_work+0x570/0x570 [ 3052.460737] ? kthread_mod_delayed_work+0xc0/0xc0 [ 3052.460744] ret_from_fork+0x24/0x30 Fixes: `1a41802701` ("btrfs: drop bio_set_dev where not needed") Reported-by: Chris Murphy <chris@colorremedies.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-12-30 16:07:19 +01:00
Dennis Zhou	7b62e66cbb	btrfs: punt all bios created in btrfs_submit_compressed_write() Compressed writes happen in the background via kworkers. However, this causes bios to be attributed to root bypassing any cgroup limits from the actual writer. We tag the first bio with REQ_CGROUP_PUNT, which will punt the bio to an appropriate cgroup specific workqueue and attribute the IO properly. However, if btrfs_submit_compressed_write() creates a new bio, we don't tag it the same way. Add the appropriate tagging for subsequent bios. Fixes: `ec39f7696c` ("Btrfs: use REQ_CGROUP_PUNT for worker thread submitted bios") Reviewed-by: Chris Mason <clm@fb.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-12-30 16:07:16 +01:00
Russell King	dcbce5fbcc	watchdog: orion: fix platform_get_irq() complaints Fix: orion_wdt f1020300.watchdog: IRQ index 1 not found which is caused by platform_get_irq() now complaining when optional IRQs are not found. Neither interrupt for orion is required, so make them both optional. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/E1iahcN-0000AT-Co@rmk-PC.armlinux.org.uk Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>	2019-12-30 15:58:29 +01:00
Andreas Kemnade	a76dfb859c	watchdog: rn5t618_wdt: fix module aliases Platform device aliases were missing so module autoloading did not work. Signed-off-by: Andreas Kemnade <andreas@kemnade.info> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191213214802.22268-1-andreas@kemnade.info Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>	2019-12-30 15:58:29 +01:00
YueHaibing	9a6c274ac1	watchdog: tqmx86_wdt: Fix build error If TQMX86_WDT is y and WATCHDOG_CORE is m, building fails: drivers/watchdog/tqmx86_wdt.o: In function `tqmx86_wdt_probe': tqmx86_wdt.c:(.text+0x46e): undefined reference to `watchdog_init_timeout' tqmx86_wdt.c:(.text+0x4e0): undefined reference to `devm_watchdog_register_device' Select WATCHDOG_CORE to fix this. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: `e3c21e088f` ("watchdog: tqmx86: Add watchdog driver for the IO controller") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191206124259.25880-1-yuehaibing@huawei.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>	2019-12-30 15:58:24 +01:00
David Engraf	da9e3f4e30	watchdog: max77620_wdt: fix potential build errors max77620_wdt uses watchdog core functions. Enable CONFIG_WATCHDOG_CORE to fix potential build errors. Signed-off-by: David Engraf <david.engraf@sysgo.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191127084617.16937-1-david.engraf@sysgo.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>	2019-12-30 15:58:24 +01:00
Fabio Estevam	91ced83c6e	watchdog: imx7ulp: Fix missing conversion of imx7ulp_wdt_enable() Since commit `747d88a1a8` ("watchdog: imx7ulp: Pass the wdog instance in imx7ulp_wdt_enable()") imx7ulp_wdt_enable() accepts a watchdog_device structure, so fix one instance that missed such conversion. This also fixes the following sparse warning: drivers/watchdog/imx7ulp_wdt.c:115:31: warning: incorrect type in argument 1 (different address spaces) drivers/watchdog/imx7ulp_wdt.c:115:31: expected struct watchdog_device wdog drivers/watchdog/imx7ulp_wdt.c:115:31: got void [noderef] <asn:2>base Fixes: `747d88a1a8` ("watchdog: imx7ulp: Pass the wdog instance inimx7ulp_wdt_enable()") Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191120140916.25001-1-festevam@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>	2019-12-30 15:58:23 +01:00

... 3 4 5 6 7 ...

888245 Commits