- A host of fixes for the Intel baytrail and cherryview:
properly serialize all register accesses and add the irqchip
with the gpiochip as we need to, fix some pin lists and
initialize the hardware in the right order.
- Fix the Aspeed G6 LPC configuration.
- Handle a possible NULL pointer exception in the core.
- Fix the Kconfig dependencies for the Equilibrium driver.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAl3764cACgkQQRCzN7AZ
XXOxTA//XxgTeisguIQMzkkUW0F+w+uz9DENdA95O57X3OuCxpaacAQC8nFuV1m4
fjRzpPd6sVbsoOOz0DvjIim962OPU5Ai54cUnhTLhCNPqXzdt4CnSKqwy+9eWZat
VAl+GtUHMXntucccUeKzpXQ+NguaV9gKHTmc6QnDcX/viYWBVyyguFegyI3jzoAw
C89k3hH/EnbNdv+xWmcgFcSPts7g7u7kI0A/GUlgoE/Eb8yVz03PB4cZoZwz2wPD
v7p5AKEcfRtU2XZGFmE9DSBAhcqG6eXXXsNZ878AJc6vIfE+W8RGW5pw/vtdF8H0
YRi3dt9Dq2CU3MXEFj2/901/C1Mcip4fn37xrWUp3mz2oabxoJR+L7vxg2YhcTvC
blmnqiWyewY9PLJc8uOb5rYSjJVb1E7zI92eowhkU5QW87+1LDCIDnBu3JBfSc7+
3ul6sZn9xMrBWx6cKxgehXLcXwMs0BxLHrmyefsNX5CLM/as62nadFIHCIDUzZQU
Bg2HMFm9z7cgUIahrHgJH968lr7q9yq7+49QYyKNxSGcovSSsucjMZ3Aop0vC556
+YOmgQU8Q7mfs5O94CO1AWwPWrglCyvOyy+7fD4WyUWbYHVyQXUEMZqAj46oHAyY
oZE/e27ec3l4XgQblTNqoo7xX2W6yibo49Oh0r04vGgv+R+UpF0=
=b9GK
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-v5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Sorry that this fixes pull request took a while. Too much christmas
business going on.
This contains a few really important Intel fixes and some odd fixes:
- A host of fixes for the Intel baytrail and cherryview: properly
serialize all register accesses and add the irqchip with the
gpiochip as we need to, fix some pin lists and initialize the
hardware in the right order.
- Fix the Aspeed G6 LPC configuration.
- Handle a possible NULL pointer exception in the core.
- Fix the Kconfig dependencies for the Equilibrium driver"
* tag 'pinctrl-v5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: ingenic: Fixup PIN_CONFIG_OUTPUT config
pinctrl: Modify Kconfig to fix linker error
pinctrl: pinmux: fix a possible null pointer in pinmux_can_be_used_for_gpio
pinctrl: aspeed-g6: Fix LPC/eSPI mux configuration
pinctrl: cherryview: Pass irqchip when adding gpiochip
pinctrl: cherryview: Add GPIO <-> pin mapping ranges via callback
pinctrl: cherryview: Split out irq hw-init into a separate helper function
pinctrl: baytrail: Pass irqchip when adding gpiochip
pinctrl: baytrail: Add GPIO <-> pin mapping ranges via callback
pinctrl: baytrail: Update North Community pin list
pinctrl: baytrail: Really serialize all register accesses
This moves the prep handlers outside of the opcode handlers, and allows
us to pass in the sqe directly. If the sqe is non-NULL, it means that
the request should be prepared for the first time.
With the opcode handlers not having access to the sqe at all, we are
guaranteed that the prep handler has setup the request fully by the
time we get there. As before, for opcodes that need to copy in more
data then the io_kiocb allows for, the io_async_ctx holds that info. If
a prep handler is invoked with req->io set, it must use that to retain
information for later.
Finally, we can remove io_kiocb->sqe as well.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We currently have a mix of use cases. Most of the newer ones are pretty
uniform, but we have some older ones that use different calling
calling conventions. This is confusing.
For the opcodes that currently rely on the req->io->sqe copy saving
them from reuse, add a request type struct in the io_kiocb command
union to store the data they need.
Prepare for all opcodes having a standard prep method, so we can call
it in a uniform fashion and outside of the opcode handler. This is in
preparation for passing in the 'sqe' pointer, rather than storing it
in the io_kiocb. Once we have uniform prep handlers, we can leave all
the prep work to that part, and not even pass in the sqe to the opcode
handler. This ensures that we don't reuse sqe data inadvertently.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Mainly does:
- capitalize gpio and bios to GPIO and BIOS
- capitalize beginning of comments
- add periods in multi-line comments
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
GPIO stuff on APUv4 seems to be the same as on APUv2, so we just
need to match on DMI data.
Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
The mapping entry has to hold the GPIO line index instead of
controller's register number.
Fixes: 5037d4ddda ("platform/x86: pcengines-apuv2: wire up simswitch gpio as led")
Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
The CONNECT X300 uses the PMC clock for on-board components and gets
stuck during boot if the clock is disabled. Therefore, add this
device to the critical systems list.
Tested on CONNECT X300.
Fixes: 648e921888 ("clk: x86: Stop marking clocks as CLK_IS_CRITICAL")
Signed-off-by: Michael Haener <michael.haener@siemens.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
At least on the HP Envy x360 15-cp0xxx model the WMI interface
for HPWMI_FEATURE2_QUERY requires an outsize of at least 128 bytes,
otherwise it fails with an error code 5 (HPWMI_RET_INVALID_PARAMETERS):
Dec 06 00:59:38 kernel: hp_wmi: query 0xd returned error 0x5
We do not care about the contents of the buffer, we just want to know
if the HPWMI_FEATURE2_QUERY command is supported.
This commits bumps the buffer size, fixing the error.
Fixes: 8a1513b493 ("hp-wmi: limit hotkey enable")
Cc: stable@vger.kernel.org
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1520703
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
This is a follow-up commit for the sysfs attributes to change
from DRIVER_ATTR to DEVICE_ATTR according to some initial comments.
In such case, it's better to point the sysfs path to the device
itself instead of the driver. The ABI document is also updated.
Fixes: 79e29cb8fb ("platform/mellanox: Add bootctl driver for Mellanox BlueField Soc")
Signed-off-by: Liming Sun <lsun@mellanox.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Add the count field to struct io_timeout, and ensure the prep handler
has read it. Timeout also needs an async context always, set it up
in the prep handler if we don't have one.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add struct io_sr_msg in our io_kiocb per-command union, and ensure that
the send/recvmsg prep handlers have grabbed what they need from the SQE
by the time prep is done.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add struct io_connect in our io_kiocb per-command union, and ensure
that io_connect_prep() has grabbed what it needs from the SQE.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Put the kiocb in struct io_rw, and add the addr/len for the request as
well. Use the kiocb->private field for the buffer index for fixed reads
and writes.
Any use of kiocb->ki_filp is flipped to req->file. It's the same thing,
and less confusing.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fix rxrpc_new_incoming_call() to check that we have a suitable service key
available for the combination of service ID and security class of a new
incoming call - and to reject calls for which we don't.
This causes an assertion like the following to appear:
rxrpc: Assertion failed - 6(0x6) == 12(0xc) is false
kernel BUG at net/rxrpc/call_object.c:456!
Where call->state is RXRPC_CALL_SERVER_SECURING (6) rather than
RXRPC_CALL_COMPLETE (12).
Fixes: 248f219cb8 ("rxrpc: Rewrite the data and ack handling code")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Standard kernel mutexes cannot be used in any way from interrupt or softirq
context, so the user_mutex which manages access to a call cannot be a mutex
since on a new call the mutex must start off locked and be unlocked within
the softirq handler to prevent userspace interfering with a call we're
setting up.
Commit a0855d24fc ("locking/mutex: Complain
upon mutex API misuse in IRQ contexts") causes big warnings to be splashed
in dmesg for each a new call that comes in from the server. Whilst it
*seems* like it should be okay, since the accept path uses trylock, there
are issues with PI boosting and marking the wrong task as the owner.
Fix this by not taking the mutex in the softirq path at all. It's not
obvious that there should be any need for it as the state is set before the
first notification is generated for the new call.
There's also no particular reason why the link-assessing ping should be
triggered inside the mutex. It's not actually transmitted there anyway,
but rather it has to be deferred to a workqueue.
Further, I don't think that there's any particular reason that the socket
notification needs to be done from within rx->incoming_lock, so the amount
of time that lock is held can be shortened too and the ping prepared before
the new call notification is sent.
Fixes: 540b1c48c3 ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Peter Zijlstra (Intel) <peterz@infradead.org>
cc: Ingo Molnar <mingo@redhat.com>
cc: Will Deacon <will@kernel.org>
cc: Davidlohr Bueso <dave@stgolabs.net>
Move the unlock and the ping transmission for a new incoming call into
rxrpc_new_incoming_call() rather than doing it in the caller. This makes
it clearer to see what's going on.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
cc: Ingo Molnar <mingo@redhat.com>
cc: Will Deacon <will@kernel.org>
cc: Davidlohr Bueso <dave@stgolabs.net>
Fix the following sparse warning:
fs/xfs/libxfs/xfs_trans_resv.c:206:1: warning: symbol 'xfs_rtalloc_log_count' was not declared. Should it be static?
Fixes: b1de6fc752 ("xfs: fix log reservation overflows when allocating large rt extents")
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
We use it in some spots, but not consistently. Convert the rest over,
makes it easier to read as well.
No functional changes in this patch.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The malidp_mw_connector_helper_funcs is not referenced by name
outside of the file it is in, so make it static to avoid the
following warning:
drivers/gpu/drm/arm/malidp_mw.c:59:41: warning: symbol 'malidp_mw_connector_helper_funcs' was not declared. Should it be static?
Signed-off-by: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217115309.2133503-1-ben.dooks@codethink.co.uk
gnttab_request_version() always sets the gnttab_interface variable
and the assertions to check for empty gnttab_interface is unnecessary.
The patch eliminates multiple such assertions.
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
By simply re-attaching to shared rings during connect_ring() rather than
assuming they are freshly allocated (i.e assuming the counters are zero)
it is possible for vbd instances to be unbound and re-bound from and to
(respectively) a running guest.
This has been tested by running:
while true;
do fio --name=randwrite --ioengine=libaio --iodepth=16 \
--rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32;
done
in a PV guest whilst running:
while true;
do echo vbd-$DOMID-$VBD >unbind;
echo unbound;
sleep 5;
echo vbd-$DOMID-$VBD >bind;
echo bound;
sleep 3;
done
in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and
re-bind its system disk image.
This is a highly useful feature for a backend module as it allows it to be
unloaded and re-loaded (i.e. updated) without requiring domUs to be halted.
This was also tested by running:
while true;
do echo vbd-$DOMID-$VBD >unbind;
echo unbound;
sleep 5;
rmmod xen-blkback;
echo unloaded;
sleep 1;
modprobe xen-blkback;
echo bound;
cd $(pwd);
sleep 3;
done
in dom0 whilst running the same loop as above in the (single) PV guest.
Some (less stressful) testing has also been done using a Windows HVM guest
with the latest 9.0 PV drivers installed.
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Currently these macros are defined to re-initialize a front/back ring
(respectively) to values read from the shared ring in such a way that any
requests/responses that are added to the shared ring whilst the front/back
is detached will be skipped over. This, in general, is not a desirable
semantic since most frontend implementations will eventually block waiting
for a response which would either never appear or never be processed.
Since the macros are currently unused, take this opportunity to re-define
them to re-initialize a front/back ring using specified values. This also
allows FRONT/BACK_RING_INIT() to be re-defined in terms of
FRONT/BACK_RING_ATTACH() using a specified value of 0.
NOTE: BACK_RING_ATTACH() will be used directly in a subsequent patch.
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
If a driver probe() fails then leave the xenstore state alone. There is no
reason to modify it as the failure may be due to transient resource
allocation issues and hence a subsequent probe() may succeed.
If the driver supports re-binding then only force state to closed during
remove() only in the case when the toolstack may need to clean up. This can
be detected by checking whether the state in xenstore has been set to
closing prior to device removal.
NOTE: Re-bind support is indicated by new boolean in struct xenbus_driver,
which defaults to false. Subsequent patches will add support to
some backend drivers.
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
...and make it static
xenbus_dev_shutdown() is seemingly intended to cause clean shutdown of PV
frontends when a guest is rebooted. Indeed the function waits for a
conpletion which is only set by a call to xenbus_frontend_closed().
This patch removes the shutdown() method from backends and moves
xenbus_dev_shutdown() from xenbus_probe.c into xenbus_probe_frontend.c,
renaming it appropriately and making it static.
NOTE: In the case where the backend is running in a driver domain, the
toolstack should have already terminated any frontends that may be
using it (since Xen does not support re-startable PV driver domains)
so xenbus_dev_shutdown() should never be called.
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Clang warns:
../drivers/block/xen-blkfront.c:1117:4: warning: misleading indentation;
statement is not part of the previous 'if' [-Wmisleading-indentation]
nr_parts = PARTS_PER_DISK;
^
../drivers/block/xen-blkfront.c:1115:3: note: previous statement is here
if (err)
^
This is because there is a space at the beginning of this line; remove
it so that the indentation is consistent according to the Linux kernel
coding style and clang no longer warns.
While we are here, the previous line has some trailing whitespace; clean
that up as well.
Fixes: c80a420995 ("xen-blkfront: handle Xen major numbers other than XENVBD")
Link: https://github.com/ClangBuiltLinux/linux/issues/791
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
The sifive_l2_cache.c is in no way related to RISC-V architecture
memory management. It is a little stub driver working around the fact
that the EDAC maintainers prefer their drivers to be structured in a
certain way that doesn't fit the SiFive SOCs.
Move the file to drivers/soc and add a Kconfig option for it, as well
as the whole drivers/soc boilerplate for CONFIG_SOC_SIFIVE.
Fixes: a967a289f1 ("RISC-V: sifive_l2_cache: Add L2 cache controller driver for SiFive SoCs")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
[paul.walmsley@sifive.com: keep the MAINTAINERS change specific to the L2$ controller code]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
pfn_to_page & page_to_pfn depend on vmemmap being available before the calls
if kernel is configured with CONFIG_SPARSEMEM_VMEMMAP=y. This was caused
by NOMMU changes which moved vmemmap definition bellow functions definitions
calling pfn_to_page & page_to_pfn.
Noticed while compiled 5.5-rc2 kernel for Fedora/RISCV.
v2:
- Add a comment for vmemmap in source
Signed-off-by: David Abdurachmanov <david.abdurachmanov@sifive.com>
Fixes: 6bd33e1ece ("riscv: add nommu support")
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
This patch fixes that the sscratch register clearing in M-mode. It cleared
sscratch register in M-mode, but it should clear mscratch register. That will
cause kernel trap if the CPU core doesn't support S-mode when trying to access
sscratch.
Fixes: 9e80635619 ("riscv: clear the instruction cache and all registers when booting")
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
In Kconfig files, config options are written without the CONFIG_ prefix.
Fixes: 6bd33e1ece ("riscv: add nommu support")
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Make sure to check the return value of usb_altnum_to_altsetting() to
avoid dereferencing a NULL pointer when the requested alternate settings
is missing.
The format altsetting number may come from a quirk table and there does
not seem to be any other validation of it (the corresponding index is
checked however).
Fixes: b099b9693d ("ALSA: usb-audio: Avoid superfluous usb_set_interface() calls")
Cc: stable <stable@vger.kernel.org> # 4.18
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20191220093134.1248-1-johan@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Davide Caratti says:
====================
net/sched: cls_u32: fix refcount leak
a refcount leak in the error path of u32_change() has been recently
introduced. It can be observed with the following commands:
[root@f31 ~]# tc filter replace dev eth0 ingress protocol ip prio 97 \
> u32 match ip src 127.0.0.1/32 indev notexist20 flowid 1:1 action drop
RTNETLINK answers: Invalid argument
We have an error talking to the kernel
[root@f31 ~]# tc filter replace dev eth0 ingress protocol ip prio 98 \
> handle 42:42 u32 divisor 256
Error: cls_u32: Divisor can only be used on a hash table.
We have an error talking to the kernel
[root@f31 ~]# tc filter replace dev eth0 ingress protocol ip prio 99 \
> u32 ht 47:47
Error: cls_u32: Specified hash table not found.
We have an error talking to the kernel
they all legitimately return -EINVAL; however, they leave semi-configured
filters at eth0 tc ingress:
[root@f31 ~]# tc filter show dev eth0 ingress
filter protocol ip pref 97 u32 chain 0
filter protocol ip pref 97 u32 chain 0 fh 800: ht divisor 1
filter protocol ip pref 98 u32 chain 0
filter protocol ip pref 98 u32 chain 0 fh 801: ht divisor 1
filter protocol ip pref 99 u32 chain 0
filter protocol ip pref 99 u32 chain 0 fh 802: ht divisor 1
With older kernels, filters were unconditionally considered empty (and
thus de-refcounted) on the error path of ->change().
After commit 8b64678e0a ("net: sched: refactor tp insert/delete for
concurrent execution"), filters were considered empty when the walk()
function didn't set 'walker.stop' to 1.
Finally, with commit 6676d5e416 ("net: sched: set dedicated tcf_walker
flag when tp is empty"), tc filters are considered empty unless the walker
function is called with a non-NULL handle. This last change doesn't fit
cls_u32 design, because at least the "root hnode" is (almost) always
non-NULL, as it's allocated in u32_init().
- patch 1/2 is a proposal to restore the original kernel behavior, where
no filter was installed in the error path of u32_change().
- patch 2/2 adds tdc selftests that can be ued to verify the correct
behavior of u32 in the error path of ->change().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
- move test "e9a3 - Add u32 with source match" to u32.json, and change the
match pattern to catch all hnodes
- add testcases for relevant error paths of cls_u32 module
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
when users replace cls_u32 filters with new ones having wrong parameters,
so that u32_change() fails to validate them, the kernel doesn't roll-back
correctly, and leaves semi-configured rules.
Fix this in u32_walk(), avoiding a call to the walker function on filters
that don't have a match rule connected. The side effect is, these "empty"
filters are not even dumped when present; but that shouldn't be a problem
as long as we are restoring the original behaviour, where semi-configured
filters were not even added in the error path of u32_change().
Fixes: 6676d5e416 ("net: sched: set dedicated tcf_walker flag when tp is empty")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In s3fwrn5_fw_recv_frame, if fw_info->rsp is not empty, the
current code causes a crash via BUG_ON. However, s3fwrn5_fw_send_msg
does not crash in such a scenario. The patch replaces the BUG_ON
by returning the error to the callers and frees up skb.
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart says:
====================
net: macb: fix probing of PHY not described in the dt
The macb Ethernet driver supports various ways of referencing its
network PHY. When a device tree is used the PHY can be referenced with
a phy-handle or, if connected to its internal MDIO bus, described in
a child node. Some platforms omitted the PHY description while
connecting the PHY to the internal MDIO bus and in such cases the MDIO
bus has to be scanned "manually" by the macb driver.
Prior to the phylink conversion the driver registered the MDIO bus with
of_mdiobus_register and then in case the PHY couldn't be retrieved
using dt or using phy_find_first (because registering an MDIO bus with
of_mdiobus_register masks all PHYs) the macb driver was "manually"
scanning the MDIO bus (like mdiobus_register does). The phylink
conversion did break this particular case but reimplementing the manual
scan of the bus in the macb driver wouldn't be very clean. The solution
seems to be registering the MDIO bus based on if the PHYs are described
in the device tree or not.
There are multiple ways to do this, none is perfect. I chose to check if
any of the child nodes of the macb node was a network PHY and based on
this to register the MDIO bus with the of_ helper or not. The drawback
is boards referencing the PHY through phy-handle, would scan the entire
MDIO bus of the macb at boot time (as the MDIO bus would be registered
with mdiobus_register). For this solution to work properly
of_mdiobus_child_is_phy has to be exported, which means the patch doing
so has to be backported to -stable as well.
Another possible solution could have been to simply check if the macb
node has a child node by counting its sub-nodes. This isn't techically
perfect, as there could be other sub-nodes (in practice this should be
fine, fixed-link being taken care of in the driver). We could also
simply s/of_mdiobus_register/mdiobus_register/ but that could break
boards using the PHY description in child node as a selector (which
really would be not a proper way to do this...).
The real issue here being having PHYs not described in the dt but we
have dt backward compatibility, so we have to live with that.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the case where the PHY isn't described in the device
tree. This is due to the way the MDIO bus is registered in the driver:
whether the PHY is described in the device tree or not, the bus is
registered through of_mdiobus_register. The function masks all the PHYs
and only allow probing the ones described in the device tree. Prior to
the Phylink conversion this was also done but later on in the driver
the MDIO bus was manually scanned to circumvent the fact that the PHY
wasn't described.
This patch fixes it in a proper way, by registering the MDIO bus based
on if the PHY attached to a given interface is described in the device
tree or not.
Fixes: 7897b071ac ("net: macb: convert to phylink")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch exports of_mdiobus_child_is_phy, allowing to check if a child
node is a network PHY.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On Big Endian architectures, u16 port value was extracted from the wrong
parts of u32 sreg_port, just like commit 10596608c4 ("netfilter:
nf_tables: fix mismatch in big-endian system") describes.
Fixes: 4ed8eb6570 ("netfilter: nf_tables: Add native tproxy support")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Florian Westphal <fw@strlen.de>
Acked-by: Máté Eckl <ecklm94@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
syzbot reported following splat:
BUG: KASAN: vmalloc-out-of-bounds in size_entry_mwt net/bridge/netfilter/ebtables.c:2063 [inline]
BUG: KASAN: vmalloc-out-of-bounds in compat_copy_entries+0x128b/0x1380 net/bridge/netfilter/ebtables.c:2155
Read of size 4 at addr ffffc900004461f4 by task syz-executor267/7937
CPU: 1 PID: 7937 Comm: syz-executor267 Not tainted 5.5.0-rc1-syzkaller #0
size_entry_mwt net/bridge/netfilter/ebtables.c:2063 [inline]
compat_copy_entries+0x128b/0x1380 net/bridge/netfilter/ebtables.c:2155
compat_do_replace+0x344/0x720 net/bridge/netfilter/ebtables.c:2249
compat_do_ebt_set_ctl+0x22f/0x27e net/bridge/netfilter/ebtables.c:2333
[..]
Because padding isn't considered during computation of ->buf_user_offset,
"total" is decremented by fewer bytes than it should.
Therefore, the first part of
if (*total < sizeof(*entry) || entry->next_offset < sizeof(*entry))
will pass, -- it should not have. This causes oob access:
entry->next_offset is past the vmalloced size.
Reject padding and check that computed user offset (sum of ebt_entry
structure plus all individual matches/watchers/targets) is same
value that userspace gave us as the offset of the next entry.
Reported-by: syzbot+f68108fed972453a0ad4@syzkaller.appspotmail.com
Fixes: 81e675c227 ("netfilter: ebtables: add CONFIG_COMPAT support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
NAT test currently covers snat (masquerade) only.
Also add a dnat rule and then check that a connecting to the
to-be-dnated address will work.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In some configurations, gcc reports an integer overflow:
net/netfilter/nf_flow_table_offload.c: In function 'nf_flow_rule_match':
net/netfilter/nf_flow_table_offload.c:80:21: error: unsigned conversion from 'int' to '__be16' {aka 'short unsigned int'} changes value from '327680' to '0' [-Werror=overflow]
mask->tcp.flags = TCP_FLAG_RST | TCP_FLAG_FIN;
^~~~~~~~~~~~
From what I can tell, we want the upper 16 bits of these constants,
so they need to be shifted in cpu-endian mode.
Fixes: c29f74e0df ("netfilter: nf_flow_table: hardware offload support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The sector size of the block layer is 512 bytes, but integrity interval
size might be different (in case of 4K block size of the media). At the
initiator side the virtual start sector is the one that was originally
submitted by the block layer (512 bytes) for the Reftag usage. The
initiator converts the Reftag to integrity interval units and sends it to
the target. So the target virtual start sector should be calculated at
integrity interval units. prepare_fn() and complete_fn() don't remap
correctly the Reftag when using incorrect units of the virtual start
sector, which leads to the following protection error at the device:
"blk_update_request: protection error, dev sdb, sector 2048 op 0x0:(READ)
flags 0x10000 phys_seg 1 prio class 0"
To fix that, set the seed in integrity interval units.
Link: https://lore.kernel.org/r/1576078562-15240-1-git-send-email-israelr@mellanox.com
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If cxgb4i_ddp_init() fails then cdev->cdev2ppm will be NULL, so add a check
for NULL pointer before dereferencing it.
Link: https://lore.kernel.org/r/1576676731-3068-1-git-send-email-varun@chelsio.com
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are spelling mistakes of asynchronous in a lpfc_printf_log message
and comments. Fix these.
Link: https://lore.kernel.org/r/20191218084301.627555-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The compare functions of the histogram code would be specific for the size
of the value being compared (byte, short, int, long long). It would
reference the value from the array via the type of the compare, but the
value was stored in a 64 bit number. This is fine for little endian
machines, but for big endian machines, it would end up comparing zeros or
all ones (depending on the sign) for anything but 64 bit numbers.
To fix this, first derference the value as a u64 then convert it to the type
being compared.
Link: http://lkml.kernel.org/r/20191211103557.7bed6928@gandalf.local.home
Cc: stable@vger.kernel.org
Fixes: 08d43a5fa0 ("tracing: Add lock-free tracing_map")
Acked-by: Tom Zanussi <zanussi@kernel.org>
Reported-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Daniel Borkmann says:
====================
pull-request: bpf 2019-12-19
The following pull-request contains BPF updates for your *net* tree.
We've added 10 non-merge commits during the last 8 day(s) which contain
a total of 21 files changed, 269 insertions(+), 108 deletions(-).
The main changes are:
1) Fix lack of synchronization between xsk wakeup and destroying resources
used by xsk wakeup, from Maxim Mikityanskiy.
2) Fix pruning with tail call patching, untrack programs in case of verifier
error and fix a cgroup local storage tracking bug, from Daniel Borkmann.
3) Fix clearing skb->tstamp in bpf_redirect() when going from ingress to
egress which otherwise cause issues e.g. on fq qdisc, from Lorenz Bauer.
4) Fix compile warning of unused proc_dointvec_minmax_bpf_restricted() when
only cBPF is present, from Alexander Lobakin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Expand dummy prog generation such that we can easily check on return
codes and add few more test cases to make sure we keep on tracking
pruning behavior.
# ./test_verifier
[...]
#1066/p XDP pkt read, pkt_data <= pkt_meta', bad access 1 OK
#1067/p XDP pkt read, pkt_data <= pkt_meta', bad access 2 OK
Summary: 1580 PASSED, 0 SKIPPED, 0 FAILED
Also verified that JIT dump of added test cases looks good.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/df7200b6021444fd369376d227de917357285b65.1576789878.git.daniel@iogearbox.net
While testing Cilium with /unreleased/ Linus' tree under BPF-based NodePort
implementation, I noticed a strange BPF SNAT engine behavior from time to
time. In some cases it would do the correct SNAT/DNAT service translation,
but at a random point in time it would just stop and perform an unexpected
translation after SYN, SYN/ACK and stack would send a RST back. While initially
assuming that there is some sort of a race condition in BPF code, adding
trace_printk()s for debugging purposes at some point seemed to have resolved
the issue auto-magically.
Digging deeper on this Heisenbug and reducing the trace_printk() calls to
an absolute minimum, it turns out that a single call would suffice to
trigger / not trigger the seen RST issue, even though the logic of the
program itself remains unchanged. Turns out the single call changed verifier
pruning behavior to get everything to work. Reconstructing a minimal test
case, the incorrect JIT dump looked as follows:
# bpftool p d j i 11346
0xffffffffc0cba96c:
[...]
21: movzbq 0x30(%rdi),%rax
26: cmp $0xd,%rax
2a: je 0x000000000000003a
2c: xor %edx,%edx
2e: movabs $0xffff89cc74e85800,%rsi
38: jmp 0x0000000000000049
3a: mov $0x2,%edx
3f: movabs $0xffff89cc74e85800,%rsi
49: mov -0x224(%rbp),%eax
4f: cmp $0x20,%eax
52: ja 0x0000000000000062
54: add $0x1,%eax
57: mov %eax,-0x224(%rbp)
5d: jmpq 0xffffffffffff6911
62: mov $0x1,%eax
[...]
Hence, unexpectedly, JIT emitted a direct jump even though retpoline based
one would have been needed since in line 2c and 3a we have different slot
keys in BPF reg r3. Verifier log of the test case reveals what happened:
0: (b7) r0 = 14
1: (73) *(u8 *)(r1 +48) = r0
2: (71) r0 = *(u8 *)(r1 +48)
3: (15) if r0 == 0xd goto pc+4
R0_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R1=ctx(id=0,off=0,imm=0) R10=fp0
4: (b7) r3 = 0
5: (18) r2 = 0xffff89cc74d54a00
7: (05) goto pc+3
11: (85) call bpf_tail_call#12
12: (b7) r0 = 1
13: (95) exit
from 3 to 8: R0_w=inv13 R1=ctx(id=0,off=0,imm=0) R10=fp0
8: (b7) r3 = 2
9: (18) r2 = 0xffff89cc74d54a00
11: safe
processed 13 insns (limit 1000000) [...]
Second branch is pruned by verifier since considered safe, but issue is that
record_func_key() couldn't have seen the index in line 3a and therefore
decided that emitting a direct jump at this location was okay.
Fix this by reusing our backtracking logic for precise scalar verification
in order to prevent pruning on the slot key. This means verifier will track
content of r3 all the way backwards and only prune if both scalars were
unknown in state equivalence check and therefore poisoned in the first place
in record_func_key(). The range is [x,x] in record_func_key() case since
the slot always would have to be constant immediate. Correct verification
after fix:
0: (b7) r0 = 14
1: (73) *(u8 *)(r1 +48) = r0
2: (71) r0 = *(u8 *)(r1 +48)
3: (15) if r0 == 0xd goto pc+4
R0_w=invP(id=0,umax_value=255,var_off=(0x0; 0xff)) R1=ctx(id=0,off=0,imm=0) R10=fp0
4: (b7) r3 = 0
5: (18) r2 = 0x0
7: (05) goto pc+3
11: (85) call bpf_tail_call#12
12: (b7) r0 = 1
13: (95) exit
from 3 to 8: R0_w=invP13 R1=ctx(id=0,off=0,imm=0) R10=fp0
8: (b7) r3 = 2
9: (18) r2 = 0x0
11: (85) call bpf_tail_call#12
12: (b7) r0 = 1
13: (95) exit
processed 15 insns (limit 1000000) [...]
And correct corresponding JIT dump:
# bpftool p d j i 11
0xffffffffc0dc34c4:
[...]
21: movzbq 0x30(%rdi),%rax
26: cmp $0xd,%rax
2a: je 0x000000000000003a
2c: xor %edx,%edx
2e: movabs $0xffff9928b4c02200,%rsi
38: jmp 0x0000000000000049
3a: mov $0x2,%edx
3f: movabs $0xffff9928b4c02200,%rsi
49: cmp $0x4,%rdx
4d: jae 0x0000000000000093
4f: and $0x3,%edx
52: mov %edx,%edx
54: cmp %edx,0x24(%rsi)
57: jbe 0x0000000000000093
59: mov -0x224(%rbp),%eax
5f: cmp $0x20,%eax
62: ja 0x0000000000000093
64: add $0x1,%eax
67: mov %eax,-0x224(%rbp)
6d: mov 0x110(%rsi,%rdx,8),%rax
75: test %rax,%rax
78: je 0x0000000000000093
7a: mov 0x30(%rax),%rax
7e: add $0x19,%rax
82: callq 0x000000000000008e
87: pause
89: lfence
8c: jmp 0x0000000000000087
8e: mov %rax,(%rsp)
92: retq
93: mov $0x1,%eax
[...]
Also explicitly adding explicit env->allow_ptr_leaks to fixup_bpf_calls() since
backtracking is enabled under former (direct jumps as well, but use different
test). In case of only tracking different map pointers as in c93552c443 ("bpf:
properly enforce index mask to prevent out-of-bounds speculation"), pruning
cannot make such short-cuts, neither if there are paths with scalar and non-scalar
types as r3. mark_chain_precision() is only needed after we know that
register_is_const(). If it was not the case, we already poison the key on first
path and non-const key in later paths are not matching the scalar range in regsafe()
either. Cilium NodePort testing passes fine as well now. Note, released kernels
not affected.
Fixes: d2e4c1e6c2 ("bpf: Constant map key tracking for prog array pokes")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/ac43ffdeb7386c5bd688761ed266f3722bb39823.1576789878.git.daniel@iogearbox.net
proc_dointvec_minmax_bpf_restricted() has been firstly introduced
in commit 2e4a30983b ("bpf: restrict access to core bpf sysctls")
under CONFIG_HAVE_EBPF_JIT. Then, this ifdef has been removed in
ede95a63b5 ("bpf: add bpf_jit_limit knob to restrict unpriv
allocations"), because a new sysctl, bpf_jit_limit, made use of it.
Finally, this parameter has become long instead of integer with
fdadd04931 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K")
and thus, a new proc_dolongvec_minmax_bpf_restricted() has been
added.
With this last change, we got back to that
proc_dointvec_minmax_bpf_restricted() is used only under
CONFIG_HAVE_EBPF_JIT, but the corresponding ifdef has not been
brought back.
So, in configurations like CONFIG_BPF_JIT=y && CONFIG_HAVE_EBPF_JIT=n
since v4.20 we have:
CC net/core/sysctl_net_core.o
net/core/sysctl_net_core.c:292:1: warning: ‘proc_dointvec_minmax_bpf_restricted’ defined but not used [-Wunused-function]
292 | proc_dointvec_minmax_bpf_restricted(struct ctl_table *table, int write,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Suppress this by guarding it with CONFIG_HAVE_EBPF_JIT again.
Fixes: fdadd04931 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K")
Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191218091821.7080-1-alobakin@dlink.ru