The VCPUOP_register_runstate_memory_area hypercall takes a virtual
address of a buffer as a parameter. The semantics of the hypercall are
such that the virtual address should always be valid.
When KPTI is enabled and we are running userspace code, the virtual
address is not valid, thus, Linux is violating the semantics of
VCPUOP_register_runstate_memory_area.
Do not call VCPUOP_register_runstate_memory_area when KPTI is enabled.
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
CC: Bertrand Marquis <Bertrand.Marquis@arm.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Link: https://lore.kernel.org/r/20200924234955.15455-1-sstabellini@kernel.org
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
After commit 9f51c05dc4 ("pvcalls-front: Avoid
get_free_pages(GFP_KERNEL) under spinlock"), the variable ret is being
initialized with '-ENOMEM' that is meaningless. So remove it.
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Link: https://lore.kernel.org/r/20200919031702.32192-1-jingxiangfeng@huawei.com
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
In 2019, we introduced pin_user_pages*() and now we are converting
get_user_pages*() to the new API as appropriate. [1] & [2] could
be referred for more information. This is case 5 as per document [1].
[1] Documentation/core-api/pin_user_pages.rst
[2] "Explicit pinning of user-space pages":
https://lwn.net/Articles/807108/
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Link: https://lore.kernel.org/r/1599375114-32360-2-git-send-email-jrdr.linux@gmail.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
There seems to be a bug in the original code when gntdev_get_page()
is called with writeable=true then the page needs to be marked dirty
before being put.
To address this, a bool writeable is added in gnt_dev_copy_batch, set
it in gntdev_grant_copy_seg() (and drop `writeable` argument to
gntdev_get_page()) and then, based on batch->writeable, use
set_page_dirty_lock().
Fixes: a4cdb556ca (xen/gntdev: add ioctl for grant copy)
Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1599375114-32360-1-git-send-email-jrdr.linux@gmail.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Currently, the memory containing DT is not reserved. Thus, that region
of memory can be reallocated or reused for other purposes. This may result
in corrupted DT for nommu virt board in Qemu. We may not face any issue
in kendryte as DT is embedded in the kernel image for that.
Fixes: 6bd33e1ece ("riscv: add nommu support")
Cc: stable@vger.kernel.org
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Add support for the bt541 touchscreen IC from zinitix, loosely based on
downstream driver. The driver currently supports multitouch (5 touch points).
The bt541 seems to support touch keys, but the support was not added because
that functionality is not being utilized by the touchscreen used for testing.
Based on the similartities between downstream drivers, it seems likely that
other similar touchscreen ICs can be supported with this driver in the future.
Signed-off-by: Michael Srba <Michael.Srba@seznam.cz>
Link: https://lore.kernel.org/r/20201001122949.16846-1-michael.srba@seznam.cz
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
This patch adds dts bindings for the zinitix bt541 touchscreen.
Signed-off-by: Michael Srba <Michael.Srba@seznam.cz>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20201001122949.16846-2-michael.srba@seznam.cz
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
1. Keep the code for the normal (non-error) flow at the lowest
indentation level. And use "goto drop" for all error handling.
2. Replace code that pads short Ethernet frames with a "__skb_pad" call.
3. Change "dev_kfree_skb" to "kfree_skb" in error handling code.
"kfree_skb" is the correct function to call when dropping an skb due to
an error. "dev_kfree_skb", which is an alias of "consume_skb", is for
dropping skbs normally (not due to an error).
Cc: Krzysztof Halasa <khc@pm.waw.pl>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Openvswitch allows to drop a packet's Ethernet header, therefore
skb_mpls_push() and skb_mpls_pop() might be called with ethernet=true
and mac_len=0. In that case the pointer passed to skb_mod_eth_type()
doesn't point to an Ethernet header and the new Ethertype is written at
unexpected locations.
Fix this by verifying that mac_len is big enough to contain an Ethernet
header.
Fixes: fa4e0f8855 ("net/sched: fix corrupted L2 header with MPLS 'push' and 'pop' actions")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
clang static analysis reports this problem:
drivers/net/ethernet/marvell/mvneta.c:3465:2: warning:
Attempt to free released memory
kfree(txq->buf);
^~~~~~~~~~~~~~~
When mvneta_txq_sw_init() fails to alloc txq->tso_hdrs,
it frees without poisoning txq->buf. The error is caught
in the mvneta_setup_txqs() caller which handles the error
by cleaning up all of the txqs with a call to
mvneta_txq_sw_deinit which also frees txq->buf.
Since mvneta_txq_sw_deinit is a general cleaner, all of the
partial cleaning in mvneta_txq_sw_deinit()'s error handling
is not needed.
Fixes: 2adb719d74 ("net: mvneta: Implement software TSO")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Although we take RTNL on dump path, it is possible to
skip RTNL on insertion path. So the following race condition
is possible:
rtnl_lock() // no rtnl lock
mutex_lock(&idrinfo->lock);
// insert ERR_PTR(-EBUSY)
mutex_unlock(&idrinfo->lock);
tc_dump_action()
rtnl_unlock()
So we have to skip those temporary -EBUSY entries on dump path
too.
Reported-and-tested-by: syzbot+b47bc4f247856fb4d9e1@syzkaller.appspotmail.com
Fixes: 0fedc63fad ("net_sched: commit action insertions together")
Cc: Vlad Buslov <vladbu@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable "i" isn't initialized back correctly after the first loop
under the label inst_rollback gets executed.
The value of "i" is assigned to be option_count - 1, and the ensuing
loop (under alloc_rollback) begins by initializing i--.
Thus, the value of i when the loop begins execution will now become
i = option_count - 2.
Thus, when kfree(dst_opts[i]) is called in the second loop in this
order, (i.e., inst_rollback followed by alloc_rollback),
dst_optsp[option_count - 2] is the first element freed, and
dst_opts[option_count - 1] does not get freed, and thus, a memory
leak is caused.
This memory leak can be fixed, by assigning i = option_count (instead of
option_count - 1).
Fixes: 80f7c6683f ("team: add support for per-port options")
Reported-by: syzbot+69b804437cfec30deac3@syzkaller.appspotmail.com
Tested-by: syzbot+69b804437cfec30deac3@syzkaller.appspotmail.com
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: net-next updates.
This series starts off with the usual update of the firmware interface
spec. A new firmware status bit in the interface will be used in patch
add the infrastructure to read the firmware status very early during
driver probe and this will allow patch #4 to do the recovery if needed.
The rest of the patches add improvements to the current RX reset
logic by localizing the reset to the affected RX ring only and to
reset only if firmware has determined that the RX ring is in permanent
error state.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the driver will schedule RX ring reset when we get a buffer
error in the RX completion record. These RX buffer errors can be due
to normal out-of-buffer conditions or a permanent error in the RX
ring. Because the driver cannot distinguish between these 2
conditions, we assume all these buffer errors require reset.
This is very disruptive when it is just a normal out-of-buffer
condition. Newer firmware will now monitor the rings for the permanent
failure and will send a notification to the driver when it happens.
This allows the driver to reset only when such a notification is
received. In environments where we have predominently out-of-buffer
conditions, we now can avoid these unnecessary resets.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is logic in the RX path to detect unexpected handles in the
RX completion. We'll print a warning and schedule a reset. The
next expected handle is then set to 0xffff which is guaranteed to
not match any valid handle. This will force all remaining packets in
the ring to be discarded before the reset. There can be hundreds of
these packets remaining in the ring and there is no need to print the
warnings for these forced errors.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a per ring rx_resets counter to count these RX resets.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some older chips, it is necessary to do a reset when we get buffer
errors associated with an RX ring. These buffer errors may become
frequent if the RX ring underruns under heavy traffic. The current
code does a global reset of all reasources when this happens. This
works but creates a big disruption of all rings when one RX ring is
having problem. This patch implements a localized RX ring reset of
just the RX ring having the issue. All other rings including all
TX rings will not be affected by this single RX ring reset.
Only the older chips prior to the P5 class supports this reset.
Because it is not a global reset, packets may still be arriving
while we are calling firmware to reset that ring. We need to be
sure that we don't post any buffers during this time while the
ring is undergoing reset. After firmware completes successfully,
the ring will be in the reset state with no buffers and we can start
filling it with new buffers and posting them.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_init_one_rx_ring() includes logic to initialize the BDs for one RX
ring and to allocate the buffers. Separate the allocation logic into a
new bnxt_alloc_one_rx_ring() function. The allocation function will be
used later to allocate new buffers for one specified RX ring when we
reset that RX ring.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_free_rx_skbs() frees all the allocated buffers and SKBs for
every RX ring. Refactor this function by calling a new function
bnxt_free_one_rx_ring_skbs() to free these buffers on one specified
RX ring at a time. This is preparation work for resetting one RX
ring during run-time.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If firmware does not come out of reset, log FW health status info
to provide more information on firmware status.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NS3 SoC platforms require assistance from the OP-TEE to recover
firmware if a crash occurs while no driver is bound. The
CRASHED_NO_MASTER condition is recorded in the firmware status register
during the crash to indicate when driver intervension is needed to
coordinate a firmware reload. This condition is detected during early
driver initialization in order to effect a firmware fastboot on
supported platforms when necessary.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Firmware now supports device independent discovery of the status
register location. This status register can provide more detailed
information about firmware errors, especially if problems occur
before the HWRM interface is functioning. Attempt to map this
register if it is present and report the firmware status on firmware
init failures.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The allocator for the firmware health structure conflates allocation
and capability checks, limiting the reusability of the code. This patch
separates out the capability check and disablement and improves the
warning message to better describe the consequences of an allocation
failure.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Main changes is to extend hwrm_nvm_get_dev_info_output() for stored
firmware versions and a new flag is added to fw_status_reg.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn says:
====================
mv88e6xxx: Add per port devlink regions
This patchset extends devlink regions to support per port regions, and
them makes use of them to support the ports of the mv88e6xxx switches.
root@rap:~# devlink region show
mdio_bus/gpio-0:00/global1: size 64 snapshot []
mdio_bus/gpio-0:00/global2: size 64 snapshot []
mdio_bus/gpio-0:00/atu: size 49152 snapshot []
mdio_bus/gpio-0:00/0/port: size 64 snapshot []
mdio_bus/gpio-0:00/1/port: size 64 snapshot []
mdio_bus/gpio-0:00/2/port: size 64 snapshot []
mdio_bus/gpio-0:00/3/port: size 64 snapshot []
mdio_bus/gpio-0:00/4/port: size 64 snapshot []
mdio_bus/gpio-0:00/5/port: size 64 snapshot []
mdio_bus/gpio-0:00/6/port: size 64 snapshot []
mdio_bus/gpio-0:00/7/port: size 64 snapshot []
mdio_bus/gpio-0:00/8/port: size 64 snapshot []
mdio_bus/gpio-0:00/9/port: size 64 snapshot []
mdio_bus/gpio-0:00/10/port: size 64 snapshot []
root@rap:~# devlink region new mdio_bus/gpio-0:00/1/port snapshot 42
root@rap:~# devlink region dump mdio_bus/gpio-0:00/1/port snapshot 42
0000000000000000 4f 1e 3e 20 00 01 01 39 3f 05 00 00 fd 07 00 00
0000000000000010 80 00 01 00 00 00 00 00 00 00 00 00 00 00 00 91
0000000000000020 00 00 00 00 00 00 00 00 00 00 00 00 22 00 00 00
0000000000000030 07 3e 00 00 00 00 00 80 00 00 00 00 00 00 5b 00
In order to support all ports of the switch, a new devlink flavour has
been added for unused ports:
mdio_bus/gpio-0:00/0: type notset flavour unused splittable false
mdio_bus/gpio-0:00/1: type notset flavour cpu port 1 splittable false
mdio_bus/gpio-0:00/2: type eth netdev red flavour physical port 2 splittable fae
mdio_bus/gpio-0:00/3: type eth netdev blue flavour physical port 3 splittable fe
mdio_bus/gpio-0:00/4: type eth netdev green flavour physical port 4 splittable e
mdio_bus/gpio-0:00/5: type notset flavour unused splittable false
mdio_bus/gpio-0:00/6: type notset flavour unused splittable false
mdio_bus/gpio-0:00/7: type notset flavour unused splittable false
mdio_bus/gpio-0:00/8: type eth netdev waic0 flavour physical port 8 splittable e
mdio_bus/gpio-0:00/9: type notset flavour unused splittable false
mdio_bus/gpio-0:00/10: type notset flavour unused splittable false
The DSA core now creates the devlink port instances earlier, so that
the driver setup function can make use of them.
v3:
Whitespace cleanup
Added justification for devlink unused flavour
Added Tested-by, Reviewed-by:
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a devlink region to return the per port registers.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hide away from DSA drivers how devlink works.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow DSA drivers to make use of devlink port regions, via simple
wrappers.
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow regions to be registered to a devlink port. The same netlink API
is used, but the port index is provided to indicate when a region is a
port region as opposed to a device region.
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
DSA drivers want to create regions on devlink ports as well as the
devlink device instance, in order to export registers and other tables
per port. To keep all this code together in the drivers, have the
devlink ports registered early, so the setup() method can setup both
device and port devlink regions.
v3:
Remove dp->setup
Move common code out of switch statement.
Fix wrong goto
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a port is unused, still create a devlink port for it, but set the
flavour to unused. This allows us to attach devlink regions to the
port, etc.
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Not all ports of a switch need to be used, particularly in embedded
systems. Add a port flavour for ports which physically exist in the
switch, but are not connected to the front panel etc, and so are
unused. By having unused ports present in devlink, it gives a more
accurate representation of the hardware. It also allows regions to be
associated to such ports, so allowing, for example, to determine
unused ports are correctly powered off, or to compare probable reset
defaults of unused ports to used ports experiences issues.
Actually registering unused ports and setting the flavour to unused is
optional. The DSA core will register all such switch ports, but such
ports are expected to be limited in number. Bigger ASICs may decide
not to list unused ports.
v2:
Expand the description about why it is useful
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Rename 'searched' column to 'clashres' in conntrack /proc/ stats
to amend a recent patch, from Florian Westphal.
2) Remove unused nft_data_debug(), from YueHaibing.
3) Remove unused definitions in IPVS, also from YueHaibing.
4) Fix user data memleak in tables and objects, this is also amending
a recent patch, from Jose M. Guisado.
5) Use nla_memdup() to allocate user data in table and objects, also
from Jose M. Guisado
6) User data support for chains, from Jose M. Guisado
7) Remove unused definition in nf_tables_offload, from YueHaibing.
8) Use kvzalloc() in ip_set_alloc(), from Vasily Averin.
9) Fix false positive reported by lockdep in nfnetlink mutexes,
from Florian Westphal.
10) Extend fast variant of cmp for neq operation, from Phil Sutter.
11) Implement fast bitwise variant, also from Phil Sutter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Update Git URL to
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd.git
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
A typical use of bitwise expression is to mask out parts of an IP
address when matching on the network part only. Optimize for this common
use with a fast variant for NFT_BITWISE_BOOL-type expressions operating
on 32bit-sized values.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add a boolean indicating NFT_CMP_NEQ. To include it into the match
decision, it is sufficient to XOR it with the data comparison's result.
While being at it, store the mask that is calculated during expression
init and free the eval routine from having to recalculate it each time.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
From time to time there are lockdep reports similar to this one:
WARNING: possible circular locking dependency detected
------------------------------------------------------
000000004f61aa56 (&table[i].mutex){+.+.}, at: nfnl_lock [nfnetlink]
but task is already holding lock:
[..] (&net->nft.commit_mutex){+.+.}, at: nf_tables_valid_genid [nf_tables]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&net->nft.commit_mutex){+.+.}:
[..]
nf_tables_valid_genid+0x18/0x60 [nf_tables]
nfnetlink_rcv_batch+0x24c/0x620 [nfnetlink]
nfnetlink_rcv+0x110/0x140 [nfnetlink]
netlink_unicast+0x12c/0x1e0
[..]
sys_sendmsg+0x18/0x40
linux_sparc_syscall+0x34/0x44
-> #0 (&table[i].mutex){+.+.}:
[..]
nfnl_lock+0x24/0x40 [nfnetlink]
ip_set_nfnl_get_byindex+0x19c/0x280 [ip_set]
set_match_v1_checkentry+0x14/0xc0 [xt_set]
xt_check_match+0x238/0x260 [x_tables]
__nft_match_init+0x160/0x180 [nft_compat]
[..]
sys_sendmsg+0x18/0x40
linux_sparc_syscall+0x34/0x44
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&net->nft.commit_mutex);
lock(&table[i].mutex);
lock(&net->nft.commit_mutex);
lock(&table[i].mutex);
Lockdep considers this an ABBA deadlock because the different nfnl subsys
mutexes reside in the same lockdep class, but this is a false positive.
CPU1 table[i] refers to the nftables subsys mutex, whereas CPU1 locks
the ipset subsys mutex.
Yi Che reported a similar lockdep splat, this time between ipset and
ctnetlink subsys mutexes.
Time to place them in distinct classes to avoid these warnings.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently netadmin inside non-trusted container can quickly allocate
whole node's memory via request of huge ipset hashtable.
Other ipset-related memory allocations should be restricted too.
v2: fixed typo ALLOC -> ACCOUNT
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Stop providing the possibility to override the address space using
set_fs() now that there is no need for that any more.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Implement the non-faulting kernel access helpers directly instead of
abusing the uaccess routines under set_fs(KERNEL_DS).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Add new __get_user_nocheck and __put_user_nocheck that switch on the size
and call the actual inline assembly helpers, and move the uaccess enable
/ disable into the actual __get_user and __put_user. This prepares for
natively implementing __get_kernel_nofault and __put_kernel_nofault.
Also don't bother with the deprecated register keyword for the error
return.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This reverts commit adccfb1a80.
Now that the generic uaccess by mempcy code handles unaligned addresses
the generic code can be used for all RISC-V CPUs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Put all the set_fs related code under CONFIG_SET_FS so that
asm-generic/uaccess.h can be used for set_fs-less builds.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Add native implementations of __{get,put}_kernel_nofault using
{get,put}_unaligned, just like the {get,put}_user implementations.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Instead of reusing raw_{copy,to}_from_user implement separate handlers
using {get,put}_unaligned. This ensures unaligned access is handled
correctly, and avoid the need for the small constant size optimization
in raw_{copy,to}_from_user.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Define TASK_SIZE_MAX as TASK_SIZE if not otherwise defined.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This is a dependency for Christoph's removal of set_fs.
* 'base.set_fs' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
powerpc: remove address space overrides using set_fs()
powerpc: use non-set_fs based maccess routines
x86: remove address space overrides using set_fs()
x86: make TASK_SIZE_MAX usable from assembly code
x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h
lkdtm: remove set_fs-based tests
test_bitmap: remove user bitmap tests
uaccess: add infrastructure for kernel builds with set_fs()
fs: don't allow splice read/write without explicit ops
fs: don't allow kernel reads and writes without iter ops
sysctl: Convert to iter interfaces
proc: add a read_iter method to proc proc_ops
proc: cleanup the compat vs no compat file ops
proc: remove a level of indentation in proc_get_inode
I was too quick moving zoran.rst... it ends that the original
patch didn't do the right thing and forgot to update the files
that references it.
Fix it.
Fixes: 6b90346919 ("media: zoran: move documentation file to the right place")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>