Commit Graph

1014064 Commits

Author SHA1 Message Date
Jason Gunthorpe
adc9d1f6f5 vfio/mdpy: Use mdev_get_type_group_id()
The mdpy_types array is parallel to the supported_type_groups array, so
the type_group_id indexes both. Instead of doing string searching just
directly index with type_group_id in all places.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Message-Id: <13-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:19 -06:00
Jason Gunthorpe
c594b26ff7 vfio/mtty: Use mdev_get_type_group_id()
The type_group_id directly gives the single or dual port index, no
need for string searching.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Message-Id: <12-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:19 -06:00
Jason Gunthorpe
15fcc44be0 vfio/mdev: Add mdev/mtype_get_type_group_id()
This returns the index in the supported_type_groups array that is
associated with the mdev_type attached to the struct mdev_device or its
containing struct kobject.

Each mdev_device can be spawned from exactly one mdev_type, which in turn
originates from exactly one supported_type_group.

Drivers are using weird string calculations to try and get back to this
index, providing a direct access to the index removes a bunch of wonky
driver code.

mdev_type->group can be deleted as the group is obtained using the
type_group_id.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Message-Id: <11-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:18 -06:00
Jason Gunthorpe
fbea432390 vfio/mdev: Remove duplicate storage of parent in mdev_device
mdev_device->type->parent is the same thing.

The struct mdev_device was relying on the kref on the mdev_parent to also
indirectly hold a kref on the mdev_type pointer. Now that the type holds a
kref on the parent we can directly kref the mdev_type and remove this
implicit relationship.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Message-Id: <10-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:18 -06:00
Jason Gunthorpe
18d731242d vfio/mdev: Add missing error handling to dev_set_name()
This can fail, and seems to be a popular target for syzkaller error
injection. Check the error return and unwind with put_device().

Fixes: 7b96953bc6 ("vfio: Mediated device Core driver")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Message-Id: <9-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:18 -06:00
Jason Gunthorpe
fbd0e2b0c3 vfio/mdev: Reorganize mdev_device_create()
Once the memory for the struct mdev_device is allocated it should
immediately be device_initialize()'d and filled in so that put_device()
can always be used to undo the allocation.

Place the mdev_get/put_parent() so that they are clearly protecting the
mdev->parent pointer. Move the final put to the release function so that
the lifetime rules are trivial to understand. Update the goto labels to
follow the normal convention.

Remove mdev_device_free() as the release function via device_put() is now
usable in all cases.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Message-Id: <8-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:18 -06:00
Jason Gunthorpe
9a302449a5 vfio/mdev: Add missing reference counting to mdev_type
struct mdev_type holds a pointer to the kref'd object struct mdev_parent,
but doesn't hold the kref. The lifetime of the parent becomes implicit
because parent_remove_sysfs_files() is supposed to remove all the access
before the parent can be freed, but this is very hard to reason about.

Make it obviously correct by adding the missing get.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Message-Id: <7-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:17 -06:00
Jason Gunthorpe
a9f8111d0b vfio/mdev: Expose mdev_get/put_parent to mdev_private.h
The next patch will use these in mdev_sysfs.c

While here remove the now dead code checks for NULL, a mdev_type can never
have a NULL parent.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Message-Id: <6-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:17 -06:00
Jason Gunthorpe
417fd5bf24 vfio/mdev: Use struct mdev_type in struct mdev_device
The kobj pointer in mdev_device is actually pointing at a struct
mdev_type. Use the proper type so things are understandable.

There are a number of places that are confused and passing both the mdev
and the mtype as function arguments, fix these to derive the mtype
directly from the mdev to remove the redundancy.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Message-Id: <5-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:17 -06:00
Jason Gunthorpe
91b9969d9c vfio/mdev: Simplify driver registration
This is only done once, we don't need to generate code to initialize a
structure stored in the ELF .data segment. Fill in the three required
.driver members directly instead of copying data into them during
mdev_register_driver().

Further the to_mdev_driver() function doesn't belong in a public header,
just inline it into the two places that need it. Finally, we can now
clearly see that 'drv' derived from dev->driver cannot be NULL, firstly
because the driver core forbids it, and secondly because NULL won't pass
through the container_of(). Remove the dead code.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Message-Id: <4-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:16 -06:00
Jason Gunthorpe
2a3d15f270 vfio/mdev: Add missing typesafety around mdev_device
The mdev API should accept and pass a 'struct mdev_device *' in all
places, not pass a 'struct device *' and cast it internally with
to_mdev_device(). Particularly in its struct mdev_driver functions, the
whole point of a bus's struct device_driver wrapper is to provide type
safety compared to the default struct device_driver.

Further, the driver core standard is for bus drivers to expose their
device structure in their public headers that can be used with
container_of() inlines and '&foo->dev' to go between the class levels, and
'&foo->dev' to be used with dev_err/etc driver core helper functions. Move
'struct mdev_device' to mdev.h

Once done this allows moving some one instruction exported functions to
static inlines, which in turns allows removing one of the two grotesque
symbol_get()'s related to mdev in the core code.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Message-Id: <3-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:16 -06:00
Jason Gunthorpe
b5a1f8921d vfio/mdev: Do not allow a mdev_type to have a NULL parent pointer
There is a small race where the parent is NULL even though the kobj has
already been made visible in sysfs.

For instance the attribute_group is made visible in sysfs_create_files()
and the mdev_type_attr_show() does:

    ret = attr->show(kobj, type->parent->dev, buf);

Which will crash on NULL parent. Move the parent setup to before the type
pointer leaves the stack frame.

Fixes: 7b96953bc6 ("vfio: Mediated device Core driver")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Message-Id: <2-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:16 -06:00
Jason Gunthorpe
6cbf507fd0 vfio/mdev: Fix missing static's on MDEV_TYPE_ATTR's
These should always be prefixed with static, otherwise compilation
will fail on non-modular builds with

ld: samples/vfio-mdev/mbochs.o:(.data+0x2e0): multiple definition of `mdev_type_attr_name'; samples/vfio-mdev/mdpy.o:(.data+0x240): first defined here

Fixes: a5e6e6505f ("sample: vfio bochs vbe display (host device for bochs-drm)")
Fixes: d61fc96f47 ("sample: vfio mdev display - host device")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Message-Id: <1-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-07 15:39:15 -06:00
David S. Miller
f86c70ed04 mlx5-updates-2021-04-06
Introduce TC sample offload
 
 Background
 ----------
 
 The tc sample action allows user to sample traffic matched by tc
 classifier. The sampling consists of choosing packets randomly and
 sampling them using psample module.
 
 The tc sample parameters include group id, sampling rate and packet's
 truncation (to save kernel-user traffic).
 
 Sample in TC SW
 ---------------
 
 User must specify rate and group id for sample action, truncate is
 optional.
 
 tc filter add dev enp4s0f0_0 ingress protocol ip prio 1 flower	\
 	src_mac 02:25:d0:14:01:02 dst_mac 02:25:d0:14:01:03	\
 	action sample rate 10 group 5 trunc 60			\
 	action mirred egress redirect dev enp4s0f0_1
 
 The tc sample action kernel module 'act_sample' will call another
 kernel module 'psample' to send sampled packets to userspace.
 
 MLX5 sample HW offload - MLX5 driver patches
 --------------------------------------------
 
 The sample action is translated to a goto flow table object
 destination which samples packets according to the provided
 sample ratio. Sampled packets are duplicated. One copy is
 processed by a termination table, named the sample table,
 which sends the packet to the eswitch manager port (that will
 be processed by software).
 
 The second copy is processed by the default table which executes
 the subsequent actions. The default table is created per <vport,
 chain, prio> tuple as rules with different prios and chains may
 overlap.
 
 For example, for the following typical flow table:
 
 +-------------------------------+
 +       original flow table     +
 +-------------------------------+
 +         original match        +
 +-------------------------------+
 + sample action + other actions +
 +-------------------------------+
 
 We translate the tc filter with sample action to the following HW model:
 
         +---------------------+
         + original flow table +
         +---------------------+
         +   original match    +
         +---------------------+
                    |
                    v
 +------------------------------------------------+
 +                Flow Sampler Object             +
 +------------------------------------------------+
 +                    sample ratio                +
 +------------------------------------------------+
 +    sample table id    |    default table id    +
 +------------------------------------------------+
            |                            |
            v                            v
 +-----------------------------+  +----------------------------------------+
 +        sample table         +  + default table per <vport, chain, prio> +
 +-----------------------------+  +----------------------------------------+
 + forward to management vport +  +            original match              +
 +-----------------------------+  +----------------------------------------+
                                  +            other actions               +
                                  +----------------------------------------+
 
 Flow sampler object
 -------------------
 
 Hardware introduces flow sampler object to do sample. It is a new
 destination type. Driver needs to specify two flow table ids in it.
 One is sample table id. The other one is the default table id.
 Sample table samples the packets according to the sample rate and
 forward the sampled packets to eswitch manager port. Default table
 finishes the subsequent actions.
 
 Group id and reg_c0
 -------------------
 
 Userspace program will take different actions for sampled packets
 according to tc sample action group id. So hardware must pass group
 id to software for each sampled packets. In Paul Blakey's "Introduce
 connection tracking offload" patch set, reg_c0 lower 16 bits are used
 for miss packet chain id restore. We convert reg_c0 lower 16 bits to
 a common object pool, so other features can also use it.
 
 Since sample group id is 32 bits, create a 16 bits object id to map
 the group id and write the object id to reg_c0 lower 16 bits. reg_c0
 can only be used for matching. Write reg_c0 to flow_tag, so software
 can get the object id via flow_tag and find group id via the common
 object pool.
 
 Sampler restore handle
 ----------------------
 
 Use common object pool to create an object id to map sample parameters.
 Allocate a modify header action to write the object id to reg_c0 lower
 16 bits. Create a restore rule to pass the object id to software. So
 software can identify sampled packets via the object id and send it to
 userspace.
 
 Aggregate the modify header action, restore rule and object id to a
 sample restore handle. Re-use identical sample restore handle for
 the same object id.
 
 Send sampled packets to userspace
 ---------------------------------
 
 The destination for sampled packets is eswitch manager port, so
 representors can receive sampled packets together with the group id.
 Driver will send sampled packets and group id to userspace via psample.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmBtNrUACgkQSD+KveBX
 +j6cRQf/ZARhVPDEgCvFd+wD+n2VCM11FJCpIumGecfqpA9DB/7i0iQrBWG2cGy6
 Go3XZ7HCPy0bAeDnVMBulF5RshfQkB/CNJfCTrw0QkNvenO/eYPZrl0XAGwL7w8W
 9vkeK51VG70bj7VEMeWVovL0X2VoGea0MD0ASLgOG3qZmCjFX0Aw3yY4WNZAA1fn
 i9rSP0AgTXqbR+nUezqP9xDHCyEf4etqpdPO/gosFvasZxTa9Xm6tXxT8YrcjAEH
 MjIYJVS5SERem/gxqrRi5p0u1RNrbZ3vPMmZQIr6x2eBXLwMhvjvcxKqZ2l9PvD5
 +O+Hf43GAmhAoqZukvU8H8oMWArciA==
 =MkzD
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2021-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-04-06

Introduce TC sample offload

Background
----------

The tc sample action allows user to sample traffic matched by tc
classifier. The sampling consists of choosing packets randomly and
sampling them using psample module.

The tc sample parameters include group id, sampling rate and packet's
truncation (to save kernel-user traffic).

Sample in TC SW
---------------

User must specify rate and group id for sample action, truncate is
optional.

tc filter add dev enp4s0f0_0 ingress protocol ip prio 1 flower	\
	src_mac 02:25:d0:14:01:02 dst_mac 02:25:d0:14:01:03	\
	action sample rate 10 group 5 trunc 60			\
	action mirred egress redirect dev enp4s0f0_1

The tc sample action kernel module 'act_sample' will call another
kernel module 'psample' to send sampled packets to userspace.

MLX5 sample HW offload - MLX5 driver patches
--------------------------------------------

The sample action is translated to a goto flow table object
destination which samples packets according to the provided
sample ratio. Sampled packets are duplicated. One copy is
processed by a termination table, named the sample table,
which sends the packet to the eswitch manager port (that will
be processed by software).

The second copy is processed by the default table which executes
the subsequent actions. The default table is created per <vport,
chain, prio> tuple as rules with different prios and chains may
overlap.

For example, for the following typical flow table:

+-------------------------------+
+       original flow table     +
+-------------------------------+
+         original match        +
+-------------------------------+
+ sample action + other actions +
+-------------------------------+

We translate the tc filter with sample action to the following HW model:

        +---------------------+
        + original flow table +
        +---------------------+
        +   original match    +
        +---------------------+
                   |
                   v
+------------------------------------------------+
+                Flow Sampler Object             +
+------------------------------------------------+
+                    sample ratio                +
+------------------------------------------------+
+    sample table id    |    default table id    +
+------------------------------------------------+
           |                            |
           v                            v
+-----------------------------+  +----------------------------------------+
+        sample table         +  + default table per <vport, chain, prio> +
+-----------------------------+  +----------------------------------------+
+ forward to management vport +  +            original match              +
+-----------------------------+  +----------------------------------------+
                                 +            other actions               +
                                 +----------------------------------------+

Flow sampler object
-------------------

Hardware introduces flow sampler object to do sample. It is a new
destination type. Driver needs to specify two flow table ids in it.
One is sample table id. The other one is the default table id.
Sample table samples the packets according to the sample rate and
forward the sampled packets to eswitch manager port. Default table
finishes the subsequent actions.

Group id and reg_c0
-------------------

Userspace program will take different actions for sampled packets
according to tc sample action group id. So hardware must pass group
id to software for each sampled packets. In Paul Blakey's "Introduce
connection tracking offload" patch set, reg_c0 lower 16 bits are used
for miss packet chain id restore. We convert reg_c0 lower 16 bits to
a common object pool, so other features can also use it.

Since sample group id is 32 bits, create a 16 bits object id to map
the group id and write the object id to reg_c0 lower 16 bits. reg_c0
can only be used for matching. Write reg_c0 to flow_tag, so software
can get the object id via flow_tag and find group id via the common
object pool.

Sampler restore handle
----------------------

Use common object pool to create an object id to map sample parameters.
Allocate a modify header action to write the object id to reg_c0 lower
16 bits. Create a restore rule to pass the object id to software. So
software can identify sampled packets via the object id and send it to
userspace.

Aggregate the modify header action, restore rule and object id to a
sample restore handle. Re-use identical sample restore handle for
the same object id.

Send sampled packets to userspace
---------------------------------

The destination for sampled packets is eswitch manager port, so
representors can receive sampled packets together with the group id.
Driver will send sampled packets and group id to userspace via psample.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07 14:38:24 -07:00
Darrick J. Wong
7d88329e5b xfs: move the check for post-EOF mappings into xfs_can_free_eofblocks
Fix the weird split of responsibilities between xfs_can_free_eofblocks
and xfs_free_eofblocks by moving the chunk of code that looks for any
actual post-EOF space mappings from the second function into the first.

This clears the way for deferred inode inactivation to be able to decide
if an inode needs inactivation work before committing the released inode
to the inactivation code paths (vs. marking it for reclaim).

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-04-07 14:38:21 -07:00
Darrick J. Wong
2b156ff8c8 xfs: move the xfs_can_free_eofblocks call under the IOLOCK
In xfs_inode_free_eofblocks, move the xfs_can_free_eofblocks call
further down in the function to the point where we have taken the
IOLOCK.  This is preparation for the next patch, where we will need that
lock (or equivalent) so that we can check if there are any post-eof
blocks to clean out.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-04-07 14:38:16 -07:00
Dave Chinner
b2941046ea xfs: precalculate default inode attribute offset
Default attr fork offset is based on inode size, so is a fixed
geometry parameter of the inode. Move it to the xfs_ino_geometry
structure and stop calculating it on every call to
xfs_default_attroffset().

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2021-04-07 14:37:07 -07:00
Dave Chinner
683ec9ba88 xfs: default attr fork size does not handle device inodes
Device inodes have a non-default data fork size of 8 bytes
as checked/enforced by xfs_repair. xfs_default_attroffset() doesn't
handle this, so lets do a minor refactor so it does.

Fixes: e6a688c332 ("xfs: initialise attr fork on inode create")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2021-04-07 14:37:07 -07:00
Dave Chinner
8de1cb0038 xfs: inode fork allocation depends on XFS_IFEXTENT flag
Due to confusion on when the XFS_IFEXTENT needs to be set, the
changes in e6a688c332 ("xfs: initialise attr fork on inode
create") failed to set the flag when initialising the empty
attribute fork at inode creation. Set this flag the same way
xfs_bmap_add_attrfork() does after attry fork allocation.

Fixes: e6a688c332 ("xfs: initialise attr fork on inode create")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2021-04-07 14:37:06 -07:00
Dave Chinner
2442ee15bb xfs: eager inode attr fork init needs attr feature awareness
The pitfalls of regression testing on a machine without realising
that selinux was disabled. Only set the attr fork during inode
allocation if the attr feature bits are already set on the
superblock.

Fixes: e6a688c332 ("xfs: initialise attr fork on inode create")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2021-04-07 14:37:06 -07:00
Chandan Babu R
ae7bae68ea xfs: scrub: Disable check for unoptimized data fork bmbt node
xchk_btree_check_minrecs() checks if the contents of the immediate child of a
bmbt root block can fit within the root block. This check could fail on inodes
with an attr fork since xfs_bmap_add_attrfork_btree() used to demote the
current root node of the data fork as the child of a newly allocated root node
if it found that the size of "struct xfs_btree_block" along with the space
required for records exceeded that of space available in the data fork.

xfs_bmap_add_attrfork_btree() should have used "struct xfs_bmdr_block" instead
of "struct xfs_btree_block" for the above mentioned space requirement
calculation. This commit disables the check for unoptimized (in terms of
disk space usage) data fork bmbt trees since there could be filesystems
in use that already have such a layout.

Suggested-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-04-07 14:37:06 -07:00
Chandan Babu R
b6785e279d xfs: Use struct xfs_bmdr_block instead of struct xfs_btree_block to calculate root node size
The incore data fork of an inode stores the bmap btree root node as 'struct
xfs_btree_block'. However, the ondisk version of the inode stores the bmap
btree root node as a 'struct xfs_bmdr_block'.

xfs_bmap_add_attrfork_btree() checks if the btree root node fits inside the
data fork of the inode. However, it incorrectly uses 'struct xfs_btree_block'
to compute the size of the bmap btree root node. Since size of 'struct
xfs_btree_block' is larger than that of 'struct xfs_bmdr_block',
xfs_bmap_add_attrfork_btree() could end up unnecessarily demoting the current
root node as the child of newly allocated root node.

This commit optimizes space usage by modifying xfs_bmap_add_attrfork_btree()
to use 'struct xfs_bmdr_block' to check if the bmap btree root node fits
inside the data fork of the inode.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:06 -07:00
Anthony Iliopoulos
fcb62c2803 xfs: deprecate BMV_IF_NO_DMAPI_READ flag
Use of the flag has had no effect since kernel commit 288699feca
("xfs: drop dmapi hooks"), which removed all dmapi related code, so
deprecate it.

Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:06 -07:00
Christoph Hellwig
4422501da6 xfs: merge _xfs_dic2xflags into xfs_ip2xflags
Merge _xfs_dic2xflags into its only caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:06 -07:00
Christoph Hellwig
e98d5e882b xfs: move the di_crtime field to struct xfs_inode
Move the crtime field from struct xfs_icdinode into stuct xfs_inode and
remove the now entirely unused struct xfs_icdinode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:05 -07:00
Christoph Hellwig
3e09ab8fdc xfs: move the di_flags2 field to struct xfs_inode
In preparation of removing the historic icinode struct, move the flags2
field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:05 -07:00
Christoph Hellwig
db07349da2 xfs: move the di_flags field to struct xfs_inode
In preparation of removing the historic icinode struct, move the flags
field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:05 -07:00
Christoph Hellwig
7821ea302d xfs: move the di_forkoff field to struct xfs_inode
In preparation of removing the historic icinode struct, move the
forkoff field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:05 -07:00
Christoph Hellwig
ee7b83fd36 xfs: use a union for i_cowextsize and i_flushiter
The i_cowextsize field is only used for v3 inodes, and the i_flushiter
field is only used for v1/v2 inodes.  Use a union to pack the inode a
littler better after adding a few missing guards around their usage.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:05 -07:00
Christoph Hellwig
b231b1221b xfs: use XFS_B_TO_FSB in xfs_ioctl_setattr
Clean up xfs_ioctl_setattr a bit by using XFS_B_TO_FSB.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:04 -07:00
Christoph Hellwig
4800887b45 xfs: cleanup xfs_fill_fsxattr
Add a local xfs_mount variable, and use the XFS_FSB_TO_B helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:04 -07:00
Christoph Hellwig
965e0a1ad2 xfs: move the di_flushiter field to struct xfs_inode
In preparation of removing the historic icinode struct, move the
flushiter field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:04 -07:00
Christoph Hellwig
b33ce57d3e xfs: move the di_cowextsize field to struct xfs_inode
In preparation of removing the historic icinode struct, move the
cowextsize field into the containing xfs_inode structure.  Also
switch to use the xfs_extlen_t instead of a uint32_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:04 -07:00
Christoph Hellwig
031474c28a xfs: move the di_extsize field to struct xfs_inode
In preparation of removing the historic icinode struct, move the extsize
field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:04 -07:00
Christoph Hellwig
6e73a545f9 xfs: move the di_nblocks field to struct xfs_inode
In preparation of removing the historic icinode struct, move the nblocks
field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:03 -07:00
Christoph Hellwig
13d2c10b05 xfs: move the di_size field to struct xfs_inode
In preparation of removing the historic icinode struct, move the on-disk
size field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:03 -07:00
Christoph Hellwig
ceaf603c70 xfs: move the di_projid field to struct xfs_inode
In preparation of removing the historic icinode struct, move the projid
field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:03 -07:00
Christoph Hellwig
7e2a8af528 xfs: don't clear the "dinode core" in xfs_inode_alloc
The xfs_icdinode structure just contains a random mix of inode field,
which are all read from the on-disk inode and mostly not looked at
before reading the inode or initializing a new inode cluster.  The
only exceptions are the forkoff and blocks field, which are used
in sanity checks for freshly allocated inodes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:03 -07:00
Christoph Hellwig
9b3beb028f xfs: remove the di_dmevmask and di_dmstate fields from struct xfs_icdinode
The legacy DMAPI fields were never set by upstream Linux XFS, and have no
way to be read using the kernel APIs.  So instead of bloating the in-core
inode for them just copy them from the on-disk inode into the log when
logging the inode.  The only caveat is that we need to make sure to zero
the fields for newly read or deleted inodes, which is solved using a new
flag in the inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:03 -07:00
Christoph Hellwig
55f773380e xfs: remove the unused xfs_icdinode_has_bigtime helper
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:02 -07:00
Christoph Hellwig
582a73440b xfs: handle crtime more carefully in xfs_bulkstat_one_int
The crtime only exists for v5 inodes, so only copy it for those.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:02 -07:00
Christoph Hellwig
4cb6f2e8c2 xfs: consistently initialize di_flags2
Make sure di_flags2 is always initialized.  We currently get this implicitly
by clearing the dinode core on allocating the in-core inode, but that is
about to go away.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:02 -07:00
Christoph Hellwig
af9dcddef6 xfs: split xfs_imap_to_bp
Split looking up the dinode from xfs_imap_to_bp, which can be
significantly simplified as a result.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:02 -07:00
Chandan Babu R
e773f88029 xfs: scrub: Remove incorrect check executed on block format directories
A directory with one directory block which in turns consists of two or more fs
blocks is incorrectly flagged as corrupt by scrub since it assumes that
"Block" format directories have a data fork single extent spanning the file
offset range of [0, Dir block size - 1].

This commit fixes the bug by removing the incorrect check.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:36:34 -07:00
Chandan Babu R
6e8bd39d72 xfs: Initialize xfs_alloc_arg->total correctly when allocating minlen extents
xfs/538 can cause the following call trace to be printed when executing on a
multi-block directory configuration,

 WARNING: CPU: 1 PID: 2578 at fs/xfs/libxfs/xfs_bmap.c:717 xfs_bmap_extents_to_btree+0x520/0x5d0
 Call Trace:
  ? xfs_buf_rele+0x4f/0x450
  xfs_bmap_add_extent_hole_real+0x747/0x960
  xfs_bmapi_allocate+0x39a/0x440
  xfs_bmapi_write+0x507/0x9e0
  xfs_da_grow_inode_int+0x1cd/0x330
  ? up+0x12/0x60
  xfs_dir2_grow_inode+0x62/0x110
  ? xfs_trans_log_inode+0x234/0x2d0
  xfs_dir2_sf_to_block+0x103/0x940
  ? xfs_dir2_sf_check+0x8c/0x210
  ? xfs_da_compname+0x19/0x30
  ? xfs_dir2_sf_lookup+0xd0/0x3d0
  xfs_dir2_sf_addname+0x10d/0x910
  xfs_dir_createname+0x1ad/0x210
  xfs_create+0x404/0x620
  xfs_generic_create+0x24c/0x320
  path_openat+0xda6/0x1030
  do_filp_open+0x88/0x130
  ? kmem_cache_alloc+0x50/0x210
  ? __cond_resched+0x16/0x40
  ? kmem_cache_alloc+0x50/0x210
  do_sys_openat2+0x97/0x150
  __x64_sys_creat+0x49/0x70
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae

This occurs because xfs_bmap_exact_minlen_extent_alloc() initializes
xfs_alloc_arg->total to xfs_bmalloca->minlen. In the context of
xfs_bmap_exact_minlen_extent_alloc(), xfs_bmalloca->minlen has a value of 1
and hence the space allocator could choose an AG which has less than
xfs_bmalloca->total number of free blocks available. As the transaction
proceeds, one of the future space allocation requests could fail due to
non-availability of free blocks in the AG that was originally chosen.

This commit fixes the bug by assigning xfs_alloc_arg->total to the value of
xfs_bmalloca->total.

Fixes: 3015196746 ("xfs: Introduce error injection to allocate only minlen size extents for files")
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-04-07 14:36:34 -07:00
Chandan Babu R
5147ef30f2 xfs: Fix dax inode extent calculation when direct write is performed on an unwritten extent
With dax enabled filesystems, a direct write operation into an existing
unwritten extent results in xfs_iomap_write_direct() zero-ing and converting
the extent into a normal extent before the actual data is copied from the
userspace buffer.

The inode extent count can increase by 2 if the extent range being written to
maps to the middle of the existing unwritten extent range. Hence this commit
uses XFS_IEXT_WRITE_UNWRITTEN_CNT as the extent count delta when such a write
operation is being performed.

Fixes: 727e1acd29 ("xfs: Check for extent overflow when trivally adding a new extent")
Reported-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-04-07 14:36:33 -07:00
David S. Miller
bb58023bee mlx5-fixes-2021-04-06
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmBtL1QACgkQSD+KveBX
 +j4Htgf+Ke0N3Nc47SJAZehoiKmTPQ+yJH/eyaMzLf7aikw5glzWkhT/5hu14/Pj
 lWS5dAiegyka2cupo4PLliuYu60xMs4/a39Plsbe8XHBbmPBKxjn1V2zK5iGyyAB
 GRe3MbBcUsELW8fiWzYEG4J4/RGJbJstc2CfXld9CM7IX3gC/Q8LsQno1LN2GuNF
 gOyOoaJbTU19F6tW5JGc5fwOERifrIq9FY+AsUyRGUq21z6OEV9TwsyaJHoXMyXP
 RwDz7st801yLZ8jA7CgI6HMREfyWGCUGIDpxdtG4L3jJhdnaZdesevEculIgrdAQ
 7z3WcVYno2AA3UT2rBpX1/NsdnSN/A==
 =jniG
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2021-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2021-04-06

This series provides some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07 14:34:03 -07:00
wengjianfeng
872fff333f nfc/fdp: remove unnecessary assignment and label
In function fdp_nci_patch_otp and fdp_nci_patch_ram,many goto
out statements are used, and out label just return variable r.
in some places,just jump to the out label, and in other places,
assign a value to the variable r,then jump to the out label.
It is unnecessary, we just use return sentences to replace goto
sentences and delete out label.

Signed-off-by: wengjianfeng <wengjianfeng@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07 14:32:31 -07:00
Qingqing Zhuo
df7232c4c6 drm/amd/display: Add missing mask for DCN3
[Why]
DCN3 is not reusing DCN1 mask_sh_list, causing
SURFACE_FLIP_INT_MASK missing in the mapping.

[How]
Add the corresponding entry to DCN3 list.

Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-07 17:30:39 -04:00
Zheng Yongjun
a79ace4b31 net: tipc: Fix spelling errors in net/tipc module
These patches fix a series of spelling errors in net/tipc module.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07 14:29:29 -07:00