Currently if a special PTE is encountered hmm_range_fault() immediately
returns EFAULT and sets the HMM_PFN_SPECIAL error output (which nothing
uses).
EFAULT should only be returned after testing with hmm_pte_need_fault().
Also pte_devmap() and pte_special() are exclusive, and there is no need to
check IS_ENABLED, pte_special() is stubbed out to return false on
unsupported architectures.
Fixes: 992de9a8b7 ("mm/hmm: allow to mirror vma of a file on a DAX backed filesystem")
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
hmm_range_fault() should never return 0 if the caller requested a valid
page, but the pfns output for that page would be HMM_PFN_ERROR.
hmm_pte_need_fault() must always be called before setting HMM_PFN_ERROR to
detect if the page is in faulting mode or not.
Fix two cases in hmm_vma_walk_pmd() and reorganize some of the duplicated
code.
Fixes: d08faca018 ("mm/hmm: properly handle migration pmd")
Fixes: da4c3c735e ("mm/hmm/mirror: helper to snapshot CPU page table")
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The intention with this code is to determine if the caller required the
pages to be valid, and if so, then take some action to make them valid.
The action varies depending on the page type.
In all cases, if the caller doesn't ask for the page, then
hmm_range_fault() should not return an error.
Revise the implementation to be clearer, and fix some bugs:
- hmm_pte_need_fault() must always be called before testing fault or
write_fault otherwise the defaults of false apply and the if()'s don't
work. This was missed on the is_migration_entry() branch
- -EFAULT should not be returned unless hmm_pte_need_fault() indicates
fault is required - ie snapshotting should not fail.
- For !pte_present() the cpu_flags are always 0, except in the special
case of is_device_private_entry(), calling pte_to_hmm_pfn_flags() is
confusing.
Reorganize the flow so that it always follows the pattern of calling
hmm_pte_need_fault() and then checking fault || write_fault.
Fixes: 2aee09d8c1 ("mm/hmm: change hmm_vma_fault() to allow write fault on page basis")
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
All return paths that do EFAULT must call hmm_range_need_fault() to
determine if the user requires this page to be valid.
If the page cannot be made valid if the user later requires it, due to vma
flags in this case, then the return should be HMM_PFN_ERROR.
Fixes: a3e0d41c2b ("mm/hmm: improve driver API to work and wait over a range")
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
All success exit paths from the walker functions must set the pfns array.
A migration entry with no required fault is a HMM_PFN_NONE return, just
like the pte case.
Fixes: d08faca018 ("mm/hmm: properly handle migration pmd")
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This eventually calls into handle_mm_fault() which is a sleeping function.
Release the lock first.
hmm_vma_walk_hole() does not touch the contents of the PUD, so it does not
need the lock.
Fixes: 3afc423632 ("mm: pagewalk: add p4d_entry() and pgd_entry()")
Cc: Steven Price <steven.price@arm.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Many of the direct returns of error skipped doing the pte_unmap(). All non
zero exit paths must unmap the pte.
The pte_unmap() is split unnaturally like this because some of the error
exit paths trigger a sleep and must release the lock before sleeping.
Fixes: 992de9a8b7 ("mm/hmm: allow to mirror vma of a file on a DAX backed filesystem")
Fixes: 53f5c3f489 ("mm/hmm: factor out pte and pmd handling to simplify hmm_vma_walk_pmd()")
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Like on N900, we cannot access RNG directly on N950/N9. Mark it disabled in
the DTS to allow kernel to boot.
Fixes: 308607e554 ("ARM: dts: Configure omap3 rng")
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The KERN_VIRT_START defines the start virtual address of kernel space.
Use this macro instead of magic number.
Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
In a similar manner to arm64, x86, powerpc, etc., it can traverse all
page tables, and dump the page table layout with the memory types and
permissions.
Add a debugfs file at /sys/kernel/debug/kernel_page_tables to export
the page table layout to userspace.
Signed-off-by: Zong Li <zong.li@sifive.com>
Tested-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
* move ata_eh_analyze_ncq_error() and ata_eh_read_log_10h() to
libata-sata.c
* add static inline for ata_eh_analyze_ncq_error() for
CONFIG_SATA_HOST=n case (link->sactive is non-zero only if
NCQ commands are actually queued so empty function body is
sufficient)
Code size savings on m68k arch using (modified) atari_defconfig:
text data bss dec hex filename
before:
16164 18 0 16182 3f36 drivers/ata/libata-eh.o
after:
15446 18 0 15464 3c68 drivers/ata/libata-eh.o
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Start separating SATA specific code from libata-scsi.c:
* un-static ata_scsi_find_dev()
* move following code to libata-sata.c:
- SATA only sysfs device attributes handling
- __ata_change_queue_depth()
- ata_scsi_change_queue_depth()
* cover with CONFIG_SATA_HOST ifdef SATA only sysfs device
attributes handling code and ATA_SHT_NCQ() macro in
<linux/libata.h>
Code size savings on m68k arch using (modified) atari_defconfig:
text data bss dec hex filename
before:
20702 105 576 21383 5387 drivers/ata/libata-scsi.o
after:
19137 23 576 19736 4d18 drivers/ata/libata-scsi.o
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Start separating SATA specific code from libata-core.c:
* move following functions to libata-sata.c:
- ata_tf_to_fis()
- ata_tf_from_fis()
- sata_link_scr_lpm()
- ata_slave_link_init()
- sata_lpm_ignore_phy_events()
* group above functions together in <linux/libata.h>
* include libata-sata.c in the build when CONFIG_SATA_HOST=y
Code size savings on m68k arch using (modified) atari_defconfig:
text data bss dec hex filename
before:
37582 572 40 38194 9532 drivers/ata/libata-core.o
after:
36762 572 40 37374 91fe drivers/ata/libata-core.o
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add !IS_ENABLED(CONFIG_SATA_HOST) to ata_eh_set_lpm() to allow
compiler to optimize out the function for non-SATA configs (for
PATA hosts "ap && !ap->ops->set_lpm" condition is always true so
it's sufficient for the function to return zero).
Code size savings on m68k arch using (modified) atari_defconfig:
text data bss dec hex filename
before:
17353 18 0 17371 43db drivers/ata/libata-eh.o
after:
16607 18 0 16625 40f1 drivers/ata/libata-eh.o
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add !IS_ENABLED(CONFIG_SATA_HOST) to ata_dev_config_ncq() to allow
compiler to optimize out the function for non-SATA configs.
Code size savings on m68k arch using (modified) atari_defconfig:
text data bss dec hex filename
before:
37582 572 40 38194 9532 drivers/ata/libata-core.o
after:
36462 572 40 37074 90d2 drivers/ata/libata-core.o
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Separate PATA timings code from libata-core.c:
* add PATA_TIMINGS config option and make corresponding PATA
host drivers (and ATA ACPI code) select it
* move following PATA timings code to libata-pata-timings.c:
- ata_timing_quantize()
- ata_timing_merge()
- ata_timing_find_mode()
- ata_timing_compute()
* group above functions together in <linux/libata.h>
* include libata-pata-timings.c in the build when PATA_TIMINGS
config option is enabled
* cover ata_timing_cycle2mode() with CONFIG_ATA_ACPI ifdef (it
depends on code from libata-core.c and libata-pata-timings.c
while its only user is ATA ACPI)
Code size savings on m68k arch using (modified) atari_defconfig:
text data bss dec hex filename
before:
39688 573 40 40301 9d6d drivers/ata/libata-core.o
after:
37820 572 40 38432 9620 drivers/ata/libata-core.o
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
* fix the overly long line in ata_timing_quantize()
* use standard kernel CodingStyle in ata_timing_merge()
* do not use assignment in if condition in ata_timing_compute()
* fix non-standard comment style in ata_timing_compute()
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move EXPORT_SYMBOL_GPL()s close to exported code like it is
done in other kernel subsystems. As a nice side effect this
results in the removal of few ifdefs.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Currently the maximum required size of the ata_scsi_rbuf[] is
576 bytes in ata_scsiop_inq_89() so modify ATA_SCSI_RBUF_SIZE
define accordingly.
Code size savings on m68k arch using (modified) atari_defconfig:
text data bss dec hex filename
before:
20782 105 4096 24983 6197 drivers/ata/libata-scsi.o
after:
20782 105 576 21463 53d7 drivers/ata/libata-scsi.o
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Optimize struct ata_force_param size by:
- using u8 for cbl and spd_limit fields
- using u16 for lflags field
Code size savings on m68k arch using (modified) atari_defconfig:
text data bss dec hex filename
before:
41064 573 40 41677 a2cd drivers/ata/libata-core.o
after:
40654 573 40 41267 a133 drivers/ata/libata-core.o
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use COMMAND_LINE_SIZE instead PAGE_SIZE for ata_force_param_buf[]
size as libata parameters buffer doesn't need to be bigger than
the command line buffer.
For many architectures this results in decreased libata-core.o
size (COMMAND_LINE_SIZE varies from 256 to 4096 while the minimum
PAGE_SIZE is 4096).
Code size savings on m68k arch using (modified) atari_defconfig:
text data bss dec hex filename
before:
41064 4413 40 45517 b1cd drivers/ata/libata-core.o
after:
41064 573 40 41677 a2cd drivers/ata/libata-core.o
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Initialize rbuf[] directly instead of using ata_tf_to_fis(). This
results in simpler and smaller code. It also allows separating
ata_tf_to_fis() into SATA specific libata part in the future.
Code size savings on m68k arch using (modified) atari_defconfig:
text data bss dec hex filename
before:
20824 105 4096 25025 61c1 drivers/ata/libata-scsi.o
after:
20782 105 4096 24983 6197 drivers/ata/libata-scsi.o
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use core helper instead of open-coding it.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is no reason to expose SATA_PMP config option when no SATA
host drivers are enabled. To fix it add SATA_HOST config option,
make all SATA host drivers select it and finally make SATA_PMP
config options depend on it.
This also serves as preparation for the future changes which
optimize libata core code size on PATA only setups.
CC: "James E.J. Bottomley" <jejb@linux.ibm.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> # for SCSI bits
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is no point in exposing ncq_enable_prio sysfs attribute for
devices on PATA and non-NCQ capable SATA hosts so:
* remove dev_attr_ncq_prio_enable from ata_common_sdev_attrs[]
* add ata_ncq_sdev_attrs[]
* update ATA_NCQ_SHT() macro to use ata_ncq_sdev_attrs[]
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In commit 7634ccd2da ("libata: maintainership update") from 2018
Jens has officially taken over libata maintainership from Tejun so
remove stale information from core libata code.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
On strict kernel memory permission, the ftrace have to change the
permission of text for dynamic patching the intructions. Use
riscv_patch_text_nosync() to patch code instead of probe_kernel_write.
Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
On strict kernel memory permission, we couldn't patch code without
writable permission. Preserve two holes in fixmap area, so we can map
the kernel code temporarily to fixmap area, then patch the instructions.
We need two pages here because we support the compressed instruction, so
the instruction might be align to 2 bytes. When patching the 32-bit
length instruction which is 2 bytes alignment, it will across two pages.
Introduce two interfaces to patch kernel code:
riscv_patch_text_nosync:
- patch code without synchronization, it's caller's responsibility to
synchronize all CPUs if needed.
riscv_patch_text:
- patch code and always synchronize with stop_machine()
Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The commit contains that make text section as non-writable, rodata
section as read-only, and data section as non-executable.
The init section should be changed to non-executable.
Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The kernel mapping will tried to optimize its mapping by using bigger
size. In rv64, it tries to use PMD_SIZE, and tryies to use PGDIR_SIZE in
rv32. To ensure that the start address of these sections could fit the
mapping entry size, make them align to the biggest alignment.
Define a macro SECTION_ALIGN because the HPAGE_SIZE or PMD_SIZE, etc.,
are invisible in linker script.
This patch is prepared for STRICT_KERNEL_RWX support.
Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>