There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.
The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.
During that time, a CPU may continue writing to the page.
This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.
When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the
process.
This only affects x86, which has a somewhat optimistic pte_accessible.
All other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions
(SPARC).
The basic race looks like this:
CPU A CPU B CPU C
load TLB entry
make entry PTE/PMD_NUMA
fault on entry
read/write old page
start migrating page
change PTE/PMD to new page
read/write old page [*]
flush TLB
reload TLB from new entry
read/write new page
lose data
[*] the old page may belong to a new user at this point!
The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.
This should fix both NUMA migration and compaction.
[mgorman@suse.de: fix build]
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
do_huge_pmd_numa_page() handles the case where there is parallel THP
migration. However, by the time it is checked the NUMA hinting
information has already been disrupted. This patch adds an earlier
check with some helpers.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On a protection change it is no longer clear if the page should be still
accessible. This patch clears the NUMA hinting fault bits on a
protection change.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Inaccessible VMA should not be trapping NUMA hint faults. Skip them.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If a PMD changes during a THP migration then migration aborts but the
failure path is doing more work than is necessary.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The anon_vma lock prevents parallel THP splits and any associated
complexity that arises when handling splits during THP migration. This
patch checks if the lock was successfully acquired and bails from THP
migration if it failed for any reason.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The TLB must be flushed if the PTE is updated but change_pte_range is
clearing the PTE while marking PTEs pte_numa without necessarily
flushing the TLB if it reinserts the same entry. Without the flush,
it's conceivable that two processors have different TLBs for the same
virtual address and at the very least it would generate spurious faults.
This patch only unmaps the pages in change_pte_range for a full
protection change.
[riel@redhat.com: write pte_numa pte back to the page tables]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Chegu Vinod <chegu_vinod@hp.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the PMD is flushed then a parallel fault in handle_mm_fault() will
enter the pmd_none and do_huge_pmd_anonymous_page() path where it'll
attempt to insert a huge zero page. This is wasteful so the patch
avoids clearing the PMD when setting pmd_numa.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On x86, PMD entries are similar to _PAGE_PROTNONE protection and are
handled as NUMA hinting faults. The following two page table protection
bits are what defines them
_PAGE_NUMA:set _PAGE_PRESENT:clear
A PMD is considered present if any of the _PAGE_PRESENT, _PAGE_PROTNONE,
_PAGE_PSE or _PAGE_NUMA bits are set. If pmdp_invalidate encounters a
pmd_numa, it clears the present bit leaving _PAGE_NUMA which will be
considered not present by the CPU but present by pmd_present. The
existing caller of pmdp_invalidate should handle it but it's an
inconsistent state for a PMD. This patch keeps the state consistent
when calling pmdp_invalidate.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
MMU notifiers must be called on THP page migration or secondary MMUs
will get very confused.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Base pages are unmapped and flushed from cache and TLB during normal
page migration and replaced with a migration entry that causes any
parallel NUMA hinting fault or gup to block until migration completes.
THP does not unmap pages due to a lack of support for migration entries
at a PMD level. This allows races with get_user_pages and
get_user_pages_fast which commit 3f926ab945 ("mm: Close races between
THP migration and PMD numa clearing") made worse by introducing a
pmd_clear_flush().
This patch forces get_user_page (fast and normal) on a pmd_numa page to
go through the slow get_user_page path where it will serialise against
THP migration and properly account for the NUMA hinting fault. On the
migration side the page table lock is taken for each PTE update.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 1b3a5d02ee ("reboot: move arch/x86 reboot= handling to generic
kernel") moved reboot= handling to generic code. In the process it also
removed the code in native_machine_shutdown() which are moving reboot
process to reboot_cpu/cpu0.
I guess that thought must have been that all reboot paths are calling
migrate_to_reboot_cpu(), so we don't need this special handling. But
kexec reboot path (kernel_kexec()) is not calling
migrate_to_reboot_cpu() so above change broke kexec. Now reboot can
happen on non-boot cpu and when INIT is sent in second kerneo to bring
up BP, it brings down the machine.
So start calling migrate_to_reboot_cpu() in kexec reboot path to avoid
this problem.
Bisected by WANG Chao.
Reported-by: Matthew Whitehead <mwhitehe@redhat.com>
Reported-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Tested-by: Baoquan He <bhe@redhat.com>
Tested-by: WANG Chao <chaowang@redhat.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes a build failure that appeared in v3.13-rc4 due to an
RTC/MFD update merged via -mm.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQIcBAABAgAGBQJSr0jGAAoJELSic+t+oim9bWsP/26SiYoI/ulG76GyJQuPVuWj
21fRM0TEZaSVxZf4lzZahdak9kDg5wDFX15ii9PnIPJ8m5Nz5lHKLaKMXUEY8saA
9vUZBsZNVbyQ9i7uH1/3mPC1fA+3XqwXKPrp4roIv5kFYCRFD+NDyBoCwl/FNvTE
u72A9nmojJv6C0UuExFtyXRiJnAn2goXYQ0JvlFKrezv7yStNGNXakIH1za1gEOq
hdMIkJ9r5/jZFIGKr7SHZjsGYqwngKfVfaSdBeWZqp6Z0FHf7MRI6aJ/TcwpkLFH
HNezPznw1/CzAHmHAyszbcIMIh/GHxJzDldUr4g+kwForjJGaYIcrimxFPtcmW9P
MhJ5DdO765aqp24cImo1aNJXMnJYTFDguMWSCWLwxNSmlpQYg2iEiKF2astcCb+p
3v4cj9EYnEiPdKwSmA/kCb7fiEnefHm8vLMgRZwXIXiDfDSub4JPya+L4SxHFMAZ
OxZyk/ZGPNkDAsyKUx3KQMw4Gmvbro/TJMLn73CgdrW++39/CbSzl5hM8oRMpeMn
LTvBQOViAODlODQDI9B1gOqd4c/9zE1wGgY64rE7MtzlhlcPNhXYjL/Cb7+JNZ49
Mp6nxt/yH7rlFMdpIC0GgsgBP4ilh3GeLDZDi+AhnbD1773Yn/mCA4cohfpUug6j
Yme3Jd3/zWHrUkUPjkfO
=t/v5
-----END PGP SIGNATURE-----
Merge tag 's2mps11-build' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator/clk fix from Mark Brown:
"Fix s2mps11 build
This patch fixes a build failure that appeared in v3.13-rc4 due to an
RTC/MFD update merged via -mm"
* tag 's2mps11-build' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
mfd: s2mps11: Fix build after regmap field rename in sec-core.c
Pull scheduler fixes from Ingo Molnar:
"Three fixes for scheduler crashes, each triggers in relatively rare,
hardware environment dependent situations"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Rework sched_fair time accounting
math64: Add mul_u64_u32_shr()
sched: Remove PREEMPT_NEED_RESCHED from generic code
sched: Initialize power_orig for overlapping groups
- Driver bug fixes for SH PFC, TWL4030, MSM and RCAR.
- Update the MAINTAINERS
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQIcBAABAgAGBQJSrsP1AAoJEEEQszewGV1zyn4P/j/tepOI0iPnEeUqc/Ah9lxd
lhZJytVH1mHpaf/4HJgmRayZrwria4fX6NEUwoIcZyuETLstvk2RnI6VdmuCRqBL
Oys/xxS3B55CIo36RHV1qADojPoNdK4lAbKKfklyJBnsDK7ozh+vPiZPANs0tzRS
ELaceaa1jX3O+ht3LvCbl8lguxu+6FpzzpQuFt/FCJTB4HuyTreg+wB7VWyoqsC7
Ry6BCKBjN8lewT8fvVkAdAeHaERkkCNxjBPDj4xU14hynWtgWikR0SOaFF7f5n2T
8dJKAqPIiVQP1Lb1WSJ6zEY1mCA7WjsOgudaDnF3vqPIMBaZKBM96cQHpI/JMdPU
VzrxC+k5BWxJC/XxE/AsES+Z949ctXFjiyj7b6SmcIU1rV46KpUuDLXwkbXSWs3G
FHWKIunHnHEmwqIDi+nDSCb7kdgxnboUZyiZJFr7s5XIJTqi7wv+wzSoIRP8N9w3
jNjWmrezJvP6YLdNBKh5jVDhY8XLcXr/U1QxP59tveO1isNwRvFJJG3zlvLwNhpG
ZGbqkygwoo3AvXc87ZLXFHIEf/vpOWr7ObWpuhF9OXoeWqmTaqD84XxUJSb7myPF
/XNIhjFLQ6kzHe5xivizLFatEXA5Oy9Rz2zTiWFZgbTciBRXU8fkoI56jcfoRwWG
yS/PxsVjr0FqrYsBvMJH
=V2Cs
-----END PGP SIGNATURE-----
Merge tag 'gpio-v3.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"All but one are long-standing bug fixes that are also tagged for
stable
- Driver bug fixes for SH PFC, TWL4030, MSM and RCAR.
- Update the MAINTAINERS"
* tag 'gpio-v3.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: rcar: Fix level interrupt handling
gpio: msm: Fix irq mask/unmask by writing bits instead of numbers
gpio: twl4030: Fix regression for twl gpio LED output
sh-pfc: Fix PINMUX_GPIO macro
MAINTAINERS: update GPIO maintainers entry
Pull two Ceph fixes from Sage Weil:
"One of these is fixing a regression from the d_flags file type patch
that went into -rc1 that broke instantiation of inodes and dentries
(we were doing dentries first). The other is just an off-by-one
corner case"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: Avoid data inconsistency due to d-cache aliasing in readpage()
ceph: initialize inode before instantiating dentry
Pull powerpc fixes from Ben Herrenschmidt:
"Uli's patch fixes a regression in ptrace caused by a mis-merge of a
previous LE patch. The rest are all more endian fixes, all fairly
trivial, found during testing of 3.13-rc's"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/powernv: Fix OPAL LPC access in Little Endian
powerpc/powernv: Fix endian issue in opal_xscom_read
powerpc: Fix endian issues in crash dump code
powerpc/pseries: Fix endian issues in MSI code
powerpc/pseries: Fix PCIE link speed endian issue
powerpc/pseries: Fix endian issues in nvram code
powerpc/pseries: Fix endian issues in /proc/ppc64/lparcfg
powerpc: Fix topology core_id endian issue on LE builds
powerpc: Fix endian issue in setup-common.c
powerpc: PTRACE_PEEKUSR always returns FPR0
If a user calls 'cpupower set --perf-bias 15', the process will end with
a SIGSEGV in libc because cpupower-set passes a NULL optarg to the atoi
call. This is because the getopt_long structure currently has all of
the options as having an optional_argument when they really have a
required argument. We change the structure to use required_argument to
match the short options and it resolves the issue.
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1000439
Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Thomas Renninger <trenn@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix building of s2mps11 regulator and clock drivers after renaming
regmap field in struct sec_pmic_dev in commit:
- "mfd/rtc: s5m: Fix register updating by adding regmap for RTC"
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
For NUMA systems, initializing the blk-mq layer and using per node hctx.
We initialize submit queues to 1, while blk-mq nr_hw_queues is
initialized to the number of NUMA nodes.
This makes the null_init_hctx function overwrite memory outside of what
it allocated. In my case it lead to writing garbage into struct
request_queue's mq_map.
Signed-off-by: Matias Bjorling <m@bjorling.me>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking fixes from David Miller:
1) Revert CHECKSUM_COMPLETE optimization in pskb_trim_rcsum(), I can't
figure out why it breaks things.
2) Fix comparison in netfilter ipset's hash_netnet4_data_equal(), it
was basically doing "x == x", from Dave Jones.
3) Freescale FEC driver was DMA mapping the wrong number of bytes, from
Sebastian Siewior.
4) Blackhole and prohibit routes in ipv6 were not doing the right thing
because their ->input and ->output methods were not being assigned
correctly. Now they behave properly like their ipv4 counterparts.
From Kamala R.
5) Several drivers advertise the NETIF_F_FRAGLIST capability, but
really do not support this feature and will send garbage packets if
fed fraglist SKBs. From Eric Dumazet.
6) Fix long standing user triggerable BUG_ON over loopback in RDS
protocol stack, from Venkat Venkatsubra.
7) Several not so common code paths can potentially try to invoke
packet scheduler actions that might be NULL without checking. Shore
things up by either 1) defining a method as mandatory and erroring
on registration if that method is NULL 2) defininig a method as
optional and the registration function hooks up a default
implementation when NULL is seen. From Jamal Hadi Salim.
8) Fix fragment detection in xen-natback driver, from Paul Durrant.
9) Kill dangling enter_memory_pressure method in cg_proto ops, from
Eric W Biederman.
10) SKBs that traverse namespaces should have their local_df cleared,
from Hannes Frederic Sowa.
11) IOCB file position is not being updated by macvtap_aio_read() and
tun_chr_aio_read(). From Zhi Yong Wu.
12) Don't free virtio_net netdev before releasing all of the NAPI
instances. From Andrey Vagin.
13) Procfs entry leak in xt_hashlimit, from Sergey Popovich.
14) IPv6 routes that are no cached routes should not count against the
garbage collection limits. We had this almost right, but were
missing handling addrconf generated routes properly. From Hannes
Frederic Sowa.
15) fib{4,6}_rule_suppress() have to consider potentially seeing NULL
route info when they are called, from Stefan Tomanek.
16) TUN and MACVTAP have had truncated packet signalling for some time,
fix from Jason Wang.
17) Fix use after frrr in __udp4_lib_rcv(), from Eric Dumazet.
18) xen-netback does not interpret the NAPI budget properly for TX work,
fix from Paul Durrant.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (132 commits)
igb: Fix for issue where values could be too high for udelay function.
i40e: fix null dereference
xen-netback: fix gso_prefix check
net: make neigh_priv_len in struct net_device 16bit instead of 8bit
drivers: net: cpsw: fix for cpsw crash when build as modules
xen-netback: napi: don't prematurely request a tx event
xen-netback: napi: fix abuse of budget
sch_tbf: use do_div() for 64-bit divide
udp: ipv4: must add synchronization in udp_sk_rx_dst_set()
net:fec: remove duplicate lines in comment about errata ERR006358
Revert "8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature"
8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature
xen-netback: make sure skb linear area covers checksum field
net: smc91x: Fix device tree based configuration so it's usable
udp: ipv4: fix potential use after free in udp_v4_early_demux()
macvtap: signal truncated packets
tun: unbreak truncated packet signalling
net: sched: htb: fix the calculation of quantum
net: sched: tbf: fix the calculation of max_size
micrel: add support for KSZ8041RNLI
...
Pull x86 fixes from Peter Anvin:
"This is a pretty small batch:
The biggest single change is to stop using EFI time services on 32-bit
platforms. This matches our current behavior on 64-bit platforms as
we already had ruled them out there as being too unreliable. Turns
out that affects 32-bit platforms, too.
One NULL pointer fix for SGI UV.
Two minor build fixes, one of which only affects icc and the other
which affects icc and future versions or nonstandard default settings
of gcc"
* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, efi: Don't use (U)EFI time services on 32 bit
x86, build, icc: Remove uninitialized_var() from compiler-intel.h
x86/UV: Fix NULL pointer dereference in uv_flush_tlb_others() if the 'nobau' boot option is used
x86, build: Pass in additional -mno-mmx, -mno-sse options
Pull SELinux fixes from James Morris.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
selinux: process labeled IPsec TCP SYN-ACK packets properly in selinux_ip_postroute()
selinux: look for IPsec labels on both inbound and outbound packets
selinux: handle TCP SYN-ACK packets correctly in selinux_ip_postroute()
selinux: handle TCP SYN-ACK packets correctly in selinux_ip_output()
selinux: fix possible memory leak
This reverts commit 102aefdda4.
Tom London reports that it causes sync() to hang on Fedora rawhide:
https://bugzilla.redhat.com/show_bug.cgi?id=1033965
and Josh Boyer bisected it down to this commit. Reverting the commit in
the rawhide kernel fixes the problem.
Eric Paris root-caused it to incorrect subtype matching in that commit
breaking fuse, and has a tentative patch, but by now we're better off
retrying this in 3.14 rather than playing with it any more.
Reported-by: Tom London <selinux@gmail.com>
Bisected-by: Josh Boyer <jwboyer@fedoraproject.org>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Anand Avati <avati@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch changes the igb_phy_has_link function to check the value of the
parameter before deciding to use udelay or mdelay in order to be sure that
the value is not too high for udelay function.
CC: stable kernel <stable@vger.kernel.org> # 3.9+
Signed-off-by: Sunil K Pandey <sunil.k.pandey@intel.com>
Signed-off-by: Kevin B Smith <kevin.b.smith@intel.com>
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the vsi->tx_rings structure is NULL we don't want to panic.
Change-Id: Ic694f043701738c434e8ebe0caf0673f4410dc10
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull ARM fixes from Russell King:
"This resolves some further issues with the dma mask changes on ARM
which have been found by TI and others, and also some corner cases
with the updates to the virtual to physical address translations.
Konstantin also found some problems with the unwinder, which now
performs tighter verification that the stack is valid while unwinding"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: fix asm/memory.h build error
ARM: 7917/1: cacheflush: correctly limit range of memory region being flushed
ARM: 7913/1: fix framepointer check in unwind_frame
ARM: 7912/1: check stack pointer in get_wchan
ARM: 7909/1: mm: Call setup_dma_zone() post early_paging_init()
ARM: 7908/1: mm: Fix the arm_dma_limit calculation
ARM: another fix for the DMA mapping checks
- Couple of fixes for recently added perf code
- Build time extable sort
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJSqqyMAAoJEGnX8d3iisJeTG4QAMUxMnqDHJL919gukLAoivom
BdLdyPkHXECnwvu9G4kd4kOHvF37QUDSaIJYlgHNA7+vkZg2O9qPBWBAl5DQQ8BU
nOQeurxnmNKvhBNcLJzRt+MF6J3TATV28sHB3TF5XSC/JV6yCdwhztBNUjeynRHT
fDBjVyK5tdRCsdh1lRID/4cQW6SnNG4VPuyQHCRt+PZ84nE7AHKu5eYMkSnIpof5
x4/y/kEYLtzuOfbjgze+ZZk9QlR+ymEVq+YSQsbGH8dM5curazGMh4lh2isa0nkJ
G4ptA/E3XSvqNkwgNSYeDss8ugxvwHnjAufgSYOlBSZjfCWxwiA8UC1nA0eks4OW
MBIwCZe6Qo8HyRfZWQgvNOjP81Q9LWfRNa7UObB2HcNvXDghuTmcmOjZJSheZWip
KA7fuISnUz24mwdRlSMwfLjG5zh13GKphpb/PL79m+uzrVB8yHfJWg8nBU7y8Tfn
j8BmyxS9cQQPN6lC2w0ESx4Fp891yR63KNKZq+MLZCj/4iP0h9s2ifL8o/xx03a0
WgqNZJaXYnssqsZAd1BhnV7Oma/OJmrwm7LgWVxAr01FvjONAh/bd3LJR0G2Nksy
PMJI0NnVWrHrso9BeWQ4f0L//tamtmkBqrTjXxgrgQisuxntxdhe16xa0FhsrmIi
B/wfllRLceFyqT78GvNr
=JA1e
-----END PGP SIGNATURE-----
Merge tag 'arc-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"These are couple of weeks old already, but I just couldn't get them to
you earlier.
- couple of fixes for recently added perf code
- build time extable sort"
* tag 'arc-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: [perf] Fix a few thinkos
ARC: Add guard macro to uapi/asm/unistd.h
ARC: extable: Enable sorting at build time
A fix for possible memory corruption during DM table load, fix a
possible leak of snapshot space in case of a crash, fix a possible
deadlock due to a shared workqueue in the delay target, fix to
initialize read-only module parameters that are used to export metrics
for dm stats and dm bufio.
Quite a few stable fixes were identified for both the thin-provisioning
and caching targets as a result of increased regression testing using
the device-mapper-test-suite (dmts). The most notable of these are the
reference counting fixes for the space map btree that is used by the
dm-array interface -- without these the dm-cache metadata will leak,
resulting in dm-cache devices running out of metadata blocks. Also,
some important fixes related to the thin-provisioning target's
transition to read-only mode on error.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQEcBAABAgAGBQJSq2pVAAoJEMUj8QotnQNaAzEH/0qpz06SVd3+oTbsWeWxXwax
B0zIZ4HPnaD9FDfSWN+6IdQ6Rkn51cspMBHPxWX0I8pmqOTg7MUM7wbaZZaKT3UW
aDlibzG1O3zBGbkr+Qhbh89fJK5/3xnZXlK0hOyaNI2vz1rA6RThZ2hIeak9xQYr
7USz2bjSMXchwebb0Z01CrRnOkhUFg6yUJURIEu4XFCJmDM/PM9zKogLGYKuZedK
pZePnZhwgkLMn7f9l4lPHk3EcWeD4Nf9WR0lS4eKdbYsvMxm1QE7BpDlVJD0Asjg
/SVOVkgygW18TINMHuw1K0DPdSCo5UZF2HRj1QUD/X4N2BSkio+E1qwF6kT4ltk=
=0Ix+
-----END PGP SIGNATURE-----
Merge tag 'dm-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
"A set of device-mapper fixes for 3.13.
A fix for possible memory corruption during DM table load, fix a
possible leak of snapshot space in case of a crash, fix a possible
deadlock due to a shared workqueue in the delay target, fix to
initialize read-only module parameters that are used to export metrics
for dm stats and dm bufio.
Quite a few stable fixes were identified for both the thin-
provisioning and caching targets as a result of increased regression
testing using the device-mapper-test-suite (dmts). The most notable
of these are the reference counting fixes for the space map btree that
is used by the dm-array interface -- without these the dm-cache
metadata will leak, resulting in dm-cache devices running out of
metadata blocks. Also, some important fixes related to the
thin-provisioning target's transition to read-only mode on error"
* tag 'dm-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm array: fix a reference counting bug in shadow_ablock
dm space map: disallow decrementing a reference count below zero
dm stats: initialize read-only module parameter
dm bufio: initialize read-only module parameters
dm cache: actually resize cache
dm cache: update Documentation for invalidate_cblocks's range syntax
dm cache policy mq: fix promotions to occur as expected
dm thin: allow pool in read-only mode to transition to read-write mode
dm thin: re-establish read-only state when switching to fail mode
dm thin: always fallback the pool mode if commit fails
dm thin: switch to read-only mode if metadata space is exhausted
dm thin: switch to read only mode if a mapping insert fails
dm space map metadata: return on failure in sm_metadata_new_block
dm table: fail dm_table_create on dm_round_up overflow
dm snapshot: avoid snapshot space leak on crash
dm delay: fix a possible deadlock due to shared workqueue
Jason Gunthorpe reports a build failure when ARM_PATCH_PHYS_VIRT is
not defined:
In file included from arch/arm/include/asm/page.h:163:0,
from include/linux/mm_types.h:16,
from include/linux/sched.h:24,
from arch/arm/kernel/asm-offsets.c:13:
arch/arm/include/asm/memory.h: In function '__virt_to_phys':
arch/arm/include/asm/memory.h:244:40: error: 'PHYS_OFFSET' undeclared (first use in this function)
arch/arm/include/asm/memory.h:244:40: note: each undeclared identifier is reported only once for each function it appears in
arch/arm/include/asm/memory.h: In function '__phys_to_virt':
arch/arm/include/asm/memory.h:249:13: error: 'PHYS_OFFSET' undeclared (first use in this function)
Fixes: ca5a45c06c ("ARM: mm: use phys_addr_t appropriately in p2v and v2p conversions")
Tested-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
A small set of driver fixes plus one larger core change which changes
the way we check to see if we're using DT so that there aren't any races
between deciding we're using DT and the regulator subsystem noticing.
This makes the new support for substituting a dummy regulator and
optional regulators work a lot better on DT systems since it ensures
that we don't trigger probe deferral when we shouldn't which was causing
bugs in clients.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQIcBAABAgAGBQJSqxI9AAoJELSic+t+oim9nuMQAJd2MqTivDav+1dAJgWT7chT
GUb61Ks3HjmsRz/76YV+CQBO7puFxR9100U/keb3Xmg8oHJSVP6xCroA5JILqhwZ
6rDsRXOKgbqlVruundkceWZJHQEbszUpdbnU8GXiNyNI8EiVoVZCgXSnPM2wyD6o
EdxjqaXi/GUodaGFBfMyMpj387QwWgCi+DocUf622fTUHLEOKjjjndsKssTW2jyf
NrRQiTnQ6Yecf8lI2rHN5C6p8MyJ8IF3i2d4pi1eBAfWF0OfeYRrm694IrbZ8Idl
vAH4BxMf111JC7apuOTHNUSpL1DV4mjYQEeXUvd3wfnWEMRkFaEgwmTRmZZAfl/i
KM+5Yob1IdStfNwayKAVsPbIqYeyV0zDkN4CteY5XtWYLUqKJon6wuSGzYRABID2
uRa82dlSWMaX89+nHPCf22F7op8qRPLgr11yg7Nvo5qB+0Snij341libjrJGY09y
wFx6fdxL4OMkyRpwyB6tkWyAjUPbMJDAvrOnA2x7nU+AS1ytGAJeJMUpzYhUEly/
31kVJBi+mPRRmBsG+Fe9ALp+4k/UpMajCYWXa4/q+Bs7r3FCzWU98NeRxMurUKfO
cco6diDSLTVaQKHcUqPW0g0BWGrggro4H5CHe5MBBi2mHK3IMuqnSYjTDiJpEh7I
Tlad7Or4kd41FCk3Wpfi
=WqDN
-----END PGP SIGNATURE-----
Merge tag 'regulator-v3.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A small set of driver fixes plus one larger core change which changes
the way we check to see if we're using DT so that there aren't any
races between deciding we're using DT and the regulator subsystem
noticing.
This makes the new support for substituting a dummy regulator and
optional regulators work a lot better on DT systems since it ensures
that we don't trigger probe deferral when we shouldn't which was
causing bugs in clients"
* tag 'regulator-v3.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: pfuze100: allow misprogrammed ID
regulator: pfuze100: Fix address of FABID
regulator: as3722: set the correct current limit
regulator: core: Check for DT every time we check full constraints
regulator: core: Replace checks of have_full_constraints with a function
Two small changes to fix some error handling and checking (both of which
would be quite serious if the errors trigger) plus a trivial
documentation fix.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQIcBAABAgAGBQJSqw5tAAoJELSic+t+oim9s5sP/3djfdZfsi0qm4hi6paMc6pF
KyP92oHwiMu5RyCV+4UVwtaZnbtBip9uI2z3AB3sozJFNVmSNZ7OoTP2lS1vkLQ+
1XcfJH65m3SPWXliixEjnfUT0UcgXBCDVREcuGX/Iu7RfQzplDfhco2LtfaDbzAa
pTD/Q6vDFxGAm0lGBOPtZ0CB14jb9i0qD4HHBY1TG7YchgLQBe6VC6JB23fGN/Y9
1+rBvS03xrBw6h9RaFLVX7Bk4/i0jrj2UpUy8C3cqK8VdJYLKx06M13oP31mQcwo
jDusm8Pz0SSiT6b8pt334QpU2YvtTNqEAZ1Kp6hZRhGuRqJr0Qb/k+l85koeiT++
+bYht3yTL9/uqQ1felViH4hvvuRhjyHuaaEPinaWyQJhYsP87dG24h9iWG8fE3k7
ft2C+7ho+q8ecW8O48eC9hWFLEBLR9tMNfQM6soHT1h4YsQM3AMih5AhjHdJlavH
yyviE6Nog6279XzJHZb6pcRyIKUKR51fsjaF6ofTWJt5zG4npF3wG96JnHtuPMgh
iIqSSbjBdBRwZ39S1RW2Q+pCai3FpXyVQAlCl7XXWKLTjfLVnMDXQ6VMy4a51UL4
D4fFVlR0nIjeWUTY0vBnFEc+4lYn83aMkkHh/xW+wmzuXstEl7KB/rkD3+EW5Hpu
KZWJmo3xow1oUtziJYDv
=YKD0
-----END PGP SIGNATURE-----
Merge tag 'regmap-v3.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"Two small changes to fix some error handling and checking (both of
which would be quite serious if the errors trigger) plus a trivial
documentation fix"
* tag 'regmap-v3.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: use IS_ERR() to check clk_get() results
regmap: make sure we unlock on failure in regmap_bulk_write
regmap: trivial comment fix (copy'n'paste error)
Pull i2c fixes from Wolfram Sang:
"Here are two simple but wanted fixes for the i2c subsystem"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: imx: Check the return value from clk_prepare_enable()
i2c: mux: Inherit retry count and timeout from parent for muxed bus
- This driver was not ready to fully Armada 370 NAND, with particularly
notable problems seen on flash with 2KB page sizes. This "compatible" entry
really should have been held back until 3.14 or later.
- Fix a bug seen in rare cases on the error path of a failed probe attempt,
where we free unallocated DMA resources
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJSqrVaAAoJEFySrpd9RFgtaUoP/jklkxrltohyejpHPhsHNWK9
d8OtfIGPNKTFUtuwOAMDBuLt+/lX4f16sWC8PoE1biGhEk/tA/YuisLS1IWNhYYW
Jms1ZTr4juvahJaLoahmJcEd21ZWV36tlvbUPQcQQCFwjuuQUPn0M755XKL7sC8G
09+1VYVjOzX1aAodhYu9JyntGtYqXrVj8hDhhRY4rP4KLbmE/XELmsgp+JPzHs6T
O1X3B3mY4SRWwHQAGtNdzzoGMSIgoeXvydHkgRDYh92iVoL2se8mVfpTGYW9a27T
alBbOm1rCuuS+3WmphB0u5oSRgcykMByP2ucTWlnP1C60DvYctZ42sOdFKPPpwzE
76DJ9HoZt9gO+ch2pn2hXtD60U4b8oZccZJP4WLkio2/nWP/Piae1nyuLtM2AlwO
4Kj/boIymU3VWkPbIj/Dq/9MF7h3eF8M1kr6JKc3MlXU9ZnQQXpBGIBOdiZhcTbY
fyxEBEQoolRE9FPPZWOOiGczTpafoot9o3kCM15G4cHCTVTzK3iuPDGIIDX97nKw
9vAdEP6/yyKnMSdh+OXZ+g134vAXYKezf9QzNastXz5QtYY+0pDOc/5shF9aPsN8
94x7Ub4WzSm+r+dT0m1CtesLMKYIoHtfBjEZA8CVaTfoK+X4oH7eXxNsne5LCRAS
LCbXjnk/WlbydNSbVKGA
=K24+
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20131212' of git://git.infradead.org/linux-mtd
Pull MTD fixes from Brian Norris:
"Two MTD fixes, for the pxa3xx-nand driver:
- This driver was not ready to fully Armada 370 NAND, with
particularly notable problems seen on flash with 2KB page sizes.
This "compatible" entry really should have been held back until
3.14 or later.
- Fix a bug seen in rare cases on the error path of a failed probe
attempt, where we free unallocated DMA resources"
* tag 'for-linus-20131212' of git://git.infradead.org/linux-mtd:
mtd: nand: pxa3xx: Use info->use_dma to release DMA resources
Partially revert "mtd: nand: pxa3xx: Introduce 'marvell,armada370-nand' compatible string"
Pull slave-dmaengine fixes from Vinod Koul:
"Here is the common fixes PULL for dmaengine.
Dan has been working on fixing the build issues in bunch of drivers.
Here we have one fixing s3c24xx-dma, along with fix from Russell on
pl08x. Also we have Kuninori rcar dma fixes. The s3c24xx-dma which
was added in last merge window missed updates to usage of DMA_COMPLETE
so converting the last driver"
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
dma: fix build breakage in s3c24xx-dma
Fix pl08x warnings
rcar-hpbdma: initialise plane information when halted
rcar-hpbdma: fixup channel busy check for double plane
rcar-hpbdma: add max transfer size
dma: mmp_pdma: add missing platform_set_drvdata() in mmp_pdma_probe()
dmaengine: s3c24xx-dma: use DMA_COMPLETE for dma completion status
An old array block could have its reference count decremented below
zero when it is being replaced in the btree by a new array block.
The fix is to increment the old ablock's reference count just before
inserting a new ablock into the btree.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 3.9+
The old behaviour, returning -EINVAL if a ref_count of 0 would be
decremented, was removed in commit f722063 ("dm space map: optimise
sm_ll_dec and sm_ll_inc"). To fix this regression we return an error
code from the mutator function pointer passed to sm_ll_mutate() and have
dec_ref_count() return -EINVAL if the old ref_count is 0.
Add a DMERR to reflect the potential seriousness of this error.
Also, add missing dm_tm_unlock() to sm_ll_mutate()'s error path.
With this fix the following dmts regression test now passes:
dmtest run --suite cache -n /metadata_use_kernel/
The next patch fixes the higher-level dm-array code that exposed this
regression.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 3.12+
If the length of data to be read in readpage() is exactly
PAGE_CACHE_SIZE, the original code does not flush d-cache
for data consistency after finishing reading. This patches fixes
this.
Signed-off-by: Li Wang <liwang@ubuntukylin.com>
Signed-off-by: Sage Weil <sage@inktank.com>
commit b18825a7c8 (Put a small type field into struct dentry::d_flags)
put a type field into struct dentry::d_flags. __d_instantiate() set the
field by checking inode->i_mode. So we should initialize inode before
instantiating dentry when handling mds reply.
Fixes: http://tracker.ceph.com/issues/6930
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
We are passing pointers to the firmware for reads, we need to properly
convert the result as OPAL is always BE.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
opal_xscom_read uses a pointer to return the data so we need
to byteswap it on LE builds.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
A couple more device tree properties that need byte swapping.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>