1. Limit the max number of WQEs per QP reported when querying the
device, so that ib_create_qp() will not fail for a QP size that the
device claimed to support due to additional headroom WQEs being
allocated.
2. Limit qp resources accepted for ib_create_qp() to the limits
reported in ib_query_device(). In kernel space, make sure that the
limits returned to the caller following qp creation also lie within
the reported device limits. For userspace, report as before, and do
adjustment in libmlx4 (so as not to break ABI).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Sagi Grimberg <sagig@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Commit 096335b3f9 ("mlx4_core: Allow dynamic MTU configuration for
IB ports") modifies the port VL setting. This exposes a bug in
mlx4_common_set_port(), where the VL cap value passed in (inside the
command mailbox) is incorrectly zeroed-out:
mlx4_SET_PORT modifies the VL_cap field (byte 3 of the mailbox).
Since the SET_PORT command is paravirtualized on the master as well as
on the slaves, mlx4_SET_PORT_wrapper() is invoked on the master. This
calls mlx4_common_set_port() where mailbox byte 3 gets overwritten by
code which should only set a single bit in that byte (for the reset
qkey counter flag) -- but instead overwrites the entire byte.
The result is that when running in SR-IOV mode, the VL_cap will be set
to zero -- fix this.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
One sparse-warning fix, one bigfix for 3.4-stable
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQIVAwUAT86Oqjnsnt1WYoG5AQJ9rhAAuLVImkvxtHHMM7j2E8ZTQ1pWT6JRf6qC
Rsz/s41olPwSEVRuLXpZrle/dSN2l1Ys49FR2u6m+96lM0At2JlkML/Sc4Gszr0g
Oeo8FN+Rv/Sv6Chv7MuWp0z0WOs3ruIR3AYQIo+jnaVzZLLQ2HRN8wupjvpCIZyk
WdPu6t/9G+OtnkFWCC3FDEIyqpghg1TcoK93b1eRFD/ZoPV8yDJ9bba//fDesVVI
OhvUJPqeJ/ow+sA1MzyLhKB6CLPmEob0qxi8++CdnTfx8fwnkYNKlgsxf0WQ8JQ5
GSClKNUpki0yiYWJR6pJrv6+e6WbesX1DriRSRODJLKls/bQKnskaxGx4DUa73BM
DkOUsALfaTfGD5XiXgEjTU2HR+codiqvDavQjWOlHWgwIKB2MYQWIwFLK/T2RSdC
5f30IiM6kHJMZS3lVP2kjfXAfQ10kiTBg7E6btzCO3aso84yxr6Er65skdnlIi5r
q1z7FCnQimfZYjlbuR8EUtdxHdGZkSQbtZ5E7X9dvmUpFstvgGGPr/SuAP3r87kM
LbyRSoDpGk8dXZ5/epY+IKCQGsFZIeTlg+eonjSuVNN8Anr3WAE1VeRmLBQilnXk
hGDLKAZ4v9YwRJWqoY3hewtpcYhCMqNGGk4hPKmJuh37OTOWFQl8sXVk2Pqzy1ap
uIP66qrvvI0=
=VrYL
-----END PGP SIGNATURE-----
Merge tag 'md-3.5-fixes' of git://neil.brown.name/md
Pull two md fixes from NeilBrown:
"One sparse-warning fix, one bugfix for 3.4-stable"
* tag 'md-3.5-fixes' of git://neil.brown.name/md:
md: raid1/raid10: fix problem with merge_bvec_fn
lib/raid6: fix sparse warnings in recovery functions
Two patches are in here which fix AMD IOMMU specific issues. One patch
fixes a long-standing warning on resume because the amd_iommu_resume
function enabled interrupts. The other patch fixes a deadlock in an
error-path of the page-fault request handling code of the IOMMU driver.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPz1wQAAoJECvwRC2XARrj6FkQAJsF4LuzKXQcAybUnzMZq3E8
g2ii0uknErrNgsAjnooH1aTzq0vo5Ps/ZLj84XJER+V0vYkD574vhKs9dzu8Y8at
xv20AH+Uei+0tUVy2WCPglVwZaSrQB+eVm0oqWIqNKnp28q3pVsns8iccp+Anqxz
VV8IqJchfAUXdB8EGJCbZkT+YGfNcFGXx2bfYyiNBUvXOfE6fzzXZly0AXWcm9GB
XLYxLdtZ5RYG874IMKnQvAUCXAfN21snK61i1hcx1aoZynzVAFvbf1SnzcINuX4M
SqXdies6Ui8S2Ndvl4wZXfEkFYRL48v2+tpjp9bxEvvfX40WCbUHJ3M7Z+LoayvP
Xgpshp5JaoXcgp3Rgyhyd22HzKMP291NyjObrKBgf6QxFaQSz16OIsFN9aKxEiyc
QCMJyLgfKeBDWM64Zw2u526RIIMENPIim+YH5R2jN08kZUYd+5r611E5BcxE0I6Z
uuuDi+eY+UCiBVyofckuJz4yVnNIZroRSFoldfsbv7BvHCE4QNOCyG5JrdVzL0e3
81dgmMDaPCr2ga2NjJ2hebYJLNYNc4IIf/zwK8OoT/S5sD53zU42WIQ2Ug91hXmw
ep04oobprN764te1VOaSbm9FNUCD3ykf6xLtqX9gKp254mZ/w46YaMR0GSphrUbS
YBNnrSbIq+1z5MtruD/V
=qLIK
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-3.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"Two patches are in here which fix AMD IOMMU specific issues. One
patch fixes a long-standing warning on resume because the
amd_iommu_resume function enabled interrupts. The other patch fixes a
deadlock in an error-path of the page-fault request handling code of
the IOMMU driver.
* tag 'iommu-fixes-3.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Fix deadlock in ppr-handling error path
iommu/amd: Cache pdev pointer to root-bridge
Pull arm CMA fix from Marek Szyprowski:
"This removes the ARMv6+ CMA dependency and lets one use old, well-
tested dma-mapping implementation also on ARMv6+ systems without the
need to use EXPERIMENTAL stuff."
Russell King complained (rightly) about the experimental feature being
forced on by the ARM config.
Here CMA is "continuous memory allocator", not "cross-memory attach".
We really neet to stop using insane TLA's for things that aren't big
industry standards.
* 'fixes-for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
ARM: dma-mapping: remove unconditional dependency on CMA
Pull two scsi target fixes from Nicholas Bellinger:
"The first is a small name-space collision fix from Stefan for the new
sbp-target / ieee-1394 code, and second is the FILEIO backend
conversion patch to always use O_DSYNC by default instead of O_SYNC as
recommended by hch. Note the latter is CC'ed stable."
* '3.5-merge-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target/file: Use O_DSYNC by default for FILEIO backends
sbp-target: rename a variable to avoid name clash
Pull cgroup fix from Tejun Heo:
"This fixes the possible premature superblock release on umount bug
mentioned during v3.5-rc1 pull request.
Originally, cgroup dentry destruction path assumed that cgroup dentry
didn't have any reference left after cgroup removal thus put super
during dentry removal. Now that there can be lingering dentry
references, this led to super being put with live dentries. This
patch fixes the problem by putting super ref on dentry release instead
of removal."
* 'for-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: superblock can't be released with active dentries
Pull embedded i2c update from Wolfram Sang:
"This only contains one new driver which had multiple dependencies
(pinctrl, i2c-mux-rework, new devm_* functions), so I decided to wait
for rc1. Plus, it had to wait a little for the ack of a devicetree
maintainer since the bindings were not trivial enough for me to pass
through.
So, given that, I hope there is still something like the "new driver
rule", so we could have the driver in 3.5 and people can start using
it. That would make merging support for some boards easier for 3.6
since the dependency on this driver is gone then."
* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
i2c: Add generic I2C multiplexer using pinctrl API
Pull kvm fix from Avi Kivity:
"A one-liner fix for a buffer overflow"
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Fix buffer overflow in kvm_set_irq()
Pull drm radeon fixes from Dave Airlie:
"This is just radeon fixes and a bunch of new PCI ids. The fixes are
for a deadlock, an audio regression, and a couple of audio fixes."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon/kms: add new SI PCI ids
drm/radeon/kms: add new BTC PCI ids
drm/radeon/kms: add new Palm, Sumo PCI ids
drm/radeon/kms: add new Trinity PCI ids
drm/radeon: fix vm deadlocks on cayman
drm/radeon: fix gpu_init on si
drm/radeon/hdmi: don't set SEND_MAX_PACKETS bit
drm/radeon/audio: don't hardcode CRTC id
drm/radeon: make audio_init consistent across asics
This patch fixes bug in macro radix_tree_for_each_contig().
If radix_tree_next_slot() sees NULL in next slot it returns NULL, but following
radix_tree_next_chunk() switches iterating into next chunk. As result iterating
becomes non-contiguous and breaks vfs "splice" and all its users.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Reported-and-bisected-by: Hans de Bruin <jmdebruin@xmsnet.nl>
Reported-and-bisected-by: Ondrej Zary <linux@rainbow-software.org>
Reported-bisected-and-tested-by: Toralf Förster <toralf.foerster@gmx.de>
Link: https://lkml.org/lkml/2012/6/5/64
Cc: stable <stable@vger.kernel.org> # 3.4.x
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In commit 82f7af09 (x86/mce: Cleanup timer mess), Thomas just forgot
the "/ 2" there while cleaning up.
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Pull scheduler fixes from Ingo Molnar.
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Remove NULL assignment of dattr_cur
sched: Remove the last NULL entry from sched_feat_names
sched: Make sched_feat_names const
sched/rt: Fix SCHED_RR across cgroups
sched: Move nr_cpus_allowed out of 'struct sched_rt_entity'
sched: Make sure to not re-read variables after validation
sched: Fix SD_OVERLAP
sched: Don't try allocating memory from offline nodes
sched/nohz: Fix rq->cpu_load calculations some more
sched/x86: Use cpu_llc_shared_mask(cpu) for coregroup_mask
kvm_set_irq() has an internal buffer of three irq routing entries, allowing
connecting a GSI to three IRQ chips or on MSI. However setup_routing_entry()
does not properly enforce this, allowing three irqchip routes followed by
an MSI route to overflow the buffer.
Fix by ensuring that an MSI entry is added to an empty list.
Signed-off-by: Avi Kivity <avi@redhat.com>
Locking mutex in different orders just screams for
deadlocks, and some testing showed that it is actually
quite easy to trigger them.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
- Properly set up the RBs
- Properly set up the SPI
- Properly set up gb_addr_config
This should fix rendering issues on certain cards.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Many TVs and A/V receivers don't work with this bit set. Problem was
confirmed using: Onkyo TX-SR605, Sony BRAVIA KDL-52X3500, Sony BRAVIA
KDL-40S40xx. In theory this bit shouldn't affect audio engine when
feeding it with data, however it seems it does. Driver fglrx doesn't set
that bit in any of the above cases.
This fixes a regression introduced by 3.5-rc1.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is based on info released by AMD, should allow using audio in much
more cases.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Call it in the asic startup callback on all asics.
Previously r600 and rv770 called it in the startup
and resume callbacks while all the other asics called
it in the startup callback.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Pull signal and vfs compile breakage fixes from Al Viro.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
fixups for signal breakage
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
nommu: fix compilation of nommu.c
Pull cifs fixes from Steve French.
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: Move get_next_mid to ops struct
CIFS: Make accessing is_valid_oplock/dump_detail ops struct field safe
CIFS: Improve identation in cifs_unlock_range
CIFS: Fix possible wrong memory allocation
Compiling 3.5-rc1 for nommu targets gives:
CC mm/nommu.o
mm/nommu.c: In function ‘sys_mmap_pgoff’:
mm/nommu.c:1489:2: error: ‘ret’ undeclared (first use in this function)
mm/nommu.c:1489:2: note: each undeclared identifier is reported only once for each function it appears in
It is trivially fixed by replacing 'ret' with the local variable that is
already defined for the return value 'retval'.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
In some environments, dramatic performance savings may be obtained because
swapped pages are saved in RAM (or a RAM-like device) instead of a swap disk.
This tag provides the basic infrastructure along with some changes to the
existing backends.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJPsorBAAoJEFjIrFwIi8fJcz8H/RBXCtFo0kiJmRked3nMAIDO
/2zN/q/Qawsg9aeoGlP7G8hQi9PMipbhQj3ixHyCTMv0zMbH988GXbBce+gIcg6e
TOQi7xXAuPEwLizmSpiTv84XzN5bMgu1oJXEqIXw0EIpuZAmp+9m/o3WBwEAtyxi
B+hvjE7eZM8f75K3lxs6sOtmIcERj9zqmT933Y8+i9iiuRyGMey2SyKtvVLbYZ+j
HroFMUi0so5TzxT/cpkRiHu0U75c651o+LV00zh7InMqbwyRsWlKTf53k8Q/q2WP
I7dVmfItwN/TpOrYTfxglYFlbYuUP35ziFvZ2trd6hcs9RK8OuKw+OmBLReHTtc=
=x9Vp
-----END PGP SIGNATURE-----
Merge tag 'stable/frontswap.v16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm
Pull frontswap feature from Konrad Rzeszutek Wilk:
"Frontswap provides a "transcendent memory" interface for swap pages.
In some environments, dramatic performance savings may be obtained
because swapped pages are saved in RAM (or a RAM-like device) instead
of a swap disk. This tag provides the basic infrastructure along with
some changes to the existing backends."
Fix up trivial conflict in mm/Makefile due to removal of swap token code
changing a line next to the new frontswap entry.
This pull request came in before the merge window even opened, it got
delayed to after the merge window by me just wanting to make sure it had
actual users. Apparently IBM is using this on their embedded side, and
Jan Beulich says that it's already made available for SLES and OpenSUSE
users.
Also acked by Rik van Riel, and Konrad points to other people liking it
too. So in it goes.
By Dan Magenheimer (4) and Konrad Rzeszutek Wilk (2)
via Konrad Rzeszutek Wilk
* tag 'stable/frontswap.v16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm:
frontswap: s/put_page/store/g s/get_page/load
MAINTAINER: Add myself for the frontswap API
mm: frontswap: config and doc files
mm: frontswap: core frontswap functionality
mm: frontswap: core swap subsystem hooks and headers
mm: frontswap: add frontswap header file
Pull irq and smpboot updates from Thomas Gleixner:
"Just cleanup patches with no functional change and a fix for suspend
issues."
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Introduce irq_do_set_affinity() to reduce duplicated code
genirq: Add IRQS_PENDING for nested and simple irq
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smpboot, idle: Fix comment mismatch over idle_threads_init()
smpboot, idle: Optimize calls to smp_processor_id() in idle_threads_init()
Pull timer updates from Thomas Gleixner:
"The clocksource driver is pure hardware enablement and the skew option
is default off, well tested and non dangerous."
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick: Move skew_tick option into the HIGH_RES_TIMER section
clocksource: em_sti: Add DT support
clocksource: em_sti: Emma Mobile STI driver
clockevents: Make clockevents_config() a global symbol
tick: Add tick skew boot option
Cyrill Gorcunov reports that I broke the fdinfo files with commit
30a08bf2d3 ("proc: move fd symlink i_mode calculations into
tid_fd_revalidate()"), and he's quite right.
The tid_fd_revalidate() function is not just used for the <tid>/fd
symlinks, it's also used for the <tid>/fdinfo/<fd> files, and the
permission model for those are different.
So do the dynamic symlink permission handling just for symlinks, making
the fdinfo files once more appear as the proper regular files they are.
Of course, Al Viro argued (probably correctly) that we shouldn't do the
symlink permission games at all, and make the symlinks always just be
the normal 'lrwxrwxrwx'. That would have avoided this issue too, but
since somebody noticed that the permissions had changed (which was the
reason for that original commit 30a08bf2d3 in the first place), people
do apparently use this feature.
[ Basically, you can use the symlink permission data as a cheap "fdinfo"
replacement, since you see whether the file is open for reading and/or
writing by just looking at st_mode of the symlink. So the feature
does make sense, even if the pain it has caused means we probably
shouldn't have done it to begin with. ]
Reported-and-tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is useful for SoCs whose I2C module's signals can be routed to
different sets of pins at run-time, using the pinctrl API.
+-----+ +-----+
| dev | | dev |
+------------------------+ +-----+ +-----+
| SoC | | |
| /----|------+--------+
| +---+ +------+ | child bus A, on first set of pins
| |I2C|---|Pinmux| |
| +---+ +------+ | child bus B, on second set of pins
| \----|------+--------+--------+
| | | | |
+------------------------+ +-----+ +-----+ +-----+
| dev | | dev | | dev |
+-----+ +-----+ +-----+
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
In the error path of the ppr_notifer it can happen that the
iommu->lock is taken recursivly. This patch fixes the
problem by releasing the iommu->lock before any notifier is
invoked. This also requires to move the erratum workaround
for the ppr-log (interrupt may be faster than data in the log)
one function up.
Cc: stable@vger.kernel.org # v3.3, v3.4
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
At some point pci_get_bus_and_slot started to enable
interrupts. Since this function is used in the
amd_iommu_resume path it will enable interrupts on resume
which causes a warning. The fix will use a cached pointer
to the root-bridge to re-enable the IOMMU in case the BIOS
is broken.
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Commit e605b743f3 ("IB/mlx4: Increase the number of vectors (EQs)
available for ULPs") didn't handle correctly the case where there
aren't enough MSI-X vectors to increase the number of EQs, so only the
legacy EQs are allocated. This results in an attempt to memset() to
zero the EQ table which was never allocated and a kernel crash.
Fix this by checking in the teardown flow if the table of EQs was ever
allocated. Also remove some unneeded setting to zero of the EQ
related fields in struct mlx4_ib_dev.
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
CMA has been enabled unconditionally on all ARMv6+ systems to solve the
long standing issue of double kernel mappings for all dma coherent
buffers. This however created a dependency on CONFIG_EXPERIMENTAL for
the whole ARM architecture what should be really avoided. This patch
removes this dependency and lets one use old, well-tested dma-mapping
implementation also on ARMv6+ systems without the need to use
EXPERIMENTAL stuff.
Reported-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
When using rping -c -a 0.0.0.0 with iw_cxgb4, the system crashes when
rdma_connect() is called. ip_dev_find() will return NULL, but pdev is
accessed anyway.
Checking that pdev is NULL and returning -ENODEV prevents the system
from crashing.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Should be 'exynos5_xxx' instead of 'exonys5_xxx'.
It happened at the commit 30b842889e ("Merge tag 'soc2' of
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc")
during v3.5 merge window.
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
[ My bad - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull some left-over PM patches from Rafael J. Wysocki.
* 'pm-acpi' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / PM: Make acpi_pm_device_sleep_state() follow the specification
ACPI / PM: Make __acpi_bus_get_power() cover D3cold correctly
ACPI / PM: Fix error messages in drivers/acpi/bus.c
rtc-cmos / PM: report wakeup event on ACPI RTC alarm
ACPI / PM: Generate wakeup events on fixed power button
This reverts commit 5ceb9ce6fe.
That commit seems to be the cause of the mm compation list corruption
issues that Dave Jones reported. The locking (or rather, absense
there-of) is dubious, as is the use of the 'page' variable once it has
been found to be outside the pageblock range.
So revert it for now, we can re-visit this for 3.6. If we even need to:
as Minchan Kim says, "The patch wasn't a bug fix and even test workload
was very theoretical".
Reported-and-tested-by: Dave Jones <davej@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
New tmpfs use of !PageUptodate pages for fallocate() is triggering the
WARNING: at mm/page-writeback.c:1990 when __set_page_dirty_nobuffers()
is called from migrate_page_copy() for compaction.
It is anomalous that migration should use __set_page_dirty_nobuffers()
on an address_space that does not participate in dirty and writeback
accounting; and this has also been observed to insert surprising dirty
tags into a tmpfs radix_tree, despite tmpfs not using tags at all.
We should probably give migrate_page_copy() a better way to preserve the
tag and migrate accounting info, when mapping_cap_account_dirty(). But
that needs some more work: so in the interim, avoid the warning by using
a simple SetPageDirty on PageSwapBacked pages.
Reported-and-tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The comment above it says "Stat data, not accessed from path walking",
but in fact some of inode fields we use for the common stat data was way
down at the end of the inode, causing unnecessary cache misses for the
common stat operations.
The inode structure is pretty big, and this can change padding depending
on field width, but at least on the common 64-bit configurations this
doesn't change the size. Some of our inode layout has historically been
to tro to avoid unnecessary padding fields, but cache locality is at
least as important for layout, if not more.
Noticed by looking at kernel profiles, and noticing that the "i_blkbits"
access stood out like a sore thumb.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Convert to use O_DSYNC for all cases at FILEIO backend creation time to
avoid the extra syncing of pure timestamp updates with legacy O_SYNC during
default operation as recommended by hch. Continue to do this independently of
Write Cache Enable (WCE) bit, as WCE=0 is currently the default for all backend
devices and enabled by user on per device basis via attrib/emulate_write_cache.
This patch drops the now unnecessary fd_buffered_io= token usage that was
originally signalling when to explictly disable O_SYNC at backend creation
time for buffered I/O operation. This can end up being dangerous for a number
of reasons during physical node failure, so go ahead and drop this option
for now when O_DSYNC is used as the default.
Also allow explict FUA WRITEs -> vfs_fsync_range() call to function in
fd_execute_cmd() independently of WCE bit setting.
Reported-by: Christoph Hellwig <hch@lst.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>