pm_idle appears in no generic Linux code,
it appears only in architecture-specific code.
Thus, pm_idle should not be declared in pm.h.
Architectures that use an idle function pointer
should delcare one local to their architecture,
and/or use cpuidle.
Signed-off-by: Len Brown <len.brown@intel.com>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Cc: linux-pm@vger.kernel.org
as pm_idle() has already been deleted from this code,
the comment was a stray.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
All paths on m32r lead to cpu_relax().
So delete the dead code and simply call cpu_relax() directly.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: linux-m32r@ml.linux-m32r.org
pm_idle() on ia64 was a synonym for default_idle().
So simply invoke default_idle() directly.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: linux-ia64@vger.kernel.org
pm_idle() and idle() served no purpose on cris --
invoke default_idle() directly.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
pm_idle() on arm64 was a synonym for default_idle(),
so remove it and invoke default_idle() directly.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
pm_idle() on ARM was a synonym for default_idle(),
so simply invoke default_idle() directly.
Signed-off-by: Len Brown <len.brown@intel.com>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
(pm_idle)() is being removed from linux/pm.h
because Linux does not have such a cross-architecture concept.
sparc uses an idle function pointer in its architecture
specific code. So we re-name sparc use of pm_idle to sparc_idle.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
SH idle code could use some simplification.
This patch enables that by guaranteeing
that "sh_idle" is local, and thus architecture specific.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: linux-sh@vger.kernel.org
(pm_idle)() is being removed from linux/pm.h
because Linux does not have such a cross-architecture concept.
x86 uses an idle function pointer in its architecture
specific code as a backup to cpuidle. So we re-name
x86 use of pm_idle to x86_idle, and make it static to x86.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Update APM to register its local idle routine with cpuidle.
This allows us to stop exporting pm_idle to modules on x86.
The Kconfig sub-option, APM_CPU_IDLE, now depends on on CPU_IDLE.
Compile-tested only.
Signed-off-by: Len Brown <len.brown@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Jiri Kosina <jkosina@suse.cz>
The SMI counter is popular -- so display it by default
rather than requiring an option. What the heck,
we've blown the 80 column budget on many systems already...
Note that the value displayed is the delta
during the measurement interval.
The absolute value of the counter can still be seen with
the generic 32-bit MSR option, ie. -m 0x34
Signed-off-by: Len Brown <len.brown@intel.com>
Here we disable HW promotion of C1 to C1E
and export both C1 and C1E and distinct C-states.
This allows a cpuidle governor to choose a lower latency
C-state than C1E when necessary to satisfy performance
and QOS constraints -- and still save power versus polling.
This also corrects the erroneous latency previously reported
for C1E -- it is 10usec, not 1usec.
Note that if you use "intel_idle.max_cstate=N",
then you must increment N by 1 to get the same behavior
after this change.
Signed-off-by: Len Brown <len.brown@intel.com>
Remove 32-bit x86 a cmdline param "no-hlt",
and the cpuinfo_x86.hlt_works_ok that it sets.
If a user wants to avoid HLT, then "idle=poll"
is much more useful, as it avoids invocation of HLT
in idle, while "no-hlt" failed to do so.
Indeed, hlt_works_ok was consulted in only 3 places.
First, in /proc/cpuinfo where "hlt_bug yes"
would be printed if and only if the user booted
the system with "no-hlt" -- as there was no other code
to set that flag.
Second, check_hlt() would not invoke halt() if "no-hlt"
were on the cmdline.
Third, it was consulted in stop_this_cpu(), which is invoked
by native_machine_halt()/reboot_interrupt()/smp_stop_nmi_callback() --
all cases where the machine is being shutdown/reset.
The flag was not consulted in the more frequently invoked
play_dead()/hlt_play_dead() used in processor offline and suspend.
Since Linux-3.0 there has been a run-time notice upon "no-hlt" invocations
indicating that it would be removed in 2012.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
mwait_idle() is a C1-only idle loop intended to be more efficient
than HLT, starting on Pentium-4 HT-enabled processors.
But mwait_idle() has been replaced by the more general
mwait_idle_with_hints(), which handles both C1 and deeper C-states.
ACPI processor_idle and intel_idle use only mwait_idle_with_hints(),
and no longer use mwait_idle().
Here we simplify the x86 native idle code by removing mwait_idle(),
and the "idle=mwait" bootparam used to invoke it.
Since Linux 3.0 there has been a boot-time warning when "idle=mwait"
was invoked saying it would be removed in 2012. This removal
was also noted in the (now removed:-) feature-removal-schedule.txt.
After this change, kernels configured with
(CONFIG_ACPI=n && CONFIG_INTEL_IDLE=n) when run on hardware
that supports MWAIT will simply use HLT. If MWAIT is desired
on those systems, cpuidle and the cpuidle drivers above
can be enabled.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
This macro is only invoked by Xen,
so make its definition specific to Xen.
> set_pm_idle_to_default()
< xen_set_default_idle()
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: xen-devel@lists.xensource.com
Remove the assumption that cstate_tables are
indexed by MWAIT flag values. Each entry
identifies itself via its own flags value.
This change is needed to support multiple states
that share the same MWAIT flags.
Note that this can have an effect on what state is described
by 'N' on cmdline intel_idle.max_cstate=N on some systems.
intel_idle.max_cstate=0 still disables the driver
intel_idle.max_cstate=1 still results in just C1(E)
However, "place holders" in the sparse C-state name-space
(eg. Atom) have been removed.
Signed-off-by: Len Brown <len.brown@intel.com>
Cosmetic only.
Replace use of MWAIT_MAX_NUM_CSTATES with CPUIDLE_STATE_MAX.
They are both 8, so this patch has no functional change.
The reason to change is that intel_idle will soon be able
to export more than the 8 "major" states supported by MWAIT.
When we hit that limit, it is important to know
where the limit comes from.
Signed-off-by: Len Brown <len.brown@intel.com>
When verbose is enabled, print the C1E-Enable
bit in MSR_IA32_POWER_CTL.
also delete some redundant tests on the verbose variable.
Signed-off-by: Len Brown <len.brown@intel.com>
This patch enables turbostat to run properly on the
next-generation Intel(R) Microarchitecture, code named "Haswell" (HSW).
HSW supports the BCLK and counters found in SNB.
Signed-off-by: Len Brown <len.brown@intel.com>
This patch enables intel_idle to run on the
next-generation Intel(R) Microarchitecture code named "Haswell".
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown sent a patch to remove this field in the intel_idle driver.
The other user of this field is the davinci cpuidle driver and a
patch has been sent to remove the usage of it.
This patch removes the last user of this field.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Len Brown <len.brown@intel.com>
The cpuidle_device is retrieved in the function by using directly
the global variable. But the caller of this function already have
this device and it can be passed as a parameter. That is one small
step to encapsulate the code more.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Len Brown <len.brown@intel.com>
The different definitions are not used anywhere in the code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Len Brown <len.brown@intel.com>
The device->state_count is initialized in the cpuidle_register_device
function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
With one function handling the idle state and a single variable,
the usage of the davinci_ops is overkill.
This patch removes these ops and simplify the code.
Furthermore, the 'driver_data' field is no longer used, we have
1 of the 3 remaining user of this field removed.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The patch is mindless, it just moves the idle function below in the file
in order to prevent forward declaration in the next patch.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
That will allow to cleanup the rest of the code right after,
because the ops won't make sense.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The commit, 4202735e8a
(cpuidle: Split cpuidle_state structure and move per-cpu statistics fields)
observed that the MWAIT flags for Cn on every processor to date were the
same, and created get_driver_data() to supply them.
Unfortunately, that assumption is false, going forward.
So here we restore the MWAIT flags to the cpuidle_state table.
However, instead restoring the old "driver_data" field,
we put the flags into the existing "flags" field,
where they probalby should have lived all along.
This patch does not change any operation.
This patch removes 1 of the 3 users of cpuidle_state_usage.driver_data.
Perhaps some day we'll get rid of the other 2.
Signed-off-by: Len Brown <len.brown@intel.com>
support.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRCn+IAAoJEK2W1qbAHj1naQEP/2eXMOslRyws7M6CcsEgpEK9
N2L2hf6bD3xF/04ZSLbHFI6hPe9wDXSL9Vxd+DRLYTSnc0E9WYBXHmE6Eb0L0xK8
m0Iubk/hi7mk6mnMJtpTFT5pazBTPhVz0nXOijguh5U6PW0xL+4ypXe9nrH2jtW0
DvEHFDIPbKcqwplm8nvo/QJ5O3YNQaMifKUtpXF/JWGlCYP4vPk0dJVg9ATbscEV
Fg3kefoJzZM09q3Uvo01wigbj+wRkpBK9+CiyW6XcE0lkOAnFmpvyYerIoAHAK37
Rsw5J4aMPA9U8mggBEtlHBWa0q5utZafHM11lT2ZeFGCXkdn+TSWni6O7ov54xPP
Cd7jx+uNpe/OuLT5YjbCg2IMXgJs+zIZMSeqSj3SrywE0a0EQHECWiXaFmMmrCCJ
TgZtmp/HS1UsdoiHA3v3ZX3AaX4W+mggYp/5md9P1vHyYS9uTlgSVplhwtVgsL23
EsDxNNxODSIFMAMnrXxAV+NBPiQRY42K22hK/RrnWew9roAQHxroIvQDmzzm5ZRL
BqCFW3w/x2loJOZZ6NH/J8IUEoF9RhCK1tOGVjFuAVn30srt3zXb4pOzYeydykT7
m04HaGO7rCBJI75XdVDhm6ozOvV/GhXF2fJOt4qyoX/X6M8YN5i0jfwC3rhKeuOe
U9fyyYoQV37EWIRI9K5Q
=qtaF
-----END PGP SIGNATURE-----
Merge tag 'dm-3.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm
Pull more device-mapper fixes from Alasdair G Kergon:
"A fix for stacked dm thin devices and a fix for the new dm WRITE SAME
support."
* tag 'dm-3.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
dm: fix write same requests counting
dm thin: fix queue limits stacking
PullHID fixes from Jiri Kosina:
- fix i2c-hid and hidraw interaction, by Benjamin Tissoires
- a quirk to make a particular device (Formosa IR receiver) work
properly, by Nicholas Santos
* 'for-3.8/upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: i2c-hid: fix i2c_hid_output_raw_report
HID: usbhid: quirk for Formosa IR receiver
HID: remove x bit from sensor doc
- Error reporting in nfs_xdev_mount incorrectly maps all errors to ENOMEM
- Fix an NFSv4 refcounting issue
- Fix a mount failure when the server reboots during NFSv4 trunking discovery
- NFSv4.1 mounts may need to run the lease recovery thread.
- Don't silently fail setattr() requests on mountpoints
- Fix a SUNRPC socket/transport livelock and priority queue issue
- We must handle NFS4ERR_DELAY when resetting the NFSv4.1 session.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQIcBAABAgAGBQJRCpS4AAoJEGcL54qWCgDyqucP/2CTv5leu+X5/0PXBOykAIHg
8oEsTEz7/4IvIxTXzuHDYirMnm/mulfGF6NrdPqfvxpAHqRfVBfLFocfNLMVhQci
97RmBfEEGM22AToYUubML5bIxr0QllV4s9Vmyh/zGDan52y7zNNlZX+v6aLjZbJB
Fbolihpcch6lhQEUNAzK0B0ddimDl9lazx/WTmMOD/JrwOqzA4FJC+YxBe88nfzQ
c6sYyEptBaSirbCOlueqGpv8skB1CLpFJXguXToPXFxpWed6uoGrIwLO7MLUdFpJ
Xw+j8cuv/wjyYJGVKjhW7kXtwK8T7+u4bT2L883R01XYXr8XfkkLON0dgG1X/unk
80mLzCO1+qRdoDSQ4b/V4B0nScPRCJuoZpftjCi2uhKewNcxQPMZ2V5/D7pO3uyE
NdhxByB8D86JfNrIcBcRaxfuiQsurQBDvsDNmWPBZSOmH/dmHqTQGLcIe6N94A0B
c7KFtXrN2MzOl8S68dUpbhftObq9X0oK2oxFFLWRQoqjrFtiDLU5JilV0bEH2CMo
gJX7CRrPoJ2wmNToKdPJRWBYDmtMMIThq3vIpCj1FFzX17r5grJpqCmp/TViu/ER
r8rmJzni+nmHaO1NLJMTCHzLJ8soiKyBq8PZKAbipTJxy+TXI/69jnWzpIQ/pbM/
JN+0tiCcqmUmXsO+hrCP
=lWFV
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.8-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Error reporting in nfs_xdev_mount incorrectly maps all errors to
ENOMEM
- Fix an NFSv4 refcounting issue
- Fix a mount failure when the server reboots during NFSv4 trunking
discovery
- NFSv4.1 mounts may need to run the lease recovery thread.
- Don't silently fail setattr() requests on mountpoints
- Fix a SUNRPC socket/transport livelock and priority queue issue
- We must handle NFS4ERR_DELAY when resetting the NFSv4.1 session.
* tag 'nfs-for-3.8-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4.1: Handle NFS4ERR_DELAY when resetting the NFSv4.1 session
SUNRPC: When changing the queue priority, ensure that we change the owner
NFS: Don't silently fail setattr() requests on mountpoints
NFSv4.1: Ensure that nfs41_walk_client_list() does start lease recovery
NFSv4: Fix NFSv4 trunking discovery
NFSv4: Fix NFSv4 reference counting for trunked sessions
NFS: Fix error reporting in nfs_xdev_mount
Pull MIPS updates from Ralf Baechle:
"A number of fixes all across the MIPS tree. No area is particularly
standing out and things have cooled down quite nicely for a release."
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Function tracer: Fix broken function tracing
mips: Move __virt_addr_valid() to a place for MIPS 64
MIPS: Netlogic: Fix UP compilation on XLR
MIPS: AR71xx: Fix AR71XX_PCI_MEM_SIZE
MIPS: AR724x: Fix AR724X_PCI_MEM_SIZE
MIPS: Lantiq: Fix cp0_perfcount_irq mapping
MIPS: DSP: Fix DSP mask for registers.
MIPS: Fix build failure by adding definition of pfn_pmd().
MIPS: Octeon: Fix warning.
MIPS: delay.c: Check BITS_PER_LONG instead of __SIZEOF_LONG__
MIPS: PNX833x: Fix comment.
MIPS: Add struct p_format to union mips_instruction.
MIPS: Export <asm/break.h>.
MIPS: BCM47xx: Enable SSB prerequisite SSB_DRIVER_PCICORE.
MIPS: BCM47xx: Select GPIOLIB for BCMA on bcm47xx platform
MIPS: vpe.c: Fix null pointer dereference in print arguments.
i2c_hid_output_raw_report is used by hidraw to forward set_report requests.
The current implementation of i2c_hid_set_report needs to take the
report_id as an argument. The report_id is stored in the first byte
of the buffer in argument of i2c_hid_output_raw_report.
Not removing the report_id from the given buffer adds this byte 2 times
in the command, leading to a non working command.
Reported-by: Andrew Duggan <aduggan@synaptics.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Function tracing is currently broken for all 32 bit MIPS platforms.
When tracing is enabled, the kernel immediately hangs on boot.
This is a result of commit b732d439cb
that changes the kernel/trace/Kconfig file so that is no longer
forces FRAME_POINTER when FUNCTION_TRACING is enabled.
MIPS frame pointers are generally considered to be useless because
they cannot be used to unwind the stack. Unfortunately the MIPS
function tracing code has bugs that are masked by the use of frame
pointers. This commit fixes the bugs so that MIPS frame pointers
don't need to be enabled.
The bugs are a result of the odd calling sequence used to call the trace
routine. This calling sequence is inserted into every traceable function
when the tracing CONFIG option is enabled. This sequence is generated
for 32bit MIPS platforms by the compiler via the "-pg" flag.
Part of the sequence is "addiu sp,sp,-8" in the delay slot after every
call to the trace routine "_mcount" (some legacy thing where 2 arguments
used to be pushed on the stack). The _mcount routine is expected to
adjust the sp by +8 before returning. So when not disabled, the original
jalr and addiu will be there, so _mcount has to adjust sp.
The problem is that when tracing is disabled for a function, the
"jalr _mcount" instruction is replaced with a nop, but the
"addiu sp,sp,-8" is still executed and the stack pointer is left
trashed. When frame pointers are enabled the problem is masked
because any access to the stack is done through the frame
pointer and the stack pointer is restored from the frame pointer when
the function returns.
This patch writes two nops starting at the address of the "jalr _mcount"
instruction whenever tracing is disabled. This means that the
"addiu sp,sp.-8" will be converted to a nop along with the "jalr". When
disabled, there will be two nops.
This is SMP safe because the first time this happens is during
ftrace_init() which is before any other processor has been started.
Subsequent calls to enable/disable tracing when other CPUs ARE running
will still be safe because the enable will only change the first nop
to a "jalr" and the disable, while writing 2 nops, will only be changing
the "jalr". This patch also stops using stop_machine() to call the
tracer enable/disable routines and calls them directly because the
routines are SMP safe.
When the kernel first boots we have to be able to handle the gcc
generated jalr, addui sequence until ftrace_init gets a chance to run
and change the sequence. At this point mcount just adjusts the stack
and returns. When ftrace_init runs, we convert the jalr/addui to nops.
Then whenever tracing is enabled we convert the first nop to a "jalr
mcount+8". The mcount+8 entry point skips the stack adjust.
[ralf@linux-mips.org: Folded in Steven Rostedt's build fix.]
Signed-off-by: Al Cooper <alcooperx@gmail.com>
Cc: rostedt@goodmis.org
Cc: ddaney.cavm@gmail.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/4806/
Patchwork: https://patchwork.linux-mips.org/patch/4841/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When processing write same requests, fix dm to send the configured
number of WRITE SAME requests to the target rather than the number of
discards, which is not always the same.
Device-mapper WRITE SAME support was introduced by commit
23508a96cd ("dm: add WRITE SAME support").
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Commit d3ce884318 "MIPS: Fix modpost error in modules attepting to use
virt_addr_valid()" moved __virt_addr_valid() from a macro in a header
file to a function in ioremap.c. But ioremap.c is only compiled for MIPS
32, and not for MIPS 64.
When compiling for my yeeloong2, which supposedly supports hibernation,
which compiles kernel/power/snapshot.c which calls virt_addr_valid(), I
got this error:
LD init/built-in.o
kernel/built-in.o: In function `memory_bm_free':
snapshot.c:(.text+0x4c9c4): undefined reference to `__virt_addr_valid'
snapshot.c:(.text+0x4ca58): undefined reference to `__virt_addr_valid'
kernel/built-in.o: In function `snapshot_write_next':
(.text+0x4e44c): undefined reference to `__virt_addr_valid'
kernel/built-in.o: In function `snapshot_write_next':
(.text+0x4e890): undefined reference to `__virt_addr_valid'
make[1]: *** [vmlinux] Error 1
make: *** [sub-make] Error 2
I suspect that __virt_addr_valid() is fine for mips 64. I moved it to
mmap.c such that it gets compiled for mips 64 and 32.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4842/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
thin_io_hints() is blindly copying the queue limits from the thin-pool
which can lead to incorrect limits being set. The fix here simply
deletes the thin_io_hints() hook which leaves the existing stacking
infrastructure to set the limits correctly.
When a thin-pool uses an MD device for the data device a thin device
from the thin-pool must respect MD's constraints about disallowing a bio
from spanning multiple chunks. Otherwise we can see problems. If the raid0
chunksize is 1152K and thin-pool chunksize is 256K I see the following
md/raid0 error (with extra debug tracing added to thin_endio) when
mkfs.xfs is executed against the thin device:
md/raid0:md99: make_request bug: can't convert block across chunks or bigger than 1152k 6688 127
device-mapper: thin: bio sector=2080 err=-5 bi_size=130560 bi_rw=17 bi_vcnt=32 bi_idx=0
This extra DM debugging shows that the failing bio is spanning across
the first and second logical 1152K chunk (sector 2080 + 255 takes the
bio beyond the first chunk's boundary of sector 2304). So the bio
splitting that DM is doing clearly isn't respecting the MD limits.
max_hw_sectors_kb is 127 for both the thin-pool and thin device
(queue_max_hw_sectors returns 255 so we'll excuse sysfs's lack of
precision). So this explains why bi_size is 130560.
But the thin device's max_hw_sectors_kb should be 4 (PAGE_SIZE) given
that it doesn't have a .merge function (for bio_add_page to consult
indirectly via dm_merge_bvec) yet the thin-pool does sit above an MD
device that has a compulsory merge_bvec_fn. This scenario is exactly
why DM must resort to sending single PAGE_SIZE bios to the underlying
layer. Some additional context for this is available in the header for
commit 8cbeb67a ("dm: avoid unsupported spanning of md stripe boundaries").
Long story short, the reason a thin device doesn't properly get
configured to have a max_hw_sectors_kb of 4 (PAGE_SIZE) is that
thin_io_hints() is blindly copying the queue limits from the thin-pool
device directly to the thin device's queue limits.
Fix this by eliminating thin_io_hints. Doing so is safe because the
block layer's queue limits stacking already enables the upper level thin
device to inherit the thin-pool device's discard and minimum_io_size and
optimal_io_size limits that get set in pool_io_hints. But avoiding the
queue limits copy allows the thin and thin-pool limits to be different
where it is important, namely max_hw_sectors_kb.
Reported-by: Daniel Browning <db@kavod.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Pull x86 EFI fixes from Peter Anvin:
"This is a collection of fixes for the EFI support. The controversial
bit here is a set of patches which bumps the boot protocol version as
part of fixing some serious problems with the EFI handover protocol,
used when booting under EFI using a bootloader as opposed to directly
from EFI. These changes should also make it a lot saner to support
cross-mode 32/64-bit EFI booting in the future. Getting these changes
into 3.8 means we avoid presenting an inconsistent ABI to bootloaders.
Other changes are display detection and fixing efivarfs."
* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, efi: remove attribute check from setup_efi_pci
x86, build: Dynamically find entry points in compressed startup code
x86, efi: Fix PCI ROM handing in EFI boot stub, in 32-bit mode
x86, efi: Fix 32-bit EFI handover protocol entry point
x86, efi: Fix display detection in EFI boot stub
x86, boot: Define the 2.12 bzImage boot protocol
x86/boot: Fix minor fd leakage in tools/relocs.c
x86, efi: Set runtime_version to the EFI spec revision
x86, efi: fix 32-bit warnings in setup_efi_pci()
efivarfs: Delete dentry from dcache in efivarfs_file_write()
efivarfs: Never return ENOENT from firmware
efi, x86: Pass a proper identity mapping in efi_call_phys_prelog
efivarfs: Drop link count of the right inode
Pull x86 fixes from Peter Anvin:
"This is a collection of miscellaneous fixes, the most important one is
the fix for the Samsung laptop bricking issue (auto-blacklisting the
samsung-laptop driver); the efi_enabled() changes you see below are
prerequisites for that fix.
The other issues fixed are booting on OLPC XO-1.5, an UV fix, NMI
debugging, and requiring CAP_SYS_RAWIO for MSR references, just as
with I/O port references."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
samsung-laptop: Disable on EFI hardware
efi: Make 'efi_enabled' a function to query EFI facilities
smp: Fix SMP function call empty cpu mask race
x86/msr: Add capabilities check
x86/dma-debug: Bump PREALLOC_DMA_DEBUG_ENTRIES
x86/olpc: Fix olpc-xo1-sci.c build errors
arch/x86/platform/uv: Fix incorrect tlb flush all issue
x86-64: Fix unwind annotations in recent NMI changes
x86-32: Start out cr0 clean, disable paging before modifying cr3/4