Commit Graph

509940 Commits

Author SHA1 Message Date
Geert Uytterhoeven
13a6a2ed1f mmc: tmio: Remove bogus un-initialization in tmio_mmc_host_free()
If CONFIG_DEBUG_SLAB=y:

    sh_mobile_sdhi ee100000.sd: Got CD GPIO
    sh_mobile_sdhi ee100000.sd: Got WP GPIO
    platform ee100000.sd: Driver sh_mobile_sdhi requests probe deferral
    ...
    Slab corruption (Not tainted): kmalloc-1024 start=ed8b3c00, len=1024
    2d0: 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  ....kkkkkkkkkkkk
    Prev obj: start=ed8b3800, len=1024
    000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
    010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Struct tmio_mmc_host is embedded inside struct mmc_host, and thus is
freed by the call to mmc_free_host(). Hence it must not be written to
afterwards, as that will corrupt freed (and perhaps already reused)
memory.

Fixes: 94b110aff8 ("mmc: tmio: add tmio_mmc_host_alloc/free()")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:38 +01:00
Wu Fengguang
c22f5e1b1c mmc: dw_mmc: exynos: dw_mci_exynos_prepare_hs400_tuning() can be static
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:38 +01:00
Russ Dill
3d3bbfbdfd mmc: omap_hsmmc: add hibernation support
Setting a dev_pm_ops suspend/resume pair but not a set of
hibernation functions means those pm functions will not be
called upon hibernation.
Fix this by using SET_SYSTEM_SLEEP_PM_OPS, which appropriately
assigns the suspend and hibernation handlers and move
omap_hsmmc_x callbacks under CONFIG_PM_SLEEP to avoid build warnings.

Signed-off-by: Russ Dill <Russ.Dill@ti.com>
[Grygorii.Strashko@linaro.org: rebased on top of K4.0]
Signed-off-by: Grygorii Strashko <grygorii.strashko@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:37 +01:00
Andreas Fenkart
cde592cbe5 mmc: omap_hsmmc: use distinctive code paths for cover / card detect logic
Mobile phones (some) have no card detect pin, but can detect if the
cover is removed. The purpose is the same; detect if card is being
added/removed, but the details differ.
When the cover is removed, it does not mean the card is gone. But it
might, since it is accessible now. It's like a warning. All the driver
does is to limit write access to the card, see protect_card flag.
In contrast, card detect notifies us after the fact, e.g.
card is gone, card is inserted. We can't take precautions, but we can
rely on those events, -- the card is really gone, or do scan the card.
To summarize there is not much code sharing between cover and card
detect, it only increases confusion. By splitting, both will be
simplified in a followup patch.

Signed-off-by: Andreas Fenkart <afenkart@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:37 +01:00
Andreas Fenkart
a49d83537b mmc: omap_hsmmc: use slot-gpio functions to manage read-only pin directly
The indirection via omap_hsmmc_get_ro and omap_hsmmc_get_wp is
redundant. Also dropped setting gpio_wp to EINVAL since platform date
is read-only
Untested: no device with ro pin was available, but change is fairly
simple

Signed-off-by: Andreas Fenkart <afenkart@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:36 +01:00
Andreas Fenkart
f048968f8e mmc: omap_hsmmc: remove unused fields from struct omap_hsmmc_host
addon to: 09108968b7b72b6083a3bfc8f8259a74ed57255e
    mmc: omap_hsmmc: remove prepare/complete system suspend support

Signed-off-by: Andreas Fenkart <afenkart@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:36 +01:00
Chen-Yu Tsai
9e71c589e4 mmc: sunxi: Use devm_reset_control_get_optional() for reset control
The reset control for the sunxi mmc controller is optional. Some
newer platforms (sun6i, sun8i, sun9i) have it, while older ones
(sun4i, sun5i, sun7i) don't.

Use the properly stubbed _optional version so the driver does not
fail to compile when RESET_CONTROLLER=n.

This patch also adds a check for deferred probing on the reset
control.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Cc: <stable@vger.kernel.org> # 3.16+
Acked-by: David Lanzendörfer <david.lanzendoerfer@o2s.ch>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:35 +01:00
Kevin Hao
caebcae94f mmc: sdhci: set the .remove to sdhci_pltfm_unregister()
In these drivers, the driver specific .remove function just a simple
wrapper of function sdhci_pltfm_unregister(). So remove these wrappers
and just set .remove to sdhci_pltfm_unregister().

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:35 +01:00
Kevin Hao
83eacdfa25 mmc: sdhci: disable the clock in sdhci_pltfm_unregister()
So we can avoid to sprinkle the clk_disable_unprepare() in many
drivers.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:34 +01:00
Kevin Hao
2290fcb341 mmc: sdhci-bcm-kona: kill the "external_clk" member in driver private struct
Actually we can use the "clk" in the struct sdhci_pltfm_host. Also
change the "external clock" to "core clock" and kill two redundant
private functions in this driver as suggested by Ray Jui.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Reviewed-by: Ray Jui <rjui@broadcom.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:34 +01:00
Kevin Hao
e46af2987b mmc: sdhci-sirf: kill the "clk" member in driver private struct
Actually we can use the "clk" in the struct sdhci_pltfm_host.
With this change we can also kill the private function for get
max clock in this driver.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:34 +01:00
Kevin Hao
e4f79d9ca2 mmc: tegra: use devm help functions to get the clk and gpio
Simplify the error and remove path.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:33 +01:00
Kevin Hao
35daeede22 mmc: sdhci-dove: kill the driver specific private struct
There is only one "clk" member in this driver specific private struct.
Actually we can use the "clk" member in the struct sdhci_pltfm_host,
and then kill this struct completely.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:33 +01:00
Kevin Hao
6a4a34a413 mmc: sdhci-dove: remove the unneeded error check
The function clk_disable_unprepare() already take care of either error
or null cases.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:32 +01:00
weijun yang
b36ac1b43e mmc: sirf: update sdhci_sirf_execute_tuning procedure
For the original tuning code, delay value is set to SD Bus Clock Delay
Register (SD_CLK_DELAY_SETTING) as (val | (Val << 7) | (val << 16)),
which means CLK_DELAY_IN1, CLK_DELAY_IN2 and CLK_DELAY_OUT are the
same and with 128 steps. This is doubtful. In CSR design specification
documents CS-304575-DR-3H, this issue is clarified, the delay[13:0] in
SD_CLK_DELAY_SETTING is simplied to the concatenation of {CLK_DELAY_IN2,
CLK_DELAY_IN1}.
Besides, for CMD19 tuning, no need to set CLK_DELAY_OUT([22,16]
of SD_CLK_DELAY_SETTING).

Signed-off-by: weijun yang <york.yang@csr.com>
Signed-off-by: Barry Song <baohua.song@csr.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:32 +01:00
Doug Anderson
ed2540effa mmc: dw_mmc: Don't crash if we get an interrupt before slot has initted
It's unlikely that this is really needed on any single-slot systems
where we disable card detects until the end of probe, but it still
seems safer to check to make sure that a slot has been initted before
we try to dereference it to find the SDIO interrupt mask.

Signed-off-by: Doug Anderson <dianders@chromium.org>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:31 +01:00
Doug Anderson
fa0c328343 mmc: dw_mmc: Only enable CD after setup and only if needed
We really don't want to get a card detect interrupt during probe time
since it can confuse things.  Let's disable the card detect interrupt
until we're in a really good place: the end of probe.  Let's also
simply avoid enabling the card detect interrupt if it's not used.

It appears that (at least on rk3288) when vqmmc is turned on it can
cause a bogus "card detect" interrupt.  That meant that we were
getting a predictable card detect interrupt while we were in
mmc_add_host().  On the version of the kernel I'm working with at
least (3.14), this is not a great time to get a card detect interrupt
since I think that we don't grab all the needed locks in
mmc_add_host() and children.  I put stack dumps in dw_mci_setup_bus()
and found that I could see two distinct stack crawls that looked like:

Caller one:
* dw_mci_setup_bus
* dw_mci_set_ios
* mmc_power_up
* mmc_start_host
* mmc_add_host

Caller two:
* dw_mci_setup_bus
* dw_mci_set_ios
* mmc_set_chip_select
* mmc_go_idle
* mmc_rescan
* process_one_work
* worker_thread
* kthread

Signed-off-by: Doug Anderson <dianders@chromium.org>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:31 +01:00
Doug Anderson
0bdbd0e88c mmc: dw_mmc: Don't start commands while busy
We've seen problems on some WiFi modules where we seem to send a CMD53
(which requires the data lines) while the module is asserting busy.
We shouldn't do that.

The Designware Databook says that before issuing a new data transfer
command we should check for busy, so that's what we'll do.

We'll leverage the existing dw_mmc knowledge about whether it should
wait for the previous command to finish to know whether we should
check for busy before sending the command.  This means we won't end up
incorrectly waiting for things like CMD52 (SDIO) or CMD13 (SD) which
don't use the data line.

Note that this also has the advantage of making sure that we don't
change the clock while the card is busy, too.

Signed-off-by: Doug Anderson <dianders@chromium.org>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:30 +01:00
Doug Anderson
d1f1dd8600 mmc: dw_mmc: Give a good reset after we give power
We should give dw_mmc a good reset after we apply power.  On some
boards vqmmc may actually be connected to the IP block in the SoC so
it's good to reset after power comes in.

Without this we sometimes see failures enumerating cards on rk3288.

Signed-off-by: Doug Anderson <dianders@chromium.org>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:30 +01:00
Doug Anderson
655babbde6 mmc: dw_mmc: Make sure we only adjust the clock when power is on
It appears that we can confuse things if we try to turn on the MMC
clock when the power is off.  Adjust is so that we turn the clock on
(using dw_mci_setup_bus) after power is all the way on and we turn the
clock off before the power goes off.

Signed-off-by: Doug Anderson <dianders@chromium.org>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:29 +01:00
addy ke
bdb9a90b3d mmc: dw_mmc: fix mmc_test by not sending abort for DRTO/EBE errors
The STOP command can terminate a data transfer between a memory card and
mmc controller.

As show in Synopsys DesignWare Cores Mobile Storage Host Databook:
Data timeout and Data end-bit error will terminate further data transfer
by mmc controller. So we should not send abort command to terminate a
data transfer again if we got DRTO and EBE interrupt.

After this patch, all mmc_test cases can pass on RK3288-Pink2 board.

Signed-off-by: Addy Ke <addy.ke@rock-chips.com>
Reviewed-by: Doug Anderson <dianders@chromium.org>
Tested-by: Doug Anderson <dianders@chromium.org>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:29 +01:00
addy ke
6d53200b51 mmc: dw_mmc: rockchip: add support MMC_CAP_RUNTIME_RESUME capability
To support HS200 and UHS mode, mmc core will call init_card() to
execute tuning:
- sdio: init_card can be executed at runtime resume.
- sd and mmc: init_card can be executed at resume or runtime resume,
  which depends on MMC_CAP_RUNTIME_RESUME capability.

On rk3288 SoC, host will get DRTO interrupt when host send command
to read tuning data. This will spend more than 111ms:
drto_ms = drto_clks * 1000 / bus_hz = 111ms.

And the total tuning time will be more than 400ms.

So we should add MMC_CAP_RUNTIME_RESUME capability to execute tuning
at runtime resume. Only if we do so, can we pass resume test.

Reviewed-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Addy Ke <addy.ke@rock-chips.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:28 +01:00
Seungwon Jeon
801131321a mmc: dw_mmc: exynos: Support eMMC's HS400 mode
Implements HS400 mode support for exynos host driver.
This also include some updates as new mode is added.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
[Alim: addressed review comments]
Tested-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-03-23 14:13:28 +01:00
James Hogan
0164a711c9 metag: Fix ioremap_wc/ioremap_cached build errors
When ioremap_wc() or ioremap_cached() are used without first including
asm/pgtable.h, the _PAGE_CACHEABLE or _PAGE_WR_COMBINE definitions
aren't found, resulting in build errors like the following (in
next-20150323 due to "lib: devres: add a helper function for
ioremap_wc"):

lib/devres.c: In function ‘devm_ioremap_wc’:
lib/devres.c:91: error: ‘_PAGE_WR_COMBINE’ undeclared

We can't easily include asm/pgtable.h in asm/io.h due to dependency
problems, so split out the _PAGE_* definitions from asm/pgtable.h into a
separate asm/pgtable-bits.h header (as a couple of other architectures
already do), and include that in io.h instead.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Cc: Abhilash Kesavan <a.kesavan@samsung.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-23 12:32:37 +00:00
Helge Deller
2e3f0ab2bb parisc: Fix pmd code to depend on PT_NLEVELS value, not on CONFIG_64BIT
Make the code which sets up the pmd depend on PT_NLEVELS == 3, not on
CONFIG_64BIT. The reason is, that a 64bit kernel with a page size
greater than 4k doesn't need the pmd and thus has PT_NLEVELS = 2.

Signed-off-by: Helge Deller <deller@gmx.de>
2015-03-23 12:28:16 +01:00
Mikulas Patocka
0e0da48dee parisc: mm: don't count preallocated pmds
The patch dc6c9a35b6 that counts pmds
allocated for a process introduced a bug on 64-bit PA-RISC kernels.

The PA-RISC architecture preallocates one pmd with each pgd. This
preallocated pmd can never be freed - pmd_free does nothing when it is
called with this pmd. When the kernel attempts to free this preallocated
pmd, it decreases the count of allocated pmds. The result is that the
counter underflows and this error is reported.

This patch fixes the bug by artifically incrementing the counter in
pmd_free when the kernel tries to free the preallocated pmd.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2015-03-23 12:28:15 +01:00
Andy Lutomirski
d74ef1118a x86/asm/entry: Replace some open-coded VM86 checks with v8086_mode() checks
This allows us to remove some unnecessary ifdefs.  There should
be no change to the generated code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/f7e00f0d668e253abf0bd8bf36491ac47bd761ff.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 11:14:40 +01:00
Andy Lutomirski
7a2806741e x86/asm/entry: Remove user_mode_vm()
It has no callers anymore.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a594afd6a0bddb1311bd7c92a15201c87fbb8681.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 11:14:33 +01:00
Andy Lutomirski
f39b6f0ef8 x86/asm/entry: Change all 'user_mode_vm()' calls to 'user_mode()'
user_mode_vm() and user_mode() are now the same.  Change all callers
of user_mode_vm() to user_mode().

The next patch will remove the definition of user_mode_vm.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/43b1f57f3df70df5a08b0925897c660725015554.1426728647.git.luto@kernel.org
[ Merged to a more recent kernel. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 11:14:17 +01:00
Andy Lutomirski
efa7045103 x86/asm/entry: Make user_mode() work correctly if regs came from VM86 mode
user_mode() is now identical to user_mode_vm().  Subsequent patches
will change all callers of user_mode_vm() to user_mode() and then
delete user_mode_vm().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/0dd03eacb5f0a2b5ba0240de25347a31b493c289.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 11:13:51 +01:00
Andy Lutomirski
ae60f0710a x86/asm/entry: Use user_mode_ignore_vm86() where appropriate
A few of the user_mode() checks in traps.c are immediately after
explicit checks for vm86 mode.  Change them to user_mode_ignore_vm86().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/0b324d5b75c3402be07f8d3c6245ed7f4995029e.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 11:13:46 +01:00
Andy Lutomirski
383f3af3f8 x86/asm/entry, perf: Explicitly optimize vm86 handling in code_segment_base()
There's no point in checking the VM bit on 64-bit, and, since
we're explicitly checking it, we can use user_mode_ignore_vm86()
after the check.

While we're at it, rearrange the #ifdef slightly to make the code
flow a bit clearer.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/dc1457a734feccd03a19bb3538a7648582f57cdd.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 11:13:41 +01:00
Andy Lutomirski
a67e7277d0 x86/asm/entry: Add user_mode_ignore_vm86()
user_mode() is dangerous and user_mode_vm() has a confusing name.

Add user_mode_ignore_vm86() (equivalent to current user_mode()).
We'll change the small number of legitimate users of user_mode()
to user_mode_ignore_vm86().

Inspired by grsec, although this works rather differently.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/202c56ca63823c338af8e2e54948dbe222da6343.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 11:13:36 +01:00
Ingo Molnar
e4518ab90f Linux 4.0-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJVD1VGAAoJEHm+PkMAQRiG7yoH/juKOQ1zbxi5M+mleDEEJtA0
 RxQSojqEMWIKrWi8PNZxjENn1OZB6XOLIXOhlyAZBrmgsjO34p1DyXlZMznr/R8W
 kQ2Xxs061hRtB3OuruMIqOApUrjuqsaCwgbgUS1qWmqZcoyZN4oELyZMP6OOlqv5
 UUBZm8MfyXGyxrCcg39mjct3VEOhiuEcvL6SUxOC380CdSVAnyqHFPcz0JVqMUn9
 9RUBs0T9cMdhb0mZ2bfXzt6AKArj63G2nXOum+VzFcvspSm2U+MPIDCuoE+ZbTPS
 jqIAgG0rj1ezRyb5oeJrvlU0Yy3u/cXoMPs9+kORvpladooYNLti8ovh6qllm0I=
 =d/ye
 -----END PGP SIGNATURE-----

Merge tag 'v4.0-rc5' into x86/asm, to resolve conflicts

Conflicts:
	arch/x86/kernel/entry_64.S

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 11:13:15 +01:00
Baruch Siach
00e6743b5b irqchip: digicolor: Move digicolor_set_gc to init section
The digicolor_set_gc() routine is only called from __init annotated
digicolor_of_init(). Annotate digicolor_set_gc() with __init as well to save a
few bytes at run time.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Link: https://lkml.kernel.org/r/a3b57ecdbe0b07f55c20c07ff98f1f694275722d.1427009985.git.baruch@tkos.co.il
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2015-03-23 10:00:52 +00:00
Helge Deller
47514da3ac parisc: Add compile-time check when adding new syscalls
Signed-off-by: Helge Deller <deller@gmx.de>
2015-03-23 10:57:25 +01:00
Steven Rostedt
b6366f048e sched/rt: Use IPI to trigger RT task push migration instead of pulling
When debugging the latencies on a 40 core box, where we hit 300 to
500 microsecond latencies, I found there was a huge contention on the
runqueue locks.

Investigating it further, running ftrace, I found that it was due to
the pulling of RT tasks.

The test that was run was the following:

 cyclictest --numa -p95 -m -d0 -i100

This created a thread on each CPU, that would set its wakeup in iterations
of 100 microseconds. The -d0 means that all the threads had the same
interval (100us). Each thread sleeps for 100us and wakes up and measures
its latencies.

cyclictest is maintained at:
 git://git.kernel.org/pub/scm/linux/kernel/git/clrkwllms/rt-tests.git

What happened was another RT task would be scheduled on one of the CPUs
that was running our test, when the other CPU tests went to sleep and
scheduled idle. This caused the "pull" operation to execute on all
these CPUs. Each one of these saw the RT task that was overloaded on
the CPU of the test that was still running, and each one tried
to grab that task in a thundering herd way.

To grab the task, each thread would do a double rq lock grab, grabbing
its own lock as well as the rq of the overloaded CPU. As the sched
domains on this box was rather flat for its size, I saw up to 12 CPUs
block on this lock at once. This caused a ripple affect with the
rq locks especially since the taking was done via a double rq lock, which
means that several of the CPUs had their own rq locks held while trying
to take this rq lock. As these locks were blocked, any wakeups or load
balanceing on these CPUs would also block on these locks, and the wait
time escalated.

I've tried various methods to lessen the load, but things like an
atomic counter to only let one CPU grab the task wont work, because
the task may have a limited affinity, and we may pick the wrong
CPU to take that lock and do the pull, to only find out that the
CPU we picked isn't in the task's affinity.

Instead of doing the PULL, I now have the CPUs that want the pull to
send over an IPI to the overloaded CPU, and let that CPU pick what
CPU to push the task to. No more need to grab the rq lock, and the
push/pull algorithm still works fine.

With this patch, the latency dropped to just 150us over a 20 hour run.
Without the patch, the huge latencies would trigger in seconds.

I've created a new sched feature called RT_PUSH_IPI, which is enabled
by default.

When RT_PUSH_IPI is not enabled, the old method of grabbing the rq locks
and having the pulling CPU do the work is implemented. When RT_PUSH_IPI
is enabled, the IPI is sent to the overloaded CPU to do a push.

To enabled or disable this at run time:

 # mount -t debugfs nodev /sys/kernel/debug
 # echo RT_PUSH_IPI > /sys/kernel/debug/sched_features
or
 # echo NO_RT_PUSH_IPI > /sys/kernel/debug/sched_features

Update: This original patch would send an IPI to all CPUs in the RT overload
list. But that could theoretically cause the reverse issue. That is, there
could be lots of overloaded RT queues and one CPU lowers its priority. It would
then send an IPI to all the overloaded RT queues and they could then all try
to grab the rq lock of the CPU lowering its priority, and then we have the
same problem.

The latest design sends out only one IPI to the first overloaded CPU. It tries to
push any tasks that it can, and then looks for the next overloaded CPU that can
push to the source CPU. The IPIs stop when all overloaded CPUs that have pushable
tasks that have priorities greater than the source CPU are covered. In case the
source CPU lowers its priority again, a flag is set to tell the IPI traversal to
restart with the first RT overloaded CPU after the source CPU.

Parts-suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Joern Engel <joern@purestorage.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150318144946.2f3cc982@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:55:22 +01:00
Steven Rostedt
71ad00d61e irq_work: Fix build failure when CONFIG_IRQ_WORK is not defined
When CONFIG_IRQ_WORK is not defined (difficult to do, as it also
requires CONFIG_PRINTK not to be defined), we get a build failure:

	kernel/built-in.o: In function `flush_smp_call_function_queue':
	kernel/smp.c:263: undefined reference to `irq_work_run'
	kernel/smp.c:263: undefined reference to `irq_work_run'
	Makefile:933: recipe for target 'vmlinux' failed

Simplest thing to do is to make irq_work_run() a nop when not set.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150319101851.4d224d9b@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:55:21 +01:00
Ingo Molnar
e1b63dec2d Merge branch 'sched/urgent' into sched/core, to pick up fixes before applying new patches
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:50:29 +01:00
Preeti U Murthy
a127d2bcf1 timers/tick/broadcast-hrtimer: Fix suspicious RCU usage in idle loop
The hrtimer mode of broadcast queues hrtimers in the idle entry
path so as to wakeup cpus in deep idle states. The associated
call graph is :

	cpuidle_idle_call()
	|____ clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, ....))
	     |_____tick_broadcast_set_event()
		   |____clockevents_program_event()
			|____bc_set_next()

The hrtimer_{start/cancel} functions call into tracing which uses RCU.
But it is not legal to call into RCU in cpuidle because it is one of the
quiescent states. Hence protect this region with RCU_NONIDLE which informs
RCU that the cpu is momentarily non-idle.

As an aside it is helpful to point out that the clock event device that is
programmed here is not a per-cpu clock device; it is a
pseudo clock device, used by the broadcast framework alone.
The per-cpu clock device programming never goes through bc_set_next().

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: linuxppc-dev@ozlabs.org
Cc: mpe@ellerman.id.au
Cc: tglx@linutronix.de
Link: http://lkml.kernel.org/r/20150318104705.17763.56668.stgit@preeti.in.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:50:05 +01:00
Peter Zijlstra
35a9393c95 lockdep: Fix the module unload key range freeing logic
Module unload calls lockdep_free_key_range(), which removes entries
from the data structures. Most of the lockdep code OTOH assumes the
data structures are append only; in specific see the comments in
add_lock_to_list() and look_up_lock_class().

Clearly this has only worked by accident; make it work proper. The
actual scenario to make it go boom would involve the memory freed by
the module unlock being re-allocated and re-used for a lock inside of
a rcu-sched grace period. This is a very unlikely scenario, still
better plug the hole.

Use RCU list iteration in all places and ammend the comments.

Change lockdep_free_key_range() to issue a sync_sched() between
removal from the lists and returning -- which results in the memory
being freed. Further ensure the callers are placed correctly and
comment the requirements.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Tsyvarev <tsyvarev@ispras.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:49:07 +01:00
Brian Silverman
746db9443e sched: Fix RLIMIT_RTTIME when PI-boosting to RT
When non-realtime tasks get priority-inheritance boosted to a realtime
scheduling class, RLIMIT_RTTIME starts to apply to them. However, the
counter used for checking this (the same one used for SCHED_RR
timeslices) was not getting reset. This meant that tasks running with a
non-realtime scheduling class which are repeatedly boosted to a realtime
one, but never block while they are running realtime, eventually hit the
timeout without ever running for a time over the limit. This patch
resets the realtime timeslice counter when un-PI-boosting from an RT to
a non-RT scheduling class.

I have some test code with two threads and a shared PTHREAD_PRIO_INHERIT
mutex which induces priority boosting and spins while boosted that gets
killed by a SIGXCPU on non-fixed kernels but doesn't with this patch
applied. It happens much faster with a CONFIG_PREEMPT_RT kernel, and
does happen eventually with PREEMPT_VOLUNTARY kernels.

Signed-off-by: Brian Silverman <brian@peloton-tech.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: austin@peloton-tech.com
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1424305436-6716-1-git-send-email-brian@peloton-tech.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:47:55 +01:00
Peter Zijlstra
d525211f9d perf: Fix irq_work 'tail' recursion
Vince reported a watchdog lockup like:

	[<ffffffff8115e114>] perf_tp_event+0xc4/0x210
	[<ffffffff810b4f8a>] perf_trace_lock+0x12a/0x160
	[<ffffffff810b7f10>] lock_release+0x130/0x260
	[<ffffffff816c7474>] _raw_spin_unlock_irqrestore+0x24/0x40
	[<ffffffff8107bb4d>] do_send_sig_info+0x5d/0x80
	[<ffffffff811f69df>] send_sigio_to_task+0x12f/0x1a0
	[<ffffffff811f71ce>] send_sigio+0xae/0x100
	[<ffffffff811f72b7>] kill_fasync+0x97/0xf0
	[<ffffffff8115d0b4>] perf_event_wakeup+0xd4/0xf0
	[<ffffffff8115d103>] perf_pending_event+0x33/0x60
	[<ffffffff8114e3fc>] irq_work_run_list+0x4c/0x80
	[<ffffffff8114e448>] irq_work_run+0x18/0x40
	[<ffffffff810196af>] smp_trace_irq_work_interrupt+0x3f/0xc0
	[<ffffffff816c99bd>] trace_irq_work_interrupt+0x6d/0x80

Which is caused by an irq_work generating new irq_work and therefore
not allowing forward progress.

This happens because processing the perf irq_work triggers another
perf event (tracepoint stuff) which in turn generates an irq_work ad
infinitum.

Avoid this by raising the recursion counter in the irq_work -- which
effectively disables all software events (including tracepoints) from
actually triggering again.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20150219170311.GH21418@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:46:32 +01:00
Geert Uytterhoeven
5593ce64d8 irqchip: renesas-irqc: Add functional clock to bindings
The external IRQ controller has a functional clock, which is used for
power management. Document it.

Fix a typo in the r8a73a4 SoC name while we're at it.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Link: https://lkml.kernel.org/r/1426704961-27322-4-git-send-email-geert+renesas@glider.be
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2015-03-23 09:35:01 +00:00
Geert Uytterhoeven
51b05f6b8a irqchip: renesas-irqc: Add minimal runtime PM support
This is just enough to let pm_clk_*() enable the functional clock, and
manage it for suspend/resume, if present.
Before, it was assumed enabled by the bootloader or reset state.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Link: https://lkml.kernel.org/r/1426704961-27322-3-git-send-email-geert+renesas@glider.be
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2015-03-23 09:34:43 +00:00
Geert Uytterhoeven
1cd5ec7330 irqchip: renesas-irqc: Add more register documentation
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Link: https://lkml.kernel.org/r/1426704961-27322-2-git-send-email-geert+renesas@glider.be
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2015-03-23 09:34:32 +00:00
Arjun Sreedharan
1c1d046be6 x86/boot: Standardize strcmp()
strcmp() is always expected to return 0 when arguments are equal,
negative when its first argument @str1 is less than its second argument
@str2 and a positive value otherwise. Previously strcmp("a", "b")
returned 1. Now it gives -1, as it is supposed to.

Until now this bug never triggered, because all uses for strcmp() in the
boot code tested for nonzero:

  triton:~/tip> git grep strcmp arch/x86/boot/
  arch/x86/boot/boot.h:int strcmp(const char *str1, const char *str2);
  arch/x86/boot/edd.c:            if (!strcmp(eddarg, "skipmbr") || !strcmp(eddarg, "skip")) {
  arch/x86/boot/edd.c:            else if (!strcmp(eddarg, "off"))
  arch/x86/boot/edd.c:            else if (!strcmp(eddarg, "on"))

should in the future strcmp() be used in a comparative way in the boot
code, it might have led to (not so subtle) bugs.

Signed-off-by: Arjun Sreedharan <arjun024@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426520267-1803-1-git-send-email-arjun024@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:24:12 +01:00
Sudeep Holla
37dea8c52c x86/cpu/cacheinfo: Fix cache_get_priv_group() for Intel processors
The private pointer provided by the cacheinfo code is used to implement
the AMD L3 cache-specific attributes using a pointer to the northbridge
descriptor. It is needed for performing L3-specific operations and for
that we need a couple of PCI devices and other service information, all
contained in the northbridge descriptor.

This results in failure of cacheinfo setup as shown below as
cache_get_priv_group() returns the uninitialised private attributes which
are not valid for Intel processors.

  ------------[ cut here ]------------
  WARNING: CPU: 3 PID: 1 at fs/sysfs/group.c:102
  internal_create_group+0x151/0x280()
  sysfs: (bin_)attrs not set by subsystem for group: index3/
  Modules linked in:
  CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.0.0-rc3+ #1
  Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A13 05/11/2014
  ...
  Call Trace:
    dump_stack
    warn_slowpath_common
    warn_slowpath_fmt
    internal_create_group
    sysfs_create_groups
    device_add
    cpu_device_create
    ? __kmalloc
    cache_add_dev
    cacheinfo_sysfs_init
    ? container_dev_init
    do_one_initcall
    kernel_init_freeable
    ? rest_init
    kernel_init
    ret_from_fork
    ? rest_init

This patch fixes the issue by checking if the L3 cache indices are
populated correctly (AMD-specific) before initializing the private
attributes.

Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:22:38 +01:00
Borislav Petkov
c9ce871283 x86/mce: Reindent __mcheck_cpu_apply_quirks() properly
Had some strange 3 tabs + 2 chars indentation, probably from me. Fix it.

No code changed:

  # arch/x86/kernel/cpu/mcheck/mce.o:

   text    data     bss     dec     hex filename
  21371    5923     264   27558    6ba6 mce.o.before
  21371    5923     264   27558    6ba6 mce.o.after

md5:
   eb3996c84d15e08ed836f043df2cbb01  mce.o.before.asm
   eb3996c84d15e08ed836f043df2cbb01  mce.o.after.asm

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:16:44 +01:00
Jesse Larrew
f77ac507f8 x86/mce: Use safe MSR accesses for AMD quirk
Certain MSRs are only relevant to a kernel in host mode, and kvm had
chosen not to implement these MSRs at all for guests. If a guest kernel
ever tried to access these MSRs, the result was a general protection
fault.

KVM will be separately patched to return 0 when these MSRs are read,
and this patch ensures that MSR accesses are tolerant of exceptions.

Signed-off-by: Jesse Larrew <jesse.larrew@amd.com>
[ Drop {} braces around loop ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Joel Schopp <joel.schopp@amd.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/1426262619-5016-1-git-send-email-jesse.larrew@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:16:43 +01:00