linux/arch/mips/loongson64/Platform
Huacai Chen e02e07e312
MIPS: Loongson: Introduce and use loongson_llsc_mb()
On the Loongson-2G/2H/3A/3B there is a hardware flaw that ll/sc and
lld/scd is very weak ordering. We should add sync instructions "before
each ll/lld" and "at the branch-target between ll/sc" to workaround.
Otherwise, this flaw will cause deadlock occasionally (e.g. when doing
heavy load test with LTP).

Below is the explaination of CPU designer:

"For Loongson 3 family, when a memory access instruction (load, store,
or prefetch)'s executing occurs between the execution of LL and SC, the
success or failure of SC is not predictable. Although programmer would
not insert memory access instructions between LL and SC, the memory
instructions before LL in program-order, may dynamically executed
between the execution of LL/SC, so a memory fence (SYNC) is needed
before LL/LLD to avoid this situation.

Since Loongson-3A R2 (3A2000), we have improved our hardware design to
handle this case. But we later deduce a rarely circumstance that some
speculatively executed memory instructions due to branch misprediction
between LL/SC still fall into the above case, so a memory fence (SYNC)
at branch-target (if its target is not between LL/SC) is needed for
Loongson 3A1000, 3B1500, 3A2000 and 3A3000.

Our processor is continually evolving and we aim to to remove all these
workaround-SYNCs around LL/SC for new-come processor."

Here is an example:

Both cpu1 and cpu2 simutaneously run atomic_add by 1 on same atomic var,
this bug cause both 'sc' run by two cpus (in atomic_add) succeed at same
time('sc' return 1), and the variable is only *added by 1*, sometimes,
which is wrong and unacceptable(it should be added by 2).

Why disable fix-loongson3-llsc in compiler?
Because compiler fix will cause problems in kernel's __ex_table section.

This patch fix all the cases in kernel, but:

+. the fix at the end of futex_atomic_cmpxchg_inatomic is for branch-target
of 'bne', there other cases which smp_mb__before_llsc() and smp_llsc_mb() fix
the ll and branch-target coincidently such as atomic_sub_if_positive/
cmpxchg/xchg, just like this one.

+. Loongson 3 does support CONFIG_EDAC_ATOMIC_SCRUB, so no need to touch
edac.h

+. local_ops and cmpxchg_local should not be affected by this bug since
only the owner can write.

+. mips_atomic_set for syscall.c is deprecated and rarely used, just let
it go

Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Huang Pei <huangpei@loongson.cn>
[paul.burton@mips.com:
  - Simplify the addition of -mno-fix-loongson3-llsc to cflags, and add
    a comment describing why it's there.
  - Make loongson_llsc_mb() a no-op when
    CONFIG_CPU_LOONGSON3_WORKAROUNDS=n, rather than a compiler memory
    barrier.
  - Add a comment describing the bug & how loongson_llsc_mb() helps
    in asm/barrier.h.]
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: ambrosehua@gmail.com
Cc: Steven J . Hill <Steven.Hill@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Cc: Li Xuefeng <lixuefeng@loongson.cn>
Cc: Xu Chenghua <xuchenghua@loongson.cn>
2019-02-04 10:53:34 -08:00

78 lines
3.0 KiB
Plaintext

#
# Loongson Processors' Support
#
# Only gcc >= 4.4 have Loongson specific support
cflags-$(CONFIG_CPU_LOONGSON2) += -Wa,--trap
cflags-$(CONFIG_CPU_LOONGSON2E) += \
$(call cc-option,-march=loongson2e,-march=r4600)
cflags-$(CONFIG_CPU_LOONGSON2F) += \
$(call cc-option,-march=loongson2f,-march=r4600)
# Enable the workarounds for Loongson2f
ifdef CONFIG_CPU_LOONGSON2F_WORKAROUNDS
ifeq ($(call as-option,-Wa$(comma)-mfix-loongson2f-nop,),)
$(error only binutils >= 2.20.2 have needed option -mfix-loongson2f-nop)
else
cflags-$(CONFIG_CPU_NOP_WORKAROUNDS) += -Wa$(comma)-mfix-loongson2f-nop
endif
ifeq ($(call as-option,-Wa$(comma)-mfix-loongson2f-jump,),)
$(error only binutils >= 2.20.2 have needed option -mfix-loongson2f-jump)
else
cflags-$(CONFIG_CPU_JUMP_WORKAROUNDS) += -Wa$(comma)-mfix-loongson2f-jump
endif
endif
cflags-$(CONFIG_CPU_LOONGSON3) += -Wa,--trap
#
# Some versions of binutils, not currently mainline as of 2019/02/04, support
# an -mfix-loongson3-llsc flag which emits a sync prior to each ll instruction
# to work around a CPU bug (see loongson_llsc_mb() in asm/barrier.h for a
# description).
#
# We disable this in order to prevent the assembler meddling with the
# instruction that labels refer to, ie. if we label an ll instruction:
#
# 1: ll v0, 0(a0)
#
# ...then with the assembler fix applied the label may actually point at a sync
# instruction inserted by the assembler, and if we were using the label in an
# exception table the table would no longer contain the address of the ll
# instruction.
#
# Avoid this by explicitly disabling that assembler behaviour. If upstream
# binutils does not merge support for the flag then we can revisit & remove
# this later - for now it ensures vendor toolchains don't cause problems.
#
cflags-$(CONFIG_CPU_LOONGSON3) += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
#
# binutils from v2.25 on and gcc starting from v4.9.0 treat -march=loongson3a
# as MIPS64 R2; older versions as just R1. This leaves the possibility open
# that GCC might generate R2 code for -march=loongson3a which then is rejected
# by GAS. The cc-option can't probe for this behaviour so -march=loongson3a
# can't easily be used safely within the kbuild framework.
#
ifeq ($(call cc-ifversion, -ge, 0409, y), y)
ifeq ($(call ld-ifversion, -ge, 225000000, y), y)
cflags-$(CONFIG_CPU_LOONGSON3) += \
$(call cc-option,-march=loongson3a -U_MIPS_ISA -D_MIPS_ISA=_MIPS_ISA_MIPS64)
else
cflags-$(CONFIG_CPU_LOONGSON3) += \
$(call cc-option,-march=mips64r2,-mips64r2 -U_MIPS_ISA -D_MIPS_ISA=_MIPS_ISA_MIPS64)
endif
else
cflags-$(CONFIG_CPU_LOONGSON3) += \
$(call cc-option,-march=mips64r2,-mips64r2 -U_MIPS_ISA -D_MIPS_ISA=_MIPS_ISA_MIPS64)
endif
#
# Loongson Machines' Support
#
platform-$(CONFIG_MACH_LOONGSON64) += loongson64/
cflags-$(CONFIG_MACH_LOONGSON64) += -I$(srctree)/arch/mips/include/asm/mach-loongson64 -mno-branch-likely
load-$(CONFIG_LEMOTE_FULOONG2E) += 0xffffffff80100000
load-$(CONFIG_LEMOTE_MACH2F) += 0xffffffff80200000
load-$(CONFIG_LOONGSON_MACH3X) += 0xffffffff80200000