Commit Graph

1059871 Commits

Author SHA1 Message Date
Peter Zijlstra
87c87ecd00 bpf,x86: Respect X86_FEATURE_RETPOLINE*
Current BPF codegen doesn't respect X86_FEATURE_RETPOLINE* flags and
unconditionally emits a thunk call, this is sub-optimal and doesn't
match the regular, compiler generated, code.

Update the i386 JIT to emit code equal to what the compiler emits for
the regular kernel text (IOW. a plain THUNK call).

Update the x86_64 JIT to emit code similar to the result of compiler
and kernel rewrites as according to X86_FEATURE_RETPOLINE* flags.
Inlining RETPOLINE_AMD (lfence; jmp *%reg) and !RETPOLINE (jmp *%reg),
while doing a THUNK call for RETPOLINE.

This removes the hard-coded retpoline thunks and shrinks the generated
code. Leaving a single retpoline thunk definition in the kernel.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120310.614772675@infradead.org
2021-10-28 23:25:29 +02:00
Peter Zijlstra
dceba0817c bpf,x86: Simplify computing label offsets
Take an idea from the 32bit JIT, which uses the multi-pass nature of
the JIT to compute the instruction offsets on a prior pass in order to
compute the relative jump offsets on a later pass.

Application to the x86_64 JIT is slightly more involved because the
offsets depend on program variables (such as callee_regs_used and
stack_depth) and hence the computed offsets need to be kept in the
context of the JIT.

This removes, IMO quite fragile, code that hard-codes the offsets and
tries to compute the length of variable parts of it.

Convert both emit_bpf_tail_call_*() functions which have an out: label
at the end. Additionally emit_bpt_tail_call_direct() also has a poke
table entry, for which it computes the offset from the end (and thus
already relies on the previous pass to have computed addrs[i]), also
convert this to be a forward based offset.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120310.552304864@infradead.org
2021-10-28 23:25:29 +02:00
Peter Zijlstra
f8a66d608a x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
Currently Linux prevents usage of retpoline,amd on !AMD hardware, this
is unfriendly and gets in the way of testing. Remove this restriction.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120310.487348118@infradead.org
2021-10-28 23:25:29 +02:00
Peter Zijlstra
d4b5a5c993 x86/alternative: Add debug prints to apply_retpolines()
Make sure we can see the text changes when booting with
'debug-alternative'.

Example output:

 [ ] SMP alternatives: retpoline at: __traceiter_initcall_level+0x1f/0x30 (ffffffff8100066f) len: 5 to: __x86_indirect_thunk_rax+0x0/0x20
 [ ] SMP alternatives: ffffffff82603e58: [2:5) optimized NOPs: ff d0 0f 1f 00
 [ ] SMP alternatives: ffffffff8100066f: orig: e8 cc 30 00 01
 [ ] SMP alternatives: ffffffff8100066f: repl: ff d0 0f 1f 00

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120310.422273830@infradead.org
2021-10-28 23:25:28 +02:00
Peter Zijlstra
bbe2df3f6b x86/alternative: Try inline spectre_v2=retpoline,amd
Try and replace retpoline thunk calls with:

  LFENCE
  CALL    *%\reg

for spectre_v2=retpoline,amd.

Specifically, the sequence above is 5 bytes for the low 8 registers,
but 6 bytes for the high 8 registers. This means that unless the
compilers prefix stuff the call with higher registers this replacement
will fail.

Luckily GCC strongly favours RAX for the indirect calls and most (95%+
for defconfig-x86_64) will be converted. OTOH clang strongly favours
R11 and almost nothing gets converted.

Note: it will also generate a correct replacement for the Jcc.d32
case, except unless the compilers start to prefix stuff that, it'll
never fit. Specifically:

  Jncc.d8 1f
  LFENCE
  JMP     *%\reg
1:

is 7-8 bytes long, where the original instruction in unpadded form is
only 6 bytes.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120310.359986601@infradead.org
2021-10-28 23:25:28 +02:00
Peter Zijlstra
2f0cbb2a8e x86/alternative: Handle Jcc __x86_indirect_thunk_\reg
Handle the rare cases where the compiler (clang) does an indirect
conditional tail-call using:

  Jcc __x86_indirect_thunk_\reg

For the !RETPOLINE case this can be rewritten to fit the original (6
byte) instruction like:

  Jncc.d8	1f
  JMP		*%\reg
  NOP
1:

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120310.296470217@infradead.org
2021-10-28 23:25:28 +02:00
Peter Zijlstra
7508500900 x86/alternative: Implement .retpoline_sites support
Rewrite retpoline thunk call sites to be indirect calls for
spectre_v2=off. This ensures spectre_v2=off is as near to a
RETPOLINE=n build as possible.

This is the replacement for objtool writing alternative entries to
ensure the same and achieves feature-parity with the previous
approach.

One noteworthy feature is that it relies on the thunks to be in
machine order to compute the register index.

Specifically, this does not yet address the Jcc __x86_indirect_thunk_*
calls generated by clang, a future patch will add this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120310.232495794@infradead.org
2021-10-28 23:25:27 +02:00
Peter Zijlstra
1a6f74429c x86/retpoline: Create a retpoline thunk array
Stick all the retpolines in a single symbol and have the individual
thunks as inner labels, this should guarantee thunk order and layout.

Previously there were 16 (or rather 15 without rsp) separate symbols and
a toolchain might reasonably expect it could displace them however it
liked, with disregard for their relative position.

However, now they're part of a larger symbol. Any change to their
relative position would disrupt this larger _array symbol and thus not
be sound.

This is the same reasoning used for data symbols. On their own there
is no guarantee about their relative position wrt to one aonther, but
we're still able to do arrays because an array as a whole is a single
larger symbol.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120310.169659320@infradead.org
2021-10-28 23:25:27 +02:00
Peter Zijlstra
6fda8a3886 x86/retpoline: Move the retpoline thunk declarations to nospec-branch.h
Because it makes no sense to split the retpoline gunk over multiple
headers.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120310.106290934@infradead.org
2021-10-28 23:25:27 +02:00
Peter Zijlstra
b6d3d9944b x86/asm: Fixup odd GEN-for-each-reg.h usage
Currently GEN-for-each-reg.h usage leaves GEN defined, relying on any
subsequent usage to start with #undef, which is rude.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120310.041792350@infradead.org
2021-10-28 23:25:26 +02:00
Peter Zijlstra
a92ede2d58 x86/asm: Fix register order
Ensure the register order is correct; this allows for easy translation
between register number and trampoline and vice-versa.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.978573921@infradead.org
2021-10-28 23:25:26 +02:00
Peter Zijlstra
4fe79e710d x86/retpoline: Remove unused replacement symbols
Now that objtool no longer creates alternatives, these replacement
symbols are no longer needed, remove them.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.915051744@infradead.org
2021-10-28 23:25:26 +02:00
Peter Zijlstra
134ab5bd18 objtool,x86: Replace alternatives with .retpoline_sites
Instead of writing complete alternatives, simply provide a list of all
the retpoline thunk calls. Then the kernel is free to do with them as
it pleases. Simpler code all-round.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.850007165@infradead.org
2021-10-28 23:25:25 +02:00
Peter Zijlstra
c509331b41 objtool: Shrink struct instruction
Any one instruction can only ever call a single function, therefore
insn->mcount_loc_node is superfluous and can use insn->call_node.

This shrinks struct instruction, which is by far the most numerous
structure objtool creates.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.785456706@infradead.org
2021-10-28 23:25:25 +02:00
Peter Zijlstra
dd003edeff objtool: Explicitly avoid self modifying code in .altinstr_replacement
Assume ALTERNATIVE()s know what they're doing and do not change, or
cause to change, instructions in .altinstr_replacement sections.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.722511775@infradead.org
2021-10-28 23:25:25 +02:00
Peter Zijlstra
1739c66eb7 objtool: Classify symbols
In order to avoid calling str*cmp() on symbol names, over and over, do
them all once upfront and store the result.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.658539311@infradead.org
2021-10-28 23:25:24 +02:00
Ulf Hansson
348ecd6177 Merge branch 'fixes' into next 2021-10-28 23:20:27 +02:00
Wolfram Sang
90935eb303 mmc: tmio: reenable card irqs after the reset callback
The reset callback may clear the internal card detect interrupts, so
make sure to reenable them if needed.

Fixes: b4d86f37ea ("mmc: renesas_sdhi: do hard reset if possible")
Reported-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211028195149.8003-1-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-28 23:19:32 +02:00
Joanne Koong
f44bc543a0 bpf/benchs: Add benchmarks for comparing hashmap lookups w/ vs. w/out bloom filter
This patch adds benchmark tests for comparing the performance of hashmap
lookups without the bloom filter vs. hashmap lookups with the bloom filter.

Checking the bloom filter first for whether the element exists should
overall enable a higher throughput for hashmap lookups, since if the
element does not exist in the bloom filter, we can avoid a costly lookup in
the hashmap.

On average, using 5 hash functions in the bloom filter tended to perform
the best across the widest range of different entry sizes. The benchmark
results using 5 hash functions (running on 8 threads on a machine with one
numa node, and taking the average of 3 runs) were roughly as follows:

value_size = 4 bytes -
	10k entries: 30% faster
	50k entries: 40% faster
	100k entries: 40% faster
	500k entres: 70% faster
	1 million entries: 90% faster
	5 million entries: 140% faster

value_size = 8 bytes -
	10k entries: 30% faster
	50k entries: 40% faster
	100k entries: 50% faster
	500k entres: 80% faster
	1 million entries: 100% faster
	5 million entries: 150% faster

value_size = 16 bytes -
	10k entries: 20% faster
	50k entries: 30% faster
	100k entries: 35% faster
	500k entres: 65% faster
	1 million entries: 85% faster
	5 million entries: 110% faster

value_size = 40 bytes -
	10k entries: 5% faster
	50k entries: 15% faster
	100k entries: 20% faster
	500k entres: 65% faster
	1 million entries: 75% faster
	5 million entries: 120% faster

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-6-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Joanne Koong
57fd1c63c9 bpf/benchs: Add benchmark tests for bloom filter throughput + false positive
This patch adds benchmark tests for the throughput (for lookups + updates)
and the false positive rate of bloom filter lookups, as well as some
minor refactoring of the bash script for running the benchmarks.

These benchmarks show that as the number of hash functions increases,
the throughput and the false positive rate of the bloom filter decreases.
>From the benchmark data, the approximate average false-positive rates
are roughly as follows:

1 hash function = ~30%
2 hash functions = ~15%
3 hash functions = ~5%
4 hash functions = ~2.5%
5 hash functions = ~1%
6 hash functions = ~0.5%
7 hash functions  = ~0.35%
8 hash functions = ~0.15%
9 hash functions = ~0.1%
10 hash functions = ~0%

For reference data, the benchmarks run on one thread on a machine
with one numa node for 1 to 5 hash functions for 8-byte and 64-byte
values are as follows:

1 hash function:
  50k entries
	8-byte value
	    Lookups - 51.1 M/s operations
	    Updates - 33.6 M/s operations
	    False positive rate: 24.15%
	64-byte value
	    Lookups - 15.7 M/s operations
	    Updates - 15.1 M/s operations
	    False positive rate: 24.2%
  100k entries
	8-byte value
	    Lookups - 51.0 M/s operations
	    Updates - 33.4 M/s operations
	    False positive rate: 24.04%
	64-byte value
	    Lookups - 15.6 M/s operations
	    Updates - 14.6 M/s operations
	    False positive rate: 24.06%
  500k entries
	8-byte value
	    Lookups - 50.5 M/s operations
	    Updates - 33.1 M/s operations
	    False positive rate: 27.45%
	64-byte value
	    Lookups - 15.6 M/s operations
	    Updates - 14.2 M/s operations
	    False positive rate: 27.42%
  1 mil entries
	8-byte value
	    Lookups - 49.7 M/s operations
	    Updates - 32.9 M/s operations
	    False positive rate: 27.45%
	64-byte value
	    Lookups - 15.4 M/s operations
	    Updates - 13.7 M/s operations
	    False positive rate: 27.58%
  2.5 mil entries
	8-byte value
	    Lookups - 47.2 M/s operations
	    Updates - 31.8 M/s operations
	    False positive rate: 30.94%
	64-byte value
	    Lookups - 15.3 M/s operations
	    Updates - 13.2 M/s operations
	    False positive rate: 30.95%
  5 mil entries
	8-byte value
	    Lookups - 41.1 M/s operations
	    Updates - 28.1 M/s operations
	    False positive rate: 31.01%
	64-byte value
	    Lookups - 13.3 M/s operations
	    Updates - 11.4 M/s operations
	    False positive rate: 30.98%

2 hash functions:
  50k entries
	8-byte value
	    Lookups - 34.1 M/s operations
	    Updates - 20.1 M/s operations
	    False positive rate: 9.13%
	64-byte value
	    Lookups - 8.4 M/s operations
	    Updates - 7.9 M/s operations
	    False positive rate: 9.21%
  100k entries
	8-byte value
	    Lookups - 33.7 M/s operations
	    Updates - 18.9 M/s operations
	    False positive rate: 9.13%
	64-byte value
	    Lookups - 8.4 M/s operations
	    Updates - 7.7 M/s operations
	    False positive rate: 9.19%
  500k entries
	8-byte value
	    Lookups - 32.7 M/s operations
	    Updates - 18.1 M/s operations
	    False positive rate: 12.61%
	64-byte value
	    Lookups - 8.4 M/s operations
	    Updates - 7.5 M/s operations
	    False positive rate: 12.61%
  1 mil entries
	8-byte value
	    Lookups - 30.6 M/s operations
	    Updates - 18.9 M/s operations
	    False positive rate: 12.54%
	64-byte value
	    Lookups - 8.0 M/s operations
	    Updates - 7.0 M/s operations
	    False positive rate: 12.52%
  2.5 mil entries
	8-byte value
	    Lookups - 25.3 M/s operations
	    Updates - 16.7 M/s operations
	    False positive rate: 16.77%
	64-byte value
	    Lookups - 7.9 M/s operations
	    Updates - 6.5 M/s operations
	    False positive rate: 16.88%
  5 mil entries
	8-byte value
	    Lookups - 20.8 M/s operations
	    Updates - 14.7 M/s operations
	    False positive rate: 16.78%
	64-byte value
	    Lookups - 7.0 M/s operations
	    Updates - 6.0 M/s operations
	    False positive rate: 16.78%

3 hash functions:
  50k entries
	8-byte value
	    Lookups - 25.1 M/s operations
	    Updates - 14.6 M/s operations
	    False positive rate: 7.65%
	64-byte value
	    Lookups - 5.8 M/s operations
	    Updates - 5.5 M/s operations
	    False positive rate: 7.58%
  100k entries
	8-byte value
	    Lookups - 24.7 M/s operations
	    Updates - 14.1 M/s operations
	    False positive rate: 7.71%
	64-byte value
	    Lookups - 5.8 M/s operations
	    Updates - 5.3 M/s operations
	    False positive rate: 7.62%
  500k entries
	8-byte value
	    Lookups - 22.9 M/s operations
	    Updates - 13.9 M/s operations
	    False positive rate: 2.62%
	64-byte value
	    Lookups - 5.6 M/s operations
	    Updates - 4.8 M/s operations
	    False positive rate: 2.7%
  1 mil entries
	8-byte value
	    Lookups - 19.8 M/s operations
	    Updates - 12.6 M/s operations
	    False positive rate: 2.60%
	64-byte value
	    Lookups - 5.3 M/s operations
	    Updates - 4.4 M/s operations
	    False positive rate: 2.69%
  2.5 mil entries
	8-byte value
	    Lookups - 16.2 M/s operations
	    Updates - 10.7 M/s operations
	    False positive rate: 4.49%
	64-byte value
	    Lookups - 4.9 M/s operations
	    Updates - 4.1 M/s operations
	    False positive rate: 4.41%
  5 mil entries
	8-byte value
	    Lookups - 18.8 M/s operations
	    Updates - 9.2 M/s operations
	    False positive rate: 4.45%
	64-byte value
	    Lookups - 5.2 M/s operations
	    Updates - 3.9 M/s operations
	    False positive rate: 4.54%

4 hash functions:
  50k entries
	8-byte value
	    Lookups - 19.7 M/s operations
	    Updates - 11.1 M/s operations
	    False positive rate: 1.01%
	64-byte value
	    Lookups - 4.4 M/s operations
	    Updates - 4.0 M/s operations
	    False positive rate: 1.00%
  100k entries
	8-byte value
	    Lookups - 19.5 M/s operations
	    Updates - 10.9 M/s operations
	    False positive rate: 1.00%
	64-byte value
	    Lookups - 4.3 M/s operations
	    Updates - 3.9 M/s operations
	    False positive rate: 0.97%
  500k entries
	8-byte value
	    Lookups - 18.2 M/s operations
	    Updates - 10.6 M/s operations
	    False positive rate: 2.05%
	64-byte value
	    Lookups - 4.3 M/s operations
	    Updates - 3.7 M/s operations
	    False positive rate: 2.05%
  1 mil entries
	8-byte value
	    Lookups - 15.5 M/s operations
	    Updates - 9.6 M/s operations
	    False positive rate: 1.99%
	64-byte value
	    Lookups - 4.0 M/s operations
	    Updates - 3.4 M/s operations
	    False positive rate: 1.99%
  2.5 mil entries
	8-byte value
	    Lookups - 13.8 M/s operations
	    Updates - 7.7 M/s operations
	    False positive rate: 3.91%
	64-byte value
	    Lookups - 3.7 M/s operations
	    Updates - 3.6 M/s operations
	    False positive rate: 3.78%
  5 mil entries
	8-byte value
	    Lookups - 13.0 M/s operations
	    Updates - 6.9 M/s operations
	    False positive rate: 3.93%
	64-byte value
	    Lookups - 3.5 M/s operations
	    Updates - 3.7 M/s operations
	    False positive rate: 3.39%

5 hash functions:
  50k entries
	8-byte value
	    Lookups - 16.4 M/s operations
	    Updates - 9.1 M/s operations
	    False positive rate: 0.78%
	64-byte value
	    Lookups - 3.5 M/s operations
	    Updates - 3.2 M/s operations
	    False positive rate: 0.77%
  100k entries
	8-byte value
	    Lookups - 16.3 M/s operations
	    Updates - 9.0 M/s operations
	    False positive rate: 0.79%
	64-byte value
	    Lookups - 3.5 M/s operations
	    Updates - 3.2 M/s operations
	    False positive rate: 0.78%
  500k entries
	8-byte value
	    Lookups - 15.1 M/s operations
	    Updates - 8.8 M/s operations
	    False positive rate: 1.82%
	64-byte value
	    Lookups - 3.4 M/s operations
	    Updates - 3.0 M/s operations
	    False positive rate: 1.78%
  1 mil entries
	8-byte value
	    Lookups - 13.2 M/s operations
	    Updates - 7.8 M/s operations
	    False positive rate: 1.81%
	64-byte value
	    Lookups - 3.2 M/s operations
	    Updates - 2.8 M/s operations
	    False positive rate: 1.80%
  2.5 mil entries
	8-byte value
	    Lookups - 10.5 M/s operations
	    Updates - 5.9 M/s operations
	    False positive rate: 0.29%
	64-byte value
	    Lookups - 3.2 M/s operations
	    Updates - 2.4 M/s operations
	    False positive rate: 0.28%
  5 mil entries
	8-byte value
	    Lookups - 9.6 M/s operations
	    Updates - 5.7 M/s operations
	    False positive rate: 0.30%
	64-byte value
	    Lookups - 3.2 M/s operations
	    Updates - 2.7 M/s operations
	    False positive rate: 0.30%

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-5-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Joanne Koong
ed9109ad64 selftests/bpf: Add bloom filter map test cases
This patch adds test cases for bpf bloom filter maps. They include tests
checking against invalid operations by userspace, tests for using the
bloom filter map as an inner map, and a bpf program that queries the
bloom filter map for values added by a userspace program.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-4-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Joanne Koong
47512102cd libbpf: Add "map_extra" as a per-map-type extra flag
This patch adds the libbpf infrastructure for supporting a
per-map-type "map_extra" field, whose definition will be
idiosyncratic depending on map type.

For example, for the bloom filter map, the lower 4 bits of
map_extra is used to denote the number of hash functions.

Please note that until libbpf 1.0 is here, the
"bpf_create_map_params" struct is used as a temporary
means for propagating the map_extra field to the kernel.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-3-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Joanne Koong
9330986c03 bpf: Add bloom filter map implementation
This patch adds the kernel-side changes for the implementation of
a bpf bloom filter map.

The bloom filter map supports peek (determining whether an element
is present in the map) and push (adding an element to the map)
operations.These operations are exposed to userspace applications
through the already existing syscalls in the following way:

BPF_MAP_LOOKUP_ELEM -> peek
BPF_MAP_UPDATE_ELEM -> push

The bloom filter map does not have keys, only values. In light of
this, the bloom filter map's API matches that of queue stack maps:
user applications use BPF_MAP_LOOKUP_ELEM/BPF_MAP_UPDATE_ELEM
which correspond internally to bpf_map_peek_elem/bpf_map_push_elem,
and bpf programs must use the bpf_map_peek_elem and bpf_map_push_elem
APIs to query or add an element to the bloom filter map. When the
bloom filter map is created, it must be created with a key_size of 0.

For updates, the user will pass in the element to add to the map
as the value, with a NULL key. For lookups, the user will pass in the
element to query in the map as the value, with a NULL key. In the
verifier layer, this requires us to modify the argument type of
a bloom filter's BPF_FUNC_map_peek_elem call to ARG_PTR_TO_MAP_VALUE;
as well, in the syscall layer, we need to copy over the user value
so that in bpf_map_peek_elem, we know which specific value to query.

A few things to please take note of:
 * If there are any concurrent lookups + updates, the user is
responsible for synchronizing this to ensure no false negative lookups
occur.
 * The number of hashes to use for the bloom filter is configurable from
userspace. If no number is specified, the default used will be 5 hash
functions. The benchmarks later in this patchset can help compare the
performance of using different number of hashes on different entry
sizes. In general, using more hashes decreases both the false positive
rate and the speed of a lookup.
 * Deleting an element in the bloom filter map is not supported.
 * The bloom filter map may be used as an inner map.
 * The "max_entries" size that is specified at map creation time is used
to approximate a reasonable bitmap size for the bloom filter, and is not
otherwise strictly enforced. If the user wishes to insert more entries
into the bloom filter than "max_entries", they may do so but they should
be aware that this may lead to a higher false positive rate.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-2-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Marc Zyngier
11e45471ab Merge branch irq/misc-5.16 into irq/irqchip-next
* irq/misc-5.16:
  : .
  : Misc irqchip fixes for 5.16:
  : - MAINTAINERS update for the ARM VIC DT binding
  : - Allow drivers using the IRQCHIP_PLATFORM_DRIVER_BEGIN/END
  :   infrastructure to use COMPILE_TEST without CONFIG_OF
  : - DT updates
  : - Detangle h8300 linux/irqchip.h inclusion
  : .
  h8300: Fix linux/irqchip.h include mess
  dt-bindings: irqchip: renesas-irqc: Document r8a774e1 bindings
  irqchip: Fix compile-testing without CONFIG_OF
  MAINTAINERS: update arm,vic.yaml reference

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-10-28 21:11:34 +01:00
Marc Zyngier
837d7a8fe8 h8300: Fix linux/irqchip.h include mess
h8300 drags linux/irqchip.h from asm/irq.h, which is in general a bad
idea (asm/*.h should avoid dragging linux/*.h, as it is usually supposed
to work the other way around).

Move the inclusion of linux/irqchip.h to the single location where it
actually matters in the arch code.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211028172849.GA701812@roeck-us.net
2021-10-28 21:02:48 +01:00
Thorsten Leemhuis
1f57bd42b7 docs: submitting-patches: make section about the Link: tag more explicit
Mention the 'Link' tag in the section about adding URLs to the commit
msg, to make it clearer they "_primarily_ [...] should be about
background", as Linus recently stated (see the link below). That makes
the explanation also easier to find with a text search. For the same
reason and to improve comprehensibility provide an example, too.

Slightly improve the text at the same time to make it more obvious
developers are meant to add links to issue reports in mailing list
archives, as those allow regression tracking efforts to automatically
check which bugs got resolved.

Move the section also downwards slightly, to reduce jumping back and
forth between aspects relevant for the top and the bottom part of the
commit msg.

Link: https://lore.kernel.org/lkml/CAHk-=wgBhyLhQLPem1vybKNt7BKP+=qF=veBgc7VirZaXn4FUw@mail.gmail.com/
CC: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Reviewed-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://lore.kernel.org/r/27105768dc19b395e7c8e7a80d056d1ff9c570d0.1635152553.git.linux@leemhuis.info
[jc: tweaked wording following Konstantin's recommendation]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-10-28 13:53:47 -06:00
Linus Torvalds
f31531e554 drm fixes for 5.15 final
MAINTAINERS:
 - change the paths
 
 ttm:
 - Fix fence leak in ttm_transfered_destroy.
 
 core:
 - add GPD Win3 rotation quirk
 
 i915:
 - Remove unconditional clflushes
 - Fix oops on boot due to sync state on disabled DP encoders
 - Revert backend specific data added to tracepoints
 - Remove useless and incorrect memory frequence calculation
 
 panel:
 - Add quirk for Aya Neo 2021
 
 seltest:
 - Reset property count for each drm damage selftest so full run will work correctly.
 
 amdgpu:
 - Fix two potential out of bounds writes in debugfs
 - Fix revision handling for Yellow Carp
 - Display fixes for Yellow Carp
 - Display fixes for DCN 3.1
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmF69VYACgkQDHTzWXnE
 hr73Ug//egxTE5WamGA8o85UEV8MzUKClu9hP6fGrkJRy3xl4wLAV0ooPSgm9RNy
 pkzx8FrRjmTNbDy1kGUdG2mhwTZ0hlyC9PRrczC+HsHKvgwOdswDD9HibWv39V6T
 Ew7oEHk8XOZ6QnSTovRGw9O8pZhFt39Ob+6qR/0cYizG8GpVexdvKzVvIOLRucXY
 DDo151JoUQiKu7KD/Pgc7IVWuw6SllGNHFf1T9wbQILxjUVfNzDGGCgH9csesUno
 kpege371I9nfNJDWZIQPckWrOTZ103Dc7jtezvbtaNJgXUrUmiWqe25rLwvcasuq
 4wBPPwTklxyxAGW8SAf2Hzy4IkuOmC7hm0UpMyMC25tLMfSZjBGnjFNkBSdjooMj
 WSyNd+dBtYdy/Uvg+Jxydsev8/O7my3OGUq6FKWFg2dichUInw6pCdUTvEtMhPdD
 i7F5CjkCYo4+6LWKgCoA+hAvacDjpl1d/ZNzP4MZ7YVPydJdIO2jR4j4AQ/Kcmia
 P3suB6wqSHFgl9LKLNwracU2a6q44aUH6mfLfqbIhdexFk5Bjk6Oswmt4SPqZJH3
 F+JSP0tKmaY35V6HRprnoZSqwxoCdC0zgwAbPICCj7dqIPy2eLiIlLphC2zro+C/
 f4+1rBSlmZ7jPB0MHHhtMgmzWHsMcj//xHY/574cB0RnZ8fxCjU=
 =sAyL
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2021-10-29' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Quiet but not too quiet, I blame Halloween.

  The first set of amdgpu fixes missed last week, hence why this has a
  few more of them, it's mostly display fixes for new GPUs and some
  debugfs OOB stuff.

  The i915 patches have one to remove a tracepoint possible issue before
  it's a real problem, the others around cflush and display are cc'ed to
  stable as well.

  Otherwise it's just a few misc fixes.

  Summary:

  MAINTAINERS:
   - Fix the path pattern

  ttm:
   - Fix fence leak in ttm_transfered_destroy.

  core:
   - Add GPD Win3 rotation quirk

  i915:
   - Remove unconditional clflushes
   - Fix oops on boot due to sync state on disabled DP encoders
   - Revert backend specific data added to tracepoints
   - Remove useless and incorrect memory frequence calculation

  panel:
   - Add quirk for Aya Neo 2021

  seltest:
   - Reset property count for each drm damage selftest so full run will
     work correctly.

  amdgpu:
   - Fix two potential out of bounds writes in debugfs
   - Fix revision handling for Yellow Carp
   - Display fixes for Yellow Carp
   - Display fixes for DCN 3.1"

* tag 'drm-fixes-2021-10-29' of git://anongit.freedesktop.org/drm/drm: (21 commits)
  MAINTAINERS: dri-devel is for all of drivers/gpu
  drm/i915: Revert 'guc_id' from i915_request tracepoint
  drm/amd/display: Fix deadlock when falling back to v2 from v3
  drm/amd/display: Fallback to clocks which meet requested voltage on DCN31
  drm/amdgpu: Fix even more out of bound writes from debugfs
  drm: panel-orientation-quirks: Add quirk for GPD Win3
  drm/i915/dp: Skip the HW readout of DPCD on disabled encoders
  drm/i915: Catch yet another unconditioal clflush
  drm/i915: Convert unconditional clflush to drm_clflush_virt_range()
  drm/i915/selftests: Properly reset mock object propers for each test
  drm: panel-orientation-quirks: Add quirk for Aya Neo 2021
  drm/ttm: fix memleak in ttm_transfered_destroy
  drm/amdgpu: support B0&B1 external revision id for yellow carp
  drm/amd/display: Moved dccg init to after bios golden init
  drm/amd/display: Increase watermark latencies for DCN3.1
  drm/amd/display: increase Z9 latency to workaround underflow in Z9
  drm/amd/display: Require immediate flip support for DCN3.1 planes
  drm/amd/display: Fix prefetch bandwidth calculation for DCN3.1
  drm/amd/display: Limit display scaling to up to true 4k for DCN 3.1
  drm/amdgpu: fix out of bounds write
  ...
2021-10-28 12:17:01 -07:00
Daniel Vetter
b112166a89 MAINTAINERS: dri-devel is for all of drivers/gpu
Somehow we only have a list of subdirectories, which apparently made
it harder for folks to find the gpu maintainers. Fix that.

References: https://lore.kernel.org/dri-devel/YXrAAZlxxStNFG%2FK@phenom.ffwll.local/
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211028170857.4029606-1-daniel.vetter@ffwll.ch
2021-10-29 04:49:11 +10:00
Dave Airlie
946ca97e2e Merge tag 'drm-intel-fixes-2021-10-28' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.15 final:
- Remove unconditional clflushes
- Fix oops on boot due to sync state on disabled DP encoders
- Revert backend specific data added to tracepoints
- Remove useless and incorrect memory frequence calculation

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8735olh27y.fsf@intel.com
2021-10-29 04:46:14 +10:00
Alex Deucher
403475be6d drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits
The DMA mask on SI parts is 40 bits not 44.  Copy
paste typo.

Fixes: 244511f386 ("drm/amdgpu: simplify and cleanup setting the dma mask")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1762
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:27:00 -04:00
Meenakshikumar Somasundaram
139a33112f drm/amd/display: MST support for DPIA
[Why]
- DPIA MST slot registers are not programmed during payload
allocation and hence MST does not work with DPIA.
- HPD RX interrupts are not handled for DPIA.

[How]
- Added inbox command to program the MST slots whenever
  payload allocation happens for DPIA links.
- Added support for handling HPD RX interrupts

Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:27:00 -04:00
Patrik Jakobsson
839e59a343 drm/amdgpu: Fix even more out of bound writes from debugfs
CVE-2021-42327 was fixed by:

commit f23750b5b3
Author: Thelford Williams <tdwilliamsiv@gmail.com>
Date:   Wed Oct 13 16:04:13 2021 -0400

    drm/amdgpu: fix out of bounds write

but amdgpu_dm_debugfs.c contains more of the same issue so fix the
remaining ones.

v2:
	* Add missing fix in dp_max_bpc_write (Harry Wentland)

Fixes: 918698d5c2 ("drm/amd/display: Return the number of bytes parsed than allocated")
Signed-off-by: Patrik Jakobsson <pjakobsson@suse.de>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:59 -04:00
Alex Deucher
58f8c7fa88 drm/amdgpu/discovery: add SDMA IP instance info for soc15 parts
Add secondary instance version info for soc15 parts.

Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:59 -04:00
Alex Deucher
074b2092d9 drm/amdgpu/discovery: add UVD/VCN IP instance info for soc15 parts
Add secondary instance version info for vega20, arcturure, and
aldebaran.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:59 -04:00
Alex Deucher
72f4c9d570 drm/amdgpu/UAPI: rearrange header to better align related items
Move the RAS query parameters to align with the INFO query where
they are used.  No functional change.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:59 -04:00
Jude Shih
5b10939750 drm/amd/display: Enable dpia in dmub only for DCN31 B0
[Why]
DMUB binary is common for both A0 and B0. Hence, driver should
notify FW about the support for DPIA in B0.

[How]
Added dpia_supported bit in dmub_fw_boot_options and will be set
only for B0.

Assign dpia_supported to true before dm_dmub_hw_init
in B0 case.

v2: fix build without CONFIG_DRM_AMD_DC_DCN (Alex)

Signed-off-by: Jude Shih <shenshih@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:59 -04:00
Jude Shih
094b21c1a3 drm/amd/display: Fix USB4 hot plug crash issue
[Why]
Notify data from outbox corrupt, the notify type should be 2 (HPD) instead of 0
(No data). We copied the address instead of the value. The memory might be
freed in the end of outbox IRQ

[How]
We should allocate the memory of notify and copy the whole content from outbox to
hpd handle function

Fixes: 88f52b1fff ("drm/amd/display: Support for SET_CONFIG processing with DMUB")
Signed-off-by: Jude Shih <shenshih@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:51 -04:00
Nicholas Kazlauskas
f638d7505f drm/amd/display: Fix deadlock when falling back to v2 from v3
[Why]
A deadlock in the kernel occurs when we fallback from the V3 to V2
add_topology_to_display or remove_topology_to_display because they
both try to acquire the dtm_mutex but recursive locking isn't
supported on mutex_lock().

[How]
Make the mutex_lock/unlock more fine grained and move them up such that
they're only required for the psp invocation itself.

Fixes: bf62221e9d ("drm/amd/display: Add DCN3.1 HDCP support")

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:51 -04:00
Michael Strauss
1e5588d140 drm/amd/display: Fallback to clocks which meet requested voltage on DCN31
[WHY]
On certain configs, SMU clock table voltages don't match which cause parser
to behave incorrectly by leaving dcfclk and socclk table entries unpopulated.

[HOW]
Currently the function that finds the corresponding clock for a given voltage
only checks for exact voltage level matches. In the case that no match gets
found, parser now falls back to searching for the max clock which meets the
requested voltage (i.e. its corresponding voltage is below requested).

Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:50 -04:00
Qingqing Zhuo
31484207fe drm/amd/display: move FPU associated DCN301 code to DML folder
[Why & How]
As part of the FPU isolation work documented in
https://patchwork.freedesktop.org/series/93042/, isolate
code that uses FPU in DCN301 to DML, where all FPU code
should locate.

Cc: Christian König <christian.koenig@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Zhan Liu <Zhan.Liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:50 -04:00
Wenjing Liu
e72aa36ef8 drm/amd/display: fix link training regression for 1 or 2 lane
[why]
We have a regression that cause maximize lane settings to use
uninitialized data from unused lanes.
This will cause link training to fail for 1 or 2 lanes because the lane
adjust is populated incorrectly sometimes.

v2: fix build without CONFIG_DRM_AMD_DC_DCN (Alex)

Reviewed-by: Eric Yang <eric.yang2@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:50 -04:00
Wenjing Liu
9c92c79b05 drm/amd/display: add two lane settings training options
[why]
option 1: disallow different lanes to have different lane settings
option 2: dpcd lane settings will always use the same hw lane settings
even if it doesn't match requested lane adjust

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:17 -04:00
Wenjing Liu
75c2830c91 drm/amd/display: decouple hw_lane_settings from dpcd_lane_settings
[why]
As DP features expands, we have encountered many situations where we
must configure a different DPCD lane setting from hw lane settings we
output.  The change is to decouple hw lane settings from dpcd lane
settings to provide flexibility to configure dpcd and hw individually.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:17 -04:00
Wenjing Liu
c224aac870 drm/amd/display: implement decide lane settings
[why]
Decouple lane settings decision logic all to its own function. The
function takes in lane adjust array and link training settings and
decide what hw lane setting and dpcd lane setting should be used.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:17 -04:00
Wenjing Liu
5354b2bd28 drm/amd/display: adopt DP2.0 LT SCR revision 8
[how]
revision 8 SCR requires DP Source to write TPS2 and FFE lane adjustment
in one 5 byte write aux transaction.
It specifies to read aux rd interval value as soon as we turn on TPS1
pattern.

Cc: Wayne Lin <wayne.lin@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:17 -04:00
Meenakshikumar Somasundaram
ed0ffb5dcd drm/amd/display: FEC configuration for dpia links in MST mode
[Why]
To fix the check condition for fec enable for dpia links in MST mode.

[How]
dc_link_should_enable_fec() to be used to check whether fec should be
enabled in MST mode.

Cc: Wayne Lin <wayne.lin@amd.com>
Reviewed-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:17 -04:00
Meenakshikumar Somasundaram
7fb52632ca drm/amd/display: FEC configuration for dpia links
[Why]
To fix the check condition for fec enable for dpia links.

[How]
dc_link_should_enable_fec() to be used to check whether fec should be
enabled.

Cc: Wayne Lin <wayne.lin@amd.com>
Reviewed-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:17 -04:00
Jimmy Kizito
4b169ca367 drm/amd/display: Add workaround flag for EDID read on certain docks
[Why]
Certain docks appear to NAK I2C writes to the segment pointer with the
MOT (middle of transaction) bit clear. This behaviour can cause EDID
reads from higher segments to fail.

[How]
Add workaround flag for links which connect to docks exhibiting this
issue.

Cc: Wayne Lin <wayne.lin@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:17 -04:00
Hansen
3137f792c5 drm/amd/display: Set phy_mux_sel bit in dmub scratch register
[Why]
B0 has pipe mux for DIGC and DIGD which can be connected to PHYF/PHYG or
PHYC/PHY D.

[How]
Based on chip internal hardware revision id determine it is B0 and set
DMUB scratch register so DMUBFW can connect the display pipe is
connected correctly to the dig.

Cc: Wayne Lin <wayne.lin@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Hansen <Hansen.Dsouza@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:16 -04:00
Martin Leung
a9a1ac4407 drm/amd/display: Manually adjust strobe for DCN303
why:
DCN303's 4 channel SOC BB causes problems at strobe

how:
workaround to manually adjust strobe calculation using FCLK
restrict.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Agustin Gutierrez <agustin.gutierrez@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Martin Leung <Martin.Leung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:16 -04:00