linux

Author	SHA1	Message	Date
Sergei Shtylyov	f8e022db50	sh_eth: RX checksum offload support Add support for the RX checksum offload. This is enabled by default and may be disabled and re-enabled using 'ethtool': # ethtool -K eth0 rx off # ethtool -K eth0 rx on Some Ether MACs provide a simple checksumming scheme which appears to be completely compatible with CHECKSUM_COMPLETE: sum of all packet data after the L2 header is appended to packet data; this may be trivially read by the driver and used to update the skb accordingly. The same checksumming scheme is implemented in the EtherAVB MACs and now supported by the 'ravb' driver. In terms of performance, throughput is close to gigabit line rate with the RX checksum offload both enabled and disabled. The 'perf' output, however, appears to indicate that significantly less time is spent in do_csum() -- this is as expected. Test results with RX checksum offload enabled: ~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4 TCP MAERTS TEST to 192.168.2.4 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131072 16384 16384 10.01 933.93 [ perf record: Woken up 8 times to write data ] [ perf record: Captured and wrote 1.955 MB perf.data (41940 samples) ] ~/netperf-2.2pl4# perf report Samples: 41K of event 'cycles:ppp', Event count (approx.): 9915302763 Overhead Command Shared Object Symbol 9.44% netperf [kernel.kallsyms] [k] __arch_copy_to_user 7.75% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq 6.31% swapper [kernel.kallsyms] [k] default_idle_call 5.89% swapper [kernel.kallsyms] [k] arch_cpu_idle 4.37% swapper [kernel.kallsyms] [k] tick_nohz_idle_exit 4.02% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irq 2.52% netperf [kernel.kallsyms] [k] preempt_count_sub 1.81% netperf [kernel.kallsyms] [k] tcp_recvmsg 1.80% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irqres 1.78% netperf [kernel.kallsyms] [k] preempt_count_add 1.36% netperf [kernel.kallsyms] [k] __tcp_transmit_skb 1.20% netperf [kernel.kallsyms] [k] __local_bh_enable_ip 1.10% netperf [kernel.kallsyms] [k] sh_eth_start_xmit Test results with RX checksum offload disabled: ~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4 TCP MAERTS TEST to 192.168.2.4 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131072 16384 16384 10.01 932.04 [ perf record: Woken up 14 times to write data ] [ perf record: Captured and wrote 3.642 MB perf.data (78817 samples) ] ~/netperf-2.2pl4# perf report Samples: 78K of event 'cycles:ppp', Event count (approx.): 18091442796 Overhead Command Shared Object Symbol 7.00% swapper [kernel.kallsyms] [k] do_csum 3.94% swapper [kernel.kallsyms] [k] sh_eth_poll 3.83% ksoftirqd/0 [kernel.kallsyms] [k] do_csum 3.23% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq 2.87% netperf [kernel.kallsyms] [k] __arch_copy_to_user 2.86% swapper [kernel.kallsyms] [k] arch_cpu_idle 2.13% swapper [kernel.kallsyms] [k] default_idle_call 2.12% ksoftirqd/0 [kernel.kallsyms] [k] sh_eth_poll 2.02% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 1.84% swapper [kernel.kallsyms] [k] __softirqentry_text_start 1.64% swapper [kernel.kallsyms] [k] tick_nohz_idle_exit 1.53% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irq 1.32% netperf [kernel.kallsyms] [k] preempt_count_sub 1.27% swapper [kernel.kallsyms] [k] __pi___inval_dcache_area 1.22% swapper [kernel.kallsyms] [k] check_preemption_disabled 1.01% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore The above results collected on the R-Car V3H Starter Kit board. Based on the commit `4d86d38186` ("ravb: RX checksum offload")... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 13:31:00 -08:00
Sergei Shtylyov	2c2ab5af7d	sh_eth: rename sh_eth_cpu_data::hw_checksum Commit `62e04b7e0e` ("sh_eth: rename 'sh_eth_cpu_data::hw_crc'") renamed the field to 'hw_checksum' for the Ether DMAC "intelligent checksum", however some Ether MACs implement a simpler checksumming scheme, so that name now seems misleading. Rename that field to 'csmr' as the "intelligent checksum" is always controlled by the CSMR register. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 13:31:00 -08:00
Alexei Starovoitov	1728b11110	Merge branch 'libbpf-btf_ext' Yonghong Song says: ==================== This patch set exposed a few functions in libbpf. All these newly added API functions are helpful for JIT based bpf compilation where .BTF and .BTF.ext are available as in-memory data blobs. Patch #1 exposed several btf_ext__* API functions which are used to handle .BTF.ext ELF sections. Patch #2 refactored the function bpf_map_find_btf_info() and exposed API function btf__get_map_kv_tids() to retrieve the map key/value type id's generated by bpf program through BPF_ANNOTATE_KV_PAIR macro. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 12:48:37 -08:00
Yonghong Song	96408c4344	tools/bpf: implement libbpf btf__get_map_kv_tids() API function Currently, to get map key/value type id's, the macro BPF_ANNOTATE_KV_PAIR(<map_name>, <key_type>, <value_type>) needs to be defined in the bpf program for the corresponding map. During program/map loading time, the local static function bpf_map_find_btf_info() in libbpf.c is implemented to retrieve the key/value type ids given the map name. The patch refactored function bpf_map_find_btf_info() to create an API btf__get_map_kv_tids() which includes the bulk of implementation for the original function. The API btf__get_map_kv_tids() can be used by bcc, a JIT based bpf compilation system, which uses the same BPF_ANNOTATE_KV_PAIR to record map key/value types. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 12:48:36 -08:00
Yonghong Song	b8dcf8d149	tools/bpf: expose functions btf_ext__* as API functions The following set of functions, which manipulates .BTF.ext section, are exposed as API functions: . btf_ext__new . btf_ext__free . btf_ext__reloc_func_info . btf_ext__reloc_line_info . btf_ext__func_info_rec_size . btf_ext__line_info_rec_size These functions are useful for JIT based bpf codegen, e.g., bcc, to manipulate in-memory .BTF.ext sections. The signature of function btf_ext__reloc_func_info() is also changed to be the same as its definition in btf.c. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 12:48:36 -08:00
Stanislav Fomichev	7e8a590377	selftests/bpf: use localhost in tcp_{server,client}.py Bind and connect to localhost. There is no reason for this test to use non-localhost interface. This lets us run this test in a network namespace. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-04 21:29:27 +01:00
Paul Burton	047f2d941b	MIPS: Use lower case for addresses in nexys4ddr.dts DTC introduced an i2c_bus_reg check in v1.4.7, used since Linux v4.20, which complains about upper case addresses used in the unit name. nexys4ddr.dts names an I2C device node "ad7420@4B", leading to: arch/mips/boot/dts/xilfpga/nexys4ddr.dts:109.16-112.8: Warning (i2c_bus_reg): /i2c@10A00000/ad7420@4B: I2C bus unit address format error, expected "4b" Fix this by switching to lower case addresses throughout the file, as is mostly the case in the file already & fairly standard throughout the tree. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: stable@vger.kernel.org # v4.20+ Cc: linux-mips@vger.kernel.org	2019-02-04 11:55:49 -08:00
Paul Burton	afd375dc23	MIPS: Enable hugepage support for MIPS64r6 Our hugepage support already exists for MIPS64 CPUs, and is already enabled for older architecture revisions. There's nothing MIPSr6 specific involved, and our hugepage support already works fine for MIPS64r6 CPUs such as the I6500, so allow it to be selected in Kconfig. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:55 -08:00
Paul Burton	82f4f66ddf	MIPS: Remove open-coded cmpxchg() in set_pte() set_pte() contains an open coded version of cmpxchg() - it atomically replaces the buddy pte's value if it is currently zero. Simplify the code considerably by just using cmpxchg() instead of reinventing it. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:48 -08:00
Paul Burton	c8790d657b	MIPS: MemoryMapID (MMID) Support Introduce support for using MemoryMapIDs (MMIDs) as an alternative to Address Space IDs (ASIDs). The major difference between the two is that MMIDs are global - ie. an MMID uniquely identifies an address space across all coherent CPUs. In contrast ASIDs are non-global per-CPU IDs, wherein each address space is allocated a separate ASID for each CPU upon which it is used. This global namespace allows a new GINVT instruction be used to globally invalidate TLB entries associated with a particular MMID across all coherent CPUs in the system, removing the need for IPIs to invalidate entries with separate ASIDs on each CPU. The allocation scheme used here is largely borrowed from arm64 (see arch/arm64/mm/context.c). In essence we maintain a bitmap to track available MMIDs, and MMIDs in active use at the time of a rollover to a new MMID version are preserved in the new version. The allocation scheme requires efficient 64 bit atomics in order to perform reasonably, so this support depends upon CONFIG_GENERIC_ATOMIC64=n (ie. currently it will only be included in MIPS64 kernels). The first, and currently only, available CPU with support for MMIDs is the MIPS I6500. This CPU supports 16 bit MMIDs, and so for now we cap our MMIDs to 16 bits wide in order to prevent the bitmap growing to absurd sizes if any future CPU does implement 32 bit MMIDs as the architecture manuals suggest is recommended. When MMIDs are in use we also make use of GINVT instruction which is available due to the global nature of MMIDs. By executing a sequence of GINVT & SYNC 0x14 instructions we can avoid the overhead of an IPI to each remote CPU in many cases. One complication is that GINVT will invalidate wired entries (in all cases apart from type 0, which targets the entire TLB). In order to avoid GINVT invalidating any wired TLB entries we set up, we make sure to create those entries using a reserved MMID (0) that we never associate with any address space. Also of note is that KVM will require further work in order to support MMIDs & GINVT, since KVM is involved in allocating IDs for guests & in configuring the MMU. That work is not part of this patch, so for now when MMIDs are in use KVM is disabled. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:41 -08:00
Paul Burton	535113896e	MIPS: Add GINVT instruction helpers Add a family of ginvt_* functions making it easy to emit a GINVT instruction to globally invalidate TLB entries. We make use of the _ASM_MACRO infrastructure to support emitting the instructions even if the assembler isn't new enough to support them natively. An associated STYPE_GINV definition & sync_ginv() function are added to emit a sync instruction of type 0x14, which operates as a completion barrier for these new GINVT (and GINVI) instructions. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:35 -08:00
Paul Burton	0b317c389c	MIPS: mm: Add set_cpu_context() for ASID assignments When we gain MMID support we'll be storing MMIDs as atomic64_t values and accessing them via atomic64_* functions. This necessitates that we don't use cpu_context() as the left hand side of an assignment, ie. as a modifiable lvalue. In preparation for this introduce a new set_cpu_context() function & replace all assignments with cpu_context() on their left hand side with an equivalent call to set_cpu_context(). To enforce that cpu_context() should not be used for assignments, we rewrite it as a static inline function. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:33 -08:00
Paul Burton	42d5b84657	MIPS: mm: Unify ASID version checks Introduce a new check_mmu_context() function to check an mm's ASID version & get a new one if it's outdated, and a check_switch_mmu_context() function which additionally sets up the new ASID & page directory. Simplify switch_mm() & various get_new_mmu_context() callsites in MIPS KVM by making use of the new functions, which will help reduce the amount of code that requires modification to gain MMID support. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:30 -08:00
Paul Burton	4ebea49ce2	MIPS: mm: Un-inline get_new_mmu_context In preparation for adding MMID support to get_new_mmu_context() which will increase the size of the function somewhat, move it from asm/mmu_context.h into a C file. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:28 -08:00
Paul Burton	7e8556d06a	MIPS: mm: Split obj-y to a file per line Split always-included objects to one per line in order to make it easier to modify the list of included objects. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:26 -08:00
Paul Burton	558ec8ad71	MIPS: mm: Remove local_flush_tlb_mm() All 3 variants of local_flush_tlb_mm() are now effectively simple calls to drop_mmu_context(). Remove them and use drop_mmu_context() directly. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:24 -08:00
Paul Burton	f7908a007e	MIPS: mm: Remove redundant preempt_disable in local_flush_tlb_mm() The r4k variant of local_flush_tlb_mm() wraps its call to drop_mmu_context() with a preempt_disable() & preempt_enable() pair, but this is redundant since drop_mmu_context() disables interrupts and from Documentation/preempt-locking.txt: Note that you do not need to explicitly prevent preemption if you are holding any locks or interrupts are disabled, since preemption is implicitly disabled in those cases. Remove the redundant preempt_disable() & preempt_enable() calls. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:22 -08:00
Paul Burton	6067d47e36	MIPS: mm: Move drop_mmu_context() comment into appropriate block drop_mmu_context() is preceded by a comment indicating what happens if the mm provided is currently active on the local CPU. Move that comment into the block that executes in this case, adjusting slightly to reflect its new location. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:20 -08:00
Paul Burton	c9b2a3dc24	MIPS: mm: Consolidate drop_mmu_context() has-ASID checks If an mm does not have an ASID on the local CPU then drop_mmu_context() is always redundant, since there's no context to "drop". Various callers of drop_mmu_context() check whether the mm has been allocated an ASID before making the call. Move that check into drop_mmu_context() and remove it from callers to simplify them. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:18 -08:00
Paul Burton	67741ba3ba	MIPS: mm: Avoid HTW stop/start when dropping an inactive mm If drop_mmu_context() is called with an mm that is not currently active on the local CPU then there's no need for us to stop & start a hardware page table walker because it can't be fetching entries for the ASID corresponding to the mm we're operating on. Move the htw_stop() & htw_start() calls into the block which we run only if the mm is currently active, in order to avoid the redundant work. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:16 -08:00
Paul Burton	4739f7dd99	MIPS: mm: Remove redundant get_new_mmu_context() cpu argument get_new_mmu_context() accepts a cpu argument, but implicitly assumes that this is always equal to smp_processor_id() by operating on the local CPU's TLB & icache. Remove the cpu argument and have get_new_mmu_context() call smp_processor_id() instead. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:14 -08:00
Paul Burton	9a27324fde	MIPS: mm: Remove redundant drop_mmu_context() cpu argument The drop_mmu_context() function accepts a cpu argument, but it implicitly expects that this is always equal to smp_processor_id() by allocating & configuring an ASID on the local CPU when the mm is active on the CPU indicated by the cpu argument. All callers do provide the value of smp_processor_id() to the cpu argument. Remove the redundant argument and have drop_mmu_context() call smp_processor_id() itself, making it clearer that the cpu variable always represents the local CPU. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:12 -08:00
Paul Burton	c653bd04f7	MIPS: mm: Define activate_mm() using switch_mm() MIPS has separate definitions of activate_mm() & switch_mm() which are identical apart from switch_mm() checking that the ASID is valid before acquiring a new one. We know that when activate_mm() is called cpu_context(X, mm) will be zero, and this will never be considered a valid ASID because we never allow the ASID version number to be zero, instead beginning with version 1 using asid_first_version(). Therefore switch_mm() will always allocate a new ASID when called for a new task, meaning that it will behave identically to activate_mm(). Take advantage of this to remove the duplication & define activate_mm() using switch_mm() just like many other architectures do. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org	2019-02-04 10:56:09 -08:00
Huacai Chen	e02e07e312	MIPS: Loongson: Introduce and use loongson_llsc_mb() On the Loongson-2G/2H/3A/3B there is a hardware flaw that ll/sc and lld/scd is very weak ordering. We should add sync instructions "before each ll/lld" and "at the branch-target between ll/sc" to workaround. Otherwise, this flaw will cause deadlock occasionally (e.g. when doing heavy load test with LTP). Below is the explaination of CPU designer: "For Loongson 3 family, when a memory access instruction (load, store, or prefetch)'s executing occurs between the execution of LL and SC, the success or failure of SC is not predictable. Although programmer would not insert memory access instructions between LL and SC, the memory instructions before LL in program-order, may dynamically executed between the execution of LL/SC, so a memory fence (SYNC) is needed before LL/LLD to avoid this situation. Since Loongson-3A R2 (3A2000), we have improved our hardware design to handle this case. But we later deduce a rarely circumstance that some speculatively executed memory instructions due to branch misprediction between LL/SC still fall into the above case, so a memory fence (SYNC) at branch-target (if its target is not between LL/SC) is needed for Loongson 3A1000, 3B1500, 3A2000 and 3A3000. Our processor is continually evolving and we aim to to remove all these workaround-SYNCs around LL/SC for new-come processor." Here is an example: Both cpu1 and cpu2 simutaneously run atomic_add by 1 on same atomic var, this bug cause both 'sc' run by two cpus (in atomic_add) succeed at same time('sc' return 1), and the variable is only added by 1, sometimes, which is wrong and unacceptable(it should be added by 2). Why disable fix-loongson3-llsc in compiler? Because compiler fix will cause problems in kernel's __ex_table section. This patch fix all the cases in kernel, but: +. the fix at the end of futex_atomic_cmpxchg_inatomic is for branch-target of 'bne', there other cases which smp_mb__before_llsc() and smp_llsc_mb() fix the ll and branch-target coincidently such as atomic_sub_if_positive/ cmpxchg/xchg, just like this one. +. Loongson 3 does support CONFIG_EDAC_ATOMIC_SCRUB, so no need to touch edac.h +. local_ops and cmpxchg_local should not be affected by this bug since only the owner can write. +. mips_atomic_set for syscall.c is deprecated and rarely used, just let it go Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Huang Pei <huangpei@loongson.cn> [paul.burton@mips.com: - Simplify the addition of -mno-fix-loongson3-llsc to cflags, and add a comment describing why it's there. - Make loongson_llsc_mb() a no-op when CONFIG_CPU_LOONGSON3_WORKAROUNDS=n, rather than a compiler memory barrier. - Add a comment describing the bug & how loongson_llsc_mb() helps in asm/barrier.h.] Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: ambrosehua@gmail.com Cc: Steven J . Hill <Steven.Hill@cavium.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: Li Xuefeng <lixuefeng@loongson.cn> Cc: Xu Chenghua <xuchenghua@loongson.cn>	2019-02-04 10:53:34 -08:00
Arnaldo Carvalho de Melo	6ab3bc240a	perf trace: Support multiple "vfs_getname" probes With a suitably defined "probe:vfs_getname" probe, 'perf trace' can "beautify" its output, so syscalls like open() or openat() can print the "filename" argument instead of just its hex address, like: $ perf trace -e open -- touch /dev/null [...] 0.590 ( 0.014 ms): touch/18063 open(filename: /dev/null, flags: CREAT\|NOCTTY\|NONBLOCK\|WRONLY, mode: IRUGO\|IWUGO) = 3 [...] The output without such beautifier looks like: 0.529 ( 0.011 ms): touch/18075 open(filename: 0xc78cf288, flags: CREAT\|NOCTTY\|NONBLOCK\|WRONLY, mode: IRUGO\|IWUGO) = 3 However, when the vfs_getname probe expands to multiple probes and it is not the first one that is hit, the beautifier fails, as following: 0.326 ( 0.010 ms): touch/18072 open(filename: , flags: CREAT\|NOCTTY\|NONBLOCK\|WRONLY, mode: IRUGO\|IWUGO) = 3 Fix it by hooking into all the expanded probes (inlines), now, for instance: [root@quaco ~]# perf probe -l probe:vfs_getname (on getname_flags:73@fs/namei.c with pathname) probe:vfs_getname_1 (on getname_flags:73@fs/namei.c with pathname) [root@quaco ~]# perf trace -e open* sleep 1 0.010 ( 0.005 ms): sleep/5588 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: RDONLY\|CLOEXEC) = 3 0.029 ( 0.006 ms): sleep/5588 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: RDONLY\|CLOEXEC) = 3 0.194 ( 0.008 ms): sleep/5588 openat(dfd: CWD, filename: /usr/lib/locale/locale-archive, flags: RDONLY\|CLOEXEC) = 3 [root@quaco ~]# Works, further verified with: [root@quaco ~]# perf test vfs 65: Use vfs_getname probe to get syscall args filenames : Ok 66: Add vfs_getname probe to get syscall args filenames : Ok 67: Check open filename arg using perf trace + vfs_getname: Ok [root@quaco ~]# Reported-by: Michael Petlan <mpetlan@redhat.com> Tested-by: Michael Petlan <mpetlan@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-mv8kolk17xla1smvmp3qabv1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-02-04 15:50:38 -03:00
Jiri Olsa	59a1770691	perf symbols: Filter out hidden symbols from labels When perf is built with the annobin plugin (RHEL8 build) extra symbols are added to its binary: # nm perf \| grep annobin \| head -10 0000000000241100 t .annobin_annotate.c 0000000000326490 t .annobin_annotate.c 0000000000249255 t .annobin_annotate.c_end 00000000003283a8 t .annobin_annotate.c_end 00000000001bce18 t .annobin_annotate.c_end.hot 00000000001bce18 t .annobin_annotate.c_end.hot 00000000001bc3e2 t .annobin_annotate.c_end.unlikely 00000000001bc400 t .annobin_annotate.c_end.unlikely 00000000001bce18 t .annobin_annotate.c.hot 00000000001bce18 t .annobin_annotate.c.hot ... Those symbols have no use for report or annotation and should be skipped. Moreover they interfere with the DWARF unwind test on the PPC arch, where they are mixed with checked symbols and then the test fails: # perf test dwarf -v 59: Test dwarf unwind : --- start --- test child forked, pid 8515 unwind: .annobin_dwarf_unwind.c:ip = 0x10dba40dc (0x2740dc) ... got: .annobin_dwarf_unwind.c 0x10dba40dc, expecting test__arch_unwind_sample unwind: failed with 'no error' The annobin symbols are defined as NOTYPE/LOCAL/HIDDEN: # readelf -s ./perf \| grep annobin \| head -1 40: 00000000001bce4f 0 NOTYPE LOCAL HIDDEN 13 .annobin_init.c They can still pass the check for the label symbol. Adding check for HIDDEN and INTERNAL (as suggested by Nick below) visibility and filter out such symbols. > Just to be awkward, if you are going to ignore STV_HIDDEN > symbols then you should probably also ignore STV_INTERNAL ones > as well... Annobin does not generate them, but you never know, > one day some other tool might create some. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Clifton <nickc@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190128133526.GD15461@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-02-04 15:50:38 -03:00
Arnaldo Carvalho de Melo	843cf70ed2	perf symbols: Add fallback definitions for GELF_ST_VISIBILITY() Those aren't present in Alpine Linux 3.4 to edge, so provide fallback defines to get the next patch building there keeping the build bisectable. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Clifton <nickc@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-03cg3gya2ju4ba2x6ibb9fuz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-02-04 15:50:37 -03:00
Heiko Carstens	ecc15f113c	s390: bpf: fix JMP32 code-gen Commit `626a5f66da` ("s390: bpf: implement jitting of JMP32") added JMP32 code-gen support for s390. However it triggers the warning below due to some unusual gotos in the original s390 bpf jit code. Add a couple of additional "is_jmp32" initializations to fix this. Also fix the wrong opcode for the "llilf" instruction that was introduced with the same commit. arch/s390/net/bpf_jit_comp.c: In function 'bpf_jit_insn': arch/s390/net/bpf_jit_comp.c:248:55: warning: 'is_jmp32' may be used uninitialized in this function [-Wmaybe-uninitialized] _EMIT6(op1 \| reg(b1, b2) << 16 \| (rel & 0xffff), op2 \| mask); \ ^ arch/s390/net/bpf_jit_comp.c:1211:8: note: 'is_jmp32' was declared here bool is_jmp32 = BPF_CLASS(insn->code) == BPF_JMP32; Fixes: `626a5f66da` ("s390: bpf: implement jitting of JMP32") Cc: Jiong Wang <jiong.wang@netronome.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Jiong Wang <jiong.wang@netronome.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 09:45:09 -08:00
David S. Miller	0429f237ce	Merge branch 's390-qeth-fixes' Julian Wiedmann says: ==================== s390/qeth: fixes 2019-02-04 please apply the following four fixes to -net. Patch 1 takes care of a common resource leak in various error paths, while the second patch fixes a misordered kfree when cleaning up after an error. The other two patches ensure that there's no stale work dangling on workqueues when the qeth device has already been offlined and/or removed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 09:43:48 -08:00
Julian Wiedmann	c0a2e4d10d	s390/qeth: conclude all event processing before offlining a card Work for Bridgeport events is currently placed on a driver-wide workqueue. If the card is removed and freed while any such work is still active, this causes a use-after-free. So put the events on a per-card queue, where we can control their lifetime. As we also don't want stale events to last beyond an offline & online cycle, flush this queue when setting the card offline. Fixes: `b4d72c08b3` ("qeth: bridgeport support - basic control") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 09:43:48 -08:00
Julian Wiedmann	c2780c1a3f	s390/qeth: cancel close_dev work before removing a card A card's close_dev work is scheduled on a driver-wide workqueue. If the card is removed and freed while the work is still active, this causes a use-after-free. So make sure that the work is completed before freeing the card. Fixes: `0f54761d16` ("qeth: Support VEPA mode") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 09:43:48 -08:00
Julian Wiedmann	afa0c5904b	s390/qeth: fix use-after-free in error path The error path in qeth_alloc_qdio_buffers() that takes care of cleaning up the Output Queues is buggy. It first frees the queue, but then calls qeth_clear_outq_buffers() with that very queue struct. Make the call to qeth_clear_outq_buffers() part of the free action (in the correct order), and while at it fix the naming of the helper. Fixes: `0da9581ddb` ("qeth: exploit asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 09:43:48 -08:00
Julian Wiedmann	5065b2dd3e	s390/qeth: release cmd buffer in error paths Whenever we fail before/while starting an IO, make sure to release the IO buffer. Usually qeth_irq() would do this for us, but if the IO doesn't even start we obviously won't get an interrupt for it either. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 09:43:47 -08:00
Alexei Starovoitov	9fa3b47304	Merge branch 'change-libbpf-print-api' Yonghong Song says: ==================== These are patches responding to my comments for Magnus's patch (https://patchwork.ozlabs.org/patch/1032848/). The goal is to make pr_* macros available to other C files than libbpf.c, and to simplify API function libbpf_set_print(). Specifically, Patch #1 used global functions to facilitate pr_* macros in the header files so they are available in different C files. Patch #2 removes the global function libbpf_print_level_available() which is added in Patch 1. Patch #3 simplified libbpf_set_print() which takes only one print function with a debug level argument among others. Changelogs: v3 -> v4: . rename libbpf internal header util.h to libbpf_util.h . rename libbpf internal function libbpf_debug_print() to libbpf_print() v2 -> v3: . bailed out earlier in libbpf_debug_print() if __libbpf_pr is NULL . added missing LIBBPF_DEBUG level check in libbpf.c __base_pr(). v1 -> v2: . Renamed global function libbpf_dprint() to libbpf_debug_print() to be more expressive. . Removed libbpf_dprint_level_available() as it is used only once in btf.c and we can remove it by optimizing for common cases. ==================== Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 09:40:59 -08:00
Yonghong Song	6f1ae8b662	tools/bpf: simplify libbpf API function libbpf_set_print() Currently, the libbpf API function libbpf_set_print() takes three function pointer parameters for warning, info and debug printout respectively. This patch changes the API to have just one function pointer parameter and the function pointer has one additional parameter "debugging level". So if in the future, if the debug level is increased, the function signature won't change. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 09:40:59 -08:00
Yonghong Song	9d100a19ff	tools/bpf: print out btf log at LIBBPF_WARN level Currently, the btf log is allocated and printed out in case of error at LIBBPF_DEBUG level. Such logs from kernel are very important for debugging. For example, bpf syscall BPF_PROG_LOAD command can get verifier logs back to user space. In function load_program() of libbpf.c, the log buffer is allocated unconditionally and printed out at pr_warning() level. Let us do the similar thing here for btf. Allocate buffer unconditionally and print out error logs at pr_warning() level. This can reduce one global function and optimize for common situations where pr_warning() is activated either by default or by user supplied debug output function. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 09:40:58 -08:00
Yonghong Song	8461ef8b7e	tools/bpf: move libbpf pr_* debug print functions to headers A global function libbpf_print, which is invisible outside the shared library, is defined to print based on levels. The pr_warning, pr_info and pr_debug macros are moved into the newly created header common.h. So any .c file including common.h can use these macros directly. Currently btf__new and btf_ext__new API has an argument getting __pr_debug function pointer into btf.c so the debugging information can be printed there. This patch removed this parameter from btf__new and btf_ext__new and directly using pr_debug in btf.c. Another global function libbpf_print_level_available, also invisible outside the shared library, can test whether a particular level debug printing is available or not. It is used in btf.c to test whether DEBUG level debug printing is availabl or not, based on which the log buffer will be allocated when loading btf to the kernel. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 09:40:58 -08:00
Florian Westphal	ac02bcf9cc	netfilter: ipv6: avoid indirect calls for IPV6=y case indirect calls are only needed if ipv6 is a module. Add helpers to abstract the v6ops indirections and use them instead. fragment, reroute and route_input are kept as indirect calls. The first two are not not used in hot path and route_input is only used by bridge netfilter. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-04 18:21:12 +01:00
Florian Westphal	960587285a	netfilter: nat: remove module dependency on ipv6 core nf_nat_ipv6 calls two ipv6 core functions, so add those to v6ops to avoid the module dependency. This is a prerequisite for merging ipv4 and ipv6 nat implementations. Add wrappers to avoid the indirection if ipv6 is builtin. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-04 18:20:19 +01:00
Petr Machata	c1f7e02979	net: cls_flower: Remove filter from mask before freeing it In fl_change(), when adding a new rule (i.e. fold == NULL), a driver may reject the new rule, for example due to resource exhaustion. By that point, the new rule was already assigned a mask, and it was added to that mask's hash table. The clean-up path that's invoked as a result of the rejection however neglects to undo the hash table addition, and proceeds to free the new rule, thus leaving a dangling pointer in the hash table. Fix by removing fnew from the mask's hash table before it is freed. Fixes: `35cc3cefc4` ("net/sched: cls_flower: Reject duplicated rules also under skip_sw") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 09:19:14 -08:00
David S. Miller	3e5a7c9814	wireless-drivers fixes for 5.0 First set of small, but importnat, fixes for 5.0. iwlwifi * fix a build regression introduced in 5.0-rc1 wlcore * fix a firmware regression from v4.18-rc1 mt76x0 * fix for configuring tx power from user space ath10k * fix wcn3990 regression from v4.20-rc1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJcWDsEAAoJEG4XJFUm622bSGsIAKevgvjWQkE7ITGfVuVNxRr9 B1pLgpOrQpVJtInbIS/sGpRKlcGn+3rE47KWzsl/1osGroQGcj5GV+hfGsvGkLwD rzbYnAWA4VKP9yD/RDCho9v8kjAQYdfIArE7zpm14De0oXJHEn3C94Hxd90hn6CX BhX4iZONcpIaAnggLj5aSIOo8+UFoE9BlGLFN0uNGQT2V0X/GjvZSfTAGsaof09Q vSij06GNAoPF4g4ekJA0sJk0T3hJ8NNklrArKii7SE841j8Y1UaDHtHd/X5eUyLa tB2pDvQgoCO6O5stZWtq1fr/qPBFthyE1TiMla5WgW+8l4QYaF14smNxkW5Kamg= =5tr0 -----END PGP SIGNATURE----- Merge tag 'wireless-drivers-for-davem-2019-02-04' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 5.0 First set of small, but importnat, fixes for 5.0. iwlwifi * fix a build regression introduced in 5.0-rc1 wlcore * fix a firmware regression from v4.18-rc1 mt76x0 * fix for configuring tx power from user space ath10k * fix wcn3990 regression from v4.20-rc1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 09:12:06 -08:00
David S. Miller	277aa590c3	Merge branch 'smc-fixes' Ursula Braun says: ==================== net/smc: fixes 2019-02-04 here are more fixes in the smc code for the net tree: Patch 1 fixes an IB-related problem with SMCR. Patch 2 fixes a cursor problem for one-way traffic. Patch 3 fixes a problem with RMB-reusage. Patch 4 fixes a closing issue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 09:11:19 -08:00
Ursula Braun	84b799a292	net/smc: correct state change for peer closing If some kind of closing is received from the peer while still in state SMC_INIT, it means the peer has had an active connection and closed the socket quickly before listen_work finished. This should not result in a shortcut from state SMC_INIT to state SMC_CLOSED. This patch adds the socket to the accept queue in state SMC_APPCLOSEWAIT1. The socket reaches state SMC_CLOSED once being accepted and closed with smc_release(). Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 09:11:19 -08:00
Ursula Braun	a5e04318c8	net/smc: delete rkey first before switching to unused Once RMBs are flagged as unused they are candidates for reuse. Thus the LLC DELETE RKEY operaton should be made before flagging the RMB as unused. Fixes: `c7674c001b` ("net/smc: unregister rkeys of unused buffer") Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 09:11:19 -08:00
Ursula Braun	b8649efad8	net/smc: fix sender_free computation In some scenarios a separate consumer cursor update is necessary. The decision is made in smc_tx_consumer_cursor_update(). The sender_free computation could be wrong: The rx confirmed cursor is always smaller than or equal to the rx producer cursor. The parameters in the smc_curs_diff() call have to be exchanged, otherwise sender_free might even be negative. And if more data arrives local_rx_ctrl.prod might be updated, enabling a cursor difference between local_rx_ctrl.prod and rx confirmed cursor larger than the RMB size. This case is not covered by smc_curs_diff(). Thus function smc_curs_diff_large() is introduced here. If a recvmsg() is processed in parallel, local_tx_ctrl.cons might change during smc_cdc_msg_send. Make sure rx_curs_confirmed is updated with the actually sent local_tx_ctrl.cons value. Fixes: `e82f2e31f5` ("net/smc: optimize consumer cursor updates") Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 09:11:19 -08:00
Ursula Braun	ad6f317f72	net/smc: preallocated memory for rdma work requests The work requests for rdma writes are built in local variables within function smc_tx_rdma_write(). This violates the rule that the work request storage has to stay till the work request is confirmed by a completion queue response. This patch introduces preallocated memory for these work requests. The storage is allocated, once a link (and thus a queue pair) is established. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 09:11:19 -08:00
Sebastian Andrzej Siewior	53bc8d2af0	net: dp83640: expire old TX-skb During sendmsg() a cloned skb is saved via dp83640_txtstamp() in ->tx_queue. After the NIC sends this packet, the PHY will reply with a timestamp for that TX packet. If the cable is pulled at the right time I don't see that packet. It might gets flushed as part of queue shutdown on NIC's side. Once the link is up again then after the next sendmsg() we enqueue another skb in dp83640_txtstamp() and have two on the list. Then the PHY will send a reply and decode_txts() attaches it to the first skb on the list. No crash occurs since refcounting works but we are one packet behind. linuxptp/ptp4l usually closes the socket and opens a new one (in such a timeout case) so those "stale" replies never get there. However it does not resume normal operation anymore. Purge old skbs in decode_txts(). Fixes: `cb646e2b02` ("ptp: Added a clock driver for the National Semiconductor PHYTER.") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 08:54:52 -08:00
Pablo Neira Ayuso	f6ac858589	netfilter: nf_tables: unbind set in rule from commit path Anonymous sets that are bound to rules from the same transaction trigger a kernel splat from the abort path due to double set list removal and double free. This patch updates the logic to search for the transaction that is responsible for creating the set and disable the set list removal and release, given the rule is now responsible for this. Lookup is reverse since the transaction that adds the set is likely to be at the tail of the list. Moreover, this patch adds the unbind step to deliver the event from the commit path. This should not be done from the worker thread, since we have no guarantees of in-order delivery to the listener. This patch removes the assumption that both activate and deactivate callbacks need to be provided. Fixes: `cd5125d8f5` ("netfilter: nf_tables: split set destruction in deactivate and destroy phase") Reported-by: Mikhail Morfikov <mmorfikov@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-04 17:29:17 +01:00
Kees Cook	4b6e9f3fe1	ath9k: eeprom: Use scnprintf instead of snprintf Change snprintf to scnprintf. There are generally two cases where using snprintf causes problems. 1) Uses of size += snprintf(buf, SIZE - size, fmt, ...) In this case, if snprintf would have written more characters than what the buffer size (SIZE) is, then size will end up larger than SIZE. In later uses of snprintf, SIZE - size will result in a negative number, leading to problems. Note that size might already be too large by using size = snprintf before the code reaches a case of size += snprintf. 2) If size is ultimately used as a length parameter for a copy back to user space, then it will potentially allow for a buffer overflow and information disclosure when size is greater than SIZE. When the size is used to index the buffer directly, we can have memory corruption. This also means when size = snprintf... is used, it may also cause problems since size may become large. Copying to userspace is mitigated by the HARDENED_USERCOPY kernel configuration. The solution to these issues is to use scnprintf which returns the number of characters actually written to the buffer, so the size variable will never exceed SIZE. Cc: Willy Tarreau <w@1wt.eu> Cc: Silvio Cesare <silvio.cesare@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>	2019-02-04 17:52:49 +02:00
Govind Singh	5cbb117477	ath10k: Add support for extended HTT aggr msg support HTT aggr message parameter in HL2.0 fw are different in comparison to legacy fw version. Fill correct HTT aggr msg parameter for targets using HL2.0 firmware. Signed-off-by: Govind Singh <govinds@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>	2019-02-04 17:51:39 +02:00

... 54 55 56 57 58 ...

815968 Commits