When trying to lock read-only pages, sev_pin_memory() fails because
FOLL_WRITE is used as the flag for get_user_pages_fast().
Commit 73b0140bf0 ("mm/gup: change GUP fast to use flags rather than a
write 'bool'") updated the get_user_pages_fast() call sites to use
flags, but incorrectly updated the call in sev_pin_memory(). As the
original coding of this call was correct, revert the change made by that
commit.
Fixes: 73b0140bf0 ("mm/gup: change GUP fast to use flags rather than a write 'bool'")
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Mike Marshall <hubcap@omnibond.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Link: http://lkml.kernel.org/r/20200423152419.87202-1-Janakarajan.Natarajan@amd.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the trapping instruction contains a ':', for a memory access through
segment registers for example, the sed substitution will insert the '*'
marker in the middle of the instruction instead of the line address:
2b: 65 48 0f c7 0f cmpxchg16b %gs:*(%rdi) <-- trapping instruction
I started to think I had forgotten some quirk of the assembly syntax
before noticing that it was actually coming from the script. Fix it to
add the address marker at the right place for these instructions:
28: 49 8b 06 mov (%r14),%rax
2b:* 65 48 0f c7 0f cmpxchg16b %gs:(%rdi) <-- trapping instruction
30: 0f 94 c0 sete %al
Fixes: 18ff44b189 ("scripts/decodecode: make faulting insn ptr more robust")
Signed-off-by: Ivan Delalande <colona@arista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20200419223653.GA31248@visor
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Without CONFIG_PREEMPT, it can happen that we get soft lockups detected,
e.g., while booting up.
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0-next-20200331+ #4
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
RIP: __pageblock_pfn_to_page+0x134/0x1c0
Call Trace:
set_zone_contiguous+0x56/0x70
page_alloc_init_late+0x166/0x176
kernel_init_freeable+0xfa/0x255
kernel_init+0xa/0x106
ret_from_fork+0x35/0x40
The issue becomes visible when having a lot of memory (e.g., 4TB)
assigned to a single NUMA node - a system that can easily be created
using QEMU. Inside VMs on a hypervisor with quite some memory
overcommit, this is fairly easy to trigger.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200416073417.5003-1-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When I run my memcg testcase which creates lots of memcgs, I found
there're unexpected out of memory logs while there're still enough
available free memory. The error log is
mkdir: cannot create directory 'foo.65533': Cannot allocate memory
The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs,
an -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right
errno should be -ENOSPC "No space left on device", which is an
appropriate errno for userspace's failed mkdir.
As the errno really misled me, we should make it right. After this
patch, the error log will be
mkdir: cannot create directory 'foo.65533': No space left on device
[akpm@linux-foundation.org: s/EBUSY/ENOSPC/, per Michal]
[akpm@linux-foundation.org: s/EBUSY/ENOSPC/, per Michal]
Fixes: 73f576c04b ("mm: memcontrol: fix cgroup creation failure after many small jobs")
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Link: http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/1586192163-20099-1-git-send-email-laoar.shao@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
syzbot managed to trigger a recursive NETDEV_FEAT_CHANGE event
between bonding master and slave. I managed to find a reproducer
for this:
ip li set bond0 up
ifenslave bond0 eth0
brctl addbr br0
ethtool -K eth0 lro off
brctl addif br0 bond0
ip li set br0 up
When a NETDEV_FEAT_CHANGE event is triggered on a bonding slave,
it captures this and calls bond_compute_features() to fixup its
master's and other slaves' features. However, when syncing with
its lower devices by netdev_sync_lower_features() this event is
triggered again on slaves when the LRO feature fails to change,
so it goes back and forth recursively until the kernel stack is
exhausted.
Commit 17b85d29e8 intentionally lets __netdev_update_features()
return -1 for such a failure case, so we have to just rely on
the existing check inside netdev_sync_lower_features() and skip
NETDEV_FEAT_CHANGE event only for this specific failure case.
Fixes: fd867d51f8 ("net/core: generic support for disabling netdev features down stack")
Reported-by: syzbot+e73ceacfd8560cc8a3ca@syzkaller.appspotmail.com
Reported-by: syzbot+c2fb6f9ddcea95ba49b5@syzkaller.appspotmail.com
Cc: Jarod Wilson <jarod@redhat.com>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jann Horn <jannh@google.com>
Reviewed-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now sch_fq has horizon feature, we want to allow QUIC/UDP applications
to use EDT model so that pacing can be offloaded to the kernel (sch_fq)
or the NIC.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a subflow is created via mptcp_subflow_create_socket(),
a new 'struct socket' is allocated, with a new i_ino value.
When inspecting TCP sockets via the procfs and or the diag
interface, the above ones are not related to the process owning
the MPTCP master socket, even if they are a logical part of it
('ss -p' shows an empty process field)
Additionally, subflows created by the path manager get
the uid/gid from the running workqueue.
Subflows are part of the owning MPTCP master socket, let's
adjust the vfs info to reflect this.
After this patch, 'ss' correctly displays subflows as belonging
to the msk socket creator.
Fixes: 2303f994b3 ("mptcp: Associate MPTCP context with TCP socket")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet says:
====================
bonding: report transmit status to callers
First patches cleanup netpoll, and make sure it provides tx status to its users.
Last patch changes bonding to not pretend packets were sent without error.
By providing more accurate status, TCP stack can avoid adding more
packets if the slave qdisc is already full.
This came while testing latest horizon feature in sch_fq, with
very low pacing rate flows, but should benefit hosts under stress.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, bonding always returns NETDEV_TX_OK to its caller.
It is worth trying to be more accurate : TCP for instance
can have different recovery strategies if it can have more
precise status, if packet was dropped by slave qdisc.
This is especially important when host is under stress.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
netpoll_send_skb() callers seem to leak skb if
the np pointer is NULL. While this should not happen, we
can make the code more robust.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some callers want to know if the packet has been sent or
dropped, to inform upper stacks.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to inline this helper, as we intend to add more
code in this function.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
netpoll_send_skb_on_dev() can get the device pointer directly from np->dev
Rename it to __netpoll_send_skb()
Following patch will move netpoll_send_skb() out-of-line.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver calls kthread_run() in probe, but forgets to call
kthread_stop() in probe failure and remove.
Add the missed kthread_stop() to fix it.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The unsigned variable val is being checked for an error by checking
if it is less than zero. This can never occur because val is unsigned.
Fix this by making val a plain int.
Addresses-Coverity: ("Unsigned compared against zero")
Fixes: bdbdac7649 ("ethtool: provide UAPI for PHY master/slave configuration.")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
net/smc/smc_llc.c: In function 'smc_llc_cli_conf_link':
net/smc/smc_llc.c:753:31: warning:
variable 'del_llc' set but not used [-Wunused-but-set-variable]
struct smc_llc_msg_del_link *del_llc;
^
net/smc/smc_llc.c: In function 'smc_llc_process_srv_delete_link':
net/smc/smc_llc.c:1311:33: warning:
variable 'del_llc_resp' set but not used [-Wunused-but-set-variable]
struct smc_llc_msg_del_link *del_llc_resp;
^
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
so tcp_is_sack/reno checks are removed from tcp_mark_head_lost.
Signed-off-by: zhang kai <zhangkaiheb@126.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NL_SET_ERR_MSG_MOD macro is used to report a string describing an
error message to userspace via the netlink extended ACK structure. It
should not have a trailing newline.
Add a cocci script which catches cases where the newline marker is
present. Using this script, fix the handful of cases which accidentally
included a trailing new line.
I couldn't figure out a way to get a patch mode working, so this script
only implements context, report, and org.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Grygorii Strashko says:
====================
net: ethernet: ti: am65x-cpts: follow up dt bindings update
This series is follow update for TI A65x/J721E Common platform time sync (CPTS)
driver [1] to implement DT bindings review comments from
Rob Herring <robh@kernel.org> [2].
- "reg" and "compatible" properties are made required for CPTS DT nodes which
also required to change K3 CPSW driver to use of_platform_device_create()
instead of of_platform_populate() for proper CPTS and MDIO initialization
- minor DT bindings format changes
- K3 CPTS example added to K3 MCU CPSW bindings
[1] https://lwn.net/Articles/819313/
[2] https://lwn.net/ml/linux-kernel/20200505040419.GA8509@bogus/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Update CPTS node following DT binding update:
- add reg and compatible properties
- fix node name
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch follows K3 CPTS review comments from Rob Herring
<robh@kernel.org>.
- "reg" and "compatible" properties are required now
- minor format changes
- K3 CPTS example added to K3 MCU CPSW bindings
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MCU CPSW expected to populate only MDIO device, but follow up patches
will add "compatible" property to the MCU CPSW CPTS node which will cause
creation of CPTS device and MCU CPSW init failure. Hence, switch to use
of_platform_device_create() instead of of_platform_populate() for MDIO
device population.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo says:
====================
hsr: hsr code refactoring
There are some unnecessary routine in the hsr module.
This patch removes these routines.
The first patch removes incorrect comment.
The second patch removes unnecessary WARN_ONCE() macro.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Create an independent function that takes a particular frame queue and
an array of frame descriptors and tries to enqueue them until it hits
the maximum number fo retries. The same function will be used in the
next patch also on the XDP_TX path.
Also, create the dpaa2_eth_xdp_fds structure to incorporate the array of
FDs as well as the number of FDs already populated.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When VLAN frame is being sent, hsr calls WARN_ONCE() because hsr doesn't
support VLAN. But using WARN_ONCE() is overdoing.
Using netdev_warn_once() is enough.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mask the consumer index before using it. Without this, we would be
writing frame descriptors beyond the ring size supported by the QBMAN
block.
Fixes: 3b2abda7d2 ("soc: fsl: dpio: Replace QMAN array mode with ring mode enqueue")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Acked-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add some verbiage describing how the hardware features of the switch are
exposed to users through tc-flower.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Restrict the TTEthernet hardware support on this switch to operate as
closely as possible to IEEE 802.1Qci as possible. This means that it can
perform PTP-time-based ingress admission control on streams identified
by {DMAC, VID, PCP}, which is useful when trying to ensure the
determinism of traffic scheduled via IEEE 802.1Qbv.
The oddity comes from the fact that in hardware (and in TTEthernet at
large), virtual links always need a full-blown action, including not
only the type of policing, but also the list of destination ports. So in
practice, a single tc-gate action will result in all packets getting
dropped. Additional actions (either "trap" or "redirect") need to be
specified in the same filter rule such that the conforming packets are
actually forwarded somewhere.
Apart from the VL Lookup, Policing and Forwarding tables which need to
be programmed for each flow (virtual link), the Schedule engine also
needs to be told to open/close the admission gates for each individual
virtual link. A fairly accurate (and detailed) description of how that
works is already present in sja1105_tas.c, since it is already used to
trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key
point here, we remember that the schedule engine supports 8
"subschedules" (execution threads that iterate through the global
schedule in parallel, and that no 2 hardware threads must execute a
schedule entry at the same time). For tc-taprio, each egress port used
one of these 8 subschedules, leaving a total of 4 subschedules unused.
In principle we could have allocated 1 subschedule for the tc-gate
offload of each ingress port, but actually the schedules of all virtual
links installed on each ingress port would have needed to be merged
together, before they could have been programmed to hardware. So
simplify our life and just merge the entire tc-gate configuration, for
all virtual links on all ingress ports, into a single subschedule. Be
sure to check that against the usual hardware scheduling conflicts, and
program it to hardware alongside any tc-taprio subschedule that may be
present.
The following scenarios were tested:
1. Quantitative testing:
tc qdisc add dev swp2 clsact
tc filter add dev swp2 ingress flower skip_sw \
dst_mac 42:be:24:9b:76:20 \
action gate index 1 base-time 0 \
sched-entry OPEN 1200 -1 -1 \
sched-entry CLOSE 1200 -1 -1 \
action trap
ping 192.168.1.2 -f
PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
.............................
--- 192.168.1.2 ping statistics ---
948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms
2. Qualitative testing (with a phase-aligned schedule - the clocks are
synchronized by ptp4l, not shown here):
Receiver (sja1105):
tc qdisc add dev swp2 clsact
now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \
sec=$(echo $now | awk -F. '{print $1}') && \
base_time="$(((sec + 2) * 1000000000))" && \
echo "base time ${base_time}"
tc filter add dev swp2 ingress flower skip_sw \
dst_mac 42:be:24:9b:76:20 \
action gate base-time ${base_time} \
sched-entry OPEN 60000 -1 -1 \
sched-entry CLOSE 40000 -1 -1 \
action trap
Sender (enetc):
now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \
sec=$(echo $now | awk -F. '{print $1}') && \
base_time="$(((sec + 2) * 1000000000))" && \
echo "base time ${base_time}"
tc qdisc add dev eno0 parent root taprio \
num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time ${base_time} \
sched-entry S 01 50000 \
sched-entry S 00 50000 \
flags 2
ping -A 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
...
^C
--- 192.168.1.1 ping statistics ---
1425 packets transmitted, 1424 packets received, 0% packet loss
round-trip min/avg/max = 0.322/0.361/0.990 ms
And just for comparison, with the tc-taprio schedule deleted:
ping -A 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
...
^C
--- 192.168.1.1 ping statistics ---
33 packets transmitted, 19 packets received, 42% packet loss
round-trip min/avg/max = 0.336/0.464/0.597 ms
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement tc-flower offloads for redirect, trap and drop using
non-critical virtual links.
Commands which were tested to work are:
# Send frames received on swp2 with a DA of 42:be:24:9b:76:20 to the
# CPU and to swp3. This type of key (DA only) when the port's VLAN
# awareness state is off.
tc qdisc add dev swp2 clsact
tc filter add dev swp2 ingress flower skip_sw dst_mac 42:be:24:9b:76:20 \
action mirred egress redirect dev swp3 \
action trap
# Drop frames received on swp2 with a DA of 42:be:24:9b:76:20, a VID
# of 100 and a PCP of 0.
tc filter add dev swp2 ingress protocol 802.1Q flower skip_sw \
dst_mac 42:be:24:9b:76:20 vlan_id 100 vlan_prio 0 action drop
Under the hood, all rules match on DMAC, VID and PCP, but when VLAN
filtering is disabled, those are set internally by the driver to the
port-based defaults. Because we would be put in an awkward situation if
the user were to change the VLAN filtering state while there are active
rules (packets would no longer match on the specified keys), we simply
deny changing vlan_filtering unless the list of flows offloaded via
virtual links is empty. Then the user can re-add new rules.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Virtual links are a sja1105 hardware concept of executing various flow
actions based on a key extracted from the frame's DMAC, VID and PCP.
Currently the tc-flower offload code supports only parsing the DMAC if
that is the broadcast MAC address, and the VLAN PCP. Extract the key
parsing logic from the L2 policers functionality and move it into its
own function, after adding extra logic for matching on any DMAC and VID.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the register definitions for the:
- VL Lookup Table
- VL Policing Table
- VL Forwarding Table
- VL Forwarding Parameters Table
These are needed in order to perform TTEthernet operations: QoS
classification, flow-based policing and/or frame redirecting with the
switch.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As its implementation shows, this is synonimous with calling
dsa_slave_dev_check followed by dsa_slave_to_port, so it is quite simple
already and provides functionality which is already there.
However there is now a need for these functions outside dsa_priv.h, for
example in drivers that perform mirroring and redirection through
tc-flower offloads (they are given raw access to the flow_cls_offload
structure), where they need to call this function on act->dev.
But simply exporting dsa_slave_to_port would make it non-inline and
would result in an extra function call in the hotpath, as can be seen
for example in sja1105:
Before:
000006dc <sja1105_xmit>:
{
6dc: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr}
6e0: e1a04000 mov r4, r0
6e4: e591958c ldr r9, [r1, #1420] ; 0x58c <- Inline dsa_slave_to_port
6e8: e1a05001 mov r5, r1
6ec: e24dd004 sub sp, sp, #4
u16 tx_vid = dsa_8021q_tx_vid(dp->ds, dp->index);
6f0: e1c901d8 ldrd r0, [r9, #24]
6f4: ebfffffe bl 0 <dsa_8021q_tx_vid>
6f4: R_ARM_CALL dsa_8021q_tx_vid
u8 pcp = netdev_txq_to_tc(netdev, queue_mapping);
6f8: e1d416b0 ldrh r1, [r4, #96] ; 0x60
u16 tx_vid = dsa_8021q_tx_vid(dp->ds, dp->index);
6fc: e1a08000 mov r8, r0
After:
000006e4 <sja1105_xmit>:
{
6e4: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr}
6e8: e1a04000 mov r4, r0
6ec: e24dd004 sub sp, sp, #4
struct dsa_port *dp = dsa_slave_to_port(netdev);
6f0: e1a00001 mov r0, r1
{
6f4: e1a05001 mov r5, r1
struct dsa_port *dp = dsa_slave_to_port(netdev);
6f8: ebfffffe bl 0 <dsa_slave_to_port>
6f8: R_ARM_CALL dsa_slave_to_port
6fc: e1a09000 mov r9, r0
u16 tx_vid = dsa_8021q_tx_vid(dp->ds, dp->index);
700: e1c001d8 ldrd r0, [r0, #24]
704: ebfffffe bl 0 <dsa_8021q_tx_vid>
704: R_ARM_CALL dsa_8021q_tx_vid
Because we want to avoid possible performance regressions, introduce
this new function which is designed to be public.
Suggested-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 19bda36c42:
| ipv6: add mtu lock check in __ip6_rt_update_pmtu
|
| Prior to this patch, ipv6 didn't do mtu lock check in ip6_update_pmtu.
| It leaded to that mtu lock doesn't really work when receiving the pkt
| of ICMPV6_PKT_TOOBIG.
|
| This patch is to add mtu lock check in __ip6_rt_update_pmtu just as ipv4
| did in __ip_rt_update_pmtu.
The above reasoning is incorrect. IPv6 *requires* icmp based pmtu to work.
There's already a comment to this effect elsewhere in the kernel:
$ git grep -p -B1 -A3 'RTAX_MTU lock'
net/ipv6/route.c=4813=
static int rt6_mtu_change_route(struct fib6_info *f6i, void *p_arg)
...
/* In IPv6 pmtu discovery is not optional,
so that RTAX_MTU lock cannot disable it.
We still use this lock to block changes
caused by addrconf/ndisc.
*/
This reverts to the pre-4.9 behaviour.
Cc: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Fixes: 19bda36c42 ("ipv6: add mtu lock check in __ip6_rt_update_pmtu")
Signed-off-by: David S. Miller <davem@davemloft.net>
clang points out that building without IPv6 would lead to returning
an uninitialized variable if a packet with family!=AF_INET is
passed into bareudp_udp_encap_recv():
drivers/net/bareudp.c:139:6: error: variable 'err' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
if (family == AF_INET)
^~~~~~~~~~~~~~~~~
drivers/net/bareudp.c:146:15: note: uninitialized use occurs here
if (unlikely(err)) {
^~~
include/linux/compiler.h:78:42: note: expanded from macro 'unlikely'
# define unlikely(x) __builtin_expect(!!(x), 0)
^
drivers/net/bareudp.c:139:2: note: remove the 'if' if its condition is always true
if (family == AF_INET)
^~~~~~~~~~~~~~~~~~~~~~
This cannot happen in practice, so change the condition in a way that
gcc sees the IPv4 case as unconditionally true here.
For consistency, change all the similar constructs in this file the
same way, using "if(IS_ENABLED())" instead of #if IS_ENABLED()".
Fixes: 571912c69f ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix bootconfig causing kernels to fail with CONFIG_BLK_DEV_RAM enabled
- Fix allocation leaks in bootconfig tool
- Fix a double initialization of a variable
- Fix API bootconfig usage from kprobe boot time events
- Reject NULL location for kprobes
- Fix crash caused by preempt delay module not cleaning up kthread
correctly
- Add vmalloc_sync_mappings() to prevent x86_64 page faults from
recursively faulting from tracing page faults
- Fix comment in gpu/trace kerneldoc header
- Fix documentation of how to create a trace event class
- Make the local tracing_snapshot_instance_cond() function static
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXrRUBhQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qveTAP4iNCnMeS/Isb+MXQx2Pnu7OP+0BeRP
2ahlKG2sBgEdnwEAoUzxQoYWtfC6xoM38YwLuZPRlcScRea/5CRHyW8BFQc=
=o3pV
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix bootconfig causing kernels to fail with CONFIG_BLK_DEV_RAM
enabled
- Fix allocation leaks in bootconfig tool
- Fix a double initialization of a variable
- Fix API bootconfig usage from kprobe boot time events
- Reject NULL location for kprobes
- Fix crash caused by preempt delay module not cleaning up kthread
correctly
- Add vmalloc_sync_mappings() to prevent x86_64 page faults from
recursively faulting from tracing page faults
- Fix comment in gpu/trace kerneldoc header
- Fix documentation of how to create a trace event class
- Make the local tracing_snapshot_instance_cond() function static
* tag 'trace-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tools/bootconfig: Fix resource leak in apply_xbc()
tracing: Make tracing_snapshot_instance_cond() static
tracing: Fix doc mistakes in trace sample
gpu/trace: Minor comment updates for gpu_mem_total tracepoint
tracing: Add a vmalloc_sync_mappings() for safe measure
tracing: Wait for preempt irq delay thread to finish
tracing/kprobes: Reject new event if loc is NULL
tracing/boottime: Fix kprobe event API usage
tracing/kprobes: Fix a double initialization typo
bootconfig: Fix to remove bootconfig data from initrd while boot
This Kselftest update for Linux 5.7-rc5 consists of ftrace test fixes
and fix to kvm Makefile for relocatable native/cross builds and installs.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl60SCYACgkQCwJExA0N
QxzlTQ/+LWXBGbHJPPOT2UPjoa+f5JQgYePxTVbnnG+nG4SQcbHMBGMORwbz7S8e
o/dc2H+jhfYNnTc2L/JoruJVT7E2CwehBKTZzqCu6AbBWdPNgQCmxFIh5Gos8JMb
cj9J/7CEtnOAQF7IArGYlyd9gblRc8ACwvgVLRJbKcNFHV0EBRV2Kg5JOE9RaKPK
xIA2GNb30cA3P4IFykc199zrOsfwf2Dr0Bf1wf3HkO1UnhrFIDsT7PRNBLGWWB+n
c9Mi9fFmLEJr3yDL6B87Mt7+sUGRb7Jvvz6h8gIhJaE8IpbkD9nq0PhuuQNawFy4
KS/FYFtRShKU/m+g+Nudg9PJR+vzpL/wR6kp/5Z1W5Us9ba200GZBtFT0x+NUP3n
fNczO61iwAi4qesfCpKgDfGco754mw1+hUZAdRJkYY6tyZmjrPVpVKgOR8kN4qjP
anUl1trQsDM5kaScG/6RmBhoAtaU91Mdhcw1/8YLBpoQUQEMxWtwlXdvKkJ3zjfE
Gs0jBoThuCHAj3IMZmL3dPaa88IA904aelTW6OLnq8BmHEW2938Y/T1L47bHO5C+
WZnFVzNgEqFTLHqX/MYzmoJ1y+A09fA3Owi6OmsvJ5ZY+LjEiKGXuBHj+VhbwJ4G
COphD9CcIVx0gQ3Zct8feajloQcGu5v4/El3/3kPQkuZjzhxRq0=
=jUcW
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
"ftrace test fixes and a fix to kvm Makefile for relocatable
native/cross builds and installs"
* tag 'linux-kselftest-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: fix kvm relocatable native/cross builds and installs
selftests/ftrace: Make XFAIL green color
ftrace/selftest: make unresolved cases cause failure if --fail-unresolved set
ftrace/selftests: workaround cgroup RT scheduling issues
We currently make some guesses as when to open this fd, but in reality
we have no business (or need) to do so at all. In fact, it makes certain
things fail, like O_PATH.
Remove the fd lookup from these opcodes, we're just passing the 'fd' to
generic helpers anyway. With that, we can also remove the special casing
of fd values in io_req_needs_file(), and the 'fd_non_neg' check that
we have. And we can ensure that we only read sqe->fd once.
This fixes O_PATH usage with openat/openat2, and ditto statx path side
oddities.
Cc: stable@vger.kernel.org: # v5.6
Reported-by: Max Kellermann <mk@cm4all.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The rawmidi core allows user to resize the runtime buffer via ioctl,
and this may lead to UAF when performed during concurrent reads or
writes: the read/write functions unlock the runtime lock temporarily
during copying form/to user-space, and that's the race window.
This patch fixes the hole by introducing a reference counter for the
runtime buffer read/write access and returns -EBUSY error when the
resize is performed concurrently against read/write.
Note that the ref count field is a simple integer instead of
refcount_t here, since the all contexts accessing the buffer is
basically protected with a spinlock, hence we need no expensive atomic
ops. Also, note that this busy check is needed only against read /
write functions, and not in receive/transmit callbacks; the race can
happen only at the spinlock hole mentioned in the above, while the
whole function is protected for receive / transmit callbacks.
Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/CAFcO6XMWpUVK_yzzCpp8_XP7+=oUpQvuBeCbMffEDkpe8jWrfg@mail.gmail.com
Link: https://lore.kernel.org/r/s5heerw3r5z.wl-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Remove duplicate headers which are included twice.
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
First set of patches for v5.8. Changes all over, ath10k apparently
seeing most new features this time. rtw88 also had lots of changes due
to preparation for new hardware support.
In this pull request there's also a new macro to include/linux/iopoll:
read_poll_timeout_atomic(). This is needed by rtw88 for atomic
polling.
Major changes:
ath11k
* add debugfs file for testing ADDBA and DELBA
* add 802.11 encapsulation offload on hardware support
* add htt_peer_stats_reset debugfs file
ath10k
* enable VHT160 and VHT80+80 modes
* enable radar detection in secondary segment
* sdio: disable TX complete indication to improve throughput
* sdio: decrease power consumption
* sdio: add HTT TX bundle support to increase throughput
* sdio: add rx bitrate reporting
ath9k
* improvements to AR9002 calibration logic
carl9170
* remove buggy P2P_GO support
p54usb
* add support for AirVasT USB stick
rtw88
* add support for antenna configuration
ti wlcore
* add support for AES_CMAC cipher
iwlwifi
* support for a few new FW API versions
* new hw configs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJetAhAAAoJEG4XJFUm622bADEH/A1OjAD3H1iZyTmXHP4T7yZe
TKJ+9I6B3BDR1czUTm+kUhrgBDNpdLLtu+b+5QXfpPLrtZ0FF/zjuazgueyqQpZ1
zudj+rG72njHpU0RKtO7wIBrCtckLPV0be+3026hztatJmJ7XQ9FvsanFPPsrrNv
0lh8E8kDUSynOW2me8FW1GBgDkGaBaicAs4FSjwNJC31Wo/VN5m9gEFkGpT1VJWP
l0xeEQ/N2mknQVuTR4CuMT9VJ0SNlJrLZpBVAqkmc170c3pKChl3LTNCnP925ye9
Nfqw2sQKgUPJKRbZR5wZTphGuu4krFv0ldWCvb0oFtZlCLIkiOz6+AA7b33oV2A=
=7ewK
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-next-2020-05-07' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for v5.8
First set of patches for v5.8. Changes all over, ath10k apparently
seeing most new features this time. rtw88 also had lots of changes due
to preparation for new hardware support.
In this pull request there's also a new macro to include/linux/iopoll:
read_poll_timeout_atomic(). This is needed by rtw88 for atomic
polling.
Major changes:
ath11k
* add debugfs file for testing ADDBA and DELBA
* add 802.11 encapsulation offload on hardware support
* add htt_peer_stats_reset debugfs file
ath10k
* enable VHT160 and VHT80+80 modes
* enable radar detection in secondary segment
* sdio: disable TX complete indication to improve throughput
* sdio: decrease power consumption
* sdio: add HTT TX bundle support to increase throughput
* sdio: add rx bitrate reporting
ath9k
* improvements to AR9002 calibration logic
carl9170
* remove buggy P2P_GO support
p54usb
* add support for AirVasT USB stick
rtw88
* add support for antenna configuration
ti wlcore
* add support for AES_CMAC cipher
iwlwifi
* support for a few new FW API versions
* new hw configs
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Manivannan Sadhasivam says:
====================
Add QRTR MHI client driver
Here is the series adding MHI client driver support to Qualcomm IPC router
protocol. MHI is a newly added bus to kernel which is used to communicate to
external modems over a physical interface like PCI-E. This driver is used to
transfer the QMI messages between the host processor and external modems over
the "IPCR" channel.
For QRTR, this driver is just another driver acting as a transport layer like
SMD.
Currently this driver is needed to control the QCA6390 WLAN device from ath11k.
The ath11k MHI controller driver will take care of booting up QCA6390 and
bringing it to operating state. Later, this driver will be used to transfer QMI
messages over the MHI-IPCR channel.
The second patch of this series removes the ARCH_QCOM dependency for QRTR. This
is needed because the QRTR driver will be used with x86 machines as well to talk
to devices like QCA6390.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
IPC Router protocol is also used by external modems for exchanging the QMI
messages. Hence, it doesn't always depend on Qualcomm platforms. One such
instance is the QCA6390 WLAN device connected to x86 machine.
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
MHI is the transport layer used for communicating to the external modems.
Hence, this commit adds MHI transport layer support to QRTR for
transferring the QMI messages over IPC Router.
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The HNS config symbol enables the framework support for the Hisilicon
Network Subsystem. It is already selected by all of its users, so there
is no reason to make it visible.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VIA Rhine Ethernet interface is only present on PCI devices or
VIA/WonderMedia VT8500/WM85xx SoCs. Add platform dependencies to the
VIA_RHINE config symbol, to avoid asking the user about it when
configuring a kernel without PCI or VT8500/WM85xx support.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
'Dan Carpenter' reported:
This code frees "sfi" and then dereferences it on the next line:
> kfree(sfi);
> clear_bit(sfi->index, epsfp.psfp_sfi_bitmap);
This "sfi->index" should be "index".
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function always return 0 now, we can make it return void to
simplify the code. This fixes the following coccicheck warning:
drivers/net/ethernet/microchip/encx24j600.c:609:5-8: Unneeded variable:
"ret". Return "0" on line 653
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>