linux

History

Vladimir Oltean db46e3a88a net/sched: taprio: avoid disabling offload when it was never enabled In an incredibly strange API design decision, qdisc->destroy() gets called even if qdisc->init() never succeeded, not exclusively since commit `87b60cfacf` ("net_sched: fix error recovery at qdisc creation"), but apparently also earlier (in the case of qdisc_create_dflt()). The taprio qdisc does not fully acknowledge this when it attempts full offload, because it starts off with q->flags = TAPRIO_FLAGS_INVALID in taprio_init(), then it replaces q->flags with TCA_TAPRIO_ATTR_FLAGS parsed from netlink (in taprio_change(), tail called from taprio_init()). But in taprio_destroy(), we call taprio_disable_offload(), and this determines what to do based on FULL_OFFLOAD_IS_ENABLED(q->flags). But looking at the implementation of FULL_OFFLOAD_IS_ENABLED() (a bitwise check of bit 1 in q->flags), it is invalid to call this macro on q->flags when it contains TAPRIO_FLAGS_INVALID, because that is set to U32_MAX, and therefore FULL_OFFLOAD_IS_ENABLED() will return true on an invalid set of flags. As a result, it is possible to crash the kernel if user space forces an error between setting q->flags = TAPRIO_FLAGS_INVALID, and the calling of taprio_enable_offload(). This is because drivers do not expect the offload to be disabled when it was never enabled. The error that we force here is to attach taprio as a non-root qdisc, but instead as child of an mqprio root qdisc: $ tc qdisc add dev swp0 root handle 1: \ mqprio num_tc 8 map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0 $ tc qdisc replace dev swp0 parent 1:1 \ taprio num_tc 8 map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \ sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \ flags 0x0 clockid CLOCK_TAI Unable to handle kernel paging request at virtual address fffffffffffffff8 [fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 96000004 [#1] PREEMPT SMP Call trace: taprio_dump+0x27c/0x310 vsc9959_port_setup_tc+0x1f4/0x460 felix_port_setup_tc+0x24/0x3c dsa_slave_setup_tc+0x54/0x27c taprio_disable_offload.isra.0+0x58/0xe0 taprio_destroy+0x80/0x104 qdisc_create+0x240/0x470 tc_modify_qdisc+0x1fc/0x6b0 rtnetlink_rcv_msg+0x12c/0x390 netlink_rcv_skb+0x5c/0x130 rtnetlink_rcv+0x1c/0x2c Fix this by keeping track of the operations we made, and undo the offload only if we actually did it. I've added "bool offloaded" inside a 4 byte hole between "int clockid" and "atomic64_t picos_per_byte". Now the first cache line looks like below: $ pahole -C taprio_sched net/sched/sch_taprio.o struct taprio_sched { struct Qdisc * * qdiscs; /* 0 8 / struct Qdisc root; /* 8 8 / u32 flags; / 16 4 / enum tk_offsets tk_offset; / 20 4 / int clockid; / 24 4 / bool offloaded; / 28 1 / / XXX 3 bytes hole, try to pack / atomic64_t picos_per_byte; / 32 0 / / XXX 8 bytes hole, try to pack / spinlock_t current_entry_lock; / 40 0 / / XXX 8 bytes hole, try to pack / struct sched_entry current_entry; /* 48 8 / struct sched_gate_list oper_sched; /* 56 8 / / --- cacheline 1 boundary (64 bytes) --- */ Fixes: `9c66d15646` ("taprio: Add support for hardware offloading") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>		2022-09-20 11:41:14 -07:00
..
act_api.c	net/sched: act_api: Notify user space if any actions were flushed before error	2022-06-27 21:51:23 -07:00
act_bpf.c	bpf: Keep the (rcv) timestamp behavior for the existing tc-bpf@ingress	2022-03-03 14:38:48 +00:00
act_connmark.c	flow_offload: fill flags to action structure	2021-12-19 14:08:47 +00:00
act_csum.c	net/sched: act_api: Add extack to offload_act_setup() callback	2022-04-08 13:45:43 +01:00
act_ct.c	net/sched: act_ct: set 'net' pointer when creating new nf_flow_table	2022-07-11 16:25:14 +02:00
act_ctinfo.c	flow_offload: fill flags to action structure	2021-12-19 14:08:47 +00:00
act_gact.c	net/sched: act_gact: Add extack messages for offload failure	2022-04-08 13:45:43 +01:00
act_gate.c	net/sched: act_api: Add extack to offload_act_setup() callback	2022-04-08 13:45:43 +01:00
act_ife.c	flow_offload: fill flags to action structure	2021-12-19 14:08:47 +00:00
act_ipt.c	flow_offload: fill flags to action structure	2021-12-19 14:08:47 +00:00
act_meta_mark.c
act_meta_skbprio.c
act_meta_skbtcindex.c
act_mirred.c	net: rename reference+tracking helpers	2022-06-09 21:52:55 -07:00
act_mpls.c	net/sched: act_mpls: Add extack messages for offload failure	2022-04-08 13:45:43 +01:00
act_nat.c	flow_offload: fill flags to action structure	2021-12-19 14:08:47 +00:00
act_pedit.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2022-05-19 11:23:59 -07:00
act_police.c	net/sched: act_police: allow 'continue' action offload	2022-07-06 12:44:39 +01:00
act_sample.c	net/sched: act_api: Add extack to offload_act_setup() callback	2022-04-08 13:45:43 +01:00
act_simple.c	flow_offload: fill flags to action structure	2021-12-19 14:08:47 +00:00
act_skbedit.c	net: sched: support hash selecting tx queue	2022-04-19 12:20:45 +02:00
act_skbmod.c	flow_offload: fill flags to action structure	2021-12-19 14:08:47 +00:00
act_tunnel_key.c	net/sched: act_tunnel_key: Add extack message for offload failure	2022-04-08 13:45:43 +01:00
act_vlan.c	net/sched: act_vlan: Add extack message for offload failure	2022-04-08 13:45:43 +01:00
cls_api.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2022-07-21 13:03:39 -07:00
cls_basic.c	net_sched: refactor TC action init API	2021-08-02 10:24:38 +01:00
cls_bpf.c	bpf: Keep the (rcv) timestamp behavior for the existing tc-bpf@ingress	2022-03-03 14:38:48 +00:00
cls_cgroup.c	net_sched: refactor TC action init API	2021-08-02 10:24:38 +01:00
cls_flow.c	net_sched: refactor TC action init API	2021-08-02 10:24:38 +01:00
cls_flower.c	net/sched: flower: Add PPPoE filter	2022-07-26 10:20:29 -07:00
cls_fw.c	net_sched: refactor TC action init API	2021-08-02 10:24:38 +01:00
cls_matchall.c	net/sched: matchall: Avoid overwriting error messages	2022-04-08 13:45:43 +01:00
cls_route.c	net_sched: cls_route: disallow handle of 0	2022-08-15 11:46:30 +01:00
cls_rsvp6.c
cls_rsvp.c
cls_rsvp.h	net_sched: refactor TC action init API	2021-08-02 10:24:38 +01:00
cls_tcindex.c	net_sched: refactor TC action init API	2021-08-02 10:24:38 +01:00
cls_u32.c	net/sched: cls_u32: fix possible leak in u32_init_knode()	2022-04-15 14:26:11 -07:00
em_canid.c	net: sched: kerneldoc fixes	2020-07-13 17:20:40 -07:00
em_cmp.c	net: sched: fix misspellings using misspell-fixer tool	2020-11-10 17:00:28 -08:00
em_ipset.c	sched: consistently handle layer3 header accesses in the presence of VLANs	2020-07-03 14:34:53 -07:00
em_ipt.c	sched: consistently handle layer3 header accesses in the presence of VLANs	2020-07-03 14:34:53 -07:00
em_meta.c	net_sched: em_meta: add READ_ONCE() in var_sk_bound_if()	2022-05-16 10:31:06 +01:00
em_nbyte.c	net: sched: Return the correct errno code	2021-02-06 11:15:28 -08:00
em_text.c
em_u32.c
ematch.c	net: sched: Fix spelling mistakes	2021-05-31 22:44:56 -07:00
Kconfig	net: sched: incorrect Kconfig dependencies on Netfilter modules	2020-12-09 15:49:29 -08:00
Makefile	net/sched: sch_frag: add generic packet fragment support.	2020-11-27 14:36:02 -08:00
sch_api.c	net: rename reference+tracking helpers	2022-06-09 21:52:55 -07:00
sch_atm.c	net: sched: Remove Qdisc::running sequence counter	2021-10-18 12:54:41 +01:00
sch_blackhole.c	Revert "net: sched: Pass root lock to Qdisc_ops.enqueue"	2020-07-16 16:48:34 -07:00
sch_cake.c	Revert "sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb"	2022-08-31 20:02:28 -07:00
sch_cbq.c	net/sched: sch_cbq: change the type of cbq_set_lss to void	2022-07-27 18:30:18 -07:00
sch_cbs.c	net: don't include ethtool.h from netdevice.h	2020-11-23 17:27:04 -08:00
sch_choke.c	net: sched: validate stab values	2021-03-10 15:47:52 -08:00
sch_codel.c	Revert "net: sched: Pass root lock to Qdisc_ops.enqueue"	2020-07-16 16:48:34 -07:00
sch_drr.c	net: sched: Remove Qdisc::running sequence counter	2021-10-18 12:54:41 +01:00
sch_dsmark.c	net/sched: store the last executed chain also for clsact egress	2021-07-29 22:17:37 +01:00
sch_etf.c	Revert "net: sched: Pass root lock to Qdisc_ops.enqueue"	2020-07-16 16:48:34 -07:00
sch_ets.c	net/sched: sch_ets: don't remove idle classes from the round-robin list	2021-12-13 12:30:23 +00:00
sch_fifo.c	net_sched: fix NULL deref in fifo_set_limit()	2021-10-01 14:59:10 -07:00
sch_fq_codel.c	fq_codel: generalise ce_threshold marking for subset of traffic	2021-10-20 15:24:36 -07:00
sch_fq_pie.c	net/sched: fq_pie: prevent dismantle issue	2021-12-09 08:01:00 -08:00
sch_fq.c	Revert "net: sched: Pass root lock to Qdisc_ops.enqueue"	2020-07-16 16:48:34 -07:00
sch_frag.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	2021-12-31 14:35:40 +00:00
sch_generic.c	net/sched: fix netdevice reference leaks in attach_default_qdiscs()	2022-08-30 15:10:08 +02:00
sch_gred.c	net: sched: gred: dynamically allocate tc_gred_qopt_offload	2021-10-27 12:06:52 -07:00
sch_hfsc.c	net: sched: Remove Qdisc::running sequence counter	2021-10-18 12:54:41 +01:00
sch_hhf.c	Revert "net: sched: Pass root lock to Qdisc_ops.enqueue"	2020-07-16 16:48:34 -07:00
sch_htb.c	sch_htb: Fail on unsupported parameters when offload is requested	2022-01-25 20:00:02 -08:00
sch_ingress.c
sch_mq.c	net: sched: Remove Qdisc::running sequence counter	2021-10-18 12:54:41 +01:00
sch_mqprio.c	net: sched: Remove Qdisc::running sequence counter	2021-10-18 12:54:41 +01:00
sch_multiq.c	net: sched: Remove Qdisc::running sequence counter	2021-10-18 12:54:41 +01:00
sch_netem.c	net/sched: sch_netem: Fix arithmetic in netem_dump() for 32-bit platforms	2022-06-17 20:29:38 -07:00
sch_pie.c	net: sched: fix misspellings using misspell-fixer tool	2020-11-10 17:00:28 -08:00
sch_plug.c	Revert "net: sched: Pass root lock to Qdisc_ops.enqueue"	2020-07-16 16:48:34 -07:00
sch_prio.c	net: sched: Remove Qdisc::running sequence counter	2021-10-18 12:54:41 +01:00
sch_qfq.c	sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc	2022-01-04 12:36:51 +00:00
sch_red.c	net: sched: validate stab values	2021-03-10 15:47:52 -08:00
sch_sfb.c	sch_sfb: Also store skb len before calling child enqueue	2022-09-08 11:12:58 +02:00
sch_sfq.c	net/sched: store the last executed chain also for clsact egress	2021-07-29 22:17:37 +01:00
sch_skbprio.c	Revert "net: sched: Pass root lock to Qdisc_ops.enqueue"	2020-07-16 16:48:34 -07:00
sch_taprio.c	net/sched: taprio: avoid disabling offload when it was never enabled	2022-09-20 11:41:14 -07:00
sch_tbf.c	net: sched: tbf: don't call qdisc_put() while holding tree lock	2022-08-30 11:41:24 +02:00
sch_teql.c	net: sched: sch_teql: fix null-pointer dereference	2021-04-08 14:14:42 -07:00