Its the same as nf_conntrack_put(), but without the
need for an indirect call. The downside is a module dependency on
nf_conntrack, but all of these already depend on conntrack anyway.
Cc: Paul Blakey <paulb@mellanox.com>
Cc: dev@openvswitch.org
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
nf_ct_put() results in a usesless indirection:
nf_ct_put -> nf_conntrack_put -> nf_conntrack_destroy -> rcu readlock +
indirect call of ct_hooks->destroy().
There are two _put helpers:
nf_ct_put and nf_conntrack_put. The latter is what should be used in
code that MUST NOT cause a linker dependency on the conntrack module
(e.g. calls from core network stack).
Everyone else should call nf_ct_put() instead.
A followup patch will convert a few nf_conntrack_put() calls to
nf_ct_put(), in particular from modules that already have a conntrack
dependency such as act_ct or even nf_conntrack itself.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
No functional changes, these structures should be const.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ip_ct_attach predates struct nf_ct_hook, we can place it there and
remove the exported symbol.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Convert nf_conn reference counting from atomic_t to refcount_t based api.
refcount_t api provides more runtime sanity checks and will warn on
certain constructs, e.g. refcount_inc() on a zero reference count, which
usually indicates use-after-free.
For this reason template allocation is changed to init the refcount to
1, the subsequenct add operations are removed.
Likewise, init_conntrack() is changed to set the initial refcount to 1
instead refcount_inc().
This is safe because the new entry is not (yet) visible to other cpus.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The pointer lt is being assigned a value and then later
updated but that value is never read. The pointer is
redundant and can be removed.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If destination port is above 32k and source port below 16k
assume this might cause 'port shadowing' where a 'new' inbound
connection matches an existing one, e.g.
inbound X:41234 -> Y:53 matches existing conntrack entry
Z:53 -> X:4123, where Z got natted to X.
In this case, new packet is natted to Z:53 which is likely
unwanted.
We avoid the rewrite for connections that originate from local host:
port-shadowing is only possible with forwarded connections.
Also adjust test case.
v3: no need to call tuple_force_port_remap if already in random mode (Phil)
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Phil Sutter <phil@nwl.cc>
Acked-by: Eric Garver <eric@garver.life>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This allows to identify flows that originate from local machine
in a followup patch.
It would be possible to make this a ->status bit instead.
For now I did not do that yet because I don't have a use-case for
exposing this info to userspace.
If one comes up the toggle can be replaced with a status bit.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Chain stats are updated from the Netfilter hook path which already run
under rcu read-size lock section.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If compiled with CONFIG_NET_NS_REFCNT_TRACKER=y,
using put_net_track() in iterate_cleanup_work()
and netns_tracker_alloc() in nf_nat_masq_schedule()
might help us finding netns refcount imbalances.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If compiled with CONFIG_NET_NS_REFCNT_TRACKER=y,
using put_net_track() in nfulnl_instance_free_rcu()
and get_net_track() in instance_create()
might help us finding netns refcount imbalances.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next, mostly
rather small housekeeping patches:
1) Remove unused variable in IPVS, from GuoYong Zheng.
2) Use memset_after in conntrack, from Kees Cook.
3) Remove leftover function in nfnetlink_queue, from Florian Westphal.
4) Remove redundant test on bool in conntrack, from Bernard Zhao.
5) egress support for nft_fwd, from Lukas Wunner.
6) Make pppoe work for br_netfilter, from Florian Westphal.
7) Remove unused variable in conntrack resize routine, from luo penghao.
* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next:
netfilter: conntrack: Remove useless assignment statements
netfilter: bridge: add support for pppoe filtering
netfilter: nft_fwd_netdev: Support egress hook
netfilter: ctnetlink: remove useless type conversion to bool
netfilter: nf_queue: remove leftover synchronize_rcu
netfilter: conntrack: Use memset_startat() to zero struct nf_conn
ipvs: remove unused variable for ip_vs_new_dest
====================
Link: https://lore.kernel.org/r/20211215234911.170741-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The old_size assignment here will not be used anymore
The clang_analyzer complains as follows:
Value stored to 'old_size' is never read
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: luo penghao <luo.penghao@zte.com.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The sixth byte of packet data has to be looked up in the sixth group,
not in the seventh one, even if we load the bucket data into ymm6
(and not ymm5, for convenience of tracking stalls).
Without this fix, matching on a MAC address as first field of a set,
if 8-bit groups are selected (due to a small set size) would fail,
that is, the given MAC address would never match.
Reported-by: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com>
Cc: <stable@vger.kernel.org> # 5.6.x
Fixes: 7400b06396 ("nft_set_pipapo: Introduce AVX2-based lookup implementation")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Tested-By: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Allow packet redirection to another interface upon egress.
[lukas: set skb_iif, add commit message, original patch from Pablo. ]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
net/netfilter/nfnetlink_queue.c:601:36: warning: variable 'ctinfo' is
uninitialized when used here [-Wuninitialized]
if (ct && nfnl_ct->build(skb, ct, ctinfo, NFQA_CT, NFQA_CT_INFO) < 0)
ctinfo is only uninitialized if ct == NULL. Init it to 0 to silence this.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
dying is bool, the type conversion to true/false value is not
needed.
Signed-off-by: Bernard Zhao <bernard@vivo.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Its no longer needed after commit 8702997074
("netfilter: nf_queue: move hookfn registration out of struct net").
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.
Use memset_startat() to avoid confusing memset() about writing beyond
the target struct member.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The dest variable is not used after ip_vs_new_dest anymore in
ip_vs_add_dest, do not need pass it to ip_vs_new_dest, remove it.
Signed-off-by: GuoYong Zheng <zhenggy@chinatelecom.cn>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter/IPVS fixes for net:
1) Add selftest for vrf+conntrack, from Florian Westphal.
2) Extend nfqueue selftest to cover nfqueue, also from Florian.
3) Remove duplicated include in nft_payload, from Wan Jiabing.
4) Several improvements to the nat port shadowing selftest,
from Phil Sutter.
5) Fix filtering of reply tuple in ctnetlink, from Florent Fourcot.
6) Do not override error with -EINVAL in filter setup path, also
from Florent.
7) Honor sysctl_expire_nodest_conn regardless conn_reuse_mode for
reused connections, from yangxingwu.
8) Replace snprintf() by sysfs_emit() in xt_IDLETIMER as reported
by Coccinelle, from Jing Yao.
9) Incorrect IPv6 tunnel match in flowtable offload, from Will
Mortensen.
10) Switch port shadow selftest to use socat, from Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
siphash keys use 16 bytes.
Define siphash_aligned_key_t macro so that we can make sure they
are not crossing a cache line boundary.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf, can and netfilter.
Current release - regressions:
- bpf: do not reject when the stack read size is different from the
tracked scalar size
- net: fix premature exit from NAPI state polling in napi_disable()
- riscv, bpf: fix RV32 broken build, and silence RV64 warning
Current release - new code bugs:
- net: fix possible NULL deref in sock_reserve_memory
- amt: fix error return code in amt_init(); fix stopping the
workqueue
- ax88796c: use the correct ioctl callback
Previous releases - always broken:
- bpf: stop caching subprog index in the bpf_pseudo_func insn
- security: fixups for the security hooks in sctp
- nfc: add necessary privilege flags in netlink layer, limit
operations to admin only
- vsock: prevent unnecessary refcnt inc for non-blocking connect
- net/smc: fix sk_refcnt underflow on link down and fallback
- nfnetlink_queue: fix OOB when mac header was cleared
- can: j1939: ignore invalid messages per standard
- bpf, sockmap:
- fix race in ingress receive verdict with redirect to self
- fix incorrect sk_skb data_end access when src_reg = dst_reg
- strparser, and tls are reusing qdisc_skb_cb and colliding
- ethtool: fix ethtool msg len calculation for pause stats
- vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries to
access an unregistering real_dev
- udp6: make encap_rcv() bump the v6 not v4 stats
- drv: prestera: add explicit padding to fix m68k build
- drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge
- drv: mvpp2: fix wrong SerDes reconfiguration order
Misc & small latecomers:
- ipvs: auto-load ipvs on genl access
- mctp: sanity check the struct sockaddr_mctp padding fields
- libfs: support RENAME_EXCHANGE in simple_rename()
- avoid double accounting for pure zerocopy skbs"
* tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (123 commits)
selftests/net: udpgso_bench_rx: fix port argument
net: wwan: iosm: fix compilation warning
cxgb4: fix eeprom len when diagnostics not implemented
net: fix premature exit from NAPI state polling in napi_disable()
net/smc: fix sk_refcnt underflow on linkdown and fallback
net/mlx5: Lag, fix a potential Oops with mlx5_lag_create_definer()
gve: fix unmatched u64_stats_update_end()
net: ethernet: lantiq_etop: Fix compilation error
selftests: forwarding: Fix packet matching in mirroring selftests
vsock: prevent unnecessary refcnt inc for nonblocking connect
net: marvell: mvpp2: Fix wrong SerDes reconfiguration order
net: ethernet: ti: cpsw_ale: Fix access to un-initialized memory
net: stmmac: allow a tc-taprio base-time of zero
selftests: net: test_vxlan_under_vrf: fix HV connectivity test
net: hns3: allow configure ETS bandwidth of all TCs
net: hns3: remove check VF uc mac exist when set by PF
net: hns3: fix some mac statistics is always 0 in device version V2
net: hns3: fix kernel crash when unload VF while it is being reset
net: hns3: sync rx ring head in echo common pull
net: hns3: fix pfc packet number incorrect after querying pfc parameters
...
Previously the IPv6 addresses in the key were clobbered and the mask was
left unset.
I haven't tested this; I noticed it while skimming the code to
understand an unrelated issue.
Fixes: cfab6dbd0e ("netfilter: flowtable: add tunnel match offload support")
Cc: wenxu <wenxu@ucloud.cn>
Signed-off-by: Will Mortensen <willmo@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
coccicheck complains about the use of snprintf() in sysfs show
functions:
WARNING use scnprintf or sprintf
Use sysfs_emit instead of scnprintf, snprintf or sprintf makes more
sense.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Jing Yao <yao.jing2@zte.com.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We are changing expire_nodest_conn to work even for reused connections when
conn_reuse_mode=0, just as what was done with commit dc7b3eb900 ("ipvs:
Fix reuse connection if real server is dead").
For controlled and persistent connections, the new connection will get the
needed real server depending on the rules in ip_vs_check_template().
Fixes: d752c36457 ("ipvs: allow rescheduling of new connections when port reuse is detected")
Co-developed-by: Chuanqi Liu <legend050709@qq.com>
Signed-off-by: Chuanqi Liu <legend050709@qq.com>
Signed-off-by: yangxingwu <xingwu.yang@gmail.com>
Acked-by: Simon Horman <horms@verge.net.au>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
And be consistent in error management for both orig/reply filtering
Fixes: cb8aa9a3af ("netfilter: ctnetlink: add kernel side filtering for dump")
Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
filter->orig_flags was used for a reply context.
Fixes: cb8aa9a3af ("netfilter: ctnetlink: add kernel side filtering for dump")
Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fix following checkincludes.pl warning:
./net/netfilter/nft_payload.c: linux/ip.h is included more than once.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Merge misc updates from Andrew Morton:
"257 patches.
Subsystems affected by this patch series: scripts, ocfs2, vfs, and
mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
cleanups, kfence, and damon)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
mm/damon: remove return value from before_terminate callback
mm/damon: fix a few spelling mistakes in comments and a pr_debug message
mm/damon: simplify stop mechanism
Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
Docs/admin-guide/mm/damon/start: simplify the content
Docs/admin-guide/mm/damon/start: fix a wrong link
Docs/admin-guide/mm/damon/start: fix wrong example commands
mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
mm/damon: remove unnecessary variable initialization
Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
selftests/damon: support watermarks
mm/damon/dbgfs: support watermarks
mm/damon/schemes: activate schemes based on a watermarks mechanism
tools/selftests/damon: update for regions prioritization of schemes
mm/damon/dbgfs: support prioritization weights
mm/damon/vaddr,paddr: support pageout prioritization
mm/damon/schemes: prioritize regions within the quotas
mm/damon/selftests: support schemes quotas
mm/damon/dbgfs: support quotas of schemes
...
nr_free_buffer_pages could be exposed through mm.h instead of swap.h.
The advantage of this change is that it can reduce the obsolete
includes. For example, net/ipv4/tcp.c wouldn't need swap.h any more
since it has already included mm.h. Similarly, after checking all the
other files, it comes that tcp.c, udp.c meter.c ,... follow the same
rule, so these files can have swap.h removed too.
Moreover, after preprocessing all the files that use
nr_free_buffer_pages, it turns out that those files have already
included mm.h.Thus, we can move nr_free_buffer_pages from swap.h to mm.h
safely. This change will not affect the compilation of other files.
Link: https://lkml.kernel.org/r/20210912133640.1624-1-liumh1@shanghaitech.edu.cn
Signed-off-by: Mianhan Liu <liumh1@shanghaitech.edu.cn>
Cc: Jakub Kicinski <kuba@kernel.org>
CC: Ulf Hansson <ulf.hansson@linaro.org>
Cc: "David S . Miller" <davem@davemloft.net>
Cc: Simon Horman <horms@verge.net.au>
Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
1) Fix mac address UAF reported by KASAN in nfnetlink_queue,
from Florian Westphal.
2) Autoload genetlink IPVS on demand, from Thomas Weissschuh.
* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
ipvs: autoload ipvs on genl access
netfilter: nfnetlink_queue: fix OOB when mac header was cleared
====================
Link: https://lore.kernel.org/r/20211101221528.236114-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Use array_size() in ebtables, from Gustavo A. R. Silva.
2) Attach IPS_ASSURED to internal UDP stream state, reported by
Maciej Zenczykowski.
3) Add NFT_META_IFTYPE to match on the interface type either
from ingress or egress.
4) Generalize pktinfo->tprot_set to flags field.
5) Allow to match on inner headers / payload data.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow to match and mangle on inner headers / payload data after the
transport header. There is a new field in the pktinfo structure that
stores the inner header offset which is calculated only when requested.
Only TCP and UDP supported at this stage.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Generalize NFT_META_IIFTYPE to NFT_META_IFTYPE which allows you to match
on the interface type of the skb->dev field. This field is used by the
netdev family to add an implicit dependency to skip non-ethernet packets
when matching on layer 3 and 4 TCP/IP header fields.
For backward compatibility, add the NFT_META_IIFTYPE alias to
NFT_META_IFTYPE.
Add __NFT_META_IIFTYPE, to be used by userspace in the future to match
specifically on the iiftype.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The internal stream state sets the timeout to 120 seconds 2 seconds
after the creation of the flow, attach this internal stream state to the
IPS_ASSURED flag for consistent event reporting.
Before this patch:
[NEW] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 [UNREPLIED] src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282
[UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282
[UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED]
[DESTROY] udp 17 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED]
Note IPS_ASSURED for the flow not yet in the internal stream state.
after this update:
[NEW] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 [UNREPLIED] src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282
[UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282
[UPDATE] udp 17 120 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED]
[DESTROY] udp 17 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED]
Before this patch, short-lived UDP flows never entered IPS_ASSURED, so
they were already candidate flow to be deleted by early_drop under
stress.
Before this patch, IPS_ASSURED is set on regardless the internal stream
state, attach this internal stream state to IPS_ASSURED.
packet #1 (original direction) enters NEW state
packet #2 (reply direction) enters ESTABLISHED state, sets on IPS_SEEN_REPLY
paclet #3 (any direction) sets on IPS_ASSURED (if 2 seconds since the
creation has passed by).
Reported-by: Maciej Żenczykowski <zenczykowski@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The VRF driver invokes netfilter for output+postrouting hooks so that users
can create rules that check for 'oif $vrf' rather than lower device name.
Afterwards, ip stack calls those hooks again.
This is a problem when conntrack is used with IP masquerading.
masquerading has an internal check that re-validates the output
interface to account for route changes.
This check will trigger in the vrf case.
If the -j MASQUERADE rule matched on the first iteration, then round 2
finds state->out->ifindex != nat->masq_index: the latter is the vrf
index, but out->ifindex is the lower device.
The packet gets dropped and the conntrack entry is invalidated.
This change makes conntrack postrouting skip the nat hooks.
Also skip confirmation. This allows the second round
(postrouting invocation from ipv4/ipv6) to create nat bindings.
This also prevents the second round from seeing packets that had their
source address changed by the nat hook.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel provides the functionality to automatically load modules
providing genl families. Use this to remove the need for users to
manually load the module.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
On 64bit platforms the MAC header is set to 0xffff on allocation and
also when a helper like skb_unset_mac_header() is called.
dev_parse_header may call skb_mac_header() which assumes valid mac offset:
BUG: KASAN: use-after-free in eth_header_parse+0x75/0x90
Read of size 6 at addr ffff8881075a5c05 by task nf-queue/1364
Call Trace:
memcpy+0x20/0x60
eth_header_parse+0x75/0x90
__nfqnl_enqueue_packet+0x1a61/0x3380
__nf_queue+0x597/0x1300
nf_queue+0xf/0x40
nf_hook_slow+0xed/0x190
nf_hook+0x184/0x440
ip_output+0x1c0/0x2a0
nf_reinject+0x26f/0x700
nfqnl_recv_verdict+0xa16/0x18b0
nfnetlink_rcv_msg+0x506/0xe70
The existing code only works if the skb has a mac header.
Fixes: 2c38de4c1f ("netfilter: fix looped (broad|multi)cast's MAC handling")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>