Commit Graph

1094 Commits

Author SHA1 Message Date
David S. Miller
8bbed40f10 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for you net-next
tree:

1) Missing NFTA_RULE_POSITION_ID netlink attribute validation,
   from Phil Sutter.

2) Restrict matching on tunnel metadata to rx/tx path, from wenxu.

3) Avoid indirect calls for IPV6=y, from Florian Westphal.

4) Add two indirections to prepare merger of IPV4 and IPV6 nat
   modules, from Florian Westphal.

5) Broken indentation in ctnetlink, from Colin Ian King.

6) Patches to use struct_size() from netfilter and IPVS,
   from Gustavo A. R. Silva.

7) Display kernel splat only once in case of racing to confirm
   conntrack from bridge plus nfqueue setups, from Chieh-Min Wang.

8) Skip checksum validation for layer 4 protocols that don't need it,
   patch from Alin Nastac.

9) Sparse warning due to symbol that should be static in CLUSTERIP,
   from Wei Yongjun.

10) Add new toggle to disable SDP payload translation when media
    endpoint is reachable though the same interface as the signalling
    peer, from Alin Nastac.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-18 11:38:30 -08:00
David S. Miller
3313da8188 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The netfilter conflicts were rather simple overlapping
changes.

However, the cls_tcindex.c stuff was a bit more complex.

On the 'net' side, Cong is fixing several races and memory
leaks.  Whilst on the 'net-next' side we have Vlad adding
the rtnl-ness support.

What I've decided to do, in order to resolve this, is revert the
conversion over to using a workqueue that Cong did, bringing us back
to pure RCU.  I did it this way because I believe that either Cong's
races don't apply with have Vlad did things, or Cong will have to
implement the race fix slightly differently.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15 12:38:38 -08:00
Alin Nastac
7fc3822536 netfilter: reject: skip csum verification for protocols that don't support it
Some protocols have other means to verify the payload integrity
(AH, ESP, SCTP) while others are incompatible with nf_ip(6)_checksum
implementation because checksum is either optional or might be
partial (UDPLITE, DCCP, GRE). Because nf_ip(6)_checksum was used
to validate the packets, ip(6)tables REJECT rules were not capable
to generate ICMP(v6) errors for the protocols mentioned above.

This commit also fixes the incorrect pseudo-header protocol used
for IPv4 packets that carry other transport protocols than TCP or
UDP (pseudo-header used protocol 0 iso the proper value).

Signed-off-by: Alin Nastac <alin.nastac@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-13 10:03:53 +01:00
Florian Westphal
8303b7e8f0 netfilter: nat: fix spurious connection timeouts
Sander Eikelenboom bisected a NAT related regression down
to the l4proto->manip_pkt indirection removal.

I forgot that ICMP(v6) errors (e.g. PKTTOOBIG) can be set as related
to the existing conntrack entry.

Therefore, when passing the skb to nf_nat_ipv4/6_manip_pkt(), that
ended up calling the wrong l4 manip function, as tuple->dst.protonum
is the original flows l4 protocol (TCP, UDP, etc).

Set the dst protocol field to ICMP(v6), we already have a private copy
of the tuple due to the inversion of src/dst.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Fixes: faec18dbb0 ("netfilter: nat: remove l4proto->manip_pkt")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-11 17:43:17 +01:00
Florian Westphal
ac02bcf9cc netfilter: ipv6: avoid indirect calls for IPV6=y case
indirect calls are only needed if ipv6 is a module.
Add helpers to abstract the v6ops indirections and use them instead.

fragment, reroute and route_input are kept as indirect calls.
The first two are not not used in hot path and route_input is only
used by bridge netfilter.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-04 18:21:12 +01:00
Florian Westphal
960587285a netfilter: nat: remove module dependency on ipv6 core
nf_nat_ipv6 calls two ipv6 core functions, so add those to v6ops to avoid
the module dependency.

This is a prerequisite for merging ipv4 and ipv6 nat implementations.

Add wrappers to avoid the indirection if ipv6 is builtin.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-04 18:20:19 +01:00
David S. Miller
343917b410 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for your net-next tree:

1) Introduce a hashtable to speed up object lookups, from Florian Westphal.

2) Make direct calls to built-in extension, also from Florian.

3) Call helper before confirming the conntrack as it used to be originally,
   from Florian.

4) Call request_module() to autoload br_netfilter when physdev is used
   to relax the dependency, also from Florian.

5) Allow to insert rules at a given position ID that is internal to the
   batch, from Phil Sutter.

6) Several patches to replace conntrack indirections by direct calls,
   and to reduce modularization, from Florian. This also includes
   several follow up patches to deal with minor fallout from this
   rework.

7) Use RCU from conntrack gre helper, from Florian.

8) GRE conntrack module becomes built-in into nf_conntrack, from Florian.

9) Replace nf_ct_invert_tuplepr() by calls to nf_ct_invert_tuple(),
   from Florian.

10) Unify sysctl handling at the core of nf_conntrack, from Florian.

11) Provide modparam to register conntrack hooks.

12) Allow to match on the interface kind string, from wenxu.

13) Remove several exported symbols, not required anymore now after
    a bit of de-modulatization work has been done, from Florian.

14) Remove built-in map support in the hash extension, this can be
    done with the existing userspace infrastructure, from laura.

15) Remove indirection to calculate checksums in IPVS, from Matteo Croce.

16) Use call wrappers for indirection in IPVS, also from Matteo.

17) Remove superfluous __percpu parameter in nft_counter, patch from
    Luc Van Oostenryck.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28 17:34:38 -08:00
Peter Oskolkov
997dd96471 net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c
Currently, IPv6 defragmentation code drops non-last fragments that
are smaller than 1280 bytes: see
commit 0ed4229b08 ("ipv6: defrag: drop non-last frags smaller than min mtu")

This behavior is not specified in IPv6 RFCs and appears to break
compatibility with some IPv6 implemenations, as reported here:
https://www.spinics.net/lists/netdev/msg543846.html

This patch re-uses common IP defragmentation queueing and reassembly
code in IP6 defragmentation in nf_conntrack, removing the 1280 byte
restriction.

Signed-off-by: Peter Oskolkov <posk@google.com>
Reported-by: Tom Herbert <tom@herbertland.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25 21:37:11 -08:00
Florian Westphal
303e0c5589 netfilter: conntrack: avoid unneeded nf_conntrack_l4proto lookups
after removal of the packet and invert function pointers, several
places do not need to lookup the l4proto structure anymore.

Remove those lookups.
The function nf_ct_invert_tuplepr becomes redundant, replace
it with nf_ct_invert_tuple everywhere.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-01-18 15:02:34 +01:00
David S. Miller
c3e5336925 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Support for destination MAC in ipset, from Stefano Brivio.

2) Disallow all-zeroes MAC address in ipset, also from Stefano.

3) Add IPSET_CMD_GET_BYNAME and IPSET_CMD_GET_BYINDEX commands,
   introduce protocol version number 7, from Jozsef Kadlecsik.
   A follow up patch to fix ip_set_byindex() is also included
   in this batch.

4) Honor CTA_MARK_MASK from ctnetlink, from Andreas Jaggi.

5) Statify nf_flow_table_iterate(), from Taehee Yoo.

6) Use nf_flow_table_iterate() to simplify garbage collection in
   nf_flow_table logic, also from Taehee Yoo.

7) Don't use _bh variants of call_rcu(), rcu_barrier() and
   synchronize_rcu_bh() in Netfilter, from Paul E. McKenney.

8) Remove NFC_* cache definition from the old caching
   infrastructure.

9) Remove layer 4 port rover in NAT helpers, use random port
   instead, from Florian Westphal.

10) Use strscpy() in ipset, from Qian Cai.

11) Remove NF_NAT_RANGE_PROTO_RANDOM_FULLY branch now that
    random port is allocated by default, from Xiaozhou Liu.

12) Ignore NF_NAT_RANGE_PROTO_RANDOM too, from Florian Westphal.

13) Limit port allocation selection routine in NAT to avoid
    softlockup splats when most ports are in use, from Florian.

14) Remove unused parameters in nf_ct_l4proto_unregister_sysctl()
    from Yafang Shao.

15) Direct call to nf_nat_l4proto_unique_tuple() instead of
    indirection, from Florian Westphal.

16) Several patches to remove all layer 4 NAT indirections,
    remove nf_nat_l4proto struct, from Florian Westphal.

17) Fix RTP/RTCP source port translation when SNAT is in place,
    from Alin Nastac.

18) Selective rule dump per chain, from Phil Sutter.

19) Revisit CLUSTERIP target, this includes a deadlock fix from
    netns path, sleep in atomic, remove bogus WARN_ON_ONCE()
    and disallow mismatching IP address and MAC address.
    Patchset from Taehee Yoo.

20) Update UDP timeout to stream after 2 seconds, from Florian.

21) Shrink UDP established timeout to 120 seconds like TCP timewait.

22) Sysctl knobs to set GRE timeouts, from Yafang Shao.

23) Move seq_print_acct() to conntrack core file, from Florian.

24) Add enum for conntrack sysctl knobs, also from Florian.

25) Place nf_conntrack_acct, nf_conntrack_helper, nf_conntrack_events
    and nf_conntrack_timestamp knobs in the core, from Florian Westphal.
    As a side effect, shrink netns_ct structure by removing obsolete
    sysctl anchors, also from Florian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 18:20:26 -08:00
Florian Westphal
c4b0e771f9 netfilter: avoid using skb->nf_bridge directly
This pointer is going to be removed soon, so use the existing helpers in
more places to avoid noise when the removal happens.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:21:37 -08:00
Florian Westphal
5cbabeec1e netfilter: nat: remove nf_nat_l4proto struct
This removes the (now empty) nf_nat_l4proto struct, all its instances
and all the no longer needed runtime (un)register functionality.

nf_nat_need_gre() can be axed as well: the module that calls it (to
load the no-longer-existing nat_gre module) also calls other nat core
functions. GRE nat is now always available if kernel is built with it.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-17 23:33:31 +01:00
Florian Westphal
faec18dbb0 netfilter: nat: remove l4proto->manip_pkt
This removes the last l4proto indirection, the two callers, the l3proto
packet mangling helpers for ipv4 and ipv6, now call the
nf_nat_l4proto_manip_pkt() helper.

nf_nat_proto_{dccp,tcp,sctp,gre,icmp,icmpv6} are left behind, even though
they contain no functionality anymore to not clutter this patch.

Next patch will remove the empty files and the nf_nat_l4proto
struct.

nf_nat_proto_udp.c is renamed to nf_nat_proto.c, as it now contains the
other nat manip functionality as well, not just udp and udplite.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-17 23:33:29 +01:00
Florian Westphal
76b90019e0 netfilter: nat: remove l4proto->nlattr_to_range
all protocols did set this to nf_nat_l4proto_nlattr_to_range, so
just call it directly.

The important difference is that we'll now also call it for
protocols that we don't support (i.e., nf_nat_proto_unknown did
not provide .nlattr_to_range).

However, there should be no harm, even icmp provided this callback.
If we don't implement a specific l4nat for this, nothing would make
use of this information, so adding a big switch/case construct listing
all supported l4protocols seems a bit pointless.

This change leaves a single function pointer in the l4proto struct.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-17 23:33:23 +01:00
Florian Westphal
fe2d002099 netfilter: nat: remove l4proto->in_range
With exception of icmp, all of the l4 nat protocols set this to
nf_nat_l4proto_in_range.

Get rid of this and just check the l4proto in the caller.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-17 23:33:14 +01:00
Florian Westphal
40e786bd29 netfilter: nat: fold in_range indirection into caller
No need for indirections here, we only support ipv4 and ipv6
and the called functions are very small.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-17 23:33:09 +01:00
Florian Westphal
203f2e7820 netfilter: nat: remove l4proto->unique_tuple
fold remaining users (icmp, icmpv6, gre) into nf_nat_l4proto_unique_tuple.
The static-save of old incarnation of resolved key in gre and icmp is
removed as well, just use the prandom based offset like the others.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-17 23:33:04 +01:00
Florian Westphal
912da924a2 netfilter: remove NF_NAT_RANGE_PROTO_RANDOM support
Historically this was net_random() based, and was then converted to
a hash based algorithm (private boot seed + hash of endpoint addresses)
due to concerns of leaking net_random() bits.

RANDOM_FULLY mode was added later to avoid problems with hash
based mode (see commit 34ce324019,
"netfilter: nf_nat: add full port randomization support" for details).

Just make prandom_u32() the default search starting point and get rid of
->secure_port() altogether.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-17 23:32:36 +01:00
Jiri Wiesner
ebaf39e603 ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes
The *_frag_reasm() functions are susceptible to miscalculating the byte
count of packet fragments in case the truesize of a head buffer changes.
The truesize member may be changed by the call to skb_unclone(), leaving
the fragment memory limit counter unbalanced even if all fragments are
processed. This miscalculation goes unnoticed as long as the network
namespace which holds the counter is not destroyed.

Should an attempt be made to destroy a network namespace that holds an
unbalanced fragment memory limit counter the cleanup of the namespace
never finishes. The thread handling the cleanup gets stuck in
inet_frags_exit_net() waiting for the percpu counter to reach zero. The
thread is usually in running state with a stacktrace similar to:

 PID: 1073   TASK: ffff880626711440  CPU: 1   COMMAND: "kworker/u48:4"
  #5 [ffff880621563d48] _raw_spin_lock at ffffffff815f5480
  #6 [ffff880621563d48] inet_evict_bucket at ffffffff8158020b
  #7 [ffff880621563d80] inet_frags_exit_net at ffffffff8158051c
  #8 [ffff880621563db0] ops_exit_list at ffffffff814f5856
  #9 [ffff880621563dd8] cleanup_net at ffffffff814f67c0
 #10 [ffff880621563e38] process_one_work at ffffffff81096f14

It is not possible to create new network namespaces, and processes
that call unshare() end up being stuck in uninterruptible sleep state
waiting to acquire the net_mutex.

The bug was observed in the IPv6 netfilter code by Per Sundstrom.
I thank him for his analysis of the problem. The parts of this patch
that apply to IPv4 and IPv6 fragment reassembly are preemptive measures.

Signed-off-by: Jiri Wiesner <jwiesner@suse.com>
Reported-by: Per Sundstrom <per.sundstrom@redqube.se>
Acked-by: Peter Oskolkov <posk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05 20:44:46 -08:00
Taehee Yoo
095faf45e6 netfilter: nat: fix double register in masquerade modules
There is a reference counter to ensure that masquerade modules register
notifiers only once. However, the existing reference counter approach is
not safe, test commands are:

   while :
   do
   	   modprobe ip6t_MASQUERADE &
	   modprobe nft_masq_ipv6 &
	   modprobe -rv ip6t_MASQUERADE &
	   modprobe -rv nft_masq_ipv6 &
   done

numbers below represent the reference counter.
--------------------------------------------------------
CPU0        CPU1        CPU2        CPU3        CPU4
[insmod]    [insmod]    [rmmod]     [rmmod]     [insmod]
--------------------------------------------------------
0->1
register    1->2
            returns     2->1
			returns     1->0
                                                0->1
                                                register <--
                                    unregister
--------------------------------------------------------

The unregistation of CPU3 should be processed before the
registration of CPU4.

In order to fix this, use a mutex instead of reference counter.

splat looks like:
[  323.869557] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [modprobe:1381]
[  323.869574] Modules linked in: nf_tables(+) nf_nat_ipv6(-) nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 n]
[  323.869574] irq event stamp: 194074
[  323.898930] hardirqs last  enabled at (194073): [<ffffffff90004a0d>] trace_hardirqs_on_thunk+0x1a/0x1c
[  323.898930] hardirqs last disabled at (194074): [<ffffffff90004a29>] trace_hardirqs_off_thunk+0x1a/0x1c
[  323.898930] softirqs last  enabled at (182132): [<ffffffff922006ec>] __do_softirq+0x6ec/0xa3b
[  323.898930] softirqs last disabled at (182109): [<ffffffff90193426>] irq_exit+0x1a6/0x1e0
[  323.898930] CPU: 0 PID: 1381 Comm: modprobe Not tainted 4.20.0-rc2+ #27
[  323.898930] RIP: 0010:raw_notifier_chain_register+0xea/0x240
[  323.898930] Code: 3c 03 0f 8e f2 00 00 00 44 3b 6b 10 7f 4d 49 bc 00 00 00 00 00 fc ff df eb 22 48 8d 7b 10 488
[  323.898930] RSP: 0018:ffff888101597218 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[  323.898930] RAX: 0000000000000000 RBX: ffffffffc04361c0 RCX: 0000000000000000
[  323.898930] RDX: 1ffffffff26132ae RSI: ffffffffc04aa3c0 RDI: ffffffffc04361d0
[  323.898930] RBP: ffffffffc04361c8 R08: 0000000000000000 R09: 0000000000000001
[  323.898930] R10: ffff8881015972b0 R11: fffffbfff26132c4 R12: dffffc0000000000
[  323.898930] R13: 0000000000000000 R14: 1ffff110202b2e44 R15: ffffffffc04aa3c0
[  323.898930] FS:  00007f813ed41540(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000
[  323.898930] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  323.898930] CR2: 0000559bf2c9f120 CR3: 000000010bc80000 CR4: 00000000001006f0
[  323.898930] Call Trace:
[  323.898930]  ? atomic_notifier_chain_register+0x2d0/0x2d0
[  323.898930]  ? down_read+0x150/0x150
[  323.898930]  ? sched_clock_cpu+0x126/0x170
[  323.898930]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[  323.898930]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[  323.898930]  register_netdevice_notifier+0xbb/0x790
[  323.898930]  ? __dev_close_many+0x2d0/0x2d0
[  323.898930]  ? __mutex_unlock_slowpath+0x17f/0x740
[  323.898930]  ? wait_for_completion+0x710/0x710
[  323.898930]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[  323.898930]  ? up_write+0x6c/0x210
[  323.898930]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[  324.127073]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[  324.127073]  nft_chain_filter_init+0x1e/0xe8a [nf_tables]
[  324.127073]  nf_tables_module_init+0x37/0x92 [nf_tables]
[ ... ]

Fixes: 8dd33cc93e ("netfilter: nf_nat: generalize IPv4 masquerading support for nf_tables")
Fixes: be6b635cd6 ("netfilter: nf_nat: generalize IPv6 masquerading support for nf_tables")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-11-27 00:36:46 +01:00
Taehee Yoo
584eab291c netfilter: add missing error handling code for register functions
register_{netdevice/inetaddr/inet6addr}_notifier may return an error
value, this patch adds the code to handle these error paths.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-11-27 00:35:19 +01:00
Florian Westphal
61792b6774 netfilter: ipv6: fix oops when defragmenting locally generated fragments
Unlike ipv4 and normal ipv6 defrag, netfilter ipv6 defragmentation did
not save/restore skb->dst.

This causes oops when handling locally generated ipv6 fragments, as
output path needs a valid dst.

Reported-by: Maciej Żenczykowski <zenczykowski@gmail.com>
Fixes: 84379c9afe ("netfilter: ipv6: nf_defrag: drop skb dst before queueing")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-25 10:18:26 +02:00
Tan Hu
097f95d319 netfilter: masquerade: don't flush all conntracks if only one address deleted on device
We configured iptables as below, which only allowed incoming data on
established connections:

iptables -t mangle -A PREROUTING -m state --state ESTABLISHED -j ACCEPT
iptables -t mangle -P PREROUTING DROP

When deleting a secondary address, current masquerade implements would
flush all conntracks on this device. All the established connections on
primary address also be deleted, then subsequent incoming data on the
connections would be dropped wrongly because it was identified as NEW
connection.

So when an address was delete, it should only flush connections related
with the address.

Signed-off-by: Tan Hu <tan.hu@zte.com.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-09-28 14:28:26 +02:00
Florian Westphal
70c0eb1ca0 netfilter: xtables: avoid BUG_ON
I see no reason for them, label or timer cannot be NULL, and if they
were, we'll crash with null deref anyway.

For skb_header_pointer failure, just set hotdrop to true and toss
such packet.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-09-17 16:11:12 +02:00
David S. Miller
aaf9253025 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-09-12 22:22:42 -07:00
David S. Miller
a8305bff68 net: Add and use skb_mark_not_on_list().
An SKB is not on a list if skb->next is NULL.

Codify this convention into a helper function and use it
where we are dequeueing an SKB and need to mark it as such.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10 10:06:54 -07:00
Taehee Yoo
5d407b071d ip: frags: fix crash in ip_do_fragment()
A kernel crash occurrs when defragmented packet is fragmented
in ip_do_fragment().
In defragment routine, skb_orphan() is called and
skb->ip_defrag_offset is set. but skb->sk and
skb->ip_defrag_offset are same union member. so that
frag->sk is not NULL.
Hence crash occurrs in skb->sk check routine in ip_do_fragment() when
defragmented packet is fragmented.

test commands:
   %iptables -t nat -I POSTROUTING -j MASQUERADE
   %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000

splat looks like:
[  261.069429] kernel BUG at net/ipv4/ip_output.c:636!
[  261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3
[  261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600
[  261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c
[  261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202
[  261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004
[  261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8
[  261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395
[  261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4
[  261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000
[  261.174169] FS:  00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000
[  261.183012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0
[  261.198158] Call Trace:
[  261.199018]  ? dst_output+0x180/0x180
[  261.205011]  ? save_trace+0x300/0x300
[  261.209018]  ? ip_copy_metadata+0xb00/0xb00
[  261.213034]  ? sched_clock_local+0xd4/0x140
[  261.218158]  ? kill_l4proto+0x120/0x120 [nf_conntrack]
[  261.223014]  ? rt_cpu_seq_stop+0x10/0x10
[  261.227014]  ? find_held_lock+0x39/0x1c0
[  261.233008]  ip_finish_output+0x51d/0xb50
[  261.237006]  ? ip_fragment.constprop.56+0x220/0x220
[  261.243011]  ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack]
[  261.250152]  ? rcu_is_watching+0x77/0x120
[  261.255010]  ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4]
[  261.261033]  ? nf_hook_slow+0xb1/0x160
[  261.265007]  ip_output+0x1c7/0x710
[  261.269005]  ? ip_mc_output+0x13f0/0x13f0
[  261.273002]  ? __local_bh_enable_ip+0xe9/0x1b0
[  261.278152]  ? ip_fragment.constprop.56+0x220/0x220
[  261.282996]  ? nf_hook_slow+0xb1/0x160
[  261.287007]  raw_sendmsg+0x21f9/0x4420
[  261.291008]  ? dst_output+0x180/0x180
[  261.297003]  ? sched_clock_cpu+0x126/0x170
[  261.301003]  ? find_held_lock+0x39/0x1c0
[  261.306155]  ? stop_critical_timings+0x420/0x420
[  261.311004]  ? check_flags.part.36+0x450/0x450
[  261.315005]  ? _raw_spin_unlock_irq+0x29/0x40
[  261.320995]  ? _raw_spin_unlock_irq+0x29/0x40
[  261.326142]  ? cyc2ns_read_end+0x10/0x10
[  261.330139]  ? raw_bind+0x280/0x280
[  261.334138]  ? sched_clock_cpu+0x126/0x170
[  261.338995]  ? check_flags.part.36+0x450/0x450
[  261.342991]  ? __lock_acquire+0x4500/0x4500
[  261.348994]  ? inet_sendmsg+0x11c/0x500
[  261.352989]  ? dst_output+0x180/0x180
[  261.357012]  inet_sendmsg+0x11c/0x500
[ ... ]

v2:
 - clear skb->sk at reassembly routine.(Eric Dumarzet)

Fixes: fa0f527358 ("ip: use rb trees for IP frag queue.")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-09 14:50:56 -07:00
Florian Westphal
da786717e0 netfilter: ip6t_rpfilter: set F_IFACE for linklocal addresses
Roman reports that DHCPv6 client no longer sees replies from server
due to

ip6tables -t raw -A PREROUTING -m rpfilter --invert -j DROP

rule.  We need to set the F_IFACE flag for linklocal addresses, they
are scoped per-device.

Fixes: 47b7e7f828 ("netfilter: don't set F_IFACE on ipv6 fib lookups")
Reported-by: Roman Mamedov <rm@romanrm.net>
Tested-by: Roman Mamedov <rm@romanrm.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16 19:36:58 +02:00
Florian Westphal
0ed4229b08 ipv6: defrag: drop non-last frags smaller than min mtu
don't bother with pathological cases, they only waste cycles.
IPv6 requires a minimum MTU of 1280 so we should never see fragments
smaller than this (except last frag).

v3: don't use awkward "-offset + len"
v2: drop IPv4 part, which added same check w. IPV4_MIN_MTU (68).
    There were concerns that there could be even smaller frags
    generated by intermediate nodes, e.g. on radio networks.

Cc: Peter Oskolkov <posk@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05 17:21:14 -07:00
Peter Oskolkov
fa0f527358 ip: use rb trees for IP frag queue.
Similar to TCP OOO RX queue, it makes sense to use rb trees to store
IP fragments, so that OOO fragments are inserted faster.

Tested:

- a follow-up patch contains a rather comprehensive ip defrag
  self-test (functional)
- ran neper `udp_stream -c -H <host> -F 100 -l 300 -T 20`:
    netstat --statistics
    Ip:
        282078937 total packets received
        0 forwarded
        0 incoming packets discarded
        946760 incoming packets delivered
        18743456 requests sent out
        101 fragments dropped after timeout
        282077129 reassemblies required
        944952 packets reassembled ok
        262734239 packet reassembles failed
   (The numbers/stats above are somewhat better re:
    reassemblies vs a kernel without this patchset. More
    comprehensive performance testing TBD).

Reported-by: Jann Horn <jannh@google.com>
Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05 17:16:46 -07:00
David S. Miller
99d20a461c Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for your net-next
tree:

1) No need to set ttl from reject action for the bridge family, from
   Taehee Yoo.

2) Use a fixed timeout for flow that are passed up from the flowtable
   to conntrack, from Florian Westphal.

3) More preparation patches for tproxy support for nf_tables, from Mate
   Eckl.

4) Remove unnecessary indirection in core IPv6 checksum function, from
   Florian Westphal.

5) Use nf_ct_get_tuplepr() from openvswitch, instead of opencoding it.
   From Florian Westphal.

6) socket match now selects socket infrastructure, instead of depending
   on it. From Mate Eckl.

7) Patch series to simplify conntrack tuple building/parsing from packet
   path and ctnetlink, from Florian Westphal.

8) Fetch timeout policy from protocol helpers, instead of doing it from
   core, from Florian Westphal.

9) Merge IPv4 and IPv6 protocol trackers into conntrack core, from
   Florian Westphal.

10) Depend on CONFIG_NF_TABLES_IPV6 and CONFIG_IP6_NF_IPTABLES
    respectively, instead of IPV6. Patch from Mate Eckl.

11) Add specific function for garbage collection in conncount,
    from Yi-Hung Wei.

12) Catch number of elements in the connlimit list, from Yi-Hung Wei.

13) Move locking to nf_conncount, from Yi-Hung Wei.

14) Series of patches to add lockless tree traversal in nf_conncount,
    from Yi-Hung Wei.

15) Resolve clash in matching conntracks when race happens, from
    Martynas Pumputis.

16) If connection entry times out, remove template entry from the
    ip_vs_conn_tab table to improve behaviour under flood, from
    Julian Anastasov.

17) Remove useless parameter from nf_ct_helper_ext_add(), from Gao feng.

18) Call abort from 2-phase commit protocol before requesting modules,
    make sure this is done under the mutex, from Florian Westphal.

19) Grab module reference when starting transaction, also from Florian.

20) Dynamically allocate expression info array for pre-parsing, from
    Florian.

21) Add per netns mutex for nf_tables, from Florian Westphal.

22) A couple of patches to simplify and refactor nf_osf code to prepare
    for nft_osf support.

23) Break evaluation on missing socket, from Mate Eckl.

24) Allow to match socket mark from nft_socket, from Mate Eckl.

25) Remove dependency on nf_defrag_ipv6, now that IPv6 tracker is
    built-in into nf_conntrack. From Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20 22:28:28 -07:00
David S. Miller
c4c5551df1 Merge ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux
All conflicts were trivial overlapping changes, so reasonably
easy to resolve.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20 21:17:12 -07:00
Florian Westphal
70b095c843 ipv6: remove dependency of nf_defrag_ipv6 on ipv6 module
IPV6=m
DEFRAG_IPV6=m
CONNTRACK=y yields:

net/netfilter/nf_conntrack_proto.o: In function `nf_ct_netns_do_get':
net/netfilter/nf_conntrack_proto.c:802: undefined reference to `nf_defrag_ipv6_enable'
net/netfilter/nf_conntrack_proto.o:(.rodata+0x640): undefined reference to `nf_conntrack_l4proto_icmpv6'

Setting DEFRAG_IPV6=y causes undefined references to ip6_rhash_params
ip6_frag_init and ip6_expire_frag_queue so it would be needed to force
IPV6=y too.

This patch gets rid of the 'followup linker error' by removing
the dependency of ipv6.ko symbols from netfilter ipv6 defrag.

Shared code is placed into a header, then used from both.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-18 11:26:53 +02:00
Florian Westphal
a0ae2562c6 netfilter: conntrack: remove l3proto abstraction
This unifies ipv4 and ipv6 protocol trackers and removes the l3proto
abstraction.

This gets rid of all l3proto indirect calls and the need to do
a lookup on the function to call for l3 demux.

It increases module size by only a small amount (12kbyte), so this reduces
size because nf_conntrack.ko is useless without either nf_conntrack_ipv4
or nf_conntrack_ipv6 module.

before:
   text    data     bss     dec     hex filename
   7357    1088       0    8445    20fd nf_conntrack_ipv4.ko
   7405    1084       4    8493    212d nf_conntrack_ipv6.ko
  72614   13689     236   86539   1520b nf_conntrack.ko
 19K nf_conntrack_ipv4.ko
 19K nf_conntrack_ipv6.ko
179K nf_conntrack.ko

after:
   text    data     bss     dec     hex filename
  79277   13937     236   93450   16d0a nf_conntrack.ko
  191K nf_conntrack.ko

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-17 15:27:49 +02:00
Florian Westphal
c779e84960 netfilter: conntrack: remove get_timeout() indirection
Not needed, we can have the l4trackers fetch it themselvs.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-16 17:55:01 +02:00
Florian Westphal
6816d931ca netfilter: conntrack: remove get_l4proto indirection from l3 protocol trackers
Handle it in the core instead.

ipv6_skip_exthdr() is built-in even if ipv6 is a module, i.e. this
doesn't create an ipv6 dependency.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-16 17:54:59 +02:00
Florian Westphal
d1b6fe9494 netfilter: conntrack: remove invert_tuple indirection from l3 protocol trackers
Its simpler to just handle it directly in nf_ct_invert_tuple().
Also gets rid of need to pass l3proto pointer to resolve_conntrack().

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-16 17:54:59 +02:00
Florian Westphal
47a91b14de netfilter: conntrack: remove pkt_to_tuple indirection from l3 protocol trackers
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-16 17:54:58 +02:00
Florian Westphal
f957be9d34 netfilter: conntrack: remove ctnetlink callbacks from l3 protocol trackers
handle everything from ctnetlink directly.

After all these years we still only support ipv4 and ipv6, so it
seems reasonable to remove l3 protocol tracker support and instead
handle ipv4/ipv6 from a common, always builtin inet tracker.

Step 1: Get rid of all the l3proto->func() calls.

Start with ctnetlink, then move on to packet-path ones.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-16 17:54:58 +02:00
Florian Westphal
84379c9afe netfilter: ipv6: nf_defrag: drop skb dst before queueing
Eric Dumazet reports:
 Here is a reproducer of an annoying bug detected by syzkaller on our production kernel
 [..]
 ./b78305423 enable_conntrack
 Then :
 sleep 60
 dmesg | tail -10
 [  171.599093] unregister_netdevice: waiting for lo to become free. Usage count = 2
 [  181.631024] unregister_netdevice: waiting for lo to become free. Usage count = 2
 [  191.687076] unregister_netdevice: waiting for lo to become free. Usage count = 2
 [  201.703037] unregister_netdevice: waiting for lo to become free. Usage count = 2
 [  211.711072] unregister_netdevice: waiting for lo to become free. Usage count = 2
 [  221.959070] unregister_netdevice: waiting for lo to become free. Usage count = 2

Reproducer sends ipv6 fragment that hits nfct defrag via LOCAL_OUT hook.
skb gets queued until frag timer expiry -- 1 minute.

Normally nf_conntrack_reasm gets called during prerouting, so skb has
no dst yet which might explain why this wasn't spotted earlier.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-09 18:04:12 +02:00
Máté Eckl
5711b4e893 netfilter: nf_tproxy: fix possible non-linear access to transport header
This patch fixes a silent out-of-bound read possibility that was present
because of the misuse of this function.

Mostly it was called with a struct udphdr *hp which had only the udphdr
part linearized by the skb_header_pointer, however
nf_tproxy_get_sock_v{4,6} uses it as a tcphdr pointer, so some reads for
tcp specific attributes may be invalid.

Fixes: a583636a83 ("inet: refactor inet[6]_lookup functions to take skb")
Signed-off-by: Máté Eckl <ecklm94@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-06 14:32:44 +02:00
Florian Westphal
d376bef9c2 netfilter: x_tables: set module owner for icmp(6) matches
nft_compat relies on xt_request_find_match to increment
refcount of the module that provides the match/target.

The (builtin) icmp matches did't set the module owner so it
was possible to rmmod ip(6)tables while icmp extensions were still in use.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-05 11:45:11 +02:00
David S. Miller
5cd3da4ba2 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Simple overlapping changes in stmmac driver.

Adjust skb_gro_flush_final_remcsum function signature to make GRO list
changes in net-next, as per Stephen Rothwell's example merge
resolution.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-03 10:29:26 +09:00
Flavio Leitner
f564650106 netfilter: check if the socket netns is correct.
Netfilter assumes that if the socket is present in the skb, then
it can be used because that reference is cleaned up while the skb
is crossing netns.

We want to change that to preserve the socket reference in a future
patch, so this is a preparation updating netfilter to check if the
socket netns matches before use it.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28 22:21:32 +09:00
Eric Dumazet
9ce7bc036a netfilter: ipv6: nf_defrag: reduce struct net memory waste
It is a waste of memory to use a full "struct netns_sysctl_ipv6"
while only one pointer is really used, considering netns_sysctl_ipv6
keeps growing.

Also, since "struct netns_frags" has cache line alignment,
it is better to move the frags_hdr pointer outside, otherwise
we spend a full cache line for this pointer.

This saves 192 bytes of memory per netns.

Fixes: c038a767cd ("ipv6: add a new namespace for nf_conntrack_reasm")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-18 14:13:25 +02:00
David S. Miller
a08ce73ba0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter/IPVS fixes for your net tree:

1) Reject non-null terminated helper names from xt_CT, from Gao Feng.

2) Fix KASAN splat due to out-of-bound access from commit phase, from
   Alexey Kodanev.

3) Missing conntrack hook registration on IPVS FTP helper, from Julian
   Anastasov.

4) Incorrect skbuff allocation size in bridge nft_reject, from Taehee Yoo.

5) Fix inverted check on packet xmit to non-local addresses, also from
   Julian.

6) Fix ebtables alignment compat problems, from Alin Nastac.

7) Hook mask checks are not correct in xt_set, from Serhey Popovych.

8) Fix timeout listing of element in ipsets, from Jozsef.

9) Cap maximum timeout value in ipset, also from Jozsef.

10) Don't allow family option for hash:mac sets, from Florent Fourcot.

11) Restrict ebtables to work with NFPROTO_BRIDGE targets only, this
    Florian.

12) Another bug reported by KASAN in the rbtree set backend, from
    Taehee Yoo.

13) Missing __IPS_MAX_BIT update doesn't include IPS_OFFLOAD_BIT.
    From Gao Feng.

14) Missing initialization of match/target in ebtables, from Florian
    Westphal.

15) Remove useless nft_dup.h file in include path, from C. Labbe.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-11 14:24:32 -07:00
Florian Westphal
c568503ef0 netfilter: x_tables: initialise match/target check parameter struct
syzbot reports following splat:

BUG: KMSAN: uninit-value in ebt_stp_mt_check+0x24b/0x450
 net/bridge/netfilter/ebt_stp.c:162
 ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162
 xt_check_match+0x1438/0x1650 net/netfilter/x_tables.c:506
 ebt_check_match net/bridge/netfilter/ebtables.c:372 [inline]
 ebt_check_entry net/bridge/netfilter/ebtables.c:702 [inline]

The uninitialised access is
   xt_mtchk_param->nft_compat

... which should be set to 0.
Fix it by zeroing the struct beforehand, same for tgchk.

ip(6)tables targetinfo uses c99-style initialiser, so no change
needed there.

Reported-by: syzbot+da4494182233c23a5fcf@syzkaller.appspotmail.com
Fixes: 55917a21d0 ("netfilter: x_tables: add context to know if extension runs from nft_compat")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-08 12:40:56 +02:00
Máté Eckl
45ca4e0cf2 netfilter: Libify xt_TPROXY
The extracted functions will likely be usefull to implement tproxy
support in nf_tables.

Extrancted functions:
	- nf_tproxy_sk_is_transparent
	- nf_tproxy_laddr4
	- nf_tproxy_handle_time_wait4
	- nf_tproxy_get_sock_v4
	- nf_tproxy_laddr6
	- nf_tproxy_handle_time_wait6
	- nf_tproxy_get_sock_v6

(nf_)tproxy_handle_time_wait6 also needed some refactor as its current
implementation was xtables-specific.

Signed-off-by: Máté Eckl <ecklm94@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-03 00:02:05 +02:00
Florian Westphal
0168e8b361 netfilter: nat: merge ipv4/ipv6 masquerade code into main nat module
Instead of using extra modules for these, turn the config options into
an implicit dependency that adds masq feature to the protocol specific nf_nat module.

before:
   text    data     bss     dec     hex filename
   2001     860       4    2865     b31 net/ipv4/netfilter/nf_nat_masquerade_ipv4.ko
   5579     780       2    6361    18d9 net/ipv4/netfilter/nf_nat_ipv4.ko
   2860     836       8    3704     e78 net/ipv6/netfilter/nf_nat_masquerade_ipv6.ko
   6648     780       2    7430    1d06 net/ipv6/netfilter/nf_nat_ipv6.ko

after:
   text    data     bss     dec     hex filename
   7245     872       8    8125    1fbd net/ipv4/netfilter/nf_nat_ipv4.ko
   9165     848      12   10025    2729 net/ipv6/netfilter/nf_nat_ipv6.ko

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-29 00:25:36 +02:00
David S. Miller
fb83eb93c6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next
tree, they are:

1) Remove obsolete nf_log tracing from nf_tables, from Florian Westphal.

2) Add support for map lookups to numgen, random and hash expressions,
   from Laura Garcia.

3) Allow to register nat hooks for iptables and nftables at the same
   time. Patchset from Florian Westpha.

4) Timeout support for rbtree sets.

5) ip6_rpfilter works needs interface for link-local addresses, from
   Vincent Bernat.

6) Add nf_ct_hook and nf_nat_hook structures and use them.

7) Do not drop packets on packets raceing to insert conntrack entries
   into hashes, this is particularly a problem in nfqueue setups.

8) Address fallout from xt_osf separation to nf_osf, patches
   from Florian Westphal and Fernando Mancera.

9) Remove reference to struct nft_af_info, which doesn't exist anymore.
   From Taehee Yoo.

This batch comes with is a conflict between 25fd386e0b ("netfilter:
core: add missing __rcu annotation") in your tree and 2c205dd398
("netfilter: add struct nf_nat_hook and use it") coming in this batch.
This conflict can be solved by leaving the __rcu tag on
__netfilter_net_init() - added by 25fd386e0b - and remove all code
related to nf_nat_decode_session_hook - which is gone after
2c205dd398, as described by:

diff --cc net/netfilter/core.c
index e0ae4aae96f5,206fb2c4c319..168af54db975
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@@ -611,7 -580,13 +611,8 @@@ const struct nf_conntrack_zone nf_ct_zo
  EXPORT_SYMBOL_GPL(nf_ct_zone_dflt);
  #endif /* CONFIG_NF_CONNTRACK */

- static void __net_init __netfilter_net_init(struct nf_hook_entries **e, int max)
 -#ifdef CONFIG_NF_NAT_NEEDED
 -void (*nf_nat_decode_session_hook)(struct sk_buff *, struct flowi *);
 -EXPORT_SYMBOL(nf_nat_decode_session_hook);
 -#endif
 -
+ static void __net_init
+ __netfilter_net_init(struct nf_hook_entries __rcu **e, int max)
  {
  	int h;

I can also merge your net-next tree into nf-next, solve the conflict and
resend the pull request if you prefer so.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:37:11 -04:00