linux

Author	SHA1	Message	Date
Taehee Yoo	e523452ac0	netfilter: nf_tables: remove unused variables The comment and trace_loginfo are not used anymore. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-29 14:50:29 +02:00
Florian Westphal	d9adf22a29	netfilter: nf_tables: use call_rcu in netlink dumps We can make all dumps and lookups lockless. Dumps currently only hold the nfnl mutex on the dump request itself. Dumps can span multiple syscalls, dump continuation doesn't acquire the nfnl mutex anywhere, i.e. the dump callbacks in nf_tables already use rcu and never rely on nfnl mutex being held. So, just switch all dumpers to rcu. This requires taking a module reference before dropping the rcu lock so rmmod is blocked, we also need to hold module reference over the entire dump operation sequence. netlink already supports this via the .module member in the netlink_dump_control struct. For the non-dump case (i.e. lookup of a specific tables, chains, etc), we need to swtich to _rcu list iteration primitive and make sure we use GFP_ATOMIC. This patch also adds the new nft_netlink_dump_start_rcu() helper that takes care of the get_ref, drop-rcu-lock,start dump, get-rcu-lock,put-ref sequence. The helper will be reused for all dumps. Rationale in all dump requests is: - use the nft_netlink_dump_start_rcu helper added in first patch - use GFP_ATOMIC and rcu list iteration - switch to .call_rcu ... thus making all dumps in nf_tables not depend on the nfnl mutex anymore. In the nf_tables_getgen: This callback just fetches the current base sequence, there is no need to serialize this with nfnl nft mutex. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-29 14:50:28 +02:00
Florian Westphal	8a3d4c3612	netfilter: nf_tables: fail batch if fatal signal is pending abort batch processing and return so task can exit faster. Otherwise even SIGKILL has no immediate effect. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-29 14:50:28 +02:00
Florian Westphal	d6501de872	netfilter: nf_tables: fix endian mismatch in return type harmless, but it avoids sparse warnings: nf_tables_api.c:2813:16: warning: incorrect type in return expression (different base types) nf_tables_api.c:2863:47: warning: incorrect type in argument 3 (different base types) nf_tables_api.c:3524:47: warning: incorrect type in argument 3 (different base types) nf_tables_api.c:3538:55: warning: incorrect type in argument 3 (different base types) Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-29 14:50:27 +02:00
Florian Westphal	eb1fb1479b	netfilter: nft_compat: use call_rcu for nfnl_compat_get Just use .call_rcu instead. We can drop the rcu read lock after obtaining a reference and re-acquire on return. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-29 14:50:26 +02:00
Wei Yongjun	88491c11b0	netfilter: nat: make symbol nat_hook static Fixes the following sparse warning: net/netfilter/nf_nat_core.c:1039:20: warning: symbol 'nat_hook' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-29 14:50:26 +02:00
Florian Westphal	0cbc06b3fa	netfilter: nf_tables: remove synchronize_rcu in commit phase synchronize_rcu() is expensive. The commit phase currently enforces an unconditional synchronize_rcu() after incrementing the generation counter. This is to make sure that a packet always sees a consistent chain, either nft_do_chain is still using old generation (it will skip the newly added rules), or the new one (it will skip old ones that might still be linked into the list). We could just remove the synchronize_rcu(), it would not cause a crash but it could cause us to evaluate a rule that was removed and new rule for the same packet, instead of either-or. To resolve this, add rule pointer array holding two generations, the current one and the future generation. In commit phase, allocate the rule blob and populate it with the rules that will be active in the new generation. Then, make this rule blob public, replacing the old generation pointer. Then the generation counter can be incremented. nft_do_chain() will either continue to use the current generation (in case loop was invoked right before increment), or the new one. Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-29 14:49:59 +02:00
Florian Westphal	003087911a	netfilter: nfnetlink: allow commit to fail ->commit() cannot fail at the moment. Followup-patch adds kmalloc calls in the commit phase, so we'll need to be able to handle errors. Make it so that -EGAIN causes a full replay, and make other errors cause the transaction to fail. Failing is ok from a consistency point of view as long as we perform all actions that could return an error before we increment the generation counter and the base seq. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-29 00:27:26 +02:00
Florian Westphal	1ac89d2015	netfilter: nat: merge nf_nat_redirect into nf_nat Similar to previous patch, this time, merge redirect+nat. The redirect module is just 2k in size, get rid of it and make redirect part available from the nat core. before: text data bss dec hex filename 19461 1484 4138 25083 61fb net/netfilter/nf_nat.ko 1236 792 0 2028 7ec net/netfilter/nf_nat_redirect.ko after: 20340 1508 4138 25986 6582 net/netfilter/nf_nat.ko Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-29 00:25:40 +02:00
David S. Miller	fb83eb93c6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree, they are: 1) Remove obsolete nf_log tracing from nf_tables, from Florian Westphal. 2) Add support for map lookups to numgen, random and hash expressions, from Laura Garcia. 3) Allow to register nat hooks for iptables and nftables at the same time. Patchset from Florian Westpha. 4) Timeout support for rbtree sets. 5) ip6_rpfilter works needs interface for link-local addresses, from Vincent Bernat. 6) Add nf_ct_hook and nf_nat_hook structures and use them. 7) Do not drop packets on packets raceing to insert conntrack entries into hashes, this is particularly a problem in nfqueue setups. 8) Address fallout from xt_osf separation to nf_osf, patches from Florian Westphal and Fernando Mancera. 9) Remove reference to struct nft_af_info, which doesn't exist anymore. From Taehee Yoo. This batch comes with is a conflict between `25fd386e0b` ("netfilter: core: add missing __rcu annotation") in your tree and `2c205dd398` ("netfilter: add struct nf_nat_hook and use it") coming in this batch. This conflict can be solved by leaving the __rcu tag on __netfilter_net_init() - added by `25fd386e0b` - and remove all code related to nf_nat_decode_session_hook - which is gone after `2c205dd398`, as described by: diff --cc net/netfilter/core.c index e0ae4aae96f5,206fb2c4c319..168af54db975 --- a/net/netfilter/core.c +++ b/net/netfilter/core.c @@@ -611,7 -580,13 +611,8 @@@ const struct nf_conntrack_zone nf_ct_zo EXPORT_SYMBOL_GPL(nf_ct_zone_dflt); #endif /* CONFIG_NF_CONNTRACK / - static void __net_init __netfilter_net_init(struct nf_hook_entries e, int max) -#ifdef CONFIG_NF_NAT_NEEDED -void (nf_nat_decode_session_hook)(struct sk_buff , struct flowi ); -EXPORT_SYMBOL(nf_nat_decode_session_hook); -#endif - + static void __net_init + __netfilter_net_init(struct nf_hook_entries __rcu **e, int max) { int h; I can also merge your net-next tree into nf-next, solve the conflict and resend the pull request if you prefer so. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 16:37:11 -04:00
Pablo Neira Ayuso	368982cd7d	netfilter: nfnetlink_queue: resolve clash for unconfirmed conntracks In nfqueue, two consecutive skbuffs may race to create the conntrack entry. Hence, the one that loses the race gets dropped due to clash in the insertion into the hashes from the nf_conntrack_confirm() path. This patch adds a new nf_conntrack_update() function which searches for possible clashes and resolve them. NAT mangling for the packet losing race is corrected by using the conntrack information that won race. In order to avoid direct module dependencies with conntrack and NAT, the nf_ct_hook and nf_nat_hook structures are used for this purpose. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:26:08 +02:00
Pablo Neira Ayuso	2c205dd398	netfilter: add struct nf_nat_hook and use it Move decode_session() and parse_nat_setup_hook() indirections to struct nf_nat_hook structure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:26:07 +02:00
Pablo Neira Ayuso	1f4b24397d	netfilter: add struct nf_ct_hook and use it Move the nf_ct_destroy indirection to the struct nf_ct_hook. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:26:06 +02:00
Pablo Neira Ayuso	8d8540c4f5	netfilter: nft_set_rbtree: add timeout support Add garbage collection logic to expire elements stored in the rb-tree representation. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:06 +02:00
Fernando Fernandez Mancera	57a4c9c5c7	netfilter: make NF_OSF non-visible symbol Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:06 +02:00
Florian Westphal	a37061a678	netfilter: lift one-nat-hook-only restriction This reverts commit `f92b40a8b2` ("netfilter: core: only allow one nat hook per hook point"), this limitation is no longer needed. The nat core now invokes these functions and makes sure that hook evaluation stops after a mapping is created and a null binding is created otherwise. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:06 +02:00
Florian Westphal	9971a514ed	netfilter: nf_nat: add nat type hooks to nat core Currently the packet rewrite and instantiation of nat NULL bindings happens from the protocol specific nat backend. Invocation occurs either via ip(6)table_nat or the nf_tables nat chain type. Invocation looks like this (simplified): NF_HOOK() \| `---iptable_nat \| `---> nf_nat_l3proto_ipv4 -> nf_nat_packet \| new packet? pass skb though iptables nat chain \| `---> iptable_nat: ipt_do_table In nft case, this looks the same (nft_chain_nat_ipv4 instead of iptable_nat). This is a problem for two reasons: 1. Can't use iptables nat and nf_tables nat at the same time, as the first user adds a nat binding (nf_nat_l3proto_ipv4 adds a NULL binding if do_table() did not find a matching nat rule so we can detect post-nat tuple collisions). 2. If you use e.g. nft_masq, snat, redir, etc. uses must also register an empty base chain so that the nat core gets called fro NF_HOOK() to do the reverse translation, which is neither obvious nor user friendly. After this change, the base hook gets registered not from iptable_nat or nftables nat hooks, but from the l3 nat core. iptables/nft nat base hooks get registered with the nat core instead: NF_HOOK() \| `---> nf_nat_l3proto_ipv4 -> nf_nat_packet \| new packet? pass skb through iptables/nftables nat chains \| +-> iptables_nat: ipt_do_table +-> nft nat chain x `-> nft nat chain y The nat core deals with null bindings and reverse translation. When no mapping exists, it calls the registered nat lookup hooks until one creates a new mapping. If both iptables and nftables nat hooks exist, the first matching one is used (i.e., higher priority wins). Also, nft users do not need to create empty nat hooks anymore, nat core always registers the base hooks that take care of reverse/reply translation. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:06 +02:00
Florian Westphal	1cd472bf03	netfilter: nf_nat: add nat hook register functions to nf_nat This adds the infrastructure to register nat hooks with the nat core instead of the netfilter core. nat hooks are used to configure nat bindings. Such hooks are registered from ip(6)table_nat or by the nftables core when a nat chain is added. After next patch, nat hooks will be registered with nf_nat instead of netfilter core. This allows to use many nat lookup functions at the same time while doing the real packet rewrite (nat transformation) in one place. This change doesn't convert the intended users yet to ease review. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:05 +02:00
Florian Westphal	06cad3acf0	netfilter: core: export raw versions of add/delete hook functions This will allow the nat core to reuse the nf_hook infrastructure to maintain nat lookup functions. The raw versions don't assume a particular hook location, the functions get added/deleted from the hook blob that is passed to the functions. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:05 +02:00
Florian Westphal	4e25ceb80b	netfilter: nf_tables: allow chain type to override hook register Will be used in followup patch when nat types no longer use nf_register_net_hook() but will instead register with the nat core. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:05 +02:00
Florian Westphal	1f55236bd8	netfilter: nf_nat: move common nat code to nat core Copy-pasted, both l3 helpers almost use same code here. Split out the common part into an 'inet' helper. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:05 +02:00
David S. Miller	6f6e434aa2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net S390 bpf_jit.S is removed in net-next and had changes in 'net', since that code isn't used any more take the removal. TLS data structures split the TX and RX components in 'net-next', put the new struct members from the bug fix in 'net' into the RX part. The 'net-next' tree had some reworking of how the ERSPAN code works in the GRE tunneling code, overlapping with a one-line headroom calculation fix in 'net'. Overlapping changes in __sock_map_ctx_update_elem(), keep the bits that read the prog members via READ_ONCE() into local variables before using them. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-21 16:01:54 -04:00
Laura Garcia Liebana	b9ccc07e3f	netfilter: nft_hash: add map lookups for hashing operations This patch creates new attributes to accept a map as argument and then perform the lookup with the generated hash accordingly. Both current hash functions are supported: Jenkins and Symmetric Hash. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-17 14:00:52 +02:00
Laura Garcia Liebana	978d8f9055	netfilter: nft_numgen: add map lookups for numgen random operations This patch uses the map lookup already included to be applied for random number generation. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-17 14:00:41 +02:00
Florian Westphal	e65eebec9c	netfilter: nf_tables: remove old nf_log based tracing nfnetlink tracing is available since nft 0.6 (June 2016). Remove old nf_log based tracing to avoid rule counter in main loop. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-17 13:53:07 +02:00
Colin Ian King	f0dfd7a2b3	netfilter: nf_tables: fix memory leak on error exit return Currently the -EBUSY error return path is not free'ing resources allocated earlier, leaving a memory leak. Fix this by exiting via the error exit label err5 that performs the necessary resource clean up. Detected by CoverityScan, CID#1432975 ("Resource leak") Fixes: `9744a6fcef` ("netfilter: nf_tables: check if same extensions are set when adding elements") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-14 00:21:59 +02:00
Pablo Neira Ayuso	bb7b40aecb	netfilter: nf_tables: bogus EBUSY in chain deletions When removing a rule that jumps to chain and such chain in the same batch, this bogusly hits EBUSY. Add activate and deactivate operations to expression that can be called from the preparation and the commit/abort phases. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-09 10:09:30 +02:00
Florian Westphal	732a8049f3	netfilter: nft_compat: fix handling of large matchinfo size currently matchinfo gets stored in the expression, but some xt matches are very large. To handle those we either need to switch nft core to kvmalloc and increase size limit, or allocate the info blob of large matches separately. This does the latter, this limits the scope of the changes to nft_compat. I picked a threshold of 192, this allows most matches to work as before and handle only few ones via separate alloation (cgroup, u32, sctp, rt). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-09 10:09:27 +02:00
Florian Westphal	8bdf164744	netfilter: nft_compat: prepare for indirect info storage Next patch will make it possible for *info to be stored in a separate allocation instead of the expr private area. This removes the 'expr priv area is info blob' assumption from the match init/destroy/eval functions. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-09 10:07:02 +02:00
Florian Westphal	009240940e	netfilter: nf_tables: don't assume chain stats are set when jumplabel is set nft_chain_stats_replace() and all other spots assume ->stats can be NULL, but nft_update_chain_stats does not. It must do this check, just because the jump label is set doesn't mean all basechains have stats assigned. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-08 14:15:33 +02:00
Florian Westphal	4e09fc873d	netfilter: prefer nla_strlcpy for dealing with NLA_STRING attributes fixes these warnings: 'nfnl_cthelper_create' at net/netfilter/nfnetlink_cthelper.c:237:2, 'nfnl_cthelper_new' at net/netfilter/nfnetlink_cthelper.c:450:9: ./include/linux/string.h:246:9: warning: '__builtin_strncpy' specified bound 16 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Moreover, strncpy assumes null-terminated source buffers, but thats not the case here. Unlike strlcpy, nla_strlcpy does pad the destination buffer while also considering nla attribute size. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-08 14:15:31 +02:00
Florian Westphal	25fd386e0b	netfilter: core: add missing __rcu annotation removes following sparse error: net/netfilter/core.c:598:30: warning: incorrect type in argument 1 (different address spaces) net/netfilter/core.c:598:30: expected struct nf_hook_entries e net/netfilter/core.c:598:30: got struct nf_hook_entries [noderef] <asn:4><noident> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-08 14:15:30 +02:00
Julian Anastasov	d5e032fc56	ipvs: fix stats update from local clients Local clients are not properly synchronized on 32-bit CPUs when updating stats (3.10+). Now it is possible estimation_timer (timer), a stats reader, to interrupt the local client in the middle of write_seqcount_{begin,end} sequence leading to loop (DEADLOCK). The same interrupt can happen from received packet (SoftIRQ) which updates the same per-CPU stats. Fix it by disabling BH while updating stats. Found with debug: WARNING: inconsistent lock state 4.17.0-rc2-00105-g35cb6d7-dirty #2 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-R} -> {SOFTIRQ-ON-W} usage. ftp/2545 [HC0[0]:SC0[0]:HE1:SE1] takes: 86845479 (&syncp->seq#6){+.+-}, at: ip_vs_schedule+0x1c5/0x59e [ip_vs] {IN-SOFTIRQ-R} state was registered at: lock_acquire+0x44/0x5b estimation_timer+0x1b3/0x341 [ip_vs] call_timer_fn+0x54/0xcd run_timer_softirq+0x10c/0x12b __do_softirq+0xc1/0x1a9 do_softirq_own_stack+0x1d/0x23 irq_exit+0x4a/0x64 smp_apic_timer_interrupt+0x63/0x71 apic_timer_interrupt+0x3a/0x40 default_idle+0xa/0xc arch_cpu_idle+0x9/0xb default_idle_call+0x21/0x23 do_idle+0xa0/0x167 cpu_startup_entry+0x19/0x1b start_secondary+0x133/0x182 startup_32_smp+0x164/0x168 irq event stamp: 42213 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&syncp->seq#6); <Interrupt> lock(&syncp->seq#6); * DEADLOCK * Fixes: `ac69269a45` ("ipvs: do not disable bh for long time") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-08 14:15:21 +02:00
Julian Anastasov	a050d345ce	ipvs: fix refcount usage for conns in ops mode Connections in One-packet scheduling mode (-o, --ops) are removed with refcnt=0 because they are not hashed in conn table. To avoid refcount_dec reporting this as error, change them to be removed with refcount_dec_if_one as all other connections. refcount_t hit zero at ip_vs_conn_put+0x31/0x40 [ip_vs] in sh[15519], uid/euid: 497/497 WARNING: CPU: 0 PID: 15519 at ../kernel/panic.c:657 refcount_error_report+0x94/0x9e Modules linked in: ip_vs_rr cirrus ttm sb_edac edac_core drm_kms_helper crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc mousedev drm aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse evdev input_leds led_class intel_agp fb_sys_fops syscopyarea sysfillrect intel_rapl_perf mac_hid intel_gtt serio_raw sysimgblt agpgart i2c_piix4 i2c_core ata_generic pata_acpi floppy cfg80211 rfkill button loop macvlan ip_vs nf_conntrack libcrc32c crc32c_generic ip_tables x_tables ipv6 crc_ccitt autofs4 ext4 crc16 mbcache jbd2 fscrypto ata_piix libata atkbd libps2 scsi_mod crc32c_intel i8042 rtc_cmos serio af_packet dm_mod dax fuse xen_netfront xen_blkfront CPU: 0 PID: 15519 Comm: sh Tainted: G W 4.15.17 #1-NixOS Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006 RIP: 0010:refcount_error_report+0x94/0x9e RSP: 0000:ffffa344dde039c8 EFLAGS: 00010296 RAX: 0000000000000057 RBX: ffffffff92f20e06 RCX: 0000000000000006 RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffffa344dde165c0 RBP: ffffa344dde03b08 R08: 0000000000000218 R09: 0000000000000004 R10: ffffffff93006a80 R11: 0000000000000001 R12: ffffa344d68cd100 R13: 00000000000001f1 R14: ffffffff92f12fb0 R15: 0000000000000004 FS: 00007fc9d2040fc0(0000) GS:ffffa344dde00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000262a000 CR3: 0000000016a0c004 CR4: 00000000001606f0 Call Trace: <IRQ> ex_handler_refcount+0x4e/0x80 fixup_exception+0x33/0x40 do_trap+0x83/0x140 do_error_trap+0x83/0xf0 ? ip_vs_conn_drop_conntrack+0x120/0x1a5 [ip_vs] ? ip_finish_output2+0x29c/0x390 ? ip_finish_output2+0x1a2/0x390 invalid_op+0x1b/0x40 RIP: 0010:ip_vs_conn_put+0x31/0x40 [ip_vs] RSP: 0000:ffffa344dde03bb8 EFLAGS: 00010246 RAX: 0000000000000001 RBX: ffffa344df31cf00 RCX: ffffa344d7450198 RDX: 0000000000000003 RSI: 00000000fffffe01 RDI: ffffa344d7450140 RBP: 0000000000000002 R08: 0000000000000476 R09: 0000000000000000 R10: ffffa344dde03b28 R11: ffffa344df200000 R12: ffffa344d7d09000 R13: ffffa344def3a980 R14: ffffffffc04f6e20 R15: 0000000000000008 ip_vs_in.part.29.constprop.36+0x34f/0x640 [ip_vs] ? ip_vs_conn_out_get+0xe0/0xe0 [ip_vs] ip_vs_remote_request4+0x47/0xa0 [ip_vs] ? ip_vs_in.part.29.constprop.36+0x640/0x640 [ip_vs] nf_hook_slow+0x43/0xc0 ip_local_deliver+0xac/0xc0 ? ip_rcv_finish+0x400/0x400 ip_rcv+0x26c/0x380 __netif_receive_skb_core+0x3a0/0xb10 ? inet_gro_receive+0x23c/0x2b0 ? netif_receive_skb_internal+0x24/0xb0 netif_receive_skb_internal+0x24/0xb0 napi_gro_receive+0xb8/0xe0 xennet_poll+0x676/0xb40 [xen_netfront] net_rx_action+0x139/0x3a0 __do_softirq+0xde/0x2b4 irq_exit+0xae/0xb0 xen_evtchn_do_upcall+0x2c/0x40 xen_hvm_callback_vector+0x7d/0x90 </IRQ> RIP: 0033:0x7fc9d11c91f9 RSP: 002b:00007ffebe8a2ea0 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff0c RAX: 00000000ffffffff RBX: 0000000002609808 RCX: 0000000000000054 RDX: 0000000000000001 RSI: 0000000002605440 RDI: 00000000025f940e RBP: 00000000025f940e R08: 000000000260213d R09: 1999999999999999 R10: 000000000262a808 R11: 00000000025f942d R12: 00000000025f940e R13: 00007fc9d1301e20 R14: 00000000025f9408 R15: 00007fc9d1302720 Code: 48 8b 95 80 00 00 00 41 55 49 8d 8c 24 e0 05 00 00 45 8b 84 24 38 04 00 00 41 89 c1 48 89 de 48 c7 c7 a8 2f f2 92 e8 7c fa ff ff <0f> 0b 58 5b 5d 41 5c 41 5d c3 0f 1f 44 00 00 55 48 89 e5 41 56 Reported-by: Net Filter <netfilternetfilter@gmail.com> Fixes: `b54ab92b84` ("netfilter: refcounter conversions") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-08 14:15:12 +02:00
Florian Westphal	b8e9dc1c75	netfilter: nf_tables: nft_compat: fix refcount leak on xt module Taehee Yoo reported following bug: iptables-compat -I OUTPUT -m cpu --cpu 0 iptables-compat -F lsmod \|grep xt_cpu xt_cpu 16384 1 Quote: "When above command is given, a netlink message has two expressions that are the cpu compat and the nft_counter. The nft_expr_type_get() in the nf_tables_expr_parse() successes first expression then, calls select_ops callback. (allocates memory and holds module) But, second nft_expr_type_get() in the nf_tables_expr_parse() returns -EAGAIN because of request_module(). In that point, by the 'goto err1', the 'module_put(info[i].ops->type->owner)' is called. There is no release routine." The core problem is that unlike all other expression, nft_compat select_ops has side effects. 1. it allocates dynamic memory which holds an nft ops struct. In all other expressions, ops has static storage duration. 2. It grabs references to the xt module that it is supposed to invoke. Depending on where things go wrong, error unwinding doesn't always do the right thing. In the above scenario, a new nft_compat_expr is created and xt_cpu module gets loaded with a refcount of 1. Due to to -EAGAIN, the netlink messages get re-parsed. When that happens, nft_compat finds that xt_cpu is already present and increments module refcount again. This fixes the problem by making select_ops to have no visible side effects and removes all extra module_get/put. When select_ops creates a new nft_compat expression, the new expression has a refcount of 0, and the xt module gets its refcount incremented. When error happens, the next call finds existing entry, but will no longer increase the reference count -- the presence of existing nft_xt means we already hold a module reference. Because nft_xt_put is only called from nft_compat destroy hook, it will never see the initial zero reference count. ->destroy can only be called after ->init(), and that will increase the refcount. Lastly, we now free nft_xt struct with kfree_rcu. Else, we get use-after free in nf_tables_rule_destroy: while (expr != nft_expr_last(rule) && expr->ops) { nf_tables_expr_destroy(ctx, expr); expr = nft_expr_next(expr); // here nft_expr_next() dereferences expr->ops. This is safe for all users, as ops have static storage duration. In nft_compat case however, its ->destroy callback can free the memory that hold the ops structure. Tested-by: Taehee Yoo <ap420073@gmail.com> Reported-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-08 14:08:21 +02:00
David S. Miller	90278871d4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for your net-next tree, more relevant updates in this batch are: 1) Add Maglev support to IPVS. Moreover, store lastest server weight in IPVS since this is needed by maglev, patches from from Inju Song. 2) Preparation works to add iptables flowtable support, patches from Felix Fietkau. 3) Hand over flows back to conntrack slow path in case of TCP RST/FIN packet is seen via new teardown state, also from Felix. 4) Add support for extended netlink error reporting for nf_tables. 5) Support for larger timeouts that 23 days in nf_tables, patch from Florian Westphal. 6) Always set an upper limit to dynamic sets, also from Florian. 7) Allow number generator to make map lookups, from Laura Garcia. 8) Use hash_32() instead of opencode hashing in IPVS, from Vicent Bernat. 9) Extend ip6tables SRH match to support previous, next and last SID, from Ahmed Abdelsalam. 10) Move Passive OS fingerprint nf_osf.c, from Fernando Fernandez. 11) Expose nf_conntrack_max through ctnetlink, from Florent Fourcot. 12) Several housekeeping patches for xt_NFLOG, x_tables and ebtables, from Taehee Yoo. 13) Unify meta bridge with core nft_meta, then make nft_meta built-in. Make rt and exthdr built-in too, again from Florian. 14) Missing initialization of tbl->entries in IPVS, from Cong Wang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-06 21:51:37 -04:00
Florian Westphal	b13468dc57	netfilter: nft_dynset: fix timeout updates on 32bit This must now use a 64bit jiffies value, else we set a bogus timeout on 32bit. Fixes: `8e1102d5a1` ("netfilter: nf_tables: support timeouts larger than 23 days") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-07 00:05:22 +02:00
Florent Fourcot	538c5672be	netfilter: ctnetlink: export nf_conntrack_max IPCTNL_MSG_CT_GET_STATS netlink command allow to monitor current number of conntrack entries. However, if one wants to compare it with the maximum (and detect exhaustion), the only solution is currently to read sysctl value. This patch add nf_conntrack_max value in netlink message, and simplify monitoring for application built on netlink API. Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-07 00:04:02 +02:00
Fernando Fernandez Mancera	bfb15f2a95	netfilter: extract Passive OS fingerprint infrastructure from xt_osf Add nf_osf_ttl() and nf_osf_match() into nf_osf.c to prepare for nf_tables support. Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-07 00:02:11 +02:00
Laura Garcia Liebana	75e72f0541	netfilter: nft_numgen: enable hashing of one element The modulus in the hash function was limited to > 1 as initially there was no sense to create a hashing of just one element. Nevertheless, there are certain cases specially for load balancing where this case needs to be addressed. This patch fixes the following error. Error: Could not process rule: Numerical result out of range add rule ip nftlb lb01 dnat to jhash ip saddr mod 1 map { 0: 192.168.0.10 } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The solution comes to force the hash to 0 when the modulus is 1. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>	2018-05-06 23:31:54 +02:00
Laura Garcia Liebana	d734a28889	netfilter: nft_numgen: add map lookups for numgen statements This patch includes a new attribute in the numgen structure to allow the lookup of an element based on the number generator as a key. For this purpose, different ops have been included to extend the current numgen inc functions. Currently, only supported for numgen incremental operations, but it will be supported for random in a follow-up patch. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-06 23:18:44 +02:00
Florian Westphal	2f99aa31cd	netfilter: nf_tables: skip synchronize_rcu if transaction log is empty After processing the transaction log, the remaining entries of the log need to be released. However, in some cases no entries remain, e.g. because the transaction did not remove anything. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-04-27 00:40:12 +02:00
Florian Westphal	dceb48d86b	netfilter: x_tables: check name length in find_match/target, too ebtables uses find_match() rather than find_request_match in one case (see `bcf4934288`, "netfilter: ebtables: Fix extension lookup with identical name"), so extend the check on name length to those functions too. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-04-27 00:40:11 +02:00
Jozsef Kadlecsik	72d4d3e398	netfilter: Fix handling simultaneous open in TCP conntrack Dominique Martinet reported a TCP hang problem when simultaneous open was used. The problem is that the tcp_conntracks state table is not smart enough to handle the case. The state table could be fixed by introducing a new state, but that would require more lines of code compared to this patch, due to the required backward compatibility with ctnetlink. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Reported-by: Dominique Martinet <asmadeus@codewreck.org> Tested-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-04-27 00:39:29 +02:00
Cong Wang	8b2ebb6cf0	ipvs: initialize tbl->entries in ip_vs_lblc_init_svc() Similarly, tbl->entries is not initialized after kmalloc(), therefore causes an uninit-value warning in ip_vs_lblc_check_expire(), as reported by syzbot. Reported-by: <syzbot+3e9695f147fb529aa9bc@syzkaller.appspotmail.com> Cc: Simon Horman <horms@verge.net.au> Cc: Julian Anastasov <ja@ssi.bg> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-04-27 00:20:33 +02:00
Cong Wang	3aa1409a7b	ipvs: initialize tbl->entries after allocation tbl->entries is not initialized after kmalloc(), therefore causes an uninit-value warning in ip_vs_lblc_check_expire() as reported by syzbot. Reported-by: <syzbot+3dfdea57819073a04f21@syzkaller.appspotmail.com> Cc: Simon Horman <horms@verge.net.au> Cc: Julian Anastasov <ja@ssi.bg> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-04-27 00:20:33 +02:00
Pablo Neira Ayuso	146cd6b5d5	Merge tag 'ipvs-for-v4.18' of http://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next Simon Horman says: ==================== IPVS Updates for v4.18 please consider these IPVS enhancements for v4.18. * Whitepace cleanup * Add Maglev hashing algorithm as a IPVS scheduler Inju Song says "Implements the Google's Maglev hashing algorithm as a IPVS scheduler. Basically it provides consistent hashing but offers some special features about disruption and load balancing. 1) minimal disruption: when the set of destinations changes, a connection will likely be sent to the same destination as it was before. 2) load balancing: each destination will receive an almost equal number of connections. Seel also: [3.4 Consistent Hasing] in https://www.usenix.org/system/files/conference/nsdi16/nsdi16-paper-eisenbud.pdf " * Fix to correct implementation of Knuth's multiplicative hashing which is used in sh/dh/lblc/lblcr algorithms. Instead the implementation provided by the hash_32() macro is used. ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-04-27 00:16:14 +02:00
Florian Westphal	d0103158cf	netfilter: nf_tables: merge exthdr expression into nft core before: text data bss dec hex filename 5056 844 0 5900 170c net/netfilter/nft_exthdr.ko 102456 2316 401 105173 19ad5 net/netfilter/nf_tables.ko after: 106410 2392 401 109203 1aa93 net/netfilter/nf_tables.ko Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-04-27 00:00:56 +02:00
Florian Westphal	ae1bc6a9f3	netfilter: nf_tables: merge rt expression into nft core before: text data bss dec hex filename 2657 844 0 3501 dad net/netfilter/nft_rt.ko 100826 2240 401 103467 1942b net/netfilter/nf_tables.ko after: 2657 844 0 3501 dad net/netfilter/nft_rt.ko 102456 2316 401 105173 19ad5 net/netfilter/nf_tables.ko Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-04-27 00:00:55 +02:00
Florian Westphal	8a22543c8e	netfilter: nf_tables: make meta expression builtin size net/netfilter/nft_meta.ko text data bss dec hex filename 5826 936 1 6763 1a6b net/netfilter/nft_meta.ko 96407 2064 400 98871 18237 net/netfilter/nf_tables.ko after: 100826 2240 401 103467 1942b net/netfilter/nf_tables.ko Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-04-27 00:00:46 +02:00

1 2 3 4 5 ...

4387 Commits