linux/net
Pablo Neira Ayuso 0628b123c9 netfilter: nfnetlink: add batch support and use it from nf_tables
This patch adds a batch support to nfnetlink. Basically, it adds
two new control messages:

* NFNL_MSG_BATCH_BEGIN, that indicates the beginning of a batch,
  the nfgenmsg->res_id indicates the nfnetlink subsystem ID.

* NFNL_MSG_BATCH_END, that results in the invocation of the
  ss->commit callback function. If not specified or an error
  ocurred in the batch, the ss->abort function is invoked
  instead.

The end message represents the commit operation in nftables, the
lack of end message results in an abort. This patch also adds the
.call_batch function that is only called from the batch receival
path.

This patch adds atomic rule updates and dumps based on
bitmask generations. This allows to atomically commit a set of
rule-set updates incrementally without altering the internal
state of existing nf_tables expressions/matches/targets.

The idea consists of using a generation cursor of 1 bit and
a bitmask of 2 bits per rule. Assuming the gencursor is 0,
then the genmask (expressed as a bitmask) can be interpreted
as:

00 active in the present, will be active in the next generation.
01 inactive in the present, will be active in the next generation.
10 active in the present, will be deleted in the next generation.
 ^
 gencursor

Once you invoke the transition to the next generation, the global
gencursor is updated:

00 active in the present, will be active in the next generation.
01 active in the present, needs to zero its future, it becomes 00.
10 inactive in the present, delete now.
^
gencursor

If a dump is in progress and nf_tables enters a new generation,
the dump will stop and return -EBUSY to let userspace know that
it has to retry again. In order to invalidate dumps, a global
genctr counter is increased everytime nf_tables enters a new
generation.

This new operation can be used from the user-space utility
that controls the firewall, eg.

nft -f restore

The rule updates contained in `file' will be applied atomically.

cat file
-----
add filter INPUT ip saddr 1.1.1.1 counter accept #1
del filter INPUT ip daddr 2.2.2.2 counter drop   #2
-EOF-

Note that the rule 1 will be inactive until the transition to the
next generation, the rule 2 will be evicted in the next generation.

There is a penalty during the rule update due to the branch
misprediction in the packet matching framework. But that should be
quickly resolved once the iteration over the commit list that
contain rules that require updates is finished.

Event notification happens once the rule-set update has been
committed. So we skip notifications is case the rule-set update
is aborted, which can happen in case that the rule-set is tested
to apply correctly.

This patch squashed the following patches from Pablo:

* nf_tables: atomic rule updates and dumps
* nf_tables: get rid of per rule list_head for commits
* nf_tables: use per netns commit list
* nfnetlink: add batch support and use it from nf_tables
* nf_tables: all rule updates are transactional
* nf_tables: attach replacement rule after stale one
* nf_tables: do not allow deletion/replacement of stale rules
* nf_tables: remove unused NFTA_RULE_FLAGS

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14 18:01:01 +02:00
..
9p for-linus-3.12-merge minor 9p fixes and tweaks for 3.12 merge window 2013-09-11 12:34:13 -07:00
802 mrp: add periodictimer to allow retries when packets get lost 2013-09-23 16:53:52 -04:00
8021q Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-10-08 23:07:53 -04:00
appletalk net: proc_fs: trivial: print UIDs as unsigned int 2013-08-15 14:37:46 -07:00
atm net: always pass struct netdev_notifier_info to netdevice notifiers 2013-05-28 21:58:54 -07:00
ax25 net: Convert uses of typedef ctl_table to struct ctl_table 2013-06-13 02:36:09 -07:00
batman-adv batman-adv: reorder batadv_iv_flags 2013-10-09 21:22:35 +02:00
bluetooth Bluetooth: Only one command per L2CAP LE signalling is supported 2013-10-03 16:09:59 +03:00
bridge netfilter: nf_tables: complete net namespace support 2013-10-14 18:00:59 +02:00
caif caif: Add missing braces to multiline if in cfctrl_linkup_request 2013-09-05 14:31:02 -04:00
can can: gw: add a per rule limitation of frame hops 2013-08-29 22:58:24 +02:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2013-09-19 12:50:37 -05:00
core net: gro: allow to build full sized skb 2013-10-10 00:08:07 -04:00
dcb
dccp inet: rename ir_loc_port to ir_num 2013-10-10 14:37:35 -04:00
decnet netfilter: pass hook ops to hookfn 2013-10-14 11:29:31 +02:00
dns_resolver net: strict_strtoul is obsolete, use kstrtoul instead 2013-07-12 16:09:14 -07:00
dsa net: dsa: inherit addr_assign_type along with dev_addr 2013-09-03 20:57:49 -04:00
ethernet ethernet: use likely() for common Ethernet encap 2013-09-30 21:52:53 -07:00
ieee802154 6lowpan: Sync default hardware address of lowpan links to their wpan 2013-10-08 15:28:37 -04:00
ipv4 netfilter: nf_tables: complete net namespace support 2013-10-14 18:00:59 +02:00
ipv6 netfilter: nf_tables: complete net namespace support 2013-10-14 18:00:59 +02:00
ipx net: proc_fs: trivial: print UIDs as unsigned int 2013-08-15 14:37:46 -07:00
irda net/irda: fixed style issues in irttp 2013-07-19 17:34:40 -07:00
iucv net: delete __cpuinit usage from all net files 2013-07-14 19:36:58 -04:00
key xfrm: Remove rebundant address family checking 2013-08-07 10:12:58 +02:00
l2tp ipv6: make lookups simpler and faster 2013-10-09 00:01:25 -04:00
lapb net/lapb: re-send packets on timeout 2013-09-23 16:52:45 -04:00
llc llc: Use normal etherdevice.h tests 2013-09-03 22:34:47 -04:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-10-08 23:07:53 -04:00
mac802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-04-30 03:55:20 -04:00
mpls MPLS: Add limited GSO support 2013-05-27 22:50:59 -07:00
netfilter netfilter: nfnetlink: add batch support and use it from nf_tables 2013-10-14 18:01:01 +02:00
netlabel inet: includes a sock_common in request_sock 2013-10-10 00:08:07 -04:00
netlink net: netlink: filter particular protocols from analyzers 2013-09-06 14:43:48 -04:00
netrom net: Convert uses of typedef ctl_table to struct ctl_table 2013-06-13 02:36:09 -07:00
nfc NFC: Update secure element state 2013-08-14 01:13:40 +02:00
openvswitch net ipv4: Convert ipv4.ip_local_port_range to be per netns v3 2013-09-30 21:59:38 -07:00
packet net: packet: use reciprocal_divide in fanout_demux_hash 2013-08-29 16:43:29 -04:00
phonet net: proc_fs: trivial: print UIDs as unsigned int 2013-08-15 14:37:46 -07:00
rds net: Convert uses of typedef ctl_table to struct ctl_table 2013-06-13 02:36:09 -07:00
rfkill Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-09-05 14:54:29 -07:00
rose net: Convert uses of typedef ctl_table to struct ctl_table 2013-06-13 02:36:09 -07:00
rxrpc
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-10-08 23:07:53 -04:00
sctp ipv6: make lookups simpler and faster 2013-10-09 00:01:25 -04:00
sunrpc net: fix build errors if ipv6 is disabled 2013-10-09 13:04:03 -04:00
tipc tipc: set sk_err correctly when connection fails 2013-08-30 16:06:57 -04:00
unix unix_diag: fix info leak 2013-10-02 16:08:24 -04:00
vmw_vsock Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-08-16 15:37:26 -07:00
wimax
wireless cfg80211: fix sysfs registration race 2013-09-26 20:03:45 +02:00
x25 x25: add a sanity check parsing X.25 facilities 2013-09-04 00:27:27 -04:00
xfrm Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2013-09-30 15:24:57 -04:00
compat.c net: heap overflow in __audit_sockaddr() 2013-10-03 16:05:14 -04:00
Kconfig Remove GENERIC_HARDIRQ config option 2013-09-13 15:09:52 +02:00
Makefile MPLS: Add limited GSO support 2013-05-27 22:50:59 -07:00
nonet.c
socket.c net: heap overflow in __audit_sockaddr() 2013-10-03 16:05:14 -04:00
sysctl_net.c net: Update the sysctl permissions handler to test effective uid/gid 2013-10-07 15:57:56 -04:00