linux/net
Jamal Hadi Salim ef6980b6be introduce IFE action
This action allows for a sending side to encapsulate arbitrary metadata
which is decapsulated by the receiving end.
The sender runs in encoding mode and the receiver in decode mode.
Both sender and receiver must specify the same ethertype.
At some point we hope to have a registered ethertype and we'll
then provide a default so the user doesnt have to specify it.
For now we enforce the user specify it.

Lets show example usage where we encode icmp from a sender towards
a receiver with an skbmark of 17; both sender and receiver use
ethertype of 0xdead to interop.

YYYY: Lets start with Receiver-side policy config:
xxx: add an ingress qdisc
sudo tc qdisc add dev $ETH ingress

xxx: any packets with ethertype 0xdead will be subjected to ife decoding
xxx: we then restart the classification so we can match on icmp at prio 3
sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xdead \
u32 match u32 0 0 flowid 1:1 \
action ife decode reclassify

xxx: on restarting the classification from above if it was an icmp
xxx: packet, then match it here and continue to the next rule at prio 4
xxx: which will match based on skb mark of 17
sudo tc filter add dev $ETH parent ffff: prio 3 protocol ip \
u32 match ip protocol 1 0xff flowid 1:1 \
action continue

xxx: match on skbmark of 0x11 (decimal 17) and accept
sudo tc filter add dev $ETH parent ffff: prio 4 protocol ip \
handle 0x11 fw flowid 1:1 \
action ok

xxx: Lets show the decoding policy
sudo tc -s filter ls dev $ETH parent ffff: protocol 0xdead
xxx:
filter pref 2 u32
filter pref 2 u32 fh 800: ht divisor 1
filter pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1  (rule hit 0 success 0)
  match 00000000/00000000 at 0 (success 0 )
        action order 1: ife decode action reclassify
         index 1 ref 1 bind 1 installed 14 sec used 14 sec
         type: 0x0
         Metadata: allow mark allow hash allow prio allow qmap
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
xxx:
Observe that above lists all metadatum it can decode. Typically these
submodules will already be compiled into a monolithic kernel or
loaded as modules

YYYY: Lets show the sender side now ..

xxx: Add an egress qdisc on the sender netdev
sudo tc qdisc add dev $ETH root handle 1: prio
xxx:
xxx: Match all icmp packets to 192.168.122.237/24, then
xxx: tag the packet with skb mark of decimal 17, then
xxx: Encode it with:
xxx:	ethertype 0xdead
xxx:	add skb->mark to whitelist of metadatum to send
xxx:	rewrite target dst MAC address to 02:15:15:15:15:15
xxx:
sudo $TC filter add dev $ETH parent 1: protocol ip prio 10  u32 \
match ip dst 192.168.122.237/24 \
match ip protocol 1 0xff \
flowid 1:2 \
action skbedit mark 17 \
action ife encode \
type 0xDEAD \
allow mark \
dst 02:15:15:15:15:15

xxx: Lets show the encoding policy
sudo tc -s filter ls dev $ETH parent 1: protocol ip
xxx:
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:2  (rule hit 0 success 0)
  match c0a87aed/ffffffff at 16 (success 0 )
  match 00010000/00ff0000 at 8 (success 0 )

	action order 1:  skbedit mark 17
	 index 6 ref 1 bind 1
 	Action statistics:
	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
	backlog 0b 0p requeues 0

	action order 2: ife encode action pipe
	 index 3 ref 1 bind 1
	 dst MAC: 02:15:15:15:15:15 type: 0xDEAD
 	 Metadata: allow mark
 	Action statistics:
	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
	backlog 0b 0p requeues 0
xxx:

test by sending ping from sender to destination

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01 17:15:22 -05:00
..
6lowpan 6lowpan: fix debugfs interface entry name 2015-12-20 08:21:00 +01:00
9p Rework and error handling fixes, primarily in the fscatch and fd transports. 2016-01-24 12:39:09 -08:00
802
8021q net: 8021q: use __ethtool_get_ksettings 2016-02-25 22:06:46 -05:00
appletalk appletalk: fix erroneous return value 2016-02-18 14:59:34 -05:00
atm net: Generalise wq_has_sleeper helper 2015-11-30 14:47:33 -05:00
ax25 net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
batman-adv batman-adv: Rename batadv_tt_orig_list_entry *_free_ref function to *_put 2016-02-23 13:51:01 +08:00
bluetooth Bluetooth: hci_core: Avoid mixing up req_complete and req_complete_skb 2016-02-20 08:52:28 +01:00
bridge bridge: mcast: add support for more router port information dumping 2016-03-01 16:55:07 -05:00
caif net: caif: fix erroneous return value 2016-02-18 14:59:35 -05:00
can can: avoid using timeval for uapi 2015-10-13 17:42:34 +02:00
ceph libceph: MOSDOpReply v7 encoding 2016-02-04 18:26:08 +01:00
core Introduce devlink infrastructure 2016-03-01 16:07:29 -05:00
dcb net/dcb: make dcbnl.c explicitly non-modular 2015-10-09 07:52:27 -07:00
dccp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-23 00:09:14 -05:00
decnet net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
dns_resolver net: dns_resolver: convert time_t to time64_t 2015-11-18 16:27:46 -05:00
dsa net: dsa: support VLAN filtering switchdev attr 2016-03-01 16:24:51 -05:00
ethernet eth: Pull header from first fragment via eth_get_headlen 2016-02-24 13:58:05 -05:00
hsr net/hsr: fix a warning message 2015-11-23 14:56:15 -05:00
ieee802154 sock: struct proto hash function may error 2016-02-11 03:54:14 -05:00
ipv4 GSO: Provide software checksum of tunneled UDP fragmentation offload 2016-02-26 14:23:35 -05:00
ipv6 GSO: Provide software checksum of tunneled UDP fragmentation offload 2016-02-26 14:23:35 -05:00
ipx
irda irda: fix a potential use-after-free in ircomm_param_request 2016-01-29 22:56:46 -08:00
iucv af_iucv: Validate socket address length in iucv_sock_bind() 2016-01-19 14:21:08 -05:00
key af_key: fix two typos 2015-10-23 03:05:19 -07:00
l2tp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-23 00:09:14 -05:00
l3mdev net: l3mdev: address selection should only consider devices in L3 domain 2016-02-26 14:22:26 -05:00
lapb
llc af_llc: fix types on llc_ui_wait_for_conn 2016-02-17 16:12:13 -05:00
mac80211 Here's another round of updates for -next: 2016-03-01 17:03:27 -05:00
mac802154 mac802154: constify ieee802154_llsec_ops structure 2016-01-04 20:40:41 +01:00
mpls mpls: autoload lwt module 2016-02-21 22:00:28 -05:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-23 00:09:14 -05:00
netlabel
netlink nfnetlink: Revert "nfnetlink: add support for memory mapped netlink" 2016-02-18 11:42:22 -05:00
netrom
nfc NFC 4.5 pull request 2016-01-04 21:48:15 -05:00
openvswitch ovs: propagate per dp max headroom to all vports 2016-03-01 15:54:30 -05:00
packet net: core: use __ethtool_get_ksettings 2016-02-25 22:06:47 -05:00
phonet sock: struct proto hash function may error 2016-02-11 03:54:14 -05:00
rds rds: duplicate include net/tcp.h 2016-02-11 09:45:24 -05:00
rfkill Here's another round of updates for -next: 2016-03-01 17:03:27 -05:00
rose
rxrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-01-12 18:57:02 -08:00
sched introduce IFE action 2016-03-01 17:15:22 -05:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-23 00:09:14 -05:00
sunrpc Initial roundup of 4.5 merge window patches 2016-01-23 18:45:06 -08:00
switchdev switchdev: Require RTNL mutex to be held when sending FDB notifications 2016-01-28 16:21:31 -08:00
tipc tipc: fix null deref crash in compat config path 2016-02-25 17:04:48 -05:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-23 00:09:14 -05:00
vmw_vsock vsock: Fix blocking ops call in prepare_to_wait 2016-02-13 05:57:39 -05:00
wimax
wireless Here's another round of updates for -next: 2016-03-01 17:03:27 -05:00
x25
xfrm net: preserve IP control block during GSO segmentation 2016-01-15 14:35:24 -05:00
compat.c
Kconfig Introduce devlink infrastructure 2016-03-01 16:07:29 -05:00
Makefile net: Introduce L3 Master device abstraction 2015-09-29 20:40:32 -07:00
socket.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
sysctl_net.c net: sysctl: fix a kmemleak warning 2015-10-23 06:22:08 -07:00