linux/net
Daniel Borkmann 82a37132f3 netfilter: x_tables: lightweight process control group matching
It would be useful e.g. in a server or desktop environment to have
a facility in the notion of fine-grained "per application" or "per
application group" firewall policies. Probably, users in the mobile,
embedded area (e.g. Android based) with different security policy
requirements for application groups could have great benefit from
that as well. For example, with a little bit of configuration effort,
an admin could whitelist well-known applications, and thus block
otherwise unwanted "hard-to-track" applications like [1] from a
user's machine. Blocking is just one example, but it is not limited
to that, meaning we can have much different scenarios/policies that
netfilter allows us than just blocking, e.g. fine grained settings
where applications are allowed to connect/send traffic to, application
traffic marking/conntracking, application-specific packet mangling,
and so on.

Implementation of PID-based matching would not be appropriate
as they frequently change, and child tracking would make that
even more complex and ugly. Cgroups would be a perfect candidate
for accomplishing that as they associate a set of tasks with a
set of parameters for one or more subsystems, in our case the
netfilter subsystem, which, of course, can be combined with other
cgroup subsystems into something more complex if needed.

As mentioned, to overcome this constraint, such processes could
be placed into one or multiple cgroups where different fine-grained
rules can be defined depending on the application scenario, while
e.g. everything else that is not part of that could be dropped (or
vice versa), thus making life harder for unwanted processes to
communicate to the outside world. So, we make use of cgroups here
to track jobs and limit their resources in terms of iptables
policies; in other words, limiting, tracking, etc what they are
allowed to communicate.

In our case we're working on outgoing traffic based on which local
socket that originated from. Also, one doesn't even need to have
an a-prio knowledge of the application internals regarding their
particular use of ports or protocols. Matching is *extremly*
lightweight as we just test for the sk_classid marker of sockets,
originating from net_cls. net_cls and netfilter do not contradict
each other; in fact, each construct can live as standalone or they
can be used in combination with each other, which is perfectly fine,
plus it serves Tejun's requirement to not introduce a new cgroups
subsystem. Through this, we result in a very minimal and efficient
module, and don't add anything except netfilter code.

One possible, minimal usage example (many other iptables options
can be applied obviously):

 1) Configuring cgroups if not already done, e.g.:

  mkdir /sys/fs/cgroup/net_cls
  mount -t cgroup -o net_cls net_cls /sys/fs/cgroup/net_cls
  mkdir /sys/fs/cgroup/net_cls/0
  echo 1 > /sys/fs/cgroup/net_cls/0/net_cls.classid
  (resp. a real flow handle id for tc)

 2) Configuring netfilter (iptables-nftables), e.g.:

  iptables -A OUTPUT -m cgroup ! --cgroup 1 -j DROP

 3) Running applications, e.g.:

  ping 208.67.222.222  <pid:1799>
  echo 1799 > /sys/fs/cgroup/net_cls/0/tasks
  64 bytes from 208.67.222.222: icmp_seq=44 ttl=49 time=11.9 ms
  [...]
  ping 208.67.220.220  <pid:1804>
  ping: sendmsg: Operation not permitted
  [...]
  echo 1804 > /sys/fs/cgroup/net_cls/0/tasks
  64 bytes from 208.67.220.220: icmp_seq=89 ttl=56 time=19.0 ms
  [...]

Of course, real-world deployments would make use of cgroups user
space toolsuite, or own custom policy daemons dynamically moving
applications from/to various cgroups.

  [1] http://www.blackhat.com/presentations/bh-europe-06/bh-eu-06-biondi/bh-eu-06-biondi-up.pdf

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: cgroups@vger.kernel.org
Acked-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-03 23:41:44 +01:00
..
9p Nothing really exciting: some groundwork for changing virtio endian, and 2013-11-15 13:28:47 +09:00
802 neigh: convert parms to an array 2013-12-09 20:56:12 -05:00
8021q Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-11-14 16:30:30 +09:00
appletalk net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
atm net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
ax25 net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
batman-adv batadv: Slight optimization of batadv_compare_eth 2013-12-09 20:58:11 -05:00
bluetooth net/*: Fix FSF address in file headers 2013-12-06 12:37:57 -05:00
bridge net: more spelling fixes 2013-12-10 21:57:11 -05:00
caif net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
can Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-11-15 16:47:22 -08:00
ceph net: 8021q/bluetooth/bridge/can/ceph: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
core net: netprio: rename config to be more consistent with cgroup configs 2014-01-03 23:41:42 +01:00
dcb net/*: Fix FSF address in file headers 2013-12-06 12:37:57 -05:00
dccp ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE 2013-11-05 21:52:27 -05:00
decnet dn_dev: add support for IFA_FLAGS nl attribute 2013-12-10 21:50:00 -05:00
dns_resolver net/*: Fix FSF address in file headers 2013-12-06 12:37:57 -05:00
dsa
ethernet
hsr net/hsr: Support iproute print_opt ('ip -details ...') 2013-11-30 12:48:14 -05:00
ieee802154 genetlink: make multicast groups const, prevent abuse 2013-11-19 16:39:06 -05:00
ipv4 netfilter: nf_conntrack: remove dead code 2014-01-03 23:41:37 +01:00
ipv6 ipv6: fix incorrect type in declaration 2013-12-12 16:14:09 -05:00
ipx net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
irda net/irda: Fix FSF address in file headers 2013-12-06 12:37:57 -05:00
iucv net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
key net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
l2tp inet: fix addr_len/msg->msg_namelen assignment in recv_error and rxpmtu functions 2013-11-23 14:46:23 -08:00
lapb
llc net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
mac80211 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2013-12-06 09:50:45 -05:00
mac802154 6lowpan: set and use mac_len for mac header length 2013-10-30 17:18:46 -04:00
mpls ipip: add GSO/TSO support 2013-10-19 19:36:19 -04:00
netfilter netfilter: x_tables: lightweight process control group matching 2014-01-03 23:41:44 +01:00
netlabel netlabel: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
netlink genetlink/pmcraid: use proper genetlink multicast API 2013-11-28 18:26:30 -05:00
netrom net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
nfc net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-11-19 15:50:47 -08:00
packet packet: introduce PACKET_QDISC_BYPASS socket option 2013-12-09 20:23:33 -05:00
phonet Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-11-19 15:50:47 -08:00
rds rds: prevent BUG_ON triggered on congestion update to loopback 2013-12-03 11:54:18 -05:00
rfkill net: rfkill: gpio: add ACPI support 2013-10-28 15:05:25 +01:00
rose net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
rxrpc net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
sched net: net_cls: move cgroupfs classid handling into core 2014-01-03 23:41:41 +01:00
sctp sctp: remove redundant null check on asoc 2013-12-11 15:39:34 -05:00
sunrpc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-11-20 14:25:39 -08:00
tipc tipc: remove unused 'blocked' flag from tipc_link struct 2013-12-11 00:17:43 -05:00
unix unix: convert printks to pr_<level> 2013-12-06 16:35:58 -05:00
vmw_vsock net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
wimax wimax: remove dead code 2013-11-21 13:09:42 -05:00
wireless Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2013-12-06 09:50:45 -05:00
x25 x25: convert printks to pr_<level> 2013-12-09 20:24:18 -05:00
xfrm net: move pskb_put() to core code 2013-11-07 19:28:58 -05:00
compat.c net: clamp ->msg_namelen instead of returning an error 2013-11-29 16:12:52 -05:00
Kconfig net: netprio: rename config to be more consistent with cgroup configs 2014-01-03 23:41:42 +01:00
Makefile net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0) 2013-11-03 23:20:14 -05:00
nonet.c
socket.c net: handle error more gracefully in socketpair() 2013-12-10 22:24:13 -05:00
sysctl_net.c net: Update the sysctl permissions handler to test effective uid/gid 2013-10-07 15:57:56 -04:00