linux/include/net
David Howells cdfbabfb2f net: Work around lockdep limitation in sockets that use sockets
Lockdep issues a circular dependency warning when AFS issues an operation
through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.

The theory lockdep comes up with is as follows:

 (1) If the pagefault handler decides it needs to read pages from AFS, it
     calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
     creating a call requires the socket lock:

	mmap_sem must be taken before sk_lock-AF_RXRPC

 (2) afs_open_socket() opens an AF_RXRPC socket and binds it.  rxrpc_bind()
     binds the underlying UDP socket whilst holding its socket lock.
     inet_bind() takes its own socket lock:

	sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET

 (3) Reading from a TCP socket into a userspace buffer might cause a fault
     and thus cause the kernel to take the mmap_sem, but the TCP socket is
     locked whilst doing this:

	sk_lock-AF_INET must be taken before mmap_sem

However, lockdep's theory is wrong in this instance because it deals only
with lock classes and not individual locks.  The AF_INET lock in (2) isn't
really equivalent to the AF_INET lock in (3) as the former deals with a
socket entirely internal to the kernel that never sees userspace.  This is
a limitation in the design of lockdep.

Fix the general case by:

 (1) Double up all the locking keys used in sockets so that one set are
     used if the socket is created by userspace and the other set is used
     if the socket is created by the kernel.

 (2) Store the kern parameter passed to sk_alloc() in a variable in the
     sock struct (sk_kern_sock).  This informs sock_lock_init(),
     sock_init_data() and sk_clone_lock() as to the lock keys to be used.

     Note that the child created by sk_clone_lock() inherits the parent's
     kern setting.

 (3) Add a 'kern' parameter to ->accept() that is analogous to the one
     passed in to ->create() that distinguishes whether kernel_accept() or
     sys_accept4() was the caller and can be passed to sk_alloc().

     Note that a lot of accept functions merely dequeue an already
     allocated socket.  I haven't touched these as the new socket already
     exists before we get the parameter.

     Note also that there are a couple of places where I've made the accepted
     socket unconditionally kernel-based:

	irda_accept()
	rds_rcp_accept_one()
	tcp_accept_from_sock()

     because they follow a sock_create_kern() and accept off of that.

Whilst creating this, I noticed that lustre and ocfs don't create sockets
through sock_create_kern() and thus they aren't marked as for-kernel,
though they appear to be internal.  I wonder if these should do that so
that they use the new set of lock keys.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09 18:23:27 -08:00
..
9p 9p: constify ->d_name handling 2017-01-12 04:01:17 -05:00
bluetooth sched/headers: Prepare to use <linux/rcuupdate.h> instead of <linux/rculist.h> in <linux/sched.h> 2017-03-02 08:42:38 +01:00
caif
irda
iucv
netfilter netfilter: nf_tables: don't call nfnetlink_set_err() if nfnetlink_send() fails 2017-03-03 13:48:34 +01:00
netns Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2017-02-03 16:58:20 -05:00
nfc NFC: digital: Add support for NFC DEP Response Waiting Time 2016-07-11 02:01:14 +02:00
phonet sock: struct proto hash function may error 2016-02-11 03:54:14 -05:00
sctp net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
tc_act net/act_pedit: Introduce 'add' operation 2017-02-10 13:18:33 -05:00
6lowpan.h 6lowpan: add 802.15.4 short addr slaac 2016-06-15 20:41:22 -07:00
act_api.h net sched actions: Add support for user cookies 2017-01-25 12:37:04 -05:00
addrconf.h inet: collapse ipv4/v6 rcv_saddr_equal functions into one 2017-01-18 13:04:28 -05:00
af_ieee802154.h
af_rxrpc.h rxrpc: Rewrite the data and ack handling code 2016-09-08 11:10:12 +01:00
af_unix.h af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock' 2016-09-04 13:29:29 -07:00
af_vsock.h VSOCK: Introduce virtio_vsock_common.ko 2016-08-02 02:57:29 +03:00
ah.h
arp.h net: add confirm_neigh method to dst_ops 2017-02-07 13:07:46 -05:00
atmclip.h
ax25.h
ax88796.h
bond_3ad.h bonding: 3ad: apply ad_actor settings changes immediately 2016-02-09 04:45:49 -05:00
bond_alb.h
bond_options.h
bonding.h netns: make struct pernet_operations::id unsigned int 2016-11-18 10:59:15 -05:00
busy_poll.h sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> 2017-03-02 08:42:32 +01:00
calipso.h calipso: Add a label cache. 2016-06-27 15:06:17 -04:00
cfg80211-wext.h
cfg80211.h scripts/spelling.txt: add "disassocation" pattern and fix typo instances 2017-02-27 18:43:47 -08:00
cfg802154.h ieee802154: add netns support 2016-07-08 12:20:57 +02:00
checksum.h csum: eliminate sparse warning in remcsum_unadjust() 2017-01-20 12:12:13 -05:00
cipso_ipv4.h netlabel: out of bound access in cipso_v4_validate() 2017-02-04 19:44:22 -05:00
cls_cgroup.h cls_cgroup: get sk_classid only from full sockets 2016-04-19 20:09:25 -04:00
codel_impl.h codel: split into multiple files 2016-04-25 16:44:27 -04:00
codel_qdisc.h net_sched: fq_codel: cache skb->truesize into skb->cb 2016-06-25 12:19:35 -04:00
codel.h codel: split into multiple files 2016-04-25 16:44:27 -04:00
compat.h packet: compat support for sock_fprog 2016-06-09 23:41:03 -07:00
datalink.h
dcbevent.h
dcbnl.h
devlink.h devlink: Add E-Switch inline mode control 2016-11-24 16:01:14 -05:00
dn_dev.h
dn_fib.h
dn_neigh.h
dn_nsp.h
dn_route.h
dn.h
dsa.h net: dsa: remove unnecessary phy*.h includes 2017-02-10 13:51:04 -05:00
dsfield.h
dst_cache.h net: add dst_cache support 2016-02-16 20:21:48 -05:00
dst_metadata.h net/dst: Add dst port to dst_metadata utility functions 2016-11-09 13:41:54 -05:00
dst_ops.h net: add confirm_neigh method to dst_ops 2017-02-07 13:07:46 -05:00
dst.h net: rename dst_neigh_output back to neigh_output 2017-02-11 21:25:18 -05:00
esp.h
ethoc.h
fib_rules.h net: core: add UID to flows, rules, and routes 2016-11-04 14:45:23 -04:00
firewire.h
flow_dissector.h flow disector: ARP support 2017-01-11 11:02:47 -05:00
flow.h Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-12 19:25:04 -08:00
flowcache.h net/flowcache: Convert to hotplug state machine 2016-11-09 23:45:28 +01:00
fou.h fou: Add encap ops for IPv6 tunnels 2016-05-20 18:03:16 -04:00
fq_impl.h fq.h: Port memory limit mechanism from fq_codel 2016-09-30 13:29:21 +02:00
fq.h fq.h: Port memory limit mechanism from fq_codel 2016-09-30 13:29:21 +02:00
garp.h
gen_stats.h net_sched: gen_estimator: complete rewrite of rate estimators 2016-12-05 15:21:59 -05:00
genetlink.h genetlink: Make family a signed integer. 2016-11-13 12:14:59 -05:00
geneve.h net: Remove deprecated tunnel specific UDP offload functions 2016-06-17 20:23:32 -07:00
gre.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-08-18 01:17:32 -04:00
gro_cells.h gro_cells: move to net/core/gro_cells.c 2017-02-08 14:38:18 -05:00
gtp.h gtp: #define #define _GTP_H_ and not #define _GTP_H 2016-07-25 17:55:43 -07:00
gue.h
hwbm.h net: add a hardware buffer management helper API 2016-03-14 12:19:46 -04:00
icmp.h net: snmp: kill STATS_BH macros 2016-04-27 22:48:25 -04:00
ieee80211_radiotap.h wireless: radiotap: rewrite the radiotap header file 2017-01-25 16:00:33 +01:00
ieee802154_netdev.h
if_inet6.h net/ipv6: allow sysctl to change link-local address generation mode 2017-01-27 10:25:34 -05:00
ife.h net: Introduce ife encapsulation module 2017-02-03 15:16:45 -05:00
ila.h
inet6_connection_sock.h inet: drop ->bind_conflict 2017-01-18 13:04:28 -05:00
inet6_hashtables.h tcp/dccp: do not touch listener sk_refcnt under synflood 2016-04-04 22:11:20 -04:00
inet_common.h net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
inet_connection_sock.h net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
inet_ecn.h ipv6: suppress sparse warnings in IP6_ECN_set_ce() 2016-08-13 15:08:00 -07:00
inet_frag.h net: remove bh disabling around percpu_counter accesses 2017-01-20 11:27:22 -05:00
inet_hashtables.h inet: reset tb->fastreuseport when adding a reuseport sk 2017-01-18 13:04:29 -05:00
inet_sock.h net/tcp-fastopen: Add new API support 2017-01-25 14:04:38 -05:00
inet_timewait_sock.h ipv4: Namespaceify tcp_tw_recycle and tcp_max_tw_buckets knob 2016-12-29 11:38:31 -05:00
inetpeer.h
ip6_checksum.h ipv6: Pass proto to csum_ipv6_magic as __u8 instead of unsigned short 2016-03-13 23:55:13 -04:00
ip6_fib.h net: ipv6: Allow shorthand delete of all nexthops in multipath route 2017-02-04 19:58:14 -05:00
ip6_route.h net: inet: Support UID-based routing in IP protocols. 2016-11-04 14:45:23 -04:00
ip6_tunnel.h ip6_tunnel: Clear IP6CB in ip6tunnel_xmit() 2016-11-02 15:18:36 -04:00
ip_fib.h ipv4: fib: Add events for FIB replace and append 2017-02-10 11:32:13 -05:00
ip_tunnels.h ip_tunnels: new IP_TUNNEL_INFO_BRIDGE flag for ip_tunnel_info mode 2017-02-03 15:21:21 -05:00
ip_vs.h ipvs: free ip_vs_dest structs when refcnt=0 2017-02-02 14:31:57 +01:00
ip.h Introduce a sysctl that modifies the value of PROT_SOCK. 2017-01-24 12:10:51 -05:00
ipcomp.h
ipconfig.h
ipv6.h ipv6: fix flow labels when the traffic class is non-0 2017-01-31 13:16:59 -05:00
ipx.h
iw_handler.h wext: uninline stream addition functions 2017-01-13 09:38:42 +01:00
kcm.h kcm: Use stream parser 2016-08-17 19:36:23 -04:00
l3mdev.h net: ipv4: Do not drop to make_route if oif is l3mdev 2016-10-13 12:05:26 -04:00
lapb.h
lib80211.h
llc_c_ac.h
llc_c_ev.h
llc_c_st.h
llc_conn.h
llc_if.h
llc_pdu.h
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
llc.h
lwtunnel.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-02-11 02:31:11 -05:00
mac80211.h scripts/spelling.txt: add "an user" pattern and fix typo instances 2017-02-27 18:43:46 -08:00
mac802154.h ieee802154: cleanup WARN_ON for fc fetch 2016-07-08 13:23:12 +02:00
mip6.h
mld.h
mpls_iptunnel.h
mpls.h openvswitch: use mpls_hdr 2016-10-03 02:00:22 -04:00
mrp.h
ncsi.h net/ncsi: Introduce ncsi_stop_dev() 2016-10-04 02:11:51 -04:00
ndisc.h net: add confirm_neigh method to dst_ops 2017-02-07 13:07:46 -05:00
neighbour.h net: rename dst_neigh_output back to neigh_output 2017-02-11 21:25:18 -05:00
net_namespace.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-11-22 13:27:16 -05:00
net_ratelimit.h
netevent.h neigh: Send a notification when DELAY_PROBE_TIME changes 2016-07-05 09:06:29 -07:00
netlabel.h netlabel: Implement CALIPSO config functions for SMACK. 2016-06-27 15:06:18 -04:00
netlink.h net: ipv6: Change notifications for multipath add to RTA_MULTIPATH 2017-02-04 19:58:14 -05:00
netprio_cgroup.h
netrom.h
nexthop.h
nl802154.h ieee802154: add netns support 2016-07-08 12:20:57 +02:00
p8022.h
ping.h net: ping: make ping_v6_sendmsg static 2016-03-23 22:09:58 -04:00
pkt_cls.h net/sched: Reflect HW offload status 2017-02-17 12:08:05 -05:00
pkt_sched.h net: make default TX queue length a defined constant 2016-11-07 20:15:55 -05:00
pptp.h pptp: Refactor the struct and macros of PPTP codes 2016-08-15 10:55:53 -07:00
protocol.h udp: Remove udp_offloads 2016-04-07 16:53:30 -04:00
psample.h net: Introduce psample, a new genetlink channel for packet sampling 2017-01-24 13:44:28 -05:00
psnap.h
raw.h net: ip, diag -- Add diag interface for raw sockets 2016-10-23 19:35:24 -04:00
rawv6.h net: ip, diag -- Add diag interface for raw sockets 2016-10-23 19:35:24 -04:00
red.h ktime: Get rid of the union 2016-12-25 17:21:22 +01:00
regulatory.h
request_sock.h ipv4: Namespaceify tcp_max_syn_backlog knob 2016-12-29 11:38:31 -05:00
rose.h
route.h net: inet: Support UID-based routing in IP protocols. 2016-11-04 14:45:23 -04:00
rtnetlink.h net: AF-specific RTM_GETSTATS attributes 2017-01-17 14:38:43 -05:00
sch_generic.h sched: move tcf_proto_destroy and tcf_destroy_chain helpers into cls_api 2017-02-10 11:38:08 -05:00
scm.h sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
secure_seq.h tcp: randomize tcp timestamp offsets for each connection 2016-12-02 12:49:59 -05:00
seg6_hmac.h ipv6: sr: add core files for SR HMAC support 2016-11-09 20:40:06 -05:00
seg6.h ipv6: sr: add core files for SR HMAC support 2016-11-09 20:40:06 -05:00
slhc_vj.h
smc.h smc: netlink interface for SMC sockets 2017-01-09 16:07:41 -05:00
snmp.h net: snmp: fix 64bit stats on 32bit arches 2016-04-28 11:49:45 -04:00
sock_reuseport.h soreuseport: fix NULL ptr dereference SO_REUSEPORT after bind 2016-01-19 14:44:23 -05:00
sock.h net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
Space.h
stp.h
strparser.h kcm: Remove TCP specific references from kcm and strparser 2016-08-28 23:32:41 -04:00
switchdev.h switchdev: bridge: Offload mc router ports 2017-02-10 11:46:39 -05:00
tcp_states.h
tcp.h net/tcp-fastopen: Add new API support 2017-01-25 14:04:38 -05:00
timewait_sock.h
transp_v6.h ipv6: add new struct ipcm6_cookie 2016-05-03 16:08:14 -04:00
tso.h
udp_tunnel.h vxlan: Add new UDP encapsulation offload type for VXLAN-GPE 2016-06-17 20:23:32 -07:00
udp.h inet: collapse ipv4/v6 rcv_saddr_equal functions into one 2017-01-18 13:04:28 -05:00
udplite.h Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-16 10:24:44 -08:00
vsock_addr.h
vxlan.h vxlan: remove unsed vxlan_dev_dst_port() 2016-11-15 12:16:13 -05:00
wext.h
wimax.h
x25.h
x25device.h
xfrm.h esp: Add a software GRO codepath 2017-02-15 11:04:11 +01:00