linux/kernel/bpf
Jakub Sitnicki 2576f87066 bpf, netns: Fix use-after-free in pernet pre_exit callback
Iterating over BPF links attached to network namespace in pre_exit hook is
not safe, even if there is just one. Once link gets auto-detached, that is
its back-pointer to net object is set to NULL, the link can be released and
freed without waiting on netns_bpf_mutex, effectively causing the list
element we are operating on to be freed.

This leads to use-after-free when trying to access the next element on the
list, as reported by KASAN. Bug can be triggered by destroying a network
namespace, while also releasing a link attached to this network namespace.

| ==================================================================
| BUG: KASAN: use-after-free in netns_bpf_pernet_pre_exit+0xd9/0x130
| Read of size 8 at addr ffff888119e0d778 by task kworker/u8:2/177
|
| CPU: 3 PID: 177 Comm: kworker/u8:2 Not tainted 5.8.0-rc1-00197-ga0c04c9d1008-dirty #776
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
| Workqueue: netns cleanup_net
| Call Trace:
|  dump_stack+0x9e/0xe0
|  print_address_description.constprop.0+0x3a/0x60
|  ? netns_bpf_pernet_pre_exit+0xd9/0x130
|  kasan_report.cold+0x1f/0x40
|  ? netns_bpf_pernet_pre_exit+0xd9/0x130
|  netns_bpf_pernet_pre_exit+0xd9/0x130
|  cleanup_net+0x30b/0x5b0
|  ? unregister_pernet_device+0x50/0x50
|  ? rcu_read_lock_bh_held+0xb0/0xb0
|  ? _raw_spin_unlock_irq+0x24/0x50
|  process_one_work+0x4d1/0xa10
|  ? lock_release+0x3e0/0x3e0
|  ? pwq_dec_nr_in_flight+0x110/0x110
|  ? rwlock_bug.part.0+0x60/0x60
|  worker_thread+0x7a/0x5c0
|  ? process_one_work+0xa10/0xa10
|  kthread+0x1e3/0x240
|  ? kthread_create_on_node+0xd0/0xd0
|  ret_from_fork+0x1f/0x30
|
| Allocated by task 280:
|  save_stack+0x1b/0x40
|  __kasan_kmalloc.constprop.0+0xc2/0xd0
|  netns_bpf_link_create+0xfe/0x650
|  __do_sys_bpf+0x153a/0x2a50
|  do_syscall_64+0x59/0x300
|  entry_SYSCALL_64_after_hwframe+0x44/0xa9
|
| Freed by task 198:
|  save_stack+0x1b/0x40
|  __kasan_slab_free+0x12f/0x180
|  kfree+0xed/0x350
|  process_one_work+0x4d1/0xa10
|  worker_thread+0x7a/0x5c0
|  kthread+0x1e3/0x240
|  ret_from_fork+0x1f/0x30
|
| The buggy address belongs to the object at ffff888119e0d700
|  which belongs to the cache kmalloc-192 of size 192
| The buggy address is located 120 bytes inside of
|  192-byte region [ffff888119e0d700, ffff888119e0d7c0)
| The buggy address belongs to the page:
| page:ffffea0004678340 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0
| flags: 0x2fffe0000000200(slab)
| raw: 02fffe0000000200 ffffea00045ba8c0 0000000600000006 ffff88811a80ea80
| raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
| page dumped because: kasan: bad access detected
|
| Memory state around the buggy address:
|  ffff888119e0d600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
|  ffff888119e0d680: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
| >ffff888119e0d700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
|                                                                 ^
|  ffff888119e0d780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
|  ffff888119e0d800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
| ==================================================================

Remove the "fast-path" for releasing a link that got auto-detached by a
dying network namespace to fix it. This way as long as link is on the list
and netns_bpf mutex is held, we have a guarantee that link memory can be
accessed.

An alternative way to fix this issue would be to safely iterate over the
list of links and ensure there is no access to link object after detaching
it. But, at the moment, optimizing synchronization overhead on link release
without a workload in mind seems like an overkill.

Fixes: ab53cad90e ("bpf, netns: Keep a list of attached bpf_link's")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630164541.1329993-1-jakub@cloudflare.com
2020-06-30 10:53:42 -07:00
..
arraymap.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-05-15 13:48:59 -07:00
bpf_iter.c
bpf_lru_list.c
bpf_lru_list.h
bpf_lsm.c bpf: Use tracing helpers for lsm programs 2020-06-01 15:08:04 -07:00
bpf_struct_ops_types.h
bpf_struct_ops.c
btf.c bpf: Do not allow btf_ctx_access with __int128 types 2020-06-25 16:17:05 +02:00
cgroup.c bpf: Don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE 2020-06-17 10:54:05 -07:00
core.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
cpumap.c xdp: Rename convert_to_xdp_frame in xdp_convert_buff_to_frame 2020-06-01 15:02:53 -07:00
devmap.c devmap: Use bpf_map_area_alloc() for allocating hash buckets 2020-06-17 10:01:19 -07:00
disasm.c
disasm.h
dispatcher.c
hashtab.c
helpers.c bpf: Implement BPF ring buffer and verifier support for it 2020-06-01 14:38:22 -07:00
inode.c
local_storage.c
lpm_trie.c
Makefile flow_dissector: Move out netns_bpf prog callbacks 2020-06-01 15:21:02 -07:00
map_in_map.c
map_in_map.h
map_iter.c
net_namespace.c bpf, netns: Fix use-after-free in pernet pre_exit callback 2020-06-30 10:53:42 -07:00
offload.c
percpu_freelist.c
percpu_freelist.h
queue_stack_maps.c
reuseport_array.c
ringbuf.c bpf: Enforce BPF ringbuf size to be the power of 2 2020-06-30 16:31:55 +02:00
stackmap.c mmap locking API: add mmap_read_trylock_non_owner() 2020-06-09 09:39:14 -07:00
syscall.c bpf: sockmap: Require attach_bpf_fd when detaching a program 2020-06-30 10:46:39 -07:00
sysfs_btf.c
task_iter.c
tnum.c
trampoline.c
verifier.c bpf: Set the number of exception entries properly for subprograms 2020-06-23 17:27:37 -07:00