Commit Graph

665443 Commits

Author SHA1 Message Date
Eric Dumazet
48cac18ecf ipv6: orphan skbs in reassembly unit
Andrey reported a use-after-free in IPv6 stack.

Issue here is that we free the socket while it still has skb
in TX path and in some queues.

It happens here because IPv6 reassembly unit messes skb->truesize,
breaking skb_set_owner_w() badly.

We fixed a similar issue for IPV4 in commit 8282f27449 ("inet: frag:
Always orphan skbs inside ip_defrag()")
Acked-by: Joe Stringer <joe@ovn.org>

==================================================================
BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
Read of size 8 at addr ffff880062da0060 by task a.out/4140

page:ffffea00018b6800 count:1 mapcount:0 mapping:          (null)
index:0x0 compound_mapcount: 0
flags: 0x100000000008100(slab|head)
raw: 0100000000008100 0000000000000000 0000000000000000 0000000180130013
raw: dead000000000100 dead000000000200 ffff88006741f140 0000000000000000
page dumped because: kasan: bad access detected

CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 describe_address mm/kasan/report.c:262
 kasan_report_error+0x121/0x560 mm/kasan/report.c:370
 kasan_report mm/kasan/report.c:392
 __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413
 sock_flag ./arch/x86/include/asm/bitops.h:324
 sock_wfree+0x118/0x120 net/core/sock.c:1631
 skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655
 skb_release_all+0x15/0x60 net/core/skbuff.c:668
 __kfree_skb+0x15/0x20 net/core/skbuff.c:684
 kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705
 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
 inet_frag_put ./include/net/inet_frag.h:133
 nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617
 ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
 nf_hook_entry_hookfn ./include/linux/netfilter.h:102
 nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
 nf_hook ./include/linux/netfilter.h:212
 __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160
 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
 rawv6_push_pending_frames net/ipv6/raw.c:613
 rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
 sock_sendmsg_nosec net/socket.c:635
 sock_sendmsg+0xca/0x110 net/socket.c:645
 sock_write_iter+0x326/0x620 net/socket.c:848
 new_sync_write fs/read_write.c:499
 __vfs_write+0x483/0x760 fs/read_write.c:512
 vfs_write+0x187/0x530 fs/read_write.c:560
 SYSC_write fs/read_write.c:607
 SyS_write+0xfb/0x230 fs/read_write.c:599
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
RIP: 0033:0x7ff26e6f5b79
RSP: 002b:00007ff268e0ed98 EFLAGS: 00000206 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007ff268e0f9c0 RCX: 00007ff26e6f5b79
RDX: 0000000000000010 RSI: 0000000020f50fe1 RDI: 0000000000000003
RBP: 00007ff26ebc1220 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
R13: 00007ff268e0f9c0 R14: 00007ff26efec040 R15: 0000000000000003

The buggy address belongs to the object at ffff880062da0000
 which belongs to the cache RAWv6 of size 1504
The buggy address ffff880062da0060 is located 96 bytes inside
 of 1504-byte region [ffff880062da0000, ffff880062da05e0)

Freed by task 4113:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514
 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
 slab_free_hook mm/slub.c:1352
 slab_free_freelist_hook mm/slub.c:1374
 slab_free mm/slub.c:2951
 kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973
 sk_prot_free net/core/sock.c:1377
 __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452
 sk_destruct+0x47/0x80 net/core/sock.c:1460
 __sk_free+0x57/0x230 net/core/sock.c:1468
 sk_free+0x23/0x30 net/core/sock.c:1479
 sock_put ./include/net/sock.h:1638
 sk_common_release+0x31e/0x4e0 net/core/sock.c:2782
 rawv6_close+0x54/0x80 net/ipv6/raw.c:1214
 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431
 sock_release+0x8d/0x1e0 net/socket.c:599
 sock_close+0x16/0x20 net/socket.c:1063
 __fput+0x332/0x7f0 fs/file_table.c:208
 ____fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x19b/0x270 kernel/task_work.c:116
 exit_task_work ./include/linux/task_work.h:21
 do_exit+0x186b/0x2800 kernel/exit.c:839
 do_group_exit+0x149/0x420 kernel/exit.c:943
 SYSC_exit_group kernel/exit.c:954
 SyS_exit_group+0x1d/0x20 kernel/exit.c:952
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203

Allocated by task 4115:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544
 slab_post_alloc_hook mm/slab.h:432
 slab_alloc_node mm/slub.c:2708
 slab_alloc mm/slub.c:2716
 kmem_cache_alloc+0x1af/0x250 mm/slub.c:2721
 sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1334
 sk_alloc+0x105/0x1010 net/core/sock.c:1396
 inet6_create+0x44d/0x1150 net/ipv6/af_inet6.c:183
 __sock_create+0x4f6/0x880 net/socket.c:1199
 sock_create net/socket.c:1239
 SYSC_socket net/socket.c:1269
 SyS_socket+0xf9/0x230 net/socket.c:1249
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203

Memory state around the buggy address:
 ffff880062d9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff880062d9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff880062da0000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                       ^
 ffff880062da0080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880062da0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 20:55:57 -08:00
Eric Dumazet
13baa00ad0 net: net_enable_timestamp() can be called from irq contexts
It is now very clear that silly TCP listeners might play with
enabling/disabling timestamping while new children are added
to their accept queue.

Meaning net_enable_timestamp() can be called from BH context
while current state of the static key is not enabled.

Lets play safe and allow all contexts.

The work queue is scheduled only under the problematic cases,
which are the static key enable/disable transition, to not slow down
critical paths.

This extends and improves what we did in commit 5fa8bbda38 ("net: use
a work queue to defer net_disable_timestamp() work")

Fixes: b90e5794c5 ("net: dont call jump_label_dec from irq context")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 20:55:57 -08:00
Alexander Potapenko
540e2894f7 net: don't call strlen() on the user buffer in packet_bind_spkt()
KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
uninitialized memory in packet_bind_spkt():
Acked-by: Eric Dumazet <edumazet@google.com>

==================================================================
BUG: KMSAN: use of unitialized memory
CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
 0000000000000000 ffff88006b6dfc08 ffffffff82559ae8 ffff88006b6dfb48
 ffffffff818a7c91 ffffffff85b9c870 0000000000000092 ffffffff85b9c550
 0000000000000000 0000000000000092 00000000ec400911 0000000000000002
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff82559ae8>] dump_stack+0x238/0x290 lib/dump_stack.c:51
 [<ffffffff818a6626>] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003
 [<ffffffff818a783b>] __msan_warning+0x5b/0xb0
mm/kmsan/kmsan_instr.c:424
 [<     inline     >] strlen lib/string.c:484
 [<ffffffff8259b58d>] strlcpy+0x9d/0x200 lib/string.c:144
 [<ffffffff84b2eca4>] packet_bind_spkt+0x144/0x230
net/packet/af_packet.c:3132
 [<ffffffff84242e4d>] SYSC_bind+0x40d/0x5f0 net/socket.c:1370
 [<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
 [<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
chained origin: 00000000eba00911
 [<ffffffff810bb787>] save_stack_trace+0x27/0x50
arch/x86/kernel/stacktrace.c:67
 [<     inline     >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
 [<     inline     >] kmsan_save_stack mm/kmsan/kmsan.c:334
 [<ffffffff818a59f8>] kmsan_internal_chain_origin+0x118/0x1e0
mm/kmsan/kmsan.c:527
 [<ffffffff818a7773>] __msan_set_alloca_origin4+0xc3/0x130
mm/kmsan/kmsan_instr.c:380
 [<ffffffff84242b69>] SYSC_bind+0x129/0x5f0 net/socket.c:1356
 [<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
 [<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
origin description: ----address@SYSC_bind (origin=00000000eb400911)
==================================================================
(the line numbers are relative to 4.8-rc6, but the bug persists
upstream)

, when I run the following program as root:

=====================================
 #include <string.h>
 #include <sys/socket.h>
 #include <netpacket/packet.h>
 #include <net/ethernet.h>

 int main() {
   struct sockaddr addr;
   memset(&addr, 0xff, sizeof(addr));
   addr.sa_family = AF_PACKET;
   int fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL));
   bind(fd, &addr, sizeof(addr));
   return 0;
 }
=====================================

This happens because addr.sa_data copied from the userspace is not
zero-terminated, and copying it with strlcpy() in packet_bind_spkt()
results in calling strlen() on the kernel copy of that non-terminated
buffer.

Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 20:55:57 -08:00
Mike Manning
8953de2f02 net: bridge: allow IPv6 when multicast flood is disabled
Even with multicast flooding turned off, IPv6 ND should still work so
that IPv6 connectivity is provided. Allow this by continuing to flood
multicast traffic originated by us.

Fixes: b6cb5ac833 ("net: bridge: add per-port multicast flood flag")
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Mike Manning <mmanning@brocade.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 20:55:57 -08:00
Aurelien Aptel
b9043cc5b9 CIFS: set signing flag in SMB2+ TreeConnect if needed
cifs_enable_signing() already sets server->sign according to what the
server requires/offers and what mount options allows/forbids, so use
that.

this is required for IPC tcon that connects to signing-required servers.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2017-03-01 22:26:11 -06:00
Aurelien Aptel
7f9f6d2ec5 CIFS: let ses->ipc_tid hold smb2 TreeIds
the TreeId field went from 2 bytes in CIFS to 4 bytes in SMB2+. this
commit updates the size of the ipc_tid field of a cifs_ses, which was
still using 2 bytes.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2017-03-01 22:26:11 -06:00
Aurelien Aptel
51146625f3 CIFS: add use_ipc flag to SMB2_ioctl()
when set, use the session IPC tree id instead of the tid in the provided
tcon.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2017-03-01 22:26:11 -06:00
Aurelien Aptel
268a635d41 CIFS: add build_path_from_dentry_optional_prefix()
this function does the same thing as add build_path_from_dentry() but
takes a boolean parameter to decide whether or not to prefix the path
with the tree name.

we cannot rely on tcon->Flags & SMB_SHARE_IS_IN_DFS for SMB2 as smb2
code never sets tcon->Flags but it sets tcon->share_flags and it seems
the SMB_SHARE_IS_IN_DFS has different semantics in SMB2: the prefix
shouldn't be added everytime it was in SMB1.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2017-03-01 22:26:10 -06:00
Aurelien Aptel
4ecce920e1 CIFS: move DFS response parsing out of SMB1 code
since the DFS payload is not tied to the SMB version we can:
* isolate the DFS payload in its own struct, and include that struct in
  packet structs
* move the function that parses the response to misc.c and make it work
  on the new DFS payload struct (add payload size and utf16 flag as a
  result).

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2017-03-01 22:26:10 -06:00
Bart Van Assche
8893cf6cb1 scsi: mpt3sas: Avoid sleeping in interrupt context
Commit 669f044170 ("scsi: srp_transport: Move queuecommand() wait code
to SCSI core") can make scsi_internal_device_block() sleep.  However,
the mpt3sas driver can call this function from an interrupt
handler. Hence add a second argument to scsi_internal_device_block()
that restores the old behavior of this function for the mpt3sas handler.

The call chain that triggered an "IRQ handler enabled interrupts"
complaint is as follows:

_base_interrupt()
-> _base_async_event()
   -> mpt3sas_scsih_event_callback()
      -> _scsih_check_topo_delete_events()
         -> _scsih_block_io_to_children_attached_directly()
            -> _scsih_block_io_device()
               -> _scsih_internal_device_block()
                  -> scsi_internal_device_block()

Reported-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Omar Sandoval <osandov@osandov.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Chaitra P B <chaitra.basappa@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Cc: <stable@vger.kernel.org> # v4.10+
Tested-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-01 21:52:13 -05:00
Damien Le Moal
c46f09175d scsi: sd: Check for unaligned partial completion
Commit <f2e767bb5d6e> ("mpt3sas: Force request partial completion
alignment") was not considering the case of commands not operating on
logical block size units (e.g. REQ_OP_ZONE_REPORT and its 64B aligned
partial replies). In this case, forcing alignment of resid to the device
logical block size can break the command result, e.g. in the case of
REQ_OP_ZONE_REPORT, the exact number of zone reported by the device.

Move the partial completion alignement check of mpt3sas to a generic
implementation in sd_done(). The check is added within the default
section of the initial req_op() switch case so that the report and reset
zone commands are ignored. In addition, as sd_done() is not called for
passthrough requests, resid corrections are not done as intended by the
initial mpt3sas patch.

Fixes: f2e767bb5d ("mpt3sas: Force request partial completion alignment")
Cc: <stable@vger.kernel.org> # v4.10
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-01 21:46:11 -05:00
Potomski, MichalX
a4b0e8a4e9 scsi: ufs: Factor out ufshcd_read_desc_param
Since in UFS 2.1 specification some of the descriptor lengths differs
from 2.0 specification and some devices, which are reporting spec
version 2.0 have different descriptor lengths we can not rely on
hardcoded values taken from 2.0 specification. This patch introduces
reading these lengths per each device from descriptor headers at probe
time to ensure their correctness.

Signed-off-by: Michal' Potomski <michalx.potomski@intel.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-01 21:43:22 -05:00
Linus Torvalds
4977ab6e92 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool relocation fixes from Ingo Molnar:
 "Two fixes related to the module loading regression introduced by the
  recent objtool changes"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool, modules: Discard objtool annotation sections for modules
  objtool, compiler.h: Fix __unreachable section relocation size
2017-03-01 17:04:50 -08:00
Linus Torvalds
8f03cf50bc NFS client updates for Linux 4.11
Stable bugfixes:
 - NFSv4: Fix memory and state leak in _nfs4_open_and_get_state
 - xprtrdma: Fix Read chunk padding
 - xprtrdma: Per-connection pad optimization
 - xprtrdma: Disable pad optimization by default
 - xprtrdma: Reduce required number of send SGEs
 - nlm: Ensure callback code also checks that the files match
 - pNFS/flexfiles: If the layout is invalid, it must be updated before retrying
 - NFSv4: Fix reboot recovery in copy offload
 - Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE"
 - NFSv4: fix getacl head length estimation
 - NFSv4: fix getacl ERANGE for sum ACL buffer sizes
 
 Features:
 - Add and use dprintk_cont macros
 - Various cleanups to NFS v4.x to reduce code duplication and complexity
 - Remove unused cr_magic related code
 - Improvements to sunrpc "read from buffer" code
 - Clean up sunrpc timeout code and allow changing TCP timeout parameters
 - Remove duplicate mw_list management code in xprtrdma
 - Add generic functions for encoding and decoding xdr streams
 
 Bugfixes:
 - Clean up nfs_show_mountd_netid
 - Make layoutreturn_ops static and use NULL instead of 0 to fix sparse warnings
 - Properly handle -ERESTARTSYS in nfs_rename()
 - Check if register_shrinker() failed during rpcauth_init()
 - Properly clean up procfs/pipefs entries
 - Various NFS over RDMA related fixes
 - Silence unititialized variable warning in sunrpc
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAli3F7YACgkQ18tUv7Cl
 QOvzrQ//dL+nnBaqsm9bA2wwuVJSQ2R1zdkwHOCWghEWROZrQHzpi0VHu0ZKBLzr
 YsYFhHvIPax9Q8USY4B/QFQ3eUuZILEVn+xDruRxZaJPnsA4Zmr16VJwGF2F68Lh
 CGekA5qybqy8lAG6v96Gyjbi+JqjHNCmelYWRv7SX9IZcDjNJpsEbrSI4LkabTWh
 70WtCl3LBzVMRYRxe8+f0mcx4g4XCQ8pDaQRgRnfKtNeQk/+PgWz66xSNinDakVb
 A8AkaiUadPRgUTpap6HfBSicpRvtLQeLhARC0E4YE5pXp2H/kUt2MFe5szblfSCv
 zf2nrPUbNEHjBypFhERzCZZk6EonY6FeOojyW0g2C+rmPdK7WLlKbwTQFxdRGvsx
 78fIiPRdlDHDp9CXzD8V4xxRBJX/KkicA1Vp8CoyQtmpzpu2fjwT0kr9HeD+aEe6
 293+72QUfk05re2HYWF9MCGGVVLdnLLjrKCgwwRQ0HX5WF6GNQxX/yVgBVlqFeV3
 xc8m7ltKco5N9JxIqwlIpySq2e114EQOqsmHYz3gxd7ID9J1NJz+9H2z2EvgAKZ7
 wIPSLoZrdBdnoXG8ZDDTAvPKeB8l6egi6wjrvGKxewVlMbjzogdARsMKWoifnCfG
 HMkH+IEvLGvFc1pPeLbscJGEdVWXVn0thO+8fkS9F9sE/zMX9PA=
 =01DU
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
 "Highlights include:

  Stable bugfixes:
   - NFSv4: Fix memory and state leak in _nfs4_open_and_get_state
   - xprtrdma: Fix Read chunk padding
   - xprtrdma: Per-connection pad optimization
   - xprtrdma: Disable pad optimization by default
   - xprtrdma: Reduce required number of send SGEs
   - nlm: Ensure callback code also checks that the files match
   - pNFS/flexfiles: If the layout is invalid, it must be updated before
     retrying
   - NFSv4: Fix reboot recovery in copy offload
   - Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION
     replies to OP_SEQUENCE"
   - NFSv4: fix getacl head length estimation
   - NFSv4: fix getacl ERANGE for sum ACL buffer sizes

  Features:
   - Add and use dprintk_cont macros
   - Various cleanups to NFS v4.x to reduce code duplication and
     complexity
   - Remove unused cr_magic related code
   - Improvements to sunrpc "read from buffer" code
   - Clean up sunrpc timeout code and allow changing TCP timeout
     parameters
   - Remove duplicate mw_list management code in xprtrdma
   - Add generic functions for encoding and decoding xdr streams

  Bugfixes:
   - Clean up nfs_show_mountd_netid
   - Make layoutreturn_ops static and use NULL instead of 0 to fix
     sparse warnings
   - Properly handle -ERESTARTSYS in nfs_rename()
   - Check if register_shrinker() failed during rpcauth_init()
   - Properly clean up procfs/pipefs entries
   - Various NFS over RDMA related fixes
   - Silence unititialized variable warning in sunrpc"

* tag 'nfs-for-4.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (64 commits)
  NFSv4: fix getacl ERANGE for some ACL buffer sizes
  NFSv4: fix getacl head length estimation
  Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE"
  NFSv4: Fix reboot recovery in copy offload
  pNFS/flexfiles: If the layout is invalid, it must be updated before retrying
  NFSv4: Clean up owner/group attribute decode
  SUNRPC: Add a helper function xdr_stream_decode_string_dup()
  NFSv4: Remove bogus "struct nfs_client" argument from decode_ace()
  NFSv4: Fix the underestimation of delegation XDR space reservation
  NFSv4: Replace callback string decode function with a generic
  NFSv4: Replace the open coded decode_opaque_inline() with the new generic
  NFSv4: Replace ad-hoc xdr encode/decode helpers with xdr_stream_* generics
  SUNRPC: Add generic helpers for xdr_stream encode/decode
  sunrpc: silence uninitialized variable warning
  nlm: Ensure callback code also checks that the files match
  sunrpc: Allow xprt->ops->timer method to sleep
  xprtrdma: Refactor management of mw_list field
  xprtrdma: Handle stale connection rejection
  xprtrdma: Properly recover FRWRs with in-flight FASTREG WRs
  xprtrdma: Shrink send SGEs array
  ...
2017-03-01 16:10:30 -08:00
Linus Torvalds
25c4e6c3f0 for-f2fs-4.11
This round introduces several interesting features such as on-disk NAT bitmaps,
 IO alignment, and a discard thread. And it includes a couple of major bug fixes
 as below.
 
 == Enhancement ==
 - introduce on-disk bitmaps to avoid scanning NAT blocks when getting free nids
 - support IO alignment to prepare open-channel SSD integration in future
 - introduce a discard thread to avoid long latency during checkpoint and fstrim
 - use SSR for warm node and enable inline_xattr by default
 - introduce in-memory bitmaps to check FS consistency for debugging
 - improve write_begin by avoiding needless read IO
 
 == Bug fix ==
 - fix broken zone_reset behavior for SMR drive
 - fix wrong victim selection policy during GC
 - fix missing behavior when preparing discard commands
 - fix bugs in atomic write support and fiemap
 - workaround to handle multiple f2fs_add_link calls having same name
 
 And it includes a bunch of clean-up patches as well.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYtmdOAAoJEEAUqH6CSFDSs0UP/AzngT37xVIhVBD13J9oHIuv
 rFA/eHVGRJmU1xc4SG1bghKm45xq8rwUX7irarfvLLc5aL+6VPGSdaRBykUr4A5N
 MN/bgK//EPp7If8EF+8PpY+9x7g67i0mtz5iD8dDrK+bUKV/IDKV1LWw5pR3g/g6
 RwMH0dUVOiD/HJ5iFp1ykTdVPe4vFY013uVmyPxUq+nCBlqlQm1nOvrGjF/HeYyX
 kqcD2LEc79GPfS5ebQIKfCfLE0rsWVnnS6YaqlDNCD5/oRim71CUtA4MPTYv29vp
 R/SebWlayEm+u68+uQUu6AyIk/1IdP0+AtRuQd/VxuteoyXmkTMHER662DqN4F8J
 npPdNrbNdlzwuAP77avy+hplqbD19yUa7o7Fl1No5rfheT3CiNTSj2uoriyEAffH
 1AM6tES7S7n5ttrXOr9iOxrK0u/vuaf7fbKVtK+RI09hwzdvyGB5HUdQB0iP/XR+
 obw8dru79ISMVZ9YuDhSfjI5ohAcfthfuqgjUt2RAfDv19IRsg5eayAp3T6nUfEX
 AGQbV/52dkO9svZztMbcBW95zmqkE0cMeX66KIMCPXNuDiE474t8k115K6kHpFwP
 e4Kx+mTSNhR1LEAaVdmCjbLb0gVrumVHTdjaZopnxTFmE70u/M6h1vY90m1LkReF
 ZDK5mhfMmGzU4wkvbgP8
 =tw8c
 -----END PGP SIGNATURE-----

Merge tag 'for-f2fs-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "This round introduces several interesting features such as on-disk NAT
  bitmaps, IO alignment, and a discard thread. And it includes a couple
  of major bug fixes as below.

  Enhancements:

   - introduce on-disk bitmaps to avoid scanning NAT blocks when getting
     free nids

   - support IO alignment to prepare open-channel SSD integration in
     future

   - introduce a discard thread to avoid long latency during checkpoint
     and fstrim

   - use SSR for warm node and enable inline_xattr by default

   - introduce in-memory bitmaps to check FS consistency for debugging

   - improve write_begin by avoiding needless read IO

  Bug fixes:

   - fix broken zone_reset behavior for SMR drive

   - fix wrong victim selection policy during GC

   - fix missing behavior when preparing discard commands

   - fix bugs in atomic write support and fiemap

   - workaround to handle multiple f2fs_add_link calls having same name

  ... and it includes a bunch of clean-up patches as well"

* tag 'for-f2fs-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (97 commits)
  f2fs: avoid to flush nat journal entries
  f2fs: avoid to issue redundant discard commands
  f2fs: fix a plint compile warning
  f2fs: add f2fs_drop_inode tracepoint
  f2fs: Fix zoned block device support
  f2fs: remove redundant set_page_dirty()
  f2fs: fix to enlarge size of write_io_dummy mempool
  f2fs: fix memory leak of write_io_dummy mempool during umount
  f2fs: fix to update F2FS_{CP_}WB_DATA count correctly
  f2fs: use MAX_FREE_NIDS for the free nids target
  f2fs: introduce free nid bitmap
  f2fs: new helper cur_cp_crc() getting crc in f2fs_checkpoint
  f2fs: update the comment of default nr_pages to skipping
  f2fs: drop the duplicate pval in f2fs_getxattr
  f2fs: Don't update the xattr data that same as the exist
  f2fs: kill __is_extent_same
  f2fs: avoid bggc->fggc when enough free segments are avaliable after cp
  f2fs: select target segment with closer temperature in SSR mode
  f2fs: show simple call stack in fault injection message
  f2fs: no need lock_op in f2fs_write_inline_data
  ...
2017-03-01 15:55:04 -08:00
Omar Sandoval
c4baad5029 virtio-console: avoid DMA from stack
put_chars() stuffs the buffer it gets into an sg, but that buffer may be
on the stack. This breaks with CONFIG_VMAP_STACK=y (for me, it
manifested as printks getting turned into NUL bytes).

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
2017-03-02 01:35:06 +02:00
Jason Wang
f889491380 vhost: introduce O(1) vq metadata cache
When device IOTLB is enabled, all address translations were stored in
interval tree. O(lgN) searching time could be slow for virtqueue
metadata (avail, used and descriptors) since they were accessed much
often than other addresses. So this patch introduces an O(1) array
which points to the interval tree nodes that store the translations of
vq metadata. Those array were update during vq IOTLB prefetching and
were reset during each invalidation and tlb update. Each time we want
to access vq metadata, this small array were queried before interval
tree. This would be sufficient for static mappings but not dynamic
mappings, we could do optimizations on top.

Test were done with l2fwd in guest (2M hugepage):

   noiommu  | before        | after
tx 1.32Mpps | 1.06Mpps(82%) | 1.30Mpps(98%)
rx 2.33Mpps | 1.46Mpps(63%) | 2.29Mpps(98%)

We can almost reach the same performance as noiommu mode.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 01:35:06 +02:00
Stephen Smalley
2651225b5e selinux: wrap cgroup seclabel support with its own policy capability
commit 1ea0ce4069 ("selinux: allow
changing labels for cgroupfs") broke the Android init program,
which looks up security contexts whenever creating directories
and attempts to assign them via setfscreatecon().
When creating subdirectories in cgroup mounts, this would previously
be ignored since cgroup did not support userspace setting of security
contexts.  However, after the commit, SELinux would attempt to honor
the requested context on cgroup directories and fail due to permission
denial.  Avoid breaking existing userspace/policy by wrapping this change
with a conditional on a new cgroup_seclabel policy capability.  This
preserves existing behavior until/unless a new policy explicitly enables
this capability.

Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-02 10:27:40 +11:00
David Howells
0837e49ab3 KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload()
rcu_dereference_key() and user_key_payload() are currently being used in
two different, incompatible ways:

 (1) As a wrapper to rcu_dereference() - when only the RCU read lock used
     to protect the key.

 (2) As a wrapper to rcu_dereference_protected() - when the key semaphor is
     used to protect the key and the may be being modified.

Fix this by splitting both of the key wrappers to produce:

 (1) RCU accessors for keys when caller has the key semaphore locked:

	dereference_key_locked()
	user_key_payload_locked()

 (2) RCU accessors for keys when caller holds the RCU read lock:

	dereference_key_rcu()
	user_key_payload_rcu()

This should fix following warning in the NFS idmapper

  ===============================
  [ INFO: suspicious RCU usage. ]
  4.10.0 #1 Tainted: G        W
  -------------------------------
  ./include/keys/user-type.h:53 suspicious rcu_dereference_protected() usage!
  other info that might help us debug this:
  rcu_scheduler_active = 2, debug_locks = 0
  1 lock held by mount.nfs/5987:
    #0:  (rcu_read_lock){......}, at: [<d000000002527abc>] nfs_idmap_get_key+0x15c/0x420 [nfsv4]
  stack backtrace:
  CPU: 1 PID: 5987 Comm: mount.nfs Tainted: G        W       4.10.0 #1
  Call Trace:
    dump_stack+0xe8/0x154 (unreliable)
    lockdep_rcu_suspicious+0x140/0x190
    nfs_idmap_get_key+0x380/0x420 [nfsv4]
    nfs_map_name_to_uid+0x2a0/0x3b0 [nfsv4]
    decode_getfattr_attrs+0xfac/0x16b0 [nfsv4]
    decode_getfattr_generic.constprop.106+0xbc/0x150 [nfsv4]
    nfs4_xdr_dec_lookup_root+0xac/0xb0 [nfsv4]
    rpcauth_unwrap_resp+0xe8/0x140 [sunrpc]
    call_decode+0x29c/0x910 [sunrpc]
    __rpc_execute+0x140/0x8f0 [sunrpc]
    rpc_run_task+0x170/0x200 [sunrpc]
    nfs4_call_sync_sequence+0x68/0xa0 [nfsv4]
    _nfs4_lookup_root.isra.44+0xd0/0xf0 [nfsv4]
    nfs4_lookup_root+0xe0/0x350 [nfsv4]
    nfs4_lookup_root_sec+0x70/0xa0 [nfsv4]
    nfs4_find_root_sec+0xc4/0x100 [nfsv4]
    nfs4_proc_get_rootfh+0x5c/0xf0 [nfsv4]
    nfs4_get_rootfh+0x6c/0x190 [nfsv4]
    nfs4_server_common_setup+0xc4/0x260 [nfsv4]
    nfs4_create_server+0x278/0x3c0 [nfsv4]
    nfs4_remote_mount+0x50/0xb0 [nfsv4]
    mount_fs+0x74/0x210
    vfs_kern_mount+0x78/0x220
    nfs_do_root_mount+0xb0/0x140 [nfsv4]
    nfs4_try_mount+0x60/0x100 [nfsv4]
    nfs_fs_mount+0x5ec/0xda0 [nfs]
    mount_fs+0x74/0x210
    vfs_kern_mount+0x78/0x220
    do_mount+0x254/0xf70
    SyS_mount+0x94/0x100
    system_call+0x38/0xe0

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-02 10:09:00 +11:00
David S. Miller
16c54ac903 First round of fixes - details in the commits:
* use a valid hrtimer clock ID in mac80211_hwsim
  * don't reorder frames prior to BA session
  * flush a delayed work at suspend so the state is all valid before
    suspend/resume
  * fix packet statistics in fast-RX, the RX packets
    counter increment was simply missing
  * don't try to re-transmit filtered frames in an aggregation session
  * shorten (for tracing) a debug message
  * typo fix in another debug message
  * fix nul-termination with HWSIM_ATTR_RADIO_NAME in hwsim
  * fix mgmt RX processing when station is looked up by driver/device
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEExu3sM/nZ1eRSfR9Ha3t4Rpy0AB0FAli1HCYACgkQa3t4Rpy0
 AB1Sfw/+MeYZtAqeem6JzNW7OPPHgJIXHAnDyby79jzDiYZ1GF2c2asWMwBMshuH
 dTYFs15gdkfItrUF9SsvGFLRaRlPJO899lgL54EwcALxqP82Byylm36X29HTUyzB
 MS3lzrr84Ad5F/BkHa949ogcoJwmkmhSYpG5L5EclJs98Vu8pyd/AlBJ+fbNFmHz
 Pe2lDiIjt5H5MvGCZfl5PK1l9GwDLIMhLekn03HEfCN5lNz2c8aru1dhcCXxWSPG
 Oxj2J4n0WknwntKLx5F/Wceee8YDRyUBw0peKYxp8VQPPLVzpoc9GKlaCDBemwVQ
 sfVfgkkZchKtAt1SPJS5yKMfUlEQdqE3w3XIzzRS9bN3AsVK5EKL63zDO7HJ+5nZ
 2hB8kY4fjGUx0RrZKKL0jruSl+AUP5bWJUIt2VM55FM195rXUkDSjsn3M1ZIwiHp
 8fvwLWibKoQg7Y0UIdVQSl03Gfd2dEs0XrvD+fNMI9aNy/EriJZ15pxbNAPSi10Z
 pN02zFCmqxju6w1tppxW9eBoxdbt45lr1cyXY+GN6RTgm2ILM9i+tQHoNf4KuWYl
 5dydXhDkRGpUghevR4K29vBYtoaNK0CteY2xECdRGUVFnMl4QUgHInOeKcSneHCi
 Cvp31WYTAP+U/8ni0VYENqEC0iRbUNrMu2mkECxriHt5go+n5jA=
 =wths
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2017-02-28' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
First round of fixes - details in the commits:
 * use a valid hrtimer clock ID in mac80211_hwsim
 * don't reorder frames prior to BA session
 * flush a delayed work at suspend so the state is all valid before
   suspend/resume
 * fix packet statistics in fast-RX, the RX packets
   counter increment was simply missing
 * don't try to re-transmit filtered frames in an aggregation session
 * shorten (for tracing) a debug message
 * typo fix in another debug message
 * fix nul-termination with HWSIM_ATTR_RADIO_NAME in hwsim
 * fix mgmt RX processing when station is looked up by driver/device
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 15:08:34 -08:00
Eric Dumazet
449809a66c tcp/dccp: block BH for SYN processing
SYN processing really was meant to be handled from BH.

When I got rid of BH blocking while processing socket backlog
in commit 5413d1babe ("net: do not block BH while processing socket
backlog"), I forgot that a malicious user could transition to TCP_LISTEN
from a state that allowed (SYN) packets to be parked in the socket
backlog while socket is owned by the thread doing the listen() call.

Sure enough syzkaller found this and reported the bug ;)

=================================
[ INFO: inconsistent lock state ]
4.10.0+ #60 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
syz-executor0/5090 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
[<ffffffff83a6a370>] spin_lock include/linux/spinlock.h:299 [inline]
 (&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
[<ffffffff83a6a370>] inet_ehash_insert+0x240/0xad0
net/ipv4/inet_hashtables.c:407
{IN-SOFTIRQ-W} state was registered at:
  mark_irqflags kernel/locking/lockdep.c:2923 [inline]
  __lock_acquire+0xbcf/0x3270 kernel/locking/lockdep.c:3295
  lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
  _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
  spin_lock include/linux/spinlock.h:299 [inline]
  inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
  reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
  inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
  tcp_conn_request+0x25cc/0x3310 net/ipv4/tcp_input.c:6399
  tcp_v4_conn_request+0x157/0x220 net/ipv4/tcp_ipv4.c:1262
  tcp_rcv_state_process+0x802/0x4130 net/ipv4/tcp_input.c:5889
  tcp_v4_do_rcv+0x56b/0x940 net/ipv4/tcp_ipv4.c:1433
  tcp_v4_rcv+0x2e12/0x3210 net/ipv4/tcp_ipv4.c:1711
  ip_local_deliver_finish+0x4ce/0xc40 net/ipv4/ip_input.c:216
  NF_HOOK include/linux/netfilter.h:257 [inline]
  ip_local_deliver+0x1ce/0x710 net/ipv4/ip_input.c:257
  dst_input include/net/dst.h:492 [inline]
  ip_rcv_finish+0xb1d/0x2110 net/ipv4/ip_input.c:396
  NF_HOOK include/linux/netfilter.h:257 [inline]
  ip_rcv+0xd90/0x19c0 net/ipv4/ip_input.c:487
  __netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4179
  __netif_receive_skb+0x2a/0x170 net/core/dev.c:4217
  netif_receive_skb_internal+0x1d6/0x430 net/core/dev.c:4245
  napi_skb_finish net/core/dev.c:4602 [inline]
  napi_gro_receive+0x4e6/0x680 net/core/dev.c:4636
  e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4033 [inline]
  e1000_clean_rx_irq+0x5e0/0x1490
drivers/net/ethernet/intel/e1000/e1000_main.c:4489
  e1000_clean+0xb9a/0x2910 drivers/net/ethernet/intel/e1000/e1000_main.c:3834
  napi_poll net/core/dev.c:5171 [inline]
  net_rx_action+0xe70/0x1900 net/core/dev.c:5236
  __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
  invoke_softirq kernel/softirq.c:364 [inline]
  irq_exit+0x19e/0x1d0 kernel/softirq.c:405
  exiting_irq arch/x86/include/asm/apic.h:658 [inline]
  do_IRQ+0x81/0x1a0 arch/x86/kernel/irq.c:250
  ret_from_intr+0x0/0x20
  native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
  default_idle+0x8f/0x410 arch/x86/kernel/process.c:271
  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:262
  default_idle_call+0x36/0x60 kernel/sched/idle.c:96
  cpuidle_idle_call kernel/sched/idle.c:154 [inline]
  do_idle+0x348/0x440 kernel/sched/idle.c:243
  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:345
  start_secondary+0x344/0x440 arch/x86/kernel/smpboot.c:272
  verify_cpu+0x0/0xfc
irq event stamp: 1741
hardirqs last  enabled at (1741): [<ffffffff84d49d77>]
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160
[inline]
hardirqs last  enabled at (1741): [<ffffffff84d49d77>]
_raw_spin_unlock_irqrestore+0xf7/0x1a0 kernel/locking/spinlock.c:191
hardirqs last disabled at (1740): [<ffffffff84d4a732>]
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
hardirqs last disabled at (1740): [<ffffffff84d4a732>]
_raw_spin_lock_irqsave+0xa2/0x110 kernel/locking/spinlock.c:159
softirqs last  enabled at (1738): [<ffffffff84d4deff>]
__do_softirq+0x7cf/0xb7d kernel/softirq.c:310
softirqs last disabled at (1571): [<ffffffff84d4b92c>]
do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&hashinfo->ehash_locks[i])->rlock);
  <Interrupt>
    lock(&(&hashinfo->ehash_locks[i])->rlock);

 *** DEADLOCK ***

1 lock held by syz-executor0/5090:
 #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>] lock_sock
include/net/sock.h:1460 [inline]
 #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>]
sock_setsockopt+0x233/0x1e40 net/core/sock.c:683

stack backtrace:
CPU: 1 PID: 5090 Comm: syz-executor0 Not tainted 4.10.0+ #60
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 print_usage_bug+0x3ef/0x450 kernel/locking/lockdep.c:2387
 valid_state kernel/locking/lockdep.c:2400 [inline]
 mark_lock_irq kernel/locking/lockdep.c:2602 [inline]
 mark_lock+0xf30/0x1410 kernel/locking/lockdep.c:3065
 mark_irqflags kernel/locking/lockdep.c:2941 [inline]
 __lock_acquire+0x6dc/0x3270 kernel/locking/lockdep.c:3295
 lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:299 [inline]
 inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
 reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
 inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
 dccp_v6_conn_request+0xada/0x11b0 net/dccp/ipv6.c:380
 dccp_rcv_state_process+0x51e/0x1660 net/dccp/input.c:606
 dccp_v6_do_rcv+0x213/0x350 net/dccp/ipv6.c:632
 sk_backlog_rcv include/net/sock.h:896 [inline]
 __release_sock+0x127/0x3a0 net/core/sock.c:2052
 release_sock+0xa5/0x2b0 net/core/sock.c:2539
 sock_setsockopt+0x60f/0x1e40 net/core/sock.c:1016
 SYSC_setsockopt net/socket.c:1782 [inline]
 SyS_setsockopt+0x2fb/0x3a0 net/socket.c:1765
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4458b9
RSP: 002b:00007fe8b26c2b58 EFLAGS: 00000292 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00000000004458b9
RDX: 000000000000001a RSI: 0000000000000001 RDI: 0000000000000006
RBP: 00000000006e2110 R08: 0000000000000010 R09: 0000000000000000
R10: 00000000208c3000 R11: 0000000000000292 R12: 0000000000708000
R13: 0000000020000000 R14: 0000000000001000 R15: 0000000000000000

Fixes: 5413d1babe ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 15:03:31 -08:00
Gary Lin
eba38a9682 bpf: update the comment about the length of analysis
Commit 07016151a4 ("bpf, verifier: further improve search
pruning") increased the limit of processed instructions from
32k to 64k, but the comment still mentioned the 32k limit.
This commit updates the comment to reflect the change.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Gary Lin <glin@suse.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 14:56:50 -08:00
Yotam Gigi
df2c43343b bridge: Fix error path in nbp_vlan_init
Fix error path order in nbp_vlan_init, so if switchdev_port_attr_set
call failes, the vlan_hash wouldn't be destroyed before inited.

Fixes: efa5356b0d ("bridge: per vlan dst_metadata netlink support")
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 14:55:28 -08:00
Gabriel Krisman Bertazi
35f5022fbe drm: Update drm_fbdev_cma_init documentation
Commit be7f735cd5ea ("drm: Rely on mode_config data for fb_helper
initialization") dropped the num_crtc argument.  Update the
documentation to reflect that and prevent the kernel-doc warnings below:

./drivers/gpu/drm/drm_fb_cma_helper.c:557: warning: Excess function parameter 'num_crtc' description in 'drm_fbdev_cma_init'
./drivers/gpu/drm/drm_fb_cma_helper.c:558: warning: Excess function parameter 'num_crtc' description in 'drm_fbdev_cma_init'

Fixes: be7f735cd5ea ("drm: Rely on mode_config data for fb_helper initialization")
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/87o9xkvn2m.fsf@dilma.collabora.co.uk
2017-03-01 23:52:35 +01:00
Pavel Shilovsky
61cfac6f26 CIFS: Fix possible use after free in demultiplex thread
The recent changes that added SMB3 encryption support introduced
a possible use after free in the demultiplex thread. When we
process an encrypted packed we obtain a pointer to SMB session
but do not obtain a reference. This can possibly lead to a situation
when this session was freed before we copy a decryption key from
there. Fix this by obtaining a copy of the key rather than a pointer
to the session under a spinlock.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2017-03-01 16:42:40 -06:00
Rafael J. Wysocki
6bff9c609f Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull changes related to turbostat for v4.11 from Len Brown.

* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (44 commits)
  tools/power turbostat: version 17.02.24
  tools/power turbostat: bugfix: --add u32 was printed as u64
  tools/power turbostat: show error on exec
  tools/power turbostat: dump p-state software config
  tools/power turbostat: show package number, even without --debug
  tools/power turbostat: support "--hide C1" etc.
  tools/power turbostat: move --Package and --processor into the --cpu option
  tools/power turbostat: turbostat.8 update
  tools/power turbostat: update --list feature
  tools/power turbostat: use wide columns to display large numbers
  tools/power turbostat: Add --list option to show available header names
  tools/power turbostat: fix zero IRQ count shown in one-shot command mode
  tools/power turbostat: add --cpu parameter
  tools/power turbostat: print sysfs C-state stats
  tools/power turbostat: extend --add option to accept /sys path
  tools/power turbostat: skip unused counters on BDX
  tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limits
  tools/power turbostat: skip unused counters on SKX
  tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7
  tools/power turbostat: initial Gemini Lake SOC support
  ...
2017-03-01 23:34:38 +01:00
Viresh Kumar
28db0c7b1c PM / OPP: Documentation: Fix opp-microvolt in examples
The triplet present in "opp-microvolt" property should be in the order
<target min max>, while all the examples have it in the order
<min target max>.

Fix it.

Luckily all of the users of "opp-microvolt" property have applied brain
instead of copying the examples from documentation and none of the
actual dts files have it wrong.

Reported-by: Rob Herring <robh@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-01 22:47:30 +01:00
Max Filippov
b46dcfa378 xtensa: allow merging vectors into .text section
Currently code for exception/IRQ vectors is stored in kernel image as
initialization data and is copied to its working addresses during
startup. It doesn't always make sense. In many cases vectors location
can be automatically decided at kernel link time and code can be placed
right there. This is especially useful for XIP kernel.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-03-01 12:32:50 -08:00
Max Filippov
9a736fcb09 xtensa: clean up bootable image build targets
Currently xtensa uses 'zImage' as a synonym of 'all', but in fact xtensa
supports three targets: 'Image' (ELF image with reset vector), 'zImage'
(compressed redboot image) and 'uImage' (U-Boot image).
Provide separate 'Image', 'zImage' and 'uImage' make targets that only
build corresponding image type. Make 'all' build all images appropriate
for a platform.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-03-01 12:32:50 -08:00
Max Filippov
4ab18701c6 xtensa: move parse_tag_fdt out of #ifdef CONFIG_BLK_DEV_INITRD
FDT tag parsing is not related to whether BLK_DEV_INITRD is configured
or not, move it out of the corresponding #ifdef/#endif block.
This fixes passing external FDT to the kernel configured w/o
BLK_DEV_INITRD support.

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-03-01 12:31:52 -08:00
Chris Wilson
02e012f172 drm/i915: Move w/a LRI debug message from context-init to driver load
The spam of every context initialisation saying the same thing is annoying
me! Move the information to the setup of the engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170301121131.11588-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-03-01 20:29:24 +00:00
Chris Zhong
80a9a059d4 drm/rockchip/dsi: add dw-mipi power domain support
Reference the power domain incase dw-mipi power down when
in use.

Signed-off-by: Chris Zhong <zyw@rock-chips.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1487577744-2855-8-git-send-email-zyw@rock-chips.com
2017-03-01 14:49:03 -05:00
Chris Zhong
ad1c974bf1 drm/rockchip/dsi: fix insufficient bandwidth of some panel
Set the lanes bps to 1 / 0.9 times of pclk, the margin is not enough
for some panel, it will cause the screen display is not normal, so
increases the badnwidth to 1 / 0.8.

Signed-off-by: Chris Zhong <zyw@rock-chips.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1487577744-2855-7-git-send-email-zyw@rock-chips.com
2017-03-01 14:49:03 -05:00
Chris Zhong
7df1207f3b dt-bindings: add power domain node for dw-mipi-rockchip
Signed-off-by: Chris Zhong <zyw@rock-chips.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1487577744-2855-6-git-send-email-zyw@rock-chips.com
2017-03-01 14:49:02 -05:00
Chris Zhong
975f4aa24f drm/rockchip/dsi: remove mode_valid function
The MIPI DSI do not need check the validity of resolution, the max
resolution should depend VOP. Hence, remove rk3288_mipi_dsi_mode_valid
here.

Signed-off-by: Chris Zhong <zyw@rock-chips.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1487577744-2855-5-git-send-email-zyw@rock-chips.com
2017-03-01 14:49:02 -05:00
Chris Zhong
a432e05405 drm/rockchip/dsi: dw-mipi: correct the coding style
correct the coding style, according the checkpatch scripts

Signed-off-by: Chris Zhong <zyw@rock-chips.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1487577744-2855-4-git-send-email-zyw@rock-chips.com
2017-03-01 14:49:01 -05:00
Chris Zhong
ef6eba1992 drm/rockchip/dsi: dw-mipi: support RK3399 mipi dsi
The vopb/vopl switch register of RK3399 mipi is different from RK3288,
the default setting for mipi dsi mode is different too, so add a
of_device_id structure to distinguish them, and make sure set the
correct mode before mipi phy init.

Signed-off-by: Chris Zhong <zyw@rock-chips.com>
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1487577744-2855-3-git-send-email-zyw@rock-chips.com
2017-03-01 14:49:01 -05:00
Chris Zhong
7fea5243d1 dt-bindings: add rk3399 support for dw-mipi-rockchip
The dw-mipi-dsi of rk3399 is almost the same as rk3288, the rk3399 has
additional phy config clock.

Signed-off-by: Chris Zhong <zyw@rock-chips.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1487577744-2855-2-git-send-email-zyw@rock-chips.com
2017-03-01 14:49:00 -05:00
John Keeping
f3b7a5b838 drm/rockchip: dw-mipi-dsi: add reset control
In order to fully reset the state of the MIPI controller we must assert
this reset.

This is slightly more complicated than it could be in order to maintain
compatibility with device trees that do not specify the reset property.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Chris Zhong <zyw@rock-chips.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170224125506.21533-24-john@metanate.com
2017-03-01 14:48:59 -05:00
John Keeping
03a5832c0e drm/rockchip: dw-mipi-dsi: support non-burst modes
Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Chris Zhong <zyw@rock-chips.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170224125506.21533-23-john@metanate.com
2017-03-01 14:48:59 -05:00
John Keeping
2f8f2d2991 drm/rockchip: dw-mipi-dsi: defer probe if panel is not loaded
This ensures that the output resolution is known before fbcon loads.
mipi_dsi_host_register() is moved above dw_mipi_dsi_register() to
simplify error cleanup since the order of these operations does not
matter.

Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170224125506.21533-22-john@metanate.com
2017-03-01 14:48:58 -05:00
John Keeping
d790ad03ed drm/rockchip: vop: test for P{H,V}SYNC
When connected to the MIPI DSI output, we need to use N{H,V}SYNC for the
internal connection but these flags are meaningless for DSI panels.
Switch the test so that we do not set the P{H,V}SYNC bits unless the
mode requires it.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Mark Yao <mark.yao@rock-chips.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
[seanpaul resolved conflict using macros instead of hardcoded values]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20170224125506.21533-21-john@metanate.com
2017-03-01 14:48:58 -05:00
John Keeping
2b0c4b70b1 drm/rockchip: dw-mipi-dsi: use positive check for N{H, V}SYNC
This matches other drivers.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170224125506.21533-20-john@metanate.com
2017-03-01 14:48:57 -05:00
John Keeping
4413697141 drm/rockchip: dw-mipi-dsi: use specific poll helper
As the documentation for readx_poll_timeout says, we want to use the
specialized macro for readl rather than using the generic version
directly.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Chris Zhong <zyw@rock-chips.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170224125506.21533-19-john@metanate.com
2017-03-01 14:48:57 -05:00
John Keeping
b0a45fec59 drm/rockchip: dw-mipi-dsi: improve PLL configuration
The multiplication ratio for the PLL is required to be even due to the
use of a "by 2 pre-scaler".  Currently we are likely to end up with an
odd multiplier even though there is an equivalent set of parameters with
an even multiplier.

For example, using the 324MHz bit rate with a reference clock of 24MHz
we end up with M = 27, N = 2 whereas the example in the PHY databook
gives M = 54, N = 4 for this bit rate and reference clock.

By walking down through the available multiplier instead of up we are
more likely to hit an even multiplier.  With the above example we do now
get M = 54, N = 4 as given by the databook.

While doing this, change the loop limits to encode the actual limits on
the divisor, which are:

	40MHz >= (pllref / N) >= 5MHz

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170224125506.21533-18-john@metanate.com
2017-03-01 14:48:56 -05:00
John Keeping
3fdfb4f170 drm/rockchip: dw-mipi-dsi: properly configure PHY timing
These values are specified as constant time periods but the PHY
configuration is in terms of the current lane byte clock so using
constant values guarantees that the timings will be outside the
specification with some display configurations.

Derive the necessary configuration from the byte clock in order to
ensure that the PHY configuration is correct.

Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170224125506.21533-17-john@metanate.com
2017-03-01 14:48:56 -05:00
John Keeping
d969c1553c drm/rockchip: dw-mipi-dsi: configure PHY before enabling
The bias, bandgap and PLL should all be configured before we enable
them.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170224125506.21533-16-john@metanate.com
2017-03-01 14:48:55 -05:00
John Keeping
efe83cee34 drm/rockchip: dw-mipi-dsi: ensure PHY is reset
Also don't power up the DSI host at this point since this is not
necessary in order to configure the PHY and we do so later when
selecting video or command mode.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Chris Zhong <zyw@rock-chips.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170224125506.21533-15-john@metanate.com
2017-03-01 14:48:55 -05:00
John Keeping
1bef24bae2 drm/rockchip: dw-mipi-dsi: fix escape clock rate
This clock rate is derived from the PHY PLL, so it should be calculated
dynamically.  This calculation is the same as that used by the vendor
kernel and ensures that the escape clock runs at <20MHz as required by
the MIPI specification.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Chris Zhong <zyw@rock-chips.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170224125506.21533-14-john@metanate.com
2017-03-01 14:48:54 -05:00
John Keeping
96ad6f0b8d drm/rockchip: dw-mipi-dsi: allow commands in panel_disable
Panel drivers may want to sent commands during the disable function, for
example MIPI_DCS_SET_DISPLAY_OFF before the video signal ends.  In order
to send commands we need to write to registers, so pclk must be enabled.

While changing this, remove the unnecessary code after the panel
unprepare call which seems to be a workaround for a specific panel and
thus belongs in the panel driver.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Chris Zhong <zyw@rock-chips.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170224125506.21533-13-john@metanate.com
2017-03-01 14:48:54 -05:00