Commit Graph

14 Commits

Author SHA1 Message Date
Jeff Layton
849dc3244c nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed
I hit the following oops out of the blue while testing with flexfiles:

BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8
IP: [<ffffffffa048f6b8>] nfs4_ff_find_or_create_ds_client+0x48/0x50 [nfs_layout_flexfiles]
PGD 44031067 PUD 5062d067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: nfsv3 nfs_layout_flexfiles tun rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dcdbas nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bonding ipmi_devintf ipmi_msghandler snd_hda_codec_generic virtio_balloon ppdev snd_hda_intel snd_hda_controller snd_hda_codec iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_core parport_pc snd_hwdep parport snd_seq snd_seq_device snd_pcm snd_timer acpi_cpufreq
 snd soundcore i2c_piix4 xfs libcrc32c joydev virtio_net virtio_console qxl drm_kms_helper ttm crc32c_intel drm virtio_pci serio_raw ata_generic virtio_ring virtio pata_acpi
CPU: 0 PID: 19138 Comm: test5 Not tainted 4.1.9-100.pd.90.el7.x86_64 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
task: ffff88007b70cf00 ti: ffff88004cc44000 task.ti: ffff88004cc44000
RIP: 0010:[<ffffffffa048f6b8>]  [<ffffffffa048f6b8>] nfs4_ff_find_or_create_ds_client+0x48/0x50 [nfs_layout_flexfiles]
RSP: 0018:ffff88004cc47890  EFLAGS: 00010246
RAX: 0000000000000003 RBX: ffff880050932300 RCX: ffff88006978f488
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003e0e8540
RBP: ffff88004cc47908 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88007ff8c758 R11: 0000000000000005 R12: ffff88003e0e8540
R13: 0000000000000000 R14: ffff88006978f488 R15: ffff88004431cc80
FS:  00007fea40c7c740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000e8 CR3: 0000000044318000 CR4: 00000000000406f0
Stack:
 ffffffffa048c934 ffff880050932310 0000000100000001 ffff88006978f510
 ffff88006978f3c8 ffff88003e56cd90 ffff88004cc479d0 00000020a052aff0
 000000000004b000 ffff88004cc47908 ffff880050932300 ffff88004cc479d0
Call Trace:
 [<ffffffffa048c934>] ? ff_layout_write_pagelist+0x64/0x220 [nfs_layout_flexfiles]
 [<ffffffffa057a3bf>] pnfs_generic_pg_writepages+0xaf/0x1b0 [nfsv4]
 [<ffffffffa051ab57>] nfs_pageio_doio+0x27/0x60 [nfs]
 [<ffffffffa051bfe4>] nfs_pageio_complete_mirror+0x54/0xa0 [nfs]
 [<ffffffffa051c7ad>] nfs_pageio_complete+0x2d/0x90 [nfs]
 [<ffffffffa052032d>] nfs_writepage_locked+0x8d/0xe0 [nfs]
 [<ffffffff811e4630>] ? page_referenced_one+0x1a0/0x1a0
 [<ffffffffa05210e7>] nfs_wb_single_page+0xf7/0x190 [nfs]
 [<ffffffffa05108d1>] nfs_launder_page+0x41/0x90 [nfs]
 [<ffffffff811b8930>] invalidate_inode_pages2_range+0x340/0x3a0
 [<ffffffff811b89a7>] invalidate_inode_pages2+0x17/0x20
 [<ffffffffa0513e1e>] nfs_release+0x9e/0xb0 [nfs]
 [<ffffffffa050fa1d>] nfs_file_release+0x3d/0x60 [nfs]
 [<ffffffff8122481c>] __fput+0xdc/0x1e0
 [<ffffffff8122496e>] ____fput+0xe/0x10
 [<ffffffff810bde67>] task_work_run+0xa7/0xe0
 [<ffffffff810af735>] get_signal+0x565/0x600
 [<ffffffff811a9815>] ? __filemap_fdatawrite_range+0x65/0x90
 [<ffffffff810144a7>] do_signal+0x37/0x730
 [<ffffffffa0569921>] ? nfs4_file_fsync+0x81/0x150 [nfsv4]
 [<ffffffff81254dbb>] ? vfs_fsync_range+0x3b/0xb0
 [<ffffffff811446a6>] ? __audit_syscall_exit+0x1e6/0x280
 [<ffffffff81014bff>] do_notify_resume+0x5f/0xa0
 [<ffffffff8178ec3c>] int_signal+0x12/0x17
Code: 48 8b 40 70 8b 00 83 f8 03 74 20 83 f8 04 75 13 55 48 89 ce 48 89 d7 48 89 e5 e8 14 0f 0e 00 5d c3 66 90 0f 0b 66 0f 1f 44 00 00 <48> 8b 82 e8 00 00 00 c3 66 66 66 66 90 55 48 89 e5 41 57 41 56
RIP  [<ffffffffa048f6b8>] nfs4_ff_find_or_create_ds_client+0x48/0x50 [nfs_layout_flexfiles]
 RSP <ffff88004cc47890>
CR2: 00000000000000e8

When the DS connection attempt fails, nfs4_ff_layout_prepare_ds marks it
for the error but then just returns the ds as if it were usable. The
comments though say:

  /* Upon return, either ds is connected, or ds is NULL */

Ensure that we set the return pointer to NULL in the event that the
connection attempt fails.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-03-16 15:46:48 -04:00
Trond Myklebust
2370abdab5 NFS: Cleanup - rename NFS_LAYOUT_RETURN_BEFORE_CLOSE
NFS_LAYOUT_RETURN_BEFORE_CLOSE is being used to signal that a
layoutreturn is needed, either due to a layout recall or to a
layout error. Rename it to NFS_LAYOUT_RETURN_REQUESTED in order
to clarify its purpose.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-27 20:40:05 -05:00
Trond Myklebust
6d45c042f3 Merge branch 'bugfixes'
* bugfixes:
  pNFS/flexfiles: Fix an XDR encoding bug in layoutreturn
  pNFS/flexfiles: Improve merging of errors in LAYOUTRETURN
2016-01-22 11:02:36 -05:00
Trond Myklebust
b819ed4b2a pNFS/flexfiles: Improve merging of errors in LAYOUTRETURN
When we hit 22 errors, we start to overflow the memory buffers allocated
to the LAYOUTRETURN errors. The issue is that currently, RPC call reply
ordering determines how successful we are in merging errors that refer
to contiguous READ or WRITE requests.

Fix is to use an insertion sort to help detect contiguity.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-21 15:49:40 -05:00
Trond Myklebust
2e5b29f044 pNFS/flexfiles: Don't prevent flexfiles client from retrying LAYOUTGET
Fix a bug in which flexfiles clients are falling back to I/O through the
MDS even when the FF_FLAGS_NO_IO_THRU_MDS flag is set.

The flexfiles client will always report errors through the LAYOUTRETURN
and/or LAYOUTERROR mechanisms, so it should normally be safe for it
to retry the LAYOUTGET until it fails or succeeds.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:32:40 -05:00
Trond Myklebust
889d94d49a NFSv4.1/flexfiles: Mark layout for return if the mirrors are invalid
If a read-write layout has an invalid mirror, then we should
mark it as invalid, and return it.
If a read-only layout has an invalid mirror, then mark it as invalid
and check if there is still at least one valid mirror before we return
it.

Note: Also fix incorrect use of pnfs_generic_mark_devid_invalid().
We really want nfs4_mark_deviceid_unavailable().

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-01 15:12:11 -07:00
Trond Myklebust
81d6dc8b34 NFSv4.1/flexfiles: RW layouts are valid only if all mirrors are valid
Unlike read layouts, the writeable layout cannot fall back to using only
one of the mirrors. It need to write to all of them.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-01 15:12:11 -07:00
Trond Myklebust
388ef16640 NFSv4.1/flexfiles: Fix incorrect usage of pnfs_generic_mark_devid_invalid()
Unlike the files layout, flexfiles does not test for the NFS_DEVICEID_INVALID
flag. Instead it relies on NFS_DEVICEID_UNAVAILABLE.
Fix is to replace with nfs4_mark_deviceid_unavailable().

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-01 15:12:11 -07:00
Trond Myklebust
d13549074c NFSv4.1/flexfiles: Fix a protocol error in layoutreturn
According to the flexfiles protocol, the layoutreturn should specify an
array of errors in the following format:

struct ff_ioerr4 {
	offset4        ffie_offset;
	length4        ffie_length;
	stateid4       ffie_stateid;
	device_error4  ffie_errors<>;
};

This patch fixes up the code to ensure that our ffie_errors is indeed
encoded as an array (albeit with only a single entry).

Reported-by: Tom Haynes <thomas.haynes@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-27 20:42:20 -04:00
Jeff Layton
0c8315dd56 nfs: always update creds in mirror, even when we have an already connected ds
A ds can be associated with more than one mirror, but we currently skip
setting a mirror's credentials if we find that it's already set up with
a connected client.

The upshot is that we can end up sending DS writes with MDS credentials
instead of properly setting them up. Fix nfs4_ff_layout_prepare_ds to
always verify that the mirror's credentials are set up, even when we
have a DS that's already connected.

Reported-by: Tom Haynes <thomas.haynes@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-25 19:35:21 -04:00
Jeff Layton
a24221dca1 nfs: fix potential credential leak in ff_layout_update_mirror_cred
If we have two tasks racing to update a mirror's credentials, then they
can end up leaking one (or more) sets of credentials. The first task
will set mirror->cred and then the second task will just overwrite it.

Use a cmpxchg to ensure that the creds are only set once. If we get to
the point where we would set mirror->cred and find that they're already
set, then we just release the creds that were just found.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-25 19:34:40 -04:00
Trond Myklebust
84a80f62f7 NFSv4.1: Convert pNFS deviceid to use kfree_rcu()
Use of synchronize_rcu() when unmounting and potentially freeing a lot
of deviceids is problematic. There really is no reason why we can't just
use kfree_rcu() here.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-27 12:32:24 -04:00
Tom Haynes
480486b473 pnfs/flexfiles: Do not dprintk after the free
Found by 0-DAY kernel test infrastructure:

fs/nfs/flexfilelayout/flexfilelayoutdev.c:520:13-16: ERROR: reference preceded by free on line 518
fs/nfs/flexfilelayout/flexfilelayoutdev.c:520:26-29: ERROR: reference preceded by free on line 518
fs/nfs/flexfilelayout/flexfilelayoutdev.c:520:39-42: ERROR: reference preceded by free on line 518
fs/nfs/flexfilelayout/flexfilelayoutdev.c:521:3-6: ERROR: reference preceded by free on line 518

Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Tom Haynes <loghyr@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-02-09 22:34:29 -05:00
Tom Haynes
d67ae825a5 pnfs/flexfiles: Add the FlexFile Layout Driver
The flexfile layout is a new layout that extends the
file layout. It is currently being drafted as a specification at
https://datatracker.ietf.org/doc/draft-ietf-nfsv4-layout-types/

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Tom Haynes <loghyr@primarydata.com>
Signed-off-by: Tao Peng <bergwolf@primarydata.com>
2015-02-03 11:06:52 -08:00