8ed62e91c4
The client_obd::cl_default_mds_easize field should track the largest observed EA size advertised by the MDT, subject to a reasonable upper bound. The MDC uses cl_default_mds_easize to calculate the initial size of request buffers. The default value should be small enough to avoid wasted memory and excessive use of vmalloc(), yet large enough to accommodate the common use case. In the current code, the default value is only updated if client_obd::cl_max_mds_easize is strictly less than mdt_body::mbo_max_mdsize. This condition is almost never met, because client_obd::cl_max_mds_easize is computed at client mount-time based on the number of OSTs in the filesystem, so the MDT won't ever observe and advertise an EA size larger than that. As a result, client_obd::cl_default_mds_easize indefinitely retains its initial value, which is computed at client mount-time based on the filesystem's default stripe width. Any getattr() requests for widely striped files will consequently allocate a request buffer that is too small, forcing reallocations on both the client and server side. To avoid this, update client_obd::cl_default_mds_easize independently of the value of client_obd::cl_max_mds_easize. In addition, this patch includes these changes: - Add comments to the client_obd structure to clarify what the cl_{default,max}_mds_{cookie,ea}size values mean. - Prevent mdc_get_info() from storing uninitialized data in client_obd::cl_max_mds_cookiesize. - Use 4096 as an upper bound for the default values. The former bound of PAGE_CACHE_SIZE is too large on 64k-page platforms (i.e. PPC), so it fails to prevent the vmalloc() spinlock contention described in LU-3338. The new value was chosen to be large enough to accommodate common use cases while staying well below the 16k threshold at which allocations start using vmalloc(). Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Kyle Blatter <kyleblatter@llnl.gov> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5549 Reviewed-on: http://review.whamcloud.com/11614 Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
---|---|---|
.. | ||
include/linux | ||
lnet | ||
lustre | ||
Kconfig | ||
Makefile | ||
README.txt | ||
sysfs-fs-lustre | ||
TODO |
Lustre Parallel Filesystem Client ================================= The Lustre file system is an open-source, parallel file system that supports many requirements of leadership class HPC simulation environments. Born from from a research project at Carnegie Mellon University, the Lustre file system is a widely-used option in HPC. The Lustre file system provides a POSIX compliant file system interface, can scale to thousands of clients, petabytes of storage and hundreds of gigabytes per second of I/O bandwidth. Unlike shared disk storage cluster filesystems (e.g. OCFS2, GFS, GPFS), Lustre has independent Metadata and Data servers that clients can access in parallel to maximize performance. In order to use Lustre client you will need to download the "lustre-client" package that contains the userspace tools from http://lustre.org/download/ You will need to install and configure your Lustre servers separately. Mount Syntax ============ After you installed the lustre-client tools including mount.lustre binary you can mount your Lustre filesystem with: mount -t lustre mgs:/fsname mnt where mgs is the host name or ip address of your Lustre MGS(management service) fsname is the name of the filesystem you would like to mount. Mount Options ============= noflock Disable posix file locking (Applications trying to use the functionality will get ENOSYS) localflock Enable local flock support, using only client-local flock (faster, for applications that require flock but do not run on multiple nodes). flock Enable cluster-global posix file locking coherent across all client nodes. user_xattr, nouser_xattr Support "user." extended attributes (or not) user_fid2path, nouser_fid2path Enable FID to path translation by regular users (or not) checksum, nochecksum Verify data consistency on the wire and in memory as it passes between the layers (or not). lruresize, nolruresize Allow lock LRU to be controlled by memory pressure on the server (or only 100 (default, controlled by lru_size proc parameter) locks per CPU per server on this client). lazystatfs, nolazystatfs Do not block in statfs() if some of the servers are down. 32bitapi Shrink inode numbers to fit into 32 bits. This is necessary if you plan to reexport Lustre filesystem from this client via NFSv4. verbose, noverbose Enable mount/umount console messages (or not) More Information ================ You can get more information at the Lustre website: http://wiki.lustre.org/ Source for the userspace tools and out-of-tree client and server code is available at: http://git.hpdd.intel.com/fs/lustre-release.git Latest binary packages: http://lustre.org/download/