mainlining shenanigans
Go to file
Kirill Smelkov 1fb027d759 fuse: require /dev/fuse reads to have enough buffer capacity (take 2)
[ This retries commit d4b13963f2 ("fuse: require /dev/fuse reads to have
enough buffer capacity"), which was reverted.  In this version we require
only `sizeof(fuse_in_header) + sizeof(fuse_write_in)` instead of 4K for
FUSE request header room, because, contrary to libfuse and kernel client
behaviour, GlusterFS actually provides only so much room for request
header. ]

A FUSE filesystem server queues /dev/fuse sys_read calls to get filesystem
requests to handle. It does not know in advance what would be that request
as it can be anything that client issues - LOOKUP, READ, WRITE, ... Many
requests are short and retrieve data from the filesystem. However WRITE and
NOTIFY_REPLY write data into filesystem.

Before getting into operation phase, FUSE filesystem server and kernel
client negotiate what should be the maximum write size the client will ever
issue. After negotiation the contract in between server/client is that the
filesystem server then should queue /dev/fuse sys_read calls with enough
buffer capacity to receive any client request - WRITE in particular, while
FUSE client should not, in particular, send WRITE requests with >
negotiated max_write payload. FUSE client in kernel and libfuse
historically reserve 4K for request header. However an existing filesystem
server - GlusterFS - was found which reserves only 80 bytes for header room
(= `sizeof(fuse_in_header) + sizeof(fuse_write_in)`).

Since

	`sizeof(fuse_in_header) + sizeof(fuse_write_in)` ==
	`sizeof(fuse_in_header) + sizeof(fuse_read_in)`  ==
	`sizeof(fuse_in_header) + sizeof(fuse_notify_retrieve_in)`

is the absolute minimum any sane filesystem should be using for header
room, the contract is that filesystem server should queue sys_reads with
`sizeof(fuse_in_header) + sizeof(fuse_write_in)` + max_write buffer.

If the filesystem server does not follow this contract, what can happen
is that fuse_dev_do_read will see that request size is > buffer size,
and then it will return EIO to client who issued the request but won't
indicate in any way that there is a problem to filesystem server.
This can be hard to diagnose because for some requests, e.g. for
NOTIFY_REPLY which mimics WRITE, there is no client thread that is
waiting for request completion and that EIO goes nowhere, while on
filesystem server side things look like the kernel is not replying back
after successful NOTIFY_RETRIEVE request made by the server.

We can make the problem easy to diagnose if we indicate via error return to
filesystem server when it is violating the contract.  This should not
practically cause problems because if a filesystem server is using shorter
buffer, writes to it were already very likely to cause EIO, and if the
filesystem is read-only it should be too following FUSE_MIN_READ_BUFFER
minimum buffer size.

Please see [1] for context where the problem of stuck filesystem was hit
for real (because kernel client was incorrectly sending more than
max_write data with NOTIFY_REPLY; see also previous patch), how the
situation was traced and for more involving patch that did not make it
into the tree.

[1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2

Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Han-Wen Nienhuys <hanwen@google.com>
Cc: Jakob Unterwurzacher <jakobunt@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-02 11:07:30 +02:00
arch This pull request contains a single bug fix for UML: 2019-08-25 11:40:24 -07:00
block block: remove REQ_NOWAIT_INLINE 2019-08-15 11:09:16 -06:00
certs Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" 2019-07-10 18:43:43 -07:00
crypto USB / PHY patches for 5.3-rc1 2019-07-11 15:40:06 -07:00
Documentation Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-08-25 10:10:15 -07:00
drivers A minor auxdisplay improvement: 2019-08-25 11:43:17 -07:00
fs fuse: require /dev/fuse reads to have enough buffer capacity (take 2) 2019-09-02 11:07:30 +02:00
include This pull request contains the following fixes for UBIFS and JFFS2: 2019-08-25 11:29:27 -07:00
init Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
ipc Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
kernel Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-08-25 10:08:01 -07:00
lib Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-08-06 17:11:59 -07:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
mm mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y 2019-08-24 19:48:42 -07:00
net Three important fixes tagged for stable (an indefinite hang, a crash on 2019-08-23 09:19:38 -07:00
samples auxdisplay: Fix a typo in cfag12864b-example.c 2019-08-08 20:00:18 +02:00
scripts SPDX fixes for 5.3-rc5 2019-08-18 09:26:16 -07:00
security KEYS: trusted: allow module init if TPM is inactive or deactivated 2019-08-13 19:59:23 +03:00
sound sound fixes for 5.3-rc5 2019-08-16 08:49:45 -07:00
tools - Fix for panics and network failures on PAE guests by Dexuan Cui. 2019-08-24 11:42:06 -07:00
usr kbuild: enable arch/s390/include/uapi/asm/zcrypt.h for uapi header test 2019-07-23 10:45:46 +02:00
virt KVM/arm fixes for 5.3, take #3 2019-08-24 12:46:30 +01:00
.clang-format Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-04-17 11:26:25 -07:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes
.gitignore .gitignore: Add compilation database file 2019-07-27 12:18:19 +09:00
.mailmap MAINTAINERS: Update my email address 2019-07-22 14:57:50 +01:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS Remove references to dead website. 2019-07-19 12:22:04 -07:00
Kbuild Kbuild updates for v5.1 2019-03-10 17:48:21 -07:00
Kconfig docs: kbuild: convert docs to ReST and rename to *.rst 2019-06-14 14:21:21 -06:00
MAINTAINERS for-linus-20190823 2019-08-23 14:45:45 -07:00
Makefile Linux 5.3-rc6 2019-08-25 12:01:23 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.