linux

Author	SHA1	Message	Date
Roman Gushchin	f1782c9bc5	dcache: account external names as indirectly reclaimable memory I received a report about suspicious growth of unreclaimable slabs on some machines. I've found that it happens on machines with low memory pressure, and these unreclaimable slabs are external names attached to dentries. External names are allocated using generic kmalloc() function, so they are accounted as unreclaimable. But they are held by dentries, which are reclaimable, and they will be reclaimed under the memory pressure. In particular, this breaks MemAvailable calculation, as it doesn't take unreclaimable slabs into account. This leads to a silly situation, when a machine is almost idle, has no memory pressure and therefore has a big dentry cache. And the resulting MemAvailable is too low to start a new workload. To address the issue, the NR_INDIRECTLY_RECLAIMABLE_BYTES counter is used to track the amount of memory, consumed by external names. The counter is increased in the dentry allocation path, if an external name structure is allocated; and it's decreased in the dentry freeing path. To reproduce the problem I've used the following Python script: import os for iter in range (0, 10000000): try: name = ("/some_long_name_%d" % iter) + "_" * 220 os.stat(name) except Exception: pass Without this patch: $ cat /proc/meminfo \| grep MemAvailable MemAvailable: `7811688` kB $ python indirect.py $ cat /proc/meminfo \| grep MemAvailable MemAvailable: 2753052 kB With the patch: $ cat /proc/meminfo \| grep MemAvailable MemAvailable: 7809516 kB $ python indirect.py $ cat /proc/meminfo \| grep MemAvailable MemAvailable: 7749144 kB [guro@fb.com: fix indirectly reclaimable memory accounting for CONFIG_SLOB] Link: http://lkml.kernel.org/r/20180312194140.19517-1-guro@fb.com [guro@fb.com: fix indirectly reclaimable memory accounting] Link: http://lkml.kernel.org/r/20180313125701.7955-1-guro@fb.com Link: http://lkml.kernel.org/r/20180305133743.12746-5-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-04-11 10:28:29 -07:00
Roman Gushchin	034ebf65c3	mm: treat indirectly reclaimable memory as available in MemAvailable Adjust /proc/meminfo MemAvailable calculation by adding the amount of indirectly reclaimable memory (rounded to the PAGE_SIZE). Link: http://lkml.kernel.org/r/20180305133743.12746-4-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-04-11 10:28:29 -07:00
Roman Gushchin	eb59254608	mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES Patch series "indirectly reclaimable memory", v2. This patchset introduces the concept of indirectly reclaimable memory and applies it to fix the issue of when a big number of dentries with external names can significantly affect the MemAvailable value. This patch (of 3): Introduce a concept of indirectly reclaimable memory and adds the corresponding memory counter and /proc/vmstat item. Indirectly reclaimable memory is any sort of memory, used by the kernel (except of reclaimable slabs), which is actually reclaimable, i.e. will be released under memory pressure. The counter is in bytes, as it's not always possible to count such objects in pages. The name contains BYTES by analogy to NR_KERNEL_STACK_KB. Link: http://lkml.kernel.org/r/20180305133743.12746-2-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-04-11 10:28:29 -07:00
Martin Schwidefsky	6a3d1e81a4	s390: correct nospec auto detection init order With CONFIG_EXPOLINE_AUTO=y the call of spectre_v2_auto_early() via early_initcall is done after the early_param functions. This overwrites any settings done with the nobp/no_spectre_v2/spectre_v2 parameters. The code patching for the kernel is done after the evaluation of the early parameters but before the early_initcall is done. The end result is a kernel image that is patched correctly but the kernel modules are not. Make sure that the nospec auto detection function is called before the early parameters are evaluated and before the code patching is done. Fixes: `6e179d6412` ("s390: add automatic detection of the spectre defense") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-04-11 17:46:00 +02:00
Steven Rostedt (VMware)	0b3dec05db	tracing: Enforce passing in filter=NULL to create_filter() There's some inconsistency with what to set the output parameter filterp when passing to create_filter(..., struct event_filter **filterp). Whatever filterp points to, should be NULL when calling this function. The create_filter() calls create_filter_start() with a pointer to a local "filter" variable that is set to NULL. The create_filter_start() has a WARN_ON() if the passed in pointer isn't pointing to a value set to NULL. Ideally, create_filter() should pass the filterp variable it received to create_filter_start() and not hide it as with a local variable, this allowed create_filter() to fail, and not update the passed in filter, and the caller of create_filter() then tried to free filter, which was never initialized to anything, causing memory corruption. Link: http://lkml.kernel.org/r/00000000000032a0c30569916870@google.com Fixes: `80765597bc` ("tracing: Rewrite filter logic to be simpler and faster") Reported-by: syzbot+dadcc936587643d7f568@syzkaller.appspotmail.com Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2018-04-11 11:31:08 -04:00
Ravi Bangoria	a64b2c01e6	trace_uprobe: Simplify probes_seq_show() Simplify probes_seq_show() function. No change in output before and after patch. Link: http://lkml.kernel.org/r/20180315082756.9050-2-ravi.bangoria@linux.vnet.ibm.com Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2018-04-11 11:31:07 -04:00
Ravi Bangoria	18d45b11d9	trace_uprobe: Use %lx to display offset tu->offset is unsigned long, not a pointer, thus %lx should be used to print it, not the %px. Link: http://lkml.kernel.org/r/20180315082756.9050-1-ravi.bangoria@linux.vnet.ibm.com Cc: stable@vger.kernel.org Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: `0e4d819d08` ("trace_uprobe: Display correct offset in uprobe_events") Suggested-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2018-04-11 11:31:07 -04:00
Howard McLauchlan	f0a2aa5a2a	tracing/uprobe: Add support for overlayfs uprobes cannot successfully attach to binaries located in a directory mounted with overlayfs. To verify, create directories for mounting overlayfs (upper,lower,work,merge), move some binary into merge/ and use readelf to obtain some known instruction of the binary. I used /bin/true and the entry instruction(0x13b0): $ mount -t overlay overlay -o lowerdir=lower,upperdir=upper,workdir=work merge $ cd /sys/kernel/debug/tracing $ echo 'p:true_entry PATH_TO_MERGE/merge/true:0x13b0' > uprobe_events $ echo 1 > events/uprobes/true_entry/enable This returns 'bash: echo: write error: Input/output error' and dmesg tells us 'event trace: Could not enable event true_entry' This change makes create_trace_uprobe() look for the real inode of a dentry. In the case of normal filesystems, this simplifies to just returning the inode. In the case of overlayfs(and similar fs) we will obtain the underlying dentry and corresponding inode, upon which uprobes can successfully register. Running the example above with the patch applied, we can see that the uprobe is enabled and will output to trace as expected. Link: http://lkml.kernel.org/r/20180410231030.2720-1-hmclauchlan@fb.com Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Howard McLauchlan <hmclauchlan@fb.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2018-04-11 11:31:06 -04:00
Jérémy Lefaure	0a4d0564f0	tracing: Use ARRAY_SIZE() macro instead of open coding it It is useless to re-invent the ARRAY_SIZE macro so let's use it instead of DATA_CNT. Found with Coccinelle with the following semantic patch: @r depends on (org \|\| report)@ type T; T[] E; position p; @@ ( (sizeof(E)@p /sizeof(*E)) \| (sizeof(E)@p /sizeof(E[...])) \| (sizeof(E)@p /sizeof(T)) ) Link: http://lkml.kernel.org/r/20171016012250.26453-1-jeremy.lefaure@lse.epita.fr Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr> [ Removed useless include of kernel.h ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2018-04-11 11:30:33 -04:00
David S. Miller	949989b36c	Merge branch 'vhost-fix-vhost_vq_access_ok-log-check' Stefan Hajnoczi says: ==================== vhost: fix vhost_vq_access_ok() log check v3: * Rebased onto net/master and resolved conflict [DaveM] v2: * Rewrote the conditional to make the vq access check clearer [Linus] * Added Patch 2 to make the return type consistent and harder to misuse [Linus] The first patch fixes the vhost virtqueue access check which was recently broken. The second patch replaces the int return type with bool to prevent future bugs. ==================== Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-11 10:54:06 -04:00
Stefan Hajnoczi	ddd3d4081f	vhost: return bool from _access_ok() functions Currently vhost _access_ok() functions return int. This is error-prone because there are two popular conventions: 1. 0 means failure, 1 means success 2. -errno means failure, 0 means success Although vhost mostly uses #1, it does not do so consistently. umem_access_ok() uses #2. This patch changes the return type from int to bool so that false means failure and true means success. This eliminates a potential source of errors. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-11 10:54:06 -04:00
Stefan Hajnoczi	d14d2b7809	vhost: fix vhost_vq_access_ok() log check Commit `d65026c6c6` ("vhost: validate log when IOTLB is enabled") introduced a regression. The logic was originally: if (vq->iotlb) return 1; return A && B; After the patch the short-circuit logic for A was inverted: if (A \|\| vq->iotlb) return A; return B; This patch fixes the regression by rewriting the checks in the obvious way, no longer returning A when vq->iotlb is non-NULL (which is hard to understand). Reported-by: syzbot+65a84dde0214b0387ccd@syzkaller.appspotmail.com Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-11 10:54:06 -04:00
Eric Auger	7ced6c98c7	vhost: Fix vhost_copy_to_user() vhost_copy_to_user is used to copy vring used elements to userspace. We should use VHOST_ADDR_USED instead of VHOST_ADDR_DESC. Fixes: `f889491380` ("vhost: introduce O(1) vq metadata cache") Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-11 10:52:34 -04:00
David S. Miller	98239c9096	Merge branch 'Aquantia-atlantic-critical-fixes-04-2018' Igor Russkikh says: ==================== Aquantia atlantic critical fixes 04/2018 Two regressions on latest 4.16 driver reported by users Some of old FW (1.5.44) had a link management logic which prevents driver to make clean reset. Driver of 4.16 has a full hardware reset implemented and that broke the link and traffic on such a cards. Second is oops on shutdown callback in case interface is already closed or was never opened. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-11 10:41:36 -04:00
Igor Russkikh	9a11aff25f	net: aquantia: oops when shutdown on already stopped device In case netdev is closed at the moment of pci shutdown, aq_nic_stop gets called second time. napi_disable in that case hangs indefinitely. In other case, if device was never opened at all, we get oops because of null pointer access. We should invoke aq_nic_stop conditionally, only if device is running at the moment of shutdown. Reported-by: David Arcari <darcari@redhat.com> Fixes: `90869ddfef` ("net: aquantia: Implement pci shutdown callback") Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-11 10:41:36 -04:00
Igor Russkikh	cce96d1883	net: aquantia: Regression on reset with 1.x firmware On ASUS XG-C100C with 1.5.44 firmware a special mode called "dirty wake" is active. With this mode when motherboard gets powered (but no poweron happens yet), NIC automatically enables powersave link and watches for WOL packet. This normally allows to powerup the PC after AC power failures. Not all motherboards or bios settings gives power to PCI slots, so this mode is not enabled on all the hardware. 4.16 linux driver introduced full hardware reset sequence This is required since before that we had no NIC hardware reset implemented and there were side effects of "not clean start". But this full reset is incompatible with "dirty wake" WOL feature it keeps the PHY link in a special mode forever. As a consequence, driver sees no link and no traffic. To fix this we forcibly change FW state to idle state before doing the full reset. This makes FW to restore link state. Fixes: `c8c82eb` net: aquantia: Introduce global AQC hardware reset sequence Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-11 10:41:36 -04:00
Bassem Boubaker	53765341ee	cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN The Cinterion AHS8 is a 3G device with one embedded WWAN interface using cdc_ether as a driver. The modem is controlled via AT commands through the exposed TTYs. AT+CGDCONT write command can be used to activate or deactivate a WWAN connection for a PDP context defined with the same command. UE supports one WWAN adapter. Signed-off-by: Bassem Boubaker <bassem.boubaker@actia.fr> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-11 10:36:38 -04:00
Tejaswi Tanikella	3f01ddb962	slip: Check if rstate is initialized before uncompressing On receiving a packet the state index points to the rstate which must be used to fill up IP and TCP headers. But if the state index points to a rstate which is unitialized, i.e. filled with zeros, it gets stuck in an infinite loop inside ip_fast_csum trying to compute the ip checsum of a header with zero length. 89.666953: <2> [<ffffff9dd3e94d38>] slhc_uncompress+0x464/0x468 89.666965: <2> [<ffffff9dd3e87d88>] ppp_receive_nonmp_frame+0x3b4/0x65c 89.666978: <2> [<ffffff9dd3e89dd4>] ppp_receive_frame+0x64/0x7e0 89.666991: <2> [<ffffff9dd3e8a708>] ppp_input+0x104/0x198 89.667005: <2> [<ffffff9dd3e93868>] pppopns_recv_core+0x238/0x370 89.667027: <2> [<ffffff9dd4428fc8>] __sk_receive_skb+0xdc/0x250 89.667040: <2> [<ffffff9dd3e939e4>] pppopns_recv+0x44/0x60 89.667053: <2> [<ffffff9dd4426848>] __sock_queue_rcv_skb+0x16c/0x24c 89.667065: <2> [<ffffff9dd4426954>] sock_queue_rcv_skb+0x2c/0x38 89.667085: <2> [<ffffff9dd44f7358>] raw_rcv+0x124/0x154 89.667098: <2> [<ffffff9dd44f7568>] raw_local_deliver+0x1e0/0x22c 89.667117: <2> [<ffffff9dd44c8ba0>] ip_local_deliver_finish+0x70/0x24c 89.667131: <2> [<ffffff9dd44c92f4>] ip_local_deliver+0x100/0x10c ./scripts/faddr2line vmlinux slhc_uncompress+0x464/0x468 output: ip_fast_csum at arch/arm64/include/asm/checksum.h:40 (inlined by) slhc_uncompress at drivers/net/slip/slhc.c:615 Adding a variable to indicate if the current rstate is initialized. If such a packet arrives, move to toss state. Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-11 10:33:46 -04:00
Phil Elwell	fed56079e7	lan78xx: Avoid spurious kevent 4 "error" lan78xx_defer_event generates an error message whenever the work item is already scheduled. lan78xx_open defers three events - EVENT_STAT_UPDATE, EVENT_DEV_OPEN and EVENT_LINK_RESET. Being aware of the likelihood (or certainty) of an error message, the DEV_OPEN event is added to the set of pending events directly, relying on the subsequent deferral of the EVENT_LINK_RESET call to schedule the work. Take the same precaution with EVENT_STAT_UPDATE to avoid a totally unnecessary error message. Signed-off-by: Phil Elwell <phil@raspberrypi.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-11 10:32:43 -04:00
Phil Elwell	4bfc33807a	lan78xx: Correctly indicate invalid OTP lan78xx_read_otp tries to return -EINVAL in the event of invalid OTP content, but the value gets overwritten before it is returned and the read goes ahead anyway. Make the read conditional as it should be and preserve the error code. Fixes: `55d7de9de6` ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Phil Elwell <phil@raspberrypi.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-11 10:30:42 -04:00
Ka-Cheong Poon	a43cced9a3	rds: MP-RDS may use an invalid c_path rds_sendmsg() calls rds_send_mprds_hash() to find a c_path to use to send a message. Suppose the RDS connection is not yet up. In rds_send_mprds_hash(), it does if (conn->c_npaths == 0) wait_event_interruptible(conn->c_hs_waitq, (conn->c_npaths != 0)); If it is interrupted before the connection is set up, rds_send_mprds_hash() will return a non-zero hash value. Hence rds_sendmsg() will use a non-zero c_path to send the message. But if the RDS connection ends up to be non-MP capable, the message will be lost as only the zero c_path can be used. Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-11 10:24:01 -04:00
Desnes A. Nunes do Rosario	adf58458bc	PCI: Remove messages about reassigning resources When reassigning device resources to increase their alignment, e.g., because of a "pci=resource_alignment=" kernel parameter or because the platform aligns resources to its page size, we previously emitted messages like this: pci 0000:00:00.0: Disabling memory decoding and releasing memory resources pci 0000:00:00.0: disabling bridge mem windows These messages don't convey any useful information, so remove them. Fixes: `3827463769` ("powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned") Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.vnet.ibm.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>	2018-04-11 08:46:50 -05:00
Rafael J. Wysocki	51798deaff	Merge branches 'pm-cpuidle' and 'pm-qos' * pm-cpuidle: tick-sched: avoid a maybe-uninitialized warning cpuidle: Add definition of residency to sysfs documentation time: hrtimer: Use timerqueue_iterate_next() to get to the next timer nohz: Avoid duplication of code related to got_idle_tick nohz: Gather tick_sched booleans under a common flag field cpuidle: menu: Avoid selecting shallow states with stopped tick cpuidle: menu: Refine idle state selection for running tick sched: idle: Select idle state before stopping the tick time: hrtimer: Introduce hrtimer_next_event_without() time: tick-sched: Split tick_nohz_stop_sched_tick() cpuidle: Return nohz hint from cpuidle_select() jiffies: Introduce USER_TICK_USEC and redefine TICK_USEC sched: idle: Do not stop the tick before cpuidle_idle_call() sched: idle: Do not stop the tick upfront in the idle loop time: tick-sched: Reorganize idle tick management code * pm-qos: PM / QoS: mark expected switch fall-throughs	2018-04-11 13:22:46 +02:00
Helge Deller	71d577db01	parisc: Switch to generic COMPAT_BINFMT_ELF Drop our own compat binfmt implementation in arch/parisc/kernel/binfmt_elf32.c in favour of the generic implementation with CONFIG_COMPAT_BINFMT_ELF. While cleaning up the dependencies, I noticed that ELF_PLATFORM was strangely defined: On a 32-bit kernel, it was defined to "PARISC", while when running in compat mode on a 64-bit kernel it was defined to "PARISC32". Since it doesn't seem to be used in glibc yet, it's now defined in both cases to "PARISC". In any case, it can be distinguished because it's either a 32-bit or a 64-bit ELF file. Signed-off-by: Helge Deller <deller@gmx.de>	2018-04-11 11:40:35 +02:00
Helge Deller	2a03bb9e7a	parisc: Move cache flush functions into .text.hot section and move the disable_sr_hashing() C and assembly functions into the .init section. Signed-off-by: Helge Deller <deller@gmx.de>	2018-04-11 11:40:35 +02:00
Helge Deller	75abf64287	parisc/signal: Add FPE_CONDTRAP for conditional trap handling Posix and common sense requires that SI_USER not be a signal specific si_code. Thus add a new FPE_CONDTRAP si_code for conditional traps. Signed-off-by: Helge Deller <deller@gmx.de> Cc: Stephen Rothwell <sfr@canb.auug.org.au>	2018-04-11 11:40:35 +02:00
Joel Stanley	cb799267bb	MAINTAINERS: Update ASPEED entry with details I am interested in all ASPEED drivers, and the previous match wasn't grabbing files in nested directories. Use N instead. Add the arm kernel mailing list so that patches get reviewed there, and the linux-aspeed list which exists only so I can use patchwork to track patches. Add Andrew as a reviewer, because he is involved in reviewing ASPEED stuff. Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2018-04-11 10:47:31 +02:00
Harald Freudenberger	af4a72276d	s390/zcrypt: Support up to 256 crypto adapters. There was an artificial restriction on the card/adapter id to only 6 bits but all the AP commands do support adapter ids with 8 bit. This patch removes this restriction to 64 adapters and now up to 256 adapter can get addressed. Some of the ioctl calls work on the max number of cards possible (which was 64). These ioctls are now deprecated but still supported. All the defines, structs and ioctl interface declarations have been kept for compabibility. There are now new ioctls (and defines for these) with an additional '2' appended which provide the extended versions with 256 cards supported. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-04-11 10:36:27 +02:00
Carlos Maiolino	8c81dd46ef	Force log to disk before reading the AGF during a fstrim Forcing the log to disk after reading the agf is wrong, we might be calling xfs_log_force with XFS_LOG_SYNC with a metadata lock held. This can cause a deadlock when racing a fstrim with a filesystem shutdown. The deadlock has been identified due a miscalculation bug in device-mapper dm-thin, which returns lack of space to its users earlier than the device itself really runs out of space, changing the device-mapper volume into an error state. The problem happened while filling the filesystem with a single file, triggering the bug in device-mapper, consequently causing an IO error and shutting down the filesystem. If such file is removed, and fstrim executed before the XFS finishes the shut down process, the fstrim process will end up holding the buffer lock, and going to sleep on the cil wait queue. At this point, the shut down process will try to wake up all the threads waiting on the cil wait queue, but for this, it will try to hold the same buffer log already held my the fstrim, locking up the filesystem. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-04-10 22:39:04 -07:00
Matthew Wilcox	fbbb450904	Export __set_page_dirty XFS currently contains a copy-and-paste of __set_page_dirty(). Export it from buffer.c instead. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-04-10 22:39:01 -07:00
Dave Airlie	871e899db1	Merge branch 'drm-next-4.17' of git://people.freedesktop.org/~agd5f/linux into drm-next A few fixes for 4.17: - Fix a potential use after free in a error case - Fix pcie lane handling in amdgpu SI dpm - sdma pipeline sync fix - A few vega12 cleanups and fixes - Misc other fixes * 'drm-next-4.17' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: Fix memory leaks at amdgpu_init() error path drm/amdgpu: Fix PCIe lane width calculation drm/radeon: Fix PCIe lane width calculation drm/amdgpu/si: implement get/set pcie_lanes asic callback drm/amdgpu: Add support for SRBM selection v3 Revert "drm/amdgpu: Don't change preferred domian when fallback GTT v5" drm/amd/powerply: fix power reading on Fiji drm/amd/powerplay: Enable ACG SS feature drm/amdgpu/sdma: fix mask in emit_pipeline_sync drm/amdgpu: Fix KIQ hang on bare metal for device unbind/bind back v2. drm/amd/pp: Clean header file in vega12_smumgr.c drm/amd/pp: Remove Dead functions on Vega12 drm/amd/pp: silence a static checker warning drm/amdgpu: drop compute ring timeout setting for non-sriov only (v2) drm/amdgpu: fix typo of domain fallback	2018-04-11 08:35:41 +10:00
Dave Airlie	c975f17d70	hda_intel: Don't declare azx PM ops if VGA_SWITCHEROO configured (Lukas) Cc: Lukas Wunner <lukas@wunner.de> Cc: Takashi Iwai <tiwai@suse.de> -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEfxcpfMSgdnQMs+QqlvcN/ahKBwoFAlrFI58ACgkQlvcN/ahK BwrWzAf+J/tz5aj+e69HF2J7nypt4GY9owspk2SCLuvTfdzQIbzfl477DXCL7IFp Oc1qcCEbS6OjRXb8yN/xGClvWI8fDlRapnOYBsKvjZ1WNq89mwgl2FQgPKAfjNnT Lg1Z9AzzTyVJcWgk9jGlswNV+Uyi2QR9Zlelg+yrTidEi4ybwLlFMV9CkmT9q9ph yP+uApTqe8NA5SZrpwhtvlPj8kRcJgF9Z7FCARcVq3ZXh2qZ+Uujzay+QcnP4AZK MSfU/TXh4xeVDc4ttYenZNus0koJugTSpZ3hELZga4mGfj4W1nkSjLh3qRTxudD0 WpxxSjFKkKUqcsB1YXKgRehmO+AGIA== =acWn -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-fixes-2018-04-04' of git://anongit.freedesktop.org/drm/drm-misc into drm-next hda_intel: Don't declare azx PM ops if VGA_SWITCHEROO configured (Lukas) Cc: Lukas Wunner <lukas@wunner.de> Cc: Takashi Iwai <tiwai@suse.de> * tag 'drm-misc-next-fixes-2018-04-04' of git://anongit.freedesktop.org/drm/drm-misc: ALSA: hda - Silence PM ops build warning	2018-04-11 08:35:18 +10:00
Takashi Iwai	9e7f06c8be	swiotlb: fix unexpected swiotlb_alloc_coherent failures The code refactoring by commit `0176adb004` ("swiotlb: refactor coherent buffer allocation") made swiotlb_alloc_buffer almost always failing due to a thinko: namely, the function evaluates the dma_coherent_ok call incorrectly and dealing as if it's invalid. This ends up with weird errors like iwlwifi probe failure or amdgpu screen flickering. This patch corrects the logic error. Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1088658 Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1088902 Fixes: `0176adb004` ("swiotlb: refactor coherent buffer allocation") Cc: <stable@vger.kernel.org> # v4.16+ Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2018-04-10 22:30:43 +02:00
Frank Sorenson	98de9ce6f6	NFS: advance nfs_entry cookie only after decoding completes successfully In nfs[34]_decode_dirent, the cookie is advanced as soon as it is read, but decoding may still fail later in the function, returning an error. Because the cookie has been advanced, the failing entry is not re-requested from the server, resulting in a missing directory entry. In addition, nfs v3 and v4 read the cookie at different locations in the xdr_stream, so the behavior of the two can be inconsistent. Fix these by reading the cookie into a temporary variable, and only advancing the cookie once the entire entry has been decoded from the xdr_stream successfully. Signed-off-by: Frank Sorenson <sorenson@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
chendt	dbc898ae10	NFSv3/acl: forget acl cache after setattr Sync of ACL with std permissions fail，We need to forget the ACL cache after setattr. Reproduction: #!/bin/bash touch testfile cat <<EOF >testfile #!/bin/bash echo "Test was executed" EOF chmod u=rwx testfile chmod g=rw- testfile chmod o=r-- testfile chacl u::r--,g::rwx,o:rw- testfile chmod u+w testfile ls -l testfile chacl -l testfile Output： -rw-rwxrw- 1 root root 0 Mar 28 05:29 testfile testfile [u::r--,g::rwx,o::rw-] Signed-off-by: chendt.fnst <chendt.fnst@cn.fujitsu.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Kinglong Mee <Kinglong Mee> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	609339c123	NFSv4.1: Fix exclusive create When we use EXCLUSIVE4_1 mode, the server returns an attribute mask where all the bits indicate which attributes were set, and where the verifier was stored. In order to figure out which attribute we have to resend, we need to clear out the attributes that are set in exclcreat_bitmask. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [Anna: Fixed typo NFS4_CREATE_EXCLUSIVE4 -> NFS4_CREATE_EXCLUSIVE] Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	f6cdfa6dd6	NFSv4: Declare the size up to date after it was set. When we've changed the file size, then ensure we declare it to be up to date in the inode attributes. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Matthew Wilcox	aae5730e2d	nfs: Use ida_simple API Allocate the owner_id when we allocate the state and free it when we free the state. That lets us get rid of a gnarly ida_pre_get() / ida_get_new() loop. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	35156bfff3	NFSv4: Fix the nfs_inode_set_delegation() arguments Neither nfs_inode_set_delegation() nor nfs_inode_reclaim_delegation() are generic code. They have no business delving into NFSv4 OPEN xdr structures, so let's replace the "struct nfs_openres" parameter. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	8b06494624	NFSv4: Clean up CB_GETATTR encoding Replace the open coded bitmap implementation with a generic one. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	8bcbe7d98c	NFSv4: Don't ask for attributes when ACCESS is protected by a delegation If we hold a delegation, then the results of the ACCESS call are protected anyway. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	36b3743fef	NFSv4: Add a helper to encode/decode struct timespec Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	40a3426c75	NFSv4: Clean up encode_attrs Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	37c88763de	NFSv4; Clean up XDR encoding of type bitmap4 Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	e8d8aa46be	NFSv4: Allow GFP_NOIO sleeps in decode_attr_owner/decode_attr_group Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	85e3dd44c5	SUNRPC: Add a helper for encoding opaque data inline Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	0e779aa703	SUNRPC: Add helpers for decoding opaque and string types Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	d943f2dd8d	NFSv4: Ignore change attribute invalidations if we hold a delegation Don't bother even recording an invalid change attribute if we hold a delegation since we already know the state of our attribute cache. We can rely on the fact that we will pick up a copy from the server when we return the delegation. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	16e1437517	NFS: More fine grained attribute tracking Currently, if the NFS_INO_INVALID_ATTR flag is set, for instance by a call to nfs_post_op_update_inode_locked(), then it will not be cleared until all the attributes have been revalidated. This means, for instance, that NFSv4 writes will always force a full attribute revalidation. Track the ctime, mtime, size and change attribute separately from the other attributes so that we can have nfs_post_op_update_inode_locked() set them correctly, and later have the cache consistency bitmask be able to clear them. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	cac88f942d	NFS: Don't force unnecessary cache invalidation in nfs_update_inode() If we managed to revalidate all the attributes, then there is no reason to mark them as invalid again. We do, however want to ensure that we set nfsi->attrtimeo correctly. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00

... 3 4 5 6 7 ...

751331 Commits