Commit Graph

707601 Commits

Author SHA1 Message Date
Colin Ian King
d099b8af46 sunrpc: remove redundant initialization of sock
sock is being initialized and then being almost immediately updated
hence the initialized value is not being used and is redundant. Remove
the initialization. Cleans up clang warning:

warning: Value stored to 'sock' during its initialization is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-10-01 18:51:30 -04:00
Benjamin Coddington
68ebf8fe3b NFS: Fix uninitialized rpc_wait_queue
Michael Sterrett reports a NULL pointer dereference on NFSv3 mounts when
CONFIG_NFS_V4 is not set because the NFS UOC rpc_wait_queue has not been
initialized.  Move the initialization of the queue out of the CONFIG_NFS_V4
conditional setion.

Fixes: 7d6ddf88c4 ("NFS: Add an iocounter wait function for async RPC tasks")
Cc: stable@vger.kernel.org # 4.11+
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-10-01 18:51:30 -04:00
Dan Carpenter
cdb2e53fd6 NFS: Cleanup error handling in nfs_idmap_request_key()
nfs_idmap_get_desc() can't actually return zero.  But if it did then
we would return ERR_PTR(0) which is NULL and the caller,
nfs_idmap_get_key(), doesn't expect that so it leads to a NULL pointer
dereference.

I've cleaned this up by changing the "<=" to "<" so it's more clear that
we don't return ERR_PTR(0).

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-10-01 18:51:30 -04:00
J. Bruce Fields
35c036ef4a nfs: RPC_MAX_AUTH_SIZE is in bytes
The units of RPC_MAX_AUTH_SIZE is bytes, not 4-byte words.  This causes
the client to request a larger-than-necessary session replay slot size.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-10-01 18:51:30 -04:00
Linus Torvalds
9e66317d3c Linux 4.14-rc3 2017-10-01 14:54:54 -07:00
Linus Torvalds
368f89984b Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "This contains the following fixes and improvements:

   - Avoid dereferencing an unprotected VMA pointer in the fault signal
     generation code

   - Fix inline asm call constraints for GCC 4.4

   - Use existing register variable to retrieve the stack pointer
     instead of forcing the compiler to create another indirect access
     which results in excessive extra 'mov %rsp, %<dst>' instructions

   - Disable branch profiling for the memory encryption code to prevent
     an early boot crash

   - Fix a sparse warning caused by casting the __user annotation in
     __get_user_asm_u64() away

   - Fix an off by one error in the loop termination of the error patch
     in the x86 sysfs init code

   - Add missing CPU IDs to various Intel specific drivers to enable the
     functionality on recent hardware

   - More (init) constification in the numachip code"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm: Use register variable to get stack pointer value
  x86/mm: Disable branch profiling in mem_encrypt.c
  x86/asm: Fix inline asm call constraints for GCC 4.4
  perf/x86/intel/uncore: Correct num_boxes for IIO and IRP
  perf/x86/intel/rapl: Add missing CPU IDs
  perf/x86/msr: Add missing CPU IDs
  perf/x86/intel/cstate: Add missing CPU IDs
  x86: Don't cast away the __user in __get_user_asm_u64()
  x86/sysfs: Fix off-by-one error in loop termination
  x86/mm: Fix fault error path using unsafe vma pointer
  x86/numachip: Add const and __initconst to numachip2_clockevent
2017-10-01 13:55:32 -07:00
Linus Torvalds
c42ed9f91f Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
 "This adds a new timer wheel function which is required for the
  conversion of the timer callback function from the 'unsigned long
  data' argument to 'struct timer_list *timer'. This conversion has two
  benefits:

   1) It makes struct timer_list smaller

   2) Many callers hand in a pointer to the timer or to the structure
      containing the timer, which happens via type casting both at setup
      and in the callback. This change gets rid of the typecasts.

  Once the conversion is complete, which is planned for 4.15, the old
  setup function and the intermediate typecast in the new setup function
  go away along with the data field in struct timer_list.

  Merging this now into mainline allows a smooth queueing of the actual
  conversion in the affected maintainer trees without creating
  dependencies"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  um/time: Fixup namespace collision
  timer: Prepare to change timer callback argument type
2017-10-01 13:03:16 -07:00
Linus Torvalds
8251354513 Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp/hotplug fixes from Thomas Gleixner:
 "This addresses the fallout of the new lockdep mechanism which covers
  completions in the CPU hotplug code.

  The lockdep splats are false positives, but there is no way to
  annotate that reliably. The solution is to split the completions for
  CPU up and down, which requires some reshuffling of the failure
  rollback handling as well"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  smp/hotplug: Hotplug state fail injection
  smp/hotplug: Differentiate the AP completion between up and down
  smp/hotplug: Differentiate the AP-work lockdep class between up and down
  smp/hotplug: Callback vs state-machine consistency
  smp/hotplug: Rewrite AP state machine core
  smp/hotplug: Allow external multi-instance rollback
  smp/hotplug: Add state diagram
2017-10-01 12:34:42 -07:00
Linus Torvalds
7e103ace9c Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
 "The scheduler pull request comes with the following updates:

   - Prevent a divide by zero issue by validating the input value of
     sysctl_sched_time_avg

   - Make task state printing consistent all over the place and have
     explicit state characters for IDLE and PARKED so they wont be
     displayed as 'D' state which confuses tools"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/sysctl: Check user input value of sysctl_sched_time_avg
  sched/debug: Add explicit TASK_PARKED printing
  sched/debug: Ignore TASK_IDLE for SysRq-W
  sched/debug: Add explicit TASK_IDLE printing
  sched/tracing: Use common task-state helpers
  sched/tracing: Fix trace_sched_switch task-state printing
  sched/debug: Remove unused variable
  sched/debug: Convert TASK_state to hex
  sched/debug: Implement consistent task-state printing
2017-10-01 12:10:02 -07:00
Linus Torvalds
1c6f705ba2 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:

 - Prevent a division by zero in the perf aux buffer handling

 - Sync kernel headers with perf tool headers

 - Fix a build failure in the syscalltbl code

 - Make the debug messages of perf report --call-graph work correctly

 - Make sure that all required perf files are in the MANIFEST for
   container builds

 - Fix the atrr.exclude kernel handling so it respects the
   perf_event_paranoid and the user permissions

 - Make perf test on s390x work correctly

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/aux: Only update ->aux_wakeup in non-overwrite mode
  perf test: Fix vmlinux failure on s390x part 2
  perf test: Fix vmlinux failure on s390x
  perf tools: Fix syscalltbl build failure
  perf report: Fix debug messages with --call-graph option
  perf evsel: Fix attr.exclude_kernel setting for default cycles:p
  tools include: Sync kernel ABI headers with tooling headers
  perf tools: Get all of tools/{arch,include}/ in the MANIFEST
2017-10-01 12:06:31 -07:00
Linus Torvalds
1de47f3cb7 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull  locking fixes from Thomas Gleixner:
 "Two fixes for locking:

   - Plug a hole the pi_stat->owner serialization which was changed
     recently and failed to fixup two usage sites.

   - Prevent reordering of the rwsem_has_spinner() check vs the
     decrement of rwsem count in up_write() which causes a missed
     wakeup"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rwsem-xadd: Fix missed wakeup due to reordering of load
  futex: Fix pi_state->owner serialization
2017-10-01 12:02:47 -07:00
Linus Torvalds
3d9d62b99b Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:

 - Add a missing NULL pointer check in free_irq()

 - Fix a memory leak/memory corruption in the generic irq chip

 - Add missing rcu annotations for radix tree access

 - Use ffs instead of fls when extracting data from a chip register in
   the MIPS GIC irq driver

 - Fix the unmasking of IPI interrupts in the MIPS GIC driver so they
   end up at the target CPU and not at CPU0

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irq/generic-chip: Don't replace domain's name
  irqdomain: Add __rcu annotations to radix tree accessors
  irqchip/mips-gic: Use effective affinity to unmask
  irqchip/mips-gic: Fix shifts to extract register fields
  genirq: Check __free_irq() return value for NULL
2017-10-01 12:00:56 -07:00
Linus Torvalds
156069f8f0 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Thomas Gleixner:
 "Two small fixes for objtool:

   - Support frame pointer setup via 'lea (%rsp), %rbp' which was not
     yet supported and caused build warnings

   - Disable unreacahble warnings for GCC4.4 and older to avoid false
     positives caused by the compiler itself"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Support unoptimized frame pointer setup
  objtool: Skip unreachable warnings for GCC 4.4 and older
2017-10-01 11:12:29 -07:00
Christophe Jaillet
74007ae631 hwmon: (xgene) Fix up error handling path mixup in 'xgene_hwmon_probe()'
Commit 2ca492e22c has moved the call to 'kfifo_alloc()' from after the
main 'if' statement to before it.
But it has not updated the error handling paths accordingly.

Fix all that:
   - if 'kfifo_alloc()' fails we can return directly
   - direct returns after 'kfifo_alloc()' must now go to 'out_mbox_free'
   - 'goto out_mbox_free' must be replaced by 'goto out', otherwise the
     '[pcc_]mbox_free_channel()' call will be missed.

Fixes: 2ca492e22c ("hwmon: (xgene) Fix crash when alarm occurs before driver probe")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-10-01 08:46:54 -07:00
Martin Wilck
007a61ae2f nvme: fix visibility of "uuid" ns attribute
"uuid" must be invisible if both ns->uuid and ns->nguid are unset,
not if either one is.

Fixes: d934f9848a "nvme: provide UUID value to userspace"
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-10-01 09:58:04 +02:00
Parthasarathy Bhuvaragan
aad06212d3 tipc: use only positive error codes in messages
In commit e3a77561e7 ("tipc: split up function tipc_msg_eval()"),
we have updated the function tipc_msg_lookup_dest() to set the error
codes to negative values at destination lookup failures. Thus when
the function sets the error code to -TIPC_ERR_NO_NAME, its inserted
into the 4 bit error field of the message header as 0xf instead of
TIPC_ERR_NO_NAME (1). The value 0xf is an unknown error code.

In this commit, we set only positive error code.

Fixes: e3a77561e7 ("tipc: split up function tipc_msg_eval()")
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-01 04:03:35 +01:00
Guillaume Nault
5a59a3a0ef ppp: fix __percpu annotation
Move sparse annotation right after pointer type.

Fixes sparse warning:
    drivers/net/ppp/ppp_generic.c:1422:13: warning: incorrect type in initializer (different address spaces)
    drivers/net/ppp/ppp_generic.c:1422:13:    expected void const [noderef] <asn:3>*__vpp_verify
    drivers/net/ppp/ppp_generic.c:1422:13:    got int *<noident>
    ...

Fixes: e5dadc65f9 ("ppp: Fix false xmit recursion detect with two ppp devices")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-01 03:58:10 +01:00
David S. Miller
230583c195 Merge branch 'udp-fix-early-demux-for-mcast-packets'
Paolo Abeni says:

====================
udp: fix early demux for mcast packets

Currently the early demux callbacks do not perform source address validation.
This is not an issue for TCP or UDP unicast, where the early demux
is only allowed for connected sockets and the source address is validated
for the first packet and never change.

The UDP protocol currently allows early demux also for unconnected multicast
sockets, and we are not currently doing any validation for them, after that
the first packet lands on the socket: beyond ignoring the rp_filter - if
enabled - any kind of martian sources are also allowed.

This series addresses the issue allowing the early demux callback to return an
error code, and performing the proper checks for unconnected UDP multicast
sockets before leveraging the rx dst cache.

Alternatively we could disable the early demux for unconnected mcast sockets,
but that would cause relevant performance regression - around 50% - while with
this series, with full rp_filter in place, we keep the regression to a more
moderate level.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-01 03:55:47 +01:00
Paolo Abeni
bc044e8db7 udp: perform source validation for mcast early demux
The UDP early demux can leverate the rx dst cache even for
multicast unconnected sockets.

In such scenario the ipv4 source address is validated only on
the first packet in the given flow. After that, when we fetch
the dst entry  from the socket rx cache, we stop enforcing
the rp_filter and we even start accepting any kind of martian
addresses.

Disabling the dst cache for unconnected multicast socket will
cause large performace regression, nearly reducing by half the
max ingress tput.

Instead we factor out a route helper to completely validate an
skb source address for multicast packets and we call it from
the UDP early demux for mcast packets landing on unconnected
sockets, after successful fetching the related cached dst entry.

This still gives a measurable, but limited performance
regression:

		rp_filter = 0		rp_filter = 1
edmux disabled:	1182 Kpps		1127 Kpps
edmux before:	2238 Kpps		2238 Kpps
edmux after:	2037 Kpps		2019 Kpps

The above figures are on top of current net tree.
Applying the net-next commit 6e617de84e ("net: avoid a full
fib lookup when rp_filter is disabled.") the delta with
rp_filter == 0 will decrease even more.

Fixes: 421b3885bf ("udp: ipv4: Add udp early demux")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-01 03:55:47 +01:00
Paolo Abeni
7487449c86 IPv4: early demux can return an error code
Currently no error is emitted, but this infrastructure will
used by the next patch to allow source address validation
for mcast sockets.
Since early demux can do a route lookup and an ipv4 route
lookup can return an error code this is consistent with the
current ipv4 route infrastructure.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-01 03:55:47 +01:00
Xin Long
d41bb33ba3 ip6_tunnel: update mtu properly for ARPHRD_ETHER tunnel device in tx path
Now when updating mtu in tx path, it doesn't consider ARPHRD_ETHER tunnel
device, like ip6gre_tap tunnel, for which it should also subtract ether
header to get the correct mtu.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-01 03:46:42 +01:00
Xin Long
2d40557cc7 ip6_gre: ip6gre_tap device should keep dst
The patch 'ip_gre: ipgre_tap device should keep dst' fixed
a issue that ipgre_tap mtu couldn't be updated in tx path.

The same fix is needed for ip6gre_tap as well.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-01 03:46:42 +01:00
Xin Long
d51711c055 ip_gre: ipgre_tap device should keep dst
Without keeping dst, the tunnel will not update any mtu/pmtu info,
since it does not have a dst on the skb.

Reproducer:
  client(ipgre_tap1 - eth1) <-----> (eth1 - ipgre_tap1)server

After reducing eth1's mtu on client, then perforamnce became 0.

This patch is to netif_keep_dst in gre_tap_init, as ipgre does.

Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-01 03:46:42 +01:00
Linus Torvalds
a8c964eacb Merge tag 'mtd/fixes-for-4.14-rc3' of git://git.infradead.org/linux-mtd
Pull mtd fixes from Boris Brezillon:

 - Fix partition alignment check in mtdcore.c

 - Fix a buffer overflow in the Atmel NAND driver

* tag 'mtd/fixes-for-4.14-rc3' of git://git.infradead.org/linux-mtd:
  mtd: nand: atmel: fix buffer overflow in atmel_pmecc_user
  mtd: Fix partition alignment check on multi-erasesize devices
2017-09-30 12:52:32 -07:00
Linus Torvalds
0b33ce72ea Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Eight mostly minor fixes for recently discovered issues in drivers"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ILLEGAL REQUEST + ASC==27 => target failure
  scsi: aacraid: Add a small delay after IOP reset
  scsi: scsi_transport_fc: Also check for NOTPRESENT in fc_remote_port_add()
  scsi: scsi_transport_fc: set scsi_target_id upon rescan
  scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly
  scsi: aacraid: error: testing array offset 'bus' after use
  scsi: lpfc: Don't return internal MBXERR_ERROR code from probe function
  scsi: aacraid: Fix 2T+ drives on SmartIOC-2000
2017-09-30 12:50:56 -07:00
Jason A. Donenfeld
fef0035c0f netlink: do not proceed if dump's start() errs
Drivers that use the start method for netlink dumping rely on dumpit not
being called if start fails. For example, ila_xlat.c allocates memory
and assigns it to cb->args[0] in its start() function. It might fail to
do that and return -ENOMEM instead. However, even when returning an
error, dumpit will be called, which, in the example above, quickly
dereferences the memory in cb->args[0], which will OOPS the kernel. This
is but one example of how this goes wrong.

Since start() has always been a function with an int return type, it
therefore makes sense to use it properly, rather than ignoring it. This
patch thus returns early and does not call dumpit() when start() fails.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-30 16:13:31 +01:00
Linus Torvalds
74d83ec2b7 Merge tag 'platform-drivers-x86-v4.14-2' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform drivers fix from Darren Hart:
 "Newly discovered species of fujitsu laptops break some assumptions
  about ACPI device pairings.

  fujitsu-laptop: Don't oops when FUJ02E3 is not present"

* tag 'platform-drivers-x86-v4.14-2' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: fujitsu-laptop: Don't oops when FUJ02E3 is not presnt
2017-09-29 19:35:41 -07:00
Linus Torvalds
95dcc4dc38 Merge tag 'led_fixes-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
Pull LED fixes from Jacek Anaszewski:
 "Four fixes for the as3645a LED flash controller and one update to
  MAINTAINERS"

* tag 'led_fixes-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
  MAINTAINERS: Add entry for MediaTek PMIC LED driver
  as3645a: Unregister indicator LED on device unbind
  as3645a: Use integer numbers for parsing LEDs
  dt: bindings: as3645a: Use LED number to refer to LEDs
  as3645a: Use ams,input-max-microamp as documented in DT bindings
2017-09-29 19:33:32 -07:00
Stephen Boyd
79765e9a3d Merge tag 'v4.14-rockchip-clkfixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into clk-fixes
Pull Rockchip clk driver fixes from Heiko Stuebner:

Some smallish fixes for the rk3128 clock support including
some register errors and some clocks that should be critical
for safe usage.

* tag 'v4.14-rockchip-clkfixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  clk: rockchip: add sclk_timer5 as critical clock on rk3128
  clk: rockchip: fix up rk3128 pvtm and mipi_24m gate regs error
  clk: rockchip: add pclk_pmu as critical clock on rk3128
2017-09-29 14:17:51 -07:00
Bjorn Andersson
9792bf5ad5 clk: Export clk_bulk_prepare()
Allow clk_bulk_prepare() to be referenced by kernel modules by adding
the missing EXPORT_SYMBOL_GPL().

Fixes: 266e4e9d91 ("clk: add clk_bulk_get accessories")
Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2017-09-29 14:17:17 -07:00
Linus Torvalds
99637e4268 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull waitid fix from Al Viro:
 "Fix infoleak in waitid()"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix infoleak in waitid(2)
2017-09-29 12:59:59 -07:00
Linus Torvalds
5ba88cd6e9 Merge branch 'for-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
 "We've collected a bunch of isolated fixes, for crashes, user-visible
  behaviour or missing bits from other subsystem cleanups from the past.

  The overall number is not small but I was not able to make it
  significantly smaller. Most of the patches are supposed to go to
  stable"

* 'for-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: log csums for all modified extents
  Btrfs: fix unexpected result when dio reading corrupted blocks
  btrfs: Report error on removing qgroup if del_qgroup_item fails
  Btrfs: skip checksum when reading compressed data if some IO have failed
  Btrfs: fix kernel oops while reading compressed data
  Btrfs: use btrfs_op instead of bio_op in __btrfs_map_block
  Btrfs: do not backup tree roots when fsync
  btrfs: remove BTRFS_FS_QUOTA_DISABLING flag
  btrfs: propagate error to btrfs_cmp_data_prepare caller
  btrfs: prevent to set invalid default subvolid
  Btrfs: send: fix error number for unknown inode types
  btrfs: fix NULL pointer dereference from free_reloc_roots()
  btrfs: finish ordered extent cleaning if no progress is found
  btrfs: clear ordered flag on cleaning up ordered extents
  Btrfs: fix incorrect {node,sector}size endianness from BTRFS_IOC_FS_INFO
  Btrfs: do not reset bio->bi_ops while writing bio
  Btrfs: use the new helper wbc_to_write_flags
2017-09-29 12:57:35 -07:00
Linus Torvalds
7b5ef82336 Merge tag 'md/4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
Pull MD fixes from Shaohua Li:
 "A few fixes for MD. Mainly fix a problem introduced in 4.13, which we
  retry bio for some code paths but not all in some situations"

* tag 'md/4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md/raid5: cap worker count
  dm-raid: fix a race condition in request handling
  md: fix a race condition for flush request handling
  md: separate request handling
2017-09-29 12:55:33 -07:00
Linus Torvalds
93b5533ab5 Merge tag 'pci-v4.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:

 - fix CONFIG_PCI=n build error (introduced in v4.14-rc1) (Geert
   Uytterhoeven)

 - fix a race in sysfs driver_override store/show (Nicolai Stange)

* tag 'pci-v4.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Fix race condition with driver_override
  PCI: Add dummy pci_acs_enabled() for CONFIG_PCI=n build
2017-09-29 12:46:13 -07:00
Linus Torvalds
a3583202e8 Merge tag 'drm-fixes-for-v4.14-rc3' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Regular fixes pull, some amdkfd, amdgpu, etnaviv, sun4i, qxl, tegra
  fixes.

  I've got an outstanding pull for i915 but it wasn't on an rc2 base so
  I wanted to ship these out first, I might get to it before rc3 or I
  might not"

* tag 'drm-fixes-for-v4.14-rc3' of git://people.freedesktop.org/~airlied/linux:
  drm/tegra: trace: Fix path to include
  qxl: fix framebuffer unpinning
  drm/sun4i: cec: Enable back CEC-pin framework
  drm/amdkfd: Print event limit messages only once per process
  drm/amdkfd: Fix kernel-queue wrapping bugs
  drm/amdkfd: Fix incorrect destroy_mqd parameter
  drm/radeon: disable hard reset in hibernate for APUs
  drm/amdgpu: revert tile table update for oland
  etnaviv: fix gem object list corruption
  etnaviv: fix submit error path
  qxl: fix primary surface handling
  drm/amdkfd: check for null dev to avoid a null pointer dereference
2017-09-29 12:43:36 -07:00
Linus Torvalds
35dbba31be Merge tag 'iommu-fixes-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:

 - A comment fix for 'struct iommu_ops'

 - Format string fixes for AMD IOMMU, unfortunatly I missed that during
   review.

 - Limit mediatek physical addresses to 32 bit for v7s to fix a warning
   triggered in io-page-table code.

 - Fix dma-sync in io-pgtable-arm-v7s code

* tag 'iommu-fixes-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu: Fix comment for iommu_ops.map_sg
  iommu/amd: pr_err() strings should end with newlines
  iommu/mediatek: Limit the physical address in 32bit for v7s
  iommu/io-pgtable-arm-v7s: Need dma-sync while there is no QUIRK_NO_DMA
2017-09-29 12:37:07 -07:00
Linus Torvalds
0648260041 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:

 - SPsel register initialisation on reset as the architecture defines
   its state as unknown

 - Use READ_ONCE when dereferencing pmd_t pointers to avoid race
   conditions in page_vma_mapped_walk() (or fast GUP) with concurrent
   modifications of the page table

 - Avoid invoking the mm fault handling code for kernel addresses (check
   against TASK_SIZE) which would otherwise result in calling
   might_sleep() in atomic context

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: fault: Route pte translation faults via do_translation_fault
  arm64: mm: Use READ_ONCE when dereferencing pointer to pte table
  arm64: Make sure SPsel is always set
2017-09-29 12:31:35 -07:00
Linus Torvalds
9f2a5128b9 Merge tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:

 - avoid a warning when compiling with clang

 - consider read-only bits in xen-pciback when writing to a BAR

 - fix a boot crash of pv-domains

* tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mapping
  xen-pciback: relax BAR sizing write value check
  x86/xen: clean up clang build warning
2017-09-29 12:24:28 -07:00
Linus Torvalds
42057e1825 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "Mixed bugfixes. Perhaps the most interesting one is a latent bug that
  was finally triggered by PCID support"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm/x86: Handle async PF in RCU read-side critical sections
  KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresume
  KVM: VMX: use cmpxchg64
  KVM: VMX: simplify and fix vmx_vcpu_pi_load
  KVM: VMX: avoid double list add with VT-d posted interrupts
  KVM: VMX: extract __pi_post_block
  KVM: PPC: Book3S HV: Check for updated HDSISR on P9 HDSI exception
  KVM: nVMX: fix HOST_CR3/HOST_CR4 cache
2017-09-29 12:18:55 -07:00
Al Viro
6c85501f2f fix infoleak in waitid(2)
kernel_waitid() can return a PID, an error or 0.  rusage is filled in the first
case and waitid(2) rusage should've been copied out exactly in that case, *not*
whenever kernel_waitid() has not returned an error.  Compat variant shares that
braino; none of kernel_wait4() callers do, so the below ought to fix it.

Reported-and-tested-by: Alexander Potapenko <glider@google.com>
Fixes: ce72a16fa7 ("wait4(2)/waitid(2): separate copying rusage to userland")
Cc: stable@vger.kernel.org # v4.13
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-09-29 13:43:15 -04:00
Andrey Ryabinin
196bd485ee x86/asm: Use register variable to get stack pointer value
Currently we use current_stack_pointer() function to get the value
of the stack pointer register. Since commit:

  f5caf621ee ("x86/asm: Fix inline asm call constraints for Clang")

... we have a stack register variable declared. It can be used instead of
current_stack_pointer() function which allows to optimize away some
excessive "mov %rsp, %<dst>" instructions:

 -mov    %rsp,%rdx
 -sub    %rdx,%rax
 -cmp    $0x3fff,%rax
 -ja     ffffffff810722fd <ist_begin_non_atomic+0x2d>

 +sub    %rsp,%rax
 +cmp    $0x3fff,%rax
 +ja     ffffffff810722fa <ist_begin_non_atomic+0x2a>

Remove current_stack_pointer(), rename __asm_call_sp to current_stack_pointer
and use it instead of the removed function.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170929141537.29167-1-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-29 19:39:44 +02:00
Tom Lendacky
bc829ee36e x86/mm: Disable branch profiling in mem_encrypt.c
Some routines in mem_encrypt.c are called very early in the boot process,
e.g. sme_encrypt_kernel(). When CONFIG_TRACE_BRANCH_PROFILING=y is defined
the resulting branch profiling associated with the check to see if SME is
active results in a kernel crash. Disable branch profiling for
mem_encrypt.c by defining DISABLE_BRANCH_PROFILING before including any
header files.

Reported-by: kernel test robot <lkp@01.org>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170929162419.6016.53390.stgit@tlendack-t1.amdoffice.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-29 19:37:51 +02:00
Ingo Molnar
1addcd55bc Merge tag 'perf-urgent-for-mingo-4.14-20170928' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

- Fix syscalltbl build failure (Akemi Yagi)

- Fix attr.exclude_kernel setting for default cycles:p, this time for
  !root with kernel.perf_event_paranoid = -1 (Arnaldo Carvalho de Melo)

- Sync kernel ABI headers with tooling headers (Ingo Molnar)

- Remove misleading debug messages with --call-graph option (Mengting Zhang)

- Revert vmlinux symbol resolution patches for s390x (Thomas Richter)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-29 19:31:46 +02:00
Linus Torvalds
95d3652eec Merge branch 'fixes-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull keys fixes from James Morris:
 "Notable here is a rewrite of big_key crypto by Jason Donenfeld to
  address some issues in the original code.

  From Jason's commit log:
   "This started out as just replacing the use of crypto/rng with
    get_random_bytes_wait, so that we wouldn't use bad randomness at
    boot time. But, upon looking further, it appears that there were
    even deeper underlying cryptographic problems, and that this seems
    to have been committed with very little crypto review. So, I rewrote
    the whole thing, trying to keep to the conventions introduced by the
    previous author, to fix these cryptographic flaws."

  There has been positive review of the new code by Eric Biggers and
  Herbert Xu, and it passes basic testing via the keyutils test suite.
  Eric also manually tested it.

  Generally speaking, we likely need to improve the amount of crypto
  review for kernel crypto users including keys (I'll post a note
  separately to ksummit-discuss)"

* 'fixes-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  security/keys: rewrite all of big_key crypto
  security/keys: properly zero out sensitive key material in big_key
  KEYS: use kmemdup() in request_key_auth_new()
  KEYS: restrict /proc/keys by credentials at open time
  KEYS: reset parent each time before searching key_user_tree
  KEYS: prevent KEYCTL_READ on negative key
  KEYS: prevent creating a different user's keyrings
  KEYS: fix writing past end of user-supplied buffer in keyring_read()
  KEYS: fix key refcount leak in keyctl_read_key()
  KEYS: fix key refcount leak in keyctl_assume_authority()
  KEYS: don't revoke uninstantiated key in request_key_auth_new()
  KEYS: fix cred refcount leak in request_key_auth_new()
2017-09-29 10:26:35 -07:00
Will Deacon
760bfb47c3 arm64: fault: Route pte translation faults via do_translation_fault
We currently route pte translation faults via do_page_fault, which elides
the address check against TASK_SIZE before invoking the mm fault handling
code. However, this can cause issues with the path walking code in
conjunction with our word-at-a-time implementation because
load_unaligned_zeropad can end up faulting in kernel space if it reads
across a page boundary and runs into a page fault (e.g. by attempting to
read from a guard region).

In the case of such a fault, load_unaligned_zeropad has registered a
fixup to shift the valid data and pad with zeroes, however the abort is
reported as a level 3 translation fault and we dispatch it straight to
do_page_fault, despite it being a kernel address. This results in calling
a sleeping function from atomic context:

  BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:313
  in_atomic(): 0, irqs_disabled(): 0, pid: 10290
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  [...]
  [<ffffff8e016cd0cc>] ___might_sleep+0x134/0x144
  [<ffffff8e016cd158>] __might_sleep+0x7c/0x8c
  [<ffffff8e016977f0>] do_page_fault+0x140/0x330
  [<ffffff8e01681328>] do_mem_abort+0x54/0xb0
  Exception stack(0xfffffffb20247a70 to 0xfffffffb20247ba0)
  [...]
  [<ffffff8e016844fc>] el1_da+0x18/0x78
  [<ffffff8e017f399c>] path_parentat+0x44/0x88
  [<ffffff8e017f4c9c>] filename_parentat+0x5c/0xd8
  [<ffffff8e017f5044>] filename_create+0x4c/0x128
  [<ffffff8e017f59e4>] SyS_mkdirat+0x50/0xc8
  [<ffffff8e01684e30>] el0_svc_naked+0x24/0x28
  Code: 36380080 d5384100 f9400800 9402566d (d4210000)
  ---[ end trace 2d01889f2bca9b9f ]---

Fix this by dispatching all translation faults to do_translation_faults,
which avoids invoking the page fault logic for faults on kernel addresses.

Cc: <stable@vger.kernel.org>
Reported-by: Ankit Jain <ankijain@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-09-29 16:47:40 +01:00
Will Deacon
f069faba68 arm64: mm: Use READ_ONCE when dereferencing pointer to pte table
On kernels built with support for transparent huge pages, different CPUs
can access the PMD concurrently due to e.g. fast GUP or page_vma_mapped_walk
and they must take care to use READ_ONCE to avoid value tearing or caching
of stale values by the compiler. Unfortunately, these functions call into
our pgtable macros, which don't use READ_ONCE, and compiler caching has
been observed to cause the following crash during ext4 writeback:

PC is at check_pte+0x20/0x170
LR is at page_vma_mapped_walk+0x2e0/0x540
[...]
Process doio (pid: 2463, stack limit = 0xffff00000f2e8000)
Call trace:
[<ffff000008233328>] check_pte+0x20/0x170
[<ffff000008233758>] page_vma_mapped_walk+0x2e0/0x540
[<ffff000008234adc>] page_mkclean_one+0xac/0x278
[<ffff000008234d98>] rmap_walk_file+0xf0/0x238
[<ffff000008236e74>] rmap_walk+0x64/0xa0
[<ffff0000082370c8>] page_mkclean+0x90/0xa8
[<ffff0000081f3c64>] clear_page_dirty_for_io+0x84/0x2a8
[<ffff00000832f984>] mpage_submit_page+0x34/0x98
[<ffff00000832fb4c>] mpage_process_page_bufs+0x164/0x170
[<ffff00000832fc8c>] mpage_prepare_extent_to_map+0x134/0x2b8
[<ffff00000833530c>] ext4_writepages+0x484/0xe30
[<ffff0000081f6ab4>] do_writepages+0x44/0xe8
[<ffff0000081e5bd4>] __filemap_fdatawrite_range+0xbc/0x110
[<ffff0000081e5e68>] file_write_and_wait_range+0x48/0xd8
[<ffff000008324310>] ext4_sync_file+0x80/0x4b8
[<ffff0000082bd434>] vfs_fsync_range+0x64/0xc0
[<ffff0000082332b4>] SyS_msync+0x194/0x1e8

This is because page_vma_mapped_walk loads the PMD twice before calling
pte_offset_map: the first time without READ_ONCE (where it gets all zeroes
due to a concurrent pmdp_invalidate) and the second time with READ_ONCE
(where it sees a valid table pointer due to a concurrent pmd_populate).
However, the compiler inlines everything and caches the first value in
a register, which is subsequently used in pte_offset_phys which returns
a junk pointer that is later dereferenced when attempting to access the
relevant pte.

This patch fixes the issue by using READ_ONCE in pte_offset_phys to ensure
that a stale value is not used. Whilst this is a point fix for a known
failure (and simple to backport), a full fix moving all of our page table
accessors over to {READ,WRITE}_ONCE and consistently using READ_ONCE in
page_vma_mapped_walk is in the works for a future kernel release.

Cc: Jon Masters <jcm@redhat.com>
Cc: Timur Tabi <timur@codeaurora.org>
Cc: <stable@vger.kernel.org>
Fixes: f27176cfc3 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()")
Tested-by: Richard Ruigrok <rruigrok@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-09-29 16:46:43 +01:00
Shiraz Saleem
04eae42740 RDMA/iwpm: Properly mark end of NL messages
Commit 1a1c116f3d removes nlmsg_len calculation in
ibnl_put_attr causing netlink messages to be rejected due
to incorrect length.

Add nlmsg_end after all attributes are appended to calculate
the nlmsg_len.

Fixes: 1a1c116f3d ("RDMA/netlink: Simplify the put_msg and put_attr")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-29 11:32:42 -04:00
Boqun Feng
b862789aa5 kvm/x86: Handle async PF in RCU read-side critical sections
Sasha Levin reported a WARNING:

| WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329
| rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline]
| WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329
| rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458
...
| CPU: 0 PID: 6974 Comm: syz-fuzzer Not tainted 4.13.0-next-20170908+ #246
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
| 1.10.1-1ubuntu1 04/01/2014
| Call Trace:
...
| RIP: 0010:rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline]
| RIP: 0010:rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458
| RSP: 0018:ffff88003b2debc8 EFLAGS: 00010002
| RAX: 0000000000000001 RBX: 1ffff1000765bd85 RCX: 0000000000000000
| RDX: 1ffff100075d7882 RSI: ffffffffb5c7da20 RDI: ffff88003aebc410
| RBP: ffff88003b2def30 R08: dffffc0000000000 R09: 0000000000000001
| R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003b2def08
| R13: 0000000000000000 R14: ffff88003aebc040 R15: ffff88003aebc040
| __schedule+0x201/0x2240 kernel/sched/core.c:3292
| schedule+0x113/0x460 kernel/sched/core.c:3421
| kvm_async_pf_task_wait+0x43f/0x940 arch/x86/kernel/kvm.c:158
| do_async_page_fault+0x72/0x90 arch/x86/kernel/kvm.c:271
| async_page_fault+0x22/0x30 arch/x86/entry/entry_64.S:1069
| RIP: 0010:format_decode+0x240/0x830 lib/vsprintf.c:1996
| RSP: 0018:ffff88003b2df520 EFLAGS: 00010283
| RAX: 000000000000003f RBX: ffffffffb5d1e141 RCX: ffff88003b2df670
| RDX: 0000000000000001 RSI: dffffc0000000000 RDI: ffffffffb5d1e140
| RBP: ffff88003b2df560 R08: dffffc0000000000 R09: 0000000000000000
| R10: ffff88003b2df718 R11: 0000000000000000 R12: ffff88003b2df5d8
| R13: 0000000000000064 R14: ffffffffb5d1e140 R15: 0000000000000000
| vsnprintf+0x173/0x1700 lib/vsprintf.c:2136
| sprintf+0xbe/0xf0 lib/vsprintf.c:2386
| proc_self_get_link+0xfb/0x1c0 fs/proc/self.c:23
| get_link fs/namei.c:1047 [inline]
| link_path_walk+0x1041/0x1490 fs/namei.c:2127
...

This happened when the host hit a page fault, and delivered it as in an
async page fault, while the guest was in an RCU read-side critical
section.  The guest then tries to reschedule in kvm_async_pf_task_wait(),
but rcu_preempt_note_context_switch() would treat the reschedule as a
sleep in RCU read-side critical section, which is not allowed (even in
preemptible RCU).  Thus the WARN.

To cure this, make kvm_async_pf_task_wait() go to the halt path if the
PF happens in a RCU read-side critical section.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-29 17:05:17 +02:00
Wanpeng Li
305d0ab476 KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresume
------------[ cut here ]------------
 WARNING: CPU: 4 PID: 5280 at /home/kernel/linux/arch/x86/kvm//vmx.c:11394 nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
 CPU: 4 PID: 5280 Comm: qemu-system-x86 Tainted: G        W  OE   4.13.0+ #17
 RIP: 0010:nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
 Call Trace:
  ? emulator_read_emulated+0x15/0x20 [kvm]
  ? segmented_read+0xae/0xf0 [kvm]
  vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
  ? vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
  x86_emulate_instruction+0x733/0x810 [kvm]
  vmx_handle_exit+0x2f4/0xda0 [kvm_intel]
  ? kvm_arch_vcpu_ioctl_run+0xd2f/0x1c60 [kvm]
  kvm_arch_vcpu_ioctl_run+0xdab/0x1c60 [kvm]
  ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
  kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? __fget+0xfc/0x210
  do_vfs_ioctl+0xa4/0x6a0
  ? __fget+0x11d/0x210
  SyS_ioctl+0x79/0x90
  entry_SYSCALL_64_fastpath+0x23/0xc2

A nested #PF is triggered during L0 emulating instruction for L2. However, it
doesn't consider we should not break L1's vmlauch/vmresme. This patch fixes
it by queuing the #PF exception instead ,requesting an immediate VM exit from
L2 and keeping the exception for L1 pending for a subsequent nested VM exit.

This should actually work all the time, making vmx_inject_page_fault_nested
totally unnecessary.  However, that's not working yet, so this patch can work
around the issue in the meanwhile.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-29 16:30:37 +02:00
Artem Savkov
e6b72ee88a netfilter: ebtables: fix race condition in frame_filter_net_init()
It is possible for ebt_in_hook to be triggered before ebt_table is assigned
resulting in a NULL-pointer dereference. Make sure hooks are
registered as the last step.

Fixes: aee12a0a37 ("ebtables: remove nf_hook_register usage")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-29 13:36:06 +02:00