Commit Graph

693573 Commits

Author SHA1 Message Date
Colin Ian King
bf98bd0be1 rt2x00: make const array glrt_table static
Don't populate array glrt_table on the stack but make it static.
Makes the object code a smaller by over 670 bytes:

Before:
   text	   data	    bss	    dec	    hex	filename
 131772	   4733	      0	 136505	  21539	rt2800lib.o

After:
   text	   data	    bss	    dec	    hex	filename
 131043	   4789	      0	 135832	  21298	rt2800lib.o

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-13 09:23:56 -07:00
Colin Ian King
f56ff77487 net: stmmac: make const array route_possibilities static
Don't populate array route_possibilities on the stack but make it
static const.  Makes the object code a little smaller by 85 bytes:

Before:
   text	   data	    bss	    dec	    hex	filename
   9901	   2448	      0	  12349	   303d	dwmac4_core.o

After:
   text	   data	    bss	    dec	    hex	filename
   9760	   2504	      0	  12264	   2fe8	dwmac4_core.o

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-13 09:23:56 -07:00
Colin Ian King
22c608919b net: broadcom: bnx2x: make a couple of const arrays static
Don't populate various tables on the stack but make them static const.
Makes the object code smaller by nearly 200 bytes:

Before:
   text	   data	    bss	    dec	    hex	filename
 113468	  11200	      0	 124668	  1e6fc	bnx2x_ethtool.o

After:
   text	   data	    bss	    dec	    hex	filename
 113129	  11344	      0	 124473	  1e639	bnx2x_ethtool.o

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-13 09:23:56 -07:00
Thomas Bogendoerfer
0db01097ca xgene: Don't fail probe, if there is no clk resource for SGMII interfaces
This change fixes following problem

[    1.827940] xgene-enet: probe of 1f210030.ethernet failed with error -2

which leads to a missing ethernet interface (reproducable at least on
Gigabyte MP30-AR0 and APM Mustang systems).

The check for a valid clk resource fails, because DT doesn't provide a
clock for sgenet1. But the driver doesn't use this clk, if the ethernet
port is connected via SGMII. Therefore this patch avoids probing for clk
on SGMII interfaces.

Fixes: 9aea7779b7 ("drivers: net: xgene: Fix crash on DT systems")
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-13 09:21:15 -07:00
Kefeng Wang
e4a6a3424b bpf: fix return in bpf_skb_adjust_net
The bpf_skb_adjust_net() ignores the return value of bpf_skb_net_shrink/grow,
and always return 0, fix it by return 'ret'.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-13 09:18:46 -07:00
Roman Kagan
efc479e690 kvm: x86: hyperv: add KVM_CAP_HYPERV_SYNIC2
There is a flaw in the Hyper-V SynIC implementation in KVM: when message
page or event flags page is enabled by setting the corresponding msr,
KVM zeroes it out.  This is problematic because on migration the
corresponding MSRs are loaded on the destination, so the content of
those pages is lost.

This went unnoticed so far because the only user of those pages was
in-KVM hyperv synic timers, which could continue working despite that
zeroing.

Newer QEMU uses those pages for Hyper-V VMBus implementation, and
zeroing them breaks the migration.

Besides, in newer QEMU the content of those pages is fully managed by
QEMU, so zeroing them is undesirable even when writing the MSRs from the
guest side.

To support this new scheme, introduce a new capability,
KVM_CAP_HYPERV_SYNIC2, which, when enabled, makes sure that the synic
pages aren't zeroed out in KVM.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-13 17:41:04 +02:00
Ladi Prosek
a826faf108 KVM: x86: make backwards_tsc_observed a per-VM variable
The backwards_tsc_observed global introduced in commit 16a9602 is never
reset to false. If a VM happens to be running while the host is suspended
(a common source of the TSC jumping backwards), master clock will never
be enabled again for any VM. In contrast, if no VM is running while the
host is suspended, master clock is unaffected. This is inconsistent and
unnecessarily strict. Let's track the backwards_tsc_observed variable
separately and let each VM start with a clean slate.

Real world impact: My Windows VMs get slower after my laptop undergoes a
suspend/resume cycle. The only way to get the perf back is unloading and
reloading the kvm module.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-13 17:25:39 +02:00
Dmitry Torokhov
4cf56a89c6 HID: multitouch: do not blindly set EV_KEY or EV_ABS bits
Now that input core insists on having dev->absinfo when device claims to
generate EV_ABS in its dev->evbit, we should not be blindly setting that
bit.

The code in question might have been needed before input_set_abs_params()
started setting EV_ABS in device's evbit, but not anymore, and is now
breaking devices such as SMART SPNL-6075 Touchscreen.

Fixes: 6ecfe51b40 ("Input: refuse to register absolute devices ...")
Reported-by: Matthias Fend <Matthias.Fend@wolfvision.net>
Tested-by: Matthias Fend <Matthias.Fend@wolfvision.net>
Cc: stable@vger.kernel.org
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-07-13 16:48:29 +02:00
Kefeng Wang
768516894f nbd: kill unused ret in recv_work
No need to return value in queue work, kill ret variable.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-13 08:03:30 -06:00
Ernesto A. Fernández
4d9bcaddac ext2: Fix memory leak when truncate races ext2_get_blocks
Buffer heads referencing indirect blocks may not be released if the file
is truncated at the right time. This happens because ext2_get_branch()
returns NULL when it finds the whole chain of indirect blocks already
set, and when truncate alters the chain this value of NULL is
treated as the address of the last head to be released. Handle this in the
same way as it's done after the got_it label.

Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2017-07-13 13:45:08 +02:00
Chris Brandt
9c284c41c0 mmc: tmio-mmc: fix bad pointer math
The existing code gives an incorrect pointer value.
The buffer pointer 'buf' was of type unsigned short *, and 'count' was a
number in bytes. A cast of buf should have been used.

However, instead of casting, just change the code to use u32 pointers.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 8185e51f35: ("mmc: tmio-mmc: add support for 32bit data port")
Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-07-13 11:56:35 +02:00
Alex Shi
69f0d429c4 locking/rtmutex: Remove unnecessary priority adjustment
We don't need to adjust priority before adding a new pi_waiter, the
priority only needs to be updated after pi_waiter change or task
priority change.

Steven Rostedt pointed out:

  "Interesting, I did some git mining and this was added with the original
   entry of the rtmutex.c (23f78d4a03). Looking at even that version, I
   don't see the purpose of adjusting the task prio here. It is done
   before anything changes in the task."

Signed-off-by: Alex Shi <alex.shi@linaro.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1499926704-28841-1-git-send-email-alex.shi@linaro.org
[ Enhance the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-13 11:44:06 +02:00
Grzegorz Sluja
bbdc74dc19 mmc: block: Prevent new req entering queue after its cleanup
The commit 304419d8a7 ("mmc: core: Allocate per-request data using the
block layer core"), refactored the mechanism of queue handling, but also
made mmc_init_request() to be called after mmc_cleanup_queue(). This
triggers a null pointer dereference:

[  683.123791] BUG: unable to handle kernel NULL pointer dereference at (null)
[  683.123801] IP: mmc_init_request+0x2c/0xf0 [mmc_block]
...
[  683.123905] Call Trace:
[  683.123913]  alloc_request_size+0x4f/0x70
[  683.123919]  mempool_alloc+0x5f/0x150
[  683.123925]  ? __enqueue_entity+0x6c/0x70
[  683.123928]  get_request+0x3ad/0x720
[  683.123933]  ? prepare_to_wait_event+0x110/0x110
[  683.123937]  blk_queue_bio+0xc1/0x3a0
[  683.123940]  generic_make_request+0xf8/0x2a0
[  683.123942]  submit_bio+0x75/0x150
[  683.123947]  submit_bio_wait+0x51/0x70
[  683.123951]  blkdev_issue_flush+0x5c/0x90
[  683.123956]  ext4_sync_fs+0x171/0x1b0
[  683.123961]  sync_filesystem+0x73/0x90
[  683.123965]  fsync_bdev+0x24/0x50
[  683.123971]  invalidate_partition+0x24/0x50
[  683.123973]  del_gendisk+0xb2/0x2a0
[  683.123977]  mmc_blk_remove_req.part.38+0x71/0xa0 [mmc_block]
[  683.123980]  mmc_blk_remove+0xba/0x190 [mmc_block]
[  683.123990]  mmc_bus_remove+0x1a/0x20 [mmc_core]
[  683.123995]  device_release_driver_internal+0x141/0x200
[  683.123999]  device_release_driver+0x12/0x20
[  683.124001]  bus_remove_device+0xfd/0x170
[  683.124004]  device_del+0x1e8/0x330
[  683.124012]  mmc_remove_card+0x60/0xc0 [mmc_core]
[  683.124019]  mmc_remove+0x19/0x30 [mmc_core]
[  683.124025]  mmc_stop_host+0xfb/0x1a0 [mmc_core]
[  683.124032]  mmc_remove_host+0x1a/0x40 [mmc_core]
[  683.124037]  sdhci_remove_host+0x2e/0x1c0 [mmc_sdhci]
[  683.124042]  sdhci_pci_remove_slot+0x3f/0x80 [sdhci_pci]
[  683.124045]  sdhci_pci_remove+0x39/0x70 [sdhci_pci]
[  683.124049]  pci_device_remove+0x39/0xc0
[  683.124052]  device_release_driver_internal+0x141/0x200
[  683.124056]  driver_detach+0x3f/0x80
[  683.124059]  bus_remove_driver+0x55/0xd0
[  683.124062]  driver_unregister+0x2c/0x50
[  683.124065]  pci_unregister_driver+0x29/0x90
[  683.124069]  sdhci_driver_exit+0x10/0x4f3 [sdhci_pci]
[  683.124073]  SyS_delete_module+0x171/0x250
[  683.124078]  entry_SYSCALL_64_fastpath+0x1e/0xa9

Fix this by setting the queue DYING flag before cleanup the queue, as it
prevents new reqs from entering the queue.

Signed-off-by: Grzegorz Sluja <grzegorzx.sluja@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Fixes: 304419d8a7 ("mmc: core: Allocate per-request data using the...")
[Ulf: Updated the changelog]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-07-13 11:44:01 +02:00
Ingo Molnar
fc5d7e5bf4 perf/urgent fix:
- Accept zero as the kernel base address, to resolve symbols on
   architectures that don't partition the virtual address space in
   kernel/user, S/390, is one of such architectures and was where
   this problem was noticed. (Arnaldo Carvalho de Melo)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJZZnW+AAoJENZQFvNTUqpABfkP/iG6xdjtafYV2dMFSGDjQ8I5
 eoV9JBFnROUaPrTYfkd5McBoiO42PkojY1Oj6DxuF9vuWA4mVsInoNh9c86Il2B0
 OM7h9MOJyrYG5rOo1HQniRdPWRhUTIPdB/5dGNpF9SKMV1cHOr+akeEfmo/QBflv
 iZ1JnKVsnrlLO/dNHhvbPWeKYcW2EU0kQ+WeCpMW9NGgwkaVulrbFDetxMRIdX5y
 iyaCPmkSxiGZS8UcpfdNdNFv1JaObaTRB9uD5YLx8HkfX3HdkqM/iXiomSd47XQ1
 KBXa7+IVK63ycolyyqbtvXyK8lyD3DZ9ytth7AMN0msbrZ/HZu5MGF4JYUyvBhlS
 2mtsEYf9nvn58h8Lwljf9v25z3vZRSYIZTaRMCmfsGVsFRoooGLuiILJhCH2g5WP
 DRhX/Su1QB9xV6G27WGSb3N2/oLW3rhyWTj5KXkfln+zi1hBxkEPDKm4W3FXwi9y
 bW+S8pGzP4XdvdNm56XZqTW/eiLUZg6FyWYWxCgBdd76SuSLk8SvZVfg9e3MUmy4
 oDItf0eCUtchNRhW3zcipVkMPHLxMf3MVWfHGP6u0UUyjhBJueNPNyiHibokgJdj
 SqHm2/az6vOtq5VhhrJwpmOSzObtTEitX0Z6VYZn2pVamM6YsdLxaTzRPwTMYlFY
 cFGcq4YJjCn0UMP87Oku
 =pz12
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-4.13-20170712' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fix from Arnaldo Carvalho de Melo:

 - Accept zero as the kernel base address, to resolve symbols on
   architectures that don't partition the virtual address space in
   kernel/user, S/390, is one of such architectures and was where
   this problem was noticed. (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-13 11:35:48 +02:00
Christian Borntraeger
97ca7bfc19 s390/mm: set change and reference bit on lazy key enablement
When we enable storage keys for a guest lazily, we reset the ACC and F
values. That is correct assuming that these are 0 on a clear reset and
the guest obviously has not used any key setting instruction.

We also zero out the change and reference bit. This is not correct as
the architecture prefers over-indication instead of under-indication
for the keyless->keyed transition.

This patch fixes the behaviour and always sets guest change and guest
reference for all guest storage keys on the keyless -> keyed switch.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-13 11:28:29 +02:00
Dong Jia Shi
2daace78a8 s390: chp: handle CRW_ERC_INIT for channel-path status change
When channel path is identified as the report source code (RSC)
of a CRW, and initialized (CRW_ERC_INIT) is recognized as the
error recovery code (ERC) by the channel subsystem, it indicates
a "path has come" event.

Let's handle this case in chp_process_crw().

Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-13 11:28:29 +02:00
Christian Borntraeger
b855629b9a s390/perf: fix problem state detection
The P sample bit indicates problem state and not PER.

Fixes: commit a752598254 ("s390: rename struct psw_bits members")
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-13 11:28:29 +02:00
Bjorn Andersson
3c48d86cc9 clk: Provide bulk prepare_enable disable_unprepare variants
This extends the existing set of bulk helpers with prepare_enable and
disable_unprepare variants.

Cc: Russell King <linux@armlinux.org.uk>,
Cc: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2017-07-12 23:50:53 -07:00
Dave Airlie
6419ec78c6 Merge branch 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux into drm-next
single r700 fix.
* 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: Fix eDP for single-display iMac10,1 (v2)
2017-07-13 13:38:22 +10:00
Linus Torvalds
4ca6df1348 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull sysctl fix from Eric Biederman:
 "A rather embarassing and hard to hit bug was merged into 4.11-rc1.

  Andrei Vagin tracked this bug now and after some staring at the code
  I came up with a fix"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  proc: Fix proc_sys_prune_dcache to hold a sb reference
2017-07-12 19:43:20 -07:00
Linus Torvalds
edaf382518 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

1) Fix 64-bit division in mlx5 IPSEC offload support, from Ilan Tayari
   and Arnd Bergmann.

2) Fix race in statistics gathering in bnxt_en driver, from Michael
   Chan.

3) Can't use a mutex in RCU reader protected section on tap driver, from
   Cong WANG.

4) Fix mdb leak in bridging code, from Eduardo Valentin.

5) Fix free of wrong pointer variable in nfp driver, from Dan Carpenter.

6) Buffer overflow in brcmfmac driver, from Arend van SPriel.

7) ioremap_nocache() return value needs to be checked in smsc911x
   driver, from Alexey Khoroshilov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits)
  net: stmmac: revert "support future possible different internal phy mode"
  sfc: don't read beyond unicast address list
  datagram: fix kernel-doc comments
  socket: add documentation for missing elements
  smsc911x: Add check for ioremap_nocache() return code
  brcmfmac: fix possible buffer overflow in brcmf_cfg80211_mgmt_tx()
  net: hns: Bugfix for Tx timeout handling in hns driver
  net: ipmr: ipmr_get_table() returns NULL
  nfp: freeing the wrong variable
  mlxsw: spectrum_switchdev: Check status of memory allocation
  mlxsw: spectrum_switchdev: Remove unused variable
  mlxsw: spectrum_router: Fix use-after-free in route replace
  mlxsw: spectrum_router: Add missing rollback
  samples/bpf: fix a build issue
  bridge: mdb: fix leak on complete_info ptr on fail path
  tap: convert a mutex to a spinlock
  cxgb4: fix BUG() on interrupt deallocating path of ULD
  qed: Fix printk option passed when printing ipv6 addresses
  net: Fix minor code bug in timestamping.txt
  net: stmmac: Make 'alloc_dma_[rt]x_desc_resources()' look even closer
  ...
2017-07-12 19:30:57 -07:00
Linus Torvalds
bd664f6b3e disable new gcc-7.1.1 warnings for now
I made the mistake of upgrading my desktop to the new Fedora 26 that
comes with gcc-7.1.1.

There's nothing wrong per se that I've noticed, but I now have 1500
lines of warnings, mostly from the new format-truncation warning
triggering all over the tree.

We use 'snprintf()' and friends in a lot of places, and often know that
the numbers are fairly small (ie a controller index or similar), but gcc
doesn't know that, and sees an 'int', and thinks that it could be some
huge number.  And then complains when our buffers are not able to fit
the name for the ten millionth controller.

These warnings aren't necessarily bad per se, and we probably want to
look through them subsystem by subsystem, but at least during the merge
window they just mean that I can't even see if somebody is introducing
any *real* problems when I pull.

So warnings disabled for now.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 19:25:47 -07:00
Dave Airlie
0355e22a4b Merge tag 'drm-misc-next-fixes-2017-07-10' of git://anongit.freedesktop.org/git/drm-misc into drm-next
Core Changes:
- Fix empty timestamps on hw without vlbank counter (Laurent)
- Clear atomic state before retrying ww/mutex acquisition in remove_fb (Maarten)

Driver Changes:
- rockchip: Fix incorrect NULL pointer check after allocation (Gustavo)

Cc: Gustavo A. R. Silva <garsilva@embeddedor.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>

* tag 'drm-misc-next-fixes-2017-07-10' of git://anongit.freedesktop.org/git/drm-misc:
  drm/rockchip: fix NULL check on devm_kzalloc() return value
  drm/atomic: Add missing drm_atomic_state_clear to atomic_remove_fb
  drm: vblank: Fix vblank timestamp update
  DRM: Fix an incorrectly formatted table
  bridge: Fix panel-bridge error return on !panel.
  drm/rockchip: gem: add the lacks lock and trivial changes
2017-07-13 11:22:34 +10:00
Dave Airlie
caa164e373 Merge tag 'drm-intel-next-fixes-2017-07-11' of git://anongit.freedesktop.org/git/drm-intel into drm-next
drm/i915 fixes for v4.13-rc1

* tag 'drm-intel-next-fixes-2017-07-11' of git://anongit.freedesktop.org/git/drm-intel:
  drm/i915: Make DP-MST connector info work
  drm/i915/gvt: Use fence error from GVT request for workload status
  drm/i915/gvt: remove scheduler_mutex in per-engine workload_thread
  drm/i915/gvt: Revert "drm/i915/gvt: Fix possible recursive locking issue"
  drm/i915/gvt: Audit the command buffer address
  drm/i915/gvt: Fix a memory leak in intel_gvt_init_gtt()
  drm/i915/fbdev: Check for existence of ifbdev->vma before operations
  drm/i915: Hold RPM wakelock while initializing OA buffer
  drm/i915/cnl: Fix the CURSOR_COEFF_MASK used in DDI Vswing Programming
  drm/i915/cfl: Fix Workarounds.
  drm/i915: Avoid undefined behaviour of "u32 >> 32"
  drm/i915: reintroduce VLV/CHV PFI programming power domain workaround
  drm/i915: Fix an error checking test
  drm/i915: Disable MSI for all pre-gen5
  drm/i915/gvt: Make function dpy_reg_mmio_readx safe
  drm/i915/gvt: Don't read ADPA_CRT_HOTPLUG_MONITOR from host
  drm/i915/gvt: Set initial PORT_CLK_SEL vreg for BDW
  drm/i915/gvt: Fix inconsistent locks holding sequence
  drm/i915/gvt: Fix possible recursive locking issue
2017-07-13 11:21:16 +10:00
Dave Airlie
39bf0bff4f Merge branch 'mediatek-drm-next-4.13' of https://github.com/ckhu-mediatek/linux.git-tags into drm-next
This include new color format support and some fixups.

* 'mediatek-drm-next-4.13' of https://github.com/ckhu-mediatek/linux.git-tags:
  drm/mediatek: separate color module to fixup error memory reallocation
  drm/mediatek: check for memory allocation failure
  drm/mediatek: re-phrase DRM_INFO error message
  drm/mediatek: use platform_register_drivers
  drm/mediatek: Support UYVY and YUYV format for overlay
2017-07-13 11:00:20 +10:00
Linus Torvalds
3a75ad1457 Modules updates for v4.13
Summary of modules changes for the 4.13 merge window:
 
 - Minor code cleanups
 
 - Avoid accessing mod struct prior to checking module struct version, from Kees
 
 - Fix racy atomic inc/dec logic of kmod_concurrent_max in kmod, from Luis
 
 Signed-off-by: Jessica Yu <jeyu@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJZZp4WAAoJEMBFfjjOO8Fy5JkQAIYujpi6ZS7pGpNCXnGa8pnQ
 E62oLWAM3UndSgzkL6KJ8HXUzc26Wvm56hoF+k/bvQ7fq0qUmMF71yQ7mArzTZEW
 QW4t7Fu6zTUh4l5hGenoz1ShJbi+rB/pQT8l6AgdCSEZjpcCoWv+sdb93qoT3YO8
 /5pugAR2Uid1yb6EVDzItB/tz5w9Vyojp/fePkcz7M0sAI3NCa/0zeWtYgJbXpTW
 atieqPM8icfP8LNBYaXmA1SowMkW9cIh8AGhBIbvUYP35wTZVP2jJA0GxK6vB/+c
 pnDRw/zZO+BUYSpv/NMpJsQ2SKX+t2h5uvBqveq3Q5PljcZAvb6L0wt3PSUp4kvz
 iRPAIb90FtQqBCLfFnDyIMvzVyCXfHq+eVsFYcvlVOWfdkLaeNEhLyn25whkFXr7
 ricd/yXKdS8T1WHatR1HqzIk7pog7PsPewVrjl78TBx3nyIMxEhtCpV9MrnditfP
 IE1/8hQ2rSriSkFeAi5SYxQ5iNwzQKtKOqMiv7lefIuJiCde+0no4XzMrPz/MaU6
 UGyTRRNiQXSlfZQaMI4Ru1itVdAugRRVScATz69ggFqRyfCVuByM78RaygfcrPEC
 H6tHbeJxyEBytlS2qB2cmVXPvIKOdJ3mU9bGdBy9IuXCj8reJMbzQMfIt4lSow+h
 axggDNhbL2urY9Ymn1wX
 =tYuD
 -----END PGP SIGNATURE-----

Merge tag 'modules-for-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull modules updates from Jessica Yu:
 "Summary of modules changes for the 4.13 merge window:

   - Minor code cleanups

   - Avoid accessing mod struct prior to checking module struct version,
     from Kees

   - Fix racy atomic inc/dec logic of kmod_concurrent_max in kmod, from
     Luis"

* tag 'modules-for-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: make the modinfo name const
  kmod: reduce atomic operations on kmod_concurrent and simplify
  module: use list_for_each_entry_rcu() on find_module_all()
  kernel/module.c: suppress warning about unused nowarn variable
  module: Add module name to modinfo
  module: Pass struct load_info into symbol checks
2017-07-12 17:22:01 -07:00
Rafael J. Wysocki
c7b5a4e6e8 PCI / PM: Fix native PME handling during system suspend/resume
Commit 76cde7e495 (PCI / PM: Make PCIe PME interrupts wake up from
suspend-to-idle) went too far with preventing pcie_pme_work_fn() from
clearing the root port's PME Status and re-enabling the PME interrupt
which should be done for PMEs to work correctly after system resume.

The failing scenario is as follows:

 1. pcie_pme_suspend() finds that the PME IRQ should be designated
    for system wakeup, so it calls enable_irq_wake() and then sets
    data->suspend_level to PME_SUSPEND_WAKEUP.

 2. PME interrupt happens at this point.

 3. pcie_pme_irq() runs, disables the PME interrupt and queues up
    the execution of pcie_pme_work_fn().

 4. pcie_pme_work_fn() runs before pcie_pme_resume() and breaks out
    of the loop right away, because data->suspend_level is not
    PME_SUSPEND_NONE, and it doesn't re-enable the PME interrupt
    for the same reason.

 5. pcie_pme_resume() runs and simply calls disable_irq_wake()
    without re-enabling the PME interrupt (because data->suspend_level
    is not PME_SUSPEND_NONE), so the PME interrupt remains disabled
    and the PME Status remains set.

To fix this notice that there is no reason why pcie_pme_work_fn()
should behave in a special way during system resume if the PME
interrupt is not disabled by pcie_pme_suspend() and partially revert
commit 76cde7e495 and restore the previous (and correct) behavior
of pcie_pme_work_fn().

Fixes: 76cde7e495 (PCI / PM: Make PCIe PME interrupts wake up from suspend-to-idle)
Reported-and-tested-by: Naresh Solanki <naresh.solanki@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-07-13 01:50:07 +02:00
Nikolay Borisov
3e8f399da4 writeback: rework wb_[dec|inc]_stat family of functions
Currently the writeback statistics code uses a percpu counters to hold
various statistics.  Furthermore we have 2 families of functions - those
which disable local irq and those which doesn't and whose names begin
with double underscore.  However, they both end up calling
__add_wb_stats which in turn calls percpu_counter_add_batch which is
already irq-safe.

Exploiting this fact allows to eliminated the __wb_* functions since
they don't add any further protection than we already have.
Furthermore, refactor the wb_* function to call __add_wb_stat directly
without the irq-disabling dance.  This will likely result in better
runtime of code which deals with modifying the stat counters.

While at it also document why percpu_counter_add_batch is in fact
preempt and irq-safe since at least 3 people got confused.

Link: http://lkml.kernel.org/r/1498029937-27293-1-git-send-email-nborisov@suse.com
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:05 -07:00
Joe Perches
c945dccc80 ARM: samsung: usb-ohci: move inline before return type
Make the code like the rest of the kernel.

Link: http://lkml.kernel.org/r/667a515b8d0f10f2465d519f8595edd91552fc5e.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:05 -07:00
Joe Perches
ba168a46b0 video: fbdev: omap: move inline before return type
Make the code like the rest of the kernel.

Link: http://lkml.kernel.org/r/bc5927726abc70d7c066df7ab4cb7cfce4a7b577.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
dce3944717 video: fbdev: intelfb: move inline before return type
Make the code like the rest of the kernel.

But there is an oddity here because the inline should probably be removed.

It's an extern function in intelfb.h and it is used in intelfbdrv.c and
intelfbhw.c.

The inline is kept here as I suppose it's possible for some compiler to
make the uses inline in intelfbdrv and and also create an external
function for intelfbhw.

Link: http://lkml.kernel.org/r/8ba151a1fdc84e42cbf4aafc798513c0158edee1.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Maik Broemme <mbroemme@libmpq.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
4abf87f41a USB: serial: safe_serial: move __inline__ before return type
Make the code like the rest of the kernel.
Also use inline instead of __inline__.

Link: http://lkml.kernel.org/r/a5072b74b6c293e6ec93c4900482e9d3267f15b2.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Johan Hovold <johan@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
a9e5bfdb9d drivers: tty: serial: move inline before return type
Make the code like the rest of the kernel.

Link: http://lkml.kernel.org/r/55d3e89d50bb03d603bfb28019fab07f48bdc714.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Pat Gefre <pfg@sgi.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
e0710e510c drivers: s390: move static and inline before return type
Make the code like the rest of the kernel.

Link: http://lkml.kernel.org/r/3f980cd89084ae09716353aba3171e4b3815e690.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Cc: Ursula Braun <ubraun@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
0825f49f22 x86/efi: move asmlinkage before return type
Make the code like the rest of the kernel.

Link: http://lkml.kernel.org/r/1cd3d401626e51ea0e2333a860e76e80bc560a4c.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
0cef25c1d8 sh: move inline before return type
Make the code like the rest of the kernel.

Link: http://lkml.kernel.org/r/f81bb2a67a97b1fd8b6ea99bd350d8a0f6864fb1.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
b745fcb949 MIPS: SMP: move asmlinkage before return type
Make the code like the rest of the kernel.

Link: http://lkml.kernel.org/r/756d3fb543e981b9284e756fa27616725a354b28.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
9d8a9ae281 m68k: coldfire: move inline before return type
Make the code like the rest of the kernel.

Link: http://lkml.kernel.org/r/14db9c166d5b68efa77e337cfe49bb9b29bca3f7.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Greg Ungerer <gerg@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
c02f2a911f ia64: sn: pci: move inline before type
Make the use of inline like the rest of the kernel.

Link: http://lkml.kernel.org/r/f42b2202bd0d4e7ccf79ce5348bb255a035e67bb.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
d778931d7b ia64: move inline before return type
Make the use of inline like the rest of the kernel.

Link: http://lkml.kernel.org/r/d47074493af80ce12590340294bc49618165c30d.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
1d731bb772 FRV: tlbflush: move asmlinkage before return type
Make the use of asmlinkage like the rest of the kernel.

Link: http://lkml.kernel.org/r/efb2dfed4d9315bf68ec0334c81b65af176a0174.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
8d95a3dca0 CRIS: gpio: move inline before return type
Move inline to be like the rest of the kernel.

Link: http://lkml.kernel.org/r/6bf1bec049897c4158f698b866810f47c728f233.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
1e90d0ed32 ARM: HP Jornada 7XX: move inline before return type
Convert 'u8 inline' to 'inline u8' to be the same style used by the rest
of the kernel.

Miscellanea:

jornada_ssp_reverse is an odd function.
It is declared inline but is also EXPORT_SYMBOL.
It is also apparently only used by jornada720_ssp.c
Likely the EXPORT_SYMBOL could be removed and the function
converted to static.

The addition of static and removal of EXPORT_SYMBOL was not done.

Link: http://lkml.kernel.org/r/5bd3b2bf39c6c9caf773949f18158f8f5ec08582.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
ead9fba6b8 ARM: KVM: move asmlinkage before type
asmlinkage is either 'extern "C"' or blank.

Move the uses of asmlinkage before the return types to be similar
to the rest of the kernel.

Link: http://lkml.kernel.org/r/005b8e120650c6a13b541e420f4e3605603fe9e6.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Joe Perches
596ed45b5b checkpatch: improve the STORAGE_CLASS test
Make sure static, extern, and asmlinkage appear before a specific type.

e.g.:
	int asmlinkage foo(void)
is better written
       asmlinkage int foo(void)

Link: http://lkml.kernel.org/r/31704c96df2d5fd9df0b41165940a7a4feb16a63.1499284835.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Michal Hocko
0f55685627 mm, migration: do not trigger OOM killer when migrating memory
Page migration (for memory hotplug, soft_offline_page or mbind) needs to
allocate a new memory.  This can trigger an oom killer if the target
memory is depleated.  Although quite unlikely, still possible,
especially for the memory hotplug (offlining of memoery).

Up to now we didn't really have reasonable means to back off.
__GFP_NORETRY can fail just too easily and __GFP_THISNODE sticks to a
single node and that is not suitable for all callers.

But now that we have __GFP_RETRY_MAYFAIL we should use it.  It is
preferable to fail the migration than disrupt the system by killing some
processes.

Link: http://lkml.kernel.org/r/20170623085345.11304-7-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alex Belits <alex.belits@cavium.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: NeilBrown <neilb@suse.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Michal Hocko
dbb329561a drm/i915: use __GFP_RETRY_MAYFAIL
Commit 24f8e00a8a ("drm/i915: Prefer to report ENOMEM rather than
incur the oom for gfx allocations") has tried to remove disruptive OOM
killer because the userspace should be able to cope with allocation
failures.

At the time only __GFP_NORETRY could achieve that and it turned out that
this would fail the allocations just too easily.  So "drm/i915: Remove
__GFP_NORETRY from our buffer allocator" removed it and hoped for a
better solution.  __GFP_RETRY_MAYFAIL is that solution.  It will keep
retrying the allocation until there is no more progress and we would go
OOM.  Instead we fail the allocation and let the caller to deal with it.

Link: http://lkml.kernel.org/r/20170623085345.11304-6-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Alex Belits <alex.belits@cavium.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: NeilBrown <neilb@suse.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:04 -07:00
Michal Hocko
cc965a29db mm: kvmalloc support __GFP_RETRY_MAYFAIL for all sizes
Now that __GFP_RETRY_MAYFAIL has a reasonable semantic regardless of the
request size we can drop the hackish implementation for !costly orders.
__GFP_RETRY_MAYFAIL retries as long as the reclaim makes a forward
progress and backs of when we are out of memory for the requested size.
Therefore we do not need to enforce__GFP_NORETRY for !costly orders just
to silent the oom killer anymore.

Link: http://lkml.kernel.org/r/20170623085345.11304-5-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alex Belits <alex.belits@cavium.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: NeilBrown <neilb@suse.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:03 -07:00
Michal Hocko
91c63ecda7 xfs: map KM_MAYFAIL to __GFP_RETRY_MAYFAIL
KM_MAYFAIL didn't have any suitable GFP_FOO counterpart until recently
so it relied on the default page allocator behavior for the given set of
flags.  This means that small allocations actually never failed.

Now that we have __GFP_RETRY_MAYFAIL flag which works independently on
the allocation request size we can map KM_MAYFAIL to it.  The allocator
will try as hard as it can to fulfill the request but fails eventually
if the progress cannot be made.  It does so without triggering the OOM
killer which can be seen as an improvement because KM_MAYFAIL users
should be able to deal with allocation failures.

Link: http://lkml.kernel.org/r/20170623085345.11304-4-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alex Belits <alex.belits@cavium.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Daney <david.daney@cavium.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: NeilBrown <neilb@suse.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:03 -07:00
Michal Hocko
dcda9b0471 mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic
__GFP_REPEAT was designed to allow retry-but-eventually-fail semantic to
the page allocator.  This has been true but only for allocations
requests larger than PAGE_ALLOC_COSTLY_ORDER.  It has been always
ignored for smaller sizes.  This is a bit unfortunate because there is
no way to express the same semantic for those requests and they are
considered too important to fail so they might end up looping in the
page allocator for ever, similarly to GFP_NOFAIL requests.

Now that the whole tree has been cleaned up and accidental or misled
usage of __GFP_REPEAT flag has been removed for !costly requests we can
give the original flag a better name and more importantly a more useful
semantic.  Let's rename it to __GFP_RETRY_MAYFAIL which tells the user
that the allocator would try really hard but there is no promise of a
success.  This will work independent of the order and overrides the
default allocator behavior.  Page allocator users have several levels of
guarantee vs.  cost options (take GFP_KERNEL as an example)

 - GFP_KERNEL & ~__GFP_RECLAIM - optimistic allocation without _any_
   attempt to free memory at all. The most light weight mode which even
   doesn't kick the background reclaim. Should be used carefully because
   it might deplete the memory and the next user might hit the more
   aggressive reclaim

 - GFP_KERNEL & ~__GFP_DIRECT_RECLAIM (or GFP_NOWAIT)- optimistic
   allocation without any attempt to free memory from the current
   context but can wake kswapd to reclaim memory if the zone is below
   the low watermark. Can be used from either atomic contexts or when
   the request is a performance optimization and there is another
   fallback for a slow path.

 - (GFP_KERNEL|__GFP_HIGH) & ~__GFP_DIRECT_RECLAIM (aka GFP_ATOMIC) -
   non sleeping allocation with an expensive fallback so it can access
   some portion of memory reserves. Usually used from interrupt/bh
   context with an expensive slow path fallback.

 - GFP_KERNEL - both background and direct reclaim are allowed and the
   _default_ page allocator behavior is used. That means that !costly
   allocation requests are basically nofail but there is no guarantee of
   that behavior so failures have to be checked properly by callers
   (e.g. OOM killer victim is allowed to fail currently).

 - GFP_KERNEL | __GFP_NORETRY - overrides the default allocator behavior
   and all allocation requests fail early rather than cause disruptive
   reclaim (one round of reclaim in this implementation). The OOM killer
   is not invoked.

 - GFP_KERNEL | __GFP_RETRY_MAYFAIL - overrides the default allocator
   behavior and all allocation requests try really hard. The request
   will fail if the reclaim cannot make any progress. The OOM killer
   won't be triggered.

 - GFP_KERNEL | __GFP_NOFAIL - overrides the default allocator behavior
   and all allocation requests will loop endlessly until they succeed.
   This might be really dangerous especially for larger orders.

Existing users of __GFP_REPEAT are changed to __GFP_RETRY_MAYFAIL
because they already had their semantic.  No new users are added.
__alloc_pages_slowpath is changed to bail out for __GFP_RETRY_MAYFAIL if
there is no progress and we have already passed the OOM point.

This means that all the reclaim opportunities have been exhausted except
the most disruptive one (the OOM killer) and a user defined fallback
behavior is more sensible than keep retrying in the page allocator.

[akpm@linux-foundation.org: fix arch/sparc/kernel/mdesc.c]
[mhocko@suse.com: semantic fix]
  Link: http://lkml.kernel.org/r/20170626123847.GM11534@dhcp22.suse.cz
[mhocko@kernel.org: address other thing spotted by Vlastimil]
  Link: http://lkml.kernel.org/r/20170626124233.GN11534@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20170623085345.11304-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alex Belits <alex.belits@cavium.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: NeilBrown <neilb@suse.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 16:26:03 -07:00