Commit Graph

119 Commits

Author SHA1 Message Date
Ingo Molnar
8c5db92a70 Merge branch 'linus' into locking/core, to resolve conflicts
Conflicts:
	include/linux/compiler-clang.h
	include/linux/compiler-gcc.h
	include/linux/compiler-intel.h
	include/uapi/linux/stddef.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 10:32:44 +01:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Mark Rutland
6aa7de0591 locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.

For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.

However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:

----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()

// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch

virtual patch

@ depends on patch @
expression E1, E2;
@@

- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)

@ depends on patch @
expression E;
@@

- ACCESS_ONCE(E)
+ READ_ONCE(E)
----

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-25 11:01:08 +02:00
Alex Deucher
c94501279b Revert "drm/amdgpu: discard commands of killed processes"
This causes instability in piglit.  It's fixed in drm-next with:
515c6faf85
1650c14b45
214a91e6bf
29d2535535
7986746263

This reverts commit 6af0883ed9.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-18 12:57:47 -04:00
Christian König
6af0883ed9 drm/amdgpu: discard commands of killed processes
When a process is killed we shouldn't submit all waiting jobs, but instead
clean up as fast as possible.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-24 11:48:42 -04:00
Nicolai Hähnle
eabd76cef9 drm/amd/sched: print sched job id in amd_sched_job trace
This makes it easier to correlate amd_sched_job with with other trace
points that don't log the job pointer.

v2: don't print the sched_job pointer (Andres)

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-07-14 11:06:00 -04:00
Monk Liu
65781c78ad drm/amdgpu/SRIOV:implement guilty job TDR for(V2)
1,TDR will kickout guilty job if it hang exceed the threshold
of the given one from kernel paramter "job_hang_limit", that
way a bad command stream will not infinitly cause GPU hang.

by default this threshold is 1 so a job will be kicked out
after it hang.

2,if a job timeout TDR routine will not reset all sched/ring,
instead if will only reset on the givn one which is indicated
by @job of amdgpu_sriov_gpu_reset, that way we don't need to
reset and recover each sched/ring if we already know which job
cause GPU hang.

3,unblock sriov_gpu_reset for AI family.

V2:
1:put kickout guilty job after sched parked.
2:since parking scheduler prior to kickout already occupies a
while, we can do last check on the in question job before
doing hw_reset.

TODO:
1:when a job is considered as guilty, we should mark some flag
in its fence status flag, and let UMD side aware that this
fence signaling is not due to job complete but job hang.

2:if gpu reset cause all video memory lost, we need introduce
a new policy to implement TDR, like drop all jobs not yet
signaled, and all IOCTL on this device will return ERROR
DEVICE_LOST.
this will be implemented later.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24 17:40:40 -04:00
Chunming Zhou
30514decb2 drm/amdgpu: fix dependency issue
The problem is that executing the jobs in the right order doesn't give you the right result
because consecutive jobs executed on the same engine are pipelined.
In other words job B does it buffer read before job A has written it's result.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-10 13:23:53 -04:00
Chunming Zhou
cb3696fdec drm/amd: fix init order of sched job
Need to increment after the fence check.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-10 13:16:50 -04:00
Chunming Zhou
a6bef67e2a drm/amdgpu: fix NULL pointer error
[  141.420491] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[  141.420532] IP: [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60
[  141.420563] PGD 20a030067
[  141.420575] PUD 2088ca067
[  141.420587] PMD 0

[  141.420599] Oops: 0000 [#1] SMP
[  141.420612] Modules linked in: amdgpu(OE) ttm(OE) drm_kms_helper(E) drm(E) i2c_algo_bit(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) rpcsec_gss_krb5(E) nfsv4(E) nfs(E) fscache(E) eeepc_wmi(E) asus_wmi(E) sparse_keymap(E) snd_hda_codec_realtek(E) video(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) joydev(E) snd_hda_codec(E) snd_seq_midi(E) snd_seq_midi_event(E) snd_hda_core(E) snd_hwdep(E) snd_rawmidi(E) snd_pcm(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) snd_seq(E) crc32_pclmul(E) ghash_clmulni_intel(E) snd_seq_device(E) snd_timer(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) snd(E) soundcore(E) serio_raw(E) shpchp(E) i2c_piix4(E) i2c_designware_platform(E) 8250_dw(E) i2c_designware_core(E) mac_hid(E) binfmt_misc(E)
[  141.420948]  nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) r8169(E) ahci(E) mii(E) libahci(E) wmi(E)
[  141.421042] CPU: 14 PID: 223 Comm: kworker/14:2 Tainted: G           OE   4.9.0-custom #4
[  141.421074] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 0606 04/06/2017
[  141.421146] Workqueue: events amd_sched_job_timedout [amdgpu]
[  141.421169] task: ffff88020b03ba80 task.stack: ffffc900016f4000
[  141.421193] RIP: 0010:[<ffffffff81579ee1>]  [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60
[  141.421229] RSP: 0018:ffffc900016f7d30  EFLAGS: 00010202
[  141.421250] RAX: ffff8801c049fc00 RBX: ffff8801d4d8dc00 RCX: 0000000000000000
[  141.421278] RDX: 0000000000000001 RSI: ffff8801c049fcc0 RDI: 0000000000000000
[  141.421307] RBP: ffffc900016f7d48 R08: 0000000000000000 R09: 0000000000000000
[  141.421334] R10: 00000020ed512a30 R11: 0000000000000001 R12: 0000000000000000
[  141.421362] R13: ffff880209ba4ba0 R14: ffff880209ba4c58 R15: ffff8801c055cc60
[  141.421390] FS:  0000000000000000(0000) GS:ffff88021ef80000(0000) knlGS:0000000000000000
[  141.421421] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  141.421443] CR2: 0000000000000030 CR3: 000000020b554000 CR4: 00000000003406e0
[  141.421471] Stack:
[  141.421480]  ffff8801d4d8dc00 ffff880209ba4c48 ffff880209ba4ba0 ffffc900016f7d78
[  141.421513]  ffffffffa0697920 ffff880209ba0000 0000000000000000 ffff880209ba2770
[  141.421549]  ffff880209ba4b08 ffffc900016f7df0 ffffffffa05ce2ae ffffffffa0509eb7
[  141.421583] Call Trace:
[  141.421628]  [<ffffffffa0697920>] amd_sched_hw_job_reset+0x50/0xb0 [amdgpu]
[  141.421676]  [<ffffffffa05ce2ae>] amdgpu_gpu_reset+0x8e/0x690 [amdgpu]
[  141.421712]  [<ffffffffa0509eb7>] ? drm_printk+0x97/0xa0 [drm]
[  141.421770]  [<ffffffffa0698156>] amdgpu_job_timedout+0x46/0x50 [amdgpu]
[  141.421829]  [<ffffffffa0696a07>] amd_sched_job_timedout+0x17/0x20 [amdgpu]
[  141.421859]  [<ffffffff81095493>] process_one_work+0x153/0x3f0
[  141.421884]  [<ffffffff81095c5b>] worker_thread+0x12b/0x4b0
[  141.421907]  [<ffffffff81095b30>] ? rescuer_thread+0x350/0x350
[  141.421931]  [<ffffffff8109b423>] kthread+0xd3/0xf0
[  141.421951]  [<ffffffff8109b350>] ? kthread_park+0x60/0x60
[  141.421975]  [<ffffffff817e1ee5>] ret_from_fork+0x25/0x30
[  141.421996] Code: ac 81 e8 a3 1f b0 ff 48 c7 c0 ea ff ff ff e9 48 ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 49 89 fc 53 <48> 8b 7f 30 48 89 f3 e8 73 7c 26 00 48 8b 13 48 39 d3 41 0f 95
[  141.422156] RIP  [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60
[  141.422183]  RSP <ffffc900016f7d30>
[  141.422197] CR2: 0000000000000030
[  141.433483] ---[ end trace bc0949bf7ddd6d4b ]---

if the job is reset twice, then the parent could be NULL.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-04-28 17:33:09 -04:00
Chunming Zhou
153de9dff9 drm/amd/sched: revise priority number
big number is to high priority.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:54:03 -04:00
Andres Rodriguez
93f8b36738 drm/amd/sched: add a unique job id to amd_sched_job
A unique id is useful for debugging and tracing. Intended to replace
pointers in ftrace output.

Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:53:53 -04:00
Ingo Molnar
ae7e81c077 sched/headers: Prepare for new header dependencies before moving code to <uapi/linux/sched/types.h>
We are going to move scheduler ABI details to <uapi/linux/sched/types.h>,
which will be used from a number of .c files.

Create empty placeholder header that maps to <linux/types.h>.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:27 +01:00
Dave Airlie
6320745596 drm/virtio: fix busid in a different way, allocate more vbufs.
drm/qxl: various bugfixes and cleanups,
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJYMzLfAAoJEEy22O7T6HE41rIQANAEl/o8cYUoyYTJlhmmnl2U
 K+QBdr7PACdbr8RZrGpwA5ad9ZJGijpZRd2gThrzNS0JBdZI48gPEzU7V206xlyD
 AriBeAu6IkoBTEl+GGx2DfvOdLR6+7KlIrDYIpl2vILgkqlHhneXdHR3R03byRHG
 2Jrxv2YQxCs8swtAb8FRkVNaUgrfkKOKFFlx1LoLFApYeP02oSxZp0Ve4nuRNj7x
 9DCivIw4NyQ9tY1fORapmrEPTerqZnzYdb9RFSv4xilx4Stq1UWdXfTSpwXZHZaG
 VroXZb1I0fZEk1aapIxuzLZFGNSM7wLET/nK02sSvzxJJv2PiyVAabIo70nUqsQK
 H/iGT2g4MZC1Yvz6evENtckbiA1p3F9jnd+Po9ivDY/RrTpND3hVC2WbcOXWxZkb
 m69muvXfrnZwoF9xWPG8aTrCATim++1Ty8/8LoKdVq1d0Dp/Gzk8KnklBPY2vRFt
 dpxqH3jLgED/QcO5W/yQdf0kPRsrNwKFNLqP9bCF2hMIw1VHHddZtnBBXDGATXYq
 hdFA8EEg3gh/kY7V8b+GyxjRKRbveG208hu+H4EirxHmRn5xJN1VoTLk9va+AJL1
 I30l4USLDkTgf1AjYmk7yFIUTemCtwjfa0lsuu4l3rRJ3k1eBrtZe2cpWv2BoQDU
 by0sNnDelzJTQ9/v1i3J
 =OYiT
 -----END PGP SIGNATURE-----

Merge tag 'drm-qemu-20161121' of git://git.kraxel.org/linux into drm-next

drm/virtio: fix busid in a different way, allocate more vbufs.
drm/qxl: various bugfixes and cleanups,

* tag 'drm-qemu-20161121' of git://git.kraxel.org/linux: (224 commits)
  drm/virtio: allocate some extra bufs
  qxl: Allow resolution which are not multiple of 8
  qxl: Don't notify userspace when monitors config is unchanged
  qxl: Remove qxl_bo_init() return value
  qxl: Call qxl_gem_{init, fini}
  qxl: Add missing '\n' to qxl_io_log() call
  qxl: Remove unused prototype
  qxl: Mark some internal functions as static
  Revert "drm: virtio: reinstate drm_virtio_set_busid()"
  drm/virtio: fix busid regression
  drm: re-export drm_dev_set_unique
  Linux 4.9-rc5
  gp8psk: Fix DVB frontend attach
  gp8psk: fix gp8psk_usb_in_op() logic
  dvb-usb: move data_mutex to struct dvb_usb_device
  iio: maxim_thermocouple: detect invalid storage size in read()
  aoe: fix crash in page count manipulation
  lightnvm: invalid offset calculation for lba_shift
  Kbuild: enable -Wmaybe-uninitialized warnings by default
  pcmcia: fix return value of soc_pcmcia_regulator_set
  ...
2016-11-30 14:18:51 +10:00
Dave Airlie
7b624ad8fe Linux 4.9-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJYHmoCAAoJEHm+PkMAQRiG7RMIAI2i7Y5hpL5yCxK5AFaL4u/G
 KxXfp1B1UanUTgjOmd7zGqtDYcFX9t7GTTUFixQ7/9Opr4PD9qbnatoDGSc3xjbT
 msDgA1B78F1/Q3kHWfeGq32MihQ4mj5NwUCo+igUcUvvWG7mHgzErj/Nh5RoobQX
 p/izdpTbrw3GX6xXB8olbG7XWHaVye/+TT3q6+gmgm8I/QEujcLeGoycE0zlhPN8
 FG/JX76At/+ZM2Py7Oxo3k+oKL9CHrtOQYDp/wN0uslV5eYvvkZz0/M1HMOGZt+c
 gZU5jzM17K7C4Nzo06WAuBU9wUBGc25m+cPicLlOmljnzfU+f50SKaDjZq3p7QI=
 =2KUF
 -----END PGP SIGNATURE-----

Backmerge tag 'v4.9-rc4' into drm-next

Linux 4.9-rc4

This is needed for nouveau development.
2016-11-07 09:37:09 +10:00
Christian König
c24784f015 drm/amd: fix scheduler fence teardown order v2
Some fences might be alive even after we have stopped the scheduler leading
to warnings about leaked objects from the SLUB allocator.

Fix this by allocating/freeing the SLUB allocator from the module
init/fini functions just like we do it for hw fences.

v2: make variable static, add link to bug

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=97500

Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2016-10-31 12:43:20 -04:00
Chris Wilson
f54d186700 dma-buf: Rename struct fence to dma_fence
I plan to usurp the short name of struct fence for a core kernel struct,
and so I need to rename the specialised fence/timeline for DMA
operations to make room.

A consensus was reached in
https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html
that making clear this fence applies to DMA operations was a good thing.
Since then the patch has grown a bit as usage increases, so hopefully it
remains a good thing!

(v2...: rebase, rerun spatch)
v3: Compile on msm, spotted a manual fixup that I broke.
v4: Try again for msm, sorry Daniel

coccinelle script:
@@

@@
- struct fence
+ struct dma_fence
@@

@@
- struct fence_ops
+ struct dma_fence_ops
@@

@@
- struct fence_cb
+ struct dma_fence_cb
@@

@@
- struct fence_array
+ struct dma_fence_array
@@

@@
- enum fence_flag_bits
+ enum dma_fence_flag_bits
@@

@@
(
- fence_init
+ dma_fence_init
|
- fence_release
+ dma_fence_release
|
- fence_free
+ dma_fence_free
|
- fence_get
+ dma_fence_get
|
- fence_get_rcu
+ dma_fence_get_rcu
|
- fence_put
+ dma_fence_put
|
- fence_signal
+ dma_fence_signal
|
- fence_signal_locked
+ dma_fence_signal_locked
|
- fence_default_wait
+ dma_fence_default_wait
|
- fence_add_callback
+ dma_fence_add_callback
|
- fence_remove_callback
+ dma_fence_remove_callback
|
- fence_enable_sw_signaling
+ dma_fence_enable_sw_signaling
|
- fence_is_signaled_locked
+ dma_fence_is_signaled_locked
|
- fence_is_signaled
+ dma_fence_is_signaled
|
- fence_is_later
+ dma_fence_is_later
|
- fence_later
+ dma_fence_later
|
- fence_wait_timeout
+ dma_fence_wait_timeout
|
- fence_wait_any_timeout
+ dma_fence_wait_any_timeout
|
- fence_wait
+ dma_fence_wait
|
- fence_context_alloc
+ dma_fence_context_alloc
|
- fence_array_create
+ dma_fence_array_create
|
- to_fence_array
+ to_dma_fence_array
|
- fence_is_array
+ dma_fence_is_array
|
- trace_fence_emit
+ trace_dma_fence_emit
|
- FENCE_TRACE
+ DMA_FENCE_TRACE
|
- FENCE_WARN
+ DMA_FENCE_WARN
|
- FENCE_ERR
+ DMA_FENCE_ERR
)
 (
 ...
 )

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk
2016-10-25 14:40:39 +02:00
Grazvydas Ignotas
9566213359 drm/amdgpu: update kernel-doc for some functions
The names were wrong.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-24 17:01:07 -04:00
Grazvydas Ignotas
a053fb7e51 drm/amdgpu: fix sched fence slab teardown
To free fences, call_rcu() is used, which calls amd_sched_fence_free()
after a grace period. During teardown, there is no guarantee all
callbacks have finished, so sched_fence_slab may be destroyed before
all fences have been freed. If we are lucky, this results in some slab
warnings, if not, we get a crash in one of rcu threads because callback
is called after amdgpu has already been unloaded.

Fix it with a rcu_barrier().

Fixes: 189e0fb763 ("drm/amdgpu: RCU protected amd_sched_fence_release")
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-24 17:00:38 -04:00
Christian König
bdf001374b drm/amdgpu: fix timeout value check in amd_sched_job_recovery
Could be that we don't actually have a timeout set.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-08-19 12:29:20 -04:00
Chunming Zhou
1c62cf91e6 drm/amd: fix deadlock of job_list_lock V2
run_job involves mutex, which could sleep.

V2: use list_for_each_entry_safe, since the job might complete
while we dropped the lock.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-29 14:37:06 -04:00
Chunming Zhou
bdc2eea472 drm/amd: reset hw count when reset job
Means the hw ring is empty after gpu reset.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-29 14:37:05 -04:00
Chunming Zhou
ec75f573c3 drm/amdgpu: add amd_sched_job_recovery
Which is to recover hw jobs when gpu reset.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 15:06:15 -04:00
Chunming Zhou
e686e75dac drm/amd: add amd_sched_hw_job_reset
amd_sched_hw_job_reset will remove callback from hw fence.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 15:06:13 -04:00
Chunming Zhou
754ce0fa55 drm/amd: add parent for sched fence
Parent of sched fence is hw fence which is to signal sched fence.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 15:06:13 -04:00
Christian König
595a9cd68c drm/amdgpu: remove fence parameter from amd_sched_job_init
We return the fence as part of the job structur anyway,
no need to do this twice.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 15:06:11 -04:00
Christian König
1059e117cc drm/amdgpu: stop disabling irqs when it isn't neccessary
A regular spin_lock/unlock should do here as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 15:06:04 -04:00
Chunming Zhou
0875dc9e80 drm/amdgpu: block scheduler when gpu reset
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 14:54:43 -04:00
Christian König
a8bd3e1c71 drm/amdgpu: stop trying to schedule() with a spin held
Drop the lock before calling cancel_delayed_work_sync().

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96445

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 14:51:31 -04:00
Christian König
6fc1367582 drm/amdgpu: generalize the scheduler fence
Make it two events, one for the job being scheduled and one when it is finished.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 14:51:20 -04:00
Christian König
c5f74f7802 drm/amdgpu: fix and cleanup job destruction
Remove the job reference counting and just properly destroy it from a
work item which blocks on any potential running timeout handler.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk.Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 14:50:54 -04:00
Christian König
f42d20a942 drm/amdgpu: move locking into the functions who need it
Otherwise the locking becomes rather confusing.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk.Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 14:50:53 -04:00
Christian König
0e51a772e2 drm/amdgpu: properly abstract scheduler timeout handling
The driver shouldn't mess with the scheduler internals.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk.Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 14:50:53 -04:00
Christian König
1e24e31f22 drm/amdgpu: remove use_shed hack in job cleanup
Remembering the code path in a variable to cleanup
differently is usually not a good idea at all.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk.Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 14:50:52 -04:00
Christian König
20df080da2 drm/amdgpu: remove duplicated timeout callback
No need for double housekeeping here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk.Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 14:50:51 -04:00
Christian König
7392c329ee drm/amdgpu: remove begin_job/finish_job
Completely pointless and confusing to use a callback
to call into the same code file.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk.Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 14:50:50 -04:00
Christian König
16a7133f35 drm/amdgpu: fix coding style in the scheduler v2
v2: fix even more

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk.Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07 14:50:50 -04:00
Christian König
edf600dac6 drm/amd: cleanup remaining spaces and tabs v2
This is the result of running the following commands:
find drivers/gpu/drm/amd/ -name "*.h" -exec sed -i 's/[ \t]\+$//' {} \;
find drivers/gpu/drm/amd/ -name "*.c" -exec sed -i 's/[ \t]\+$//' {} \;
find drivers/gpu/drm/amd/ -name "*.h" -exec sed -i 's/ \+\t/\t/' {} \;
find drivers/gpu/drm/amd/ -name "*.c" -exec sed -i 's/ \+\t/\t/' {} \;

v2: drop changes to DAL and internal headers

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-11 12:31:20 -04:00
Nils Wallménius
62250a910a drm/amd/scheduler: Mark amdgpu_sched_ops const
This marks the struct amdgpu_sched_ops const and
adjusts amd_sched_init to take a const pointer
for the ops param. The ops member of
struct amd_gpu_scheduler is also changed to const.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nils Wallménius <nils.wallmenius@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-04 20:20:05 -04:00
Monk Liu
b6723c8da5 drm/amdgpu: use ref to keep job alive
this is to fix fatal page fault error that occured if:
job is signaled/released after its timeout work is already
put to the global queue (in this case the cancel_delayed_work
will return false), which will lead to NX-protection error
page fault during job_timeout_func.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02 15:20:07 -04:00
Monk Liu
0de2479c95 drm/amdgpu: rework TDR in scheduler (v2)
Add two callbacks to scheduler to maintain jobs, and invoked for
job timeout calculations. Now TDR measures time gap from
job is processed by hw.

v2:
fix typo

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02 15:19:57 -04:00
Monk Liu
cccd9bce97 drm/amdgpu: get rid of incorrect TDR
original time out detect routine is incorrect, cuz it measures
the gap from job scheduled, but we should only measure the
gap from processed by hw.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02 15:19:42 -04:00
Monk Liu
4835096b07 drm/amdgpu: put job to list before done
the mirror_list will be used for later time out detect
feature.  This is needed to properly detect a GPU
timeout with the scheduler.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02 15:17:53 -04:00
Monk Liu
e472d2588e drm/amdgpu: delay job free to when it's finished (v2)
for those jobs submitted through scheduler, do not
free it immediately after scheduled, instead free it
in global workqueue by its sched fence signaling
callback function.

v2:
call uf's bo_undef after job_run()
call job's sync free after job_run()
no static inline __amdgpu_job_free() anymore, just use
kfree(job) to replace it.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02 15:17:41 -04:00
Monk Liu
e686941a32 drm/amdgpu: use sched_job_init to initialize sched_job
Consolidate job initialization in one place rather than
duplicating it in multiple places.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02 15:12:59 -04:00
Christian König
189e0fb763 drm/amdgpu: RCU protected amd_sched_fence_release
Fences must be freed RCU protected, otherwise the reservation_object_*_rcu()
functions can run into problems.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-03-16 17:59:01 -04:00
Monk Liu
777dbd458c drm/amdgpu: drop a dummy wakeup scheduler
since the dependency job is also scheduled by the same
scheduler with the job depended on it, no need to
call wake up scheduler when the dep is scheduled.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-02-10 14:17:02 -05:00
Chunming Zhou
e8deea2d4b drm/amdgpu: add entity only when first job come
umd somtimes will create a context for every ring,
that means some entities wouldn't be used at all.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-14 19:41:19 -05:00
Nicolai Hähnle
786b521908 drm/amdgpu: fix race condition in amd_sched_entity_push_job
As soon as we leave the spinlock after the job has been added to the job
queue, we can no longer rely on the job's data to be available.

I have seen a null-pointer dereference due to sched == NULL in
amd_sched_wakeup via amd_sched_entity_push_job and
amd_sched_ib_submit_kernel_helper. Since the latter initializes
sched_job->sched with the address of the ring scheduler, which is
guaranteed to be non-NULL, this race appears to be a likely culprit.

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Bugzilla: https://bugs.freedesktop.org/attachment.cgi?bugid=93079
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-12-04 11:26:52 -05:00
Chunming Zhou
d033a6de80 drm/amd: abstract kernel rq and normal rq to priority of run queue
Allows us to set priorities in the scheduler.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
2015-12-02 15:54:33 -05:00