find_vma_intersection(mm, start, end) only guarantees that end is greater
than or equal to vma->vm_start but doesn't guarantee that start is
greater than or equal to vma->vm_start. The calculation for the
intersecting range in nouveau_svmm_bind() isn't accounting for this and
can call migrate_vma_setup() with a starting address less than
vma->vm_start. This results in migrate_vma_setup() returning -EINVAL for
the range instead of nouveau skipping that part of the range and migrating
the rest.
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
As there is no need to check for the return value of debugfs_create_file
and drm_debugfs_create_files, remove unnecessary checks and error
handling in nouveau_drm_debugfs_init.
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Huge page-table entries for TTM
In order to reduce CPU usage [1] and in theory TLB misses this patchset enables
huge- and giant page-table entries for TTM and TTM-enabled graphics drivers.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325073102.6129-1-thomas_os@shipmail.org
A bit smaller this time around.. there are still a couple uabi
additions for vulkan waiting in the wings, but I punted on them this
cycle due to running low on time. (They should be easy enough to
rebase, and if it is a problem for anyone I can push a next+uabi
branch so that tu work can proceed.)
The bigger change is refactoring dpu resource manager and moving dpu
to use atomic global state. Other than that, it is mostly cleanups
and fixes.
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ <CAF6AEGuf1R4Xz-t9Z7_cwx9jD=b4wUvvwfqA5cHR8fCSXSd5XQ@mail.gmail.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl6BIG4eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGlHUH/RCFve2sfHRPjRW+
xR5SaLVAw6XKvtKBq7yvKmHEwqNJnL79IHyqqtSrtfFr2FfaH/KvYiCbbAezvSrM
np0udGu7STKGd21CWuyEZJudyhXkOwMRNiFiCXWp7rs35oh8T0TpJxMzo2Nc1nLk
JFQPqAP6OSvq4IkWEywKQI+Au3Z1IBf83xVjZ1s+MKPQHYD49x2hc4cQntL5/cnm
a3DoR2iBkYiGZCZ9dDqAqJTnMQIiCbACdZXgGjNRUpdyA/dtAjsMl11NPYHm8TA2
3AHBupAK50WBZGad6xv2qKQyScsmoJG2mv92QjlOFz0Tpiu6rLnDlLYREDVB6YH6
qbLDsc8=
=XEIU
-----END PGP SIGNATURE-----
Merge v5.6 into drm-next
msm needed rc6, so I just went and merged release
(msm has been in drm-next outside of this tree)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Ice Lake RPS; Sandy Bridge RC6; and few others around
GT hangchec/reset; livelock; and a null dereference.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAl59tisACgkQ+mJfZA7r
E8rKBgf/Vo+ZCQ33KLmthESO1QCENOd3QNknwZ454/rTM611x1x/xQ+j9Cv9z742
/oshYQyPzbqjDE8TYl77MYKPdn93EIfVlJOj1jmwlYms1WUlLAkIR7HgBydB0j3L
W0WMBw5BQ5b15leS/MlAJ1CKj/M6hX5pWYNGoxa8A/q3kErQFAw22J/XuvrSvKLc
lHcwLsLCq6O5M3xe1ztZZ/T5Kh1lKgTWZhG/jchxMe0Lr7R+CPUM7unU9kcBX3ml
gEFIm2a4aH6ARnq8QHnu0ziyzOP13O2DZ27+xjtKP/NlFABC8XbFlj/N13kumfuk
uKMwX2O8PcZykStYyINc1DIL1FJmog==
=2wdN
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-fixes-2020-03-27' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Fixes for instability on Baytrail and Haswell;
Ice Lake RPS; Sandy Bridge RC6; and few others around
GT hangchec/reset; livelock; and a null dereference.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200327081607.GA3082710@intel.com
amd-drm-next-5.7-2020-03-26:
amdgpu:
- Remove a dpm quirk that is not necessary
- Fix handling of AC/DC mode in newer SMU firmwares on navi
- SR-IOV fixes
- RAS fixes
scheduler:
- Fix a race condition
radeon:
- Remove a dpm quirk that is not necessary
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200326155310.5486-1-alexander.deucher@amd.com
This patch fixes the private_flags of mode to be checked and
compared against uapi.mode and not from hw.mode. This helps
properly trigger modeset at boot if desired by driver.
It helps resolve audio_codec initialization issues if display
is connected at boot. Initial discussion on this issue has happened
on below thread:
https://patchwork.freedesktop.org/series/74828/
v2: No functional change. Fixed the Closes tag and added
Maarten's RB.
v3: Added Fixes tag.
Cc: Ville Syrjä <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Cc: Souza, Jose <jose.souza@intel.com>
Fixes: 58d124ea27 ("drm/i915: Complete crtc hw/uapi split, v6.")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1363
Suggested-by: Ville Syrjä <ville.syrjala@linux.intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: SweeAun Khor <swee.aun.khor@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200326125111.11081-1-uma.shankar@intel.com
(cherry picked from commit d5e5670592)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We move the virtual breadcrumb from one physical engine to the next, if
the next virtual request is scheduled on a new physical engine. Since
the virtual context can only be in one signal queue, we need it to track
the current physical engine for the new breadcrumbs. However, to move
the list we need both breadcrumb locks -- and since we cannot take both
at the same time (unless we are careful and always ensure consistent
ordering) stage the movement of the signaler via the current virtual
request.
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1510
Fixes: 6d06779e86 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325130059.30600-1-chris@chris-wilson.co.uk
(cherry picked from commit 6c81e21a47)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On Ivybridge, we can go lower than rc6 to rc6p. And this is required for
Ivybridge to hit the same minimum power consumption as rc6 on other
platforms, so make it so.
v2: Update selftest to include all rc6 residency counters
Note that Andi did mention that we should be converting the magic
numbers into opaque magic macros, so if they ever get reused (unlikely
given only Ivybridge used the extra modes) we'll need to pay back the
technical debt.
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1518
Fixes: 730eaeb524 ("drm/i915/gt: Manual rc6 entry upon parking")
Testcase: igt/i915_pm_rc6_residency/rc6-idle
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200324134232.8773-1-chris@chris-wilson.co.uk
(cherry picked from commit 13c5a577b3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Abuse^W Take advantage that we know we are inside the GT wakeref and
that prevents any client execbuf from reopening the i915_vma in order to
claim all the vma to close without having to drop the spinlock to free
each one individually. By keeping the spinlock, we do not have to
restart if we run concurrently with i915_gem_free_objects -- which
causes them both to restart continually and make very very slow
progress.
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1361
Fixes: 77853186e5 ("drm/i915: Claim vma while under closed_lock in i915_vma_parked()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-2-chris@chris-wilson.co.uk
(cherry picked from commit 3447c4c55d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
If we park/unpark faster than we can respond to RPS events, we never
will process a downclock event after expiring a waitboost, and thus we
will forever restart the GPU at max clocks even if the workload switches
and doesn't justify full power.
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1500
Fixes: 3e7abf8141 ("drm/i915: Extract GT render power state management")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200322163225.28791-1-chris@chris-wilson.co.uk
Cc: <stable@vger.kernel.org> # v5.5+
(cherry picked from commit 21abf0bf16)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Use the restored ability to check if a context is closed to decide
whether or not to immediately ban the context from further execution
after a hang.
Fixes: be90e34483 ("drm/i915/gt: Cancel banned contexts after GT reset")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-2-chris@chris-wilson.co.uk
(cherry picked from commit 8e37d69913)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
I need to keep the GEM context around a bit longer so adding an explicit
flag for syncing execbuf with closed/abandonded contexts.
v2:
* Use already available context flags. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-1-chris@chris-wilson.co.uk
(cherry picked from commit 207e4a71fb)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
There is one one corner case at dma_fence_signal_locked
which will raise the NULL pointer problem just like below.
->dma_fence_signal
->dma_fence_signal_locked
->test_and_set_bit
here trigger dma_fence_release happen due to the zero of fence refcount.
->dma_fence_put
->dma_fence_release
->drm_sched_fence_release_scheduled
->call_rcu
here make the union fled “cb_list” at finished fence
to NULL because struct rcu_head contains two pointer
which is same as struct list_head cb_list
Therefore, to hold the reference of finished fence at drm_sched_process_job
to prevent the null pointer during finished fence dma_fence_signal
[ 732.912867] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 732.914815] #PF: supervisor write access in kernel mode
[ 732.915731] #PF: error_code(0x0002) - not-present page
[ 732.916621] PGD 0 P4D 0
[ 732.917072] Oops: 0002 [#1] SMP PTI
[ 732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G OE 5.4.0-rc7 #1
[ 732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
[ 732.938569] Call Trace:
[ 732.939003] <IRQ>
[ 732.939364] dma_fence_signal+0x29/0x50
[ 732.940036] drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[ 732.940996] drm_sched_process_job+0x34/0xa0 [gpu_sched]
[ 732.941910] dma_fence_signal_locked+0x85/0x100
[ 732.942692] dma_fence_signal+0x29/0x50
[ 732.943457] amdgpu_fence_process+0x99/0x120 [amdgpu]
[ 732.944393] sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]
v2: hold the finished fence at drm_sched_process_job instead of
amdgpu_fence_process
v3: resume the blank line
Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1) SRIOV guest KMD doesn't care training buffer
2) if we resered training buffer that will overlap with IP discovery
reservation because training buffer is at vram_size - 0x8000 and
IP discovery is at ()vram_size - 0x10000 => vram_size -1)
v2: squash in warning fix from Nirmoy
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Driver needs to send the ack message when it receives the
AC/DC interrupt from the SMU.
TODO: verify the client and src ids.
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1043
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For boards that do not support automatic AC/DC transitions
in firmware, manually tell the firmware when the status
changes.
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1043
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Check the platform caps in the vbios pptable to decide
whether to enable automatic AC/DC transitions.
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1043
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Check of the pointer exists and we are actually on AC power.
v2: fix error message to reflect AC/DC mode.
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1043
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
PMFW may boots those ASICs with DC mode. Need to set it back
to AC mode.
v2: split from Evan's original patch (Alex)
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1043
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a common smu11 helper to set the AC/DC power source.
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1043
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is needed to tell the SMU firmware what state is in
in certain cases. DC mode does not allow overclocking
for example.
v2: split Evan's original patch (Alex)
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1043
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is one one corner case at dma_fence_signal_locked
which will raise the NULL pointer problem just like below.
->dma_fence_signal
->dma_fence_signal_locked
->test_and_set_bit
here trigger dma_fence_release happen due to the zero of fence refcount.
->dma_fence_put
->dma_fence_release
->drm_sched_fence_release_scheduled
->call_rcu
here make the union fled “cb_list” at finished fence
to NULL because struct rcu_head contains two pointer
which is same as struct list_head cb_list
Therefore, to hold the reference of finished fence at drm_sched_process_job
to prevent the null pointer during finished fence dma_fence_signal
[ 732.912867] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 732.914815] #PF: supervisor write access in kernel mode
[ 732.915731] #PF: error_code(0x0002) - not-present page
[ 732.916621] PGD 0 P4D 0
[ 732.917072] Oops: 0002 [#1] SMP PTI
[ 732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G OE 5.4.0-rc7 #1
[ 732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
[ 732.938569] Call Trace:
[ 732.939003] <IRQ>
[ 732.939364] dma_fence_signal+0x29/0x50
[ 732.940036] drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[ 732.940996] drm_sched_process_job+0x34/0xa0 [gpu_sched]
[ 732.941910] dma_fence_signal_locked+0x85/0x100
[ 732.942692] dma_fence_signal+0x29/0x50
[ 732.943457] amdgpu_fence_process+0x99/0x120 [amdgpu]
[ 732.944393] sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]
v2: hold the finished fence at drm_sched_process_job instead of
amdgpu_fence_process
v3: resume the blank line
Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set ComputePGMRSRC1.VGPRS as 0x3f to clear all ArcVGPRs.
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Commit '16f17eda8bad ("drm/amd/display: Send vblank and user
events at vsartup for DCN")' introduces a new way of pageflip
completion handling for DCN, and some trouble.
The current implementation introduces a race condition, which
can cause pageflip completion events to be sent out one vblank
too early, thereby confusing userspace and causing flicker:
prepare_flip_isr():
1. Pageflip programming takes the ddev->event_lock.
2. Sets acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED
3. Releases ddev->event_lock.
--> Deadline for surface address regs double-buffering passes on
target pipe.
4. dc_commit_updates_for_stream() MMIO programs the new pageflip
into hw, but too late for current vblank.
=> pflip_status == AMDGPU_FLIP_SUBMITTED, but flip won't complete
in current vblank due to missing the double-buffering deadline
by a tiny bit.
5. VSTARTUP trigger point in vblank is reached, VSTARTUP irq fires,
dm_dcn_crtc_high_irq() gets called.
6. Detects pflip_status == AMDGPU_FLIP_SUBMITTED and assumes the
pageflip has been completed/will complete in this vblank and
sends out pageflip completion event to userspace and resets
pflip_status = AMDGPU_FLIP_NONE.
=> Flip completion event sent out one vblank too early.
This behaviour has been observed during my testing with measurement
hardware a couple of time.
The commit message says that the extra flip event code was added to
dm_dcn_crtc_high_irq() to prevent missing to send out pageflip events
in case the pflip irq doesn't fire, because the "DCH HUBP" component
is clock gated and doesn't fire pflip irqs in that state. Also that
this clock gating may happen if no planes are active. This suggests
that the problem addressed by that commit can't happen if planes
are active.
The proposed solution is therefore to only execute the extra pflip
completion code iff the count of active planes is zero and otherwise
leave pflip completion handling to the pflip irq handler, for a
more race-free experience.
Note that i don't know if this fixes the problem the original commit
tried to address, as i don't know what the test scenario was. It
does fix the observed too early pageflip events though and points
out the problem introduced.
Fixes: 16f17eda8b ("drm/amd/display: Send vblank and user events at vsartup for DCN")
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- fix for potential out-of-bounds reads in the perfmon ioctl
implementation from Christian
- override to expose proper feature flags for the GC400 found on the
STM32MP1 SoC, also from Christian
- Guido fixed an issue where we would spuriously fail to enter
runtime suspend due to a new GPU engine status bit on GC7000
- tree-wide change from Gustavo to get rid of zero-length arrays
- fix for missed TS cache flush on GC7000, leading to spurious
MMU faults from me
- request pages from DMA32 zone on systems where we can't address
all present memory from me
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas Stach <l.stach@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/74d9c6d19099fdba6c6795204a6aa445b7930c79.camel@pengutronix.de
Start using the helpers that align buffer object user-space addresses and
buffer object vram addresses to huge page boundaries.
This is to improve the chances of allowing huge page-table entries.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Using huge page-table entries requires that the physical address of the
start of a buffer object is huge page size aligned.
Make a special version of the TTM range manager that accomplishes this,
but falls back to a smaller page size alignment (PUD->PMD, PMD->NORMAL)
to avoid eviction.
If other drivers want to use it in the future, it can be made a
TTM generic helper. Note that drivers can force eviction for a certain
alignment by assigning the TTM GPU alignment correspondingly.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Unaligned virtual addresses makes it unlikely that huge page-table entries
can be used.
So align virtual buffer object address huge page boundaries to the
underlying physical address huge page boundaries taking buffer object
sizes into account to determine when it might be possible to use huge
page-table entries.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
With vmwgfx dirty-tracking we need a specialized huge_fault
callback. Implement and hook it up.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Support huge (PMD-size and PUD-size) page-table entries by providing a
huge_fault() callback.
We still support private mappings and write-notify by splitting the huge
page-table entries on write-access.
Note that for huge page-faults to occur, either the kernel needs to be
compiled with trans-huge-pages always enabled, or the kernel needs to be
compiled with trans-huge-pages enabled using madvise, and the user-space
app needs to call madvise() to enable trans-huge pages on a per-mapping
basis.
Furthermore huge page-faults will not succeed unless buffer objects and
user-space addresses are aligned on huge page size boundaries.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
___
v2: Use 2.18 instead of 2.17
Add a new param for user-space to determine if kernel module is SM5
capable.
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Surface define v4 added new member buffer_byte_stride. With this patch
add buffer_byte_stride in surface metadata and create surface using new
command if support is available.
Also with this patch replace device specific data types with kernel
types.
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Makes surface_define cleaner by sending vmw_surface_metadata instead of
all the arguments individually.
v2: fix uninitialized return value, error message
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Create a new structure vmw_surface_metadata representing the metadata
used for creating surface. With this can make the surface_define_priv
a bit cleaner.
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
With SM5 capability a new version of streamoutput is supported by device
which need backing mob and a new field. With this change the new command
is supported in command buffer.
v2: Also track streamoutput context binding in binding manager.
v3: Track only one streamoutput as only one can be set to context.
v4: Fix comment typos
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Signed-off-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Previous name vmw_ctx_bindinfo_so is misleading because it actually
represent so target and stream output is a new resource type that needs
tracking for SM5 capable device. Also rename binding type enum and
internal functions to reflect these belongs to so targets.
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Validate indirect and dispatch commands in command buffer.
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>