linux

Author	SHA1	Message	Date
Kumar, Mahesh	eb2fdcdf82	drm/i915/skl+: Watermark calculation cleanup This patch cleanup/reorganises the watermark calculation functions. This patch make use of already available macro "drm_atomic_crtc_state_for_each_plane_state" to walk through plane_state list instead of calculating plane_state in function itself. This restructuring will help later patch for new DDB allocation algorithm to do only algo related changes. Changes from V1: - split the patch in two parts as per Matt's comment Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-9-mahesh1.kumar@intel.com	2017-05-17 14:33:00 -07:00
Kumar, Mahesh	5ba6faafbe	drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation DDB minimum requirement of crtc configuration (cumulative of all the enabled planes in crtc) may exceed the allocated DDB for crtc/pipe. This patch make changes to fail the flip/ioctl if minimum requirement for pipe exceeds the total ddb allocated to the pipe. Previously it succeeded but making alloc_size a negative value. Which will make subsequent calculations for plane ddb allocation bogus & may lead to screen corruption or system hang. Changes from V1: - Improve commit message as per Ander's comment - Remove extra parentheses (Ander) Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-8-mahesh1.kumar@intel.com	2017-05-17 14:32:57 -07:00
Kumar, Mahesh	336031ea1c	drm/i915/skl+: no need to memset again We are already doing memset of ddb structure at the begining of skl_allocate_pipe_ddb function, No need to again do a memset. Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-7-mahesh1.kumar@intel.com	2017-05-17 14:32:53 -07:00
Kumar, Mahesh	7b75119c8b	drm/i915/skl: Fail the flip if no FB for WM calculation Fail the flip if no FB is present but plane_state is set as visible. Above is not a valid combination so instead of continue fail the flip. Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-6-mahesh1.kumar@intel.com	2017-05-17 14:32:44 -07:00
Kumar, Mahesh	7084b50bdd	drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point This patch make changes to calculate adjusted plane pixel rate & plane downscale amount using fixed_point functions available. This patch will give uniformity in code, & will help to avoid mixing of 32bit uint32_t variable for fixed-16.16 with fixed_16_16_t variables in later patch in the series. Changes from V1: - Rebase based on wrapper name change - Remove unnecessary comment Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-5-mahesh1.kumar@intel.com	2017-05-17 14:32:32 -07:00
Kumar, Mahesh	d273ecce3d	drm/i915: Use fixed_16_16 wrapper for division operation Don't use fixed_16_16 structure members directly, instead use wrapper to perform fixed_16_16 division operation. Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-4-mahesh1.kumar@intel.com	2017-05-17 14:32:28 -07:00
Kumar, Mahesh	a9d055de65	drm/i915: Add more wrapper for fixed_point_16_16 operations This patch adds few wrapper to perform fixed_point_16_16 operations mul_round_up_u32_fixed16 : Multiplies u32 and fixed_16_16_t variables & returns u32 result with rounding-up. mul_fixed16 : Multiplies two fixed_16_16_t variable & returns fixed_16_16 div_round_up_fixed16 : Perform division operation on fixed_16_16_t variables & return u32 result with round-off div_round_up_u32_fixed16 : devide uint32_t variable by fixed_16_16 variable and round_up the result to uint32_t. These wrappers will be used by later patches in the series. Changes from V1: - Rename wrapper as per Matt's comment Changes from V2: - Fix indentation Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-3-mahesh1.kumar@intel.com	2017-05-17 14:28:37 -07:00
Kumar, Mahesh	afbc95cd0c	drm/i915: fix naming of fixed_16_16 wrapper. fixed_16_16_div_round_up(_u64), wrapper for fixed_16_16 division operation don't really round_up the result. Wrapper round_up only the fraction part of the result to make it 16-bit. This patch eliminates round_up keyword from the wrapper. Later patch will introduce the new wrapper to do rounding-off the result and give unt32_t output to cleanup mix use of fixed_16_16_t & uint32_t variables. Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-2-mahesh1.kumar@intel.com	2017-05-17 14:28:10 -07:00
Chris Wilson	955a4b8979	drm/i915: Don't force serialisation on marking up execlists irq posted Since we coordinate with the execlists tasklet using a locked schedule operation that ensures that after we set the engine->irq_posted we always have an invocation of the tasklet, we do not need to use a locked operation to set the engine->irq_posted itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-12-chris@chris-wilson.co.uk	2017-05-17 13:38:15 +01:00
Chris Wilson	5d3d69d5c1	drm/i915: Stop inlining the execlists IRQ handler As the handler is now quite complex, involving a few atomics, the cost of the function preamble is negligible in comparison and so we should leave the function out-of-line for better I$. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-11-chris@chris-wilson.co.uk	2017-05-17 13:38:14 +01:00
Chris Wilson	349bdb68bd	drm/i915/execlists: Reduce lock contention between schedule/submit_request If we do not require to perform priority bumping, and we haven't yet submitted the request, we can update its priority in situ and skip acquiring the engine locks -- thus avoiding any contention between us and submit/execute. v2: Remove the stack element from the list if we can do the early assignment. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-10-chris@chris-wilson.co.uk	2017-05-17 13:38:13 +01:00
Chris Wilson	c5cf9a9147	drm/i915: Create a kmem_cache to allocate struct i915_priolist from The i915_priolist are allocated within an atomic context on a path where we wish to minimise latency. If we use a dedicated kmem_cache, we have the advantage of a local freelist from which to service new requests that should keep the latency impact of an allocation small. Though currently we expect the majority of requests to be at default priority (and so hit the preallocate priolist), once userspace starts using priorities they are likely to use many fine grained policies improving the utilisation of a private slab. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-9-chris@chris-wilson.co.uk	2017-05-17 13:38:12 +01:00
Chris Wilson	6c067579e6	drm/i915: Split execlist priority queue into rbtree + linked list All the requests at the same priority are executed in FIFO order. They do not need to be stored in the rbtree themselves, as they are a simple list within a level. If we move the requests at one priority into a list, we can then reduce the rbtree to the set of priorities. This should keep the height of the rbtree small, as the number of active priorities can not exceed the number of active requests and should be typically only a few. Currently, we have ~2k possible different priority levels, that may increase to allow even more fine grained selection. Allocating those in advance seems a waste (and may be impossible), so we opt for allocating upon first use, and freeing after its requests are depleted. To avoid the possibility of an allocation failure causing us to lose a request, we preallocate the default priority (0) and bump any request to that priority if we fail to allocate it the appropriate plist. Having a request (that is ready to run, so not leading to corruption) execute out-of-order is better than leaking the request (and its dependency tree) entirely. There should be a benefit to reducing execlists_dequeue() to principally using a simple list (and reducing the frequency of both rbtree iteration and balancing on erase) but for typical workloads, request coalescing should be small enough that we don't notice any change. The main gain is from improving PI calls to schedule, and the explicit list within a level should make request unwinding simpler (we just need to insert at the head of the list rather than the tail and not have to make the rbtree search more complicated). v2: Avoid use-after-free when deleting a depleted priolist v3: Michał found the solution to handling the allocation failure gracefully. If we disable all priority scheduling following the allocation failure, those requests will be executed in fifo and we will ensure that this request and its dependencies are in strict fifo (even when it doesn't realise it is only a single list). Normal scheduling is restored once we know the device is idle, until the next failure! Suggested-by: Michał Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-8-chris@chris-wilson.co.uk	2017-05-17 13:38:09 +01:00
Chris Wilson	e4f815f6bf	drm/i915: Use a define for the default priority [0] Explicitly assign the default priority, and give it a name. After much discussion, we have chosen to call it I915_PRIORITY_NORMAL! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-7-chris@chris-wilson.co.uk	2017-05-17 13:38:08 +01:00
Chris Wilson	a4b2b01523	drm/i915: Don't mark an execlists context-switch when idle If we know that the engine is idle, i.e. we have not more contexts in flight, we can skip any spurious CSB idle interrupts. These spurious interrupts seem to arrive long after we assert that the engines are completely idle, triggering later assertions: [ 178.896646] intel_engine_is_idle(bcs): interrupt not handled, irq_posted=2 [ 178.896655] ------------[ cut here ]------------ [ 178.896658] kernel BUG at drivers/gpu/drm/i915/intel_engine_cs.c:226! [ 178.896661] invalid opcode: 0000 [#1] SMP [ 178.896663] Modules linked in: i915(E) x86_pkg_temp_thermal(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) intel_gtt(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) aesni_intel(E) prime_numbers(E) evdev(E) aes_x86_64(E) drm(E) crypto_simd(E) cryptd(E) glue_helper(E) mei_me(E) mei(E) lpc_ich(E) efivars(E) mfd_core(E) battery(E) video(E) acpi_pad(E) button(E) tpm_tis(E) tpm_tis_core(E) tpm(E) autofs4(E) i2c_i801(E) fan(E) thermal(E) i2c_designware_platform(E) i2c_designware_core(E) [ 178.896694] CPU: 1 PID: 522 Comm: gem_exec_whispe Tainted: G E 4.11.0-rc5+ #14 [ 178.896702] task: ffff88040aba8d40 task.stack: ffffc900003f0000 [ 178.896722] RIP: 0010:intel_engine_init_global_seqno+0x1db/0x1f0 [i915] [ 178.896725] RSP: 0018:ffffc900003f3ab0 EFLAGS: 00010246 [ 178.896728] RAX: 0000000000000000 RBX: ffff88040af54000 RCX: 0000000000000000 [ 178.896731] RDX: ffff88041ec933e0 RSI: ffff88041ec8cc48 RDI: ffff88041ec8cc48 [ 178.896734] RBP: ffffc900003f3ac8 R08: 0000000000000000 R09: 000000000000047d [ 178.896736] R10: 0000000000000040 R11: ffff88040b344f80 R12: 0000000000000000 [ 178.896739] R13: ffff88040bce0000 R14: ffff88040bce52d8 R15: ffff88040bce0000 [ 178.896742] FS: 00007f2cccc2d8c0(0000) GS:ffff88041ec80000(0000) knlGS:0000000000000000 [ 178.896746] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 178.896749] CR2: 00007f41ddd8f000 CR3: 000000040bb03000 CR4: 00000000001406e0 [ 178.896752] Call Trace: [ 178.896768] reset_all_global_seqno.part.33+0x4e/0xd0 [i915] [ 178.896782] i915_gem_request_alloc+0x304/0x330 [i915] [ 178.896795] i915_gem_do_execbuffer+0x8a1/0x17d0 [i915] [ 178.896799] ? remove_wait_queue+0x48/0x50 [ 178.896812] ? i915_wait_request+0x300/0x590 [i915] [ 178.896816] ? wake_up_q+0x70/0x70 [ 178.896819] ? refcount_dec_and_test+0x11/0x20 [ 178.896823] ? reservation_object_add_excl_fence+0xa5/0x100 [ 178.896835] i915_gem_execbuffer2+0xab/0x1f0 [i915] [ 178.896844] drm_ioctl+0x1e6/0x460 [drm] [ 178.896858] ? i915_gem_execbuffer+0x260/0x260 [i915] [ 178.896862] ? dput+0xcf/0x250 [ 178.896866] ? full_proxy_release+0x66/0x80 [ 178.896869] ? mntput+0x1f/0x30 [ 178.896872] do_vfs_ioctl+0x8f/0x5b0 [ 178.896875] ? ____fput+0x9/0x10 [ 178.896878] ? task_work_run+0x80/0xa0 [ 178.896881] SyS_ioctl+0x3c/0x70 [ 178.896885] entry_SYSCALL_64_fastpath+0x17/0x98 [ 178.896888] RIP: 0033:0x7f2ccb455ca7 [ 178.896890] RSP: 002b:00007ffcabec72d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 178.896894] RAX: ffffffffffffffda RBX: 000055f897a44b90 RCX: 00007f2ccb455ca7 [ 178.896897] RDX: 00007ffcabec74a0 RSI: 0000000040406469 RDI: 0000000000000003 [ 178.896900] RBP: 00007f2ccb70a440 R08: 00007f2ccb70d0a4 R09: 0000000000000000 [ 178.896903] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 178.896905] R13: 000055f89782d71a R14: 00007ffcabecf838 R15: 0000000000000003 [ 178.896908] Code: 00 31 d2 4c 89 ef 8d 70 48 41 ff 95 f8 06 00 00 e9 68 fe ff ff be 0f 00 00 00 48 c7 c7 48 dc 37 a0 e8 fa 33 d6 e0 e9 0b ff ff ff <0f> 0b 0f 0b 0f 0b 0f 0b 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 On the other hand, by ignoring the interrupt do we risk running out of space in CSB ring? Testing for a few hours suggests not, i.e. that we only seem to get the odd delayed CSB idle notification. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-6-chris@chris-wilson.co.uk	2017-05-17 13:38:08 +01:00
Chris Wilson	77f0d0e925	drm/i915/execlists: Pack the count into the low bits of the port.request add/remove: 1/1 grow/shrink: 5/4 up/down: 391/-578 (-187) function old new delta execlists_submit_ports 262 471 +209 port_assign.isra - 136 +136 capture 6344 6359 +15 reset_common_ring 438 452 +14 execlists_submit_request 228 238 +10 gen8_init_common_ring 334 341 +7 intel_engine_is_idle 106 105 -1 i915_engine_info 2314 2290 -24 __i915_gem_set_wedged_BKL 485 411 -74 intel_lrc_irq_handler 1789 1604 -185 execlists_update_context 294 - -294 The most important change there is the improve to the intel_lrc_irq_handler and excclist_submit_ports (net improvement since execlists_update_context is now inlined). v2: Use the port_api() for guc as well (even though currently we do not pack any counters in there, yet) and hide all port->request_count inside the helpers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-5-chris@chris-wilson.co.uk	2017-05-17 13:38:06 +01:00
Chris Wilson	0ce8178808	drm/i915: Redefine ptr_pack_bits() and friends Rebrand the current (pointer \| bits) pack/unpack utility macros as explicit bit twiddling for PAGE_SIZE so that we can use the more flexible underlying macros for different bits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-4-chris@chris-wilson.co.uk	2017-05-17 13:38:04 +01:00
Chris Wilson	991bfc64db	drm/i915: Make ptr_unpack_bits() more function-like ptr_unpack_bits() is a function-like macro, as such it is meant to be replaceable by a function. In this case, we should be passing in the out-param as a pointer. Bizarrely this does affect code generation: function old new delta i915_gem_object_pin_map 409 389 -20 An improvement(?) in this case, but one can't help wonder what strict-aliasing optimisations we are preventing. The generated code looks identical in using ptr_unpack_bits (no extra motions to stack, the pointer and bits appear to be kept in registers), the difference appears to be code ordering and with a reorder it is able to use smaller forward jumps. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-3-chris@chris-wilson.co.uk	2017-05-17 13:38:03 +01:00
Chris Wilson	47624cc330	drm/i915: Import the kfence selftests for i915_sw_fence A long time ago, I wrote some selftests for the struct kfence idea. Now that we have infrastructure in i915/igt for running kselftests, include some for i915_sw_fence. v2: INIT_WORK_ONSTACK/destroy_work_on_stack (Mika) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-2-chris@chris-wilson.co.uk	2017-05-17 13:38:02 +01:00
Chris Wilson	9310cb7f8d	drm/i915: Remove kref from i915_sw_fence My original intention was for i915_sw_fence to be the base class and provide the reference count for the container. This was from starting with a design to handle async_work. In practice, for i915 we embed fences into structs which have their own independent reference counting, making the i915_sw_fence.kref duplicitous. If we remove the kref, we remove the i915_sw_fence's ability to free itself and its independence, it can only exist within a container and must be supplied with a callback to handle its release. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-1-chris@chris-wilson.co.uk	2017-05-17 13:38:01 +01:00
Arkadiusz Hiler	0b71cea29f	drm/i915/gen9: Reintroduce WaEnableYV12BugFixInHalfSliceChicken7 This basically reverts commit `465418c606` ("drm/i915/gen9: Remove WaEnableYV12BugFixInHalfSliceChicken7") with small addition - marking it as affecting GLK as well. It was incorrectly considered fixed in production steppings. References: HSD#2126385, HSD#2131381, HSDES#1504433555, BSID#0764 Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Jeff McGee <jeff.mcgee@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [Mika: s/KBL/GLK on commit message] Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170512112015.19082-1-arkadiusz.hiler@intel.com	2017-05-17 14:30:27 +03:00
Matthew Auld	d567232cbd	drm/i915: use vma->size for appgtt allocate_va_range For the aliasing ppgtt we clear the va range up to vma->size, but seem to allocate up to vma->node.size, which is a little inconsistent given that vma->node.size >= vma->size. Not that is really matters all that much since we preallocate anyway, but for consistency just use vma->size. Fixes: `ff685975d9` ("drm/i915: Move allocate_va_range to GTT") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170516085514.5853-1-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-05-16 11:26:36 +01:00
Gustavo A. R. Silva	efd38b68c7	gpu: drm: i915: compress logic into one line Simplify logic to avoid unnecessary variable declaration and assignment. Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170515220028.GA15149@embeddedgus	2017-05-16 11:28:01 +02:00
Gustavo A. R. Silva	cbaa331504	gpu: drm: i915: remove dead code Local variable has_reduced_clock is assigned to a constant value and it is never updated again. Remove this variable and the dead code it guards. Addresses-Coverity-ID: 1362230 Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170515215605.GA14963@embeddedgus	2017-05-16 11:27:54 +02:00
Colin Ian King	9a09485d41	drm/i915/guc:fix spelling mistake: "adddress" -> "address" Trivial fix to spelling mistake in seq_printf message. Fixes: `a8b9370fc7` ("drm/i915/guc: Dump the GuC stage descriptor pool in debugfs") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170516092235.28640-1-colin.king@canonical.com	2017-05-16 11:26:41 +02:00
Madhav Chauhan	1fdd783e9c	drm/i915/glk: Calculate high/low switch count for GLK As per BSPEC, high/low switch count to be programmed in terms of byteclock using exit_zero_count and prep_count. For Geminilake exit/prep counts are already calculated in terms of byteclock. This patch calculates high/low switch count using counts value in byteclock, old calculation leads to screen flicker/shift issue while resuming from S3/S4. Signed-off-by: Madhav Chauhan <madhav.chauhan@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1494336565-19185-1-git-send-email-madhav.chauhan@intel.com	2017-05-15 18:29:46 +03:00
Chris Wilson	9081d08056	drm/i915: Fixup 64bit divides in timelines selftest Some 64b divides snuck in when doing the prng timing compensation. Fixes: `4797948071` ("drm/i915: Squash repeated awaits on the same fence") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170513094154.3581-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-05-15 14:40:44 +01:00
Daniel Vetter	2388cd9c50	drm/i915: Update DRIVER_DATE to 20170515 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-05-15 09:11:48 +02:00
Robert Bragg	712122eaa1	drm/i915/perf: rate limit spurious oa report notice This change is pre-emptively aiming to avoid a potential cause of kernel logging noise in case some condition were to result in us seeing invalid OA reports. The workaround for the OA unit's tail pointer race condition is what avoids the primary known cause of invalid reports being seen and with that in place we aren't expecting to see this notice but it can't be entirely ruled out. Just in case some condition does lead to the notice then it's likely that it will be triggered repeatedly while attempting to append a sequence of reports and depending on the configured OA sampling frequency that might be a large number of repeat notices. v2: (Chris) avoid inconsistent warning on throttle with printk_ratelimit() v3: (Matt) init and summarise with stream init/close not driver init/fini Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-9-lionel.g.landwerlin@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-05-13 11:03:43 +01:00
Robert Bragg	4117ebc74c	drm/i915/perf: better pipeline aged/aging tail updates This updates the tail pointer race workaround handling to updating the 'aged' pointer before looking to start aging a new one. There's the possibility that there is already new data available and so we can immediately start aging a new pointer without having to first wait for a later hrtimer callback (and then another to age). Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-8-lionel.g.landwerlin@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-05-13 11:03:17 +01:00
Robert Bragg	52c57c263f	drm/i915/perf: improve invalid OA format debug message A minor improvement to debugging output Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-7-lionel.g.landwerlin@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-05-13 11:02:40 +01:00
Robert Bragg	0dd860cf73	drm/i915/perf: improve tail race workaround There's a HW race condition between OA unit tail pointer register updates and writes to memory whereby the tail pointer can sometimes get ahead of what's been written out to the OA buffer so far (in terms of what's visible to the CPU). Although this can be observed explicitly while copying reports to userspace by checking for a zeroed report-id field in tail reports, we want to account for this earlier, as part of the _oa_buffer_check to avoid lots of redundant read() attempts. Previously the driver used to define an effective tail pointer that lagged the real pointer by a 'tail margin' measured in bytes derived from OA_TAIL_MARGIN_NSEC and the configured sampling frequency. Unfortunately this was flawed considering that the OA unit may also automatically generate non-periodic reports (such as on context switch) or the OA unit may be enabled without any periodic sampling. This improves how we define a tail pointer for reading that lags the real tail pointer by at least %OA_TAIL_MARGIN_NSEC nanoseconds, which gives enough time for the corresponding reports to become visible to the CPU. The driver now maintains two tail pointers: 1) An 'aging' tail with an associated timestamp that is tracked until we can trust the corresponding data is visible to the CPU; at which point it is considered 'aged'. 2) An 'aged' tail that can be used for read()ing. The two separate pointers let us decouple read()s from tail pointer aging. The tail pointers are checked and updated at a limited rate within a hrtimer callback (the same callback that is used for delivering POLLIN events) and since we're now measuring the wall clock time elapsed since a given tail pointer was read the mechanism no longer cares about the OA unit's periodic sampling frequency. The natural place to handle the tail pointer updates was in gen7_oa_buffer_is_empty() which is called as part of blocking reads and the hrtimer callback used for polling, and so this was renamed to oa_buffer_check() considering the added side effect while checking whether the buffer contains data. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-6-lionel.g.landwerlin@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-05-13 11:01:28 +01:00
Robert Bragg	3bb335c1e7	drm/i915/perf: no head/tail ref in gen7_oa_read This avoids redundantly passing an (inout) head and tail pointer to gen7_append_oa_reports() from gen7_oa_read which doesn't need to reference either itself. Moving the head/tail reads and writes into gen7_append_oa_reports should have no functional effect except to avoid some redundant head pointer writes in cases where nothing was copied to userspace. This is a stepping stone towards updating how the head and tail pointer state is managed to improve the workaround for the OA unit's tail pointer race. It reduces the number of places we need to read/write the head and tail pointers. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-5-lionel.g.landwerlin@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-05-13 11:00:43 +01:00
Robert Bragg	f279020a02	drm/i915/perf: avoid read back of head register There's no need for the driver to keep reading back the head pointer from hardware since the hardware doesn't update it automatically. This way we can treat any invalid head pointer value as a software/driver bug instead of spurious hardware behaviour. This change is also a small stepping stone towards re-working how the head and tail state is managed as part of an improved workaround for the tail register race condition. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-4-lionel.g.landwerlin@intel.com	2017-05-13 10:59:39 +01:00
Robert Bragg	26ebd9c734	drm/i915/perf: avoid poll, read, EAGAIN busy loops If the function for checking whether there is OA buffer data available (during a poll or blocking read) has false positives then we want to avoid a situation where the subsequent read() returns EAGAIN (after a more accurate check) followed by a poll() immediately reporting the same false positive POLLIN event and effectively maintaining a busy loop until there really is data. This makes sure that we clear the .pollin event status whenever we return EAGAIN to userspace which will throttle subsequent POLLIN events and repeated attempts to read to the 5ms intervals of the hrtimer callback we have. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-3-lionel.g.landwerlin@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-05-13 10:59:07 +01:00
Robert Bragg	e81b3a555f	drm/i915/perf: fix gen7_append_oa_reports comment If I'm going to complain about a back-to-front convention then the least I can do is not muddle the comment up too. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-2-lionel.g.landwerlin@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-05-13 10:58:30 +01:00
Puthikorn Voravootivat	2630addf6b	drm/i915: Restore brightness level in aux backlight driver Some panel will default to zero brightness when turning the panel off and on again. This patch restores last brightness level back when panel is turning back on. Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511230225.142870-8-puthik@chromium.org	2017-05-12 15:51:20 +03:00
Puthikorn Voravootivat	e9c9e5ae8b	drm/i915: Set backlight mode before enable backlight We should set backlight mode register before set register to enable the backlight. Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511230225.142870-6-puthik@chromium.org	2017-05-12 15:50:35 +03:00
Puthikorn Voravootivat	73ab484c90	drm/i915: Correctly enable backlight brightness adjustment via DPCD intel_dp_aux_enable_backlight() assumed that the register BACKLIGHT_BRIGHTNESS_CONTROL_MODE can only has value 01 (DP_EDP_BACKLIGHT_CONTROL_MODE_PRESET) when initialize. This patch fixed that by handling all cases of that register. Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511230225.142870-3-puthik@chromium.org	2017-05-12 15:47:09 +03:00
Puthikorn Voravootivat	a644ca9bcd	drm/i915: Fix cap check for intel_dp_aux_backlight driver intel_dp_aux_backlight driver should check for the DP_EDP_BACKLIGHT_BRIGHTNESS_AUX_SET_CAP before enable the driver. Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511230225.142870-2-puthik@chromium.org	2017-05-12 15:47:08 +03:00
Matthew Auld	1f23475c89	drm/i915: don't do allocate_va_range again on PIN_UPDATE If a vma is already bound to a ppgtt, we incorrectly call allocate_va_range again when doing a PIN_UPDATE, which will result in over accounting within our paging structures, such that when we do unbind something we don't actually destroy the structures and end up inadvertently recycling them. In reality this probably isn't too bad, but once we start touching PDEs and PDPEs for 64K/2M/1G pages this apparent recycling will manifest into lots of really, really subtle bugs. v2: Fix the testing of vma->flags for aliasing_ppgtt_bind_vma Fixes: `ff685975d9` ("drm/i915: Move allocate_va_range to GTT") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170512091423.26085-1-chris@chris-wilson.co.uk	2017-05-12 13:37:25 +01:00
Chuanxiao Dong	0d402a24df	drm/i915: set initialised only when init_context callback is NULL During execlist_context_deferred_alloc() we presumed that the context is uninitialised (we only just allocated the state object for it!) and chose to optimise away the later call to engine->init_context() if engine->init_context were NULL. This breaks with GVT's contexts that are marked as pre-initialised to avoid us annoyingly calling engine->init_context(). The fix is to not override ce->initialised if it is already true. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1494497262-24855-1-git-send-email-chuanxiao.dong@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-05-12 11:11:11 +01:00
Joonas Lahtinen	73cc0b9aa9	drm/i915: Do not sync RCU during shrinking Due to the complex dependencies between workqueues and RCU, which are not easily detected by lockdep, do not synchronize RCU during shrinking. On low-on-memory systems (mem=1G for example), the RCU sync leads to all system workqueus freezing and unrelated lockdep splats are displayed according to reports. GIT bisecting done by J. R. Okajima points to the commit where RCU syncing was extended. RCU sync gains us very little benefit in real life scenarios where the amount of memory used by object backing storage is dominant over the metadata under RCU, so drop it altogether. " Yeeeaah, if core could just, go ahead and reclaim RCU queues, that'd be great. " - Chris Wilson, 2016 (`0eafec6d32`) v2: More information to commit message. v3: Remove "grep _rcu_" escapee from i915_gem_shrink_all (Andrea) Fixes: `c053b5a506` ("drm/i915: Don't call synchronize_rcu_expedited under struct_mutex") Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reported-by: J. R. Okajima <hooanon05g@gmail.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Hugh Dickins <hughd@google.com> Tested-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: J. R. Okajima <hooanon05g@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: <stable@vger.kernel.org> # v4.11+ Link: http://patchwork.freedesktop.org/patch/msgid/1494414040-11160-1-git-send-email-joonas.lahtinen@linux.intel.com	2017-05-11 15:58:23 +03:00
Michal Wajdeczko	a0c1fe2190	drm/i915/guc: Make scratch register base and count flexible We are using some scratch registers in MMIO based send function. Make their base and count flexible in preparation of upcoming GuC firmware/hardware changes. While around, change cmd len parameter verification from WARN_ON to GEM_BUG_ON as we don't need this all the time. v2: call out WARN/GEM_BUG change in the commit msg (Daniele) v3: don't overqualify the ints (Chris) v4: rebase and use proper enum Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-05-11 12:49:26 +03:00
Michal Wajdeczko	a03aac442d	drm/i915/guc: Move notification code into virtual function Prepare for alternate GuC notification mechanism. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> [Joonas: Added newlines] Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-05-11 12:49:26 +03:00
Maarten Lankhorst	eb8bc8dcc5	drm/i915: Remove vma unpin in intel_plane_destroy commit `a667fb402c` Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Date: Thu Dec 15 15:29:44 2016 +0100 drm/i915: Disable all crtcs during driver unload, v2. made sure that all crtc's are disabled on driver unload, but only the following commit made sure all fb's are cleaned up correctly: commit `9b2104f423` Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Date: Tue Feb 21 14:51:40 2017 +0100 drm/atomic: Make disable_all helper fully disable the crtc. Finally remove this and add a WARN_ON when vma is set. It should have been removed by intel_cleanup_plane_fb(). Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511082844.13965-2-maarten.lankhorst@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-05-11 11:30:27 +02:00
Maarten Lankhorst	749d98b80b	drm/i915: Fix hw state verifier access to crtc->state. We shouldn't inspect crtc->state, instead grab the crtc state. At this point the hw state verifier should be able to run even if crtc->state has been updated (which cannot currently happen). Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511082844.13965-1-maarten.lankhorst@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-05-11 11:29:58 +02:00
Oscar Mateo	a8b9370fc7	drm/i915/guc: Dump the GuC stage descriptor pool in debugfs We are missing pieces of information that could be useful for GuC debugging. v2: Reuse some code (Joonas) Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> [Joonas: Removed extra newline and s/uint32_t/u32/ for checkpatch.pl] Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1494428691-20672-1-git-send-email-oscar.mateo@intel.com	2017-05-11 12:26:50 +03:00
Daniel Vetter	ff26ffa8ee	drm/i915: Fix __intel_wait_for_register_fw to not sleep in atomic The unconditionally fallback to the blocking wait_for resulted in impressive fireworks at boot-up on my snb here. Make sure if we set the slow timeout to 0 that we never ever sleep. The tail of the callchain was intel_wait_for_register -> __intel_wait_for_register_fw -> usleep_range -> BOOM It blew up in intel_crt_detect load detection code on the ADPA_CRT_HOTPLUG_FORCE_TRIGGER in the ADPA register. v2: Shut up gcc. v3: Use uninitialized_var() (Chris). Fixes: `0564654340` ("drm/i915: Acquire uncore.lock over intel_uncore_wait_for_register()") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1494429572-15118-1-git-send-email-daniel.vetter@ffwll.ch	2017-05-10 20:06:41 +02:00
Ville Syrjälä	e11ffddba1	drm/i915: Simplify cursor register write sequence It looks like simply writing all the cursor register every single time might be slightly faster than checking to see of each of them need to be written. So if any other register apart from CURPOS needs to be written let's just write all the registers. CURPOS is left as a special case mainly for 845/865 where we have to disable the cursor to change many of the cursor parameters. This introduces a slight chance of the cursor flickering when things get updated (since we're not currently doing the vblank evade for cursor updates). If we write CURPOS alone then that obviously can't happen. And let's follow the same pattern in the i9xx code just for symmetry. I wasn't able to see a singificant performance difference between this and just writing all the registers unconditionally. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170327185546.2977-16-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com>	2017-05-10 19:28:35 +03:00

1 2 3 4 5 ...

14599 Commits