linux

Author	SHA1	Message	Date
Alex Deucher	45a3e06be4	drm/amdgpu: Use IP versions in convert_tiling_flags_to_modifier() Rather than checking the asic_type. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Dillon Varone	fe5e8f07fc	drm/amd/display: Modify plane removal sequence to avoid hangs. Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Danijel Slivka	7952fa0d3e	drm/amd/pm: new v3 SmuMetrics data structure for Sienna Cichlid structure changed in smc_fw_version >= 0x3A4900, "uint16_t VcnActivityPercentage" replaced with "uint16_t VcnUsagePercentage0" and "uint16_t VcnUsagePercentage1" Signed-off-by: Danijel Slivka <danijel.slivka@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Prike Liang	d7709eb6a1	drm/amdgpu: enable gfxoff routine for GC 10.3.7 Enable gfxoff routine for GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Prike Liang	fabe175385	drm/amdgpu: enable gfx power gating for GC 10.3.7 Enable gfx power gating for GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Prike Liang	9a1358bb2c	drm/amdgpu/nv: enable clock gating for GC 10.3.7 subblock This will enable the following block clock gating. - MC - SDMA - HDP - ATHUB - IH - VCN/JPEG Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Prike Liang	00bfab4457	drm/amdgpu: enable gfx clock gating control for GC 10.3.7 Enable gfx cg gate/ungate control for GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Qiang Yu	b6901d93cc	drm/amdgpu: fix suspend/resume hang regression Regression has been reported that suspend/resume may hang with the previous vm ready check commit. So bring back the evicted list check as a temp fix. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1922 Fixes: `c1a66c3bc4` ("drm/amdgpu: check vm ready by amdgpu_vm->evicting flag") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Qiang Yu <qiang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Yifan Zha	e6fac6a9c9	drm/amdgpu: Move CAP firmware loading to the beginning of PSP firmware list [Why] As PSP needs to verify the signature, CAP firmware must be loaded first when PSP loads firmwares. Otherwise, when DFC feature is enabled, CP firmwares would be loaded failed. [ 1149.160480] [drm] MM table gpu addr = 0x800022f000, cpu addr = 00000000a62afcea. [ 1149.209874] [drm] failed to load ucode CP_CE(0x8) [ 1149.209878] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.215914] [drm] failed to load ucode CP_PFP(0x9) [ 1149.215917] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.221941] [drm] failed to load ucode CP_ME(0xA) [ 1149.221944] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.228082] [drm] failed to load ucode CP_MEC1(0xB) [ 1149.228085] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.234209] [drm] failed to load ucode CP_MEC2(0xD) [ 1149.234212] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.242379] [drm] failed to load ucode VCN(0x1C) [ 1149.242382] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [How] Move CAP UCODE ID to the beginning of AMDGPU_UCODE_ID enum list. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Andrey Grodzovsky	5aa061474b	drm/amdgpu: Bump minor version for hot plug tests enabling. This will allow to enable the tests only after latest fix after which the tests passed on my system. I tested on NV21 standalone and Vega 10 and Polaris as pair with DRI_PRIME. It's possible there might be still issues on ASICs i don't have at my posession but that that the point of enbling the tests finally - if other people during testing will encounter errors they will report and I will be able to fix. The releated merge request for enabling libdrm tests suite is in https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/227 Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Andrey Grodzovsky	57230f0ce6	drm/amdgpu: Fix sigsev when accessing MMIO on hot unplug. Protect with drm_dev_enter/exit Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Yifan Zhang	7d4108e4ce	drm/amdgpu: convert code name to ip version for noretry set Use IP version rather than codename for noretry set. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Yifan Zhang	957b0787ee	drm/amdgpu: move amdgpu_gmc_noretry_set after ip_versions populated otherwise adev->ip_versions is still empty when amdgpu_gmc_noretry_set is called. Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	80e0c2cb37	drm/amdgpu: Remove redundant .ras_fini initialization in some ras blocks 1. Define amdgpu_ras_block_late_fini_default in amdgpu_ras.c as .ras_fini common function, which is called when .ras_fini of ras block isn't initialized. 2. Remove the code of using amdgpu_ras_block_late_fini to initialize .ras_fini in ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	30e58102d5	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in mca ras block Remove redundant calls of amdgpu_ras_block_late_fini in mca ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	149d7ba1f8	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in sdma ras block Remove redundant calls of amdgpu_ras_block_late_fini in sdma ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	aa8e65dfc7	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in hdp ras block Remove redundant calls of amdgpu_ras_block_late_fini in hdp ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	f148c143ef	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in xgmi ras block Remove redundant calls of amdgpu_ras_block_late_fini in xgmi ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	0dca257d6d	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in umc ras block Remove redundant calls of amdgpu_ras_block_late_fini in umc ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	f578a37d19	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in nbio ras block Remove redundant calls of amdgpu_ras_block_late_fini in nbio ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	9dad47c50f	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in mmhub ras block Remove redundant calls of amdgpu_ras_block_late_fini in mmhub ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
yipechai	35366481d0	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in gfx ras block Remove redundant calls of amdgpu_ras_block_late_fini in gfx ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
yipechai	1f211a827c	drm/amdgpu: centrally calls the .ras_fini function of all ras blocks centrally calls the .ras_fini function of all ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
yipechai	667c7091a3	drm/amdgpu: Optimize xxx_ras_fini function of each ras block 1. Move the variables of ras block instance members from specific xxx_ras_fini to general ras_fini call. 2. Function calls inside the modules only use parameters passed from xxx_ras_fini instead of ras block instance members. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
yipechai	01d468d9a4	drm/amdgpu: Modify .ras_fini function pointer parameter Modify .ras_fini function pointer parameter so that we can remove redundant intermediate calls in some ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Shah Dharati	b51759661e	drm/amd/display: Adding a dc_debug option and dmub setting to use PHY FSM for PSR [Why] PSR Power on/off is done in PSR. Add a dc_debug option and dmub setting to use PHY implementation of this instead. [How] Add a dc_debug option and dmub setting to use PHY FSM Power up/down for PSR. Co-authored-by: Shah Dharati <dharati.shah@amd.com> Reviewed-by: Hansen Dsouza <hansen.dsouza@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Shah Dharati <dharati.shah@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Dillon Varone	91dcfe5fd9	drm/amd/display: Add frame alternate 3D & restrict HW packed on dongles [WHY?] Some projectors support frame alternate 3D modes at 120Hz, but DAL3 does not create timings. Most active DP to HDMI dongles do not translate infoframes properly to use HW packing stereo mode. [HOW?] Create frame alternate 3D timings for displays that support it. Disable HW packing 3D mode on DP active dongles. Reviewed-by: Martin Leung <Martin.Leung@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Tom Rix	ca6fcfa8d4	drm/amdgpu: Fix realloc of ptr Clang static analysis reports this error amdgpu_debugfs.c:1690:9: warning: 1st function call argument is an uninitialized value tmp = krealloc_array(tmp, i + 1, ^~~~~~~~~~~~~~~~~~~~~~~~~~~ realloc uses tmp, so tmp can not be garbage. And the return needs to be checked. Fixes: `5ce5a584cb` ("drm/amdgpu: add debugfs for reset registers list") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Chris Park	4affb12303	drm/amd/display: Reset VIC if HDMI_VIC is present [Why] HDMI Compliance requires VIC to be set to 0 on 2D mode if HDMI_VIC is present. [How] When VIC and HDMI_VIC is both present, reset VIC to 0. Reviewed-by: Martin Leung <Martin.Leung@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Chris Park <Chris.Park@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Nicholas Kazlauskas	6dc0fded62	drm/amd/display: Make functional resource functions non-static [Why & How] To align coding style for how we use this across DCN. The resource creation ones can remain static, however. Reviewed-by: Eric Yang <Eric.Yang2@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Hansen Dsouza	1e242bf8bc	drm/amd/display: Remove invalid RDPCS Programming in DAL RDPCS programming is done in DMUB remove legacy invalid code Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Hansen Dsouza <Hansen.Dsouza@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Philip Yang	d58b8a99cb	drm/amdkfd: Add SMI add event helper To remove duplicate code, unify event message format and simplify new event add in the following patches. Use KFD_SMI_EVENT_MSG_SIZE to define msg size, the same size will be used in user space to alloc the msg receive buffer. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Philip Yang	38abd56bed	drm/amdkfd: Correct SMI event read size sizeof(buf) is 8 bytes because it is defined as unsigned char *buf, each SMI event read only copy max 8 bytes to user buffer. Correct this by using the buf allocate size. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Philip Yang	e433d68433	Revert "drm/amdkfd: process_info lock not needed for svm" This reverts commit `3abfe30d80`. To fix deadlock in kFDSVMEvictTest when xnack off. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Meng Tang	b8cb6ab686	gpu/amd: vega10_hwmgr: fix inappropriate private variable name In file vega10_hwmgr.c, the names of struct vega10_power_state * and struct pp_power_state * are confusingly used, which may lead to some confusion. Status quo is that variables of type struct vega10_power_state * are named "vega10_ps", "ps", "vega10_power_state". A more appropriate usage is that struct are named "ps" is used for variabled of type struct pp_power_state . So rename struct vega10_power_state which are named "ps" and "vega10_power_state" to "vega10_ps", I also renamed "psa" to "vega10_psa" and "psb" to "vega10_psb" to make it more clearly. The rows longer than 100 columns are involved. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Meng Tang <tangmeng@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Luben Tuikov	52e8da704d	drm/amd/display: Don't fill up the logs Don't fill up the logs with: [253557.859575] [drm:amdgpu_dm_atomic_check [amdgpu]] DSC precompute is not needed. [253557.892966] [drm:amdgpu_dm_atomic_check [amdgpu]] DSC precompute is not needed. [253557.926070] [drm:amdgpu_dm_atomic_check [amdgpu]] DSC precompute is not needed. [253557.959344] [drm:amdgpu_dm_atomic_check [amdgpu]] DSC precompute is not needed. which prints many times a second, when the kernel is run with drm.debug=2. Instead of DRM_DEBUG_DRIVER(), make it DRM_INFO_ONCE(). Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Roman Li <Roman.Li@amd.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Hersen Wu <hersenwu@amd.com> Cc: Daniel Wheeler <daniel.wheeler@amd.com> Fixes: `17ce8a6907` ("drm/amd/display: Add dsc pre-validation in atomic check") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:38:51 -05:00
Tommy Haung	e41d27eaf5	drm/aspeed: Add AST2600 chip support Add AST2600 chip support and setting. Signed-off-by: Tommy Haung <tommy_huang@aspeedtech.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Joel Stanley <joel@jms.id.au> Link: https://patchwork.freedesktop.org/patch/msgid/20220302024930.18758-5-tommy_huang@aspeedtech.com	2022-03-03 09:08:35 +10:30
Tommy Haung	5e2421ce79	drm/aspeed: Update INTR_STS handling Add interrupt clear register define for further chip support. Signed-off-by: Tommy Haung <tommy_huang@aspeedtech.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Joel Stanley <joel@jms.id.au> Link: https://patchwork.freedesktop.org/patch/msgid/20220302024930.18758-4-tommy_huang@aspeedtech.com	2022-03-03 09:08:35 +10:30
Thomas Zimmermann	9ae2ac4d31	drm: Add TODO item for optimizing format helpers Add a TODO item for optimizing blitting and format-conversion helpers in DRM and fbdev. There's always demand for faster graphics output. v4: * fix typos (Sam) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220223193804.18636-6-tzimmermann@suse.de	2022-03-02 20:26:02 +01:00
Thomas Zimmermann	0d03011894	fbdev: Improve performance of cfb_imageblit() Improve the performance of cfb_imageblit() by manually unrolling the inner blitting loop and moving some invariants out. The compiler failed to do this automatically. This change keeps cfb_imageblit() in sync with sys_imagebit(). A microbenchmark measures the average number of CPU cycles for cfb_imageblit() after a stabilizing period of a few minutes (i7-4790, FullHD, simpledrm, kernel with debugging). cfb_imageblit(), new: 15724 cycles cfb_imageblit(): old: 30566 cycles In the optimized case, cfb_imageblit() is now ~2x faster than before. v3: * fix commit description (Pekka) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220223193804.18636-5-tzimmermann@suse.de	2022-03-02 20:22:33 +01:00
Thomas Zimmermann	3c54c95bd9	fbdev: Remove trailing whitespaces from cfbimgblt.c Fix coding style. No functional changes. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220223193804.18636-4-tzimmermann@suse.de	2022-03-02 20:22:06 +01:00
Thomas Zimmermann	6f29e04938	fbdev: Improve performance of sys_imageblit() Improve the performance of sys_imageblit() by manually unrolling the inner blitting loop and moving some invariants out. The compiler failed to do this automatically. The resulting binary code was even slower than the cfb_imageblit() helper, which uses the same algorithm, but operates on I/O memory. A microbenchmark measures the average number of CPU cycles for sys_imageblit() after a stabilizing period of a few minutes (i7-4790, FullHD, simpledrm, kernel with debugging). The value for CFB is given as a reference. sys_imageblit(), new: 25934 cycles sys_imageblit(), old: 35944 cycles cfb_imageblit(): 30566 cycles In the optimized case, sys_imageblit() is now ~30% faster than before and ~20% faster than cfb_imageblit(). v2: * move switch out of inner loop (Gerd) * remove test for alignment of dst1 (Sam) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220223193804.18636-3-tzimmermann@suse.de	2022-03-02 20:20:46 +01:00
Thomas Zimmermann	7dbc515f5c	fbdev: Improve performance of sys_fillrect() Improve the performance of sys_fillrect() by using word-aligned 32/64-bit mov instructions. While the code tried to implement this, the compiler failed to create fast instructions. The resulting binary instructions were even slower than cfb_fillrect(), which uses the same algorithm, but operates on I/O memory. A microbenchmark measures the average number of CPU cycles for sys_fillrect() after a stabilizing period of a few minutes (i7-4790, FullHD, simpledrm, kernel with debugging). The value for CFB is given as a reference. sys_fillrect(), new: 26586 cycles sys_fillrect(), old: 166603 cycles cfb_fillrect(): 41012 cycles In the optimized case, sys_fillrect() is now ~6x faster than before and ~1.5x faster than the CFB implementation. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220223193804.18636-2-tzimmermann@suse.de	2022-03-02 20:20:34 +01:00
Srinivasan Shanmugam	b2006061ae	drm/i915/xehpsdv: Move render/compute engine reset domains related workarounds Registers that exist in the shared render/compute reset domain need to be placed on an engine workaround list to ensure that they are properly re-applied whenever an RCS or CCS engine is reset. We have a number of workarounds (updating registers MLTICTXCTL, L3SQCREG1_CCS0, GEN12_MERT_MOD_CTRL, and GEN12_GAMCNTRL_CTRL) that are incorrectly implemented on the 'gt' workaround list and need to be moved accordingly. Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.s@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-14-matthew.d.roper@intel.com	2022-03-02 06:52:42 -08:00
Matt Roper	ff6b19d3a0	drm/i915/xehp: Add compute workarounds Additional workarounds are required once we start exposing CCS engines. Note that we have a number of workarounds that update registers in the shared render/compute reset domain. Historically we've just added such registers to the RCS engine's workaround list. But going forward we should be more careful to place such workarounds on a wa_list for an engine that definitely exists and is not fused off (e.g., a platform with no RCS would never apply the RCS wa_list). We'll keep rcs_engine_wa_init() focused on RCS-specific workarounds that only need to be applied if the RCS engine is present. A separate general_render_compute_wa_init() function will be used to define workarounds that touch registers in the shared render/compute reset domain and that we need to apply regardless of what render and/or compute engines actually exist. Any workarounds defined in this new function will internally be added to the first present RCS or CCS engine's workaround list to ensure they get applied (and only get applied once rather than being needlessly re-applied several times). Co-author: Srinivasan Shanmugam Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-13-matthew.d.roper@intel.com	2022-03-02 06:52:42 -08:00
Daniele Ceraolo Spurio	88ed07cb27	drm/i915/xehp: handle fused off CCS engines HW resources are divided across the active CCS engines at the compute slice level, with each CCS having priority on one of the cslices. If a compute slice has no enabled DSS, its paired compute engine is not usable in full parallel execution because the other ones already fully saturate the HW, so consider it fused off. v2 (José): - moved it to its own function - fixed definition of ccs_mask v3 (Matt): - Replace fls() condition with a simple IP version test v4 (Matt): - Don't try to calculate a ccs_mask using intel_slicemask_from_dssmask() until we've determined that we're running on an Xe_HP platform where the logic makes sense (and won't overflow). Cc: Stuart Summers <stuart.summers@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220302052008.1884985-1-matthew.d.roper@intel.com	2022-03-02 06:52:42 -08:00
Matthew Brost	e393e2aa0a	drm/i915/xehp: Don't support parallel submission on compute / render A different emit breadcrumbs ring programming is required for compute / render and we don't have UMD user so just reject parallel submission for these engine classes. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-11-matthew.d.roper@intel.com	2022-03-02 06:52:42 -08:00
Daniele Ceraolo Spurio	ea4ca894a1	drm/i915/xehp/guc: enable compute engine inside GuC Tell GuC that CCS is enabled by setting the CCS mask in its ADS. Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Original-author: Michel Thierry Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-10-matthew.d.roper@intel.com	2022-03-02 06:52:09 -08:00
Matt Roper	87cb6d80f2	drm/i915/xehp: Enable ccs/dual-ctx in RCU_MODE We have to specify in the Render Control Unit Mode register when CCS is enabled. v2: - Move RCU_MODE programming to a helper function. (Tvrtko) - Clean up and clarify comments. (Tvrtko) - Add RCU_MODE to the GuC save/restore list. (Daniele) v3: - Move this patch before the GuC ADS update to enable compute engines; the definition of RCU_MODE and its insertion into the save/restore list moves to this patch. (Daniele) v4: - Call xehp_enable_ccs_engines() directly in guc_resume() and execlists_resume() rather than adding an extra layer of wrapping to the engine->resume() vfunc. (Umesh) Bspec: 46034 Original-author: Michel Thierry Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220302001554.1836066-1-matthew.d.roper@intel.com	2022-03-02 06:45:21 -08:00
Matt Roper	adfadb5638	drm/i915/xehp: Define context scheduling attributes in lrc descriptor In Dual Context mode the EUs are shared between render and compute command streamers. The hardware provides a field in the lrc descriptor to indicate the prioritization of the thread dispatch associated to the corresponding context. The context priority is set to 'low' at creation time and relies on the existing context priority to set it to low/normal/high. Bspec: 46145, 46260 Original-author: Michel Thierry Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Prasad Nallani <prasad.nallani@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-8-matthew.d.roper@intel.com	2022-03-02 06:45:21 -08:00

1 2 3 4 5 ...

1075420 Commits