fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-01-23 06:10:23 +01:00

Author	SHA1	Message	Date
Agate, Jesse	282ad9d864	amd/vpelib: Refactor frontend and backend config callback Refactor and rename frontend and backend config callback. Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Jesse Agate <Jesse.Agate@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Alan Liu	4886ee5caf	amd/vpelib: Amend log for tone map support check [Why & How] Amend the log when failing to support tone mapping. Reviewed-by: Tomson Chang <tomson.chang@amd.com> Reviewed-by: Jude Shih <Jude.Shih@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Alan Liu <haoping.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Kovac, Krunoslav	c5e2c4feaf	amd/vpelib: Refactor MPC registers Refactor MPC registers. 3DLUT programming is largely the same but register are renamed to be in VPMPC_RMCM (as opposed to VPMPCC_MCM). Note that they are still inside MCM so governed by MCM control location. Reviewed-by: Roy Chan <Roy.Chan@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Agate, Jesse	63d8fa3f28	amd/vpelib: Refactor structs for API change Refactor structs for API change. Reviewed-by: Roy Chan <Roy.Chan@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Jesse Agate <Jesse.Agate@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Hsieh, Mike	5e3b3ed8f7	amd/vpelib: Refactor OPP registers Refactor OPP registers. --------- Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Mike Hsieh <Mike.Hsieh@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Kovac, Krunoslav	914eb0a212	amd/vpelib: MPC refactoring HW registers In order to be able to share HW registers, some refactoring is needed. Reviewed-by: Roy Chan <Roy.Chan@amd.com> Reviewed-by: Tomson Chang <tomson.chang@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Assadian, Navid	a76d1aa565	amd/vpelib: Fix whitepoint for geometric downscaling Fix whitepoint for geometric down scaling. --------- Reviewed-by: Roy Chan <Roy.Chan@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Navid Assadian <navid.assadian@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Visan, Tiberiu	30a28b76c8	amd/vpelib: set the same range for clr adj Change the range for color adjustments and also modify bright cap. Reviewed-by: Roy Chan <Roy.Chan@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Tiberiu Visan <Tiberiu.Visan@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Assadian, Navid	e1ef91ac2a	amd/vpelib: Fix CS translation for geometric downscaling Geometric downscaling uses RGB10 as the intermediate format. The support for P601 and JFIF with RGB formats is added. Co-authored-by: Roy Chan <roy.chan@amd.com> Reviewed-by: Roy Chan <Roy.Chan@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Navid Assadian <navid.assadian@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Assadian, Navid	699f88f844	amd/vpelib: Add API function to get taps A module to calculate the number of taps is added to the API. Additionally, the get_optimal_taps module is moved from dpp to resource. Reviewed-by: Roy Chan <Roy.Chan@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Navid Assadian <navid.assadian@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Assadian, Navid	4fc221524c	amd/vpelib: Change Max DS support to 4:1 Since VPE can use upto 8 taps, for quality purpose vpelib cannot support downscaling ratio more than 4:1. The caps value needed to be modified to reject this case earlier. Reviewed-by: Roy Chan <Roy.Chan@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Navid Assadian <navid.assadian@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Kovac, Krunoslav	e6dd0de4d9	amd/vpelib: DPP starting changes Refactor DPP registers to split into common and version specific. Gamut remap for DPP will likely move to MPC. For this, we need MPC changes and refactor program_front_end/back_end so the correct block does it. Reviewed-by: Roy Chan <Roy.Chan@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Lin, Ricky	54d1d41e10	amd/vpelib: Added JFIF format to RGB output side Added JFIF format to RGB output side, due to geometric scaling will change cs parameter to JFIF. --------- Reviewed-by: Tomson Chang <tomson.chang@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: rickylin <ricky.lin@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Hsieh, Mike	746556d585	amd/vpelib: Remove deprecated update_3dlut flag [WHY & HOW] update_3dlut flag has been replaced by UID mechanism. Remove update_3dlut flag and update related functions. Reviewed-by: Jesse Agate <jesse.agate@amd.com> Reviewed-by: Tomson Chang <tomson.chang@amd.com> Acked-by: Jack Chih <chiachih@amd.com> Signed-off-by: Mike Hsieh <Mike.Hsieh@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>	2024-08-07 16:46:25 +00:00
Paulo Zanoni	0e38b794e2	intel: fix compute SLM sizes on Xe2 and newer Before the patch, intel_device_info_get_max_preferred_slm_size() returns values in kilobytes, but then intel_device_info_get_max_slm_size() is multiplying it by 1024. As a result, LNL is reporting maxComputeSharedMemorySize to be 134217728, which is 128mb. Fix this by making intel_device_info_get_max_slm_size() not multiply it by 1024. This should fix at least the following dEQP tests: dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.1 dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.128 dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.16 dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.2 dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.4 dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.64 Some tests were failing with: deqp-vk: ../../src/intel/common/intel_compute_slm.c:24: slm_encode_lookup: Assertion `kbytes <= table[table_len - 1].size_in_kb' failed. while other tests were triggering the OOM. v2: - Make everybody return sizes in bytes (José). v3: - Rename variable to bytes (José, Jordan). Fixes: `fd368f5521` ("anv: Set maxComputeSharedMemorySize value for Xe2 platforms") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30541>	2024-08-07 16:14:02 +00:00
Sil Vilerino	a0f1a708c4	Revert "d3d12: Video Encode - Remove PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE as not supported" This reverts commit `d6bb4ddc63`. Fixes: `d6bb4ddc63` ("d3d12: Video Encode - Remove PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE as not supported") PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE is necessary for some scenarios like the example below described in https://github.com/microsoft/WSL/issues/11838 gst-launch-1.0 -v videotestsrc num-buffers=250 ! video/x-raw,width=1920,height=1200 ! vaapipostproc ! vaapih264enc ! filesink location=~/wsl_test.h264 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30548>	2024-08-07 15:51:19 +00:00
Nanley Chery	54631ebc68	anv: Batch MCS and CCS aux-op flushes The PRMs suggest that certain classes of auxiliary surface operations will automatically synchronize when performed back-to-back: Any transition from any value in {Clear, Render, Resolve} to a different value in {Clear, Render, Resolve} requires end of pipe synchronization. Make use of this functionality by batching CCS and MCS flushes when compatible auxiliary surface operations are performed within a command buffer. Ref: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11325 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29922>	2024-08-07 15:25:37 +00:00
Nanley Chery	f854161928	anv,iris: Use WriteImmediate instead of Z flush for WA According to the HSD, this is an alternative option for Wa_14016712196. Taking this option allows us to combine this workaround with a couple other depth workarounds. Make sure to execute these workarounds before the workaround for the depth register mode, so that the stalling flush is not impacted. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29922>	2024-08-07 15:25:37 +00:00
Nanley Chery	db6ae41c65	intel/blorp: Use WA helpers for depth pipecontrol Instead of unconditionally emitting a pipe control on gfx11+, use the workaround helpers for workarounds 1408224581 and 14014097488. Also, add a check for workaround 14016712196, which is also impacted. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29922>	2024-08-07 15:25:37 +00:00
Nanley Chery	77e4f9690d	anv: Drop flush from unused depth workaround This flush was introduced with the following commits: `8949d27bb8` ("anv: implement gen9 post sync pipe control workaround") `bcb611361b` ("anv: implement gen12 post sync pipe control workaround") The flush was unsued with the following commit: `e79e1ca304` ("intel: Drop Tigerlake revision 0 workarounds") This prevents some extra pipecontrols caused by a following patch. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29922>	2024-08-07 15:25:37 +00:00
Zan Dobersek	f58e1ef7ec	tu: enable shaderInt8 support Enable the shaderInt8 Vulkan feature for Turnip. As final necessary changes, an assert for nir_op_imul is tweaked to also allow 8-bit multiplication, and nir_op_bcsel's conversion of the conditional value from 8 to 32 bits is applied through masking, like in the general conversion case. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10675 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>	2024-08-07 14:32:28 +00:00
Zan Dobersek	e30c329026	ir3: improve validation, display for ldp instructions During validation, an ldp instruction should have all its three source registers validated. For display, the half-type register name should be displayed when applicable. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>	2024-08-07 14:32:28 +00:00
Zan Dobersek	55ac28954e	ir3: indicate possible dword straddle for any multi-component pvtmem access When filling out ir3_info, any multi-component stp or ldp instruction should indicate possible straddle across dword boundaries. This indirectly prevents setting the PERWAVEMEMLAYOUT flag on the SP_VS_PVT_MEM_SIZE register, enabling proper functioning of three-component 8-bit accesses with natural alignment. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>	2024-08-07 14:32:28 +00:00
Zan Dobersek	9e0b77d5c3	ir3: use fully-functional dp4acc when available a750 improves dp4acc to have support for all dot product variants. The main difference with dp4acc of previous generations is that the signedness and packed instruction fields have to be instead interpreted as signedness of either side of the dot product. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>	2024-08-07 14:32:28 +00:00
Zan Dobersek	8aa2cad5df	ir3: lower relevant 8-bit ALU ops in nir_lower_bit_size The nir_lower_bit_size pass is used to properly adapt specific 8-bit ALU operations for correct behavior. In those cases inputs are converted to 16 bits and the result is converted back down to 8 bits. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>	2024-08-07 14:32:28 +00:00
Zan Dobersek	7fd5f76393	nir/lower_vars_to_scratch: calculate threshold-limited variable size separately ir3's lowering of variables to scratch memory has to treat 8-bit values as 16-bit ones when comparing such value's size against the given threshold since those values are handled through 16-bit half-registers. But those values can still use natural 8-bit size and alignment for storing inside scratch memory. nir_lower_vars_to_scratch now accepts two size-and-alignment functions, one used for calculating the variable size and the other for calculating the size and alignment needed for storing inside scratch memory. Non-ir3 uses of this pass can just duplicate the currently-used function. ir3 provides a separate variable-size function that special-cases 8-bit types. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>	2024-08-07 14:32:28 +00:00
Zan Dobersek	f8602612ed	ir3: some 8-bit subgroup intrinsics must execute as 16-bit instructions ir3 8-bit quad-broadcast, quad-swap, scan and reduce instructions only work correctly when done in 16-bit space. A nir_lower_bit_size pass is used to upcast the source value and then downcast the result back to 8 bits. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>	2024-08-07 14:32:28 +00:00
Danylo Piliaiev	8b7beca572	tu: Enable UBWC for D24S8 with USAGE_SAMPLED and formatless border color DXVK and VKD3D-Proton use customBorderColorWithoutFormat and have most of D24S8 images with USAGE_SAMPLED, in such case we disable UBWC for correctness. However, games don't use border color for depth-stencil images. So we elect to ignore this edge case and force UBWC to be enabled. See also https://github.com/doitsujin/dxvk/issues/4191 Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30545>	2024-08-07 13:51:20 +00:00
Karol Herbst	012323a1d1	rusticl/image: properly sync mappings content for 1Dbuffer images This fixes clFillImage 1Dbuffer use_pitches CL CTS tests. Fixes: `7b22bc617b` ("rusticl/memory: complete rework on how mapping is implemented") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30528>	2024-08-07 13:38:06 +00:00
Karol Herbst	2484331e82	rusticl/image: take pitches into account when allocating memory for maps This is more correct than the previous code and the CL CTS relies on edge case behavior here, e.g. for 1Dbuffer images. I think part of that is not actually required by the spec, but whatever. Fixes: `7b22bc617b` ("rusticl/memory: complete rework on how mapping is implemented") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30528>	2024-08-07 13:38:06 +00:00
Karol Herbst	1fa288b224	rusticl/memory: Fix memory unmaps after rework An application could map and unmap a host ptr allocation multiple times, but because how the refcounting works, we might never ended up syncing the written data to the mapped region. This moves the refcounting out of the event processing. Fixes: `7b22bc617b` ("rusticl/memory: complete rework on how mapping is implemented") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30528>	2024-08-07 13:38:05 +00:00
Eric Engestrom	b6d8459e3a	ci: pass MESA_SPIRV_LOG_LEVEL from job to the test Fixes: `4b8735cd4e` ("ci: raise the log level threshold of spirv logs") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30546>	2024-08-07 11:43:25 +02:00
Mike Blumenkrantz	ef88af8467	dril: always take the egl init path using EGL_DEFAULT_DISPLAY will cover the swrast case, which fixes generating all the correct configs Fixes: `ec7afd2c24` ("dril: rework config creation") Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30426>	2024-08-07 08:54:40 +00:00
Iago Toral Quiroga	086ed1e54b	broadcom/compiler: emit instructions producing flags earlier We usually emit flags right before consuming them but this is suboptimal from the point of view of register pressure: if an instruction is only used to generate flags then waiting to emit it right before reading the flags extends the liveness of the sources used to generate the flags for no gain. This pass will check for such instructions and try to move them as early as possible. Shader-db results below show this is effective to reduce register pressure, allowing a few shaders to increase thread counts and/or reduce spilling: total instructions in shared programs: 11057173 -> 11057076 (<.01%) instructions in affected programs: 1955543 -> 1955446 (<.01%) helped: 4214 HURT: 3905 Inconclusive result (value mean confidence interval includes 0). total threads in shared programs: 425096 -> 425170 (0.02%) threads in affected programs: 74 -> 148 (100.00%) helped: 37 HURT: 0 Threads are helped. total uniforms in shared programs: 3846275 -> 3845674 (-0.02%) uniforms in affected programs: 23574 -> 22973 (-2.55%) helped: 217 HURT: 30 Uniforms are helped. total max-temps in shared programs: 2222910 -> 2220488 (-0.11%) max-temps in affected programs: 61904 -> 59482 (-3.91%) helped: 2145 HURT: 113 Max-temps are helped. total spills in shared programs: 4294 -> 4280 (-0.33%) spills in affected programs: 148 -> 134 (-9.46%) helped: 8 HURT: 0 total fills in shared programs: 6497 -> 6468 (-0.45%) fills in affected programs: 291 -> 262 (-9.97%) helped: 8 HURT: 0 total sfu-stalls in shared programs: 14344 -> 14611 (1.86%) sfu-stalls in affected programs: 1308 -> 1575 (20.41%) helped: 217 HURT: 335 Inconclusive result (%-change mean confidence interval includes 0). total inst-and-stalls in shared programs: 11071517 -> 11071687 (<.01%) inst-and-stalls in affected programs: 1946767 -> 1946937 (<.01%) helped: 4191 HURT: 3909 Inconclusive result (value mean confidence interval includes 0). total nops in shared programs: 270628 -> 269829 (-0.30%) nops in affected programs: 22032 -> 21233 (-3.63%) helped: 1213 HURT: 571 Inconclusive result (%-change mean confidence interval includes 0). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30511>	2024-08-07 09:28:39 +02:00
Georg Lehmann	d9849ac466	aco: test xor swap16 path Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30515>	2024-08-06 20:40:12 +00:00
Georg Lehmann	e0818cb87b	aco/gfx11+: don't use VOP3 v_swap_b16 v_swap_b16 is not offically supported as VOP3, so it can't be used with v128-255. Tests show that VOP3 appears to work correctly, but according to AMD that should not be relied on. https://github.com/llvm/llvm-project/pull/100442#discussion_r1703929676 Foz-DB Navi31: Totals from 6 (0.01% of 79395) affected shaders: Instrs: 64799 -> 65932 (+1.75%) CodeSize: 360180 -> 368440 (+2.29%) Latency: 1364648 -> 1365922 (+0.09%) InvThroughput: 635843 -> 636475 (+0.10%) Copies: 14766 -> 15698 (+6.31%) VALU: 38743 -> 39675 (+2.41%) Fixes: `80b8bbf0c5` ("aco/gfx11: use v_swap_b16") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30515>	2024-08-06 20:40:12 +00:00
Alyssa Rosenzweig	796b3ab23d	nir/opt_peephole_select: allow speculatable load constant this is useful on AGX when soft fault is enabled. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30501>	2024-08-06 20:01:37 +00:00
Aditya Swarup	ae85f59645	anv: Disable fast clear when surface height is 16k As suggested in WA_16021232440: Disable fast clear when surface height equals 16k. Signed-off-by: Aditya Swarup <aditya.swarup@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29182>	2024-08-06 19:14:04 +00:00
Aditya Swarup	0f821c1e2f	iris: Disable fast clear when surface height is 16k If surface height during fast clear is 16k, as per bspec the height programmed should be "value - 1" i.e. 0x3FFF. However, HW adds "1" to it but ignores overflow bit[14]. HW performs OOB check based on bit[13:0] which is 0 and drops failed transactions. This patch passes the following failing test on LNL: "PIGLIT_PLATFORM=gbm PIGLIT_DEFAULT_SIZE=16384x16384 shader_runner fast-slow-clear-interaction.shader_test -auto -fbo" Signed-off-by: Aditya Swarup <aditya.swarup@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29182>	2024-08-06 19:14:04 +00:00
Lionel Landwerlin	6145798022	intel/mi_builder: enable control flow API on Gfx9+ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30539>	2024-08-06 17:55:19 +00:00
Lionel Landwerlin	8cc492cb26	genxml: unify some bits between Gfx8/Gfx11/Gfx12.5 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30539>	2024-08-06 17:55:18 +00:00
Lionel Landwerlin	343e569ab7	anv: ensure max_plane_count is at least 1 This simplifies a bunch of checks throughout the driver. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30539>	2024-08-06 17:55:18 +00:00
Lionel Landwerlin	4f093b2e2b	anv: add missing MEDIA_STATE_FLUSH for internal shaders Replicating what we do in genX_cmd_compute.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `7ca5c84804` ("anv: add support for simple internal compute shaders") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30539>	2024-08-06 17:55:18 +00:00
Lionel Landwerlin	0bd96e868c	intel-clc: missing printf lowering Useful for printf() debugging in our opencl shader snippets. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30539>	2024-08-06 17:55:18 +00:00
Lionel Landwerlin	398e6cf38b	anv: reuse cs_prog_data pointer Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30539>	2024-08-06 17:55:18 +00:00
Lionel Landwerlin	f4a812a229	anv: remove some unused includes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30539>	2024-08-06 17:55:18 +00:00
Lionel Landwerlin	cde72181b7	anv: prevent asserts with debug printf in internal shaders Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30539>	2024-08-06 17:55:18 +00:00
Kenneth Graunke	32cce2f397	intel/brw: Set appropriate types for 16-bit sampler trailing components 16-bit SIMD8 sampler writeback messages come with a bit of padding in them, requiring us to emit a LOAD_PAYLOAD to reorganize the data into the padding-free format expected by NIR. Additionally, we may reduce the response length on the sampler messages based on which components of the (always vec4) NIR destination are actually in use. When we do that, dest_size > read_size, and the trailing components are all empty BAD_FILE registers, indicating the contents are undefined. Unfortunately, we can't ignore those trailing components entirely. In the past, we left them default-initialized, giving us a BAD_FILE register with UD type (which didn't matter, since all sampler returns were 32-bit). But with 16-bit, this was confusing the LOAD_PAYLOAD. For example, writing RGB and skipping A (without sparse) would produce read_size = 3 and dest_size = 4 and nir_dest[5] containing: nir_dest[] = <R:hf, G:hf, B:hf, blank-A:ud, blank-sparse:ud> We'd then call LOAD_PAYLOAD on the first 4 sources, causing it to see 3 HF's and a UD, and try to copy the full 32-bit value at the end, instead of 16-bits of pad like we intended. This meant it would overflow the destination register's size, triggering validation errors. Thanks to Ian Romanick for noticing this, writing a test, and also coming up with a nearly identical fix. Fixes: `0116430d39` ("intel/brw: Handle 16-bit sampler return payloads") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11617 References: https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/152 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30529>	2024-08-06 17:26:05 +00:00
Tatsuyuki Ishi	947a333ec3	util/u_queue: Replace relative time wait hack with u_cnd_monotonic Remove the gross hack. The hack was broken too, because it incorrectly added abs_time (a timestamp) to the now (another timestamp). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30491>	2024-08-06 16:37:59 +00:00
Alyssa Rosenzweig	c40c723336	agx: use opt_uniform_atomics Apple does something similar. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30488>	2024-08-06 11:48:18 -04:00

1 2 3 4 5 ...

193177 commits