fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 11:18:11 +02:00

Author	SHA1	Message	Date
Ian Romanick	73f365e208	intel/brw: load_offset cannot be constant on this path Literally inside an if-statement (about 26 lines before this hunk) that checks for !nir_src_is_const(instr->src[1]). No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>	2024-08-30 03:39:31 +00:00
Ian Romanick	fef175de09	intel/brw: Enable constant propagation for a couple more logical sends This prevents some regressions later in the MR. Once load_const operations are marked as is_scalar, they will cesase to get the automatic constant propagation that occurs in try_rebuild_source. No shader-db or fossil-db changes on any Intel platform. v2: Slightly relax source restrictions on SHADER_OPCODE_UNALIGNED_OWORD_BLOCK_READ_LOGICAL. Add a comment explaining the restriction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>	2024-08-30 03:39:31 +00:00
Ian Romanick	c6a8b382fd	intel/brw: Relax is_partial_write check in cmod propagation The is_partial_write check is too strict because it tests two separate things. It tests whether or not the instruction always writes a value (i.e., is it predicated), and it tests whether or not the instruction writes a complete register. This latter check is problematic as it perevents cmod propagation in SIMD1, and it prevents cmod propagation in SIMD8 when the destination size is 16 bits. This check is unnecessary. Cmod propagation already checks that the region written and region read overlap. It also already checks that the execution sizes of the instructions match. Further restriction based on the specific parts of the register written only generates false negatives. v2: Relax all of the calls to is_partial_write. Suggested by Caio. No shader-db changes on any Intel platform. fossil-db: Meteor Lake Totals: Instrs: 151505520 -> 151502923 (-0.00%); split: -0.00%, +0.00% Cycle count: 17201385104 -> 17194901423 (-0.04%); split: -0.06%, +0.02% Spill count: 80827 -> 80837 (+0.01%) Fill count: 152693 -> 152692 (-0.00%); split: -0.01%, +0.01% Totals from 346 (0.05% of 630198) affected shaders: Instrs: 1257205 -> 1254608 (-0.21%); split: -0.21%, +0.00% Cycle count: 5532845647 -> 5526361966 (-0.12%); split: -0.18%, +0.06% Spill count: 32903 -> 32913 (+0.03%) Fill count: 64338 -> 64337 (-0.00%); split: -0.03%, +0.03% DG2 Totals: Instrs: 151531440 -> 151528055 (-0.00%); split: -0.00%, +0.00% Cycle count: 17200238927 -> 17197996676 (-0.01%); split: -0.03%, +0.02% Spill count: 81003 -> 80971 (-0.04%); split: -0.04%, +0.00% Fill count: 152975 -> 152912 (-0.04%); split: -0.05%, +0.01% Totals from 346 (0.05% of 630198) affected shaders: Instrs: 1260363 -> 1256978 (-0.27%); split: -0.27%, +0.00% Cycle count: 5532019670 -> 5529777419 (-0.04%); split: -0.09%, +0.05% Spill count: 33046 -> 33014 (-0.10%); split: -0.11%, +0.01% Fill count: 64581 -> 64518 (-0.10%); split: -0.13%, +0.03% Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) Totals: Instrs: 149972324 -> 149972289 (-0.00%) Cycle count: 15566495293 -> 15565151171 (-0.01%); split: -0.01%, +0.00% Totals from 16 (0.00% of 629912) affected shaders: Instrs: 351194 -> 351159 (-0.01%) Cycle count: 3922227030 -> 3920882908 (-0.03%); split: -0.04%, +0.00% Skylake Totals: Instrs: 140787999 -> 140787983 (-0.00%); split: -0.00%, +0.00% Cycle count: 14665614947 -> 14665515855 (-0.00%); split: -0.00%, +0.00% Spill count: 58500 -> 58501 (+0.00%) Fill count: 102097 -> 102100 (+0.00%) Totals from 16 (0.00% of 625685) affected shaders: Instrs: 343560 -> 343544 (-0.00%); split: -0.01%, +0.01% Cycle count: 3354997898 -> 3354898806 (-0.00%); split: -0.01%, +0.01% Spill count: 16864 -> 16865 (+0.01%) Fill count: 27479 -> 27482 (+0.01%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>	2024-08-30 03:39:31 +00:00
Ian Romanick	13332c236b	intel/brw: Unconditionally run optimizations after nir_opt_uniform_subgroup I observed some ray tracing shaders where a resource_intel inside a loop was non-uniform, and some code was lowered to account for that. Eventually the loop containing the resource_intel was unrolled, and the resource_intel became uniform. For example, nir_opt_uniform_subgroup can transform something like con loop { con block b5: // preds: b4 b8 con 32 %330 = @read_first_invocation (%329) con 1 %331 = ieq %330, %329 // succs: b6 b7 if %331 { con block b6: // preds: b5 con 32 %332 = iadd %120.b, %330 con 32 %333 = @resource_intel (%125 (0xdeaddeed), %332, %125 (0xdeaddeed), %3 (0x0)) (desc_set=1, binding=2, resource_intel=bindless\|non-uniform, resource_block_intel=-1) div 32x4 %334 = (float32)txl %333 (texture_handle), %130 (sampler_handle), %327 (coord), %275 (lod), 0 (texture), 0 (sampler) break // succs: b9 } else { con block b7: // preds: b5, succs: b8 } con block b8: // preds: b7, succs: b5 } into con loop { con block b5: // preds: b4 b8 con 1 %331 = ieq %329, %329 // succs: b6 b7 if %331 { con block b6: // preds: b5 con 32 %332 = iadd %120.b, %329 con 32 %333 = @resource_intel (%125 (0xdeaddeed), %332, %125 (0xdeaddeed), %3 (0x0)) (desc_set=1, binding=2, resource_intel=bindless\|non-uniform, resource_block_intel=-1) div 32x4 %334 = (float32)txl %333 (texture_handle), %130 (sampler_handle), %327 (coord), %275 (lod), 0 (texture), 0 (sampler) break // succs: b9 } else { con block b7: // preds: b5, succs: b8 } con block b8: // preds: b7, succs: b5 } Notice that %331 is now a tautology. Running brw_nir_optimize again eliminates the loop. v2: Add a comment in the code explaining the rationale. Suggested by Ken. Update the commit message. Suggested by Caio. shader-db: Meteor Lake, DG2, and Tiger Lake had similar results. (Meteor Lake shown) total instructions in shared programs: 19733448 -> 19733330 (<.01%) instructions in affected programs: 14120 -> 14002 (-0.84%) helped: 32 / HURT: 3 total cycles in shared programs: 916254496 -> 916226288 (<.01%) cycles in affected programs: 2035116 -> 2006908 (-1.39%) helped: 19 / HURT: 13 total spills in shared programs: 5807 -> 5807 (0.00%) spills in affected programs: 26 -> 26 (0.00%) helped: 1 / HURT: 1 total fills in shared programs: 6794 -> 6792 (-0.03%) fills in affected programs: 84 -> 82 (-2.38%) helped: 1 / HURT: 1 LOST: 1 GAINED: 1 Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 20393084 -> 20392971 (<.01%) instructions in affected programs: 21750 -> 21637 (-0.52%) helped: 31 / HURT: 4 total cycles in shared programs: 880273065 -> 880247818 (<.01%) cycles in affected programs: 2546748 -> 2521501 (-0.99%) helped: 18 / HURT: 9 total spills in shared programs: 4628 -> 4630 (0.04%) spills in affected programs: 287 -> 289 (0.70%) helped: 1 / HURT: 2 total fills in shared programs: 5381 -> 5376 (-0.09%) fills in affected programs: 711 -> 706 (-0.70%) helped: 2 / HURT: 2 LOST: 1 GAINED: 1 fossil-db: Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 151513669 -> 151505520 (-0.01%); split: -0.01%, +0.00% Send messages: 7459339 -> 7459396 (+0.00%) Loop count: 49111 -> 47588 (-3.10%) Cycle count: 17208178205 -> 17201385104 (-0.04%); split: -0.05%, +0.01% Spill count: 80830 -> 80827 (-0.00%); split: -0.02%, +0.01% Fill count: 152754 -> 152693 (-0.04%); split: -0.04%, +0.00% Scratch Memory Size: 4136960 -> 4130816 (-0.15%) Max live registers: 32016493 -> 32015955 (-0.00%); split: -0.00%, +0.00% Totals from 672 (0.11% of 630198) affected shaders: Instrs: 1352428 -> 1344279 (-0.60%); split: -0.78%, +0.17% Send messages: 54302 -> 54359 (+0.10%) Loop count: 6124 -> 4601 (-24.87%) Cycle count: 1260266379 -> 1253473278 (-0.54%); split: -0.69%, +0.16% Spill count: 15967 -> 15964 (-0.02%); split: -0.09%, +0.08% Fill count: 36245 -> 36184 (-0.17%); split: -0.18%, +0.01% Scratch Memory Size: 740352 -> 734208 (-0.83%) Max live registers: 50699 -> 50161 (-1.06%); split: -1.45%, +0.39% Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown) Totals: Instrs: 149976046 -> 149971100 (-0.00%); split: -0.00%, +0.00% Subgroup size: 7685264 -> 7685256 (-0.00%) Cycle count: 15566401168 -> 15566405478 (+0.00%); split: -0.00%, +0.00% Spill count: 61238 -> 61240 (+0.00%) Fill count: 107301 -> 107289 (-0.01%) Max live registers: 31992969 -> 31993857 (+0.00%); split: -0.00%, +0.00% Totals from 553 (0.09% of 629912) affected shaders: Instrs: 557027 -> 552081 (-0.89%); split: -0.90%, +0.01% Subgroup size: 8648 -> 8640 (-0.09%) Cycle count: 150154496 -> 150158806 (+0.00%); split: -0.23%, +0.24% Spill count: 181 -> 183 (+1.10%) Fill count: 440 -> 428 (-2.73%) Max live registers: 33698 -> 34586 (+2.64%); split: -0.02%, +2.65% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>	2024-08-30 03:39:31 +00:00
Ian Romanick	65eb7ed5fc	intel/brw: Run intel_nir_lower_conversions only after brw_nir_optimize Without this, the next commit tiggers assertions. v2: Unconditionally do the lowering after brw_nir_optimize. Suggested by Caio. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1] Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>	2024-08-30 03:39:31 +00:00
Ian Romanick	572e00dd66	intel/brw: Copy prop from raw integer moves with mismatched types The specific pattern from the unit test was observed in ray tracing trampoline shaders. v2: Refactor the is_raw_move tests out to a utility function. Suggested by Ken. v3: Fix a regression caused by being too picky about source modifiers. This was introduced somewhere between when I did initial shader-db runs an v2. v4: Fix typo in comment. Noticed by Caio. shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19734086 -> 19733997 (<.01%) instructions in affected programs: 135388 -> 135299 (-0.07%) helped: 76 / HURT: 2 total cycles in shared programs: 916290451 -> 916264968 (<.01%) cycles in affected programs: 41046002 -> 41020519 (-0.06%) helped: 32 / HURT: 29 fossil-db: Meteor Lake, DG2, and Skylake had similar results. (Meteor Lake shown) Totals: Instrs: 151531355 -> 151513669 (-0.01%); split: -0.01%, +0.00% Cycle count: 17209372399 -> 17208178205 (-0.01%); split: -0.01%, +0.00% Max live registers: 32016490 -> 32016493 (+0.00%) Totals from 17361 (2.75% of 630198) affected shaders: Instrs: 2642048 -> 2624362 (-0.67%); split: -0.67%, +0.00% Cycle count: 79803066 -> 78608872 (-1.50%); split: -1.75%, +0.25% Max live registers: 421668 -> 421671 (+0.00%) Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) Totals: Instrs: 149995644 -> 149977326 (-0.01%); split: -0.01%, +0.00% Cycle count: 15567293770 -> 15566524840 (-0.00%); split: -0.02%, +0.01% Spill count: 61241 -> 61238 (-0.00%) Fill count: 107304 -> 107301 (-0.00%) Max live registers: 31993109 -> 31993112 (+0.00%) Totals from 17813 (2.83% of 629912) affected shaders: Instrs: 3738236 -> 3719918 (-0.49%); split: -0.49%, +0.00% Cycle count: 4251157049 -> 4250388119 (-0.02%); split: -0.06%, +0.04% Spill count: 28268 -> 28265 (-0.01%) Fill count: 50377 -> 50374 (-0.01%) Max live registers: 470648 -> 470651 (+0.00%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>	2024-08-30 03:39:31 +00:00
Lionel Landwerlin	14d772d678	anv: fix utrace compute timestamp reads on Gfx20 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30923>	2024-08-29 20:10:11 +00:00
Tapani Pälli	096acf8c0c	anv: change existing ICL workaround to depend on BLEND_STATE Commit `f900b763b1` we started to dirty MS as WM changes. However later on things changed with `eebb6cd236`, we need to dirty with BLEND_STATE now. Fixes: `eebb6cd236` ("anv: stop using 3DSTATE_WM::ForceThreadDispatchEnable") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30920>	2024-08-29 13:58:08 +00:00
Rohan Garg	51e05c2844	iris,anv: simplify and inline sampler count calculations Use the CLAMP macro to clamp the value and simplify the sampler count encoding. Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30922>	2024-08-29 11:49:56 +00:00
Rohan Garg	32f606486f	anv: prefetch samplers when dispatching compute shaders Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30922>	2024-08-29 11:49:56 +00:00
Tapani Pälli	44e1cf2748	anv: set correct miplevel for anv_image_hiz_op Fixes: `5efecc9782` ("anv: Enable HiZ on multi-LOD depth buffers.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11787 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30892>	2024-08-29 04:50:44 +00:00
Faith Ekstrand	42114aa723	vulkan: Handle VIEW_INDEX_FROM_DEVICE_INDEX_BIT in the runtime Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30876>	2024-08-29 03:30:31 +00:00
Faith Ekstrand	8c60f1461b	vulkan: Take a VkPipelineCreateFlags2KHR in vk_pipeline_shader_stage() Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30876>	2024-08-29 03:30:31 +00:00
Jesse Natalie	03655dfda1	compiler, vk: Support subgroup size of 4 Relax the assert and assign it an enum value Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30876>	2024-08-29 03:30:31 +00:00
Kenneth Graunke	da395e6985	intel/brw: Fix extract_imm for subregion reads of 64-bit immediates We could be trying to extract a D/UD from a Q/UQ, for example. We were ignoring the top 32-bits, which is incorrect. Fixes: `580e1c592d` ("intel/brw: Introduce a new SSA-based copy propagation pass") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30884>	2024-08-28 12:33:26 -07:00
Kenneth Graunke	51c85e0363	intel/brw: Drop misguided sign extension attempts in extract_imm() This function never expands a type - it only narrows it. As such, we don't need to ever sign extend to fill additional new bits. I think this code was left over from earlier versions of my optimization pass that was buggy and trying to handle cases it should not have. Fixes: `580e1c592d` ("intel/brw: Introduce a new SSA-based copy propagation pass") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30884>	2024-08-28 12:33:26 -07:00
Deborah Brouwer	18f15da94d	ci/intel: add i915/MTL firmware to rootfs Add Meteor Lake firmware directly to rootfs since it is not available from debian package. Signed-off-by: Deborah Brouwer <deborah.brouwer@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30770>	2024-08-28 04:31:10 +00:00
Caio Oliveira	695f5314d6	intel/brw: Simplify fs_inst annotation When INTEL_DEBUG=ann is also set, the disassembler would annotate the output with either a string or the string verison of a NIR instruction. This was done by keeping two pointers (but only using one at a time). Change the code to print the instruction into a string instead of keeping it pointer around (peg the string to the shader). That way, only one pointer is needed for annotations. Because that serialization is not free, only do that when the environment variable is set. Since we are here, move the annotation string field to the end, moving it to the least commonly used cacheline. Further packing might allow the entire fs_inst to fit in two cachelines. For release builds, don't even add the debug annotation to the struct. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30822>	2024-08-28 03:59:50 +00:00
Caio Oliveira	ec15cdfa2a	intel/brw: Pack brw_reg struct The alignment required for the second union (has 64-bit size) causes a hole between the first and second union. Move the remaining data there. In 64-bit build, shrinks brw_reg from 24 bytes to 16 bytes. And by consequence, shirnks fs_inst from 200 bytes to 160 bytes, making it use one less cacheline. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30822>	2024-08-28 03:59:50 +00:00
Iván Briano	2261b298d1	anv: fix adding to wa_addr Fixes: `6336e0fe7f` ("anv: order data in wa_bo to leave wa_addr last") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30881>	2024-08-27 18:10:58 -07:00
Sagar Ghuge	063715ed45	anv: Reduce clear color state alignment to 64B Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26793>	2024-08-27 21:13:30 +00:00
Lionel Landwerlin	e97b968aeb	brw: add a comment what Gfx12.5 URB fences Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30849>	2024-08-27 13:38:14 +00:00
Lionel Landwerlin	93fba40389	brw: switch mesh/task URB fence prior to EOT to GPU Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30849>	2024-08-27 13:38:14 +00:00
Sergi Blanch Torne	1b51e24b0a	New testing jobs intel-adl-skqp Introduce skqp testing on ADL generation. Only one job on the pre-merge, and no fraction needed, so not required to set up a job for nightly runs. Introduced the initial expectation files with fails, flakes, and skips. Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26831>	2024-08-27 12:49:28 +02:00
Sergi Blanch Torne	c653e98748	New testing jobs anv-adl-angle{,-full} Introduce testing coverage for Angle in ANV driver on ADL generation. One job in the pre-merge fraction, and another for the full coverage on the nightly runs. Introduced the initial expectation files with fails, flakes, and skips. Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26831>	2024-08-27 12:49:24 +02:00
Sergi Blanch Torne	6c9138f86a	New testing jobs anv-adl{,-full} Introduce testing coverage for ANV driver on ADL generation. Sharded in 4 jobs the pre-merge fraction, and with 5 jobs the full coverage on the nightly runs. Introduced the initial expectation files with fails, flakes, and skips. Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26831>	2024-08-27 12:49:21 +02:00
Sergi Blanch Torne	fce5e77604	New DUT for Alder Lake Introduce a new runner tag from a hidden job for ADL (Alder Lake Intel generation), known as brya. Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26831>	2024-08-27 12:49:16 +02:00
Kenneth Graunke	437bda3013	intel/brw: Get rid of the lsc_msg_desc_wcmask helper The LOAD/STORE opcodes take a vector size, while the LOAD/STORE_CMASK opcodes take a channel mask. The two are mutually exclusive. So we can just have the lsc_msg_desc() helper take one or the other in the same parameter. This more closely matches the actual descriptor. We couldn't do this until the previous commit, since we were previously relying on the lsc_msg_desc() function to calculate a cmask out of the number of vector components. But now we don't need it to do that. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30632>	2024-08-27 09:25:59 +00:00
Kenneth Graunke	55f193a105	intel/brw: Switch from LSC CMASK opcodes to regular LOAD/STORE The LOAD/STORE opcodes take a vector size (number of components), while the LOAD/STORE_CMASK opcodes take a channel mask. For some reason, we were passing a number of channels to lsc_msg_desc(), then using it to construct a channel mask with all channels enabled, and always using the CMASK message variants. Considering we don't actually want to mask off any channels, we should probably just use the regular LOAD/STORE opcodes, as they're more flexible anyway. One exception is that typed messages on Xe2 apparently only support LOAD_CMASK/STORE_CMASK and not regular LOAD/STORE. So we keep using those there. (Thanks to Sagar Ghuge for catching this!) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30632>	2024-08-27 09:25:58 +00:00
Sviatoslav Peleshko	7e52b67801	anv: Add full subgroups WA for the shaders with barriers in Breaking Limit When barriers are used in invalid shaders with non-uniform control flow we might get a hang. Forcing 32-wide group can help by making it more probable that barrier instruction is executed by at least one channel in each thread, and thus hang will be avoided. This shouldn't affect Xe2+, where active-thread-only barriers are used anyway. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11497 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>	2024-08-27 08:26:08 +00:00
Sviatoslav Peleshko	1904fe1186	anv: Release correct BO in anv_cmd_buffer_set_ray_query_buffer If p_atomic_cmpxchg doesn't set the ray_query_shadow_bos[bucket] to new_bo allocated by this thread, it returns the bucket BO allocated by the other thread and we use it. But due to a mistake, we also release that BO, not the candidate just allocated by this thread and never used again. Fixes: `5d3e4193` ("anv: enable ray queries") Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>	2024-08-27 08:26:08 +00:00
Sviatoslav Peleshko	09122e2be0	brw,elk: Fix opening flags on dumping shader binaries Truncation is needed for overwriting correctly in cases when old file is bigger than the one we want to dump (e.g. when the old one was edited inplace). Also, creation permissions are way too broad. Fixes: `4f41c44d` ("intel/compiler: Add variable to dump binaries of all compiled shaders") Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>	2024-08-27 08:26:08 +00:00
Sviatoslav Peleshko	442cc7996e	anv: Assert ray query BO actually exists The crash will happen if the client tries to use ray queries without enabling the KHR_ray_query extension. Add an assert to be able to catch this sooner. Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>	2024-08-27 08:26:08 +00:00
Nanley Chery	4a8f3181ba	intel: Support any depth fast-clear value on Xe2 Remove the restriction that a depth fast-clear must have a clear value which matches an image-dependent heuristic. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30767>	2024-08-27 06:15:36 +00:00
Nanley Chery	4a9e45061a	anv: Add and use anv_image_hiz_clear_value() The benchmarks we're tracking tend to prefer clearing depth buffers to 0.0f when the depth buffers are part of images with multiple aspects. Otherwise, they tend to prefer clearing depth buffers to 1.0f. Replace the ANV_HZ_FC_VAL constant with a function which implements this heuristic. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30767>	2024-08-27 06:15:36 +00:00
Nanley Chery	9fd79dc49e	anv: Pass the VkClearDepthStencilValue for clears Xe2 can easily support fast-clearing depth buffers to multiple clear values. Instead of assuming a hard-coded value in various parts of the driver, pass the clear value down the expected paths. For consistency, also adjust the slow depth clear function to have a matching parameter. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30767>	2024-08-27 06:15:36 +00:00
Paulo Zanoni	f3c7e14f09	isl: don't assert(num_elements > (1ull << 27)) Some games such as Marvel's Spider-Man Remastered and Assassin's Creed: Valhalla don't work in debug mode because they hit this assertion. In Release mode, they appear to work (although in some platforms there may be visual corruption or GPU hangs). There's nothing we can do about this error (see below), so in this patch we replace the assertion with an error message, because it allows us to (i) test the rest of the game in debug mode so we may catch other issues; and (ii) warn users of release mode that the issue is happening. The unsupported num_elements comes from vkGetDescriptorEXT() and appears to be violating VUID-VkDescriptorGetInfoEXT-type-09427. This function cannot return errors, but we can disable VK_EXT_descriptor_buffer. If we do disable the extension, then vkCreateBufferView() will start triggering the assertion, and we can see that VkBufferViewCreateInfo-range-00930 is being violated. If we change Anv to return errors on these vkCreateBufferView() cases, then the games won't work at all. I reported this to vkd3d-proton, but according to the vkd3d-proton developer Philip Rebohle: "There's also the problematic case of games using typed descriptors but passing non-typed buffer descriptors, which is an extremely common app bug that works on all D3D12 drivers that we need to work around by creating typed views. If that's what's happening here then the best we can do is to just not create the typed view and have the game be broken entirely, or create a smaller view and most likely still completely break the game, but at least that way it wouldn't trigger Vulkan validation. Emulating larger views via multiple smaller views is not possible for us." "Confirmed that it's the app itself creating these views." "D3D12 does not have runtime validation for this or any sort of query for the app, so we really can't do much here." Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9963 Link: https://github.com/HansKristian-Work/vkd3d-proton/issues/2071 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30775>	2024-08-27 05:47:50 +00:00
Lionel Landwerlin	6336e0fe7f	anv: order data in wa_bo to leave wa_addr last We want to make sure the workaround_address is the last item in the BO so that we don't have to care about the size of the writes going there, we'll be sure they won't overwrite other items in that BO. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `7b9400b7f7` ("intel/blorp: Don't use clear color conversion on gfx12") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11775 Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30844>	2024-08-27 00:51:03 +00:00
Lionel Landwerlin	d8ec8acede	anv: always use workaround_address, not workaround_bo The workaround BO has some debug information at the beginning. The workaround address is placed after that. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30844>	2024-08-27 00:51:03 +00:00
Nanley Chery	9b98cebe9a	intel: Drop BLORP_BATCH_NO_UPDATE_CLEAR_COLOR All drivers update the clear color themselves. So, drop the functionality from BLORP as well as the flag controlling it. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30824>	2024-08-26 23:57:12 +00:00
Nanley Chery	721d0c3e77	anv,hasvk: Always use BLORP_BATCH_NO_UPDATE_CLEAR_COLOR Store the clear color from within the drivers, rather than from BLORP. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30824>	2024-08-26 23:57:11 +00:00
Nanley Chery	5fd42500cf	anv,hasvk: Add and use set_image_clear_color() We're going to be storing clear colors from the drivers rather than BLORP. Add a function for this purpose. For now, the first use replaces init_fast_clear_color(). One change in behavior is that the clear color initialization is now done without write-checking on gfx12. This actually matches what anv does to other writes to the image's fast-clear tracking state. We can fix this later if and when we address the larger issue. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30824>	2024-08-26 23:57:11 +00:00
Lionel Landwerlin	1f9c40a8d1	anv: explicitly disable BT pool allocations at device init The default state doesn't seem well defined (or kernel driver bug maybe?). Let's just set it to disabled on platforms where we're not using it. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Found-by: Chuansheng Liu <chuansheng.liu@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30841>	2024-08-26 10:34:31 +00:00
Caio Oliveira	31dfb04fd3	intel/brw: Remove long register file names The long names were originally meant to map to the HW encoding but nowadays the actual encoding values depend on gfx version, whether instruction is 3src, etc. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00
Caio Oliveira	6bdf2de4d2	intel/brw: Remove unused ARF values and helpers These were used by old Gfx versions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00
Caio Oliveira	72b687abb4	intel/brw: Make BAD_FILE the zero value for brw_reg_file Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00
Caio Oliveira	e8f921678a	intel/brw: Explicitly map brw_reg_file into hardware values For now this is a no-op, but will be useful when changing the enum to values that don't match the hardware. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00
Caio Oliveira	e7179232c9	intel/brw: Move encoding of Gfx11 3-src inside the inst helpers Create specific helper for register file encoding and handle it there. Use ad-hoc structs to let the macro take optional named arguments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00
Caio Oliveira	d31c8bfb6f	intel/brw: Remove more uses of variable length arrays In these cases there's a clear bound we can use. In C++ this is a compiler extension and not compatible with zero initializing a regular struct -- which will happen in a later change. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00
Caio Oliveira	86c20e2910	intel/brw: Use a helper for common VEC pattern In the helper, instead of using the Variable Length Array, use a fixed size array to NIR_MAX_VEC_COMPONENTS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00

1 2 3 4 5 ...

12637 commits