fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 09:00:10 +01:00

Author	SHA1	Message	Date
Caio Oliveira	8e2a7cb42d	brw: Embed at_end() inside brw_builder(brw_shader *) constructor All remaining uses of that constructor would also use at_end(), and vice-versa. So just implement that behavior in the constructor itself. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Caio Oliveira	6f37e6f104	brw: Add explicit way to get an empty brw_builder And use brw_builder(brw_shader *) and brw_builder() constructors where possible. The way tests are written, it is necessary to initialize an "empty" builder -- which is later replaced by a proper one. Default parameter NULL make that initialization implicit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Caio Oliveira	32e562ae01	brw: Simplify brw_builder "insert before inst" constructor Since brw_inst now has the block it belongs and the block can reach the shader, the only necessary information to create a builder is the brw_inst itself. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Caio Oliveira	66307811c3	brw: Remove block parameter from brw_inst::remove() Use brw_inst::block instead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Caio Oliveira	7924d48bcd	brw: Use brw_inst::block in CSE Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Caio Oliveira	b0b0fa8624	brw: Use brw_inst::block in Combine Constants Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Caio Oliveira	07d0af763d	brw: Use brw_inst::block in Def analysis Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Caio Oliveira	705d448bc3	brw: Add block pointer in brw_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Caio Oliveira	b71ec53048	brw: Remove unused function Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Caio Oliveira	54912281a0	brw: Always verify EU compaction in debug mode There's already code to verify that any compacted instruction that we produce is equivalent to the original uncompacted instruction -- including detailed output if it fails. This patch enables this verification in debug build and will abort in case it fails. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33821>	2025-03-06 00:14:14 +00:00
Lionel Landwerlin	93a327c4e6	anv/brw: move INTEL_MSAA_* flag computation to the compiler Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33751>	2025-03-05 17:20:12 +00:00
Lionel Landwerlin	beaba53010	brw: make intel_shader_enums.h opencl importable Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33751>	2025-03-05 17:20:12 +00:00
Caio Oliveira	dd1ca1588d	brw: Fix size in assembler when compacting Calculation was wrongly walking uncompacted instructions, even if we had some compacted in the middle, generating invalid size. Since we are here just drop the instruction count, since in practice the caller will have to walk the instruction stream anyway. Fixes: `6267585778` ("intel/brw: Also return the size of the assembled shader") Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33532>	2025-03-03 20:43:56 +00:00
Kenneth Graunke	88309a9818	brw: Rename shared function enums for clarity Our name for this enum was brw_message_target, but it's better known as shared function ID or SFID. Call it brw_sfid to make it easier to find. Now that brw only supports Gfx9+, we don't particularly care whether SFIDs were introduced on Gfx4, Gfx6, or Gfx7.5. Also, the LSC SFIDs were confusingly tagged "GFX12" but aren't available on Gfx12.0; they were introduced with Alchemist/Meteorlake. GFX6_SFID_DATAPORT_SAMPLER_CACHE in particular was confusing. It sounds like the SFID to use for the sampler on Gfx6+, however it has nothing to do with the sampler at all. BRW_SFID_SAMPLER remains the sampler SFID. On Haswell, we ran out of messages on the main data cache data port, and so they introduced two additional ones, for more messages. The modern Tigerlake PRMs simply call these DP_DC0, DP_DC1, and DP_DC2. I think the "sampler" name came from some idea about reorganizing messages that never materialized (instead, the LSC came as a much larger cleanup). Recently we've adopted the term "HDC" for the legacy data cluster, as opposed to "LSC" for the modern Load/Store Cache. To make clear which SFIDs target the legacy HDC dataports, we use BRW_SFID_HDC0/1/2. We were also citing the G45, Sandybridge, and Ivybridge PRMs for a compiler that supports none of those platforms. Cite modern docs. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33650>	2025-02-27 08:49:24 +00:00
Tapani Pälli	78e5157a9c	intel/compiler: add a spec note about L1WT types being uncached Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33755>	2025-02-27 05:38:35 +00:00
Paulo Zanoni	fd10764cff	brw: extend the NOP+WHILE workaround It turns out that we need to add a NOP not only in between two consecutive WHILE instructions, but also after every control flow instruction that immediately precedes a WHILE. v2: Rebase after the renames. Fixes: `5ca883505e` ("brw: add a NOP in between WHILE instructions on LNL") Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33021>	2025-02-26 22:23:16 +00:00
Paulo Zanoni	3596b4e325	brw: add instructions missing from is_control_flow() I'm not aware of any workloads that will be impacted by this change, but let's keep our list of control flow instructions complete. A shader-db run on MTL tells me nothing changes. v2: "The scheduler relies on HALT not being considered control flow to be able to move code past HALT instructions. Doing this would prevent such optimization from happening and would reduce performance dramatically in some cases." - Francisco. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33021>	2025-02-26 22:23:16 +00:00
Karol Herbst	dad5ee1039	intel/brw, lp: enable lower_pack_64_4x16 The compiler won't be able to emit pack_64_4x16, so we should prevent nir_opt_algebraic to optimize to it. This fixes an infinite optimization loop inside brw_nir_optimize: nir_copy_prop 16x4 %77 = @load_global (%80) 32 %61995 = pack_32_2x16_split %77.x, %77.y 32 %61998 = pack_32_2x16_split %77.z, %77.w 64 %61999 = pack_64_2x32_split %61995, %61998 64 %76 = iadd %100, %79 @store_global (%61999, %76) nir_opt_algebraic 16x4 %77 = @load_global (%80) 32 %61995 = pack_32_2x16_split %77.x, %77.y 32 %61998 = pack_32_2x16_split %77.z, %77.w 16x4 %62000 = vec4 %77.x, %77.y, %77.z, %77.w 64 %62001 = pack_64_4x16 %62000 64 %76 = iadd %100, %79 @store_global (%62001, %76) nir_lower_pack 16x4 %77 = @load_global (%80) 16x4 %62000 = vec4 %77.x, %77.y, %77.z, %77.w 16 %62002 = mov %62000.y 16 %62003 = mov %62000.x 32 %62004 = pack_32_2x16_split %62003, %62002 16 %62005 = mov %62000.w 16 %62006 = mov %62000.z 32 %62007 = pack_32_2x16_split %62006, %62005 64 %62008 = pack_64_2x32_split %62004, %62007 64 %76 = iadd %100, %79 @store_global (%62008, %76) // brw_nir_optimize loops here nir_copy_prop 16x4 %77 = @load_global (%80) 32 %62004 = pack_32_2x16_split %77.x, %77.y 32 %62007 = pack_32_2x16_split %77.z, %77.w 64 %62008 = pack_64_2x32_split %62004, %62007 64 %76 = iadd %100, %79 @store_global (%62008, %76) llvmpipe has a similar issue inside lp_build_opt_nir Fixes: `b1bc691b0f` ("nir/algebraic: add and improve pack/unpack patterns") Acked-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33347>	2025-02-26 20:43:39 +00:00
Ian Romanick	495812d8e0	brw/print: Don't let SHADER_OPCODE_FLOW affect indentation In `fossilize-replay --pipeline-hash 375a63e14afa96c4 fossils/fossil-db/steam-dxvk/f1_22_abu_dhabi.dx12vk-ultra.foz`, `cf_count` would get decremented below zero. This would lead trying to print `UINT_MAX` levels of indentation just a few lines below. I ran out of disk space and patience before that finished. 🤣 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33748>	2025-02-26 19:50:30 +00:00
Lionel Landwerlin	d0c980caa7	brw: avoid setting up the sampler header bits when unused Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33704>	2025-02-26 17:19:04 +00:00
Lionel Landwerlin	8b4f997168	brw: optimize load payload with immediate headers Currently the condition to use a single MOV is failing on immediate values, so we emit 2 MOVs in SIMD8 instead of a single SIMD16. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33704>	2025-02-26 17:19:04 +00:00
Alyssa Rosenzweig	ff94b155ab	treewide: port remaining nir_metadata_preserve users apply our semantic patch manually to the remaining users. Coccinelle bailed on these files for whatever reason, I guess. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33722>	2025-02-26 15:19:53 +00:00
Alyssa Rosenzweig	9a58a8257e	treewide: Switch to nir_progress Via the Coccinelle patch at the end of the commit message, followed by sed -ie 's/progress = progress \| /progress \|=/g' $(git grep -l 'progress = prog') ninja -C ~/mesa/build clang-format cd ~/mesa/src/compiler/nir && clang-format -i *.c agxfmt @@ identifier prog; expression impl, metadata; @@ -if (prog) { -nir_metadata_preserve(impl, metadata); -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} -return prog; +return nir_progress(prog, impl, metadata); @@ expression prog_expr, impl, metadata; @@ -if (prog_expr) { -nir_metadata_preserve(impl, metadata); -return true; -} else { -nir_metadata_preserve(impl, nir_metadata_all); -return false; -} +bool progress = prog_expr; +return nir_progress(progress, impl, metadata); @@ identifier prog; expression impl, metadata; @@ -nir_metadata_preserve(impl, prog ? (metadata) : nir_metadata_all); -return prog; +return nir_progress(prog, impl, metadata); @@ identifier prog; expression impl, metadata; @@ -nir_metadata_preserve(impl, prog ? (metadata) : nir_metadata_all); +nir_progress(prog, impl, metadata); @@ expression impl, metadata; @@ -nir_metadata_preserve(impl, metadata); -return true; +return nir_progress(true, impl, metadata); @@ expression impl; @@ -nir_metadata_preserve(impl, nir_metadata_all); -return false; +return nir_no_progress(impl); @@ identifier other_prog, prog; expression impl, metadata; @@ -if (prog) { -nir_metadata_preserve(impl, metadata); -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} -other_prog \|= prog; +other_prog = other_prog \| nir_progress(prog, impl, metadata); @@ identifier prog; expression impl, metadata; @@ -if (prog) { -nir_metadata_preserve(impl, metadata); -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} +nir_progress(prog, impl, metadata); @@ identifier other_prog, prog; expression impl, metadata; @@ -if (prog) { -nir_metadata_preserve(impl, metadata); -other_prog = true; -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} +other_prog = other_prog \| nir_progress(prog, impl, metadata); @@ expression prog_expr, impl, metadata; identifier prog; @@ -if (prog_expr) { -nir_metadata_preserve(impl, metadata); -prog = true; -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} +bool impl_progress = prog_expr; +prog = prog \| nir_progress(impl_progress, impl, metadata); @@ identifier other_prog, prog; expression impl, metadata; @@ -if (prog) { -other_prog = true; -nir_metadata_preserve(impl, metadata); -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} +other_prog = other_prog \| nir_progress(prog, impl, metadata); @@ expression prog_expr, impl, metadata; identifier prog; @@ -if (prog_expr) { -prog = true; -nir_metadata_preserve(impl, metadata); -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} +bool impl_progress = prog_expr; +prog = prog \| nir_progress(impl_progress, impl, metadata); @@ expression prog_expr, impl, metadata; @@ -if (prog_expr) { -nir_metadata_preserve(impl, metadata); -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} +bool impl_progress = prog_expr; +nir_progress(impl_progress, impl, metadata); @@ identifier prog; expression impl, metadata; @@ -nir_metadata_preserve(impl, metadata); -prog = true; +prog = nir_progress(true, impl, metadata); @@ identifier prog; expression impl, metadata; @@ -if (prog) { -nir_metadata_preserve(impl, metadata); -} -return prog; +return nir_progress(prog, impl, metadata); @@ identifier prog; expression impl, metadata; @@ -if (prog) { -nir_metadata_preserve(impl, metadata); -} +nir_progress(prog, impl, metadata); @@ expression impl; @@ -nir_metadata_preserve(impl, nir_metadata_all); +nir_no_progress(impl); @@ expression impl, metadata; @@ -nir_metadata_preserve(impl, metadata); +nir_progress(true, impl, metadata); squashme! sed -ie 's/progress = progress \| /progress \|=/g' $(git grep -l 'progress = prog') Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33722>	2025-02-26 15:19:53 +00:00
Sagar Ghuge	6f7a76e9d9	intel/compiler: Zero out the header for texel fetch It looks like even if we pass the header not present in the sampler descriptor, it's not helping with the correct behavior of texelFetch. Experiment on real HW shows that if we just zero out the header and include it in the message, it helps with the correct behavior. I'm not sure if there is a valid HW workaround for this one. We can skip masking the sampler message header bits 4:0 but masking them out doesn't hurt in this case. Increasing number of parameter impact sampler performance, For example, a sample message using 5 parameters will not be able to sustain the same throughput as a sample message with only 4 valid parameters. We should look out for any perf impact with respect to texel fetch. This patch fixes ~3k tests involving texelFetch instruction on Xe3+ Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33562>	2025-02-26 00:23:49 +00:00
Caio Oliveira	a030acd7c3	brw: Reformat brw_gram.y and brw_lex.l Change to use Mesa space indentation. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33739>	2025-02-25 22:57:51 +00:00
Caio Oliveira	7311bcfd6a	intel/brw: Don't need to repair CFG in brw_opt_combine_constants Since a previous change ensured that a DO-block is guaranteed to not be followed by a DO-block, it is sufficient to pick the next block without requiring to repair the CFG. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33536>	2025-02-24 23:25:06 +00:00
Caio Oliveira	d2c39b1779	intel/brw: Always have a (non-DO) block after a DO in the CFG Make the "block after DO" more stable so that adding instructions after a DO doesn't require repairing the CFG. Use a new SHADER_OPCODE_FLOW instruction that is a placeholder representing "go to the next block" and disappears at code generation. For some context, there are a few facts about how CFG currently works - Blocks are assumed to not be empty; - DO is always by itself in a block, i.e. starts and ends a block; - There are no empty blocks; - Predicated WHILE and CONTINUE will link to the "block after DO"; - When nesting loops, it is possible that the "block after DO" is another "DO". Reasons and further explanations for those are in the brw_cfg.c comments. What makes this new change useful is that a pass might want to add instructions between two DO instructions. When that happens, a new block must be created and any predicated WHILE and CONTINUE must be repaired. So, instead of requiring a repair (which has proven to be tricky in the past), this change adds a block that can be "virtually" empty but allow instructions to be added without further changes. One alternative design would be allowing empty blocks, that would be a deeper change since the blocks are currently assumed to be not empty in various places. We'll save that for when other changes are made to the CFG. The problem described happens in brw_opt_combine_constants, and a different patch will clean that up. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33536>	2025-02-24 23:25:06 +00:00
Caio Oliveira	d32a5ab0e4	intel/brw: Use the builder DO() function in all places Shorter and a preparation to add some functionality to DO(). Had to make it const since that's the convention for builder, so just made all the sibling helpers const too. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33536>	2025-02-24 23:25:06 +00:00
Lionel Landwerlin	ce7208c3ee	brw: add support for texel address lowering The expectations are : - no MSAA images - a single tiling mode is used when not linear Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32676>	2025-02-23 15:16:50 +00:00
Lionel Landwerlin	b25e050ec7	brw: add support for 64bit storage images load/store Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32676>	2025-02-23 15:16:50 +00:00
Lionel Landwerlin	3bd4c5a166	brw: include UGM fence when TGM + lowered image->global Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32676>	2025-02-23 15:16:50 +00:00
Lionel Landwerlin	da098b76a4	brw: store source_hash in prog_data This is a debug feature that we kind of manage in the driver atm. It's better that we move this completely to the compiler and can load it from the cache. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Michael Cheng <michael.cheng@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33643>	2025-02-22 08:30:22 +00:00
Lionel Landwerlin	2f156ddb50	brw: factor out base prog_data setting Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Michael Cheng <michael.cheng@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33643>	2025-02-22 08:30:22 +00:00
Paulo Zanoni	1d23cf192b	brw: don't mark instructions read from text assembly as compacted I dumped assembly generated by our driver with INTEL_DEBUG=shaders, copied and pasted it into a lua file, tried to run it with src/intel/executor, but the disassembler started telling me some instructions were invalid. This happened because we print the "compacted" flag in our assembly text, so when brw_gram.y parses our assembly flag, it sees the "compacted" flag and sets it to the instruction by calling add_instruction_option(). But the executor tool never sets the BRW_ASSEMBLE_COMPACT flag when it calls brw_assemble(), so when brw_assemble() calls dump_assembly(), which calls brw_disassbemble(), the disassembler gets confused and prints misinterpreted instructions and calls them invalid. It is not the job of brw_gram.y (our text assembly parser) to mark instructions as compacted. Whatever is later assembling the instruction is the entity that should decide if the instructions are compacted or not. So in this patch we just ignore this flag. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33614>	2025-02-22 00:38:53 +00:00
Georg Lehmann	f26069fdd9	nir: replace nir_opt_conditional_discard with nir_opt_peephole_select Foz-DB Navi21: Totals from 118 (0.15% of 79377) affected shaders: Instrs: 208001 -> 207355 (-0.31%); split: -0.33%, +0.01% CodeSize: 1080428 -> 1078432 (-0.18%); split: -0.20%, +0.02% SpillSGPRs: 202 -> 211 (+4.46%) Latency: 1923508 -> 1919093 (-0.23%); split: -0.62%, +0.39% InvThroughput: 407475 -> 407081 (-0.10%); split: -0.12%, +0.02% SClause: 7050 -> 7033 (-0.24%); split: -0.31%, +0.07% Copies: 12156 -> 11821 (-2.76%); split: -3.04%, +0.28% PreSGPRs: 8198 -> 8331 (+1.62%); split: -0.02%, +1.65% PreVGPRs: 7628 -> 7528 (-1.31%) VALU: 155747 -> 155657 (-0.06%); split: -0.06%, +0.00% SALU: 18295 -> 17782 (-2.80%); split: -2.98%, +0.18% SMEM: 10521 -> 10519 (-0.02%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>	2025-02-20 21:59:17 +00:00
Georg Lehmann	ca8147edbe	nir/peephole_select: add options struct Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>	2025-02-20 21:59:16 +00:00
Paulo Zanoni	55bdae03cc	brw: don't always set cond_modifier on parsed assembly instructions For the instructions we parse with brw_gram.y, don't unconditionally call brw_eu_inst_set_cond_modifier(). Do it like we do in brw_generator::generate_code() and only call it if we have a cond_modifier to set. Why? Because for ONE_SRC instructions, CondCtrl (bits 95:92) only exists if Src.IsImm is false. If Src.Imm is true, then bits 95:64 are actually Src0.ImmValue[63:32]. If we unconditionally call brw_eu_inst_set_cond_modifier(), we'll end up zeroing bits 95:92 for ONE_SRC instructions with 64bit immediates. See BSpec page Structure_EU_INSTRUCTION_BASIC_ONE_SRC (56880). This issue can be reproduced with src/intel/executor if you try to have the following instruction: mov(16) g10<1>Q 0xfedcba9876543210:Q { align1 WE_all 1H }; our parser will end up zeroing the top bits, so the value of the immediate will be 0x0edcba9876543210. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33559>	2025-02-18 23:44:32 +00:00
Paulo Zanoni	927d7b322b	brw: increase brw_reg::subnr size to 6 bits Since Xe2, the registers are bigger and even the instruction structures got updated to have 6 bits. The way I detected this issue was when I tried to use src/intel/executor to add the following instruction: add(8) g6.8<1>UD g4<8,8,1>UD 0x00000008UD { align1 WE_all 1Q I@1 }; Executor would read this and end up emitting an add with dst being g6<1>UD instead of what we wanted. It turns out that inside brw_gram.y, at dstoperand and dstoperandex we do: $$.subnr = $$.subnr * brw_type_size_bytes($4); which would overflow subnr back to 0. The overflow doesn't seem to be a problem with code we emit directly (unlike the code we parse, like above) due to the fact that we seem to treat Xe2 registers as smaller all the way until we call phys_nr() and phys_subnr() during code generation. The phys_subnr() function can generate a value that would overflow reg.subnr, but this value is never written back to reg.subnr, it's just returned as an unsigned int. Fixes: `e9f63df2f2` ("intel/dev: Enable LNL PCI IDs without INTEL_FORCE_PROBE") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33539>	2025-02-18 19:38:46 +00:00
Lionel Landwerlin	a9b6a54a8c	brw: fix component packing starting index Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `6845dede59` ("brw: add support for no VF input slot compaction") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33553>	2025-02-14 20:17:54 +00:00
Lionel Landwerlin	db53e53bf6	brw: add documentation about slot compaction & component packing Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>	2025-02-13 14:36:15 +00:00
Lionel Landwerlin	6845dede59	brw: add support for no VF input slot compaction Normally the driver & compiler work together to use as few 3DSTATE_VERTEX_ELEMENTS/VERTEX_BUFFER_ELEMENT data as possible. The compiler ignores unused bits and driver avoids emitting the corresponding elements in 3DSTATE_VERTEX_ELEMENTS. For device generated commands, we want an 3DSTATE_VERTEX_ELEMENTS programming that is independent from the shader so that we can implement indirect pipeline binding without complicating the generation shader as well as emitting fewer generated commands. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>	2025-02-13 14:36:15 +00:00
Lionel Landwerlin	f19c5f4fcc	brw: use meaningful io locations for system values Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>	2025-02-13 14:36:15 +00:00
Lionel Landwerlin	4f892ae4f7	brw: enable vertex fetching component packing Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>	2025-02-13 14:36:15 +00:00
Lionel Landwerlin	9b8d75c95c	brw: add a max HW vertices attribute limit Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>	2025-02-13 14:36:15 +00:00
Lionel Landwerlin	fae8d325a7	brw: update vulkan max attribute limit Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>	2025-02-13 14:36:15 +00:00
Lionel Landwerlin	bae9344baf	brw: port vs input to lower_64bit_to_32_new Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>	2025-02-13 14:36:15 +00:00
Lionel Landwerlin	e9e4aa0f29	brw: remove nr_attribute_slots from vs_prog_data It's not used outside of the compiler. We add a new nr_attribute_regs which now seems useless but will be useful in a later change. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>	2025-02-13 14:36:15 +00:00
Lionel Landwerlin	c00830083e	brw: fix indentation Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>	2025-02-13 14:36:15 +00:00
Daniel Schürmann	175c06e5cd	intel: switch to nir_metadata_divergence Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30814>	2025-02-13 10:08:43 +00:00
Sagar Ghuge	2e0d5ccd91	intel/compiler: Drop primitive leaf desc load code Looks like we are not using the primitive leaf desc loading code part at all. Let's just drop it. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33497>	2025-02-12 05:23:05 +00:00

... 4 5 6 7 8 ...

4403 commits