fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-15 07:48:04 +02:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	335cf5f22f	nir: Use a tagged pointer for nir_src parents This allows us to pack the is_if boolean into the bottom bit of the parent pointer, eliminating the boolean and hence shrinking the nir_src by 8 bytes (due to the extra 63 bits of padding incurred in the old layout). Because all access is forced through helpers now, this is a local change. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>	2023-10-10 04:58:05 -04:00
Alyssa Rosenzweig	316af8c965	nir: Assert the nir_src union is used safely It is undefined behaviour in C to read a different member of a union than was written. Nothing in-tree should be using this behaviour with the nir_src union: nir_if should never be read as nir_instr and vice versa. Assert this. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>	2023-10-10 04:58:05 -04:00
Alyssa Rosenzweig	c39896b17b	nir: Use getters for nir_src::parent_* First, we need to give the parent_instr field a unique name to be able to replace with a helper. We have parent_instr fields for both nir_src and nir_def, so let's rename nir_src::parent_instr in preparation for rework. This was done with a combination of sed and manual fix-ups. Then we use semantic patches plus manual fixups: @@ expression s; @@ -s->renamed_parent_instr +nir_src_parent_instr(s) @@ expression s; @@ -s.renamed_parent_instr +nir_src_parent_instr(&s) @@ expression s; @@ -s->parent_if +nir_src_parent_if(s) @@ expression s; @@ -s.renamed_parent_if +nir_src_parent_if(&s) @@ expression s; @@ -s->is_if +nir_src_is_if(s) @@ expression s; @@ -s.is_if +nir_src_is_if(&s) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>	2023-10-10 04:58:05 -04:00
Alyssa Rosenzweig	ad619da3bc	nir: Use set_parent_instr internally This properly clears is_if. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>	2023-10-10 04:58:04 -04:00
Alyssa Rosenzweig	19f8e0e3aa	nir: Add trivial nir_src_* getters These will become nontrivial later in the series. For now these have no smarts in them, in order to make the conversion completely mechanical. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>	2023-10-10 04:58:04 -04:00
Iván Briano	987749430d	nir: round f2f16{_rtne/_rtz} correctly for constant expressions As noted in the previous commit, the intermediate cast to float from double can produce wrong results. Fixes upcoming Vulkan CTS tests: dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_nostorage dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_vert dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_nostorage_vert dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_frag dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_nostorage_frag Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25281>	2023-10-09 23:37:52 +00:00
Iván Briano	c8a8b09c15	nir/lower_int64: respect rounding mode when casting to float Appendix A: Vulkan environemtn for SPIR-V says: Operations described as “correctly rounded” will return the infinitely precise result, x, rounded so as to be representable in floating-point. The rounding mode is not specified, unless the entry point is declared with the RoundingModeRTE or the RoundingModeRTZ Execution Mode. Conversion between types are classified as correctly rounded, so let's do rounding correctly. v2: check rounding mode for destination bit size (Georg) Fixes upcoming Vulkan CTS tests: dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp32.input_args.rounding_rtz_conv_from_uint64_up dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp32.input_args.rounding_rtz_conv_from_int64_up dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.rounding_rtz_conv_from_uint64_up_vert dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.rounding_rtz_conv_from_int64_up_vert dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.rounding_rtz_conv_from_uint64_up_frag dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.rounding_rtz_conv_from_int64_up_frag Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25281>	2023-10-09 23:37:52 +00:00
antonino	a2e96a86e1	nir: fix several crashes in `nir_lower_tex` This patch fixes the following issues that lead to crashes in some cases: * an instruction is inserted to get texture lod that depends on a texture instruction that hasn't been inserted yet. * this code tries to read channel 1 of the lod, but lod is scalar * the code assumed there would only be 2 srcs, this isn't the case when bindless is used. Fixes: `b154a4154b` ("nir/lower_tex: rewrite tex/txb -> txd/txl before saturating srcs") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25621>	2023-10-09 17:31:34 +00:00
Marek Olšák	348eee9c97	nir: handle nir_var_mem_ubo in nir_clone_uniform_variable for UBOs Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25394>	2023-10-07 11:18:40 +00:00
Marek Olšák	b47b8d16d9	nir: expose reusable linking helpers for cloning uniform loads for the new varying optimizer Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25394>	2023-10-07 11:18:40 +00:00
Marek Olšák	b1bbe4e190	nir: gather dual slot input information Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25394>	2023-10-07 11:18:40 +00:00
Marek Olšák	cb66fddd81	nir: take dual slot input info into account when computing IO driver locations Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25394>	2023-10-07 11:18:40 +00:00
Marek Olšák	0f2491cbdd	nir: add dual-slot input information into load_input intrinsics This is necessary to allow optimizing VS inputs after nir_lower_io, which is currently impossible because the loss of dual-slot information in NIR would break VS inputs. With this, driver locations can be recomputed by calling nir_recompute_io_bases. It's a prerequisite for optimizing varyings with lowered IO. When this is used, we will be able to eliminate unused dual-slot VS inputs as well as unused low and high halves of dual-slot VS inputs for the first time, which can happen due to optimizations of varyings. Without this, st/mesa binds vertex buffers for dual-slot inputs that are fully or partially unused in the shader. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25394>	2023-10-07 11:18:40 +00:00
Marek Olšák	97f3fdadca	nir: recompute IO bases after DCE in nir_lower_io_passes otherwise the IO bases can be incorrect due to non-DCE'd input loads Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25394>	2023-10-07 11:18:40 +00:00
Marek Olšák	f37e32b78b	nir: sort variables by location in nir_lower_io_passes to work around a bug I don't know why this is necessary, but it unblocks the work on varying optimizations. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25394>	2023-10-07 11:18:40 +00:00
Konstantin Seurer	4625e18619	nir/passthrough_gs: Support edge flags with points Fixes: `24535ff` ("nir: handle edge flags in nir_create_passthrough_gs") Reviewed-by: Antonino Maniscalco <antonino.maniscalco@collabora.com> Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25335>	2023-10-04 23:20:52 +00:00
Rhys Perry	ad5be40303	nir: add fetch inactive index to quad_swizzle_amd/masked_swizzle_amd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525>	2023-10-04 18:53:43 +00:00
Danylo Piliaiev	a5f0f7d4b1	turnip,ir3: Implement A7XX push consts load via preamble New push consts loading consist of: - Push consts are set for the entire pipeline via HLSQ_SHARED_CONSTS_IMM array which could fit up to 256b of push consts. - For each shader stage that uses push consts READ_IMM_SHARED_CONSTS should be set in HLSQ_*_CNTL, otherwise push consts may get overwritten by new push consts that are set after the draw. - Push consts are loaded into consts reg file in a shader preamble via stsc at the very start of the preamble. OPC_PUSH_CONSTS_LOAD_MACRO is used instead of directly translating NIR intrinsic into stsc because: we don't want to teach legalize pass how to set (ss) between stores and loads of consts reg file, don't want for stsc to be reordered, etc. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25086>	2023-10-04 15:51:54 +00:00
Georg Lehmann	bd16d3cdaf	nir/lower_subgroups: use intrinsic builder more Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25501>	2023-10-03 12:49:28 +00:00
Georg Lehmann	289b369597	nir: make quad intrinsic dst bit size match src0 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25501>	2023-10-03 12:49:28 +00:00
Rhys Perry	7139a78959	nir/constant_folding: remove zero texel offset fossil-db (navi31): Totals from 7 (0.01% of 79330) affected shaders: Instrs: 7001 -> 6993 (-0.11%) CodeSize: 35736 -> 35692 (-0.12%) InvThroughput: 3232 -> 3229 (-0.09%) Copies: 552 -> 549 (-0.54%) PreSGPRs: 277 -> 273 (-1.44%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25477>	2023-10-02 10:11:37 +00:00
Georg Lehmann	305db1af11	nir: scalarize masked_swizzle_amd created from shuffle_xor Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9901 Fixes: `0ef87f148d` ("nir/lower_subgroups: Don't do multiple lowerings at once") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25468>	2023-10-02 09:01:18 +00:00
Alyssa Rosenzweig	10b9c2fa36	nir: Support arrays in block_image_store_agx For layered rendering, runs once per layer. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	f4042afd57	nir: Add layer_id_written_agx sysval We'll implement layer ID reads in the frag shader with a varying read, but if the VS doesn't write the varying we need to return 0 per the spec. Add a sysval to detect that case so we can handle it at runtime without keys. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Timothy Arceri	f2e87c5c28	nir: add used field to nir variables Will be use in a following path by the glsl nir based linker. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25371>	2023-09-28 13:55:16 +00:00
Timothy Arceri	337c32cb3a	nir: copy explicit_invariant flag to nir vars This will be used in the following patch. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25371>	2023-09-28 13:55:16 +00:00
Caio Oliveira	af3eb80afa	nir: Handle cooperative matrix in various passes Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23825>	2023-09-28 07:35:02 +00:00
Caio Oliveira	3105d516d0	nir: Add new intrinsics for Cooperative Matrix Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23825>	2023-09-28 07:35:02 +00:00
Caio Oliveira	2d0f4f2c17	compiler/types: Add support for Cooperative Matrix types Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23825>	2023-09-28 07:35:02 +00:00
Timothy Arceri	1780102923	nir: fix typo in comment The variable is unused or dead, not used. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25414>	2023-09-28 01:54:43 +00:00
Rhys Perry	65afc8bebf	nir/algebraic: optimize u2u32(a >> 32) fossil-db (navi21): Totals from 352 (0.44% of 79330) affected shaders: Instrs: 271816 -> 271240 (-0.21%); split: -0.28%, +0.07% CodeSize: 1546520 -> 1544448 (-0.13%); split: -0.23%, +0.09% SpillVGPRs: 832 -> 827 (-0.60%); split: -1.08%, +0.48% Latency: 4037120 -> 4021748 (-0.38%); split: -0.41%, +0.03% InvThroughput: 1369540 -> 1362066 (-0.55%); split: -0.59%, +0.04% VClause: 6476 -> 6471 (-0.08%); split: -0.12%, +0.05% SClause: 6798 -> 6794 (-0.06%) Copies: 44828 -> 44630 (-0.44%); split: -0.89%, +0.45% Branches: 8845 -> 8844 (-0.01%); split: -0.05%, +0.03% PreSGPRs: 14684 -> 14659 (-0.17%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25409>	2023-09-27 22:13:01 +00:00
Rhys Perry	bcdac65ca3	nir/lower_int64: fix find_lsb(0) If the high 32 bits were zero, this would be umin(find_lsb(lo), 31). This evaluates to 31 if lo is also zero, instead of -1. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `9293d8e64b` ("nir: Add find_lsb lowering to nir_lower_int64.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25409>	2023-09-27 22:13:01 +00:00
Karol Herbst	807ff7ed01	nir: add nir_lower_alu_vec8_16_srcs pass This pass is useful for vector based backends as we might end up with alu instructions referencing vec8/vec16 values even though being vec4 or smaller themselves. This new pass intents to clean up any use of vec8/vec16 sources other passes won't. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25330>	2023-09-27 11:54:13 +00:00
Samuel Pitoiset	1ce80653b2	nir: rename atomic_add_gs_invocation_count_amd to make it more generic It will be re-used to implement mesh/tash shader invocations queries. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25331>	2023-09-26 07:50:15 +00:00
Caio Oliveira	63ab985511	util: Use an opaque type for linear context In the linear allocation only the parent (context) can be used to allocate new children, so let's use an opaque type to identify the linear context. This is similar to what's done in GC allocator. Update the documentation and a couple of function names to refer to linear context instead of linear parent. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25280>	2023-09-25 17:26:17 +00:00
Caio Oliveira	aec516ead6	util: Remove size from linear_parent creation None of the callsites took advantage of this, so remove the feature. This will help to a next change that will add an opaque type to represent a linear parent. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25280>	2023-09-25 17:26:17 +00:00
Caio Oliveira	3988d901ac	meson: Remove unnecessary inc_compiler mentions The inc_compiler should come as part of idep_compiler, idep_nir or idep_nir_headers dependency. Acked-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (v3dv) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25314>	2023-09-22 14:52:50 +00:00
Caio Oliveira	ec835595f0	compiler: Use a meson dependency for libcompiler That will make sure the include directories are passed on and also make sure the generated headers are properly built before whoever code depends on it. NIR dependency propagates that dependency too. Since the right include directory is always propagated, we can remove the extra "compiler/" prefix from the `#include`s in glsl_types.h. Note: NIR has a special "header only" dependency, so include the generated headers for compiler there too. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9843 Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25314>	2023-09-22 14:52:50 +00:00
Konstantin Seurer	be8a73f40d	nir/deref: Layer rematerialization helpers nir_rematerialize_derefs_in_use_blocks_impl can be implemented on top of nir_rematerialize_deref_in_use_blocks. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23712>	2023-09-22 10:05:58 +00:00
Konstantin Seurer	439e8c42cc	nir/lcssa: Fix rematerializing derefs This would pull derefs out of loops by emitting the pattern `deref(phi(deref))` which is not allowed by nir_validate. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23712>	2023-09-22 10:05:58 +00:00
Konstantin Seurer	29dc1b193a	nir: Add nir_rematerialize_deref_in_use_blocks nir_rematerialize_deref_in_use_blocks can be used in passes that don't run on the whole function. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23712>	2023-09-22 10:05:58 +00:00
Rhys Perry	ba809dccb8	nir/deref: remove rematerialize_deref_in_block cache Nothing was ever inserted into this. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23712>	2023-09-22 10:05:58 +00:00
Konstantin Seurer	ab1310e84d	nir: Add nir_foreach_block_in_cf_node_reverse Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23712>	2023-09-22 10:05:58 +00:00
Konstantin Seurer	70e497a2ac	nir: Add nir_cf_node_cf_tree_prev Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23712>	2023-09-22 10:05:58 +00:00
Ian Romanick	2157f136d7	nir/rematerialize: Rematerialize ALUs used only by compares with zero This was 4th on the list of things to try in `3ee2e84c60` ("nir: Rematerialize compare instructions"). This is implemented as a separate subpass that tries to find ALU instructions (with restrictions) that are only used by comparisons with zero that are in turn only used as conditions for bcsel or if-statements. There are two restrictions implemented. One of the sources must be a constant. This is done in an attempt to prevent increasing register pressure. Additionally, the opcode of the instruction must be one that has a high probablility of getting a conditional modifier on Intel GPUs. Not all instructions can have a conditional modifiers (e.g., min and max), so I don't think there is any benefit to moving these instructions. v2: Rebase on many, many recent NIR infrastructure changes. v3: Make data in commit message more clear. Suggested by Matt. Rebase on `b5d6b7c402` ("nir: Drop most uses if nir_instr_rewrite_src()"). All of the affected shaders on ILK and G45 are in CS:GO. There is some brief analysis of the changes in the MR. Reviewed-by: Matt Tuner <mattst88@gmail.com> Shader-db results: DG2 total instructions in shared programs: 22824637 -> 22824258 (<.01%) instructions in affected programs: 365742 -> 365363 (-0.10%) helped: 190 / HURT: 97 total cycles in shared programs: 832186193 -> 832157290 (<.01%) cycles in affected programs: 41245259 -> 41216356 (-0.07%) helped: 208 / HURT: 117 total spills in shared programs: 4072 -> 4060 (-0.29%) spills in affected programs: 366 -> 354 (-3.28%) helped: 4 / HURT: 2 total fills in shared programs: 3601 -> 3607 (0.17%) fills in affected programs: 708 -> 714 (0.85%) helped: 4 / HURT: 2 LOST: 0 GAINED: 1 Tiger Lake and Ice Lake had similar results. (Ice Lake shown) total instructions in shared programs: 20320934 -> 20320689 (<.01%) instructions in affected programs: 236592 -> 236347 (-0.10%) helped: 176 / HURT: 29 total cycles in shared programs: 849846341 -> 849843856 (<.01%) cycles in affected programs: 41277336 -> 41274851 (<.01%) helped: 195 / HURT: 110 LOST: 0 GAINED: 1 Skylake total instructions in shared programs: 18550811 -> 18550470 (<.01%) instructions in affected programs: 233908 -> 233567 (-0.15%) helped: 182 / HURT: 25 total cycles in shared programs: 835910983 -> 835889167 (<.01%) cycles in affected programs: 38764359 -> 38742543 (-0.06%) helped: 207/ HURT: 94 total spills in shared programs: 4522 -> 4506 (-0.35%) spills in affected programs: 324 -> 308 (-4.94%) helped: 4 / HURT: 0 total fills in shared programs: 5296 -> 5280 (-0.30%) fills in affected programs: 324 -> 308 (-4.94%) helped: 4 / HURT: 0 LOST: 0 GAINED: 1 Broadwell total instructions in shared programs: 18199130 -> 18197920 (<.01%) instructions in affected programs: 214664 -> 213454 (-0.56%) helped: 191 / HURT: 0 total cycles in shared programs: 935131908 -> 934870248 (-0.03%) cycles in affected programs: 75770568 -> 75508908 (-0.35%) helped: 203 / HURT: 84 total spills in shared programs: 13896 -> 13734 (-1.17%) spills in affected programs: 162 -> 0 helped: 3 / HURT: 0 total fills in shared programs: 16989 -> 16761 (-1.34%) fills in affected programs: 228 -> 0 helped: 3 / HURT: 0 Haswell total instructions in shared programs: 16969502 -> 16969085 (<.01%) instructions in affected programs: 185498 -> 185081 (-0.22%) helped: 121 / HURT: 1 total cycles in shared programs: 925290863 -> 924806827 (-0.05%) cycles in affected programs: 30200863 -> 29716827 (-1.60%) helped: 100 / HURT: 85 total spills in shared programs: 13565 -> 13533 (-0.24%) spills in affected programs: 736 -> 704 (-4.35%) helped: 8 / HURT: 0 total fills in shared programs: 15468 -> 15436 (-0.21%) fills in affected programs: 740 -> 708 (-4.32%) helped: 8 / HURT: 0 LOST: 0 GAINED: 1 Ivy Bridge total instructions in shared programs: 15839127 -> 15838947 (<.01%) instructions in affected programs: 77776 -> 77596 (-0.23%) helped: 58 / HURT: 0 total cycles in shared programs: 459852774 -> 459739770 (-0.02%) cycles in affected programs: 11970210 -> 11857206 (-0.94%) helped: 79 / HURT: 53 Sandy Bridge total instructions in shared programs: 14106847 -> 14106831 (<.01%) instructions in affected programs: 1611 -> 1595 (-0.99%) helped: 10 / HURT: 0 total cycles in shared programs: 775004024 -> 775007516 (<.01%) cycles in affected programs: 2530686 -> 2534178 (0.14%) helped: 55 / HURT: 48 Iron Lake total cycles in shared programs: 257753356 -> 257754900 (<.01%) cycles in affected programs: 2977374 -> 2978918 (0.05%) helped: 12 / HURT: 106 GM45 total cycles in shared programs: 169711382 -> 169712816 (<.01%) cycles in affected programs: 2402070 -> 2403504 (0.06%) helped: 12 / HURT: 57 Fossil-db results: All Intel platforms had similar results. (DG2 shown) Totals: Instrs: 193884596 -> 193465896 (-0.22%); split: -0.25%, +0.03% Cycles: 14050193354 -> 14048194826 (-0.01%); split: -0.34%, +0.33% Spill count: 114944 -> 100449 (-12.61%); split: -13.59%, +0.98% Fill count: 201525 -> 179534 (-10.91%); split: -11.22%, +0.31% Scratch Memory Size: 10028032 -> 8468480 (-15.55%) Totals from 16912 (2.59% of 653124) affected shaders: Instrs: 34173709 -> 33755009 (-1.23%); split: -1.41%, +0.19% Cycles: 2945969110 -> 2943970582 (-0.07%); split: -1.62%, +1.55% Spill count: 97753 -> 83258 (-14.83%); split: -15.98%, +1.15% Fill count: 176355 -> 154364 (-12.47%); split: -12.82%, +0.35% Scratch Memory Size: 8619008 -> 7059456 (-18.09%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20176>	2023-09-21 16:58:29 +00:00
Connor Abbott	4282386311	nir/spirv: Add inverse_ballot intrinsic This is actually a no-op on AMD, so we really don't want to lower it to something more complicated. There may be a more efficient way to do this on Intel too. In addition, in the future we'll want to use this for lowering boolean reduce operations, where the inverse ballot will operate on the backend's "natural" ballot type as indicated by options->ballot_bit_size, instead of uvec4 as produced by SPIR-V. In total, there are now three possible lowerings we may have to perform: - inverse_ballot with source type of uvec4 from SPIR-V to inverse_ballot with natural source type, when the backend supports inverse_ballot natively. - inverse_ballot with source type of uvec4 from SPIR-V to arithmetic, when the backend doesn't support inverse_ballot. - inverse_ballot with natural source type from reduce operation, when the backend doesn't support inverse_ballot. Previously we just did the second lowering unconditionally in vtn, but it's just a combination of the first and third. We add support here for the first and third lowerings in nir_lower_subgroups, instead of simply moving the second lowering, to avoid unnecessary churn. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25123>	2023-09-20 14:41:18 +00:00
Connor Abbott	0ef87f148d	nir/lower_subgroups: Don't do multiple lowerings at once Since using nir_shader_lower_instructions(), instructions get revisited before proceeding with the next one. This already guarantees that any subsequent lowerings of those instructions happen during the same pass of nir_lower_subgroups(). v2: use nir_shader_lower_instructions() instead of setting the cursor. Co-authored-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25123>	2023-09-20 14:41:18 +00:00
Pavel Ondračka	1c72c71bdf	nir/move_vec_src_uses_to_dest: allow to skip reuse of constant sources And enable this for r300 and intel-vec4 crocus HSW (mostly helps few doplhin ubershaders): total instructions in shared programs: 1576736 -> 1576589 (<.01%) instructions in affected programs: 38235 -> 38088 (-0.38%) helped: 12 HURT: 0 total cycles in shared programs: 111025838 -> 110944796 (-0.07%) cycles in affected programs: 5646582 -> 5565540 (-1.44%) helped: 15 HURT: 6 total spills in shared programs: 447 -> 432 (-3.36%) spills in affected programs: 186 -> 171 (-8.06%) helped: 12 HURT: 0 total fills in shared programs: 792 -> 774 (-2.27%) fills in affected programs: 291 -> 273 (-6.19%) helped: 12 HURT: 0 r300 RV530: total instructions in shared programs: 96655 -> 96304 (-0.36%) instructions in affected programs: 15020 -> 14669 (-2.34%) helped: 79 HURT: 18 total temps in shared programs: 13027 -> 12952 (-0.58%) temps in affected programs: 677 -> 602 (-11.08%) helped: 41 HURT: 9 total cycles in shared programs: 147745 -> 147314 (-0.29%) cycles in affected programs: 21831 -> 21400 (-1.97%) helped: 84 HURT: 19 r300 RV370: total instructions in shared programs: 63678 -> 63669 (-0.01%) instructions in affected programs: 931 -> 922 (-0.97%) helped: 12 HURT: 6 total temps in shared programs: 10028 -> 10013 (-0.15%) temps in affected programs: 339 -> 324 (-4.42%) helped: 33 HURT: 10 total cycles in shared programs: 101118 -> 101087 (-0.03%) cycles in affected programs: 2659 -> 2628 (-1.17%) helped: 22 HURT: 6 Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24932>	2023-09-19 18:05:37 +02:00
Pavel Ondračka	dc60194599	nir/move_vec_src_uses_to_dest: skip reuse if vec is used only once in store_output lima and etnaviv show no change in shader-db. crocus HSW: total instructions in shared programs: 1576762 -> 1576736 (<.01%) instructions in affected programs: 485 -> 459 (-5.36%) helped: 28 HURT: 1 total cycles in shared programs: 111025898 -> 111025838 (<.01%) cycles in affected programs: 1248 -> 1188 (-4.81%) helped: 29 HURT: 0 RV370: total instructions in shared programs: 63889 -> 63558 (-0.52%) instructions in affected programs: 9116 -> 8785 (-3.63%) helped: 129 HURT: 0 total temps in shared programs: 10071 -> 10016 (-0.55%) temps in affected programs: 285 -> 230 (-19.30%) helped: 51 HURT: 0 total cycles in shared programs: 101344 -> 100997 (-0.34%) cycles in affected programs: 9326 -> 8979 (-3.72%) helped: 129 HURT: 0 RV530: total instructions in shared programs: 93597 -> 93267 (-0.35%) instructions in affected programs: 10309 -> 9979 (-3.20%) helped: 166 HURT: 0 total temps in shared programs: 13019 -> 12955 (-0.49%) temps in affected programs: 337 -> 273 (-18.99%) helped: 61 HURT: 1 total cycles in shared programs: 144506 -> 144159 (-0.24%) cycles in affected programs: 10662 -> 10315 (-3.25%) helped: 165 HURT: 0 Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24932>	2023-09-19 18:05:30 +02:00
Dave Airlie	51840bbdce	nir: add a deref slot counter that handles compact Conor suggested this, so we can mark slots properly in the io marking. This fixes a problem seen when rewriting llvmpipe to use nir info instead of tgsi info. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24803>	2023-09-18 16:47:30 +00:00

1 2 3 4 5 ...

4898 commits