fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 16:08:06 +02:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	09d31922de	nir: Drop "SSA" from NIR language Everything is SSA now. sed -e 's/nir_ssa_def/nir_def/g' \ -e 's/nir_ssa_undef/nir_undef/g' \ -e 's/nir_ssa_scalar/nir_scalar/g' \ -e 's/nir_src_rewrite_ssa/nir_src_rewrite/g' \ -e 's/nir_gather_ssa_types/nir_gather_types/g' \ -i $(git grep -l nir \| grep -v relnotes) git mv src/compiler/nir/nir_gather_ssa_types.c \ src/compiler/nir/nir_gather_types.c ninja -C build/ clang-format cd src/compiler/nir && find .c .h -type f -exec clang-format -i \{} \; Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24585>	2023-08-12 16:44:41 -04:00
Mike Blumenkrantz	e9a5da2f4b	nir: add a filter cb to lower_io_to_scalar this is useful for drivers that want to do selective scalarization of io Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24565>	2023-08-11 09:02:53 +00:00
Iago Toral Quiroga	da625903c7	squash! v3dv,broadcom/compiler: don't abuse sampler index For tex instructions that don't have sampler state use backend_flags instead of sampler index to bind default sampler state. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24537>	2023-08-10 07:10:01 +00:00
Alyssa Rosenzweig	95e3df39c0	treewide: sed out more is_ssa Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	a8013644a1	nir: Drop nir_alu_src::{negate,abs} Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	ab0d878932	treewide: Remove more is_ssa asserts Stuff Coccinelle missed. sed -i -e '/assert(.\.is_ssa)/d' $(git grep -l is_ssa) sed -i -e '/ASSERT.\.is_ssa)/d' $(git grep -l is_ssa) + a manual fixup to restore the assert for parallel copy lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	5fead24365	treewide: Drop is_ssa asserts We only see SSA now. Via Coccinelle patch: @@ expression x; @@ -assert(x.is_ssa); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	d559764e7c	nir: Remove nir_alu_dest::saturate Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	91f676819c	vc4,broadcom/compiler: Drop write_mask handling There's no legacy register support so we ncessarily write a contiguous vector. In other words, the write_mask is of the form `(1 << x) - 1`. Meanwhile this code asserts the write mask is of the form (1 << x)`. Putting it together the write mask is necessarily always 0x1, writing out a single scalar. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24471>	2023-08-03 13:06:38 +00:00
Iago Toral Quiroga	f0e603583e	broadcom/compiler: drop execution environment from the shader key We are no longer using this for anything. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24396>	2023-08-03 06:32:41 +00:00
Iago Toral Quiroga	b95bb44c61	broadcom/compiler: always clamp results from logic ops We have also been clamping our integer RTs in GL for a while now. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24396>	2023-08-03 06:32:41 +00:00
Iago Toral Quiroga	87e167baa1	broadcom/compiler: move vulkan's point coord lowering to the driver Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24396>	2023-08-03 06:32:40 +00:00
Iago Toral Quiroga	59018b0228	broadcom/compiler: move uniform offset lowering from compiler to GL driver We only need this in GL so move it there. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24396>	2023-08-03 06:32:40 +00:00
Iago Toral Quiroga	f5931ba6d8	broadcom/compiler: use NIR's lowering for dispatch base Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24396>	2023-08-03 06:32:40 +00:00
Iago Toral Quiroga	9211b9afdf	broadcom/compiler: stop asserting on Vulkan environment The idea is to eventually get rid of key->environment. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24396>	2023-08-03 06:32:40 +00:00
Iago Toral Quiroga	e941732ab1	v3dv: stop incrementing UBO indices by one This matches what we do for OpenGL, allowing us to have the same compiler behavior for both worlds. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24396>	2023-08-03 06:32:40 +00:00
Alyssa Rosenzweig	17d66055ae	nir: Remove reg_intrinsics parameter to convert_from_ssa All users must set it. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24450>	2023-08-02 10:26:45 -04:00
Alyssa Rosenzweig	51db19f7a2	nir: Rename scoped_barrier -> barrier sed + ninja clang-format + fix up spacing for common code. If you are unhappy that I did not manually change the whitespace of your driver, you need to enable clang-format for it so the formatting would happen automatically. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24428>	2023-08-01 23:18:29 +00:00
Faith Ekstrand	d89ca14e71	broadcom/compiler: Convert to new-style NIR registers Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24153>	2023-07-25 15:36:52 +00:00
Alyssa Rosenzweig	03b2c34793	nir: Remove register arrays Nothing produces them any more, so remove them from NIR. This massively reduces the size of nir_src, which should improve performance all over. nir_src size reduced from 56 bytes -> 40 bytes (pahole results on arm64, x86_64 should be similar.) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>	2023-07-21 11:25:49 +00:00
Faith Ekstrand	73e191924c	nir: Add a reg_intrinsics flag to nir_convert_from_ssa It doesn't do anything yet. We leave that to the subsequent patches so we can keep the tree-wide refactor as simple as possible. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>	2023-07-12 01:34:27 +00:00
Iago Toral Quiroga	be91133f87	broadcom/compiler: don't leak v3d_compile when finding a new best strategy If we had selected a best strategy and find an even better one we need to make sure we free the previous one. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24001>	2023-07-05 21:52:01 +00:00
Iago Toral Quiroga	dcc6288a13	broadcom/compiler: free defin and defout arrays if they already exist Just like we do for everything else here, since we are going to realloc them again right below. Notice this is not exactly a memory leak, since all these arrays are allocated with ralloc using v3d_compile as context, so all allocations will be eventually freed when the context is destroyed. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24001>	2023-07-05 21:52:01 +00:00
Yonggang Luo	8f8ea2dd68	broadcom: Switch to use nir_foreach_function_impl Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23988>	2023-07-04 10:47:26 +00:00
Yonggang Luo	edb607ed9f	v3d: Switch to use nir_foreach_function_impl Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23988>	2023-07-04 10:47:26 +00:00
Iago Toral Quiroga	1f8ecd3ae0	broadcom: use nir info to keep track of implicit sample shading It seems NIR is tracking this for us now so we can stop doing this in the backend. Also, new CTS tests seem to add the requirement where in the presence of some builtin's like gl_SampleID in a shader, even if unused, sample shading is expected to be enabled, which is something we can't track in the backend since the variable may have been dropped by then. Fixes 2 failures in: dEQP-VK.draw.renderpass.implicit_sample_shading.sample* Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23984>	2023-07-04 08:54:43 +00:00
Konstantin Seurer	5c8c2ec85c	v3d: Use nir_builder_at Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23883>	2023-07-03 15:21:37 +00:00
Alyssa Rosenzweig	a64f860acb	broadcom/compiler: Use nir_steal_tex_src It's great for passes like these. Noticed while in the area. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23895>	2023-06-29 22:36:50 +00:00
Alyssa Rosenzweig	5623f6571b	broadcom/compiler: Remove unused #define Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23895>	2023-06-29 22:36:50 +00:00
Alyssa Rosenzweig	4601517f54	broadcom/compiler: Remove v3d_nir_lower_robust_access Now unused. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23895>	2023-06-29 22:36:50 +00:00
Alyssa Rosenzweig	596176a720	broadcom/compiler: Use nir_lower_robust_access The common code version, instead of the vendor version. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23895>	2023-06-29 22:36:50 +00:00
Yonggang Luo	38935d9789	broadcom: replace redefined ALIGN() macro with common util functions `cl_aligned_packet_length()` expand literals, so use ALIGN_POT to compute it at compile time. `v3dv_AllocateMemory()` uses a 64-bit `allocationSize`, so use `align64()`. `v3d_lower_nir()` uses a 32-bit `shared_size`, so use `align()`. Extracted out of https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23932 for easier review. Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23938>	2023-06-29 21:12:07 +00:00
Yonggang Luo	62ce223245	treewide: Switch to use nir_foreach_function_with_impl when possible Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23903>	2023-06-29 08:36:03 +00:00
Erik Faye-Lund	b3b3be55c4	broadcom/compiler: use imm-helpers Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23855>	2023-06-29 07:08:18 +00:00
Alyssa Rosenzweig	815efcdf7e	nir: Use nir_builder_create perl -p0e 's/nir_builder ([^;]);\snir_builder_init\(&\1, /nir_builder \1 = nir_builder_create(/g' -i $(git grep -l nir_builder_init) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23860>	2023-06-27 18:13:02 +00:00
Caio Oliveira	59cc77f0fa	compiler: Move from nir_scope to mesa_scope Just moving the enum and performing renames, no behavior change. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23328>	2023-06-19 23:29:26 +00:00
Iago Toral Quiroga	e31aff59d8	broadcom/compiler: handle textureGatherOffsets There is a lowering in NIR for this so we just need to enable it. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23616>	2023-06-19 08:13:06 +00:00
Erik Faye-Lund	2a71e332aa	nir: use new immediate comparison helpers There's plenty of places we can use these new and shiny helpers, so let's clean up the code a bit. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23460>	2023-06-15 13:33:58 +02:00
Iago Toral Quiroga	6114e66124	broadcom/compiler: only use last thread switch flag to detect final section Since commit 'c98ddc778a3 broadcom/compiler: force a last thrsw for spilling' we always ensure we signal the last thread section explicitly with a last thread switch. Relying on VPM stores to detect the last thread section is particularly bad, because we can have VPM stores occurring quite early in a shader program, which would disable TMU spilling almost entirely. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22461>	2023-06-14 09:27:50 +00:00
Alejandro Piñeiro	dfdbf5bf94	broadcom/compiler: clarify use of QFILE_VPM This was only used for version < 40 (See commit `22a02f3e3`). Adding some extra explanations and asserts of places where it is used. As we are here also move the definition of a register with QFILE_VPM, to avoid defining it if not needed. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22984>	2023-06-14 09:03:35 +00:00
Emma Anholt	0cffef54e5	v3d: Respect nir_intrinsic_store_output's write_mask. Usually lower_io_to_temps sorts this out for us so you only get full writes, but we should be able to handle it without that. Avoids a regression with the mesa/st PBO VS with layer output. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23114>	2023-06-12 17:37:54 +00:00
Juan A. Suarez Romero	7a21b59df9	v3d: handle samplerExternalOES Add handling for GLSL_SAMPLER_DIM_EXTERNAL. Fixes `spec@oes_egl_image_external_essl3@oes_egl_image_external_essl3`. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23469>	2023-06-07 09:20:32 +00:00
Alyssa Rosenzweig	99a00e2247	treewide: Use nir_trim_vector more Via Coccinelle patches @@ expression a, b, c; @@ -nir_channels(b, a, (1 << c) - 1) +nir_trim_vector(b, a, c) @@ expression a, b, c; @@ -nir_channels(b, a, BITFIELD_MASK(c)) +nir_trim_vector(b, a, c) @@ expression a, b; @@ -nir_channels(b, a, 3) +nir_trim_vector(b, a, 2) @@ expression a, b; @@ -nir_channels(b, a, 7) +nir_trim_vector(b, a, 3) Plus a fixup for pointless trimming an immediate in RADV and radeonsi. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23352>	2023-06-06 18:52:25 +00:00
Iago Toral Quiroga	3530e3ffb2	broadcom/compiler: use scoped barriers Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23228>	2023-05-25 14:28:30 +02:00
Iago Toral Quiroga	e99ab86f77	broadcom/compiler: flag use of control barriers We have been relying on NIR's gather info pass for this but it is not safe unless we are certain we are always calling it after any other pass that may emit a control barrier. As it stands, nir_zero_initialize_shared_memory can emit a control barrier and we don't call the gather info pass after it, which is problematic. The only reason this is not really a problem right now is because for non-scoped barriers (which is what we currently use) it doesn't emit a scoped barrier, just a regular memory barrier (which is probably a bug in the pass!), but as soon as we move to scoped barriers, this is going to be a problem, since we need to know when we emit a control barrier to ensure supergroup calculations prevent deadlocks at the barrier op. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23228>	2023-05-25 14:28:30 +02:00
Erik Faye-Lund	c87e491107	nir: use nir_fsub_imm Now that we have nir_fsub_imm, let's use it to save some typing! Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23179>	2023-05-25 06:59:25 +00:00
Erik Faye-Lund	20d619cd84	nir: use more nir_fmul_imm This simplifies things a bit. Note that in some cases, the arguments are swapped, because multiplications are commutative, and nir_fmul_imm only allows the second operand to be an immediate. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23179>	2023-05-25 06:59:24 +00:00
Alejandro Piñeiro	88ca89bea9	broadcom/compiler: disable tmu pipelining when needed disable_tmu_pipelining has been recently set to false on two strategies that should set it to true. Fixes the following CTS test: dEQP-VK.graphicsfuzz.spv-stable-maze-flatten-copy-composite Fixes: `c950098ab` - broadcom/compiler: move buffer loads to lower register pressure Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23207>	2023-05-24 15:17:03 +00:00
Alejandro Piñeiro	470b8567a5	broadcom/compiler: return NULL if we fail to register allocate Right now if we fail to register allocate, we return the qpu_insts that we had at that point, even if the driver can't really use it. Also v3dv_pipeline was already assuming that it would return NULL on failure, returning VK_ERROR_UNKNOWN on that case. This allows CTS tests with a lot of pressure, that regress now and then to not being able to allocate, to finish with an error, instead of blocking forever. For example: dEQP-VK.graphicsfuzz.spv-stable-maze-flatten-copy-composite Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23203>	2023-05-24 14:19:12 +00:00
Iago Toral Quiroga	e401add741	broadcom/compiler: skip jumps in non-uniform if/then when block cost is small We have an optimization for non-uniform if/else where if all channels meet the jump condition we emit a branch to jump straight to the ELSE block. Similarly, if at the end of the THEN block we don't have any channels that would execute the ELSE block, we emit a branch to jump straight to the AFTER block. This optimization has a cost though: we need to emit the condition for the branch and a branch instruction (which also comes with a 3 delay slot), so for very small blocks (just a couple of ALU for example) emitting the branch instruction is typically worse. Futher, if the condition for the branch is not met, we still pay the cost for no benefit at all. Here is an example: nop ; fmul.ifa rf26, 0x3e800000, rf54 xor.pushz -, rf52, 2 ; nop bu.alla 32, r:unif (0x00000000 / 0.000000) nop ; nop nop ; nop nop ; nop xor.pushz -, rf52, 3 ; nop nop ; mov.ifa rf52, 0 nop ; mov.pushz -, rf52 nop ; mov.ifa rf26, 0x3f800000 The bu instruction here is setup to jump over the following 4 instructions (the last 4 instructions in there). To do this, we pay the price of the xor to generate the condition, the bu instruction, and the 3 delay slots right after it, so we end up paying 6 instructions to skip over 4 which we pay always, even if the branch is not taken and we still have to execute those 4 instructions. With this change, we produce: nop ; fmul.ifa rf56, 0x3e800000, rf28 xor.pushz -, rf9, 3 ; nop nop ; mov.ifa rf9, 0 nop ; mov.pushz -, rf9 nop ; mov.ifa rf56, 0x3f800000 Now we don't try to skip the small block, ever. At worse, if all channels would have met the branch condition, we only pay the cost of the 4 instructions instead of 6, at best, if any channel wouldn't take the branch, we save ourselves 5 cycles for the branch condition, the branch instruction and its 3 delay slots. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23161>	2023-05-22 09:23:41 +00:00

1 2 3 4 5 ...

745 commits