fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 09:18:10 +02:00

Author	SHA1	Message	Date
Caio Oliveira	2ed79f80ba	nir/load_store_vectorize: Skip new bit-sizes that are unaligned with high_offset Otherwise this would require combining two values to produce a single (new bit-size) channel, which vectorize_stores() don't handle. The pass can still keep trying smaller bit-sizes. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12946 Fixes: `ce9205c03b` ("nir: add a load/store vectorization pass") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34414>	2025-04-11 19:17:17 +00:00
Georg Lehmann	d046ecf95a	nir/opt_algebraic: optimize open coded ffract Foz-DB Navi21: Totals from 274 (0.34% of 79789) affected shaders: Instrs: 522630 -> 522181 (-0.09%); split: -0.09%, +0.01% CodeSize: 2880668 -> 2878940 (-0.06%); split: -0.07%, +0.01% VGPRs: 14488 -> 14464 (-0.17%) Latency: 4092358 -> 4091243 (-0.03%); split: -0.04%, +0.01% InvThroughput: 1014148 -> 1013471 (-0.07%); split: -0.07%, +0.00% VClause: 11646 -> 11639 (-0.06%) SClause: 18614 -> 18611 (-0.02%) Copies: 56248 -> 56309 (+0.11%); split: -0.05%, +0.16% PreVGPRs: 13649 -> 13647 (-0.01%) VALU: 359733 -> 359285 (-0.12%); split: -0.13%, +0.01% SALU: 59719 -> 59720 (+0.00%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33369>	2025-04-11 12:36:02 +00:00
Konstantin Seurer	ba001626ac	nir: Turn the format string index into a const index It is already expected to be constant. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34208>	2025-04-10 19:31:37 +00:00
Boris Brezillon	4f4ac56145	pan/va: Support relaxed waits on read-only render targets On Valhall we can optimize lower waits, which waits for both readers and writers, into resource_waits which only wait for writers, allowing threads accessing read-only resources to execute concurrently. Let's use that on LD_TILE instructions so we can optmize the read-only case. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	20275d6521	pan/bi: Introduce two intrinsics to support input attachment remapping In order to dynamically load the content of the tile buffer, we need to know the target (color, depth or stencil) and the conversion to apply. Let's define the load_input_attachment_{target,conv}_pan intrinsics so we can dissociate the logic lowering input attachment loads into load_converted_output_pan, and the part optimizing the shader when input attachment map is passed at compile time. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	f3be0836b7	pan/bi: Pass an explicit sampleid to load_converted_output_pan Needed if we want to lower multisample input attachment loads to tile buffer loads. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	cdeda45282	pan/bi: Pass load_converted_output_pan target through a source This allows us to pass a dynamic render target which will be needed to support VK_KHR_dynamic_rendering_local_read. While at it, we also enable support for depth/stencil tile loads. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Alyssa Rosenzweig	c2a3c70086	nir/lower_tex: use vector_insert_imm was in the area. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34426>	2025-04-08 19:04:47 +00:00
Alyssa Rosenzweig	c23201ad8a	nir/lower_blend: disable logic ops for unsupported formats Fixes new Vulkan CTS cases on Honeykrisp (and probably panvk and whatever) dEQP-VK.pipeline.shader_object_unlinked_binary.logic_op_na_formats.* Cc: mesa-stable Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34426>	2025-04-08 19:04:47 +00:00
Alyssa Rosenzweig	54ccc8ed0b	nir/lower_blend: refactor logicop variables This pulls out the logicop_func variable from the options struct, so we can modify it in the next commit in a central place. It then refactors out the format variable from the options struct since we end up duplicating options->format[rt] a zillion times and passing in both an options struct and a logicop func override is confusing so this will just make everything neater and self-contained next commit. no functional change. Cc'd to make the next commit cherrypickable. Cc: mesa-stable Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34426>	2025-04-08 19:04:46 +00:00
Faith Ekstrand	6aa2c152b8	nak,nir: Add an image_load_raw_nv intrinsic Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34336>	2025-04-08 04:06:45 +00:00
Marek Olšák	1d5c42528b	nir/opt_algebraic: lower 16-bit imul_high & umul_high Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>	2025-04-07 19:44:22 +00:00
Timothy Arceri	d8782db3a4	glsl: fix regression in ubo cloning Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes KHR-GL46.layout_binding.block_layout_binding_block_VertexShader with radeonsi. Fixes: `2b2132d2ac` ("nir: fix uniform cloning helper") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34337>	2025-04-06 19:43:47 +10:00
Konstantin	e7a44de184	nir/tests: Do not rely on __LINE__ __LINE__ can be inconsistent when using different compilers. This patch changes the test runner to do a simple string find/replace of the test source file instead of looking for the line where the reference string starts. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33980>	2025-04-04 19:01:01 +00:00
Timur Kristóf	a530890e75	nir/print: Fix variable mode for arrayed output load intrinsics. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This helps print the names of varyings correctly. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>	2025-04-03 19:54:51 +00:00
Timur Kristóf	96d11d0f56	nir/opt_varyings: Fix assertion when deduplicating TCS outputs. When deduplicating TCS outputs, we may find outputs that aren't loaded by the shader itself. This previously hit a bad assertion. Fixes: `c66967b5cb` Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12410 Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>	2025-04-03 19:54:51 +00:00
Timur Kristóf	a29b5857f7	nir/xfb: Preserve some xfb information when gathering from intrinsics. We need to remember which streamout buffers and streams were enabled, even if the shader doesn't actually write any outputs to them, because the API requires that we count vertices created by this shader towards queries against those streams. That information can be gathered by nir_gather_xfb_info_with_varyings from the original NIR I/O variables that we get from the frontend, but it isn't included in any intrinsics so would be otherwise lost here. Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>	2025-04-03 19:54:51 +00:00
Faith Ekstrand	a3935c7aa2	nak,nir: Generalize nak_nir_split_64bit_conversions and move it to NIR This pass was originally based on a similar pass from Intel but it's grown support for some fancy stuff like fp64 -> fp16 conversion splitting with proper rounding. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34126>	2025-03-29 03:02:17 +00:00
Lionel Landwerlin	772beb0ebf	nir: add support for lowering non uniform texture offsets Intel HW only has support for non-uniform offsets for TG4 operations. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>	2025-03-29 02:15:18 +00:00
Georg Lehmann	2b1fc1a7fe	nir: add option to keep mul24_relaxed Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33871>	2025-03-27 06:24:15 +00:00
Timothy Arceri	2b2132d2ac	nir: fix uniform cloning helper glsl allows for ubos to have the same name but different bindings. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Fixes: `b47b8d16d9` ("nir: expose reusable linking helpers for cloning uniform loads") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12852 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34138>	2025-03-25 06:54:53 +00:00
Connor Abbott	1621080df7	compiler,nir: Gather needs_full_quad_helper_invocations info This is needed on Qualcomm, where there are separate fields to enable just 3 fragments and all 4 fragments. Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Fixes: `264d8a6766` ("ir3: Set need_full_quad depending on info.fs.require_full_quads") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33862>	2025-03-14 21:55:58 +00:00
Connor Abbott	7a55e13939	nir, compiler: Rename needs_quad_helper_invocations This currently treats coarse and fine derivatives the same, but Qualcomm needs to know whether just coarse derivatives are used or fine derivatives/quad ops are also used. Rename this to needs_coarse_quad_helper_invocations make clear the difference from the new field, needs_full_quad_helper_invocations. Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> Fixes: `264d8a6766` ("ir3: Set need_full_quad depending on info.fs.require_full_quads") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33862>	2025-03-14 21:55:57 +00:00
Karol Herbst	3a9954c117	nir/serialize: fix decoding of is_return and is_uniform Fixes: `3321a56d1d` ("nir: Serialize all parameter attributes") Fixes: `26cbb6b933` ("nir: Add parameter divergence info") Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34052>	2025-03-14 15:01:32 +00:00
Georg Lehmann	b386659588	nir/opt_algebraic: create ubfe from (a & mask) >> c Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Foz-DB Navi21: Totals from 917 (1.16% of 79188) affected shaders: Instrs: 2549482 -> 2544997 (-0.18%); split: -0.18%, +0.00% CodeSize: 13781648 -> 13763616 (-0.13%); split: -0.13%, +0.00% Latency: 24832087 -> 24825199 (-0.03%); split: -0.04%, +0.01% InvThroughput: 5921339 -> 5914799 (-0.11%); split: -0.12%, +0.01% VClause: 59910 -> 59898 (-0.02%); split: -0.02%, +0.00% SClause: 62294 -> 62293 (-0.00%) Copies: 221015 -> 220988 (-0.01%); split: -0.02%, +0.01% VALU: 1717280 -> 1713332 (-0.23%); split: -0.23%, +0.00% SALU: 359390 -> 358910 (-0.13%) VMEM: 101966 -> 101924 (-0.04%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33455>	2025-03-14 11:15:04 +00:00
Matt Turner	7534559f2f	nir: Return NULL, not false, from functions returning pointers Reported by clang's `-Wbool-conversion`. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34014>	2025-03-13 20:11:09 +00:00
Mary Guillemard	e0be93d881	nir: Add Panfrost specific shader_output intrinsic On Avalon, this is a bitfield that holds information on what values a vertex shader should output. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33910>	2025-03-10 07:38:16 +01:00
Alyssa Rosenzweig	bc6b527b52	nir/lower_helper_writes: fix stores after discard We need to use nir_is_helper_invocation instead of nir_load_helper_invocation, to correctly predicate stores after demote. Identified in a Piglit on AGX a year ago but I forgot to upstream this. Fixes: `586da7b329` ("nir: Add nir_lower_helper_writes pass") Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33939>	2025-03-08 07:47:40 +00:00
Daniel Schürmann	dbd41e3ddd	nir: set SYSTEM_VALUE_HELPER_INVOCATION read for nir_intrinsic_is_helper_invocation is_helper_invocation is the volatile access of load_helper_invocation. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33492>	2025-03-07 15:44:49 +00:00
Daniel Schürmann	a4cffa91b8	nir: remove nir_lower_discard_if_to_cf option Since removing nir_intrinsic_discard{_if} it has no purpose anymore. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33492>	2025-03-07 15:44:49 +00:00
Corentin Noël	eb1274ef08	nir: Add bool return value to nir_legacy_trivialize(..) Signed-off-by: Corentin Noël <corentin.noel@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33686>	2025-03-06 03:29:20 +00:00
Caterina Shablia	ca9ff8c8c7	nir: teach nir_lower_bit_size to handle ballot and ballot_relaxed Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33365>	2025-03-05 22:58:15 +00:00
Karol Herbst	5c1f61d900	nir: Do not eliminate dead writes to shared memory in called functions. Fixes regressions in rusticl and c11_atomic OpenCL CTS test. Fixes: `e65c1473de` ("nir: Eliminate dead writes to shared memory at the end of the program") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33807>	2025-03-04 19:41:13 +00:00
Konstantin Seurer	3aeab4ce40	nir/print: Do not print debug information when gathering it Referencing a shader string with differend debug information is confusing. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28613>	2025-03-04 18:42:48 +00:00
Konstantin Seurer	a04b5ebd3c	nir/sweep: Fix handling instructions with debug info When debug information is present, the nir_instr pointer is not the start of the allocation. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28613>	2025-03-04 18:42:48 +00:00
Konstantin Seurer	3a69b52d37	nir: Test nir_minimize_call_live_states Adds a couple of tests for various instructions and controlflow constructs. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33289>	2025-03-03 23:30:57 +00:00
Faith Ekstrand	a65009e808	nir: Add a nir_opt_tex_skip_helpers optimization Arm and NVIDIA hardware both have this as a bit you can set on the texture instruction so we may as well have a shared pass for it. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33402>	2025-03-01 08:44:15 +00:00
Faith Ekstrand	7ac6ec2ceb	nir: Add a get_io_index_src() helper Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33402>	2025-03-01 08:44:15 +00:00
Georg Lehmann	d272a6e261	nir/opt_algebraic: optimize d3d a ? b : 0 Foz-DB Navi21: Totals from 3466 (4.34% of 79789) affected shaders: MaxWaves: 73163 -> 73161 (-0.00%); split: +0.02%, -0.02% Instrs: 3993862 -> 3987633 (-0.16%); split: -0.19%, +0.04% CodeSize: 21747420 -> 21725620 (-0.10%); split: -0.15%, +0.05% VGPRs: 190736 -> 190728 (-0.00%); split: -0.04%, +0.03% SpillSGPRs: 489 -> 478 (-2.25%); split: -2.86%, +0.61% Latency: 48169718 -> 48159068 (-0.02%); split: -0.05%, +0.02% InvThroughput: 12132999 -> 12128721 (-0.04%); split: -0.05%, +0.01% VClause: 78063 -> 78052 (-0.01%); split: -0.09%, +0.08% SClause: 109095 -> 108996 (-0.09%); split: -0.13%, +0.04% Copies: 265784 -> 264530 (-0.47%); split: -0.72%, +0.25% Branches: 84533 -> 84553 (+0.02%) PreSGPRs: 172577 -> 172531 (-0.03%); split: -0.19%, +0.16% PreVGPRs: 165776 -> 165825 (+0.03%); split: -0.06%, +0.09% VALU: 2851544 -> 2850426 (-0.04%); split: -0.08%, +0.04% SALU: 413543 -> 408408 (-1.24%); split: -1.45%, +0.21% VMEM: 139890 -> 139887 (-0.00%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>	2025-03-01 07:49:28 +00:00
Georg Lehmann	2e7f34af6b	nir/opt_algebraic: optimize more ine/ieq(umin(b2i, ), 0) Foz-DB Navi21: Totals from 76 (0.10% of 79789) affected shaders: MaxWaves: 1050 -> 1062 (+1.14%) Instrs: 113754 -> 113691 (-0.06%); split: -0.11%, +0.06% CodeSize: 605096 -> 605216 (+0.02%); split: -0.03%, +0.05% VGPRs: 6024 -> 5976 (-0.80%) Latency: 1776501 -> 1777519 (+0.06%); split: -0.06%, +0.12% InvThroughput: 379644 -> 376751 (-0.76%) SClause: 2132 -> 2134 (+0.09%) Copies: 4131 -> 4128 (-0.07%); split: -1.77%, +1.69% PreSGPRs: 4275 -> 4270 (-0.12%) PreVGPRs: 5568 -> 5526 (-0.75%) VALU: 86732 -> 86581 (-0.17%); split: -0.24%, +0.07% SALU: 7112 -> 7198 (+1.21%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>	2025-03-01 07:49:28 +00:00
Georg Lehmann	7bc3062a3b	nir/opt_algebraic: push comparisons with constants into bcsel with constant Foz-DB Navi21: Totals from 1657 (2.08% of 79789) affected shaders: MaxWaves: 30275 -> 30261 (-0.05%); split: +0.01%, -0.05% Instrs: 3316251 -> 3315701 (-0.02%); split: -0.04%, +0.02% CodeSize: 17831924 -> 17832020 (+0.00%); split: -0.06%, +0.06% SpillSGPRs: 815 -> 859 (+5.40%) SpillVGPRs: 3335 -> 3293 (-1.26%) Scratch: 231424 -> 230400 (-0.44%) Latency: 33413310 -> 33402751 (-0.03%); split: -0.04%, +0.01% InvThroughput: 9116062 -> `9112904` (-0.03%); split: -0.04%, +0.00% VClause: 65587 -> 65560 (-0.04%); split: -0.05%, +0.01% SClause: 86208 -> 86261 (+0.06%); split: -0.02%, +0.08% Copies: 356158 -> 356439 (+0.08%); split: -0.07%, +0.15% PreSGPRs: 101710 -> 101806 (+0.09%); split: -0.01%, +0.11% PreVGPRs: 89293 -> 89286 (-0.01%); split: -0.04%, +0.04% VALU: 2220900 -> 2218839 (-0.09%); split: -0.11%, +0.01% SALU: 472988 -> 474567 (+0.33%); split: -0.08%, +0.42% VMEM: 118401 -> 118347 (-0.05%) SMEM: 123597 -> 123592 (-0.00%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>	2025-03-01 07:49:27 +00:00
Georg Lehmann	3837bc6d16	nir/opt_algebraic: optimize ~a == ~b and ~a == #b Foz-DB Navi21: Totals from 2 (0.00% of 79789) affected shaders: Instrs: 8343 -> 8323 (-0.24%) CodeSize: 43884 -> 43764 (-0.27%) Latency: 19390 -> 19363 (-0.14%) InvThroughput: 3380 -> 3356 (-0.71%) VALU: 5413 -> 5393 (-0.37%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>	2025-03-01 07:49:27 +00:00
Georg Lehmann	8759223498	nir/opt_algebraic: optimize b2i/b2f comparision with non 0/1 constants Foz-DB Navi21: Totals from 28 (0.04% of 79789) affected shaders: MaxWaves: 732 -> 728 (-0.55%) Instrs: 23425 -> 22559 (-3.70%) CodeSize: 137740 -> 132292 (-3.96%) VGPRs: 1128 -> 1144 (+1.42%) Latency: 94604 -> 92423 (-2.31%) InvThroughput: 19166 -> 18814 (-1.84%); split: -2.38%, +0.54% VClause: 429 -> 423 (-1.40%) SClause: 937 -> 926 (-1.17%) Copies: 1199 -> 914 (-23.77%); split: -24.52%, +0.75% Branches: 451 -> 421 (-6.65%) PreSGPRs: 1043 -> 996 (-4.51%) PreVGPRs: 992 -> 973 (-1.92%); split: -3.53%, +1.61% VALU: 17566 -> 16865 (-3.99%) SALU: 1254 -> 1157 (-7.74%) VMEM: 619 -> 609 (-1.62%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>	2025-03-01 07:49:27 +00:00
Georg Lehmann	2bfcfef5da	nir/opt_algebraic: optimize bcsel of b2f and constants Foz-DB Navi21: Totals from 212 (0.27% of 79789) affected shaders: MaxWaves: 4024 -> 4030 (+0.15%) Instrs: 1314134 -> 1313894 (-0.02%); split: -0.03%, +0.02% CodeSize: 7033216 -> 7026888 (-0.09%); split: -0.10%, +0.01% VGPRs: 14224 -> 14176 (-0.34%) Latency: 7402062 -> 7399180 (-0.04%); split: -0.06%, +0.02% InvThroughput: 1724879 -> 1723773 (-0.06%); split: -0.07%, +0.00% VClause: 37741 -> 37711 (-0.08%); split: -0.11%, +0.03% SClause: 29266 -> 29268 (+0.01%); split: -0.01%, +0.01% Copies: 123810 -> 123786 (-0.02%); split: -0.19%, +0.17% Branches: 42370 -> 42407 (+0.09%); split: -0.03%, +0.11% PreSGPRs: 13149 -> 13196 (+0.36%); split: -0.05%, +0.40% PreVGPRs: 12407 -> 12395 (-0.10%) VALU: 884471 -> 883475 (-0.11%); split: -0.12%, +0.01% SALU: 177671 -> 178408 (+0.41%); split: -0.03%, +0.45% Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761>	2025-03-01 07:49:27 +00:00
Georg Lehmann	b90826736d	nir/opt_algebraic: optimize bit_count(a) != 0 vkd3d-proton will emit b = ballot(!gl_HelperInvocation); (subgroupBallotBitCount(b) != 0u) ? subgroupShuffle(a, subgroupBallotFindLSB(b)) : 0u; for WaveReadFirstLane(a) in fragment shaders Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33808>	2025-02-28 18:03:04 +00:00
Georg Lehmann	f595bcfe78	nir/opt_varyings: clean up nir_progress usage Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33770>	2025-02-28 14:38:14 +00:00
Job Noorman	739ca77e66	nir/lower_subgroups: use build_cluster_mask for quad mask build_subgroup_quad_mask can now be written in terms of build_cluster_mask. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31732>	2025-02-27 18:53:19 +00:00
Benjamin Lee	252c59602e	panfrost: implement 16-bit ldexp Bifrost LDEXP.v2f16 takes a 16-bit exponent, which requires messy lowering. The codegen for this is quite bad currently, but would be improved by implementing unpack_32_2x16_split_*, and by fusing comparisons with CSEL. The main alternative is converting to F32, then LDEXP.f32, then converting back to F16. This has better codegen for dynamic exponents currently, but worse in the common case with a constant exponent where all the saturating cast logic can be folded. Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.ldexp.compute.vec2 when shaderFloat16 is enabled in panvk. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>	2025-02-27 16:49:11 +00:00
Job Noorman	2619d576e7	nir/lower_phis_to_scalar: don't create moves for undef sources Creating moves out of undefs makes it more difficult for other passes to detects undefs without having to chase moves. Instead, just create a new 1-component undef. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29889>	2025-02-27 13:18:14 +00:00
Job Noorman	5ae12b6a5a	nir/lower_phis_to_scalar: use nir_builder API where possible Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29889>	2025-02-27 13:18:14 +00:00

1 2 3 4 5 ...

6118 commits