fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 09:18:10 +02:00

Author	SHA1	Message	Date
Marek Olšák	f52ae35d73	nir/opt_varyings: propagate indirect uniform/UBO loads into the next shader Uniform and UBO loads with non-constant indices are now propagated. The majority of this code implements cloning deref chains. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32424>	2024-12-04 13:40:41 +00:00
Marek Olšák	c0de78f120	nir/opt_varyings: change try_move_postdominator param to nir_instr type We want more instructions to be movable, like load_deref(var, index = load_input). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32424>	2024-12-04 13:40:41 +00:00
Marek Olšák	8e39e8ed4d	nir/opt_varyings: make top-level compaction code for TES, TCS, GS separate Add a separate "if" block for each and use a helper for repeated code. There will be more code added here that keeping TES, TCS, and GS compaction code unified would be a mess. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32424>	2024-12-04 13:40:41 +00:00
Marek Olšák	d20e07dbad	nir/opt_varyings: fix max_slot for color varying compaction It should be in units of slots. This was unlikely to break anything. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32424>	2024-12-04 13:40:41 +00:00
Marek Olšák	69b1853ecf	nir/opt_varyings: count the number of unused components for compaction correctly Holes due to indirectly-indexed inputs were ignored, making the compaction worse when such inputs were present alongside convergent inputs. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32424>	2024-12-04 13:40:41 +00:00
Marek Olšák	1aa9fec542	nir/opt_varyings: fix compaction with sparse indirect FS inputs Without this, compaction can put inputs into vec4 slots already occupied by indirectly-accessed inputs while ignoring their interpolation qualifier, which is incorrect. All input components sharing the same vec4 slot must use interpolation qualifiers that are compatible with each other. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32424>	2024-12-04 13:40:41 +00:00
Marek Olšák	b01f3cea7a	nir/opt_varyings: remove redundant conditions from a while loop Most of these conditions are repeated below with a continue statement. This just puts break at the end where all of them are false. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32424>	2024-12-04 13:40:41 +00:00
Marek Olšák	a618a2aa8b	nir/linking_helpers: don't promote interpolated varyings to flat Even the most flexible interpolation that we have in NIR options (nir_io_has_flexible_input_interpolation_except_flat) doesn't allow mixing flat and non-flat in the same vec4. This (legacy) optimization can't promote interpolated inputs to flat if it doesn't consider the interpolation mode of the whole vec4 slot. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32424>	2024-12-04 13:40:41 +00:00
Georg Lehmann	34a47e4b14	nir/opt_algebraic: mark a - ffract(a) as nan incorrect. Inf + fract(Inf) -> Inf + NaN -> NaN floor(Inf) -> Inf Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32393>	2024-12-03 14:42:18 +00:00
Georg Lehmann	2ee96cf514	nir/opt_algebraic: optimize d3d9 ceil No Foz-DB changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32393>	2024-12-03 14:42:18 +00:00
Georg Lehmann	34caed8adb	nir/opt_algebraic: optimize d3d9 ftrunc Foz-DB Navi21: Totals from 85 (0.11% of 79395) affected shaders: MaxWaves: 1972 -> 1968 (-0.20%) Instrs: 48682 -> 47067 (-3.32%) CodeSize: 255664 -> 247172 (-3.32%) VGPRs: 3752 -> 3768 (+0.43%) Latency: 154414 -> 150360 (-2.63%) InvThroughput: 37186 -> 35081 (-5.66%) VClause: 847 -> 865 (+2.13%); split: -0.24%, +2.36% SClause: 768 -> 796 (+3.65%) Copies: 2763 -> 2869 (+3.84%); split: -0.14%, +3.98% VALU: 28133 -> 26781 (-4.81%) SALU: 7182 -> 6939 (-3.38%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32393>	2024-12-03 14:42:18 +00:00
Georg Lehmann	ea4aa8e5a6	nir/opt_algebraic: optimize ffma(b2f, b2f, c) Foz-DB Navi21: Totals from 134 (0.17% of 79395) affected shaders: Instrs: 153297 -> 153326 (+0.02%); split: -0.03%, +0.05% CodeSize: 829520 -> 828444 (-0.13%); split: -0.13%, +0.00% Latency: 900489 -> 899964 (-0.06%); split: -0.07%, +0.01% InvThroughput: 267838 -> 267478 (-0.13%); split: -0.14%, +0.00% VClause: 2452 -> 2454 (+0.08%) Copies: 8331 -> 8353 (+0.26%); split: -0.25%, +0.52% PreSGPRs: 4974 -> 4964 (-0.20%) PreVGPRs: 6209 -> 6218 (+0.14%) VALU: 112317 -> 112092 (-0.20%); split: -0.21%, +0.01% SALU: 12451 -> 12694 (+1.95%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32393>	2024-12-03 14:42:18 +00:00
Kenneth Graunke	5712fc48a9	nir: Allow large overfetching holes in the load store vectorizer The load__uniform_block_intel intrinsics always load either 8x or 16x 32-bit components worth of data (so 32 byte increments). This leads to cases where we load a few components from one vec8, followed by a few components of an adjacent vec8. We want to combine those into a vec16 load, as that loads a whole cacheline at a time, and requires less hoops to calculate addresses and request memory loads. So, we allow 7 4 = 28 bytes of holes, which handles vec8+vec8 where only the .x component is read. Most drivers and intrinsics will not want such large holes. I thought about adding a per-intrinsic max_hole to the core code, but decided that since we already have driver callbacks, we can just rely on them to reject what makes sense to them. No driver callbacks currently allow holes, so this should not currently affect any drivers. But any work in progress branches may need to be updated to reject larger holes. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32315>	2024-12-03 02:02:33 +00:00
Marek Olšák	8752401e03	nir/algebraic: optimize (a & b) \| (a \| c) => a \| c, (a & b) & (a \| c) => a & b No change in shader-db with ACO, but it doesn't seem to be optimized by any other patterns. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32449>	2024-12-03 01:24:27 +00:00
Marek Olšák	3670d42c74	nir/algebraic: optimize (a \| b) \| (a \| c) ==> (a \| b) \| c shader-db with ACO: 3 shaders have -0.11% average decrease in the code size Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32449>	2024-12-03 01:24:27 +00:00
Marek Olšák	978ad93375	nir/algebraic: optimize (a & b) & (a & c) ==> (a & b) & c shader-db with ACO: 3 shaders have -0.57% average decrease in the code size Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32449>	2024-12-03 01:24:27 +00:00
Marek Olšák	83b093f95e	nir/algebraic: use is_used_once in a few iand/ior patterns shader-db with ACO: 1 shader has -4 decrease in the code size Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32449>	2024-12-03 01:24:27 +00:00
Antonino Maniscalco	2b9738ce6d	nir,zink,asahi: support passing through gl_PrimitiveID When this pass is used with Zink, gl_PrimitiveID needs to be passed through, however this is unnecessary for other divers. Analogous to previous commit Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Fixes: `d0342e28b3` ("nir: Add helper to create passthrough GS shader") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32397>	2024-12-03 00:24:04 +00:00
Kenneth Graunke	92797c6878	nir/algebraic: Reassociate fadd into fmul in DP4-like pattern This extends the optimization from commit `09705747d7` ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern") to a chain of 4 ffmas for a DP4-style pattern. Moving the add to the other end of the sequence allows it to be fused into an FMA. fossil-db results from Alchemist: Totals: Instrs: 158544142 -> 158490516 (-0.03%); split: -0.04%, +0.00% Subgroup size: 7808912 -> 7808920 (+0.00%); split: +0.00%, -0.00% Cycle count: 17859550672 -> 17859491966 (-0.00%); split: -0.01%, +0.01% Spill count: 84652 -> 84494 (-0.19%); split: -0.37%, +0.18% Fill count: 160728 -> 160623 (-0.07%); split: -0.29%, +0.23% Scratch Memory Size: 4278272 -> 4272128 (-0.14%); split: -0.29%, +0.14% Max live registers: 32411695 -> 32409789 (-0.01%); split: -0.01%, +0.00% Max dispatch width: 5627856 -> 5627920 (+0.00%); split: +0.00%, -0.00% Non SSA regs after NIR: 185359099 -> 185307703 (-0.03%); split: -0.03%, +0.00% Totals from 16378 (2.56% of 640872) affected shaders: Instrs: 9818723 -> 9765097 (-0.55%); split: -0.58%, +0.04% Subgroup size: 194056 -> 194064 (+0.00%); split: +0.01%, -0.01% Cycle count: 294967108 -> 294908402 (-0.02%); split: -0.58%, +0.56% Spill count: 10088 -> 9930 (-1.57%); split: -3.09%, +1.53% Fill count: 24738 -> 24633 (-0.42%); split: -1.90%, +1.48% Scratch Memory Size: 439296 -> 433152 (-1.40%); split: -2.80%, +1.40% Max live registers: 1297204 -> 1295298 (-0.15%); split: -0.22%, +0.07% Max dispatch width: 133232 -> 133296 (+0.05%); split: +0.14%, -0.10% Non SSA regs after NIR: 11999084 -> 11947688 (-0.43%); split: -0.43%, +0.00% Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32197>	2024-12-02 13:15:16 +00:00
Rhys Perry	9f3607de76	nir/tests: fix SSA dominance in opt_if_merge tests It isn't necessary for these ALU instructions to be used in the next IF. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `c437f2e79c` ("nir/tests: Add tests for opt_if_merge") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12211 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32391>	2024-12-02 09:38:22 +00:00
Timothy Arceri	6ca81adffc	nir: allow loops with unknown induction var initialiser to unroll If the condition of the loop terminator is based on an unsigned value we can in some cases find the max number of possible loop trips. With the max loop trips know a complex unroll can unroll the loop. For example: uniform uint x; uint i = x; while (true) { if (i >= 4) break; i += 6; } The above loop can be unrolled even though we don't know the initial value of the induction variable because it can have at most 1 iteration. There were no changes with my shader-db collection. Change was inspired by MR #31312 where builtin shader code failed to unroll. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31701>	2024-12-02 11:44:33 +11:00
Job Noorman	d5d0628728	nir/lower_subgroups: add option to only lower clustered rotates On ir3, we have native support for full rotates but not for clustered ones. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31731>	2024-11-29 16:22:48 +00:00
Job Noorman	5dbd2b08f4	nir/lower_subgroups: disable boolean reduce when not supported lower_boolean_reduce only supports ballot_components == 1. Fall back to lower_scan_reduce when this is not the case. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31731>	2024-11-29 16:22:48 +00:00
Job Noorman	493f7b8084	nir/lower_subgroups: add extra filter data to options It might be convenient for filter implementations to have access to extra information. This will be used, for example, by ir3 to access compiler features. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31731>	2024-11-29 16:22:48 +00:00
Job Noorman	e6c63a88fb	nir: add read_getlast_ir3 intrinsic Like read_first_invocation but using getlast. Note that I intentionally used the name of the ir3 instruction in the name as its semantics are tricky to exactly describe otherwise. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31731>	2024-11-29 16:22:47 +00:00
Job Noorman	60e1615ced	nir/lower_subgroups: support unknown subgroup size Some targets (e.g., ir3) don't always know the exact subgroup size. Calculate the maximum subgroup size in that case by multiplying ballot_components and ballot_bit_size. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31731>	2024-11-29 16:22:47 +00:00
Alyssa Rosenzweig	e3001352ad	nir: add helpers for precompiled shaders v2: generalize function signatures. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> [v1] Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> [v1] Acked-by: Mary Guillemard <mary.guillemard@collabora.com> [v2] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32339>	2024-11-28 17:34:12 +00:00
Marek Olšák	c26da94b4c	nir/opt_varyings: replace options::lower_varying_from_uniform with a cost number This is a simple way for drivers to enable uniform expression propagation without having to set any callbacks for it. It replaces the old option. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32390>	2024-11-28 15:39:46 +00:00
Marek Olšák	428613b690	nir/opt_varyings: add a default callback for varying_estimate_instr_cost used when the driver doesn't set it. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32390>	2024-11-28 15:39:46 +00:00
Marek Olšák	1f238f0a2e	nir/opt_varyings: always call remove_dead_varyings in init_linkage so that we don't have to do it after every init_linkage call. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32390>	2024-11-28 15:39:46 +00:00
Marek Olšák	c50c9e9bf9	nir/lower_clip: implement ClipVertex lowering for GS + lowered IO correctly This is currently needed to fix d3d12 for st_unlower_io_to_vars. The idea is to track the current value of ClipVertex in a temporary variable, and for every emit_vertex, we load the ClipVertex value from the temporary (which matches the stored value) and insert new CLIP_DIST stores before emit_vertex. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>	2024-11-28 14:14:47 +00:00
Marek Olšák	a648acc287	nir/lower_clip: convert nir_lower_clip_gs to nir_shader_intrinsics_pass and add struct lower_clip_state to hold the state for both nir_lower_clip_gs and nir_lower_clip_vs. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>	2024-11-28 14:14:47 +00:00
Marek Olšák	3b8e4a71fe	nir/lower_clip: set clip_distance_array_size outside of create_clipdist_vars Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>	2024-11-28 14:14:47 +00:00
Marek Olšák	b4ef50bca8	nir/lower_clip: separate code for IO variables and intrinsics The code for IO variables was interleaved with code for IO intrinsics, which was difficult to follow. lower_clip_outputs is split and replaced by more accurate names: lower_clip_vertex_var and lower_clip_vertex_intrin Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>	2024-11-28 14:14:47 +00:00
Marek Olšák	3e40c2010e	nir/lower_clip: don't set cursor to fix crashes due to removed instructions The original builder already points at the end of the function impl. Just use that. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>	2024-11-28 14:14:47 +00:00
Caterina Shablia	7ca8c19246	Revert "nir: introduce instance_index system value" This reverts commit `b9be1f1f20`. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32332>	2024-11-28 07:53:01 +00:00
Caterina Shablia	9d5ba87ca1	Revert "nir: lower INSTANCE_{ID,INDEX} to an offset load_instance_{index,id} respectively" This reverts commit `a5bcf566a9`. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32332>	2024-11-28 07:53:01 +00:00
Job Noorman	1333af5d77	nir/search: add is_only_used_by_{iand,ior} helpers Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Rob Clark <robclark@freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32181>	2024-11-28 06:19:59 +00:00
Job Noorman	a8c947df9a	nir/search: make is_only_used_by_iadd reusable The algorithm is exactly the same for other opcodes so we don't have to have to copy paste it. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rob Clark <robclark@freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32181>	2024-11-28 06:19:59 +00:00
Job Noorman	22fc90a116	nir: add ir3-specific bitwise triop opcodes ir3 has a number of bitwise triops (e.g., shrm == (src0 >> src1) & src2) that don't have NIR-equivalents. Doing instruction selection for them is a lot more convenient using algebraic patterns than to have to manually match for them. This patch add NIR opcodes for these instructions. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Rob Clark <robclark@freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32181>	2024-11-28 06:19:59 +00:00
Alyssa Rosenzweig	c2973765e2	nir: add nir_lower_constant_to_temp helper this comes up with clc. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32382>	2024-11-27 20:02:05 +00:00
Alyssa Rosenzweig	12cc22af4c	nir: add nir_remove_entrypoints helper opposite of nir_remove_non_entrypoint. this operation comes up with precompiling. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32382>	2024-11-27 20:02:05 +00:00
Alyssa Rosenzweig	c076900360	nir: add nir_function::pass_flags convenience, asahi will stash stuff here. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32382>	2024-11-27 20:02:05 +00:00
Alyssa Rosenzweig	5555769102	nir: add workgroup size to functions for cl kernel libraries with many entrypoints. spirv can represent, nir should be able to as well. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32382>	2024-11-27 20:02:05 +00:00
Alyssa Rosenzweig	ba30eb9f40	nir: add nir_foreach_entrypoint macros for compiling libraries full of kernels. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32382>	2024-11-27 20:02:05 +00:00
Alyssa Rosenzweig	d8ece9bf3a	nir: add nir_lower_calls_to_builtins pass nir_builder for the GPU Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32382>	2024-11-27 20:02:04 +00:00
Georg Lehmann	3f26e9ca19	nir/opt_intrinsic: fix sample mask opt with demote Reviewed-by: Marek Olšák <marek.olsak@amd.com> Fixes: `d3ce8a7f6b` ("nir: optimize gl_SampleMaskIn to gl_HelperInvocation for radeonsi when possible") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32327>	2024-11-26 18:44:39 +00:00
Georg Lehmann	22557497ec	nir/opt_intrinsic: rework sample mask opt with vector alu Purely theoretical issue, for example gl_SampleMaskIn.xx == 0.xx. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Fixes: `d3ce8a7f6b` ("nir: optimize gl_SampleMaskIn to gl_HelperInvocation for radeonsi when possible") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32327>	2024-11-26 18:44:38 +00:00
Marek Olšák	8fc640b256	nir/lower_io_to_temporaries: fix interp_deref_at_* lowering The pass converts: ... %.. = load_deref(input) to: temp = copy_deref(input) // beginning of the shader ... %.. = load_deref(temp) If interp_deref_at_* occurs between copy_deref and load_deref, the interp_deref_at_* lowering overwrites temp, so all future load_deref(temp) return the result of interp_deref_at_* instead of copy_deref, which is incorrect. The issue manifests when the same input is used by both load_deref and interp_deref_at_* in the same shader and when interp_deref_at_* happens to be before load_deref. This fixes it by using a completely new temporary for each instance of interp_deref_at_*. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32344>	2024-11-26 06:50:40 +00:00
Marek Olšák	c23abb12e8	nir: allow cloning indirect array derefs in nir_clone_deref_instr but only if cloning within the same shader. This will be used to fix nir_lower_io_to_temporaries. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32344>	2024-11-26 06:50:40 +00:00

1 2 3 4 5 ...

5814 commits