fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 22:28:06 +02:00

Author	SHA1	Message	Date
Mary Guillemard	588fd6dfd6	agx: Implement scratch load/store Signed-off-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Mary Guillemard	c15115de6b	agx: Add stack load and store opcodes Signed-off-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Mary Guillemard	514d432e50	agx: Handle doorbell and stack mapping intrinsics Signed-off-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Mary Guillemard	ee0e7b8347	agx: Add doorbell and stack mapping opcodes Signed-off-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Alyssa Rosenzweig	c6a118b654	asahi: Wire up geometry shaders - Compile GS with linked VS and auxiliary programs - Dispatch GS as compute programs + an indirect draw with the GS copy program - Use passthrough GS to implement XFB, replacing old XFB impl. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Alyssa Rosenzweig	df2c145c91	agx: Handle bindless samplers Unified encoding. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Alyssa Rosenzweig	ca42562c7f	agx: Lower LOD bias earlier To make the extra descriptor accesses explicit for drivers. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Alyssa Rosenzweig	111e526f19	agx: Allow drivers to lower texture handles Rather than hardcoding u0_u1, this lets drivers map texture handles in whatever way is convenient. In particular, this makes textures work properly with merged shader stages (provided one of the stages is forced to use bindless access), by giving each stage an independent texture heap. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Alyssa Rosenzweig	a74fbb3840	agx: Translate simple subgroup ops We'll use these for optimizing parallel prefix sums. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Alyssa Rosenzweig	77bb446e90	agx: Add scaffolding for subgroup ops Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Alyssa Rosenzweig	5b754410da	agx: Require 32-bit alignment for EOT offset Fixes piles of brokenness. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Alyssa Rosenzweig	7d7f5013f8	agx: Cleanup 8-bit math before lowering Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Alyssa Rosenzweig	b18181d924	agx: Check for spilling in release builds Don't smash stack -- explain to the user what happened. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Alyssa Rosenzweig	7b92c63105	agx: Fix fragment side effects scheduling We can't move discards across side effects, or the side effect might not happen. Fixes KHR-GLES31.core.shader_image_load_store.basic-allFormats-load-fs regression. Sigh. CI is up next. Fixes: `119e5b9719` ("agx: Schedule for register pressure") Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:55 +00:00
Christian Gmeiner	e928f45735	agx: Re-index nir defs to reduce memory usage nir_index_ssa_defs(..) will re-index the function impl and will update ssa_alloc. In almost all cases this will result in a lower ctx->alloc number which reduces memory usage in compiler passes that are using ctx-alloc to allocate memory. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:54 +00:00
Alyssa Rosenzweig	b6b01aa1f2	agx: Legalize image MS index Fix 2D MSAA Array tests in arb_shader_image_load_store-max-size Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>	2023-11-07 00:05:54 +00:00
Alyssa Rosenzweig	2c54372760	agx: Use CL for texture lowerings To demonstrate everything working, and the value of this approach. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25498>	2023-11-02 11:37:47 +00:00
Alyssa Rosenzweig	eecd8390d0	asahi,agx: Plumb libagx Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25498>	2023-11-02 11:37:47 +00:00
Alyssa Rosenzweig	7193849f30	agx: Fuse ubitfield_extract Similarly, let's get the win everywhere. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25498>	2023-11-02 11:37:46 +00:00
Alyssa Rosenzweig	5500e02a61	agx: Fuse (unmasked) extr_agx This will clean up genxml unpack code and is needed for parity with the assembly we write by hand. This way we get the win for all shaders. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25498>	2023-11-02 11:37:46 +00:00
Alyssa Rosenzweig	0cde7b794c	agx: Vectorize load/stores This helps CL shaders. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25498>	2023-11-02 11:37:46 +00:00
Alyssa Rosenzweig	7f27f2e314	agx: Fix lower regular texture metadata for buffer textures, we insert new blocks which invalidates dominance and block index info... leads to end-to-end fails when shuffling pass order. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25498>	2023-11-02 11:37:46 +00:00
Karol Herbst	74ef0d4f93	asahi: flush denorms on exact fmin/fmax Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25788>	2023-10-18 17:10:02 +00:00
Alyssa Rosenzweig	c39896b17b	nir: Use getters for nir_src::parent_* First, we need to give the parent_instr field a unique name to be able to replace with a helper. We have parent_instr fields for both nir_src and nir_def, so let's rename nir_src::parent_instr in preparation for rework. This was done with a combination of sed and manual fix-ups. Then we use semantic patches plus manual fixups: @@ expression s; @@ -s->renamed_parent_instr +nir_src_parent_instr(s) @@ expression s; @@ -s.renamed_parent_instr +nir_src_parent_instr(&s) @@ expression s; @@ -s->parent_if +nir_src_parent_if(s) @@ expression s; @@ -s.renamed_parent_if +nir_src_parent_if(&s) @@ expression s; @@ -s->is_if +nir_src_is_if(s) @@ expression s; @@ -s.is_if +nir_src_is_if(&s) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>	2023-10-10 04:58:05 -04:00
Alyssa Rosenzweig	e518c92d26	asahi: Assume LAYER is flat-shaded It can't be anything else, this makes sure the varyings are sorted properly. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	b252630604	agx: Support packed layered rendering writes With the new pass. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	4a954dff07	asahi,agx: Select layered rendering outputs These 2 are together Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	7d94f2ee49	agx: Add pass to lower layer ID writes The hardware needs the layer ID and the viewport index packed together. That consumes an entire varying slot, if we want those available in the frag shader we need a separate slot. Add a pass to insert the extra packed write. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	175819eec6	agx: Handle layered block image stores Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	c3a208d6d9	agx: Pack block image store dim correctly Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	da0da5d6f8	agx/nir_lower_texture: Allow disabling layer clamping For background program with layered. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	d83d24e96a	agx: Insert jmp_exec_none instructions With the exception of the backwards branch for loops, all the control flow we insert during instruction selection just predicates instructions rather than actually jumping around. That means, for example, we execute both sides of the if even for a uniform condition! That's inefficient. The solution is insert jmp_exec_none instructions after control flow in order to skip unexecuted regions, which is much faster than predicating them out. However, jmp_exec_none is costly in itself, so we need to use a heuristic to determine when it's actually beneficial. This uses a very simple heuristic for this purpose. However, it is a massive performance speed-up for Dolphin uber shaders: 39fps -> 67fps at 2x resolution. Nearly a doubling of performance! Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	79c4d4213c	agx: Add agx_prev_block helper Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	dd6106c8bd	agx: Add jumps to block ends jmp_exec_none variant that jumps to the last instruction of the target block, rather than the beginning. This is convenient for skipping over elses, while still executing the block-final pop_exec instruction. Similarly for skipping over loop bodies while still executing the block-final pop_exec, after break instructions. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	22ab505a3d	agx: Augment if/else/while_cmp with a target Add an optional pointer to a target block for these instructions. This does NOT act like a logical branch, and does NOT get added to the logical control flow. It is ignored wholesale until after RA, when physical edges may be inserted by a pass we add later in this series. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	d05657e0d6	agx: Hoist sample_mask/zs_emit Although this is well-motivated, perf effect seems to be neglible for Dolphin. It does prevent the scheduler from making things worse by sinking these instructions though, so as a way to prevent future problems this seems sensible. The kind of problem this affects (late discard) isn't modelled in shader-db. Nevertheless, nothing concerning there: total instructions in shared programs: 1756699 -> 1756722 (<.01%) instructions in affected programs: 10106 -> 10129 (0.23%) helped: 21 HURT: 41 Inconclusive result (value mean confidence interval includes 0). total bytes in shared programs: 11525404 -> 11525452 (<.01%) bytes in affected programs: 72900 -> 72948 (0.07%) helped: 27 HURT: 41 Inconclusive result (value mean confidence interval includes 0). total halfregs in shared programs: 483394 -> 483286 (-0.02%) halfregs in affected programs: 4945 -> 4837 (-2.18%) helped: 88 HURT: 78 Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	0d8362b842	agx: Align the reg file for 256-bit vectors This fixes live range splitting with 3D textureGrad(), which involves vectors larger than the natural 128-bit maximum and hence requires special handling. Fixes this assert with a combination of debug flags and new patches: unsigned int find_best_region_to_evict(struct ra_ctx , unsigned int, unsigned int , unsigned int *): Assertion `(rctx->bound % size) == 0 && "register file size must be aligned to the maximum vector size"' failed Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	d1eb17e92e	treewide: Drop nir_ssa_for_src users Via Coccinelle patch: @@ expression b, s, n; @@ -nir_ssa_for_src(b, *s, n) +s->ssa @@ expression b, s, n; @@ -nir_ssa_for_src(b, s, n) +s.ssa Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25247>	2023-09-18 10:25:17 -04:00
Alyssa Rosenzweig	0df0980fc4	agx: Enable sinking ALU Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>	2023-09-18 08:38:16 -04:00
Alyssa Rosenzweig	fb60626260	agx: Run opt_idiv_const after lowering texture Shaves 10 instructions off the cube map array lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	49951ef3cc	agx: Lower coordinates for cube map array images Annoyingly different from texture coordinates. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	fb76f6cc6e	agx: Handle cube arrays when clamping arrays Need to adjust the component. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	7895d5b79c	agx: Add unit test for cmp+sel fusing Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	bdad7992bc	agx: Add unit test for if_cmp fusing Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	08e0c5a9cf	agx: Fuse compares into selects This lets us save a LOT of instructions at the cost of increased register pressure. However, on my shader-db, this is still coming out ahead since no shaders are hurt for thread count/spills, and only 1/10 of the shaders helped for instruction count are hurt for register pressure. The shaders most hurt for pressure have very low pressure (7 -> 15 is the worst case) and you need a certain number of registers to use a 4 source instruction at all. Analyzing the hurt shaders, nothing concerns me too much ... this isn't as bad as I feared. So I think at this point it's worth ripping off the bandage, given the massive potential for instruction count win. This is a big improvement for some of the shaders I'm working on for my $SECRET_PROJECT. total instructions in shared programs: 1784943 -> 1775169 (-0.55%) instructions in affected programs: 644211 -> 634437 (-1.52%) helped: 3498 HURT: 38 Instructions are helped. total bytes in shared programs: 11720734 -> 11643224 (-0.66%) bytes in affected programs: 4370986 -> 4293476 (-1.77%) helped: 3572 HURT: 36 Bytes are helped. total halfregs in shared programs: 474094 -> 475165 (0.23%) halfregs in affected programs: 12821 -> 13892 (8.35%) helped: 65 HURT: 247 Halfregs are HURT. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	e7ffc799d1	agx: Fuse conditions into if's Simple greedy thing that has the potential to inflate register pressure but reduces instructions. Thanks to the recent loop work that turns if { break } into while_icmp, this also implicitly handles fusing conditions into loops, which is what actually prompted this. Surprisingly, this helps register pressure on my shader-db (no change to thread count), I guess by eliminating the boolean temps in case where the sources are used multiple times. total instructions in shared programs: 1786561 -> 1784943 (-0.09%) instructions in affected programs: 128557 -> 126939 (-1.26%) helped: 474 HURT: 13 Instructions are helped. total bytes in shared programs: 11733236 -> 11720734 (-0.11%) bytes in affected programs: 976034 -> 963532 (-1.28%) helped: 521 HURT: 13 Bytes are helped. total halfregs in shared programs: 474245 -> 474094 (-0.03%) halfregs in affected programs: 1869 -> 1718 (-8.08%) helped: 28 HURT: 7 Halfregs are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	f17ad0c516	agx: Generate unfused comparison pseudo ops So we can optimize them easier. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	ed6e391349	agx: Add pseudo-instructions for icmp/fcmp Easier to optimize with. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	139e56c0db	agx: Only use nest by 1 for loops w/o continue Apple doesn't do this, but it should be equivalent and it makes it easier to see that we can use while_icmp for break_if_icmp in loops that don't use continue (which Apple does do). So, the effect of this commit is to use while_icmp for most breaks, which saves an instruction. total instructions in shared programs: 1764199 -> 1764076 (<.01%) instructions in affected programs: 24149 -> 24026 (-0.51%) helped: 78 HURT: 0 Instructions are helped. total bytes in shared programs: 11609306 -> 11608322 (<.01%) bytes in affected programs: 164604 -> 163620 (-0.60%) helped: 78 HURT: 0 Bytes are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	8f06252e9b	agx: Add helper to determine if a NIR loop uses continue We need to emit extra instructions to handle continues, but if we don't have any, we can omit those. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00

1 2 3 4 5 ...

751 commits