fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-06 09:28:07 +02:00

Author	SHA1	Message	Date
Dave Airlie	d45f598ece	llvmpipe: move to nir lowering for fquantize2f16 Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24988>	2023-09-05 23:33:20 +00:00
Tapani Pälli	b6bd7107e6	driconf: use lower_depth_range_rate for The Spirit and The Mouse Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9738 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25029>	2023-09-05 22:40:36 +00:00
David Heidelberg	6223e88757	Revert "ci: disable Google Freedreno farm, currently timeouting on all jobs" This reverts commit `fc46062ee5`. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25058>	2023-09-05 22:15:19 +00:00
David Rosca	ad6557b101	frontends/va: Support chroma sample location in postproc Rename vlVaSetCscMatrix to vlVaSetProcParameters because it now does more than just setting csc matrix. Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Thong Thai <thong.thai@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24869>	2023-09-05 21:31:43 +00:00
David Rosca	a50a46acf5	gallium/auxiliary/vl: Support chroma sample location in compute shaders Used only in YUV to RGB video_buffer shader for now. Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Thong Thai <thong.thai@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24869>	2023-09-05 21:31:43 +00:00
David Rosca	a6a43963ed	gallium/auxiliary/vl: Clamp coordinates in compute shaders Video textures include padding, so this is needed to avoid sampling outside of src rect due to scaling or additional offset. Fixes wrong colors on right/bottom edge. Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Thong Thai <thong.thai@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24869>	2023-09-05 21:31:43 +00:00
David Rosca	a90b9f1d1e	gallium/auxiliary/vl: Map range when updating constants Use WRITE \| DISCARD_RANGE to avoid having to read back the csc matrix and luma min/max values. Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Thong Thai <thong.thai@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24869>	2023-09-05 21:31:43 +00:00
David Rosca	7c8e1596d6	gallium/auxiliary: Fix util_compute_blit half texel offset with scaling Video textures include padding, so make sure to not sample outside src rect. Also remove the parameter and always use the offset. When not scaling, this fixes blurry output. When scaling, this fixes incorrect color at right/bottom edge. Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Thong Thai <thong.thai@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24869>	2023-09-05 21:31:43 +00:00
Mike Blumenkrantz	959801d9d9	zink: polaris ci updates Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25056>	2023-09-05 19:43:46 +00:00
Alyssa Rosenzweig	07cb81f0fc	asahi: Skip LOD bias lowering for GLES This reduces silliness in Dolphin ubershaders by eliminating the double lowering. It also makes the GLES shader assembly nicer to read. Dolphin ubershader performance at 4K on MMG improved by about 0.5%. Not massive, but definitely noticeable and reduces the delta to macOS. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:35 +00:00
Alyssa Rosenzweig	2adb0f31fc	gallium,mesa/st: Add PIPE_CONTEXT_NO_LOD_BIAS flag While desktop GL supports sampler LOD bias, GLES does not. To support the GL use case, all Gallium drivers are expected to handle sampler LOD bias. However, this may require shader code to implement (lowering tex to txb, txl to fadd+txl) and cost resources to push the LOD bias constants into the shader. The issue is compounded with something like Dolphin's GLES renderer, which does this LOD bias emulation itself -- meaning that LOD bias is lowered twice when using Dolphin with GLES! As such, this commit adds a context flag for frontends to communicate that they will never use sampler LOD bias, allowing the driver to omit the lowering as a GLES fast path (or, for Dolphin, for performance parity between GLES and GL). This will be used on Asahi. It could also be used to optimize a path on Mali-T720 supported in Panfrost, though I don't intend to write that patch. Originally https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25034 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	6269b60a1c	asahi: Conditionally expose cube arrays With =deqp. I don't want this exposed before geometry shaders since we run dEQP (GLES) far more than Piglit (GL), and we need geometry shaders to get adequate regression testing via dEQP-GLES. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	dd3dd6e127	asahi: Handle linear 1D Arrays Lowered to linear 2D Arrays, handle them like that. Fixes 1D Array case of arb_shader_image_size-builtin. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	56267ec14d	asahi: Forbid linear 1D Array images Porbably a theoretical case, but these fall down the 2D path so better not allow it at any rate. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	fb60626260	agx: Run opt_idiv_const after lowering texture Shaves 10 instructions off the cube map array lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	49951ef3cc	agx: Lower coordinates for cube map array images Annoyingly different from texture coordinates. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	fb76f6cc6e	agx: Handle cube arrays when clamping arrays Need to adjust the component. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	54ebddaa0f	ail: Force page-alignment for layered attachments When rendering to a layered depth/stencil attachment, we specify the layer stride in pages. That means that depth/stencil targets must be page-aligned to be rendered to correctly. If we're merely sampling, not rendering, we do not need the extra alignment. So we add a flag to handle this case so we keep passing the generated ail tests. Fixes KHR-GLES31.core.texture_cube_map_array.color_depth_attachments Similarly, we page-align colour attachments. I don't have a good theoretical justification for this part, but it seems to be necessary and layered rendering fails otherwise. Possibly the PBE requires page-aligned layers unconditionally? Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	f9b08cf3a6	asahi: Translate cube array dimension Yet another enum. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	7895d5b79c	agx: Add unit test for cmp+sel fusing Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	bdad7992bc	agx: Add unit test for if_cmp fusing Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	08e0c5a9cf	agx: Fuse compares into selects This lets us save a LOT of instructions at the cost of increased register pressure. However, on my shader-db, this is still coming out ahead since no shaders are hurt for thread count/spills, and only 1/10 of the shaders helped for instruction count are hurt for register pressure. The shaders most hurt for pressure have very low pressure (7 -> 15 is the worst case) and you need a certain number of registers to use a 4 source instruction at all. Analyzing the hurt shaders, nothing concerns me too much ... this isn't as bad as I feared. So I think at this point it's worth ripping off the bandage, given the massive potential for instruction count win. This is a big improvement for some of the shaders I'm working on for my $SECRET_PROJECT. total instructions in shared programs: 1784943 -> 1775169 (-0.55%) instructions in affected programs: 644211 -> 634437 (-1.52%) helped: 3498 HURT: 38 Instructions are helped. total bytes in shared programs: 11720734 -> 11643224 (-0.66%) bytes in affected programs: 4370986 -> 4293476 (-1.77%) helped: 3572 HURT: 36 Bytes are helped. total halfregs in shared programs: 474094 -> 475165 (0.23%) halfregs in affected programs: 12821 -> 13892 (8.35%) helped: 65 HURT: 247 Halfregs are HURT. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	e7ffc799d1	agx: Fuse conditions into if's Simple greedy thing that has the potential to inflate register pressure but reduces instructions. Thanks to the recent loop work that turns if { break } into while_icmp, this also implicitly handles fusing conditions into loops, which is what actually prompted this. Surprisingly, this helps register pressure on my shader-db (no change to thread count), I guess by eliminating the boolean temps in case where the sources are used multiple times. total instructions in shared programs: 1786561 -> 1784943 (-0.09%) instructions in affected programs: 128557 -> 126939 (-1.26%) helped: 474 HURT: 13 Instructions are helped. total bytes in shared programs: 11733236 -> 11720734 (-0.11%) bytes in affected programs: 976034 -> 963532 (-1.28%) helped: 521 HURT: 13 Bytes are helped. total halfregs in shared programs: 474245 -> 474094 (-0.03%) halfregs in affected programs: 1869 -> 1718 (-8.08%) helped: 28 HURT: 7 Halfregs are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	f17ad0c516	agx: Generate unfused comparison pseudo ops So we can optimize them easier. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	ed6e391349	agx: Add pseudo-instructions for icmp/fcmp Easier to optimize with. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	139e56c0db	agx: Only use nest by 1 for loops w/o continue Apple doesn't do this, but it should be equivalent and it makes it easier to see that we can use while_icmp for break_if_icmp in loops that don't use continue (which Apple does do). So, the effect of this commit is to use while_icmp for most breaks, which saves an instruction. total instructions in shared programs: 1764199 -> 1764076 (<.01%) instructions in affected programs: 24149 -> 24026 (-0.51%) helped: 78 HURT: 0 Instructions are helped. total bytes in shared programs: 11609306 -> 11608322 (<.01%) bytes in affected programs: 164604 -> 163620 (-0.60%) helped: 78 HURT: 0 Bytes are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	8f06252e9b	agx: Add helper to determine if a NIR loop uses continue We need to emit extra instructions to handle continues, but if we don't have any, we can omit those. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	5c9495cf37	agx: Omit while_icmp without continue The only role of the while_icmp at the end of a NIR loop is to make continue jumps work. If, after emitting the loop, we learn that there are no continues, there is no need to insert a while_icmp since it would be a no-op anyway. total instructions in shared programs: 1764311 -> 1764199 (<.01%) instructions in affected programs: 26321 -> 26209 (-0.43%) helped: 82 HURT: 0 Instructions are helped. total bytes in shared programs: 11609978 -> 11609306 (<.01%) bytes in affected programs: 178842 -> 178170 (-0.38%) helped: 82 HURT: 0 Bytes are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	e71a1469a8	agx: Omit push_exec at top level In general, loops need a push_exec at the start for correctness. However, a push_exec at the top level (non-nested) is a no-op, so we can omit and save a few cycles. total instructions in shared programs: 1764350 -> 1764311 (<.01%) instructions in affected programs: 7339 -> 7300 (-0.53%) helped: 36 HURT: 0 Instructions are helped. total bytes in shared programs: 11610212 -> 11609978 (<.01%) bytes in affected programs: 48638 -> 48404 (-0.48%) helped: 36 HURT: 0 Bytes are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	6e0ae2c316	agx: Detect conditional breaks Search for code like if ... { break } and replace with a break_if pseudo-instruction for optimized handling, since the break_if lowering is better than the original code. total instructions in shared programs: 1764596 -> 1764350 (-0.01%) instructions in affected programs: 24540 -> 24294 (-1.00%) helped: 78 HURT: 0 Instructions are helped. total bytes in shared programs: 11611196 -> 11610212 (<.01%) bytes in affected programs: 166458 -> 165474 (-0.59%) helped: 78 HURT: 0 Bytes are helped. shader-db probably understates the benefit here, since this optimizes the body of loops. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	a009f39fca	agx: Use agx_first_instr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	aad7d5288a	agx: Add agx_first/last_instr helpers Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	ffb64283ee	agx: Add break_if_*cmp instructions To faciliate break optimizations. We use a more efficient lowering than the literal transition of the NIR. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	ff816f224b	agx: Split nest instruction into begin_cf + break We use it for two different things. Pseudo-instructions are cheap, split it up for easier optimization passes. This also fixes the schedule classes.. we can move the cf_begin around if we want, it's inert. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	b89c048c9b	agx: Lower nest later As part of pseudo op lowering. Simpler and will simplify control flow opts. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	b25b36a9e3	agx: Expand nest For breaking out of deeper control flow. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	8405444143	agx: Lower pseudo-ops later Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	f9343fe5ca	agx: Remove logical_end instructions They're more trouble than they're worth for us. They were originally lifted unthinkingly from ACO, where I assume they're necessary for software CF lowering, but they're just an inconvenient convenience for us. Remove em. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	a2e5d1ddd1	asahi: Force translucency for ignored render targets If we bound 4 render targets but we only write to 1 of them, the other 3 need their contents preserved. This requires either properly configuring HSR to implement colour masking (TODO) or using the big hammer of setting TRANSLUCENT. This patch picks the latter for now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	62a2bdde7f	agx: Lower pack_32_4x8_split Fixes test_integer_ops integer_dot_product. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Asahi Lina	23c5ff814e	asahi: Allow no16 flag for disk cache The debug flags are already plumbed into driver_flags for the disk cache, so we just need to actually allow some flags instead of bailing out of the disk cache init. We only care about no16 for production right now, and it's probably a good idea to disable disk caching during most debug sessions, so allowlist only that one. Signed-off-by: Asahi Lina <lina@asahilina.net> Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Asahi Lina	8781c448a4	driconf: Disable fp16 for browsers There are way too many broken WebGL apps using the wrong precision qualifiers, which causes anything from jittery geometry to complete breakage (e.g. QuakeJS and other games). In addition, a Firefox bug is breaking basic canvas rendering for the same reason (mozilla bug #1845309). Let's just disable fp16 for browsers. There is no hope of getting all this broken stuff fixed. Signed-off-by: Asahi Lina <lina@asahilina.net> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Asahi Lina	025da70013	asahi: Add and support the no_fp16 driconf flag This is the driconf equivalent of our debug no16 flag, which disables fp16 support to work around apps using bad GLSL precision qualifiers. Signed-off-by: Asahi Lina <lina@asahilina.net> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Asahi Lina	45be01374f	asahi: Add scaffolding for supporting driconf options It's time to start using some of these, so add the required scaffolding to be able to have driver-specific driconf handling for us. Signed-off-by: Asahi Lina <lina@asahilina.net> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Asahi Lina	c83672a0b3	asahi: Fix VDM pipeline field width The lower bits have a special meaning, like on the other pipelines. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Asahi Lina	0424017e72	asahi: decode: Do not assert on buffer overruns This kills the hypervisor, let's just print and return. Also flush after decoding, so that if something else goes wrong at least we get the logs up to that point. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Asahi Lina	acd5ed0451	asahi: decode: Implement VDM call/ret Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Asahi Lina	c43dbadaa0	asahi: cmdbuf: Identify call/ret bits Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Asahi Lina	4f793d878a	asahi: Allocate staging resources as staging We were never setting the flag, which made these resources write-combine... Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00
Alyssa Rosenzweig	119e5b9719	agx: Schedule for register pressure Since we register allocate in SSA, the number of registers required (register demand) equals to the maximum number of simultaneous live values (register pressure). So if we can reduce register pressure, we are guaranteed to reduce register demand. Even an ineffective heuristic like randomly swapping instructions can only reduce pressure as long as it's conservative. This implements one such heuristic: in each block, schedule backwards, selecting the free instruction that looks like it will reduce liveness the most. In other words, the greedy algorithm to reduce register pressure. At the end of the block, if we haven't actually reduced pressure, we bail. This isn't optimal, but it's well-motivated and optimally handles special cases (like 0-source instructions). This is based on the scheduler I originally wrote for Mali. In my Dolphin ubershader branch, this improved performance at native 4K by 10fps (105fps->115fps) when I measured together with some other optimizations. On top of my current next (which notably includes nir_opt_sink improvements), this commit alone goes (53fps->54fps) which is considerably less impressive :-p shader-db results are a win, but not as large as we might hope. Instruction count win seems to be from the smaller live ranges being easier on RA (fewer swaps / moves). The two shaders affected for thread count are from fifa mobile, which go from 640 threads -> 1024 (full occupancy). In other words... this heuristic does an excellent job in a small subset of shaders. The Dolphin ubershader win was real, though :~) Note these shader-db wins are on top of a branch with the nir_opt_sink improvements. Without that, the stats are much better... The schedulers have some overlap, but they're better together. total instructions in shared programs: 1766635 -> 1763496 (-0.18%) instructions in affected programs: 445855 -> 442716 (-0.70%) helped: 1963 HURT: 350 Instructions are helped. total bytes in shared programs: 11597648 -> 11586924 (-0.09%) bytes in affected programs: `3106230` -> 3095506 (-0.35%) helped: 2003 HURT: 374 Bytes are helped. total halfregs in shared programs: 504609 -> 481980 (-4.48%) halfregs in affected programs: 138322 -> 115693 (-16.36%) helped: 3405 HURT: 311 Halfregs are helped. total threads in shared programs: 18839936 -> 18840704 (<.01%) threads in affected programs: 1280 -> 2048 (60.00%) helped: 2 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>	2023-09-05 18:50:34 +00:00

1 2 3 4 5 ...

177418 commits