fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-04-21 12:00:41 +02:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	87e57eae09	asahi: Rename no colour output to tag write disable Comparison with PowerVR's XML shows that this is the actual name... And it needs to be set a bit more carefully than "no colour output" in order to get correct behaviour for depth-only passes that use sample mask / discard. Fix the name first, the extra conditions will come when they're needed for multisampling. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 09:05:39 -04:00
Alyssa Rosenzweig	e13f9caa25	agx: Fix packing for iadd with shift Wrong bit pattern was packed, oops. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 09:05:39 -04:00
Alyssa Rosenzweig	cd7e016961	asahi: Use device_load shift for VBO loads When possible. Only occassionally possible because the loads are pretty limited in the addressing arithmetic. This probably doesn't matter for performance but it saves some noise in dEQP tests which makes for nicer debugging, plenty of optimizations end up worth it for that alone. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 09:05:39 -04:00
Alyssa Rosenzweig	bd9c33e16a	agx: Defeature fsub All has_fsub does is fuse fsubs (they're unfused otherwise), no point doing that if we're going to just going to lower. shader-db is mostly noise. total instructions in shared programs: 1487217 -> 1487035 (-0.01%) instructions in affected programs: 22658 -> 22476 (-0.80%) helped: 85 HURT: 2 helped stats (abs) min: 1.0 max: 12.0 x̄: 2.19 x̃: 1 helped stats (rel) min: 0.38% max: 2.46% x̄: 0.87% x̃: 0.65% HURT stats (abs) min: 1.0 max: 3.0 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.58% max: 1.08% x̄: 0.83% x̃: 0.83% 95% mean confidence interval for instructions value: -2.51 -1.67 95% mean confidence interval for instructions %-change: -0.97% -0.70% Instructions are helped. total bytes in shared programs: 10189996 -> 10189288 (<.01%) bytes in affected programs: 158132 -> 157424 (-0.45%) helped: 85 HURT: 2 helped stats (abs) min: 4.0 max: 48.0 x̄: 8.75 x̃: 4 helped stats (rel) min: 0.22% max: 1.44% x̄: 0.51% x̃: 0.38% HURT stats (abs) min: 6.0 max: 30.0 x̄: 18.00 x̃: 18 HURT stats (rel) min: 0.90% max: 0.91% x̄: 0.91% x̃: 0.91% 95% mean confidence interval for bytes value: -9.98 -6.30 95% mean confidence interval for bytes %-change: -0.56% -0.39% Bytes are helped. total halfregs in shared programs: 462536 -> 462556 (<.01%) halfregs in affected programs: 131 -> 151 (15.27%) helped: 1 HURT: 4 helped stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2 helped stats (rel) min: 28.57% max: 28.57% x̄: 28.57% x̃: 28.57% HURT stats (abs) min: 4.0 max: 8.0 x̄: 5.50 x̃: 5 HURT stats (rel) min: 12.77% max: 36.36% x̄: 25.01% x̃: 25.45% 95% mean confidence interval for halfregs value: -0.65 8.65 95% mean confidence interval for halfregs %-change: -18.64% 47.23% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 09:05:39 -04:00
Alyssa Rosenzweig	1185ac931f	agx: Remove bogus assert I->mask isn't even valid for iter instructions. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 09:00:59 -04:00
Alyssa Rosenzweig	fc88876329	agx: Handle linear 2D array textureSize() We handle linear 2D arrays internally for blit shaders, so we need textureSize to work for these. That requires some special casing, because there's a line stride where the layer count would otherwise be. But it's not too bad. Fixes dEQP-GLES3.functional.shaders.texture_functions.texturesize.sampler2darray_* when forcing linear textures. Since we clamp array access to the maximum layer, we need textureSize() to work for even the most basic array texturing. So this should fix blits from linear 2D arrays as well, which finally unlocks support for compressed arrays/cubes/3D textures. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 09:00:59 -04:00
Alyssa Rosenzweig	21d7049925	agx/lower_zs_emit: Fix progress returning Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 09:00:56 -04:00
Alyssa Rosenzweig	c8e331bf72	agx: Fix abs/neg propagation into fcmpsel The first two sources are floats, the latter two sources and destination (and hence the opcode) are not. Reflect that when packing and optimizing. Noticed while debugging a silly dEQP test. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 09:00:56 -04:00
Alyssa Rosenzweig	632014ece0	agx: Handle splits of uniforms This is straightforward, and can happen with certain u2u16 patterns. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 09:00:56 -04:00
Alyssa Rosenzweig	2f907dd827	asahi: Identify XML for barycentric coordinates Reading them from a fragment shader, not interpolating at custom ones. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 09:00:53 -04:00
Alyssa Rosenzweig	e9b471d1b3	asahi: Fix disk cache disable with AGX_MESA_DEBUG We go to initialize the disk cache before we've compiled any shaders so agx_compiler_debug is 0 at this point. Don't try to read it, instead go through sa safe getter that will do the right thing. Fixes: `5e9538c12e` ("agx: isolate compiler debug flags") Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 09:00:40 -04:00
Asahi Lina	ae2b312ecb	asahi: Add batch state debugging I've had to reimplement this more than once, let's just make a flag for it. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 08:59:41 -04:00
Janne Grunau	be3a1e2e88	asahi: Free low VA BOs correctly These need the shader_base added to them. Fixes GEM_BIND errors after usc_head provides VA without the VM_SHADER_START offset from returned low VA. Signed-off-by: Janne Grunau <j@jannau.net> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 08:59:41 -04:00
Asahi Lina	6bbf10f3f2	asahi: Identify ZS resolve bits (tentative) Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>	2023-05-07 08:58:23 -04:00
Asahi Lina	888d443f29	asahi: Add resource debugging I keep re-implementing this every time I look at resource-related issues. Let's just make it official so we can turn it on with a flag instead of having to add printfs every time ^^ Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:04 +00:00
Asahi Lina	883ba4b161	asahi: Make BO import path failures more robust These operations can fail for complex reasons through no fault of mesa, so we should have proper runtime checks for them even in release builds. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:04 +00:00
Asahi Lina	00064ba4e3	asahi: Fix style nits Found with a grep abomination which is probably too broken/silly to actually implement in CI... but hey, at least it found some. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:04 +00:00
Asahi Lina	a88b9c5540	asahi: Locate low VA BOs correctly These need the shader_base added to them. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:04 +00:00
Asahi Lina	75e3212809	Revert "asahi: Advertise dual-source blending" This reverts commit `f4e2b22646`. This is broken until GL3 is enabled, possibly due to a core Mesa bug, but it's a corner case not worth fixing. Fixes Chromium. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:04 +00:00
Alyssa Rosenzweig	8a6d74d15b	agx: Make signal_pix instructions explicit Rather than implicitly packing them with the sample_mask. Again, this is just changing where they're emitted, no functional changes yet. Bug for bug compatibility with the old behaviour. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:04 +00:00
Alyssa Rosenzweig	bb530760a2	agx: Rename writeout to wait_pix This is the name applegpu is currently using, to capture the semantics of a pixel fence. I'm not sure what Apple calls this but wait_pix is closer than writeout for sure. This commit just does the rename. It doesn't fix the broken semantics we've had, this is to ease review and bisection. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:04 +00:00
Alyssa Rosenzweig	2028e7b88b	agx: Tease apart some sample_mask packing magic There's a second instruction here, and a second source in the first instruction. applegpu has known about the encodings for a while but I never updated the packing code. We will need to stop hardcoding this for multisampling support, as preparation tease apart the magic pieces. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:04 +00:00
Alyssa Rosenzweig	23880daa8d	asahi: Lower 1D to 2D Khronos APIs require that we support mipmapping even for 1D textures. However, it isn't clear if this is supported in the hardware, and how it would work even if it is. But 1D textures are pretty useless, so we just lower 1D textures to 2D textures instead of worrying about that. Fixes piles of Piglits relating to 1D textures. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Alyssa Rosenzweig	098295f1a0	asahi: Implement null textures Use the same silly workaround that Metal does, to fill in texture descriptors when there's nothing bound in the interest of robust behaviour. Fixes null pointer dereference in arb_shading_language_420pack-active-sampler-conflict. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Alyssa Rosenzweig	203c9c12e2	agx: Don't overallocate registers We need to account for the full vector lengths. Especially important once we start restricting the reg file. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Alyssa Rosenzweig	42c5d6140b	agx: Coalesce more collects Try harder to coalesce collects, by trying to allocate collects only to regions of the register file where we actually have a full vector worth of registers free. If we already know that the vector will be blocked later, it's not a good base register to pick since we'd be force to shuffle later. So, this tweak to the collect coalescing heuristic lets us eliminate a pile of pointless copying. shader-db results are excellent. Note that, although we use more registers, none of the shaders tested had their thread count affected, likely because the max HURT isn't too high and most of the scary % here is from using a few more registers when the register pressure is already low. In the near future, that property will become guaranteed thanks to live range splitting, too. total instructions in shared programs: `1507337` -> 1500562 (-0.45%) instructions in affected programs: 428137 -> 421362 (-1.58%) helped: 2658 HURT: 167 helped stats (abs) min: 1.0 max: 34.0 x̄: 2.63 x̃: 2 helped stats (rel) min: 0.10% max: 25.00% x̄: 3.04% x̃: 2.14% HURT stats (abs) min: 1.0 max: 10.0 x̄: 1.24 x̃: 1 HURT stats (rel) min: 0.20% max: 23.81% x̄: 3.90% x̃: 3.57% 95% mean confidence interval for instructions value: -2.49 -2.31 95% mean confidence interval for instructions %-change: -2.76% -2.51% Instructions are helped. total bytes in shared programs: 10333670 -> 10293172 (-0.39%) bytes in affected programs: 2996682 -> 2956184 (-1.35%) helped: 2660 HURT: 175 helped stats (abs) min: 2.0 max: 204.0 x̄: 15.70 x̃: 12 helped stats (rel) min: 0.08% max: 23.08% x̄: 2.64% x̃: 1.83% HURT stats (abs) min: 2.0 max: 60.0 x̄: 7.26 x̃: 6 HURT stats (rel) min: 0.12% max: 22.39% x̄: 3.19% x̃: 2.78% 95% mean confidence interval for bytes value: -14.81 -13.76 95% mean confidence interval for bytes %-change: -2.39% -2.18% Bytes are helped. total halfregs in shared programs: 417284 -> 427363 (2.42%) halfregs in affected programs: 49814 -> 59893 (20.23%) helped: 95 HURT: 3018 helped stats (abs) min: 1.0 max: 8.0 x̄: 2.29 x̃: 2 helped stats (rel) min: 2.44% max: 28.57% x̄: 9.20% x̃: 6.06% HURT stats (abs) min: 1.0 max: 14.0 x̄: 3.41 x̃: 4 HURT stats (rel) min: 2.08% max: 150.00% x̄: 36.54% x̃: 27.27% 95% mean confidence interval for halfregs value: 3.17 3.31 95% mean confidence interval for halfregs %-change: 34.05% 36.23% Halfregs are HURT. total threads in shared programs: 16465280 -> 16465280 (0.00%) threads in affected programs: 0 -> 0 helped: 0 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Alyssa Rosenzweig	8e501b758a	asahi/decode: Print VDM barriers Instead of just decoding silently. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Alyssa Rosenzweig	0bbd8b502a	asahi/decode: Remove agxdecode_dump_bo Now that we have proper parsing this is more of a nuissance than not. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Alyssa Rosenzweig	e713983875	agx: Add helper for calculating occupancy Add information about the relationship between program register usage and program occupancy (the maximum number of threads that may execute concurrently on a single shader core). This table is derived from studying the maxTotalThreadsPerThreadgroup property in Metal while varying the register usage, something I blogged about a few years back. It's probably not 100% accurate and it hasn't been tested against hardware, but it matters "only" for performance (not correctness) so I'm not super stressed about the details. In the (near) future, RA will be able to make use of this information to know exactly when it can use more registers without hurting performance. In the present, it's just used for better shader-db statistics. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Alyssa Rosenzweig	3a87d2cfbd	agx: Don't destroy usub_sat with constant Fixes KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-pad Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Alyssa Rosenzweig	8ec91ee16f	agx: Don't allow uniform source to local_atomic Fixes KHR-GLES31.core.compute_shader.atomic-case3 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Alyssa Rosenzweig	c643f42dc6	agx: Constify agx_{read,write}_registers Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Alyssa Rosenzweig	da9c8a4627	agx: Assert that we don't overflow registers This will become particularly important when we bound to smaller register files. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Alyssa Rosenzweig	7c7b95ba2a	agx: DCE even with noopt To simplify live range splitting, RA will soon assume that DCE has run (removing extraneous vectors). So run DCE even when otherwise disabling backend optimizations. AGX_MESA_DEBUG=noopt is still useful for disabling instruction combining, which is the more-likely-to-be-buggy pass anyway. This also fixes IR not being printed with noopt. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Alyssa Rosenzweig	75b858e904	asahi: Support more renderable formats Fixes KHR-GLES3.copy_tex_image_conversions.forbidden.* Arguably working around a mesa/st issue but more format support is good for compatibility and performance anyway. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Janne Grunau <j@jannau.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>	2023-04-07 03:23:03 +00:00
Emma Anholt	094b5a71d7	agx: Enable nir_lower_frexp. Needed for Vulkan, and for dropping GLSL frontend lowering for the deqp coverage override case. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22083>	2023-04-06 02:32:01 +00:00
Emma Anholt	2a33ea95d6	glsl: Retire ldexp lowering in favor of the nir lowering flag. Compilers need to set the nir flag anyway for vulkan, so just pass ldexp through to NIR and let that handle it. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22083>	2023-04-06 02:32:00 +00:00
Alyssa Rosenzweig	0f974d1f90	asahi: Convert to SPDX headers Also drop my email address in the copyright lines and fix some "Copyright 208 Alyssa Rosenzweig" lines, I'm not that old. Together this drops a lot of boilerplate without losing any meaningful licensing information. SPDX is already in use for the MIT-licensed code in turnip, venus, and a few other scattered parts of the tree, so this should be ok from a Mesa licensing standpoint. This reduces friction to create new files, by parsing the copy/paste boilerplate and being short enough you can easily type it out if you want. It makes new files seem less daunting: 20 lines of header for 30 lines of code is discouraging, but 2 lines of header for 30 lines of code is reasonable for a simple compiler pass. This has technical effects, as lowering the barrier to making new files should encourage people to split code into more modular files with (hopefully positive) effects on project compile time. This helps with consistency between files. Across the tree we have at least a half dozen variants of the MIT license text (probably more), plus code that uses SPDX headers instead. I've already been using SPDX headers in Asahi manually, so you can tell old vs new code based on the headers. Finally, it means less for reviewers to scroll through adding files. Minimal actual cognitive burden for reviewers thanks to banner blindness, but the big headers still bloat diffs that add/delete files. I originally proposed this in December (for much more of the tree) but someone requested I wait until January to discuss. I've been trying to get in touch with them since then. It is now almost April and, with still no response, I'd like to press forward with this. So with a joint sign-off from the major authors of the code in question, let's do this. Signed-off-by: Asahi Lina <lina@asahilina.net> Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Rose Hudson <rose@krx.sh> Acked-by: Lyude Paul [over IRC: "yes I'm fine with that"] Meh'd-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22062>	2023-03-28 05:14:00 +00:00
Eric Engestrom	8e6ac35658	asahi: fix a few typos Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21947>	2023-03-17 22:11:33 +00:00
Asahi Lina	04387269dd	asahi: Extend batch tracking for explicit sync Now that we have stub sync support in the submission API, we can implement the batch tracking changes required to support an explicit sync world. This excludes the UAPI-specific bits (command decoding and status parsing). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21620>	2023-03-16 20:42:01 +00:00
Asahi Lina	e6b565699f	asahi: Support importing sync objects on BO export When a BO is exported, implicit sync convention requires that writers signal a fence on the object when complete. We already do this for BOs that are already exported, but it is possible for a BO to be written to, then exported for the first time. Add a field to agx_bo to keep track of the current writer syncobj handle. On first export, we use this to import it into the DMA-BUF. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21620>	2023-03-16 20:42:01 +00:00
Alyssa Rosenzweig	45554a957a	agx: Lower discard late Fixes regression with Dolphin's ubershaders. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21855>	2023-03-11 23:34:56 +00:00
Alyssa Rosenzweig	7e908878c1	ail: Restructure generated tests Currently, the generated tests consist of some boilerplate, generated test cases, and at the very end the actual test. This is bad for readability, because the actual code is all the way at the bottom. It's also bad for clang-format linting: even though the test cases are /* clang-format off */, they still take an exceptionally long time to parse when linting. I suspect this is a clang-format bug, but it's easy enough to workaround. To solve these issues, restructure so that the test cases are in separate files (containing the actual data), but the manually written test functions are consolidated into a new family of generated layout tests. This is probably cleaner. Parallel clang-format linting is now 10x faster on the M1, which means it's now practical to lint in my "publish branch" hook. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21854>	2023-03-11 20:45:42 +00:00
Alyssa Rosenzweig	933b5c76f6	agx: Switch to scoped_barrier Rather than ingesting separate control and memory barriers, ingest only the combined and optimized scoped_barrier intrinsic. For barriers originating from GLSL, this makes it easier to ensure correctness. For barriers originating from SPIR-V, this is required for translation at all, as spirv_to_nir knows only scoped barriers. So this gets us closer to Vulkan and OpenCL. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21752>	2023-03-11 16:20:06 +00:00
Alyssa Rosenzweig	b768a254f7	agx: Use nir_lower_mem_access_bit_sizes Lowers away 64-bit loads, which we'll create in the sysval lowering for dynamically indexed UBOs/VBOs. The lowering generates pack_64_2x32 instructions, so lower those too. No shader-db changes. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21674>	2023-03-11 14:15:50 +00:00
Alyssa Rosenzweig	8a53050d7d	agx: Implement extract_[ui]16 Instead of lowering to bitwise ops. Yet another way of subdividing in NIR. Probably insignificant but makes it easy to check that the pass ordering from the previous pass is right. It does let us get much better codegen for unpacksnorm2x16, whatever that's worth. No shader-db changes. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21674>	2023-03-11 14:15:50 +00:00
Alyssa Rosenzweig	706815488e	agx: Fix subdivision coalescing As intended. We can't CSE with partial null destinations in the way, so we shouldn't eliminate dead destinations until after CSE has run. But we should still eliminate dead instructions to ensure CSE doesn't move things around needlessly, hurting register pressure. Noticed while debugging live range splitting. No GLES3.0 shader-db changes. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21674>	2023-03-11 14:15:50 +00:00
Alyssa Rosenzweig	5ea9c2e634	agx: Make partial DCE optional Our dead code elimination pass does two things: 1. delete instructions that are entirely unnecessary 2. delete unnecessary destinations of necessary instructions To deal with pass ordering issues, we sometimes want to do #1 without #2. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21674>	2023-03-11 14:15:50 +00:00
Alyssa Rosenzweig	16f8bfb042	agx: Don't set lower_pack_split We should handle nir_op_unpack_32_2x16_split_* natively, since we can generate better code with agx_subdivide (coalescing the ops away) than the bitshift lowering. That said, we do need some extra instructions for the floating point conversions. No shader-db changes (which makes sense because we're targetting the GLES3.0 shader-db, which doesn't have the packing GLSL functions). The real motivation of this change isn't optimizing some GLSL pack functions, though, it's avoiding a code regression from using NIR's memory bit size lowering in a future MR. That lowering will turn things like "load i16vec4" into "load i32vec2 + unpack_32_2x16", so we need to be able to coalesce that unpack. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21674>	2023-03-11 14:15:50 +00:00
Alyssa Rosenzweig	6b22a02f90	asahi,agx: Implement buffer textures with gnarly NIR Implement buffer textures in full generality. There are a few issues here: * OpenGL requires buffer textures support a minimum size of 65536 elements, however 1D textures in AGX are (at most) 8192 elements. * OpenGL 4.0 (and OpenGL ES) require buffer textures to support the "RGB32" texture formats. These are 3 packed channels of 32-bits each. In general, non-power-of-two texel sizes are problematic. AGX does not support any such formats and we rely on the GL frontend to lower to a padded format (RGBX) if necessary. Such a lowering cannot work for buffer textures, however, so we need to find a way to implement RGB32 buffer textures. We solve these issues in the follow way: * Use 2D texture descriptors for buffer textures, with a large fixed power-of-two size along one axis. Then large texel indices may be accessed at a small vec2 texel coordinate, and since the fixed dimension is a power-of-two, that vector may be recovered by simply shifting and masking. This effectively avoids size restriction. We do need to clamp texel indices to the buffer size to avoid faulting on OOB reads, since we may read past the end of the buffer (if the app binds a non-page-aligned offset into the buffer). * Use a general purpose memory load for RGB32 buffer textures. Lower the texture load instruction to a memory load from the buffer and some address arithmetic. There's no format conversion needed for RGB32, other than maybe filling in a format-appropriate alpha, so this is straightforward. Again, we need to clamp the texel index for robustness with OOB reads. Each of these solutions brings its own problem. * Using 2D textures instead of 1D requires physically rounding up the buffer size when packing the descriptor, so we can no longer implement textureSize() by reading off the texture descriptor like normal. * We don't know at compile-time whether a given texture load will read from an RGB32 buffer texture or not, so we need to emit code for both. In Vulkan, we can't key the shader to this property, either, since it's descriptor set state and not pipeline state. And each of these problems in turn brings its own solution: * The texture descriptor is linear, so the "compression buffer address" field is ignored by the hardware. We stash the real buffer size there so that textureSize becomes a load from the texture descriptor like usual, without requiring a sideband (which would complicate bindless textures). * If we determine a texture descriptor contains RGB32 data, then it will never be interpreted by the hardware and hence does not need to be a valid texture descriptor. So, we extend the hardware's format enum to contain a software-defined RGB32 format enum. Then, when lowering texture buffer loads, we either read it as a typed RGB32 memory load or as a texture load depending on the value of the format field in the texture descriptor. All of this is accomplished with a big NIR pass generating a pile of strange looking code. But it should be good enough in practice for this silly feature. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21672>	2023-03-11 02:26:31 +00:00

1 2 3 4 5 ...

795 commits