fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 22:28:06 +02:00

Author	SHA1	Message	Date
Yonggang Luo	99dce8407e	asahi: Use nir_foreach_function_impl instead nir_foreach_function in function agx_nir_lower_zs_emit Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23920>	2023-06-29 11:29:54 +00:00
Yonggang Luo	62ce223245	treewide: Switch to use nir_foreach_function_with_impl when possible Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23903>	2023-06-29 08:36:03 +00:00
Alyssa Rosenzweig	173b9ee69a	treewide: Use nir_builder_create more perl -p0e 's/nir_builder_init\(&([^,]*), /\1 = nir_builder_create(/g' -i $(git grep -l nir_builder_init) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23860>	2023-06-27 18:13:02 +00:00
Alyssa Rosenzweig	815efcdf7e	nir: Use nir_builder_create perl -p0e 's/nir_builder ([^;]);\snir_builder_init\(&\1, /nir_builder \1 = nir_builder_create(/g' -i $(git grep -l nir_builder_init) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23860>	2023-06-27 18:13:02 +00:00
Alyssa Rosenzweig	d4424950ac	asahi: Use txf for background program More straightforward (txf instead of tex, with integer coords). No discrernible performance difference. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23836>	2023-06-27 14:38:21 +00:00
Alyssa Rosenzweig	05adeb850b	agx: Use nir_lower_frag_coord_to_pixel_coord Instead of open-coding the logic. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23836>	2023-06-27 14:38:21 +00:00
Alyssa Rosenzweig	766535c867	agx: Implement vector live range splitting The SSA killer feature is that, under an "optimal" allocator, the number of registers used (register demand) is equal to the number of registers required (register pressure, the maximum number of variables simultaneously live at any point in the program). I put "optimal" in scare quotes, because we don't need to use the exact minimum number of registers as long as we don't sacrifice thread count or introduce spilling, and using a few extra registers when possible can help coalesce moves. Details-shmetails. The problem is that, prior to this commit, our register allocator was not well-behaved in certain circumstances, and would require an arbitrarily large number of registers. In particular, since different variables have different sizes and require contiguous allocation, in large programs the register file may become fragmented, causing the RA to use arbitrarily many registers despite having lots of registers free. The solution is vector live range splitting. First, we calculate the register pressure (the minimum number of registers that it is theoretically possible to allocate successfully), and round up to the maximum number of registers we will actually use (to give some wiggle room to coalesce moves). Then, we will treat this maximum as a bound, requiring that we don't use more registers than chosen. In the event that register file fragmentation prevents us from finding a contiguous sequence of registers to allocate a variable, rather than giving up or using registers we don't have, we shuffle the register file around (defragmenting it) to make room for the new variable. That lets us use a few moves to avoid sacrificing thread count or introducing spilling, which is usually a great choice. Android GLES3.1 shader-db results are as expected: some noise / small regressions for instruction count, but a bunch of shaders with improved thread count. The massive increase in register demand may seem weird, but this is the RA doing exactly what it's supposed to: using more registers if and only if they would not hurt thread count. Notice that no programs whatsoever are hurt for thread count, which is the salient part. total instructions in shared programs: 1781473 -> 1781574 (<.01%) instructions in affected programs: 276268 -> 276369 (0.04%) helped: 1074 HURT: 463 Inconclusive result (value mean confidence interval includes 0). total bytes in shared programs: 12196640 -> 12201670 (0.04%) bytes in affected programs: 1987322 -> 1992352 (0.25%) helped: 1060 HURT: 513 Bytes are HURT. total halfregs in shared programs: 488755 -> 529651 (8.37%) halfregs in affected programs: 295651 -> 336547 (13.83%) helped: 358 HURT: 9737 Halfregs are HURT. total threads in shared programs: 18875008 -> 18885440 (0.06%) threads in affected programs: 64576 -> 75008 (16.15%) helped: 82 HURT: 0 Threads are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>	2023-06-23 17:37:41 +00:00
Alyssa Rosenzweig	72e6b683f3	agx/lower_parallel_copy: Lower 64-bit copies To 32-bit. This way we don't get into bad situations where we need to eg swap unaligned 64-bit values or something funny like that. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>	2023-06-23 17:37:41 +00:00
Alyssa Rosenzweig	bfdaab6512	agx: Validate predecessor information Including the new loop header? flag. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>	2023-06-23 17:37:41 +00:00
Alyssa Rosenzweig	923b966775	agx: Add loop header? flag This is useful for deciding whether we need to fix up phis in RA. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>	2023-06-23 17:37:41 +00:00
Alyssa Rosenzweig	a2dbe6b688	agx: Recollect stored vectors at their use This is Timur's cheesy solution to split-hell.shader_test. Seems to work ok here. Before: 94 inst, 588 bytes, 165 halfregs, 1 threads, 0 loops, 0:0 spills:fills After: 63 inst, 454 bytes, 129 halfregs, 1 threads, 0 loops, 0:0 spills:fills On Android GLES3.1 shader-db, a few shaders are helped a lot: total instructions in shared programs: 1781706 -> 1781473 (-0.01%) instructions in affected programs: 4284 -> 4051 (-5.44%) helped: 16 HURT: 2 Instructions are helped. total bytes in shared programs: 12197854 -> 12196640 (<.01%) bytes in affected programs: 29526 -> 28312 (-4.11%) helped: 20 HURT: 2 Bytes are helped. total halfregs in shared programs: 489007 -> 488755 (-0.05%) halfregs in affected programs: 945 -> 693 (-26.67%) helped: 7 HURT: 0 Halfregs are helped. total threads in shared programs: 18873216 -> 18875008 (<.01%) threads in affected programs: 5376 -> 7168 (33.33%) helped: 7 HURT: 0 Threads are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>	2023-06-23 17:37:41 +00:00
Alyssa Rosenzweig	91d98975a6	agx: Extract coordinate register size calculation It will be used for image writes too, not just reads. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>	2023-06-23 17:37:41 +00:00
Alyssa Rosenzweig	b5fccfa197	agx: Fix discards Switch our frontends from generating sample_mask_agx to discard_agx, and switching from legalizing sample_mask_agx to lowering discard_agx to sample_mask_agx. This is a much easier problem and is done here in a way that is simple (and inefficient) but obviously correct. This should fix corruption in Darwinia. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>	2023-06-23 17:37:41 +00:00
Alyssa Rosenzweig	baf67144bd	agx: Update explanation of sample_mask behaviour We discovered today that these (probably) trigger depth/stencil testing, which has significant implications for the correct/performant use. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>	2023-06-23 17:37:41 +00:00
Caio Oliveira	59cc77f0fa	compiler: Move from nir_scope to mesa_scope Just moving the enum and performing renames, no behavior change. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23328>	2023-06-19 23:29:26 +00:00
Alyssa Rosenzweig	0a6d919c53	asahi: Use bitfield_extract for texture lowering This makes descriptor crawls a lot easier to read, which is good because more are coming. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23351>	2023-06-15 13:08:44 -04:00
Alyssa Rosenzweig	1636037b66	agx: Implement bitfieldExtract natively We have a bfeil instruction which mostly maps to the GLSL thing, so use it with the appropriate lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23351>	2023-06-15 13:08:44 -04:00
Alyssa Rosenzweig	1d4a59448c	treewide: Remove use_scoped_barrier It is now set by all relevant drivers and not checked anywhere. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>	2023-06-13 16:36:10 +00:00
Jesse Natalie	082eba6165	nir_lower_mem_access_bit_sizes: Move options into a struct Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>	2023-06-13 00:43:36 +00:00
Jesse Natalie	4217353e2d	nir_lower_mem_access_bit_sizes: Add a bit_size input to the callback We'd like to use this callback to adjust loads and stores from things that are unsupported to things that are supported, but if the input is already supported, we'd prefer not to change it. Rather than making up a bit size that'd work and doing a bunch of pack/unpack bit math, only return a different bit size if the input one doesn't work for us (i.e. can't load enough memory or just an unsupported size entirely). Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>	2023-06-13 00:43:36 +00:00
Alyssa Rosenzweig	176c3a2ab7	agx: Use common nir_steal_tex_src Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23513>	2023-06-12 20:09:53 +00:00
Alyssa Rosenzweig	ba27071c8b	agx: Fold addressing math into atomics Like our loads and stores, our global atomics support indexing with a 64-bit base plus a 32-bit element index, zero- or sign-extended and multiplied by the word size. Unlike the loads and stores, they do not support additional shifting (it's not too useful), so that needs an explicit lowering. Switch to using AGX variants of the atomics, running our address pattern matching on global atomics in order to delete some ALU. This cleans up the image atomic lowering nicely, since we get to take full advantage of the shift + zero-extend + add on the atomic... The shift comes from multiplying by the bytes per pixel. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23529>	2023-06-09 12:06:00 +00:00
Alyssa Rosenzweig	13535d3f9d	agx: Refactor expressions in agx_nir_lower_address So we can add more instructions without duplication. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23529>	2023-06-09 12:06:00 +00:00
Alyssa Rosenzweig	537994bb32	asahi: Remove stale comments Trivial. It is now later and I have confirmed with Piglit. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Asahi Lina	d6ff4733a6	asahi: Do not leak meta shader NIR Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Asahi Lina	bb27e3f69c	asahi: Use os_dupfd_cloexec() instead of dup() This fixes file descriptor leaks in konsole/etc. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	389c0fdc7c	asahi: Add ASAHI_MESA_DEBUG=nowc flag Add a debug flag to disable write-combining as a performance hack. This may help diagnose slowness with glReadPixels() heavy workloads like screen capture. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	3a0d1f83d5	agx: Stop bit-inexact conversion propagation Despite being mathematically equivalent, the following code sequences are not bit-identical under IEEE 754 rules due to differing internal precision: fadd16 r0l, r2, 0.0 z = f2f16 x fadd16 r1h, r0l, r0h w = fadd z, y versus fadd32 r1h, r2, r0h f2f16(w) = fadd x, f2f32(y) This is probably fine under GL's relaxed floating point precision rules, but it's definitely not ok with the more strict OpenCL or Vulkan. It also is a potential problem with GL invariance rules, if we get different results for the same shader depending whether we did a monolithic compile or a fast link. The place for doing inexact transformations is NIR, when we have the information available to do so correctly. By the time we get to the backend, everything we do needs to be bit-exact to preserve sanity. Fixes dEQP-GLES2.functional.shaders.algorithm.rgb_to_hsl_vertex. We believe that this is a CTS bug, but it's a useful one since it uncovered a serious driver bug that would bite us in the much less friendly Vulkan (or god forbid OpenCL) CTS later. It also seems like a magnet for GL app bugs, the fp16 support we do now is uncovering bad enough bugs as it is. shader-db results are pretty abysmal, though :\| total instructions in shared programs: 1537964 -> 1571328 (2.17%) instructions in affected programs: 670231 -> 703595 (4.98%) total bytes in shared programs: 10533984 -> 10732316 (1.88%) bytes in affected programs: 4662414 -> 4860746 (4.25%) total halfregs in shared programs: 483448 -> 474541 (-1.84%) halfregs in affected programs: 58867 -> 49960 (-15.13%) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	be5004691c	asahi: Advertise GL 3.1 We now have support for baseline MSAA, except for support for eMRT. But hey, this gets us 99% of the way there, so it's worth flipping on at least in agx/next. We can also advertise dual-source blending again. It was reverted since Chromium freaks out with dual-source blending on a GL 2.1 driver, but since we're advertising GL 3.1 now, it's ok. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	8d019125a0	agx: Emit shader info late So we can take into account program transformations for the final info. This reports more accurate metadata. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	7b1d6204c8	asahi: Use nonempty tib for MSAA Affects dEQP-GLES31.functional.texture.multisample.samples_4.use_texture_depth_2d. This needs tests, but whatever, 70% of the YouTube chat said to land the hack. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> HackHackHacked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: YouTube Viewers Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	f3919bead6	asahi: Lower MSAA Use the shiny new passes to lower fragment shaders. Monolithic only right now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	1dd513727d	agx: Handle centroid and sample interpolation Works great now that all the infrastructure is wired up. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	b7f130fbbc	agx: Model interpolation for iter instructions Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	2548293e8b	agx: Split iter and iterproj instructions These are different (though related) instructions. I've split them in applegpu, let's mirror that here. This simplifies the IR a bit. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	b9b71bcae6	asahi,agx: Call lower_discard_zs_emit in the driver The driver needs to lower MSAA (because only it knows the sample count). MSAA lowering depends on discards getting lowered (in order to get sample masks on the discards for sample shading to work properly). Discard lowering depends on all discards emitted. But the driver needs to lower clip planes which generates discards. To break the circular dependency, we have the driver call the discard lowering pass itself (in between lowering clip planes and lowering MSAA). Technically, this is probably a layering violation but it's the least gross solution I see. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	398851ca53	agx: Lower discard in NIR We already lower discard in NIR when depth/stencil writes are used in the shader. In this patch, we extend that lowering for when depth/stencil writes are not used, in which case the discard is lowered to a sample_mask instruction. This is a step towards multisampling, since the old lowering assumed single-sample and there's no way to express a sample mask with a standard NIR discard instructions so we need to lower in NIR anyway for sample shading (i.e. if a discard_if diverges between samples in a pixel). This changes the lowering for discard_if to be free of control flow (instead executing a sample mask instruction unconditionally). This seems to be slightly faster in SuperTuxKart and slightly slower in Dolphin, but I'm not too worried right now. To make this work, we do need some extra lowering to ensure we always execute a sample_mask instruction, in case a discard_if is buried in other control flow (as occurs with Dolphin's ubershaders). So that's added too. We need that for MSAA anyway, so pardon the line count. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	989d6fd378	agx: Enable tag writes when sample mask written Including indirectly via discard/demote. Fixes graphical artefacts in Chromium when API sample masks are hooked up, which will result in fragment programs that do not write colour/depth but do a lone sample mask write. These need tag writes enabled (according to a trace from Metal for a case constructed to test this scenario). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	f514d49ae2	agx: Handle sample_mask_agx 1:1 translation. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	73bbf43bc0	agx: Plumb in nir_intrinsic_load_sample_mask_in We have a special register for this, although this will need some lowering for glSampleMask. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	6fd16dd7c9	agx: Model both sources of sample_mask We need to control both sources to implement multisampling properly. The semantic is something like: foreach sample in the first mask { if correspond bit in second bit set { make sample live } else { make sample dead } } But I'm reticent to document more formally until the details are really understood and properly tested. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	bffbe099df	asahi: Set uses_sample_shading for background program If we read gl_SampleID we need the lowering, even though we don't call into gather_info to set the bit for us. So set the bit manually. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig	0b95d81150	agx: Assert that sample shading is lowered Lest someone mess this up later and then try to "implement" these intrinsics in the backend. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:48 +00:00
Alyssa Rosenzweig	46a5a99d24	asahi: Add alpha-to-coverage (and alpha-to-one) lowering This should probably be shared code but meh. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:48 +00:00
Alyssa Rosenzweig	51e868f3a2	asahi: Add passes to lower sample intrinsics Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:48 +00:00
Alyssa Rosenzweig	f28962e29a	asahi: Add passes to lower MSAA Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:48 +00:00
Alyssa Rosenzweig	70b8babe3c	agx: Use textures_used, not num_textures The latter doesn't account for holes. Fixes regression in Neverball on Asahi. Fixes: `e607a89f` ("mesa/main: ff-fragshader to nir") Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:48 +00:00
Alyssa Rosenzweig	f1c2ea99e2	agx: Constant fold when optimizing int64 Otherwise we can get bcsel(false, ...) in the final optimized code, which isn't great. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:48 +00:00
Alyssa Rosenzweig	9641fba9ba	agx: Set support_16bit_alu Allows some more optimizations. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>	2023-06-07 03:21:48 +00:00
Alyssa Rosenzweig	99a00e2247	treewide: Use nir_trim_vector more Via Coccinelle patches @@ expression a, b, c; @@ -nir_channels(b, a, (1 << c) - 1) +nir_trim_vector(b, a, c) @@ expression a, b, c; @@ -nir_channels(b, a, BITFIELD_MASK(c)) +nir_trim_vector(b, a, c) @@ expression a, b; @@ -nir_channels(b, a, 3) +nir_trim_vector(b, a, 2) @@ expression a, b; @@ -nir_channels(b, a, 7) +nir_trim_vector(b, a, 3) Plus a fixup for pointless trimming an immediate in RADV and radeonsi. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23352>	2023-06-06 18:52:25 +00:00

1 2 3 4 5 ...

874 commits