fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-24 11:00:11 +01:00

Author	SHA1	Message	Date
Ian Romanick	4246c2869c	nir/algebraic: Invert comparisons less often This fixes the piglit test range_analysis_fsat_of_nan.shader_test. That test contains some code like o = saturate(X) > 0 ? vec4(1.0, 0.0, 0.0, 1.0) : vec4(0.0, 1.0, 0.0, 1.0); A clever optimizer will convert this to o = vec4(float(saturate(X) > 0), float(!(saturate(X) > 0)), 0, 1); Due to the ordering of optimizations in the compiler, the `saturate` operations are removed. This is safe even in the presense of NaN. o = vec4(float(X > 0), float(!(X > 0)), 0, 1); Since the calculations are not marked precise, an overzealous optimizer may reduce this to o = vec4(float(X > 0), float(X <= 0), 0, 1); This will result in black being output. The GLSL spec gives quite a bit of leeway with respect to NaN, but that seems too far. The shader author asked for a result of red or green. A result of black is still "undefined behavior," but it's also a little mean. This also enables CSE to do its job better. v2: Update A530 expected image checksum for minetest.trace. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4531 Fixes: `0dbda153aa` ("nir/algebraic: Flag inexact optimizations") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake total instructions in shared programs: 21041563 -> 21041789 (<.01%) instructions in affected programs: 992066 -> 992292 (0.02%) helped: 526 HURT: 548 helped stats (abs) min: 1 max: 16 x̄: 2.48 x̃: 2 helped stats (rel) min: 0.04% max: 5.56% x̄: 0.74% x̃: 0.49% HURT stats (abs) min: 1 max: 27 x̄: 2.80 x̃: 2 HURT stats (rel) min: 0.04% max: 4.55% x̄: 0.59% x̃: 0.38% 95% mean confidence interval for instructions value: -0.00 0.42 95% mean confidence interval for instructions %-change: -0.12% <.01% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 855885569 -> 856118189 (0.03%) cycles in affected programs: 343637248 -> 343869868 (0.07%) helped: 907 HURT: 541 helped stats (abs) min: 1 max: 7724 x̄: 206.45 x̃: 36 helped stats (rel) min: <.01% max: 29.97% x̄: 1.01% x̃: 0.37% HURT stats (abs) min: 1 max: 14177 x̄: 776.09 x̃: 31 HURT stats (rel) min: <.01% max: 29.94% x̄: 1.24% x̃: 0.35% 95% mean confidence interval for cycles value: 84.30 237.00 95% mean confidence interval for cycles %-change: -0.32% -0.01% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). LOST: 3 GAINED: 5 Ice Lake total instructions in shared programs: 20027107 -> 20025352 (<.01%) instructions in affected programs: 1068856 -> 1067101 (-0.16%) helped: 1153 HURT: 273 helped stats (abs) min: 1 max: 14 x̄: 1.83 x̃: 1 helped stats (rel) min: 0.03% max: 5.66% x̄: 0.61% x̃: 0.35% HURT stats (abs) min: 1 max: 15 x̄: 1.29 x̃: 1 HURT stats (rel) min: 0.16% max: 1.30% x̄: 0.58% x̃: 0.60% 95% mean confidence interval for instructions value: -1.33 -1.13 95% mean confidence interval for instructions %-change: -0.43% -0.34% Instructions are helped. total cycles in shared programs: 979499227 -> 979448725 (<.01%) cycles in affected programs: 344261539 -> 344211037 (-0.01%) helped: 1079 HURT: 441 helped stats (abs) min: 1 max: 9384 x̄: 147.78 x̃: 48 helped stats (rel) min: <.01% max: 31.83% x̄: 0.90% x̃: 0.33% HURT stats (abs) min: 1 max: 7220 x̄: 247.07 x̃: 32 HURT stats (rel) min: <.01% max: 31.30% x̄: 1.52% x̃: 0.53% 95% mean confidence interval for cycles value: -70.01 3.56 95% mean confidence interval for cycles %-change: -0.35% -0.05% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 10564 -> 10568 (0.04%) spills in affected programs: 143 -> 147 (2.80%) helped: 0 HURT: 1 total fills in shared programs: 11343 -> 11347 (0.04%) fills in affected programs: 287 -> 291 (1.39%) helped: 0 HURT: 1 LOST: 3 GAINED: 2 Skylake total instructions in shared programs: 18192274 -> 18190128 (-0.01%) instructions in affected programs: 1000188 -> 998042 (-0.21%) helped: 1149 HURT: 55 helped stats (abs) min: 1 max: 14 x̄: 1.92 x̃: 1 helped stats (rel) min: 0.04% max: 6.67% x̄: 0.67% x̃: 0.42% HURT stats (abs) min: 1 max: 2 x̄: 1.05 x̃: 1 HURT stats (rel) min: 0.16% max: 0.55% x̄: 0.27% x̃: 0.26% 95% mean confidence interval for instructions value: -1.87 -1.69 95% mean confidence interval for instructions %-change: -0.67% -0.58% Instructions are helped. total cycles in shared programs: 960856054 -> 960728040 (-0.01%) cycles in affected programs: 340840968 -> 340712954 (-0.04%) helped: 1079 HURT: 233 helped stats (abs) min: 1 max: 7640 x̄: 170.95 x̃: 46 helped stats (rel) min: <.01% max: 30.20% x̄: 0.96% x̃: 0.28% HURT stats (abs) min: 1 max: 6864 x̄: 242.23 x̃: 26 HURT stats (rel) min: <.01% max: 34.64% x̄: 2.10% x̃: 0.22% 95% mean confidence interval for cycles value: -135.62 -59.53 95% mean confidence interval for cycles %-change: -0.59% -0.25% Cycles are helped. LOST: 15 GAINED: 1 Broadwell total instructions in shared programs: 17855624 -> 17853580 (-0.01%) instructions in affected programs: 1012209 -> 1010165 (-0.20%) helped: 1105 HURT: 52 helped stats (abs) min: 1 max: 13 x̄: 1.90 x̃: 1 helped stats (rel) min: 0.03% max: 6.67% x̄: 0.67% x̃: 0.36% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.13% max: 0.52% x̄: 0.26% x̃: 0.25% 95% mean confidence interval for instructions value: -1.86 -1.67 95% mean confidence interval for instructions %-change: -0.68% -0.58% Instructions are helped. total cycles in shared programs: 1029905447 -> 1029840699 (<.01%) cycles in affected programs: 347102680 -> 347037932 (-0.02%) helped: 1007 HURT: 211 helped stats (abs) min: 1 max: 1360 x̄: 89.76 x̃: 48 helped stats (rel) min: <.01% max: 16.26% x̄: 0.69% x̃: 0.25% HURT stats (abs) min: 1 max: 1297 x̄: 121.51 x̃: 20 HURT stats (rel) min: <.01% max: 31.31% x̄: 1.21% x̃: 0.20% 95% mean confidence interval for cycles value: -62.39 -43.92 95% mean confidence interval for cycles %-change: -0.47% -0.25% Cycles are helped. total spills in shared programs: 20335 -> 20333 (<.01%) spills in affected programs: 19 -> 17 (-10.53%) helped: 2 HURT: 0 total fills in shared programs: 25905 -> 25899 (-0.02%) fills in affected programs: 23 -> 17 (-26.09%) helped: 2 HURT: 0 LOST: 9 GAINED: 0 Haswell total instructions in shared programs: 16418516 -> 16417293 (<.01%) instructions in affected programs: 223785 -> 222562 (-0.55%) helped: 590 HURT: 67 helped stats (abs) min: 1 max: 15 x̄: 2.19 x̃: 1 helped stats (rel) min: 0.03% max: 6.52% x̄: 0.87% x̃: 0.60% HURT stats (abs) min: 1 max: 2 x̄: 1.04 x̃: 1 HURT stats (rel) min: 0.04% max: 1.85% x̄: 0.44% x̃: 0.25% 95% mean confidence interval for instructions value: -2.01 -1.71 95% mean confidence interval for instructions %-change: -0.80% -0.67% Instructions are helped. total cycles in shared programs: 1037179754 -> 1037084874 (<.01%) cycles in affected programs: 352541071 -> 352446191 (-0.03%) helped: 1093 HURT: 182 helped stats (abs) min: 1 max: 888 x̄: 111.03 x̃: 64 helped stats (rel) min: <.01% max: 27.30% x̄: 0.84% x̃: 0.20% HURT stats (abs) min: 1 max: 6777 x̄: 145.49 x̃: 21 HURT stats (rel) min: <.01% max: 24.10% x̄: 1.99% x̃: 0.29% 95% mean confidence interval for cycles value: -88.10 -60.73 95% mean confidence interval for cycles %-change: -0.58% -0.29% Cycles are helped. total spills in shared programs: 17457 -> 17456 (<.01%) spills in affected programs: 12 -> 11 (-8.33%) helped: 1 HURT: 0 total fills in shared programs: 20387 -> 20385 (<.01%) fills in affected programs: 15 -> 13 (-13.33%) helped: 1 HURT: 0 LOST: 6 GAINED: 1 Ivy Bridge and earlier platforms had similar results. (Ivy Bridge shown) total instructions in shared programs: 15515482 -> 15513998 (<.01%) instructions in affected programs: 239739 -> 238255 (-0.62%) helped: 573 HURT: 57 helped stats (abs) min: 1 max: 20 x̄: 2.73 x̃: 2 helped stats (rel) min: 0.03% max: 9.84% x̄: 0.94% x̃: 0.55% HURT stats (abs) min: 1 max: 2 x̄: 1.39 x̃: 1 HURT stats (rel) min: 0.09% max: 1.85% x̄: 0.52% x̃: 0.35% 95% mean confidence interval for instructions value: -2.57 -2.14 95% mean confidence interval for instructions %-change: -0.89% -0.73% Instructions are helped. total cycles in shared programs: 584509880 -> 584463152 (<.01%) cycles in affected programs: 11765280 -> 11718552 (-0.40%) helped: 661 HURT: 152 helped stats (abs) min: 1 max: 3073 x̄: 101.99 x̃: 32 helped stats (rel) min: <.01% max: 34.38% x̄: 1.46% x̃: 0.50% HURT stats (abs) min: 1 max: 6637 x̄: 136.10 x̃: 15 HURT stats (rel) min: <.01% max: 24.19% x̄: 1.75% x̃: 0.25% 95% mean confidence interval for cycles value: -82.79 -32.16 95% mean confidence interval for cycles %-change: -1.11% -0.61% Cycles are helped. LOST: 9 GAINED: 0 Tiger Lake Instructions in all programs: 160905127 -> 160900949 (-0.0%) SENDs in all programs: 6812418 -> 6812085 (-0.0%) Loops in all programs: 38225 -> 38225 (+0.0%) Cycles in all programs: 7431911114 -> 7433914697 (+0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304539 -> 304537 (-0.0%) Ice Lake Instructions in all programs: 145296733 -> 145292370 (-0.0%) SENDs in all programs: 6863818 -> 6863485 (-0.0%) Loops in all programs: 38219 -> 38219 (+0.0%) Cycles in all programs: 8798257570 -> 8800204360 (+0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334250 -> 334248 (-0.0%) Skylake Instructions in all programs: 135891485 -> 135887357 (-0.0%) SENDs in all programs: 6803031 -> 6802698 (-0.0%) Loops in all programs: 38216 -> 38216 (+0.0%) Cycles in all programs: 8442221881 -> 8444201959 (+0.0%) Spills in all programs: 194839 -> 194839 (+0.0%) Fills in all programs: 301116 -> 301114 (-0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	49177b9e2f	nir/algebraic: Tautology replacements require sources be numbers It seems worth the small amount of damage to give an extra cushion of not having to debug problems later. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21043197 -> 21043359 (<.01%) instructions in affected programs: 4409 -> 4571 (3.67%) helped: 0 HURT: 25 HURT stats (abs) min: 1 max: 16 x̄: 6.48 x̃: 5 HURT stats (rel) min: 0.39% max: 15.38% x̄: 4.59% x̃: 4.40% 95% mean confidence interval for instructions value: 4.37 8.59 95% mean confidence interval for instructions %-change: 2.93% 6.26% Instructions are HURT. total cycles in shared programs: 856175986 -> 856176921 (<.01%) cycles in affected programs: 58908 -> 59843 (1.59%) helped: 0 HURT: 25 HURT stats (abs) min: 7 max: 70 x̄: 37.40 x̃: 38 HURT stats (rel) min: 0.27% max: 5.63% x̄: 1.87% x̃: 1.39% 95% mean confidence interval for cycles value: 31.11 43.69 95% mean confidence interval for cycles %-change: 1.35% 2.39% Cycles are HURT. No fossil-db changes on any Intel platform. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	d69ba58644	nir/algebraic: Remove some optimizations of comparisons with fsat When most of these patterns were created, we believed, incorrectly, that fsat(NaN) was NaN. We have since realized that fsat(NaN) is zero. Originally, this changed the patterns to use is_a_number. This didn't help any shaders, so it's easier to just drop the optimizations. This commit crossed paths with `4c3ad4d065` ("nir/algebraic: mark more optimization with fsat(NaN) as inexact") and `bc123c396a` ("nir/algebraic: mark some optimizations with fsat(NaN) as inexact"). Given that these don't impact very many shaders, it seems safer to just remove them. As discussed in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8716, I tried modifying these patterns to use !(b cmp a). Unfortunately, on Intel GPUs, the results were much worse than just removing the patterns altogether. Some other related patterns will be addressed in later commits. There are still a number of patterns that use the identity fsat(1-X) == 1 - fsat(X). If X is NaN, the former is zero while the latter is 1.0. I haven't evaluted these patterns yet. If changes are needed in these patterns, it should be a separate commit anyway. v2: Replace arrow `=>` with `->` in comments because the `=>` looks a lot like `<=` comparison. Suggested by Rhys. Fixes: `92b75c126b` ("nir/algebraic: Replace checks that a value is between (or not) [0, 1]") Fixes: `a7f0c57673` ("nir/algebraic: Eliminate useless fsat() on operand of comparison w/value in (0, 1)") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel hardware had similar results. (Ice Lake shown) total instructions in shared programs: 20029060 -> 20029670 (<.01%) instructions in affected programs: 69236 -> 69846 (0.88%) helped: 0 HURT: 263 HURT stats (abs) min: 1 max: 20 x̄: 2.32 x̃: 1 HURT stats (rel) min: 0.30% max: 11.11% x̄: 1.35% x̃: 0.98% 95% mean confidence interval for instructions value: 1.86 2.78 95% mean confidence interval for instructions %-change: 1.18% 1.52% Instructions are HURT. total cycles in shared programs: 979821278 -> 979834425 (<.01%) cycles in affected programs: 1476848 -> 1489995 (0.89%) helped: 49 HURT: 204 helped stats (abs) min: 1 max: 812 x̄: 102.31 x̃: 20 helped stats (rel) min: 0.01% max: 21.43% x̄: 2.23% x̃: 0.52% HURT stats (abs) min: 2 max: 2600 x̄: 89.02 x̃: 16 HURT stats (rel) min: 0.04% max: 27.27% x̄: 1.49% x̃: 0.72% 95% mean confidence interval for cycles value: 13.18 90.75 95% mean confidence interval for cycles %-change: 0.29% 1.25% Cycles are HURT. No fossil-db changes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Jason Ekstrand	b447f5049b	nir: Add a discard optimization pass Many fragment shaders do a discard using relatively little information but still put the discard fairly far down in the shader for no good reason. If the discard is moved higher up, we can possibly avoid doing some or almost all of the work in the shader. When this lets us skip texturing operations, it's an especially high win. One of the biggest offenders here is DXVK. The D3D APIs have different rules for discards than OpenGL and Vulkan. One effective way (which is what DXVK uses) to implement DX behavior on top of GL or Vulkan is to wait until the very end of the shader to discard. This ends up in the pessimal case where we always do all of the work before discarding. This pass helps some DXVK shaders significantly. v2 (Jason Ekstrand): - Fix a couple of typos (Grazvydas, Ian) - Use the new nir_instr_move helper - Find all movable discards before moving anything so we don't accidentally re-order anything and break dependencies v3 (Pierre-Eric): remove the call to nir_opt_conditional_discard based on Daniel Schürmann comment. v4 (Pierre-Eric): - handle demote intrinsics and drop derivatives_safe_after_discard - add early return if discards/demotes aren't used v5 (Pierre-Eric): - use pass_flags instead of instr set (Daniel Schürmann) v6 (Daniel Schürmann): - cleanup and fix pass_flags handling Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10522>	2021-05-19 18:04:44 +00:00
Jason Ekstrand	3033410b10	nir/gather_info: Expose a nir_intrinsic_writes_external_memory helper Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10522>	2021-05-19 18:04:44 +00:00
Jason Ekstrand	f97fb1fa55	nir: Add a nir_instr_move helper Removes an instruction from one place and inserts it at another while working around a weird cursor corner-case. v2: change return value to bool (Daniel Schürmann) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> (v1) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10522>	2021-05-19 18:04:44 +00:00
Bas Nieuwenhuizen	2d6a6469b8	nir: Add bvh64_intersect_ray_amd intrinsic. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10818>	2021-05-18 23:01:47 +02:00
Bas Nieuwenhuizen	aa82f91c38	nir: Add load_sbt_amd intrinsic. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9767>	2021-05-18 18:29:36 +00:00
Samuel Pitoiset	1b1c726ca9	nir/opt_access: fix getting variables in presence of similar bindings/desc It's perfectly legal to declare multiple SSBOs that point to the same binding/descriptor_set with different access mask. Currently, it will always get the first one in the list that matches binding/desc_set regardless of the access mask, but other variables might have different access mask. Fix this by being conservative if another variable uses the same binding/desc_set because we can't get it reliably without adding a new field to vulkan_resource_index. This fixes rendering issues in Resident Evil Village with vkd3d-proton. This bug has been uncovered by ("spirv: Don't remove variables used by resource indexing intrinsics") because variables are no longer removed No fossils-db changes. Cc: 21.1 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10692>	2021-05-18 06:25:24 +00:00
Connor Abbott	a40714abf7	nir/lower_phis_to_scalar: Add "lower_all" option We don't want to have to deal with vector phis in freedreno, because vectors are always split/unsplit around vectorized instructions anyways, and the stated reason for not scalarising them (it hurting coalescing) won't apply to us because we won't be using nir_from_ssa. Add this option so that we don't have to do the equivalent thing while translating from NIR. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10809>	2021-05-17 09:59:45 +00:00
Mike Blumenkrantz	6df187df13	nir/builder: add nir_pad_vector and nir_pad_vec4 util functions these pad a given value to vec4 or arbitrary number of components Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10630>	2021-05-16 14:15:14 +00:00
Gert Wollny	4c045ad11e	nir/linker: add option to ignore the IO precisions for better varying packing Backends that don't handle IO component precision can pack more varyings into one slot if the linker ignores the precision. If the IO is vectorized then this can save IO instructions. Related: `165a69d2f7` nir: handle mediump varyings in varying compaction helpers Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10722>	2021-05-15 09:58:27 +02:00
Caio Marcelo de Oliveira Filho	09984fd02f	nir: Rename nir_is_per_vertex_io to nir_is_arrayed_io VS outputs are "per vertex" but not the kind of I/O we want to match with this helper. Change to a name that covers the "arrayness" required by the type. Name inspired by the GLSL spec definition of arrayed I/O. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10493>	2021-05-14 16:17:45 +00:00
Gert Wollny	e418710f8b	compiler/nir: check whether var is an input in lower_fragcoord_wtrans Otherwise the lowering pass might try to lower any other load from a deref if its data.location value happens to be zero. Fixes: `418c4c0d7d` compiler/nir: extend lower_fragcoord_wtrans to support VARYING_SLOT_POS Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10577>	2021-05-14 13:26:13 +00:00
Timur Kristóf	0d6b6c850f	nir: Add AMD specific intrinsics for merged shaders and NGG. These intrinsics represent what the hardware can actually do. Lowering our shaders to use these intrinsics will allow us to deal with mapping the classic VS, TES, GS (and the future MS) stages to the hardware capabilities using NIR, which makes our backend compilers simpler. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Timur Kristóf	641707a807	nir: Allow load_primitive_id in VS in nir_divergence_analysis. The lowered NIR code of NGG VS shaders uses this intrinsic when the VS has to export the primitive ID. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Timur Kristóf	e905e0938a	nir: Support upper bound of unsigned bit size conversions. These allow us to generate slightly better code in some cases, eg. multiplications in ACO. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Timur Kristóf	9a2ffe1abb	nir: Support upper bound of subgroup_id/num_subgroups for non-compute. These intrinsics will be used when lowering NGG shaders, including currently supported stages like VS, TES, GS and also by mesh shaders in the future. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Marcin Ślusarz	2c3e2d69bd	nir: handle float atomics in nir_lower_memory_model Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `2adb337256` ("nir,radv/aco: add and use pass to lower make available/visible barriers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10766>	2021-05-12 11:09:07 +00:00
Marcin Ślusarz	27073b59bc	nir: handle float atomics in nir_gather_info Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10766>	2021-05-12 11:09:07 +00:00
Tapani Pälli	181beece3c	nir: skip assert check with empty structs Fixes issues with upcoming CTS test testing empty structs. v2: decorate with UNUSED as only used in assert (Timothy) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10681>	2021-05-10 08:07:29 +03:00
Alyssa Rosenzweig	db2f6b87a3	nir/divergence_anlysis: Add intrinsics for Bifrost Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10022>	2021-05-07 18:20:30 +00:00
Alyssa Rosenzweig	f3de2bd6c2	nir: Add blend lowering pass This pass was originally developed for Panfrost, where it passes the relevant dEQP tests. Upstreaming so it can be extended and then shared with: * Asahi, for blending * Zink, for logic ops * Lavapipe, for advanced blending Note that using this with MRT in a fragment shader (as non-panfrost drivers will) has not yet been tested. Logic ops with integer framebuffers are probably todo. It's been enough for Panfrost, will suffice for ES2 on Asahi, and provides an upstream base for kusma's work on advanced blending, so overall the merge is a net benefit. v2: Remove bogus assert that the format layout is PLAIN. We need to render R11G11B10, which Mesa reports as layout OTHER. The code is still correct. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10601>	2021-05-07 17:25:21 +00:00
Gert Wollny	b4600d9352	nir: Add filter callback for lower_to_scalar to the options Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9943>	2021-05-07 12:09:03 +00:00
Mike Blumenkrantz	37545418cd	nir: add nir_isub_imm Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10654>	2021-05-06 13:01:03 +00:00
Jesse Natalie	d7ca0319d7	nir: Add relaxed 24bit opcodes These are equivalent to the 32bit opcodes if there are no more efficient 24bit opcodes available, but inputs are guaranteed to already be 24bit, so the 24bit opcodes can be used instead if they exist and are efficient. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10549>	2021-05-05 22:06:42 +00:00
Jason Ekstrand	e1edf74dde	nir/builder: Move clamp helpers to nir_builder.h Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10631>	2021-05-04 22:51:34 +00:00
Caio Marcelo de Oliveira Filho	dd48683cfd	nir: Move shared_memory_explicit_layout bit into common shader_info Move it out of the "cs" sub-struct, since the bit can be used for other shader stages in the future. This also removes a subtle issue in spirv_to_nir: info.cs.shared_memory_explicit_layout was used without checking for the CS shader stage. It ended up being "harmless" since the effects also depended on presence of shared variables. Fixes: `5de6c5973a` ("spirv: Implement SPV_KHR_workgroup_memory_explicit_layout") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10529>	2021-05-04 20:54:58 +00:00
Iago Toral Quiroga	aebb47b7d1	compiler/nir: add a divergence analysis option for non-uniform workgroup id The V3D hardware allows us to pack multiple workgroups together to avoid wasting execution lanes in shader cores. For example, if we dispatch 16 workgroups with a local size of 1 element, we can pack all 16 workgroups in a single 16-wide dispatch where each lane executes a different workgroup, instead of 16 1-wide dispatches. When we do this, we don't have a uniform workgroup id any more. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>	2021-05-04 15:53:23 +00:00
Caio Marcelo de Oliveira Filho	7cc846788c	nir: Remove now unnecessary conditions from emit_load/store helpers The mode one was used before `0bc5a829dd` ("nir: Remove shared support from lower_io"). The others were used before `5f7c7c9a7f` ("nir: add src and dest types to all IO loads and stores for mediump"). All conditions now are always true, so drop them. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10533>	2021-05-04 06:33:24 -07:00
Gert Wollny	a199697642	nir/opt_algebraic: optimizations for add umax/umin with zero For unsigned comparisons with zero these ops can be eliminated. v2: Add comparison optimizations with -1 (Rhys Perry) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> (v1) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10583>	2021-05-04 09:33:32 +02:00
Alyssa Rosenzweig	a976101da5	nir/opcodes: Reword confusing comment Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10578>	2021-05-03 12:51:47 +00:00
Alyssa Rosenzweig	0ea67e57e5	nir: Add fsin_agx opcode Used to split up the fsin/fcos lowering for AGX between NIR and the backend, to permit algebraic optimizations without polluting NIR with too many hardware details. The backend NIR lowering produces an fmul/ffma of the input so we can optimize code like sin(2*x). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10582>	2021-05-02 17:41:09 -04:00
Rhys Perry	7a7838529a	nir/lower_non_uniform: allow lowering with vec2 handles Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9523>	2021-04-27 15:56:07 +00:00
Connor Abbott	77fcb01f7f	nir/lower_clip_disable: Fix store writemask We're storing into the array element, not the whole variable. Fixes: `fb2fe80` ("nir: add lowering pass for clip plane enabling") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7274>	2021-04-26 17:07:02 +00:00
Jesse Natalie	2775b9139b	nir_lower_readonly_images_to_tex: Use nir_shader_lower_instructions Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>	2021-04-23 23:16:15 +00:00
Jesse Natalie	fa677c8644	nir_lower_readonly_images_to_tex: Support non-CL semantics For non-CL, intrinsic access isn't set, because the image type doesn't have access qualifier. Instead, the access qualifier is set on the variable. So, add a mode to this pass which can chase back to the variable in addition to the intrinsic access. Also, update the variable type and the deref chain types so everything is consistent, that the tex is accessing a sampler. Note we can't do this for CL, because void-typed samplers don't exist. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>	2021-04-23 23:16:15 +00:00
Jesse Natalie	29c9731400	nir: Rename nir_lower_cl_images_to_tex, replace 'cl' with 'readonly' Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>	2021-04-23 23:16:15 +00:00
Alyssa Rosenzweig	c84804f167	nir/lower_fragcolor: Take max cbufs as argument One step closer to generalizing this pass to more drivers. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10411>	2021-04-23 17:20:43 +00:00
Alyssa Rosenzweig	73eb497b86	nir/lower_fragcolor: Fix driver_location assignment Fixes crash in dEQP-GLES31.functional.shaders.framebuffer_fetch.basic.last_frag_data when using this pass. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10411>	2021-04-23 17:20:43 +00:00
Alyssa Rosenzweig	0f4ba349e9	nir/lower_fragcolor: Handle fp16 outputs Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10391>	2021-04-21 22:17:28 +00:00
Alyssa Rosenzweig	49c6157b15	nir/lower_fragcolor: Use shader_instructions_pass While I was in the area. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10391>	2021-04-21 22:17:28 +00:00
Rhys Perry	89b759c4f9	nir/opt_load_store_vectorize: loop internally To vectorize to vec8/16 or vec4 (without vec3), we can't incrementally add components to a load/store. This patch loops vectorization so that two new vec2/4/8 operations can be combined into a larger operation. fossil-db (GFX10.3): Totals from 22 (0.02% of 139391) affected shaders: SpillVGPRs: 1749 -> 1771 (+1.26%) CodeSize: 901212 -> 892532 (-0.96%); split: -1.19%, +0.22% Scratch: 178176 -> 184320 (+3.45%) Instrs: 159358 -> 158027 (-0.84%); split: -0.99%, +0.16% Cycles: 37046772 -> 36738544 (-0.83%); split: -1.00%, +0.17% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>	2021-04-21 20:26:58 +00:00
Rhys Perry	447820d003	nir/opt_load_store_vectorize: ignore load_vulkan_descriptor These mess with alignment calculation. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>	2021-04-21 20:26:58 +00:00
Rhys Perry	6ca11b4a66	nir/opt_load_store_vectorize: improve handling of swizzles Previously (for simplicity), it could have skipped vectorization if swizzles were involved. fossil-db (GFX10.3): Totals from 498 (0.36% of 139391) affected shaders: SGPRs: 25328 -> 26608 (+5.05%); split: -1.36%, +6.41% VGPRs: 9988 -> 9996 (+0.08%) SpillSGPRs: 40 -> 65 (+62.50%) CodeSize: 1410188 -> 1385584 (-1.74%); split: -1.76%, +0.02% Instrs: 257149 -> 250579 (-2.55%); split: -2.57%, +0.01% Cycles: 1096892 -> 1070600 (-2.40%); split: -2.41%, +0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>	2021-04-21 20:26:58 +00:00
Rhys Perry	4df3654c79	nir/load_store_vectorize: assume CAN_REORDER ops don't alias with stores fossil-db (GFX10.3): Totals from 20 (0.01% of 139391) affected shaders: SGPRs: 688 -> 712 (+3.49%); split: -1.16%, +4.65% CodeSize: 35488 -> 34424 (-3.00%); split: -3.04%, +0.05% Instrs: 6405 -> 6259 (-2.28%); split: -2.44%, +0.16% Cycles: 51768 -> 51268 (-0.97%); split: -1.21%, +0.24% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>	2021-04-21 20:26:58 +00:00
Mike Blumenkrantz	3ccd0891d3	nir/lower_fragcolor: set outputs_written for fragdata members normal gather_info stuff Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10080>	2021-04-21 19:36:16 +00:00
Jesse Natalie	09440ce3fb	nir: Fix MSVC warning C4334 (32bit shift cast to 64bit) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-By: Bill Kristiansen <billkris@microsoft.com> Cc: mesa-stable@lists.freedesktop.org Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10331>	2021-04-20 00:28:34 +00:00
Alyssa Rosenzweig	899dd8e60a	nir: Update some comments referring to imov This was renamed when I was in high school. I remember updating the Midgard compiler while sitting in AP Physics. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10296>	2021-04-19 20:07:35 +00:00
Danylo Piliaiev	f17b41ab4f	nir: add lowering pass for helperInvocationEXT() Some hardware doesn't have a way to check if invocation was demoted, in such case we have to track it ourselves. OpIsHelperInvocationEXT is specified as: "An invocation is currently a helper invocation if it was originally invoked as a helper invocation or if it has been demoted to a helper invocation by OpDemoteToHelperInvocationEXT." Therefore we: - Set gl_IsHelperInvocationEXT = gl_HelperInvocation - Add "gl_IsHelperInvocationEXT = true" right before each demote - Add "gl_IsHelperInvocationEXT = gl_IsHelperInvocationEXT \|\| condition" right before each demote_if Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9460>	2021-04-19 17:11:36 +00:00

... 2 3 4 5 6 ...

3272 commits