fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 20:10:14 +01:00

Author	SHA1	Message	Date
Dave Airlie	f76f4be301	intel/compiler: move gen5 final pass to actually be final pass This got broken by the register conversion, this pass needs to be after all the others. Fixes: `ce75c3c3fe` ("intel: Switch to intrinsic-based registers") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26731>	2023-12-18 07:24:37 +00:00
Lionel Landwerlin	6dbb5f1e07	intel/fs: rerun divergence analysis prior to convert_from_ssa Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9964 Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26235>	2023-11-17 06:40:49 +00:00
Rhys Perry	f695a9fed2	intel/compiler: use nir_lower_fp16_casts Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25566>	2023-11-16 11:02:31 +00:00
Caio Oliveira	d2125dac85	intel/compiler: Take more precise params in brw_nir_optimize() Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25986>	2023-11-08 18:10:31 +00:00
Caio Oliveira	c4be90b4ba	intel/compiler: Remove unused parameter from brw_nir_adjust_payload() Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25986>	2023-11-08 18:10:31 +00:00
Iván Briano	54498937c5	intel/compiler: round f2f16 correctly for RTNE case v2: bcsel -> b2i32 (Ian) Fixes upcoming Vulkan CTS tests: dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.rounding_rte_conv_from_fp64_up dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.rounding_rte_conv_from_fp64_up_nostorage dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_conv_from_fp64_up_vert dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_conv_from_fp64_up_nostorage_vert dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_conv_from_fp64_up_frag dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_conv_from_fp64_up_nostorage_frag Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25281>	2023-10-09 23:37:52 +00:00
Connor Abbott	4282386311	nir/spirv: Add inverse_ballot intrinsic This is actually a no-op on AMD, so we really don't want to lower it to something more complicated. There may be a more efficient way to do this on Intel too. In addition, in the future we'll want to use this for lowering boolean reduce operations, where the inverse ballot will operate on the backend's "natural" ballot type as indicated by options->ballot_bit_size, instead of uvec4 as produced by SPIR-V. In total, there are now three possible lowerings we may have to perform: - inverse_ballot with source type of uvec4 from SPIR-V to inverse_ballot with natural source type, when the backend supports inverse_ballot natively. - inverse_ballot with source type of uvec4 from SPIR-V to arithmetic, when the backend doesn't support inverse_ballot. - inverse_ballot with natural source type from reduce operation, when the backend doesn't support inverse_ballot. Previously we just did the second lowering unconditionally in vtn, but it's just a combination of the first and third. We add support here for the first and third lowerings in nir_lower_subgroups, instead of simply moving the second lowering, to avoid unnecessary churn. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25123>	2023-09-20 14:41:18 +00:00
Pavel Ondračka	1c72c71bdf	nir/move_vec_src_uses_to_dest: allow to skip reuse of constant sources And enable this for r300 and intel-vec4 crocus HSW (mostly helps few doplhin ubershaders): total instructions in shared programs: 1576736 -> 1576589 (<.01%) instructions in affected programs: 38235 -> 38088 (-0.38%) helped: 12 HURT: 0 total cycles in shared programs: 111025838 -> 110944796 (-0.07%) cycles in affected programs: 5646582 -> 5565540 (-1.44%) helped: 15 HURT: 6 total spills in shared programs: 447 -> 432 (-3.36%) spills in affected programs: 186 -> 171 (-8.06%) helped: 12 HURT: 0 total fills in shared programs: 792 -> 774 (-2.27%) fills in affected programs: 291 -> 273 (-6.19%) helped: 12 HURT: 0 r300 RV530: total instructions in shared programs: 96655 -> 96304 (-0.36%) instructions in affected programs: 15020 -> 14669 (-2.34%) helped: 79 HURT: 18 total temps in shared programs: 13027 -> 12952 (-0.58%) temps in affected programs: 677 -> 602 (-11.08%) helped: 41 HURT: 9 total cycles in shared programs: 147745 -> 147314 (-0.29%) cycles in affected programs: 21831 -> 21400 (-1.97%) helped: 84 HURT: 19 r300 RV370: total instructions in shared programs: 63678 -> 63669 (-0.01%) instructions in affected programs: 931 -> 922 (-0.97%) helped: 12 HURT: 6 total temps in shared programs: 10028 -> 10013 (-0.15%) temps in affected programs: 339 -> 324 (-4.42%) helped: 33 HURT: 10 total cycles in shared programs: 101118 -> 101087 (-0.03%) cycles in affected programs: 2659 -> 2628 (-1.17%) helped: 22 HURT: 6 Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24932>	2023-09-19 18:05:37 +02:00
Alyssa Rosenzweig	d1eb17e92e	treewide: Drop nir_ssa_for_src users Via Coccinelle patch: @@ expression b, s, n; @@ -nir_ssa_for_src(b, *s, n) +s->ssa @@ expression b, s, n; @@ -nir_ssa_for_src(b, s, n) +s.ssa Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25247>	2023-09-18 10:25:17 -04:00
Ian Romanick	5eddf60e56	intel/compiler: Combine control barriers with identical memory semantics This prevents the second barrier generating a spurious, identical fence message as the first barrier. fossil-db stats on Alchemist: Totals: Instrs: 196513342 -> 196512777 (-0.00%); split: -0.00%, +0.00% Cycles: 14271426028 -> 14271404569 (-0.00%); split: -0.00%, +0.00% Send messages: 8021892 -> 8021770 (-0.00%) Totals from 46 (0.01% of 653252) affected shaders: Instrs: 76761 -> 76196 (-0.74%); split: -0.75%, +0.01% Cycles: 2027946 -> 2006487 (-1.06%); split: -1.45%, +0.39% Send messages: 7589 -> 7467 (-1.61%) Nothing in shader-db was affected. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>	2023-09-09 04:41:25 +00:00
Lionel Landwerlin	10e75aae1b	intel/nir: rerun lower_tex if it lowers something nir_lower_tex can lower tg4 coords into tg4 offset which on DG2+ we also need to lower into constant offsets. Unfortunately the nir_lower_tex pass is not able to lower the instructions it itself generates, so the easy fix for when nir_lower_tex lowers tg4 coords into tg4 offsets is to rerun the pass. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9735 Cc: mesa-stable Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25015>	2023-09-05 13:35:51 +00:00
Lionel Landwerlin	74a40cc4b6	intel/fs: move lower of non-uniform at_sample barycentric to NIR We use a non-uniform lowering loop in the backend which we can do better in NIR because we can also use divergence analysis there. This change also limits VGRF usage to a single VGRF to hold the sample ID in the backend. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24716>	2023-08-29 23:19:13 +00:00
Alyssa Rosenzweig	cda1961835	treewide: Also handle struct nir_builder form Via Coccinelle patch: @def@ typedef bool; typedef nir_builder; typedef nir_instr; typedef nir_def; identifier fn, instr, intr, x, builder, data; @@ static fn(struct nir_builder* builder, -nir_instr instr, +nir_intrinsic_instr intr, ...) { ( - if (instr->type != nir_instr_type_intrinsic) - return false; - nir_intrinsic_instr intr = nir_instr_as_intrinsic(instr); \| - nir_intrinsic_instr intr = nir_instr_as_intrinsic(instr); - if (instr->type != nir_instr_type_intrinsic) - return false; ) <... ( -instr->x +intr->instr.x \| -instr +&intr->instr ) ...> } @pass depends on def@ identifier def.fn; expression shader, progress; @@ ( -nir_shader_instructions_pass(shader, fn, +nir_shader_intrinsics_pass(shader, fn, ...) \| -NIR_PASS_V(shader, nir_shader_instructions_pass, fn, +NIR_PASS_V(shader, nir_shader_intrinsics_pass, fn, ...) \| -NIR_PASS(progress, shader, nir_shader_instructions_pass, fn, +NIR_PASS(progress, shader, nir_shader_intrinsics_pass, fn, ...) ) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24852>	2023-08-24 15:48:02 +00:00
Alyssa Rosenzweig	465b138f01	treewide: Use nir_shader_intrinsic_pass sometimes This converts a lot of trivial passes. Nice boilerplate deletion. Via Coccinelle patch (with a small manual fix-up for panfrost where coccinelle got confused by genxml + ninja clang-format squashed in, and for Zink because my semantic patch was slightly buggy). @def@ typedef bool; typedef nir_builder; typedef nir_instr; typedef nir_def; identifier fn, instr, intr, x, builder, data; @@ static fn(nir_builder* builder, -nir_instr instr, +nir_intrinsic_instr intr, ...) { ( - if (instr->type != nir_instr_type_intrinsic) - return false; - nir_intrinsic_instr intr = nir_instr_as_intrinsic(instr); \| - nir_intrinsic_instr intr = nir_instr_as_intrinsic(instr); - if (instr->type != nir_instr_type_intrinsic) - return false; ) <... ( -instr->x +intr->instr.x \| -instr +&intr->instr ) ...> } @pass depends on def@ identifier def.fn; expression shader, progress; @@ ( -nir_shader_instructions_pass(shader, fn, +nir_shader_intrinsics_pass(shader, fn, ...) \| -NIR_PASS_V(shader, nir_shader_instructions_pass, fn, +NIR_PASS_V(shader, nir_shader_intrinsics_pass, fn, ...) \| -NIR_PASS(progress, shader, nir_shader_instructions_pass, fn, +NIR_PASS(progress, shader, nir_shader_intrinsics_pass, fn, ...) ) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24852>	2023-08-24 15:48:02 +00:00
Faith Ekstrand	b5d6b7c402	nir: Drop most uses if nir_instr_rewrite_src() Generated by the following semantic patch: @@ expression I, S, D; @@ -nir_instr_rewrite_src(I, S, nir_src_for_ssa(D)); +nir_src_rewrite(S, D); Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24729>	2023-08-18 01:00:15 +00:00
Faith Ekstrand	4695bebc79	nir: Drop nir_dest Instead, we replace every use of it with nir_def. Most of this commit was generated by sed: sed -i -e 's/dest.ssa/def/g' src/*/.h src/*/.c src/*/.cpp A few manual fixups were required in lima and the nir_legacy code. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>	2023-08-14 21:22:53 +00:00
Faith Ekstrand	6c1d32581a	nir: Drop nir_alu_dest Instead, we replace it directly with nir_def. We could replace it with nir_dest but the next commit gets rid of that so this avoids unnecessary churn. Most of this commit was generated by sed: sed -i -e 's/dest.dest.ssa/def/g' src/*/.h src/*/.c src/*/.cpp There were a few manual fixups required in the nir_legacy.c and nir_from_ssa.c as nir_legacy_reg and nir_parallel_copy_entry both have a similar pattern. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>	2023-08-14 21:22:53 +00:00
Faith Ekstrand	ed9affa02f	nir: Drop most instances of nir_ssa_dest_init() Generated using the following two semantic patches: @@ expression I, J, NC, BS; @@ -nir_ssa_dest_init(I, &J->dest, NC, BS); +nir_def_init(I, &J->dest.ssa, NC, BS); @@ expression I, J, NC, BS; @@ -nir_ssa_dest_init(I, &J->dest.dest, NC, BS); +nir_def_init(I, &J->dest.dest.ssa, NC, BS); Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24658>	2023-08-13 17:12:52 +00:00
Alyssa Rosenzweig	09d31922de	nir: Drop "SSA" from NIR language Everything is SSA now. sed -e 's/nir_ssa_def/nir_def/g' \ -e 's/nir_ssa_undef/nir_undef/g' \ -e 's/nir_ssa_scalar/nir_scalar/g' \ -e 's/nir_src_rewrite_ssa/nir_src_rewrite/g' \ -e 's/nir_gather_ssa_types/nir_gather_types/g' \ -i $(git grep -l nir \| grep -v relnotes) git mv src/compiler/nir/nir_gather_ssa_types.c \ src/compiler/nir/nir_gather_types.c ninja -C build/ clang-format cd src/compiler/nir && find .c .h -type f -exec clang-format -i \{} \; Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24585>	2023-08-12 16:44:41 -04:00
Lionel Landwerlin	9934613c74	anv/hasvk: track robustness per pipeline stage And split them into UBO and SSBO v2 (Lionel): - Get rid of robustness fields in anv_shader_bin v3 (Lionel): - Do not pass unused parameters around Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17545>	2023-08-09 09:00:12 +03:00
Alyssa Rosenzweig	5fead24365	treewide: Drop is_ssa asserts We only see SSA now. Via Coccinelle patch: @@ expression x; @@ -assert(x.is_ssa); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	17d66055ae	nir: Remove reg_intrinsics parameter to convert_from_ssa All users must set it. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24450>	2023-08-02 10:26:45 -04:00
Lionel Landwerlin	fe81d40bff	intel/nir: add lower for sparse images & textures We have to lower images into image load + sampler residency. There is also a restriction on sampler access with a compare, lower those as 2 sampler instructions to meet the restriction. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>	2023-07-27 02:02:59 +03:00
Iván Briano	377c2a045f	intel/compiler: call brw_nir_adjust_payload from brw_postprocess_nir Calling anything after nir_trivialize_registers() risks undoing some of its work. In this case, brw_nir_adjust_payload() will do a constant folding pass if any payload adjusting happened, and that can turn a bunch of @store_regs into basically noops. Fixes dEQP-VK.subgroups.*task Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24325>	2023-07-25 22:48:09 +00:00
Marcin Ślusarz	a252123363	intel/compiler/mesh: compactify MUE layout Instead of using 4 dwords for each output slot, use only the amount of memory actually needed by each variable. There are some complications from this "obvious" idea: - flat and non-flat variables can't be merged into the same vec4 slot, because flat inputs mask has vec4 stride - multi-slot variables can have different layout: float[N] requires N 1-dword slots, but i64vec3 requires 1 fully occupied 4-dword slot followed by 2-dword slot - some output variables occur both in single-channel/component split and combined variants - crossing vec4 boundary requires generating more writes, so avoiding them if possible is beneficial This patch fixes some issues with arrays in per-vertex and per-primitive data (func.mesh.ext.outputs.*.indirect_array.q0 in crucible) and by reduction in single MUE size it allows spawning more threads at the same time. Note: this patch doesn't improve vk_meshlet_cadscene performance because default layout is already optimal enough. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>	2023-07-24 07:55:29 +00:00
Alyssa Rosenzweig	1466014184	nir: Rename lower_locals_to_reg_intrinsics back The short name is freed up. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>	2023-07-21 11:25:49 +00:00
Faith Ekstrand	ce75c3c3fe	intel: Switch to intrinsic-based registers Results on HSW (vec4 only): total instructions in shared programs: 2978400 -> 2974135 (-0.14%) instructions in affected programs: 77870 -> 73605 (-5.48%) helped: 143 HURT: 48 helped stats (abs) min: 1 max: 100 x̄: 30.22 x̃: 9 helped stats (rel) min: 0.03% max: 30.49% x̄: 8.02% x̃: 6.39% HURT stats (abs) min: 1 max: 4 x̄: 1.19 x̃: 1 HURT stats (rel) min: 0.08% max: 16.67% x̄: 3.71% x̃: 3.23% 95% mean confidence interval for instructions value: -26.69 -17.97 95% mean confidence interval for instructions %-change: -6.24% -3.90% Instructions are helped. total cycles in shared programs: 45345924 -> 44742666 (-1.33%) cycles in affected programs: 29083466 -> 28480208 (-2.07%) helped: 4785 HURT: 3879 helped stats (abs) min: 2 max: 8072 x̄: 276.00 x̃: 24 helped stats (rel) min: 0.02% max: 54.43% x̄: 7.78% x̃: 1.95% HURT stats (abs) min: 2 max: 14736 x̄: 184.95 x̃: 20 HURT stats (rel) min: 0.02% max: 97.00% x̄: 7.69% x̃: 1.53% 95% mean confidence interval for cycles value: -83.49 -55.77 95% mean confidence interval for cycles %-change: -1.16% -0.55% Cycles are helped. total spills in shared programs: 1093 -> 539 (-50.69%) spills in affected programs: 772 -> 218 (-71.76%) helped: 74 HURT: 0 total fills in shared programs: 760 -> 757 (-0.39%) fills in affected programs: 66 -> 63 (-4.55%) helped: 3 HURT: 0 Results on TGL (all stages): total instructions in shared programs: 21486982 -> 21488266 (<.01%) instructions in affected programs: 2245938 -> 2247222 (0.06%) helped: 1288 HURT: 1385 helped stats (abs) min: 1 max: 93 x̄: 4.05 x̃: 2 helped stats (rel) min: 0.02% max: 3.82% x̄: 0.61% x̃: 0.46% HURT stats (abs) min: 1 max: 134 x̄: 4.69 x̃: 2 HURT stats (rel) min: <.01% max: 5.59% x̄: 0.65% x̃: 0.44% 95% mean confidence interval for instructions value: 0.13 0.83 95% mean confidence interval for instructions %-change: <.01% 0.08% Instructions are HURT. total cycles in shared programs: 809326677 -> 809475669 (0.02%) cycles in affected programs: 447781659 -> 447930651 (0.03%) helped: 1924 HURT: 1994 helped stats (abs) min: 1 max: 74567 x̄: 1217.49 x̃: 10 helped stats (rel) min: <.01% max: 38.44% x̄: 1.09% x̃: 0.17% HURT stats (abs) min: 1 max: 76426 x̄: 1249.47 x̃: 8 HURT stats (rel) min: <.01% max: 137.11% x̄: 1.64% x̃: 0.17% 95% mean confidence interval for cycles value: -125.61 201.67 95% mean confidence interval for cycles %-change: 0.12% 0.48% Inconclusive result (value mean confidence interval includes 0). LOST: 4 GAINED: 4 Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104>	2023-07-19 02:11:57 +00:00
Marcin Ślusarz	36ff6c0004	intel/compiler: remove NV_mesh_shader support Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24071>	2023-07-14 08:27:14 +00:00
Faith Ekstrand	73e191924c	nir: Add a reg_intrinsics flag to nir_convert_from_ssa It doesn't do anything yet. We leave that to the subsequent patches so we can keep the tree-wide refactor as simple as possible. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>	2023-07-12 01:34:27 +00:00
Lionel Landwerlin	c26c0a36d3	intel/fs: disable coarse pixel shader with interpolater messages at sample Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9292 Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23962>	2023-07-06 12:48:52 +00:00
Marcin Ślusarz	1ac1d5d62e	anv,intel/compiler: enable shortcut in wg id to wg idx lowering on >= gfx12.5 This speeds up vk_meshlet_cadscene in "VK mesh ext" renderer by 1.4% Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22334>	2023-07-04 09:15:08 +00:00
Marcin Ślusarz	7ec1ef75d3	intel/compiler: pass num_workgroups from task to mesh shaders Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22334>	2023-07-04 09:15:08 +00:00
Yonggang Luo	68b8aa788d	intel/compiler: Switch to use nir_foreach_function_impl Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23920>	2023-06-29 11:29:54 +00:00
Alyssa Rosenzweig	815efcdf7e	nir: Use nir_builder_create perl -p0e 's/nir_builder ([^;]);\snir_builder_init\(&\1, /nir_builder \1 = nir_builder_create(/g' -i $(git grep -l nir_builder_init) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23860>	2023-06-27 18:13:02 +00:00
Alyssa Rosenzweig	6689c678fe	nir/lower_locals_to_regs: Add bool bitsize knob GLSL booleans (and hence bool derefs) may be translated either as 1-bit or 32-bit NIR registers, depending whether the backend uses nir_lower_bool_to_int32 or not. Add a knob for this and choose the right type for different backends. Fixes nir_validate failure on dEQP-VK.subgroups.ballot_broadcast.graphics.subgroupbroadcast_bvec3 run under lavapipe. That test indexes into a bvec3 array, and gallivm first lowers bools and then lowers derefs to registers, resulting in random 1-bit booleans mixed in with 32-bit bools. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23804>	2023-06-26 08:22:06 -04:00
Caio Oliveira	59cc77f0fa	compiler: Move from nir_scope to mesa_scope Just moving the enum and performing renames, no behavior change. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23328>	2023-06-19 23:29:26 +00:00
Ian Romanick	e419eefd34	intel/fs: Use nir_opt_reassociate_bfi All Skylake and newer Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 19907072 -> 19907054 (<.01%) instructions in affected programs: 8859 -> 8841 (-0.20%) helped: 9 / HURT: 0 total cycles in shared programs: 855791238 -> 855779334 (<.01%) cycles in affected programs: 3308294 -> 3296390 (-0.36%) helped: 12 / HURT: 13 Broadwell total instructions in shared programs: 17818231 -> 17817440 (<.01%) instructions in affected programs: 9887 -> 9096 (-8.00%) helped: 9 / HURT: 0 total cycles in shared programs: 902970035 -> 902941221 (<.01%) cycles in affected programs: 2767243 -> 2738429 (-1.04%) helped: 14 / HURT: 5 total spills in shared programs: 17784 -> 17718 (-0.37%) spills in affected programs: 318 -> 252 (-20.75%) helped: 1 / HURT: 0 total fills in shared programs: 25458 -> 24949 (-2.00%) fills in affected programs: 1346 -> 837 (-37.82%) helped: 1 / HURT: 0 Haswell total instructions in shared programs: 16707799 -> 16707586 (<.01%) instructions in affected programs: 24049 -> 23836 (-0.89%) helped: 41 / HURT: 0 total cycles in shared programs: 882730648 -> 882723174 (<.01%) cycles in affected programs: 5096737 -> 5089263 (-0.15%) helped: 25 / HURT: 12 total spills in shared programs: 14937 -> 14909 (-0.19%) spills in affected programs: 436 -> 408 (-6.42%) helped: 4 / HURT: 0 total fills in shared programs: 17569 -> 17529 (-0.23%) fills in affected programs: 444 -> 404 (-9.01%) helped: 4 / HURT: 0 No shader-db changes on any older Intel platforms. All Intel platforms had similar results. (Ice Lake shown) Totals: Instrs: 153118594 -> 153117340 (-0.00%); split: -0.00%, +0.00% Cycles: 15011967556 -> 15011904351 (-0.00%); split: -0.00%, +0.00% Fill count: 203692 -> 203684 (-0.00%) Totals from 703 (0.11% of 662496) affected shaders: Instrs: 192826 -> 191572 (-0.65%); split: -0.65%, +0.00% Cycles: 29937640 -> 29874435 (-0.21%); split: -0.25%, +0.04% Fill count: 4146 -> 4138 (-0.19%) Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>	2023-06-14 18:49:53 +00:00
Lionel Landwerlin	6b9f838d62	intel/fs: handle load_global_constant_uniform_block_intel Again, load the data just once in GRF, share it across lanes. Shader-db on dg2: total instructions in shared programs: 23214555 -> 23215400 (<.01%) instructions in affected programs: 199977 -> 200822 (0.42%) helped: 3 HURT: 38 helped stats (abs) min: 5 max: 670 x̄: 283.67 x̃: 176 helped stats (rel) min: 1.34% max: 49.41% x̄: 22.15% x̃: 15.70% HURT stats (abs) min: 1 max: 185 x̄: 44.63 x̃: 32 HURT stats (rel) min: 0.13% max: 42.86% x̄: 10.25% x̃: 9.30% 95% mean confidence interval for instructions value: -18.65 59.87 95% mean confidence interval for instructions %-change: 3.29% 12.47% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 5928 -> 5928 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 851137495 -> 851152449 (<.01%) cycles in affected programs: 16406137 -> 16421091 (0.09%) helped: 9 HURT: 32 helped stats (abs) min: 10 max: 13498 x̄: 6443.22 x̃: 5581 helped stats (rel) min: 0.11% max: 4.75% x̄: 1.45% x̃: 0.34% HURT stats (abs) min: 3 max: 15056 x̄: 2279.47 x̃: 735 HURT stats (rel) min: 0.10% max: 23.71% x̄: 4.58% x̃: 4.65% 95% mean confidence interval for cycles value: -1315.40 2044.87 95% mean confidence interval for cycles %-change: 1.71% 4.80% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 11856 -> 11825 (-0.26%) spills in affected programs: 2368 -> 2337 (-1.31%) helped: 4 HURT: 0 total fills in shared programs: 16258 -> 16207 (-0.31%) fills in affected programs: 2930 -> 2879 (-1.74%) helped: 4 HURT: 0 total sends in shared programs: 1038194 -> 1038185 (<.01%) sends in affected programs: 40 -> 31 (-22.50%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 2.25 x̃: 2 helped stats (rel) min: 10.00% max: 33.33% x̄: 21.46% x̃: 21.25% 95% mean confidence interval for sends value: -4.64 0.14 95% mean confidence interval for sends %-change: -40.41% -2.51% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 0 Some VK/DX titles result (on DG2 only), it's mostly additional instruction counts except for the unity spaceship demo where a CS shader gets additional SIMDness. The reason for additional instructions is that since we're doing block loads, we need to find the live channels in control flow to select a single lane value that is valid. aztec_ruins_high: Totals from 3 (1.12% of 269) affected shaders: Instrs: 17732 -> 17896 (+0.92%) Cycles: 796518 -> 819302 (+2.86%) cyberpunk_2077: Totals from 17 (0.17% of 10301) affected shaders: Instrs: 10848 -> 11658 (+7.47%) Cycles: 248243 -> 259168 (+4.40%); split: -0.57%, +4.97% fallout_4_dxvk_g2: Totals from 2 (0.12% of 1638) affected shaders: Instrs: 3157 -> 3368 (+6.68%) Cycles: 487807 -> 490426 (+0.54%); split: -0.26%, +0.79% Max live registers: 139 -> 141 (+1.44%) red_dead_redemption2: Totals from 68 (1.14% of 5970) affected shaders: Instrs: 34871 -> 36486 (+4.63%) Cycles: 551430 -> 565211 (+2.50%) Send messages: 2074 -> 2072 (-0.10%) Max live registers: 5078 -> 5077 (-0.02%) total_war_warhammer2: Totals from 5 (1.05% of 478) affected shaders: Instrs: 6905 -> 6971 (+0.96%); split: -0.16%, +1.12% Cycles: 97035 -> 97989 (+0.98%); split: -0.07%, +1.05% unity spaceship demo (instruction count going up due to a CS shader bump from SIMD8->16): Totals from 53 (9.71% of 546) affected shaders: Instrs: 223748 -> 233223 (+4.23%); split: -0.01%, +4.25% Cycles: 23134697 -> 25207080 (+8.96%); split: -0.17%, +9.13% Subgroup size: 480 -> 488 (+1.67%) Spill count: 2156 -> 2242 (+3.99%); split: -0.19%, +4.17% Fill count: 4617 -> 4845 (+4.94%); split: -0.09%, +5.02% Max live registers: 5991 -> 6050 (+0.98%); split: -0.40%, +1.39% Max dispatch width: 480 -> 488 (+1.67%) witcher_3_dxvk_g2: Totals from 27 (2.51% of 1074) affected shaders: Instrs: 57067 -> 57677 (+1.07%); split: -0.03%, +1.10% Cycles: 1397871 -> 1436704 (+2.78%); split: -0.35%, +3.13% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Lionel Landwerlin	5ae8a78d8c	intel/fs: make use of load_ubo_uniform_block_intel The principle is the same as the load_ssbo_uniform_block_intel. Whenever we see a uniform offset, load the data only once in GRFs to reduce register pressure. Iris shader-db run on DG2 : total instructions in shared programs: 23001325 -> 23094969 (0.41%) instructions in affected programs: 1775989 -> 1869633 (5.27%) helped: 764 HURT: 2097 helped stats (abs) min: 1 max: 102 x̄: 6.96 x̃: 2 helped stats (rel) min: 0.03% max: 16.91% x̄: 1.36% x̃: 0.63% HURT stats (abs) min: 1 max: 2461 x̄: 47.19 x̃: 7 HURT stats (rel) min: <.01% max: 199.34% x̄: 5.91% x̃: 2.60% 95% mean confidence interval for instructions value: 25.43 40.03 95% mean confidence interval for instructions %-change: 3.60% 4.33% Instructions are HURT. total loops in shared programs: 5847 -> 5847 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 839329852 -> 845491482 (0.73%) cycles in affected programs: 130229434 -> 136391064 (4.73%) helped: 1098 HURT: 2228 helped stats (abs) min: 1 max: 130102 x̄: 1340.64 x̃: 22 helped stats (rel) min: <.01% max: 64.25% x̄: 4.03% x̃: 0.71% HURT stats (abs) min: 1 max: 185309 x̄: 3426.24 x̃: 87 HURT stats (rel) min: <.01% max: 92.85% x̄: 8.12% x̃: 3.82% 95% mean confidence interval for cycles value: 1342.16 2362.97 95% mean confidence interval for cycles %-change: 3.70% 4.52% Cycles are HURT. total spills in shared programs: 10768 -> 11856 (10.10%) spills in affected programs: 9717 -> 10805 (11.20%) helped: 25 HURT: 28 total fills in shared programs: 13720 -> 16258 (18.50%) fills in affected programs: 12016 -> 14554 (21.12%) helped: 25 HURT: 28 total sends in shared programs: 1034790 -> 1031266 (-0.34%) sends in affected programs: 33416 -> 29892 (-10.55%) helped: 1005 HURT: 0 helped stats (abs) min: 1 max: 22 x̄: 3.51 x̃: 3 helped stats (rel) min: 1.69% max: 60.00% x̄: 15.20% x̃: 14.08% 95% mean confidence interval for sends value: -3.72 -3.29 95% mean confidence interval for sends %-change: -15.82% -14.57% Sends are helped. LOST: 26 GAINED: 183 shader-db on a number of VK/DX titles on DG2 : PERCENTAGE DELTAS Shaders Instrs Cycles age_of_wonders_III 1928 +0.02% -0.19% PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Max live registers Max dispatch width assassins_creed_odyssey 2119 +1.12% -0.42% -0.03% -0.29% -9.10% -4.26% -0.64% +0.65% PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Max live registers aztec_ruins_high 269 -0.05% -0.45% -0.29% -7.27% -0.33% PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width dark_souls_3_dxvk_g2 1420 +0.09% +0.24% +0.21% +0.12% (stats look bad, but it's just one shader affected) PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Scratch Memory Size Max live registers fallout_4_dxvk_g2 1638 +0.67% +8.32% +16.02% +7.17% +100.00% +0.48% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Spill count Fill count Max live registers Max dispatch width red_dead_redemption2 5969 +0.16% -0.04% -0.04% +0.01% +0.05% -0.20% +0.04% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width rise_of_the_tomb_raider_g2 12129 +2.19% +1.36% -1.23% -0.36% +2.04% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers shooter-game 693 +0.07% -0.89% -0.09% -0.09% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width talos_g2 1140 +0.37% +3.80% -0.86% -0.67% +0.19% PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width total_war_warhammer2 477 +0.25% +0.66% -0.17% +0.10% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width witcher_3_dxvk_g2 1074 +0.75% -10.45% -0.15% -0.16% -0.16% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers wolfenstein_youngblood 1111 +0.52% +0.66% -0.59% -0.03% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Jesse Natalie	082eba6165	nir_lower_mem_access_bit_sizes: Move options into a struct Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>	2023-06-13 00:43:36 +00:00
Jesse Natalie	4217353e2d	nir_lower_mem_access_bit_sizes: Add a bit_size input to the callback We'd like to use this callback to adjust loads and stores from things that are unsupported to things that are supported, but if the input is already supported, we'd prefer not to change it. Rather than making up a bit size that'd work and doing a bunch of pack/unpack bit math, only return a different bit size if the input one doesn't work for us (i.e. can't load enough memory or just an unsupported size entirely). Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>	2023-06-13 00:43:36 +00:00
Mark Janes	a98f246857	isl: use generated workaround helpers for Wa_1806565034 This workaround was enabled for gen12+, but only applies to gen12.0. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21912>	2023-06-02 16:17:34 +00:00
Mark Janes	d0669f3ede	intel/dev: switch defect identifiers to use lineage numbers Update existing workarounds when necessary to match changed identifiers. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23226>	2023-05-30 22:13:41 +00:00
Erik Faye-Lund	20d619cd84	nir: use more nir_fmul_imm This simplifies things a bit. Note that in some cases, the arguments are swapped, because multiplications are commutative, and nir_fmul_imm only allows the second operand to be an immediate. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23179>	2023-05-25 06:59:24 +00:00
Lionel Landwerlin	429ef02f83	intel/fs: make tcs input_vertices dynamic We need to do 3 things to accomplish this : 1. make all the register access consider the maximal case when unknown at compile time 2. move the clamping of load_per_vertex_input prior to lowering nir_intrinsic_load_patch_vertices_in (in the dynamic cases, the clamping will use the nir_intrinsic_load_patch_vertices_in to clamp), meaning clamping using derefs rather than lowered nir_intrinsic_load_per_vertex_input 3. in the known cases, lower nir_intrinsic_load_patch_vertices_in in NIR (so that the clamped elements still be vectorized to the smallest number of URB read messages) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22378>	2023-05-24 18:32:07 +00:00
Kenneth Graunke	a2d384a5c0	intel/compiler: Fix 64-bit ufind_msb, find_lsb, and bit_count We only support 32-bit versions of ufind_msb, find_lsb, and bit_count, so we need to lower them via nir_lower_int64. Previously, we were failing to do so on platforms older than Icelake and let those operations fall through to nir_lower_bit_size, which used a callback to determine it should lower them for bit_size != 32. However, that pass only emulates small bit-size operations by promoting them to supported, larger bit-sizes (i.e. 16-bit using 32-bit). It doesn't support emulating larger operations (i.e. 64-bit using 32-bit). So nir_lower_bit_size would just u2u32 the 64-bit source, causing us to flat ignore half of the bits. Commit `78a195f252` (intel/compiler: Postpone most int64 lowering to brw_postprocess_nir) provoked this bug on Icelake and later as well, by moving the nir_lower_int64 handling for ufind_msb until late in compilation, allowing it to reach nir_lower_bit_size which broke it. To fix this, we always set int64 lowering for these opcodes, and also correct the nir_lower_bit_size callback to ignore 64-bit operations. Cc: mesa-stable Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23123>	2023-05-19 22:44:37 +00:00
Rohan Garg	6b8fe32322	intel: infer scalar'ness locally for brw_vectorize_lower_mem_access Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23098>	2023-05-18 15:46:06 +02:00
Rohan Garg	3a8f5c2783	intel: update comments about non-existent function parameter Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23098>	2023-05-18 15:46:06 +02:00
Rohan Garg	a15cc833f9	intel: drop unused is_scalar function parameter in brw_nir_apply_key Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23098>	2023-05-18 15:46:06 +02:00
Rohan Garg	212810ac8a	intel: infer scalar'ness locally for brw_postprocess_nir Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23098>	2023-05-18 15:46:06 +02:00

1 2 3 4 5 ...

331 commits