fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 13:30:12 +01:00

Author	SHA1	Message	Date
Rohan Garg	4de065f6a2	intel/compiler: Adjust fence message lengths for new register width on Xe2+ Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Rohan Garg	e1289d6135	intel/compiler: Adjust CS payload registers for new register width on Xe2+ Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	150b3e87c8	intel/fs/xe2+: Round up fs_builder::vgrf() size calculation to HW register unit. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	24dcc3269b	intel/fs/xe2+: Update encoding of FB write message payload. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	a573531785	intel/compiler/xe2+: Represent dispatch_grf_start_reg in native GRF units. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	17ef5e7ead	intel/fs/xe2+: Allow increased SIMD width for various get_fpu_lowered_simd_width() restrictions. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	6423cb9bfa	intel/eu/xe2+: Update validation of GRF region size to account for Xe2 reg size Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	00b614a5a7	intel/fs/xe2+: Scale MAX_SAMPLER_MESSAGE_SIZE by native register size. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	421d43fe62	intel/fs/xe2+: Fixes for increased accumulator register width. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	80e9031b44	intel/fs/xe2+: Fix grf_count in post-RA scheduling for updated register file size. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	571ddf8516	intel/fs/xe2+: Fix payload node live range calculations for change in register size. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	2b7419d090	intel/fs: Fix signedness of payload_node_count argument of calculate_payload_ranges(). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	abf8111560	intel/eu/xe2+: Fix encoding of various message descriptors for change in register size. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	6d39b3d6ae	intel/fs/ra/xe2: Scale up register allocation granularity by 2x on Xe2+ platforms. v2: Fix spill register allocation. Switch to brw_reg::nr representation in fake 256b units. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	bd98df5d8e	intel/compiler: Make MAX_VGRF_SIZE macro depend on devinfo and update it for Xe2. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	a7d521e556	intel/vec4/ra: Define REG_CLASS_COUNT constant specifying the number of register classes. Rework: * Jordan: 16=>20 following `d33aff783d` ("intel/fs: add support for sparse accesses") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:35 -07:00
Francisco Jerez	5d87f41a54	intel/fs/ra: Define REG_CLASS_COUNT constant specifying the number of register classes. Rework: * Jordan: 16=>20 following `d33aff783d` ("intel/fs: add support for sparse accesses") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:35 -07:00
Connor Abbott	4282386311	nir/spirv: Add inverse_ballot intrinsic This is actually a no-op on AMD, so we really don't want to lower it to something more complicated. There may be a more efficient way to do this on Intel too. In addition, in the future we'll want to use this for lowering boolean reduce operations, where the inverse ballot will operate on the backend's "natural" ballot type as indicated by options->ballot_bit_size, instead of uvec4 as produced by SPIR-V. In total, there are now three possible lowerings we may have to perform: - inverse_ballot with source type of uvec4 from SPIR-V to inverse_ballot with natural source type, when the backend supports inverse_ballot natively. - inverse_ballot with source type of uvec4 from SPIR-V to arithmetic, when the backend doesn't support inverse_ballot. - inverse_ballot with natural source type from reduce operation, when the backend doesn't support inverse_ballot. Previously we just did the second lowering unconditionally in vtn, but it's just a combination of the first and third. We add support here for the first and third lowerings in nir_lower_subgroups, instead of simply moving the second lowering, to avoid unnecessary churn. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25123>	2023-09-20 14:41:18 +00:00
Pavel Ondračka	1c72c71bdf	nir/move_vec_src_uses_to_dest: allow to skip reuse of constant sources And enable this for r300 and intel-vec4 crocus HSW (mostly helps few doplhin ubershaders): total instructions in shared programs: 1576736 -> 1576589 (<.01%) instructions in affected programs: 38235 -> 38088 (-0.38%) helped: 12 HURT: 0 total cycles in shared programs: 111025838 -> 110944796 (-0.07%) cycles in affected programs: 5646582 -> 5565540 (-1.44%) helped: 15 HURT: 6 total spills in shared programs: 447 -> 432 (-3.36%) spills in affected programs: 186 -> 171 (-8.06%) helped: 12 HURT: 0 total fills in shared programs: 792 -> 774 (-2.27%) fills in affected programs: 291 -> 273 (-6.19%) helped: 12 HURT: 0 r300 RV530: total instructions in shared programs: 96655 -> 96304 (-0.36%) instructions in affected programs: 15020 -> 14669 (-2.34%) helped: 79 HURT: 18 total temps in shared programs: 13027 -> 12952 (-0.58%) temps in affected programs: 677 -> 602 (-11.08%) helped: 41 HURT: 9 total cycles in shared programs: 147745 -> 147314 (-0.29%) cycles in affected programs: 21831 -> 21400 (-1.97%) helped: 84 HURT: 19 r300 RV370: total instructions in shared programs: 63678 -> 63669 (-0.01%) instructions in affected programs: 931 -> 922 (-0.97%) helped: 12 HURT: 6 total temps in shared programs: 10028 -> 10013 (-0.15%) temps in affected programs: 339 -> 324 (-4.42%) helped: 33 HURT: 10 total cycles in shared programs: 101118 -> 101087 (-0.03%) cycles in affected programs: 2659 -> 2628 (-1.17%) helped: 22 HURT: 6 Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24932>	2023-09-19 18:05:37 +02:00
Alyssa Rosenzweig	d1eb17e92e	treewide: Drop nir_ssa_for_src users Via Coccinelle patch: @@ expression b, s, n; @@ -nir_ssa_for_src(b, *s, n) +s->ssa @@ expression b, s, n; @@ -nir_ssa_for_src(b, s, n) +s.ssa Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25247>	2023-09-18 10:25:17 -04:00
Sviatoslav Peleshko	b1a63d5418	intel/fs: Check if the whole ubo load range is in the push const range Before this, we were checking only the beginning of the ubo range, so partially overlapping loads were trying to load undefined data. Fixes: `b2da1238` ("i965: Use pushed UBO data in the scalar backend.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9748 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25111>	2023-09-15 10:55:24 +00:00
Ian Romanick	92f5442489	intel/fs: Merge copy prop dataflow loops This is kept as a separate commit because the change looks like a lot more than it it. The order of the two loops is swapped, then the two loops are merged. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	fa2757aa97	intel/fs: Use rb_tree for copy prop dataflow Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	35644bb483	intel/fs: Use rb_tree to store ACP entries by destination Using a single data structure seems better. There's no appreciable performance change. On batman_arkham_city_goty.foz, the difference reported was 0.48%±0.36% (n=20). Several commits in the MR, including some that should have no effect at all, reported similar changes. I attribute this primarily changing of loop alignments and similar. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	c28bf1a249	intel/fs: Use rb_tree to store ACP entries by source On batman_arkham_city_goty.foz, this improves fossil-db time by -3.83%±0.24% (n=20). This fossil takes the longest time of any in my database. v2: Add some comments for cmp_entry_src_entry_src and cmp_entry_src_nr. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	06bdd3eac0	intel/fs: Encapsulate per-block ACP in a structure This simplifies some later changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	c262752d74	intel/fs: Make opt_copy_propagation_local file private This annoyed me durning development of this MR. Every time I changed the parameters to this internal function, I had to modify a public header file... and trigger a much large rebuild. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	0946108298	intel/fs: Simplify check in can_propagate_from The larger predicate here already requires that inst->opcode must be BRW_OPCODE_MOV, so it can't BRW_OPCODE_SEL. With that removed, the other simplifications are pretty straight forward. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	1f15a0f8b2	intel/fs: Don't loop in try_constant_propagate The caller already loops over the sources. This means that the caller must loop over the sources in reverse because constant propagation prefers to propagate into the last sources first. The shader-db and fossil-db changes (below) are all due to SEL instructions. Changing the order sources are visited changes whether a SEL with two immediate sources is (+f0.0) sel g12 IMM_A IMM_B or (-f0.0) sel g12 IMM_B IMM_A The ordering of the sources affects the order the constant combining encounters the values, and the determines which value is "combined" and which value remains an immediate. This affects the results by luck. If there are two instructions: (+f0.0) sel g12 IMM_A IMM_B (+f0.0) sel g13 IMM_A IMM_C Picking IMM_A is advantageous over picking IMM_B and IMM_C. Since the selection algorithm in constant combining is greedy, this case requires the algorithm see the values in just the right order for the right thing to happen. v2: Rebase on many, many changes. Move instruction source fixup reordering out or try_constant_propagate. v3: Rebase on !7698. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	ab23d89ade	intel/fs: Move src.file checks out of try_constant_propagate and try_copy_propagate Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	b5b2338c5c	intel/fs: Make try_constant_propagate and try_copy_propagate file private This annoyed me durning development of this MR. Every time I changed the parameters to this internal function, I had to modify a public header file... and trigger a much large rebuild. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:22 +00:00
Ian Romanick	8665e37960	intel/fs: Don't try to copy propagate into a source again after progress is made If the linked list structure used depended on the list head to know when to terminate, this would be a pretty serious bug. If try_constant_propage or try_copy_propagate make progress, inst->src[i].nr will change. This results in the foreach_in_list using a different list header on later iterations of the loop. This causes two shaders in shader-db and 9 shaders in fossil-db to change. Looking at the code changes, these are cases where there was a copy of a copy that gets propagated. The part that confuses me is the VGRF numbers involved should not hash to the same bucket, so it should be impossible to find the original source from the intermediate VGRF. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:22 +00:00
Ian Romanick	e488b46419	intel/fs: Don't continue fixed point iteration just because liveout changes Unless the change in liveout also causes livein to change, updates to liveout cannot have any global effect. Changes to livein already flag additional interation. I had additional changes in this area that didn't pan out. While working on those change, I was a little confused about this bit of code. It's unnecessary, so it's better to delete it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:22 +00:00
Caio Oliveira	3890c60584	compiler/types: Remove unused GLSL_TYPE_FUNCTION and related functions GLSL doesn't use that type. SPIR-V used for a while but later started relying on its own data structures and stopped using it. See `ca62e849d3` ("nir/spirv: Stop using glsl_type for function types") If we were ever to add this one again, would be better to have a way to grab a key for lookup that did not require allocations, right now that's needed to inject return type as the first element in params array. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25160>	2023-09-12 23:18:12 +00:00
Iván Briano	f1bc58cb7b	intel/fs: use ffsll so we don't explode on 32 bits Fixes: `b200e5765c` ("anv: use a simpler MUE layout for fast linked libraries") Tested-by: Mark Janes <markjanes@swizzler.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25192>	2023-09-12 22:42:38 +00:00
Iván Briano	4eddeea7bf	intel/fs: handle URB setup for fast linked mesh pipelines Up until now, the mesh pipeline assumed it would be always linked to the fragment shader, and so the calculated MUE map would always be available. That is not the case for fast linked pipeline libraries, so the URB setup needs to account for this. We do this by replicating what's done for non-mesh pipelines, defining the URB based on the FS inputs, and always assuming they will be laid out in order of varying number, except that we also account for per-primitive attributes. Fixes all GPL using tests under dEQP-VK.mesh_shader.ext.smoke.* Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25047>	2023-09-12 02:51:31 +00:00
Iván Briano	17d7f7a292	intel/fs: read viewport and layer from the FS payload Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25047>	2023-09-12 02:51:31 +00:00
Iván Briano	d36da7c5f8	anv: track what kind of pipeline a fragment shader may be used with Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25047>	2023-09-12 02:51:31 +00:00
Iván Briano	b200e5765c	anv: use a simpler MUE layout for fast linked libraries The compaction introduced in `a252123363` ("intel/compiler/mesh: compactify MUE layout") is not suitable for the case where graphics pipeline libraries are fast linked, as the fragment shader won't receive the mue_map to know where to locate its inputs. For that case, keep doing what we did before and lay things down in the order varyings are defined, which is also how it works for the non-mesh case. Fixes dEQP-VK.fragment_shading_rate.fast_linked_library.ms Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25047>	2023-09-12 02:51:31 +00:00
Dave Airlie	bfe152916c	nir: move the libclc lowering over to functions file. This lowering is pretty generic, and I want to enhance it for times when we don't want to inline. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24687>	2023-09-12 01:57:50 +00:00
Ian Romanick	8ce4d7a08d	intel/compiler: Don't evict for workgroup-scope fences Flushing and invalidating caches isn't necessary for workgroup scope fences. In fact, the DP_FLUSH_TYPE docs (BSpec 54041) say: "If the fence scope is Local or Threadgroup, HW ignores the flush type and operates as if it was set to None(no flush)" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>	2023-09-09 04:41:25 +00:00
Ian Romanick	5eddf60e56	intel/compiler: Combine control barriers with identical memory semantics This prevents the second barrier generating a spurious, identical fence message as the first barrier. fossil-db stats on Alchemist: Totals: Instrs: 196513342 -> 196512777 (-0.00%); split: -0.00%, +0.00% Cycles: 14271426028 -> 14271404569 (-0.00%); split: -0.00%, +0.00% Send messages: 8021892 -> 8021770 (-0.00%) Totals from 46 (0.01% of 653252) affected shaders: Instrs: 76761 -> 76196 (-0.74%); split: -0.75%, +0.01% Cycles: 2027946 -> 2006487 (-1.06%); split: -1.45%, +0.39% Send messages: 7589 -> 7467 (-1.61%) Nothing in shader-db was affected. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>	2023-09-09 04:41:25 +00:00
Timothy Arceri	84e0f5ce75	nir: remove unused param from nir_alu_src_copy() Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24986>	2023-09-08 03:01:39 +00:00
Lionel Landwerlin	c9739e8912	intel/fs: limit register flag interaction of FIND_*LIVE_CHANNEL Those instructions do not access the flag registers on Gfx8+. Removing the interaction enables CSE to remove more of those instructions. Results are a bit mixed (DG2 vulkan fossils): ACO: Totals from 127 (5.97% of 2128) affected shaders: Instrs: 139966 -> 138972 (-0.71%); split: -0.85%, +0.14% Cycles: 1685747 -> 1667480 (-1.08%); split: -2.35%, +1.26% Max live registers: 10582 -> 10544 (-0.36%) Max dispatch width: 1048 -> 1040 (-0.76%) Cyberpunk 2077: Totals from 2879 (27.95% of 10301) affected shaders: Instrs: 4264789 -> 4225666 (-0.92%); split: -1.01%, +0.09% Cycles: 72380209 -> 71619521 (-1.05%); split: -1.63%, +0.58% Subgroup size: 30624 -> 30632 (+0.03%) Spill count: 98 -> 101 (+3.06%) Fill count: 90 -> 93 (+3.33%) Scratch Memory Size: 8192 -> 9216 (+12.50%) Max live registers: 217807 -> 217098 (-0.33%); split: -0.59%, +0.26% Max dispatch width: 23792 -> 24112 (+1.34%) Gaining 40 SIMD16 shaders Rise Of The Tomb Raider: Totals from 622 (5.06% of 12289) affected shaders: Instrs: 437380 -> 434760 (-0.60%); split: -0.72%, +0.12% Cycles: 261843085 -> 261580703 (-0.10%); split: -0.73%, +0.63% Max live registers: 27731 -> 27766 (+0.13%); split: -1.01%, +1.14% Max dispatch width: 5832 -> 5432 (-6.86%); split: +0.27%, -7.13% Loosing 26 SIMD32 shaders Strange Brigade: Totals from 1298 (31.48% of 4123) affected shaders: Instrs: 1504408 -> 1487968 (-1.09%); split: -1.17%, +0.08% Cycles: 20735976 -> 20443216 (-1.41%); split: -1.60%, +0.19% Max live registers: 89911 -> 89957 (+0.05%) DG2 shader-db run: total instructions in shared programs: 23130895 -> 23130036 (<.01%) instructions in affected programs: 260956 -> 260097 (-0.33%) helped: 234 HURT: 101 helped stats (abs) min: 1 max: 54 x̄: 6.36 x̃: 4 helped stats (rel) min: 0.05% max: 8.16% x̄: 2.01% x̃: 1.90% HURT stats (abs) min: 1 max: 37 x̄: 6.23 x̃: 3 HURT stats (rel) min: 0.02% max: 5.67% x̄: 0.89% x̃: 0.55% 95% mean confidence interval for instructions value: -3.62 -1.51 95% mean confidence interval for instructions %-change: -1.33% -0.94% Instructions are helped. total loops in shared programs: 6071 -> 6071 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 898610645 -> 898557166 (<.01%) cycles in affected programs: 18308201 -> 18254722 (-0.29%) helped: 315 HURT: 48 helped stats (abs) min: 1 max: 19312 x̄: 404.23 x̃: 128 helped stats (rel) min: 0.02% max: 28.98% x̄: 3.92% x̃: 2.65% HURT stats (abs) min: 2 max: 14478 x̄: 1538.60 x̃: 409 HURT stats (rel) min: <.01% max: 23.24% x̄: 3.34% x̃: 0.41% 95% mean confidence interval for cycles value: -333.68 39.03 95% mean confidence interval for cycles %-change: -3.51% -2.41% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 5964 -> 5964 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 6909 -> 6909 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 total sends in shared programs: 1040266 -> 1040266 (0.00%) sends in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 3 GAINED: 1 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24553>	2023-09-06 14:47:40 +00:00
Lionel Landwerlin	10e75aae1b	intel/nir: rerun lower_tex if it lowers something nir_lower_tex can lower tg4 coords into tg4 offset which on DG2+ we also need to lower into constant offsets. Unfortunately the nir_lower_tex pass is not able to lower the instructions it itself generates, so the easy fix for when nir_lower_tex lowers tg4 coords into tg4 offsets is to rerun the pass. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9735 Cc: mesa-stable Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25015>	2023-09-05 13:35:51 +00:00
Matt Turner	28c1053c07	intel: Allow using intel_clc from the system With -Dintel-clc=system, the build system will search for an `intel_clc` binary and use it instead of building `intel_clc` itself. This allows Intel Vulkan ray tracing support to be built when cross compiling without terrible hacks (that would otherwise be necessary due to `intel_clc`'s dependence on SPIRV-LLVM-Translator, libclc, clang, and LLVM). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24983>	2023-09-01 21:36:02 +00:00
Alyssa Rosenzweig	f80c57c38f	treewide: Use nir_before/after_impl for more elaborate cases Via Coccinelle patch: @@ expression func_impl; @@ -nir_before_block(nir_start_block(func_impl)) +nir_before_impl(func_impl) @@ expression func_impl; @@ -nir_after_block(nir_impl_last_block(func_impl)) +nir_after_impl(func_impl) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24910>	2023-08-30 19:30:58 +00:00
Alyssa Rosenzweig	25cc04c59b	treewide: Use nir_before/after_impl in easy cases These open-code the same idiom as the helper. Via Coccinelle patch: @@ expression func_impl; @@ -nir_before_cf_list(&func_impl->body) +nir_before_impl(func_impl) @@ expression func_impl; @@ -nir_after_cf_list(&func_impl->body) +nir_after_impl(func_impl) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24910>	2023-08-30 19:30:58 +00:00
Karol Herbst	202fe3de31	intel/compiler: drop 64 bit handling for cl workgroup intrinsics Signed-off-by: Karol Herbst <git@karolherbst.de> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24905>	2023-08-30 07:04:33 +00:00
Lionel Landwerlin	74a40cc4b6	intel/fs: move lower of non-uniform at_sample barycentric to NIR We use a non-uniform lowering loop in the backend which we can do better in NIR because we can also use divergence analysis there. This change also limits VGRF usage to a single VGRF to hold the sample ID in the backend. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24716>	2023-08-29 23:19:13 +00:00

... 3 4 5 6 7 ...

2931 commits