fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-16 14:08:07 +02:00

Author	SHA1	Message	Date
Rohan Garg	7f6e6eb8ec	anv: partially revert `2e8b1f6d` set_image_compressed_bit checks for the image aux usage whereas cmd_buffer_mark_image_written checks for the subresource's aux usage. Signed-off-by: Rohan Garg <rohan.garg@intel.com> Fixes: `2e8b1f6d` ('anv: drop duplicate checks when setting the compressed bit') Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24363>	2023-07-31 15:06:39 +00:00
Lionel Landwerlin	c1c0311d42	anv: enable EDS3 ConservativeRasterizationMode Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24395>	2023-07-31 12:30:37 +00:00
Lionel Landwerlin	a0179c32b6	anv: fix 3DSTATE_RASTER::APIMode field setting The APIMode field is set in the dynamic part in gfx8_cmd_buffer.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `55951ac28e` ("anv: fix emitting dynamic primitive topology") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24395>	2023-07-31 12:30:37 +00:00
Jordan Justen	5df97c27dc	intel/compiler: Use nir SUBGROUP_INVOCATION for RT TOPOLOGY_ID Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21774>	2023-07-28 22:54:59 +00:00
Jordan Justen	dbf19b76e8	intel/isl: Use intel_needs_workaround() for MTL CCS WA Also use parent WA number of 14017240301 instead of 14017353530. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22401>	2023-07-28 15:23:47 -07:00
José Roberto de Souza	2b7599dc49	intel: Rename intel_gem_add_ext() to intel_i915_gem_add_ext() gem_add_ext() is i915 specific so adding it to the name. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23905>	2023-07-28 15:36:52 +00:00
José Roberto de Souza	c9950786f6	intel/common: Move functions inside of C++ ifdef Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23905>	2023-07-28 15:36:52 +00:00
José Roberto de Souza	4198a301b3	intel: Move i915_drm.h specific code from common/intel_gem.h to common/i915/intel_gem.h This allow us to remove one more i915_drm.h include from code shared by both backends. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23905>	2023-07-28 15:36:52 +00:00
José Roberto de Souza	1174e7412e	intel/dev: Port intel_dev_info tool to Xe KMD Only hwconfig was calling i915 specifc function, so it was only necessary split the function that fetches it from backends and call it from intel_get_and_print_hwconfig_table() depending on the KMD loaded. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23905>	2023-07-28 15:36:52 +00:00
Illia Polishchuk	56e0aff530	anv, drirc: Add workaround to speed up Cyberpunk 2077 reg allocation Calling the ra_allocate function after each register spill can take several minutes. This option speeds up shader compilation by spilling more registers after the ra_allocate failure.Required for Cyberpunk 2077, which uses a watchdog thread to terminate the process in case the render thread hasn't responded within 2 minutes. Execution time of my Cyberpunk2077 shader compilation test: https://gitlab.freedesktop.org/illia.a.polishchuk/cyberpunk-vulkan-compute-hang-test-anv Before the patch: real 1m28,738s user 1m28,329s sys 0m0,400s After the patch real 0m33,245s user 32m,835s sys 0m0,404s I think it's acceptable patch because Cyberpunk benchmarks has the same FPS with and without patch. (I started it without patch with a patched binary with disabled watchdog thread) Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com> Requires: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24228 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9241 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24299>	2023-07-28 14:51:42 +00:00
Jason Ekstrand	739e21fa9a	intel/fs: Add a parameter to speed up register spilling Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24299>	2023-07-28 14:51:42 +00:00
Mike Blumenkrantz	e68e612826	nir: add a helper for calculating variable slots this will maybe avoid future bugs, but probably not Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24163>	2023-07-28 13:14:35 +00:00
Mike Blumenkrantz	59396eefe6	nir: fix slot calculations for compact variables with location_frac a variable with a component offset may span multiple slots, and this cannot be inferred from its type alone (e.g., compacted clip+cull distances) cc: mesa-stable Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24163>	2023-07-28 13:14:35 +00:00
Lionel Landwerlin	87149cc545	blorp: update and move fast clear PIPE_CONTROLs to drivers Before this patch, when updating the indirect clear color, BLORP only invalidated the texture cache on gfx11. The hardware docs state that the texture cache invalidation is also needed on gfx12 however. Add this invalidation for gfx12 and move the fast-clear related cache invalidations to the drivers for clarity and performance. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5850 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23588>	2023-07-28 00:07:15 +00:00
Lionel Landwerlin	c94bd56114	blorp: switch blorp_update_clear_color to early return Avoid even going to the function if we don't need to. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23588>	2023-07-28 00:07:15 +00:00
Iván Briano	71ebd9b9d7	anv,hasvk: respect provoking vertex setting on geometry shaders We need to set the right value on ReorderMode based on the provoking vertex mode, or the order in which the vertices for tristrip[_adj] are delivered to the geometry shader doesn't match what Vulkan expects. Fixes dEQP-VK.transform_feedback.primitives_generated_query.concurrent.triangle_strip_with_adjacency Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23243>	2023-07-27 18:52:49 +00:00
Lionel Landwerlin	365b14489d	anv: wire image sparse loads Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>	2023-07-27 02:03:02 +03:00
Lionel Landwerlin	fe81d40bff	intel/nir: add lower for sparse images & textures We have to lower images into image load + sampler residency. There is also a restriction on sampler access with a compare, lower those as 2 sampler instructions to meet the restriction. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>	2023-07-27 02:02:59 +03:00
Lionel Landwerlin	300cc829de	intel/nir: handle image_sparse_load in storage format lowering The last component of sparse load is the residency data. We don't want to touch/convert that value with the format lowering. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>	2023-07-27 02:02:34 +03:00
Lionel Landwerlin	d33aff783d	intel/fs: add support for sparse accesses Purely from the backend point of view it's just an additional parameter to sampler messages. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>	2023-07-27 02:02:30 +03:00
Lionel Landwerlin	50c29e1ffa	anv: simplify buffer address+size loads from descriptor buffer Only found a couple titles that have been helped by this : PERCENTAGE DELTAS Shaders Instrs Cycles cyberpunk_2077 10388 -0.00% -0.00% ----------------------------------------------- All affected 1 -2.24% -0.39% ----------------------------------------------- Total 10388 -0.00% -0.00% PERCENTAGE DELTAS Shaders Instrs Cycles red_dead_redemption2 5949 -0.10% -0.00% -------------------------------------------------- All affected 111 -0.74% -0.14% -------------------------------------------------- Total 5949 -0.10% -0.00% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23318>	2023-07-26 09:41:23 +00:00
Lionel Landwerlin	f1f58c3bea	isl: add ability to store buffer size in unused RENDER_SURFACE_STATE fields Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23318>	2023-07-26 09:41:23 +00:00
Lionel Landwerlin	d099e47de0	intel/fs: add more UNDEFs around SEND messages lower_find_live_channel() in particular is used a lot in control flow to find the live channel for the surface/sampler handle. Adding UNDEFs on the temporary registers used for finding the live channels helps reduce the liveness of those temporary registers, especially in loops. Some titles affected : Rise Of The Tomb Raider: Totals from 2780 (22.58% of 12311) affected shaders: Instrs: 1294455 -> 1294592 (+0.01%); split: -0.15%, +0.16% Cycles: 1473136441 -> 1471302617 (-0.12%); split: -1.52%, +1.40% Max live registers: 144282 -> 143595 (-0.48%) Max dispatch width: 22200 -> 22232 (+0.14%) Red Dead Redemption 2: Totals from 435 (7.28% of 5972) affected shaders: Instrs: 488472 -> 487594 (-0.18%); split: -0.31%, +0.14% Cycles: 11354732 -> 11384928 (+0.27%); split: -0.44%, +0.71% Spill count: 1217 -> 1172 (-3.70%) Fill count: 3521 -> 3447 (-2.10%) Scratch Memory Size: 64512 -> 62464 (-3.17%) Max live registers: 35997 -> 35798 (-0.55%) Fallout 4: Totals from 8 (0.49% of 1638) affected shaders: Instrs: 41908 -> 40509 (-3.34%) Cycles: 3638464 -> 3555680 (-2.28%); split: -2.67%, +0.39% Spill count: 717 -> 665 (-7.25%) Fill count: 2542 -> 2438 (-4.09%) Scratch Memory Size: 32768 -> 16384 (-50.00%) Max live registers: 567 -> 534 (-5.82%) Cyberpunk 2077: Totals from 2984 (28.97% of 10301) affected shaders: Instrs: 3888874 -> 3891600 (+0.07%); split: -0.20%, +0.27% Cycles: 67906489 -> 67767721 (-0.20%); split: -0.68%, +0.47% Spill count: 200 -> 98 (-51.00%) Fill count: 237 -> 90 (-62.03%) Scratch Memory Size: 10240 -> 8192 (-20.00%) Max live registers: 215715 -> 212727 (-1.39%) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24282>	2023-07-26 08:48:33 +00:00
Lionel Landwerlin	5c72724819	intel/fs: consider UNDEF as non-partial write A few titles show max live register reductions, but nothing significant in instruction count or other stats. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24282>	2023-07-26 08:48:32 +00:00
Lionel Landwerlin	d62e494b37	intel/vec4: fix log_data pointer Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3384f029be` ("intel/compiler: rework input parameters") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9421 Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24307>	2023-07-26 06:36:18 +00:00
Iván Briano	377c2a045f	intel/compiler: call brw_nir_adjust_payload from brw_postprocess_nir Calling anything after nir_trivialize_registers() risks undoing some of its work. In this case, brw_nir_adjust_payload() will do a constant folding pass if any payload adjusting happened, and that can turn a bunch of @store_regs into basically noops. Fixes dEQP-VK.subgroups.*task Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24325>	2023-07-25 22:48:09 +00:00
Ian Romanick	cb0de0a1d3	intel/fs: Constant fold OR and AND The path taken in fs_visitor::swizzle_nir_scratch_addr for DG2 generates some AND and OR instructions before the SHL. This commit folds those so the whold calculation becomes a constant (like on older platforms). v2: Fix return type of src_as_uint. Noticed by Marcin. shader-db results: DG2 total instructions in shared programs: 23190475 -> 23179540 (-0.05%) instructions in affected programs: 36026 -> 25091 (-30.35%) helped: 7 / HURT: 0 total cycles in shared programs: 841196807 -> 841142563 (<.01%) cycles in affected programs: 1660670 -> 1606426 (-3.27%) helped: 7 / HURT: 0 No shader-db changes on any older Intel platforms. fossil-db results: DG2 Totals: Instrs: 197780372 -> 197773966 (-0.00%) Cycles: 14066410782 -> 14066399378 (-0.00%); split: -0.00%, +0.00% Subgroup size: 8438104 -> 8438112 (+0.00%) Send messages: 8049445 -> 8049446 (+0.00%) Scratch Memory Size: 14263296 -> 14264320 (+0.01%) Totals from 9 (0.00% of 668055) affected shaders: Instrs: 24547 -> 18141 (-26.10%) Cycles: 1984791 -> 1973387 (-0.57%); split: -0.98%, +0.40% Subgroup size: 88 -> 96 (+9.09%) Send messages: 867 -> 868 (+0.12%) Scratch Memory Size: 69632 -> 70656 (+1.47%) No fossil-db changes on any older Intel platforms. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23884>	2023-07-25 22:11:21 +00:00
Ian Romanick	61c786bad5	intel/fs: Constant fold SHL This is a modified version of a commit originally in !7698. This version add the changes to brw_fs_copy_propagation. If the address passed to fs_visitor::swizzle_nir_scratch_addr is a constant, that function will generate SHL with two constant sources. DG2 uses a different path to generate those addresses, so the constant folding can't occur there yet. That will be addressed in the next commit. What follows is the commit change history from that older MR. v2: Previously this commit was after `intel/fs: Combine constants for integer instructions too`. However, this commit can create invalid instructions that are only cleaned up by `intel/fs: Combine constants for integer instructions too`. That would potentially affect the shader-db results of each commit, but I did not collect new data for the reordering. v3: Fix masking for W/UW and for Q/UQ types. Add an assertion for !saturate. Both suggested by Ken. Also add an assertion that B/UB types don't matically come back. v4: Fix sources count. See also `ed3c2f73db` ("intel/fs: fixup sources number from opt_algebraic"). v5: Fix typo in comment added in v3. Noticed by Marcin. Fix a typo in a comment added when pulling this commit out of !7698. Noticed by Ken. shader-db results: DG2 No changes. Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown) total instructions in shared programs: 20655696 -> 20651648 (-0.02%) instructions in affected programs: 23125 -> 19077 (-17.50%) helped: 7 / HURT: 0 total cycles in shared programs: 858436639 -> 858407749 (<.01%) cycles in affected programs: 8990532 -> 8961642 (-0.32%) helped: 7 / HURT: 0 Broadwell and Haswell had similar results. (Broadwell shown) total instructions in shared programs: 18500780 -> 18496630 (-0.02%) instructions in affected programs: 24715 -> 20565 (-16.79%) helped: 7 / HURT: 0 total cycles in shared programs: 946100660 -> 946087688 (<.01%) cycles in affected programs: 5838252 -> 5825280 (-0.22%) helped: 7 / HURT: 0 total spills in shared programs: 17588 -> 17572 (-0.09%) spills in affected programs: 1206 -> 1190 (-1.33%) helped: 2 / HURT: 0 total fills in shared programs: 25192 -> 25156 (-0.14%) fills in affected programs: 156 -> 120 (-23.08%) helped: 2 / HURT: 0 No shader-db changes on any older Intel platforms. fossil-db results: DG2 Totals: Instrs: 197780415 -> 197780372 (-0.00%); split: -0.00%, +0.00% Cycles: 14066412266 -> 14066410782 (-0.00%); split: -0.00%, +0.00% Totals from 16 (0.00% of 668055) affected shaders: Instrs: 16420 -> 16377 (-0.26%); split: -0.43%, +0.17% Cycles: 220133 -> 218649 (-0.67%); split: -0.69%, +0.01% Tiger Lake, Ice Lake and Skylake had similar results. (Ice Lake shown) Totals: Instrs: 153425977 -> 153423678 (-0.00%) Cycles: 14747928947 -> 14747929547 (+0.00%); split: -0.00%, +0.00% Subgroup size: 8535968 -> 8535976 (+0.00%) Send messages: 7697606 -> 7697607 (+0.00%) Scratch Memory Size: 4380672 -> 4381696 (+0.02%) Totals from 6 (0.00% of 662749) affected shaders: Instrs: 13893 -> 11594 (-16.55%) Cycles: 5386074 -> 5386674 (+0.01%); split: -0.42%, +0.43% Subgroup size: 80 -> 88 (+10.00%) Send messages: 675 -> 676 (+0.15%) Scratch Memory Size: 91136 -> 92160 (+1.12%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23884>	2023-07-25 22:11:21 +00:00
Ian Romanick	56e6186dcf	intel/fs: Always do opt_algebraic after opt_copy_propagation makes progress opt_copy_propagation can create invalid instructions like shl(8) vgrf96:UD, 2d, 8u These instructions will be cleaned up by opt_algebraic. The irony is opt_algebraic converts these to simple mov instructions that opt_copy_propagation should clean up. I don't think we want a loop like do { progress = false; if (OPT(opt_copy_propagation)) { OPT(opt_algebraic); OPT(dead_code_eliminate); } } while (progress); But maybe we do? Maybe this would be sufficient: while (OPT(opt_copy_propagation)) OPT(opt_algebraic); OPT(dead_code_eliminate); No shader-db or fossil-db changes (yet) on any Intel platform. This is expected. v2: Do opt_algebraic immediately after every call to opt_copy_propagation instead of being clever. Suggested by Lionel. Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23884>	2023-07-25 22:11:21 +00:00
José Roberto de Souza	f59d272e93	anv: Request Xe KMD to place BOs to CPU visible VRAM when required This is required to support discrete GPUs placed in systems with large PCI bar or resizeble PCI bar not available or disabled. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23781>	2023-07-25 19:33:16 +00:00
José Roberto de Souza	f9fcd7168a	intel/dev/xe: Add support for small-bar setups This adds support for discrete GPUs placed in systems with large PCI bar or resizeble PCI bar not available or disabled. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23781>	2023-07-25 19:33:15 +00:00
Faith Ekstrand	94f36cfaa3	intel/fs: Assume NIR is in SSA form Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310>	2023-07-25 16:25:11 +00:00
Faith Ekstrand	965bbe5286	intel/fs: Rework the overlapping mov/vec case Now that we're using load/store_reg intrinsics, the previous checks for registers aren't what we want. Instead, we need to be looking for a mov or vec where both the destination and a source are load/store_reg with matching decl_reg. Fixes: `b8209d69ff` ("intel/fs: Add support for new-style registers") Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310>	2023-07-25 16:25:11 +00:00
Faith Ekstrand	45ee952efb	intel/fs: Use write masks from store_reg intrinsics Fixes: `b8209d69ff` ("intel/fs: Add support for new-style registers") Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310>	2023-07-25 16:25:10 +00:00
Marcin Ślusarz	4f1125e4ae	intel/compiler/test: fix crashes when TEST_DEBUG is set Dumping instructions requires that ISA info is not empty. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24274>	2023-07-25 15:13:29 +00:00
Illia Polishchuk	c2724b4d37	s/Intel: fix/anv: fix: potentially overflowing expression in genX CID 1528164 (#1 of 1): Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN) overflow_before_widen: Potentially overflowing expression pool->n_passes * pool->khr_perf_preamble_stride with type unsigned int (32 bits, unsigned) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type uint64_t (64 bits, unsigned). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20893>	2023-07-25 08:55:56 +00:00
Jianxun Zhang	75452f611e	intel/common: Only set op mask on instructions in decoder When a default value of a struct's field, which is in the higher half of the first dword, is specified in a gen xml file, setting op mask makes decoder treat the field as a header (intel_field_is_header()). As a result, it won't output the field in batch dump. This is not a common case but can happen once a gen xml file includes such fields. The op mask is only meaningful to instructions, so we fix the above issue by not setting op mask of structs (also registers). Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24268>	2023-07-24 22:56:59 +00:00
Nanley Chery	1d12b29b3f	intel/blorp: Ambiguate after CCS resolves on gfx7-8 ISL's state-machine of CCS_D describes full resolves as leaving the aux buffer in the pass-through state. Hardware doesn't behave this way on gfx8 however. On that platform, full resolves transition the aux buffer to the resolved state. This was verified by dumping the CCS before and after a full resolve on BDW (gfx7 is simply assumed to behave the same). Ambiguate after resolving to match driver expectations. Prevents iris from failing piglit's fcc-write-after-clear on BDW with a future patch which relies on fast-clear encodings being removed after a resolve. The avoided failure is: Testing implicit read of partial block UNORM -> SNORM Probe color at (0,1,0) Expected: 1.000000 1.000000 1.000000 1.000000 Observed: 0.000000 0.000000 0.000000 0.000000 Cc: mesa-stable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23676>	2023-07-24 22:29:01 +00:00
Lionel Landwerlin	8cbf730145	intel/fs: don't try to rebuild sequences of non ssa values Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `04777171e0` ("intel/fs: try to rematerialize surface computation code") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9378 Reviewed-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24228>	2023-07-24 20:04:24 +00:00
Emma Anholt	61ec26db26	ci/tgl: Improve the info for ANGLE's MSAA regression on TGL. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24200>	2023-07-24 16:07:28 +00:00
Faith Ekstrand	079e8a9674	anv,hasvk,iris: sampler_prog_key::swizzles is only used on crocus The field is no longer consumed by brw_complie_* and is instead handled directly by the crocus driver. Therefore, it's safe to leave it zero and not even bother setting it. This removes our reliance on the SWIZZLE_* macros in prog_instructions.h. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24288>	2023-07-24 15:40:40 +00:00
Marcin Ślusarz	48885c7fe3	intel/compiler: load debug mesh compaction options once Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>	2023-07-24 07:55:29 +00:00
Marcin Ślusarz	c1685f08dd	intel/compiler,anv: put some vertex and primitive data in headers Both per-primitive and per-vertex space is allocated in MUE in 8 dword chunks and those 8-dword chunks (granularity of 3DSTATE_SBE_MESH.Per[Primitive\|Vertex]URBEntryOutputReadLength) are passed to fragment shaders as inputs (either non-interpolated for per-primitive and flat vertex attributes or interpolated for non-flat vertex attributes). Some attributes have a special meaning and must be placed in separate 8/16-dword slot called Primitive Header or Vertex Header. Primitive Header contains 4 such attributes (Cull Primitive, ViewportIndex, RTAIndex, CPS), leaving 4 dwords (the rest of 8-dword slot) potentially unused. Vertex Header is similar - it starts with 3 unused dwords, 1 dword for Point Size (but if we declare that shader doesn't produce Point Size then we can reuse it), followed by 4 dwords for Position and optionally 8 dwords for clip distances. This means we have an interesting optimization problem - we can put some user attributes into holes in Primitive and Vertex Headers, which may lead to smaller MUE size and potentially more mesh threads running in parallel, but we have to be careful to use those holes only when we need it, otherwise we could force HW to pass too much data to fragment shader. Example 1: Let's assume that Primitive Header is enabled and user defined 12 dwords of per-primitive attributes. Without packing we would consume 8 + ALIGN(12, 8) = 24 dwords of MUE space and pass ALIGN(12, 8) = 16 dwords to fragment shader. With packing, we'll consume 4 + 4 + ALIGN(12 - 4, 8) = 16 dwords of MUE space and pass ALIGN(4, 8) + ALIGN(12 - 4, 8) = 16 dwords to fragment shader. 16/16 is better than 24/16, so packing makes sense. Example 2: Now let's assume that Primitive Header is enabled and user defined 16 dwords of per-primitive attributes. Without packing we would consume 8 + ALIGN(16, 8) = 24 dwords of MUE space and pass ALIGN(16, 16) = 16 dwords to fragment shader. With packing, we'll consume 4 + 4 + ALIGN(16 - 4, 8) = 24 dwords of MUE space and pass ALIGN(4, 8) + ALIGN(16 - 4, 8) = 24 dwords to fragment shader. 24/24 is worse than 24/16, so packing doesn't make sense. This change doesn't affect vk_meshlet_cadscene in default configuration, but it speeds it up by up to 25% with "-extraattributes N", where N is some small value divisible by 2 (by default N == 1) and we are bound by URB size. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>	2023-07-24 07:55:29 +00:00
Marcin Ślusarz	a252123363	intel/compiler/mesh: compactify MUE layout Instead of using 4 dwords for each output slot, use only the amount of memory actually needed by each variable. There are some complications from this "obvious" idea: - flat and non-flat variables can't be merged into the same vec4 slot, because flat inputs mask has vec4 stride - multi-slot variables can have different layout: float[N] requires N 1-dword slots, but i64vec3 requires 1 fully occupied 4-dword slot followed by 2-dword slot - some output variables occur both in single-channel/component split and combined variants - crossing vec4 boundary requires generating more writes, so avoiding them if possible is beneficial This patch fixes some issues with arrays in per-vertex and per-primitive data (func.mesh.ext.outputs.*.indirect_array.q0 in crucible) and by reduction in single MUE size it allows spawning more threads at the same time. Note: this patch doesn't improve vk_meshlet_cadscene performance because default layout is already optimal enough. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>	2023-07-24 07:55:29 +00:00
Zhang Ning	06db9bd3f6	Revert "intel/ci: disable iris-jsl-deqp because it always fails for an AMD MR" This reverts commit `da4b5b4a47`. Signed-off-by: Zhang Ning <zhangn1985@outlook.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23815>	2023-07-24 03:02:14 +00:00
Alyssa Rosenzweig	1466014184	nir: Rename lower_locals_to_reg_intrinsics back The short name is freed up. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>	2023-07-21 11:25:49 +00:00
Alyssa Rosenzweig	a08286f993	intel/fs: Don't read reg.base_offset It's not set in the new intrinsics path. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>	2023-07-21 11:25:48 +00:00
Rohan Garg	01965a2fe9	anv: drop CFE state validation checks anv no longer needs to track if the CFE state is valid since we ensure that the state is valid at pipeline creation time. Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23934>	2023-07-21 10:46:08 +00:00
Rohan Garg	e7e7042093	anv,iris: program the maximum number of threads on compute queue init Fixes: `90a39cac87` ("intel/blorp: Emit compute program based on BLORP_BATCH_USE_COMPUTE") Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23934>	2023-07-21 10:46:08 +00:00
Marcin Ślusarz	06046a02f8	anv: merge cases leading to the same code Added in: `688968e888` ("anv: add support for direct descriptor in allocation/writes") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24260>	2023-07-21 07:22:22 +00:00

1 2 3 4 5 ...

9920 commits