fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 22:20:14 +01:00

Author	SHA1	Message	Date
Lionel Landwerlin	5ae8a78d8c	intel/fs: make use of load_ubo_uniform_block_intel The principle is the same as the load_ssbo_uniform_block_intel. Whenever we see a uniform offset, load the data only once in GRFs to reduce register pressure. Iris shader-db run on DG2 : total instructions in shared programs: 23001325 -> 23094969 (0.41%) instructions in affected programs: 1775989 -> 1869633 (5.27%) helped: 764 HURT: 2097 helped stats (abs) min: 1 max: 102 x̄: 6.96 x̃: 2 helped stats (rel) min: 0.03% max: 16.91% x̄: 1.36% x̃: 0.63% HURT stats (abs) min: 1 max: 2461 x̄: 47.19 x̃: 7 HURT stats (rel) min: <.01% max: 199.34% x̄: 5.91% x̃: 2.60% 95% mean confidence interval for instructions value: 25.43 40.03 95% mean confidence interval for instructions %-change: 3.60% 4.33% Instructions are HURT. total loops in shared programs: 5847 -> 5847 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 839329852 -> 845491482 (0.73%) cycles in affected programs: 130229434 -> 136391064 (4.73%) helped: 1098 HURT: 2228 helped stats (abs) min: 1 max: 130102 x̄: 1340.64 x̃: 22 helped stats (rel) min: <.01% max: 64.25% x̄: 4.03% x̃: 0.71% HURT stats (abs) min: 1 max: 185309 x̄: 3426.24 x̃: 87 HURT stats (rel) min: <.01% max: 92.85% x̄: 8.12% x̃: 3.82% 95% mean confidence interval for cycles value: 1342.16 2362.97 95% mean confidence interval for cycles %-change: 3.70% 4.52% Cycles are HURT. total spills in shared programs: 10768 -> 11856 (10.10%) spills in affected programs: 9717 -> 10805 (11.20%) helped: 25 HURT: 28 total fills in shared programs: 13720 -> 16258 (18.50%) fills in affected programs: 12016 -> 14554 (21.12%) helped: 25 HURT: 28 total sends in shared programs: 1034790 -> 1031266 (-0.34%) sends in affected programs: 33416 -> 29892 (-10.55%) helped: 1005 HURT: 0 helped stats (abs) min: 1 max: 22 x̄: 3.51 x̃: 3 helped stats (rel) min: 1.69% max: 60.00% x̄: 15.20% x̃: 14.08% 95% mean confidence interval for sends value: -3.72 -3.29 95% mean confidence interval for sends %-change: -15.82% -14.57% Sends are helped. LOST: 26 GAINED: 183 shader-db on a number of VK/DX titles on DG2 : PERCENTAGE DELTAS Shaders Instrs Cycles age_of_wonders_III 1928 +0.02% -0.19% PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Max live registers Max dispatch width assassins_creed_odyssey 2119 +1.12% -0.42% -0.03% -0.29% -9.10% -4.26% -0.64% +0.65% PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Max live registers aztec_ruins_high 269 -0.05% -0.45% -0.29% -7.27% -0.33% PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width dark_souls_3_dxvk_g2 1420 +0.09% +0.24% +0.21% +0.12% (stats look bad, but it's just one shader affected) PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Scratch Memory Size Max live registers fallout_4_dxvk_g2 1638 +0.67% +8.32% +16.02% +7.17% +100.00% +0.48% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Spill count Fill count Max live registers Max dispatch width red_dead_redemption2 5969 +0.16% -0.04% -0.04% +0.01% +0.05% -0.20% +0.04% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width rise_of_the_tomb_raider_g2 12129 +2.19% +1.36% -1.23% -0.36% +2.04% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers shooter-game 693 +0.07% -0.89% -0.09% -0.09% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width talos_g2 1140 +0.37% +3.80% -0.86% -0.67% +0.19% PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width total_war_warhammer2 477 +0.25% +0.66% -0.17% +0.10% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width witcher_3_dxvk_g2 1074 +0.75% -10.45% -0.15% -0.16% -0.16% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers wolfenstein_youngblood 1111 +0.52% +0.66% -0.59% -0.03% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Lionel Landwerlin	7eb1e2a690	intel/fs: avoid reusing the VGRF for uniform load_ubo Only found 3 shaders affected in Red Dead Redemption : Totals from 3 (0.05% of 5969) affected shaders: Instrs: 2246 -> 2230 (-0.71%) Cycles: 156506 -> 148402 (-5.18%); split: -5.23%, +0.05% This will have a larger effect when we add the load_ubo_uniform_block_intel intrinsic where we will have larger blocks (vec8/vec16 vs vec4 only now). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Lionel Landwerlin	ff3494fce3	intel/fs: print identation for control flow INTEL_DEBUG=optimizer output changes from : { 10} 40: cmp.nz.f0.0(8) null:F, vgrf3470:F, 0f { 10} 41: (+f0.0) if(8) (null):UD, { 11} 42: txf_logical(8) vgrf3473:UD, vgrf250:D(null):UD, 0d(null):UD(null):UD(null):UD(null):UD, 31u, 0u(null):UD(null):UD(null):UD, 3d, 0d { 12} 43: and(8) vgrf262:UD, vgrf3473:UD, 2u { 11} 44: cmp.nz.f0.0(8) null:D, vgrf262:D, 0d { 10} 45: (+f0.0) if(8) (null):UD, { 11} 46: mov(8) vgrf270:D, -1082130432d { 12} 47: mov(8) vgrf271:D, 1082130432d { 14} 48: mov(8) vgrf274+0.0:D, 0d { 14} 49: mov(8) vgrf274+1.0:D, 0d to : { 10} 40: cmp.nz.f0.0(8) null:F, vgrf3470:F, 0f { 10} 41: (+f0.0) if(8) (null):UD, { 11} 42: txf_logical(8) vgrf3473:UD, vgrf250:D(null):UD, 0d(null):UD(null):UD(null):UD(null):UD, 31u, 0u(null):UD(null):UD(null):UD, 3d, 0d { 12} 43: and(8) vgrf262:UD, vgrf3473:UD, 2u { 11} 44: cmp.nz.f0.0(8) null:D, vgrf262:D, 0d { 10} 45: (+f0.0) if(8) (null):UD, { 11} 46: mov(8) vgrf270:D, -1082130432d { 12} 47: mov(8) vgrf271:D, 1082130432d { 14} 48: mov(8) vgrf274+0.0:D, 0d { 14} 49: mov(8) vgrf274+1.0:D, 0d Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Lionel Landwerlin	0cd9f0c3d3	intel/fs: fix bindless/shared surface mistake Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `068bf1378d` ("intel/fs: enable SSBO accesses through the bindless heap") Tested-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23536>	2023-06-14 07:42:57 +00:00
Alyssa Rosenzweig	1d4a59448c	treewide: Remove use_scoped_barrier It is now set by all relevant drivers and not checked anywhere. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>	2023-06-13 16:36:10 +00:00
Jesse Natalie	082eba6165	nir_lower_mem_access_bit_sizes: Move options into a struct Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>	2023-06-13 00:43:36 +00:00
Jesse Natalie	4217353e2d	nir_lower_mem_access_bit_sizes: Add a bit_size input to the callback We'd like to use this callback to adjust loads and stores from things that are unsupported to things that are supported, but if the input is already supported, we'd prefer not to change it. Rather than making up a bit size that'd work and doing a bunch of pack/unpack bit math, only return a different bit size if the input one doesn't work for us (i.e. can't load enough memory or just an unsupported size entirely). Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>	2023-06-13 00:43:36 +00:00
Caio Oliveira	26f6ea5c30	intel/compiler: Remove unused functions and declarations Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23539>	2023-06-09 20:09:51 +00:00
Lionel Landwerlin	25de091753	intel/nir: switch ray query state tracking to local variables uint16_t We should be able to use uint8_t but there appears to be a backend bug. Q2RTX shader compute shader improvement with ray queries : Totals: Instrs: 102221 -> 101499 (-0.71%); split: -0.82%, +0.12% Cycles: 4451260 -> 4396025 (-1.24%) Send messages: 3587 -> 3585 (-0.06%) Spill count: 717 -> 658 (-8.23%) Fill count: 1248 -> 1214 (-2.72%); split: -3.21%, +0.48% Scratch Memory Size: 21504 -> 16384 (-23.81%) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19982>	2023-06-09 08:29:43 +03:00
Caio Oliveira	2bb26cc01d	intel/compiler: Refactor dump_instruction(s) Delete unnecessary virtual functions, we need just two. Refactor code so the 'default behavior' logic (stderr and/or creating file) is not duplicated. Rename the virtuals so overrides don't hide the common convenience functions. Finally, provide a variant of dump_instructions() with a `FILE *` parameter. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23457>	2023-06-08 22:00:21 +00:00
Alyssa Rosenzweig	99a00e2247	treewide: Use nir_trim_vector more Via Coccinelle patches @@ expression a, b, c; @@ -nir_channels(b, a, (1 << c) - 1) +nir_trim_vector(b, a, c) @@ expression a, b, c; @@ -nir_channels(b, a, BITFIELD_MASK(c)) +nir_trim_vector(b, a, c) @@ expression a, b; @@ -nir_channels(b, a, 3) +nir_trim_vector(b, a, 2) @@ expression a, b; @@ -nir_channels(b, a, 7) +nir_trim_vector(b, a, 3) Plus a fixup for pointless trimming an immediate in RADV and radeonsi. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23352>	2023-06-06 18:52:25 +00:00
Lionel Landwerlin	049c791a63	intel/fs: fix pull-constant-load prior to gfx7 In `ad9bc1ffb5` ("intel/fs: enable UBO accesses through bindless heap") we added a new source, we need to fixup the source index for the generator. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `ad9bc1ffb5` ("intel/fs: enable UBO accesses through bindless heap") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23405>	2023-06-06 14:47:41 +00:00
Ian Romanick	78dd15d8e8	intel/eu/validate: Add some validation of ADD3 v2: Remove spurious ALIGN_1 checks. Suggested by Matt. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23262>	2023-06-06 06:10:53 +00:00
Ian Romanick	1c4c76032b	intel/eu/validate: Add Gfx12.5 This required updating the expected results in a number of test. The vast majority of these are cases where Gfx12.5 platforms don't allow mixing F and HF sources. In all honesty... I just updated the half_float_conversion expected results until the test passed. The next commit will add changes specific to Gfx12.5. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23262>	2023-06-06 06:10:53 +00:00
Ian Romanick	a3cfec0690	intel/eu/validate: Use a single macro define half_float_conversion cases This is what other tests do. The next commit will add a third set of possible results (for Gfx12.5+), and the multiple macro method does not scale. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23262>	2023-06-06 06:10:53 +00:00
Ian Romanick	7ef45e661f	intel/fs: Add constant propagation for ADD3 v2: Require that the constant value be representable as either uint16_t or int16_t. Suggested by Matt. v3: Remove redundant patterns. Noticed by Matt. shader-db: DG2 total instructions in shared programs: 23103767 -> 23103577 (<.01%) instructions in affected programs: 51822 -> 51632 (-0.37%) helped: 98 / HURT: 15 total cycles in shared programs: 842347714 -> 842380017 (<.01%) cycles in affected programs: 1942595 -> 1974898 (1.66%) helped: 97 / HURT: 32 Nearly all of the affected shaders (around 9,900) are shaders in Cyberpunk 2077. It's about an even split between vertex and fragment shaders. The majority of the remaining affected shaders (3,600) are from Strange Brigade. This was also a nearly even split between fragment and vertex. All but two of the lost shaders are SIMD32 fragment shaders in Cyberpunk 2077. The other two are SIMD32 fragment shaders in Dota2. fossil-db: DG2 Instructions in all programs: 196379107 -> 196248608 (-0.1%) helped: 13467 / HURT: 1210 Cycles in all programs: 13931355281 -> 13929955971 (-0.0%) helped: 11801 / HURT: 2922 Lost: 90 Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23262>	2023-06-06 06:10:53 +00:00
Ian Romanick	9a9a86013c	intel/fs: Allow HF const in MAD on Gfx12.5 if all sources are HF Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23262>	2023-06-06 06:10:53 +00:00
Ian Romanick	4f272bf001	intel/fs: Fix handling of W, UW, and HF constants in combine_constants Sources that are already W, UW, or HF can be represented as those types by definition. Pass them through. Previously an HF source on a MAD would have been marked as !can_promote. I'm pretty sure this means it would get moved out to a register, but I did not verify this. For ADD3, a constant source could be D or UD. In this case, the value must be tested to determine whether it can be represented as W or UW. The patterns in opt_algebraic won't generate an ADD3 with constant source, so this problem cannot occur yet. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23262>	2023-06-06 06:10:53 +00:00
Ian Romanick	4cc3206218	intel/fs: Don't munge source order of 3-src instructions in opt_algebraic This only impacts ADD3, so at this point it should not have any affect. As soon as constants are propagated into ADD3 instructions, it will be a problem. The worst part is, the ADD3 instrutions that are broken by the old code aren't even "progress" on this pass. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23262>	2023-06-06 06:10:53 +00:00
Erik Faye-Lund	6d142078bc	nir: use generated immediate comparison helpers This makes the code a bit less verbose, so let's use the helpers. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23393>	2023-06-05 13:40:08 +00:00
Erik Faye-Lund	28b1c5bca1	nir: use nir_i{ne,eq}_imm helpers We already have these, so let's use them more. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23393>	2023-06-05 13:40:07 +00:00
Yonggang Luo	12256136e0	compiler: Rename shader_prim to mesa_prim and replace all usage of pipe_prim_type with mesa_prim This is a prepare step to remove depends on p_defines.h in src/util/* This is done by: replace pipe_prim_type with mesa_prim replace shader_prim with mesa_prim replace PIPE_PRIM_MAX with MESA_PRIM_COUNT replace SHADER_PRIM_ with MESA_PRIM_ replace PIPE_PRIM_ with MESA_PRIM_ This patch only replace code only Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23369>	2023-06-03 03:29:03 +00:00
Mark Janes	a98f246857	isl: use generated workaround helpers for Wa_1806565034 This workaround was enabled for gen12+, but only applies to gen12.0. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21912>	2023-06-02 16:17:34 +00:00
Kenneth Graunke	2d9a3bb093	intel/compiler: Fix a fallthrough in components_read() for atomics In commit `284f0c9a57` I refactored the handling of the data source to just call a helper rather than special casing opcodes with 0 or 2 sources. Unfortunately, I also dropped the "else return 1", creating a fallthrough for all sources other than SURFACE_LOGICAL_SRC_ADDRESS and SURFACE_LOGICAL_SRC_DATA. The case below happened to return the correct value for all cases except SURFACE_LOGICAL_SRC_SURFACE, which has been returning 2 instead of 1 since that commit. Restore the else case. Thanks to Marcin Ślusarz for catching this. Fixes: `284f0c9a57` ("intel/compiler: Add an lsc_op_num_data_values() helper") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23347>	2023-06-01 21:06:57 +00:00
Lionel Landwerlin	018e306b8e	intel/fs: fix a couple of descriptor mistakes I found those issues while testing DOOM eternal and Ian also ran into it with other shaders. We write the desc register in SIMD1 exec_all, so all the data is in the first component. We need to make sure to pass that component in the lower SEND instructions. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23354>	2023-06-01 19:53:41 +00:00
Rohan Garg	ef2b763d9c	anv: fix incorrect asserts when combining CPS and per sample interpolation CPS is dynamically turned off when per sample interpolation is active. Update the asserts to reflect this. Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `5644011f06` ("intel/compiler: Convert wm_prog_key::persample_interp to a tri-state") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23103>	2023-05-31 19:26:59 +00:00
Mark Janes	d0669f3ede	intel/dev: switch defect identifiers to use lineage numbers Update existing workarounds when necessary to match changed identifiers. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23226>	2023-05-30 22:13:41 +00:00
Alyssa Rosenzweig	ebf4eff7eb	treewide: Use nir_replicate Via coccinelle. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23259>	2023-05-30 16:24:21 -04:00
Lionel Landwerlin	3f1ff326e0	anv: reduce push constant size for descriptor sets Now that descriptor sets are located a in a 1Gb area, we can avoid storing the whole address to the descriptor and add the base address of the area to a 32bit offset. Replay a bunch of fossils with this and changes not really significant one way or another : Totals: Instrs: 9278246 -> 9277148 (-0.01%); split: -0.01%, +0.00% Cycles: 3547598421 -> 3547579435 (-0.00%); split: -0.00%, +0.00% Totals from 353 (1.14% of 31021) affected shaders: Instrs: 581546 -> 580448 (-0.19%); split: -0.23%, +0.04% Cycles: 25885422 -> 25866436 (-0.07%); split: -0.31%, +0.24% No difference on send messages or spills/fills. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:38 +00:00
Lionel Landwerlin	04777171e0	intel/fs: try to rematerialize surface computation code This helps a lot with accessing surface handles in control flow. Our resource_intel intrinsic has a non_uniform flag, in which case we cannot apply this optimization. But in uniform cases, this is just a massive win. We drop all kind of pipeline stalls due to find_live_channel. We also reduce register pressure by doing the surface handle computation in a single GRF (instead of 2 or 4). There are some regressions in max dispatch width but those I think are only on SIMD32 and due to the current heuristic disabling it after throughput comparison with SIMD16. We know this heuristic is not perfect, it should probably be updated in another change. Here are some stats (all titles seem to have similar gains) : PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Scratch Memory Size Max live registers Max dispatch width red_dead_redemption2 5860 -36.80% -5.67% +0.77% +0.06% -81.26% -79.16% -70.62% -8.63% -6.93% --------------------------------------------------------------------------------------------------------------------------------------------------------------- All affected 4716 -37.29% -5.67% +0.95% +0.07% -81.26% -79.16% -70.62% -9.15% -8.47% --------------------------------------------------------------------------------------------------------------------------------------------------------------- Total 5860 -36.80% -5.67% +0.77% +0.06% -81.26% -79.16% -70.62% -8.63% -6.93% PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Scratch Memory Size Max live registers Max dispatch width rise_of_the_tomb_raider_g2 12010 -37.19% -22.12% +0.01% +0.00% -99.01% -99.14% -98.65% -7.62% -4.96% --------------------------------------------------------------------------------------------------------------------------------------------------------------------- All affected 11732 -37.27% -22.14% +0.01% +0.00% -99.01% -99.14% -98.65% -7.67% -5.11% --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Total 12010 -37.19% -22.12% +0.01% +0.00% -99.01% -99.14% -98.65% -7.62% -4.96% PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Scratch Memory Size Max live registers Max dispatch width total_war_warhammer2 462 -27.45% -12.42% -82.35% -88.46% -66.67% -5.52% -5.62% ----------------------------------------------------------------------------------------------------------------------------------- All affected 335 -28.31% -12.77% -82.35% -88.46% -66.67% -6.25% -7.24% ----------------------------------------------------------------------------------------------------------------------------------- Total 462 -27.45% -12.42% -82.35% -88.46% -66.67% -5.52% -5.62% PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Scratch Memory Size Max live registers Max dispatch width witcher_3_dxvk_g2 1049 -36.94% -57.82% +0.06% +0.01% -98.52% -97.29% -98.10% -7.81% -1.00% ------------------------------------------------------------------------------------------------------------------------------------------------------------ All affected 693 -41.93% -58.45% +0.09% +0.01% -98.52% -97.29% -98.10% -10.25% -1.33% ------------------------------------------------------------------------------------------------------------------------------------------------------------ Total 1049 -36.94% -57.82% +0.06% +0.01% -98.52% -97.29% -98.10% -7.81% -1.00% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	b28609a756	intel/fs: enable uniform block accesses through bindless heap Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	05089f305f	intel/fs: enable bindless sampler state offsets Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	6d6877bf99	intel/fs: enable extended bindless surface offset Gives use 4Gb of bindless surface state on Gfx12.5+ instead of 64Mb. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	01fc9a06bd	intel/fs: enable get_buffer_size on bindless heap Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	ad9bc1ffb5	intel/fs: enable UBO accesses through bindless heap Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	068bf1378d	intel/fs: enable SSBO accesses through the bindless heap Using the information coming from surface_index_intel, we can tell whether we should use the BTI or bindless heap for a particular SSBO access. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	3d0cc3f63b	intel/fs: keep track of new resource_intel information Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	86e9943b00	intel/fs: teach ubo range analysis pass about resource_intel Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	12540dfb6b	intel/fs: add a pass to move resource_intel closer to user Non uniform lower can insert read_first_invocation on the result of resource_intel. We want to keep that intrinsic directly in front of the user (load_ubo/load_ssbo/load_image/etc...) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	e09cfda0de	intel/fs: lower get_buffer_size like other logical sends This will also enable the use of the bindless heap. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:36 +00:00
Lionel Landwerlin	a66944dfbc	intel/fs: reuse descriptor helper Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:36 +00:00
Erik Faye-Lund	20d619cd84	nir: use more nir_fmul_imm This simplifies things a bit. Note that in some cases, the arguments are swapped, because multiplications are commutative, and nir_fmul_imm only allows the second operand to be an immediate. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23179>	2023-05-25 06:59:24 +00:00
Lionel Landwerlin	429ef02f83	intel/fs: make tcs input_vertices dynamic We need to do 3 things to accomplish this : 1. make all the register access consider the maximal case when unknown at compile time 2. move the clamping of load_per_vertex_input prior to lowering nir_intrinsic_load_patch_vertices_in (in the dynamic cases, the clamping will use the nir_intrinsic_load_patch_vertices_in to clamp), meaning clamping using derefs rather than lowered nir_intrinsic_load_per_vertex_input 3. in the known cases, lower nir_intrinsic_load_patch_vertices_in in NIR (so that the clamped elements still be vectorized to the smallest number of URB read messages) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22378>	2023-05-24 18:32:07 +00:00
Lionel Landwerlin	21c7b55f6f	intel/fs: fix size_read() for LOAD_PAYLOAD With Anv/Zink, the piglit test : arb_shader_storage_buffer_object-max-ssbo-size -auto -fbo fsexceed is failing validation after copy propagation : load_payload(8) vgrf15:F, vgrf1+0.12<0>:F, vgrf1+0.0<0>:F, vgrf1+0.4<0>:F, vgrf1+0.8<0>:F, vgrf1+0.12<0>:F ../src/intel/compiler/brw_fs_validate.cpp:191: A <= B failed A = inst->src[i].offset / REG_SIZE + regs_read(inst, i) = 2 B = alloc.sizes[inst->src[i].nr] = 1 In most cases it works because src[0] would be at offset 0 and so reading a full reg passes validation, but Anv/Zink started emitting slightly different code adding an offset maybe the size read 2 GRFs. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23126>	2023-05-23 12:39:08 +00:00
Kenneth Graunke	a2d384a5c0	intel/compiler: Fix 64-bit ufind_msb, find_lsb, and bit_count We only support 32-bit versions of ufind_msb, find_lsb, and bit_count, so we need to lower them via nir_lower_int64. Previously, we were failing to do so on platforms older than Icelake and let those operations fall through to nir_lower_bit_size, which used a callback to determine it should lower them for bit_size != 32. However, that pass only emulates small bit-size operations by promoting them to supported, larger bit-sizes (i.e. 16-bit using 32-bit). It doesn't support emulating larger operations (i.e. 64-bit using 32-bit). So nir_lower_bit_size would just u2u32 the 64-bit source, causing us to flat ignore half of the bits. Commit `78a195f252` (intel/compiler: Postpone most int64 lowering to brw_postprocess_nir) provoked this bug on Icelake and later as well, by moving the nir_lower_int64 handling for ufind_msb until late in compilation, allowing it to reach nir_lower_bit_size which broke it. To fix this, we always set int64 lowering for these opcodes, and also correct the nir_lower_bit_size callback to ignore 64-bit operations. Cc: mesa-stable Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23123>	2023-05-19 22:44:37 +00:00
Erik Faye-Lund	185001a86f	meson: remove needless c++17-overrides C++17 is the project-wide default since `f9057cea51` ("fix(FTBFS): meson: raise C++ standard to C++17"), so let's drop these local overrides. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23048>	2023-05-19 12:45:31 +00:00
Rohan Garg	6b8fe32322	intel: infer scalar'ness locally for brw_vectorize_lower_mem_access Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23098>	2023-05-18 15:46:06 +02:00
Rohan Garg	3a8f5c2783	intel: update comments about non-existent function parameter Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23098>	2023-05-18 15:46:06 +02:00
Rohan Garg	a15cc833f9	intel: drop unused is_scalar function parameter in brw_nir_apply_key Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23098>	2023-05-18 15:46:06 +02:00
Rohan Garg	212810ac8a	intel: infer scalar'ness locally for brw_postprocess_nir Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23098>	2023-05-18 15:46:06 +02:00

1 2 3 4 5 ...

2553 commits