fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-23 17:18:11 +02:00

Author	SHA1	Message	Date
Georg Lehmann	807b267c4d	nir/lower_wpos_ytransform: clean up baryc_at_offset Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31951>	2024-11-05 21:42:37 +00:00
Georg Lehmann	5d8adf92e7	nir/lower_wpos_ytransform: remove redundant state shader Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31951>	2024-11-05 21:42:37 +00:00
Georg Lehmann	63f828d262	nir/lower_wpos_ytransform: remove unnecessary state variable Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31951>	2024-11-05 21:42:37 +00:00
Georg Lehmann	3738c69796	nir/opt_frag_coord_to_pixel_coord: optimize trunc/floor Foz-DB Navi21: Totals from 207 (0.26% of 79206) affected shaders: MaxWaves: 5924 -> 5980 (+0.95%) Instrs: 83164 -> 83144 (-0.02%); split: -0.06%, +0.04% CodeSize: 457296 -> 459092 (+0.39%); split: -0.00%, +0.39% VGPRs: 5336 -> 5160 (-3.30%) Latency: 1308811 -> 1307754 (-0.08%); split: -0.16%, +0.08% InvThroughput: 232768 -> 222979 (-4.21%); split: -4.21%, +0.00% VClause: 1359 -> 1370 (+0.81%); split: -0.07%, +0.88% SClause: 3300 -> 3293 (-0.21%); split: -0.24%, +0.03% Copies: 4992 -> 4985 (-0.14%); split: -0.56%, +0.42% PreVGPRs: 3757 -> 3619 (-3.67%) VALU: 58366 -> 58338 (-0.05%); split: -0.08%, +0.03% Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31966>	2024-11-05 21:09:45 +00:00
Marek Olšák	9d043e138d	nir: add nir_clear_divergence_info, use it in nir_opt_varyings nir_opt_varyings computes vertex divergence, which isn't exactly expected by any other passes. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31968>	2024-11-05 14:13:40 +00:00
Marek Olšák	b71edce77a	nir/lower_io: change INTERP_MODE_NONE to SMOOTH when NONE means SMOOTH to improve CSE of load_barycentric_* and IO vectorization. This is only for load_interpolated_input, which can never be FLAT. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31968>	2024-11-05 14:13:40 +00:00
Marek Olšák	aee1ebb992	nir: print interp_mode better Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31968>	2024-11-05 14:13:40 +00:00
Marek Olšák	2ca56376a4	nir: rename nir_io_glsl_lower_derefs -> nir_io_has_io_intrinsics Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31968>	2024-11-05 14:13:40 +00:00
Marek Olšák	adc40aee25	glsl: lower IO in the linker if enabled, don't lower it later This removes the useless codepath that kept IO derefs until st_finalize_nir. It was used before nir_opt_varyings existed. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31968>	2024-11-05 14:13:40 +00:00
Georg Lehmann	bedd6310dc	nir: add nir_opt_frag_coord_to_pixel_coord Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864>	2024-11-04 12:34:31 +00:00
Georg Lehmann	2f830f9b94	nir: add SYSTEM_VALUE_PIXEL_COORD Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864>	2024-11-04 12:34:30 +00:00
Alyssa Rosenzweig	506b9a5ff5	nir/divergence_analysis: add AGX atomics Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31909>	2024-10-30 19:04:32 +00:00
Alyssa Rosenzweig	85b3dc90e0	nir,agx: lower fmin/fmax in NIR we want to elide flushes, doing so requires more sophisticated analysis than I'd like in the middle of isel. also, it should be done before forming preambles for efficiency (notice the uniform reduction here). let's do it with a NIR pass. total instructions in shared programs: 2768481 -> 2757832 (-0.38%) instructions in affected programs: 644084 -> 633435 (-1.65%) helped: 2242 HURT: 18 helped stats (abs) min: 1 max: 349 x̄: 4.77 x̃: 3 helped stats (rel) min: 0.01% max: 34.91% x̄: 3.19% x̃: 2.19% HURT stats (abs) min: 1 max: 19 x̄: 2.89 x̃: 1 HURT stats (rel) min: 0.24% max: 7.94% x̄: 1.27% x̃: 0.81% 95% mean confidence interval for instructions value: -5.20 -4.22 95% mean confidence interval for instructions %-change: -3.30% -3.01% Instructions are helped. total alu in shared programs: 2182880 -> 2172352 (-0.48%) alu in affected programs: 513166 -> 502638 (-2.05%) helped: 2235 HURT: 16 helped stats (abs) min: 1 max: 349 x̄: 4.73 x̃: 3 helped stats (rel) min: 0.02% max: 37.65% x̄: 3.70% x̃: 2.59% HURT stats (abs) min: 1 max: 19 x̄: 2.50 x̃: 1 HURT stats (rel) min: 0.33% max: 3.74% x̄: 1.04% x̃: 0.91% 95% mean confidence interval for alu value: -5.16 -4.20 95% mean confidence interval for alu %-change: -3.83% -3.49% Alu are helped. total fscib in shared programs: 2178643 -> 2168059 (-0.49%) fscib in affected programs: 514666 -> 504082 (-2.06%) helped: 2243 HURT: 17 helped stats (abs) min: 1 max: 349 x̄: 4.74 x̃: 3 helped stats (rel) min: 0.02% max: 37.65% x̄: 3.74% x̃: 2.59% HURT stats (abs) min: 1 max: 19 x̄: 2.65 x̃: 1 HURT stats (rel) min: 0.33% max: 14.71% x̄: 1.85% x̃: 0.93% 95% mean confidence interval for fscib value: -5.16 -4.20 95% mean confidence interval for fscib %-change: -3.87% -3.53% Fscib are helped. total bytes in shared programs: 18467348 -> 18403042 (-0.35%) bytes in affected programs: 4403648 -> 4339342 (-1.46%) helped: 2247 HURT: 20 helped stats (abs) min: 2 max: 2132 x̄: 28.73 x̃: 18 helped stats (rel) min: 0.01% max: 33.53% x̄: 2.80% x̃: 1.94% HURT stats (abs) min: 4 max: 72 x̄: 12.60 x̃: 6 HURT stats (rel) min: 0.23% max: 6.58% x̄: 1.06% x̃: 0.75% 95% mean confidence interval for bytes value: -31.29 -25.45 95% mean confidence interval for bytes %-change: -2.90% -2.64% Bytes are helped. total regs in shared programs: 864605 -> 864442 (-0.02%) regs in affected programs: 4692 -> 4529 (-3.47%) helped: 68 HURT: 48 helped stats (abs) min: 1 max: 54 x̄: 7.25 x̃: 3 helped stats (rel) min: 4.26% max: 43.20% x̄: 13.21% x̃: 10.53% HURT stats (abs) min: 1 max: 36 x̄: 6.88 x̃: 6 HURT stats (rel) min: 3.64% max: 91.67% x̄: 23.12% x̃: 24.00% 95% mean confidence interval for regs value: -3.60 0.79 95% mean confidence interval for regs %-change: -2.10% 5.75% Inconclusive result (value mean confidence interval includes 0). total uniforms in shared programs: 2120927 -> 2120911 (<.01%) uniforms in affected programs: 770 -> 754 (-2.08%) helped: 6 HURT: 0 helped stats (abs) min: 2 max: 4 x̄: 2.67 x̃: 2 helped stats (rel) min: 1.79% max: 2.70% x̄: 2.13% x̃: 1.96% 95% mean confidence interval for uniforms value: -3.75 -1.58 95% mean confidence interval for uniforms %-change: -2.50% -1.76% Uniforms are helped. total threads in shared programs: 27612224 -> 27613056 (<.01%) threads in affected programs: 7168 -> 8000 (11.61%) helped: 6 HURT: 3 helped stats (abs) min: 64 max: 192 x̄: 170.67 x̃: 192 helped stats (rel) min: 8.33% max: 23.08% x̄: 20.62% x̃: 23.08% HURT stats (abs) min: 64 max: 64 x̄: 64.00 x̃: 64 HURT stats (rel) min: 8.33% max: 9.09% x̄: 8.59% x̃: 8.33% 95% mean confidence interval for threads value: -3.17 188.06 95% mean confidence interval for threads %-change: -0.92% 22.69% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>	2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig	e3f91fb13c	nir/serialize: fix name no more nir_register Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: M Henning <drawoc@darkrefraction.com> Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>	2024-10-30 12:59:11 +00:00
Alyssa Rosenzweig	b8624d5c6b	nir: correct comment Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: M Henning <drawoc@darkrefraction.com> Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>	2024-10-30 12:59:11 +00:00
Alyssa Rosenzweig	33299354e0	nir/opt_algebraic: optimize patterns hit with OpenCL This patterns were all found in the AGX quads tessellator, a medium-sized OpenCL kernel. LLVM generates a lot of garbage around booleans which we need to chew through. Though there's nothing AGX or really OpenCL specific here, so some of this could help graphics shaders too. Together, their effect is significant for that kernel instr count & occupancy: before: 2966 inst, 2310 alu, 2310 fscib, 1216 ic, 23148 bytes, 239 regs, 384 threads after: 2848 inst, 2246 alu, 2246 fscib, 1000 ic, 22260 bytes, 231 regs, 448 threads No significant changes on GL shaderdb (a single godot shader regressed 1 instruction, 1344->1345). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>	2024-10-30 12:59:10 +00:00
Marek Olšák	ee452129c6	nir: add cull_triangles_, cull_lines_ prefixes to viewport_xy_scale_and_offset for radeonsi Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>	2024-10-29 16:47:44 +00:00
Marek Olšák	2227f5be9d	nir: rename load_cull_small_primitive_precision -> triangle, add line_precision for radeonsi Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>	2024-10-29 16:47:44 +00:00
Marek Olšák	0914e0d02f	nir: rename load_cull_small_primitives -> triangles, add load_cull_small_lines for radeonsi Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>	2024-10-29 16:47:44 +00:00
Georg Lehmann	d6535f2602	nir/opt_algebraic: create ubfe with non constant mask Foz-DB Navi21: Totals from 278 (0.35% of 79395) affected shaders: MaxWaves: 7444 -> 7448 (+0.05%) Instrs: 316069 -> 314584 (-0.47%); split: -0.47%, +0.00% CodeSize: 1608064 -> 1593204 (-0.92%) VGPRs: 11128 -> 11120 (-0.07%) Latency: 796599 -> 797786 (+0.15%); split: -0.19%, +0.34% InvThroughput: 141195 -> 139472 (-1.22%); split: -1.22%, +0.00% Copies: 28565 -> 29796 (+4.31%); split: -0.15%, +4.46% PreSGPRs: 14335 -> 14336 (+0.01%) VALU: 161342 -> 159426 (-1.19%) SALU: 87794 -> 88305 (+0.58%); split: -0.03%, +0.61% Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31852>	2024-10-29 10:51:10 +00:00
Timur Kristóf	be68aeafdc	nir/opt_algebraic: Add various bitfield extract patterns. v2 (Georg Lehmann): - fixed incorrect imin in ubfe_ubfe - simplied outer_bits of ushr((ubfe, ...), ...) opt - added is_used_once to iand(ushr(), ...) opt to improve stats For-DB Navi21: Totals from 3309 (4.18% of 79206) affected shaders: Instrs: 5295291 -> 5282128 (-0.25%); split: -0.28%, +0.03% CodeSize: 28299320 -> 28298456 (-0.00%); split: -0.07%, +0.06% Latency: 51566173 -> 51521923 (-0.09%); split: -0.09%, +0.01% InvThroughput: 13222050 -> 13204557 (-0.13%); split: -0.14%, +0.01% VClause: 116451 -> 116458 (+0.01%); split: -0.02%, +0.02% SClause: 160356 -> 160324 (-0.02%); split: -0.03%, +0.01% Copies: 424152 -> 423670 (-0.11%); split: -0.20%, +0.09% Branches: 156701 -> 156192 (-0.32%); split: -0.33%, +0.01% PreSGPRs: 168507 -> 168500 (-0.00%); split: -0.02%, +0.01% PreVGPRs: 151477 -> 151474 (-0.00%) VALU: 3486077 -> 3476675 (-0.27%); split: -0.31%, +0.04% SALU: 786467 -> 783109 (-0.43%); split: -0.45%, +0.03% VMEM: 188035 -> 188060 (+0.01%) SMEM: 259632 -> 259630 (-0.00%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31852>	2024-10-29 10:51:09 +00:00
Georg Lehmann	695d2414cd	nir,radv: optimize shared atomic offsets Foz-DB Navi21: Totals from 87 (0.11% of 79395) affected shaders: Instrs: 140877 -> 140873 (-0.00%) CodeSize: 747760 -> 747164 (-0.08%); split: -0.09%, +0.01% Latency: 4528171 -> 4528162 (-0.00%) InvThroughput: 826358 -> 826349 (-0.00%) Copies: 10888 -> 10884 (-0.04%) VALU: 84634 -> 84630 (-0.00%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31080>	2024-10-29 09:31:08 +00:00
Rob Clark	7f63fa34da	nir/lower_amul: Fix ASAN error We shouldn't assume the bindings are sparse when we allocate an array indexed on the binding. See, for example: dEQP-GLES31.functional.program_interface_query.buffer_variable.random.55 Fixes: `2e833b16bc` ("nir/lower_amul: Use num_ubos/ssbos instead of recomputing it.") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31611>	2024-10-25 15:38:51 +00:00
Pierre-Eric Pelloux-Prayer	60578df33a	nir: skip offset=0 in nir_io_add_const_offset_to_base When offset=0, the pass was a no-op but was setting the progress flag which could cause infinite loops when this pass is going to be added to gl_nir_opts. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31684>	2024-10-25 13:36:54 +00:00
Rhys Perry	8efc765a3d	nir/algebraic: fix shfr optimization with zero src2 No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `08903bbe89` ("nir: add mqsad_4x8, shfr and nir_opt_mqsad") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31808>	2024-10-25 09:59:40 +00:00
Rhys Perry	b2abd3bdba	nir: fix shfr constant folding with zero src2 No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `08903bbe89` ("nir: add mqsad_4x8, shfr and nir_opt_mqsad") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31808>	2024-10-25 09:59:40 +00:00
Daniel Schürmann	87cb42f953	treewide: don't lower to LCSSA before calling nir_divergence_analysis() Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	95ed72922e	nir/divergence: Don't assume that LCSSA phis are not loop-invariant Since we check for loop-invariance, we don't have to unconditionally flag LCSSA phis as divergent in presence of divergent breaks. This ensures consistency, with or without LCSSA form. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	c5f142a695	nir/divergence: skip expensive nir_src_is_divergent() check in most cases Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	0eff03d385	nir/divergence: calculate divergence without requiring LCSSA form Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	d34d2f8fa8	nir: consider loop invariance in nir_src_is_divergent() By doing so, this function does not require LCSSA form anymore in order to provide correct results. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	1a55d6c23b	nir/divergence: Introduce and set nir_def::loop_invariant Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	c0b3d7a916	nir/divergence: require nir_metadata_block_index This allows for fast checks whether some value is defined inside a loop. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	8d1abd4996	treewide: use nir_src_is_divergent() rather than checking the divergence of the SSA Without LCSSA, divergence between src and def might differ. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	c8348139fd	nir: change signature of nir_src_is_divergent() Now, it takes nir_src * instead of nir_src. Also move the implementation to nir_divergence_analysis.c. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	421b42637d	nir: remove nir_update_instr_divergence() This function has obscure limitations. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	ce0a3fe645	nir/opt_uniform_atomics: don't preserve divergence information Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	c25c63ebc0	nir/divergence: separately indicate whether loops have divergent continues or breaks bool nir_loop_is_divergent(nir_loop *) replaces the previous loop->divergent indicator. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Georg Lehmann	1f9b82bb2a	nir/opt_algebraic: optimize -0.0 + a Foz-DB Navi21: Totals from 428 (0.54% of 79395) affected shaders: MaxWaves: 8510 -> 8512 (+0.02%) Instrs: 731062 -> 729665 (-0.19%); split: -0.19%, +0.00% CodeSize: 3735788 -> 3728324 (-0.20%); split: -0.20%, +0.00% VGPRs: 27328 -> 27336 (+0.03%); split: -0.03%, +0.06% SpillSGPRs: 315 -> 314 (-0.32%) Latency: 3872986 -> 3873236 (+0.01%); split: -0.08%, +0.09% InvThroughput: 971001 -> 970056 (-0.10%); split: -0.17%, +0.08% VClause: 11954 -> 11956 (+0.02%); split: -0.02%, +0.03% SClause: 17361 -> 17358 (-0.02%) Copies: 59038 -> 59045 (+0.01%); split: -0.22%, +0.24% Branches: 17685 -> 17656 (-0.16%) PreSGPRs: 26103 -> 26102 (-0.00%) PreVGPRs: 23220 -> 23206 (-0.06%) VALU: 515293 -> 513963 (-0.26%); split: -0.26%, +0.00% SALU: 91591 -> 91544 (-0.05%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31770>	2024-10-23 08:58:34 +00:00
Marek Olšák	0226922384	nir: add nir_gather_tcs_info, new gathering/analysis pass This does shader analysis that is more niche than regular shader info. It's planned to be used by nir_restructure_tcs_flow as discussed here: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11910 It's also useful for driver-specific passes. The code for gathering "all_invocations_define_tess_levels" is copied from radeonsi. The rest is new. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31447>	2024-10-23 03:17:16 +00:00
Amber	a3afe22dc9	nir: add pass to lower atomic arithmetic to a loop with cmpxchg. Signed-off-by: Amber Harmonia <amber@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27776>	2024-10-21 21:47:44 +00:00
Mary Guillemard	84d57e1fb1	nir: Move atomic_op_to_alu to common code Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27776>	2024-10-21 21:47:44 +00:00
Marek Olšák	fb6184f89c	nir: add shader_info::tess::tcs_same_invocation_inputs_read(_indirect) We need both the same-invocation usage mask and cross-invocation usage mask. The AMD reason is below. Cross-invocation TCS input access doesn't prevent the same-invocation fast path in AMD hw because it's just a different way to load the same data, and we want to use both paths for the same TCS input based on the load instruction. The fast path can't be used for indirect access, which is gathered separately for same-invocation access. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31645>	2024-10-21 18:53:51 +00:00
Pavel Ondračka	33c8dc4f18	nir/nir_group_loads: reduce chance of max_distance check overflow Helps for the case when max_distance is set to ~0, where the pass would now only create groups of two loads together due to overflow. Found while experimenting with this pass on r300, however the only driver currently affected is i915. With i915 this change gains around 20 shaders in my small shader-db (most notably some GLMark2, Unigine Tropics, Tesseract, Amnesia) at the expense of increased register pressure in few other cases. I'm assuming this is a good deal for such old HW, and this seems like what was intended when the pass was introduced to i915, but anyway this could be tweaked further driver side with a more optimized max_distance value. Only shader-db tested. Relevant i915 shader-db stats (lpt): total tex_indirect in shared programs: 1529 -> 1493 (-2.35%) tex_indirect in affected programs: 96 -> 60 (-37.50%) helped: 29 HURT: 2 total temps in shared programs: 3015 -> 3200 (6.14%) temps in affected programs: 465 -> 650 (39.78%) helped: 1 HURT: 91 GAINED: 20 Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: GKraats <vd.kraats@hccnet.nl> Fixes: `33b4eb149e` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31529>	2024-10-18 09:21:22 +00:00
Job Noorman	509606e56d	nir/lower_subgroups: scan/reduce for multiple ballot components lower_scan_reduce only worked when ballot_components equals one. This commit adds support for arbitrary ballot_components. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31587>	2024-10-18 06:57:52 +00:00
Job Noorman	58b199f7ed	nir/lower_subgroups: add build_cluster_mask helper This functionality will become more complex in the next commit so separate it into a helper function. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31587>	2024-10-18 06:57:52 +00:00
Job Noorman	e0cb4a94a3	nir/lower_subgroups: move up some helper functions build_subgroup_mask and build_ballot_imm_ishl will be needed by other functions higher-up the file. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31587>	2024-10-18 06:57:52 +00:00
Lionel Landwerlin	97b17aa0b1	brw/nir: rework inline_data_intel to work with compute This intrinsic was initially dedicated to mesh/task shaders, but the mechanism it exposes also exists in the compute shaders on Gfx12.5+. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>	2024-10-17 19:35:59 +00:00
Georg Lehmann	dbf63a0788	nir: remove nir_op_is_derivative Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Georg Lehmann	f9d2aad7a3	nir: remove alu ddx/ddy Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00

... 15 16 17 18 19 ...

6469 commits