fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 15:58:06 +02:00

Author	SHA1	Message	Date
Georg Lehmann	045ddb992a	nir/opt_algebraic: optimize 16bit vec2 comparison followed by b2i16 using usub_sat Helps vectorized emulated fp16 -> fp8 conversions No Foz-DB changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35876>	2025-07-03 20:08:39 +00:00
Alyssa Rosenzweig	f853d285ef	nir/lower_tex: optimize LOD bias lower for txl make sure we can fold the f2f away. alternatively f2fmp would work here but details. elden ring: Totals from 137 (4.27% of 3206) affected shaders: Instrs: 485455 -> 484904 (-0.11%) CodeSize: 3218638 -> 3215338 (-0.10%) ALU: 308071 -> 307520 (-0.18%) FSCIB: 308071 -> 307520 (-0.18%) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35909>	2025-07-03 16:41:51 +00:00
Alyssa Rosenzweig	b992703477	nir/lower_system_values: optimize global ID for drivers where we need to lower a base_workgroup_id but not global IDs. rather than lowering the whole global ID to stick the base workgroup ID in there, just add the workgroup offset to the final thread position. Elden ring fossils: Totals from 52 (1.62% of 3206) affected shaders: Instrs: 48355 -> 48233 (-0.25%); split: -0.31%, +0.06% CodeSize: 331912 -> 331148 (-0.23%); split: -0.28%, +0.05% ALU: 30853 -> 30674 (-0.58%); split: -0.70%, +0.12% FSCIB: 30853 -> 30674 (-0.58%); split: -0.70%, +0.12% IC: 9054 -> 8958 (-1.06%) GPRs: 4184 -> 4216 (+0.76%) Uniforms: 6703 -> 6677 (-0.39%); split: -1.61%, +1.22% Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35909>	2025-07-03 16:41:51 +00:00
Alejandro Piñeiro	0003e16fc6	nir/lower_clip: update comment As the lowering mentioned there got renamed twice: commit `b085016f94` Author: Rob Clark <robclark@freedesktop.org> Date: Fri Mar 25 13:52:26 2016 -0400 nir: rename lower_outputs_to_temporaries -> lower_io_to_temporaries Since it will gain support to lower inputs, give it a more generic name. commit `1754507d49` Author: Marek Ol¨ák <maraeo@gmail.com> Date: Wed Jun 25 19:05:19 2025 -0400 nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760> Reviewed-by: Marek Ol¨ák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35855>	2025-07-02 20:56:53 +00:00
Marek Olšák	4263b49778	ac/nir: remove ngg_scratch LDS ABI, allocate it in the lowering pass This is a cleanup. Old gs LDS layout: [es outputs][gs outputs][scratch] Old nogs LDS layout: [xfb/cull][scratch] New gs LDS layout: [es outputs][scratch\|gs outputs] New nogs LDS layout: [scratch\|xfb/cull] The LDS scratch is moved to the beginning of the preceding buffer in LDS, while the addresses in that LDS buffer are offset by the scratch size. It effectively merges the LDS scratch with the preceding buffer in LDS. Thanks to that, we no longer need the ngg_scratch ABI and the offset in a user SGPR. The lowering passes now return the LDS scratch size, which is used by the drivers to determine the final LDS size. The ngg_lds_layout SGPR is now unused without GS in RADV. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:41 +00:00
David Neto	673f684ddd	nir: Support printing cmat constants A cooperative matrix can only be constructed from a single scalar value. Print that value, wrapped by a function call that looks like a type-constructor. This adds a test case that will otherwise assert out in spirv2nir. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35757>	2025-07-02 16:48:51 +00:00
Alyssa Rosenzweig	3c2f46fcac	treewide: use nir_break_if with named if Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Via Coccinelle patch: @@ expression builder, condition; identifier nif; @@ -nir_if *nif = nir_push_if(builder, condition); -{ -nir_jump(builder, nir_jump_break); -} -nir_pop_if(builder, nif); +nir_break_if(builder, condition); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35794>	2025-06-30 14:51:54 -04:00
Alyssa Rosenzweig	67237b6f1b	treewide: use nir_break_if Via Coccinelle patch: @@ expression builder, condition; @@ -nir_push_if(builder, condition); -{ -nir_jump(builder, nir_jump_break); -} -nir_pop_if(builder, NULL); +nir_break_if(builder, condition); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35794>	2025-06-30 14:51:24 -04:00
Alyssa Rosenzweig	7fd7b18b38	nir: rename AGX geom/tess intrinsics Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details to the new common code name. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>	2025-06-30 16:24:10 +00:00
Alyssa Rosenzweig	d13b321201	nir/lower_gs_intrinsics: drop stuff added for AGX AGX now vendors a significantly different version of this pass, so the common one doesn't need the stuff added for AGX. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>	2025-06-30 16:24:10 +00:00
Alyssa Rosenzweig	16b53d356a	nir: add rasterization_stream sysval for plumbing transformFeedbackRasterizationStreamSelect (in turn for exercising more CTS and proving out my design). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>	2025-06-30 16:24:06 +00:00
Alyssa Rosenzweig	805ef6cc17	nir: add intrinsics for geometry shader lowering Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>	2025-06-30 16:24:05 +00:00
Alyssa Rosenzweig	4f7cae5e61	nir/opt_algebraic: add trichotomy identity In https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802 we will significantly rework geometry shaders & transform feedback. In the new approach, transform feedback is executed as part of the hardware vertex shader, meaning the vertex shader needs to write out all the "copies" of the same value into different parts of the XFB buffer. In the general case of a GS writing triangle strips, we get 0-3 copies. This is good and lets us parallelize XFB better with GS. In the case of a VS alone with XFB, we insert a passthrough GS. In that case special case, we can only get at most 1 copy, so if we can prove the length of the output strip is 3 we can delete 2/3 of the shader. Anyway, the only thing preventing NIR from doing that optimization is failing to see through some conditionals, fixed by optimizing with the law of trichotomy. We could add other variants of this pattern (signed vs unsigned, iand vs ior/ixor) if we expect anything else to hit this other than my boutique use case. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>	2025-06-30 16:24:04 +00:00
Robert Mader	a166d7609f	gles: Add support for 10/12/16 bit SW decoder YCbCr formats Signed-off-by: Robert Mader <robert.mader@collabora.com> Co-Authored-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34303>	2025-06-30 11:56:23 +00:00
Rhys Perry	7b291a33d4	nir/search: fix dumping of conversions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35770>	2025-06-30 10:41:39 +00:00
Rhys Perry	08859cbe50	nir/lower_bit_size: fix bitz/bitnz Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `6585209cdd` ("nir/lower_bit_size: mask bitz/bitnz src1 like shifts") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35770>	2025-06-30 10:41:39 +00:00
Mel Henning	8795006994	nir/opt_uniform_subgroup: Handle vote_feq Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Brings the vertex shader in dEQP-VK.subgroups.vote.framebuffer.subgroupallequal_dvec4_vertex from 234 to 169 instructions on NAK. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>	2025-06-28 16:10:50 +00:00
Mel Henning	70fccc59fc	nir/opt_uniform_subgroup: Handle vote_ieq No shader-db changes here, but it does improve some cts shaders, eg. the vertex shader in dEQP-VK.subgroups.vote.framebuffer.subgroupallequal_i64vec4_vertex goes from 80 to 56 instructions with NAK Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>	2025-06-28 16:10:50 +00:00
Mel Henning	10acb44c64	nir: Split lower_vote_eq into int/float versions Recent nvidia hardware has a native instruction for nir_intrinsic_vote_ieq but not for nir_intrinsic_vote_feq. So, split this boolean into two so we can contol the lowering separately for each instruction. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>	2025-06-28 16:10:50 +00:00
Lionel Landwerlin	fcf4401824	brw: handle wa_18019110168 with independent shader compilation Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>	2025-06-28 05:55:35 +00:00
Matt Turner	102d7409ef	nir: Add convert_cmat_intel intrinsic This intrinsic will be used to implement matrix type and layout conversions in the backend compiler. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35616>	2025-06-27 01:26:22 +00:00
James Price	10ae673368	spirv: Fix cooperative matrix in OpVariable initializer Check for cooperative matrix types first in the nir_lower_variable_initializers pass, since they are also considered to be scalar types. Fixes: `7e6cd395c7` ("nir: Handle cmat types in lower_variable_initializers") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13388 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35668>	2025-06-26 22:24:31 +00:00
Konstantin Seurer	aacfc663cb	nir: Add nir_lower_halt_to_return This is a lowering pass that was implemented by multiple drivers. Reviewed-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33003>	2025-06-26 20:12:12 +00:00
Marek Olšák	1754507d49	nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:54 +00:00
Marek Olšák	1e03827c77	nir: rename nir_lower_io_arrays_to_elements -> nir_lower_io_array_vars_to_elements same for *_no_indirects Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:54 +00:00
Marek Olšák	3713e2d580	nir: rename nir_lower_clip_cull_distance_arrays -> nir_lower_clip_cull_distance_array_vars Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:53 +00:00
Marek Olšák	adb17a8609	nir: move nir_recompute_io_bases into its own file Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:53 +00:00
Marek Olšák	97743980ce	nir: remove unused nir_force_mediump_io & nir_unpack_16bit_varying_slots I think I added these. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:52 +00:00
Marek Olšák	aefea49dad	nir: move lots of code from nir_lower_io.c into new nir_lower_explicit_io.c nir_lower_io is just for regular inputs/outputs. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:52 +00:00
Marek Olšák	5bd3e0c08c	nir: move nir_assign_var_locations to freedreno (its only use) Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:52 +00:00
Marek Olšák	c8cda0dc1a	nir: move nir_io_add_const_offset_to_base into its own file Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:51 +00:00
Marek Olšák	d78070ded5	nir: move nir_io_add_intrinsic_xfb_info into its own file Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:51 +00:00
Marek Olšák	12df9b3def	nir: rename nir_vectorize_tess_levels -> nir_lower_tess_level_array_vars_to_vec Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:50 +00:00
Marek Olšák	2aa94caf82	nir: rename nir_lower_io_to_vector -> nir_opt_vectorize_io_vars Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:50 +00:00
Marek Olšák	944f8f6db2	nir: move nir_lower_io_vars_to_scalar into its own file Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:49 +00:00
Marek Olšák	439d805291	nir: rename nir_lower_io_to_scalar_early -> nir_lower_io_vars_to_scalar Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:49 +00:00
Alyssa Rosenzweig	6efe557718	nir/search_helpers: add has_multiple_uses helper heuristic for the next patch. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35720>	2025-06-26 16:41:55 +00:00
Alyssa Rosenzweig	63ce73a601	nir,hk: sink lowered UBOs this is better than doing it once we've lowered to hardware ops which makes it more challenging to sink since then we'd have to sink the whole tree instead of a single intrinsic. Totals from 17617 (32.81% of 53701) affected shaders: MaxWaves: 16863872 -> 16901504 (+0.22%); split: +0.24%, -0.02% Instrs: 12406405 -> 12430375 (+0.19%); split: -0.15%, +0.35% CodeSize: 87055248 -> 87180802 (+0.14%); split: -0.18%, +0.33% Spills: 10350 -> 9301 (-10.14%); split: -11.57%, +1.43% Fills: 5215 -> 3733 (-28.42%); split: -31.49%, +3.07% Scratch: 113164 -> 110472 (-2.38%); split: -2.63%, +0.25% ALU: 9552550 -> 9558513 (+0.06%); split: -0.22%, +0.28% FSCIB: 9552545 -> 9558508 (+0.06%); split: -0.22%, +0.28% IC: 2874032 -> 2876442 (+0.08%); split: -0.00%, +0.09% GPRs: 1470040 -> 1459283 (-0.73%); split: -1.00%, +0.27% Uniforms: 5113254 -> 5115158 (+0.04%); split: -0.82%, +0.85% Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Job Noorman <job@noorman.info> [NIR] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35720>	2025-06-26 16:41:55 +00:00
Alyssa Rosenzweig	caa0854da8	nir: plumb load_global_bounded this lets the backend implement bounded loads (i.e. robust SSBOs) in a way that's more clever than a full branch. similar idea to load_global_constant_bound which should eventually be merged into this. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Job Noorman <job@noorman.info> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35720>	2025-06-26 16:41:53 +00:00
Georg Lehmann	7de352e99e	nir,radv: add an option to not move 8/16bit vecs ACO will overestimate the register demand of the sources, so we don't want to create the vector later. Foz-DB Navi48: Totals from 240 (0.30% of 80265) affected shaders: MaxWaves: 6429 -> 6435 (+0.09%) Instrs: 3406069 -> 3406646 (+0.02%); split: -0.01%, +0.03% CodeSize: 18231596 -> 18233288 (+0.01%); split: -0.01%, +0.02% VGPRs: 14768 -> 14732 (-0.24%) Latency: 18981274 -> 18979170 (-0.01%); split: -0.02%, +0.01% InvThroughput: 4247331 -> 4246634 (-0.02%); split: -0.02%, +0.01% VClause: 85453 -> 85458 (+0.01%); split: -0.01%, +0.01% Copies: 262046 -> 261971 (-0.03%); split: -0.06%, +0.03% PreVGPRs: 10899 -> 10775 (-1.14%) VALU: 1923441 -> 1923485 (+0.00%); split: -0.01%, +0.01% SALU: 457983 -> 457982 (-0.00%) VOPD: 4980 -> 4861 (-2.39%); split: +0.48%, -2.87% Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35729>	2025-06-26 09:29:43 +00:00
Georg Lehmann	7ac9a87572	nir/opt_sink: don't assume moving conversion can't increase register pressure Foz-DB Navi48: Totals from 11311 (14.09% of 80265) affected shaders: MaxWaves: 337664 -> 337648 (-0.00%); split: +0.00%, -0.01% Instrs: 10102221 -> 10101625 (-0.01%); split: -0.05%, +0.04% CodeSize: 55000184 -> 54999292 (-0.00%); split: -0.04%, +0.03% VGPRs: 571052 -> 571064 (+0.00%); split: -0.03%, +0.03% Latency: 59247189 -> 59204726 (-0.07%); split: -0.13%, +0.06% InvThroughput: 10236407 -> 10215675 (-0.20%); split: -0.26%, +0.06% VClause: 211730 -> 211677 (-0.03%); split: -0.07%, +0.04% SClause: 284802 -> 284762 (-0.01%); split: -0.07%, +0.06% Copies: 702890 -> 702539 (-0.05%); split: -0.18%, +0.13% Branches: 205117 -> 205112 (-0.00%) PreSGPRs: 475898 -> 475825 (-0.02%); split: -0.02%, +0.00% PreVGPRs: 366318 -> 366449 (+0.04%); split: -0.14%, +0.17% VALU: 5764791 -> 5764349 (-0.01%); split: -0.02%, +0.01% SALU: 1259529 -> 1259517 (-0.00%); split: -0.04%, +0.04% VOPD: 5854 -> 5724 (-2.22%); split: +0.70%, -2.92% Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35729>	2025-06-26 09:29:43 +00:00
Rob Clark	6f5ff6be44	nir: Fix lower_readonly_images_to_tex bitsize The txf instruction could be returning something smaller than 32b. Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35758>	2025-06-26 02:48:16 +00:00
Georg Lehmann	e6d208b1f9	nir/opt_shrink_vectors: also split vecs into distinct smaller vecs if possible Foz-DB Navi48: Totals from 17 (0.02% of 80265) affected shaders: Instrs: 75085 -> 74912 (-0.23%); split: -0.23%, +0.00% CodeSize: 428968 -> 427028 (-0.45%); split: -0.45%, +0.00% Latency: 1306841 -> 1306080 (-0.06%); split: -0.06%, +0.00% InvThroughput: 598998 -> 598719 (-0.05%) Copies: 15733 -> 15561 (-1.09%) Branches: 2435 -> 2422 (-0.53%) PreVGPRs: 1723 -> 1721 (-0.12%) VALU: 43019 -> 42847 (-0.40%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35676>	2025-06-25 05:34:48 +00:00
Georg Lehmann	22d7dd69b2	nir/shrink_vectors: shrink larger vectors too Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35676>	2025-06-25 05:34:48 +00:00
Matt Turner	6100dbc3d0	compiler: Generate files with newline at end These generator scripts use the `write` function that, unlike `print`, doesn't print a trailing newline. So let's add one to the template. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35697>	2025-06-24 14:01:04 +00:00
Georg Lehmann	b729ad1742	nir/loop_analyze: consider movs/vecs free Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details They are free more likely than not. Foz-DB Navi31: Totals from 462 (0.58% of 80251) affected shaders: Instrs: 1464013 -> 1868466 (+27.63%) CodeSize: 8476352 -> 10745544 (+26.77%) VGPRs: 27412 -> 27560 (+0.54%) SpillSGPRs: 0 -> 16 (+inf%) SpillVGPRs: 83 -> 76 (-8.43%) Scratch: 6072832 -> 6071808 (-0.02%) Latency: 19282476 -> 19552323 (+1.40%); split: -0.11%, +1.51% InvThroughput: 2198357 -> 2258490 (+2.74%); split: -0.47%, +3.21% VClause: 32986 -> 43491 (+31.85%) SClause: 72760 -> 126112 (+73.33%) Copies: 165286 -> 223368 (+35.14%) Branches: 60530 -> 79743 (+31.74%); split: -0.03%, +31.77% PreSGPRs: 24885 -> 25077 (+0.77%) PreVGPRs: 23004 -> 22494 (-2.22%); split: -2.26%, +0.04% VALU: 760978 -> 898136 (+18.02%) SALU: 187786 -> 252995 (+34.73%); split: -0.03%, +34.75% VMEM: 58469 -> 69164 (+18.29%); split: -0.07%, +18.36% SMEM: 87926 -> 158175 (+79.90%); split: -0.00%, +79.90% VOPD: 580 -> 732 (+26.21%); split: +31.38%, -5.17% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35686>	2025-06-24 12:18:47 +00:00
Georg Lehmann	b1290fdf20	nir/loop_analyze: handle vector selections properly Consider all conditions, not just the first. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35686>	2025-06-24 12:18:47 +00:00
Georg Lehmann	47aba15489	nir/loop_analyze: always consider comparisions between induction var and constant free There is no reason why this should be restricted to single uses. Foz-DB Navi31: Totals from 21 (0.03% of 80251) affected shaders: Instrs: 54424 -> 65851 (+21.00%) CodeSize: 286688 -> 346896 (+21.00%) Latency: 2980310 -> 2959904 (-0.68%) InvThroughput: 403744 -> 400782 (-0.73%) VClause: 923 -> 1316 (+42.58%) SClause: 1217 -> 1705 (+40.10%) Copies: 3226 -> 3393 (+5.18%); split: -0.87%, +6.04% Branches: 1014 -> 1130 (+11.44%); split: -0.39%, +11.83% PreSGPRs: 1327 -> 1306 (-1.58%) PreVGPRs: 1896 -> 1868 (-1.48%) VALU: 36083 -> 43560 (+20.72%) SALU: 4471 -> 4708 (+5.30%); split: -2.75%, +8.05% VMEM: 2225 -> 2743 (+23.28%) SMEM: 1662 -> 2273 (+36.76%); split: -0.06%, +36.82% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35686>	2025-06-24 12:18:47 +00:00
Georg Lehmann	8c4225b99b	nir: add cmat_transpose Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34793>	2025-06-24 07:14:34 +00:00
Alyssa Rosenzweig	d37bf148d2	nir/lower_blend: fix snorm factor clamping The spec says (emphasis mine): If the color attachment is fixed-point, the components of the source and destination values AND BLEND FACTORS are each clamped to [0,1] or [-1,1] respectively for an unsigned normalized or signed normalized color attachment prior to evaluating the blend operations. If the color attachment is floating-point, no clamping occurs. However, neither the CTS nor any hardware implement this semantic. For unsigned normalized formats, the definitions are roughly equivalent (except perhaps around constant colours). 0 <= x <= 1 implies that 0 <= 1 - x <= 1. Therefore if the source/destination colours are clamped to [0, 1], then their complements are also in [0, 1], so clamping any blend factor (except constant colour) has no effect if the source/dest were already clamped. For signed normalized formats, however, this difference matters. -1 <= x <= 1 implies that 0 <= 1 - x <= 2... so to implement the spec text faithfully, we would need to clamp again the complemented colour blend factors to return back to signed normalized range. Software blending implementations can of course do that... but doing so causes CTS fails, as the CTS reference renderer does not do this. This commit adjusts nir_lower_blend to match what actual hardware does, what CTS requires, and what the spec should have said. See https://gitlab.khronos.org/vulkan/vulkan/-/issues/4293 for the spec resolution. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35519>	2025-06-23 19:38:27 +00:00

1 2 3 4 5 ...

6321 commits