fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 15:58:06 +02:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	7bc9bbcc6e	nir/lower_printf: support dynamic buffer size this is required for vtn_bindgen2 where we don't know the buffer size until the driver-specific code paths, but we need to lower printf (to hash format strings) in common code. so defer the buffer size decision to an intrinsic. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Alyssa Rosenzweig	6db9218ec3	nir/lower_printf: add option to hash format strings Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Alyssa Rosenzweig	e1368f0a30	nir,util: move printf serializing into util there's nothing NIR specific here and these routines will be useful otherwise. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Alyssa Rosenzweig	47e16cab5e	nir/lower_printf: drop default max buffer size no uses and it doesn't make sense. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Alyssa Rosenzweig	621ff262bc	nir/lower_printf: drop null check we derefernce options above. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Qiang Yu	e5041ef036	docs,src: replace doc and comments for PIPE_CAP with pipe_caps Use command: find . -type d \( -path "./.git" -o -path "./docs/relnotes" \) -prune -o -type f -exec sed -i 's/PIPE_CAP_\([A-Za-z0-9_]\)/pipe_caps.\L\1/g' {} + find . -type d \( -path "./.git" -o -path "./docs/relnotes" \) -prune -o -type f -exec sed -i 's/PIPE_CAPF_\([A-Za-z0-9_]\)/pipe_caps.\L\1/g' {} + With manual adjustment for docs/gallium/screen.rst to merge pipe_cap and pipe_capf section. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32955>	2025-01-17 04:39:47 +00:00
Marek Olšák	ff6e3e9f76	nir: add next_stage param to nir_slot_is_varying & nir_remove_sysval_output The result of nir_slot_is_varying depends on what the next shader stage is, and nir_remove_sysval_output uses it. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32855>	2025-01-16 16:28:15 +00:00
Marek Olšák	0d961b0723	nir: add barycentric coordinates src to load_point_coord_maybe_flipped Just like other input loads, radeonsi needs to know the barycentric coordinates for it. This adds the src and determines the optimal barycentric coordinates in nir_lower_point_smooth, the only producer of the intrinsic. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33046>	2025-01-16 02:58:03 +00:00
Sil Vilerino	e061792e25	src/compiler: Fix warning C4389: An == or != operation involved signed and unsigned variables. This could result in a loss of data. Reviewed-By: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Jesse Natalie <None> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32979>	2025-01-15 21:40:20 +00:00
Sil Vilerino	8ecb7bc2a2	src/compiler: Fix warning C4244 'argument' : conversion from 'type1' to 'type2', possible loss of data Reviewed-By: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Jesse Natalie <None> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32979>	2025-01-15 21:40:20 +00:00
Alyssa Rosenzweig	401b400de3	nir,asahi,hk: add barrier argument to MESA_DISPATCH_PRECOMP In the current API, precomp implicitly assumes full barriers both before & after every dispatch. That's not good for performance. However, dropping the barriers and requiring user to explicitly call barrier functions before/after would have bad ergonomics. So, we add a new parameter to the standard MESA_DISPATCH_PRECOMP signature representing the barriers required around the dispatch. As usual, the actual type & semantic is left to drivers to define what makes sense for their hardware. We just reserve the place for it. (I think most drivers will want bitflags here, but I don't think the actual flags are worth. If a driver wanted to use a struct here, that would work too.) Since the asahi stack doesn't do anything clever with barriers yet, we mechnically add an AGX_BARRIER_ALL barrier to all precomp users in-tree. We can optimize that later, this just gets the flag-day change in with no functional change. For JM panfrost, this will provide a convenient place to stash both their "job barrier" bit and their "suppress prefetch" bit (which is really a sort of barrier / cache flush, if you think about it). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32980>	2025-01-14 16:39:57 +00:00
Kenneth Graunke	2f334e8baf	nir: Add a nir_def_first_component_read() helper Similar to nir_def_last_component_read(). Just a little nicer than prodding at the bitmask of components read directly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32888>	2025-01-10 22:44:09 +00:00
Alyssa Rosenzweig	d9b4867e2a	nir/lower_robust_access: fix robustness with atomic swap this was missed in the original v3d pass, and then the common code port inherited the bug. (so strictly this fix "should" be backported even farther back but it won't apply before the Fixes here, and I don't think we do LTS that far back anyway). in theory this should fix a corner case with robustness on the gl (but not vulkan, at least for apple) drivers on broadcom & apple. Fixes: `f0fb8d05e3` ("nir: Add nir_lower_robust_access pass") Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32907>	2025-01-08 15:59:05 +00:00
Alyssa Rosenzweig	7a4469681e	nir: pass a callback to nir_lower_robust_access rather than try to enumerate everything a driver might want with an unmanageable collection of booleans, just do a filter callback + data. this ends up simpler overall, and will allow Intel to use this pass for just 64-bit images without needing to add even more booleans. while we're churning the pass signature, also do a quick port to nir_shader_intrinsics_pass Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [NIR and V3D] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32907>	2025-01-08 15:59:05 +00:00
Daniel Schürmann	d2f52e61c2	nir/divergence: change nir_has_divergent_loop() to return true only for divergent breaks The important information is whether a loop has a uniform number of iterations. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28627>	2025-01-08 13:33:54 +01:00
Mary Guillemard	ecdccae990	nir,agx: Allow nir_precomp_print_blob to print a static array This makes it stop leaking shader binary blobs definition and is required for panfrost clc. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32939>	2025-01-08 11:37:27 +00:00
Georg Lehmann	67d74a04b9	nir/peephole_select: allow load_vector/scalar_arg_amd Foz-DB Navi21: Totals from 1507 (1.90% of 79395) affected shaders: MaxWaves: 31830 -> 31870 (+0.13%); split: +0.20%, -0.08% Instrs: 938704 -> 937232 (-0.16%); split: -0.19%, +0.03% CodeSize: 4970860 -> 4964652 (-0.12%); split: -0.14%, +0.02% VGPRs: 79536 -> 79512 (-0.03%); split: -0.08%, +0.05% Latency: 5194524 -> 5218285 (+0.46%); split: -0.38%, +0.84% InvThroughput: 1200152 -> 1207251 (+0.59%); split: -0.02%, +0.61% VClause: 20728 -> 20741 (+0.06%); split: -0.11%, +0.17% SClause: 33612 -> 32871 (-2.20%); split: -2.78%, +0.57% Copies: 70601 -> 68847 (-2.48%); split: -2.62%, +0.13% Branches: 20032 -> 17521 (-12.53%) PreSGPRs: 47828 -> 47801 (-0.06%) VALU: 637446 -> 638094 (+0.10%); split: -0.02%, +0.13% SALU: 88627 -> 88462 (-0.19%); split: -1.08%, +0.90% VMEM: 36664 -> 36659 (-0.01%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32792>	2025-01-08 09:56:39 +00:00
Boris Brezillon	2af6e4beeb	pan: Don't pretend we support load_{vertex_id_zero_base,first_vertex} load_vertex_id_zero_base() is supposed to return the zero-based vertex ID, which is then offset by load_first_vertex() to get an absolute vertex ID. At the same time, when we're in a Vulkan environment, load_first_vertex() also encodes the vertexOffset passed to the indexed draw. Midgard/Bifrost have a sligtly different semantics, where load_first_vertex() returns vertexOffset + minVertexIdInIndexRange, and load_vertex_id_zero_base() returns an ID that needs to be offset by this vertexOffset + minVertexIdInIndexRange to get the absolute vertex ID. Everything works fine as long as all the load_first_vertex() and load_vertex_id_zero_base() calls are coming from the load_vertex_id() lowering. But as mentioned above, that's no longer the case in Vulkan, where gl_BaseVertexARB will be turned into load_first_vertex() and expect a value of vertexOffset in an indexed draw context. We thus need to fix the mismatch by introducing two new panfrost-specific intrinsic so we can stop abusing load_first_vertex() and load_vertex_id_zero_base(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32415>	2025-01-07 08:15:19 +00:00
Marek Olšák	3800f0af41	nir/algebraic: optimize pack_split(unpack(a).x, unpack(a).y) -> a This is required to optimize FP64 and Int64 shaders generated by virglrenderer. It generates pack/unpack around every 64-bit op, which NIR currently can't eliminate. This fixes that. There is a new constraint ".y", which means that the use of an instruction should have swizzle.y. This allows us to add patterns that have Y swizzle on results of instructions. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32172>	2025-01-07 05:47:52 +00:00
Marek Olšák	b1bc691b0f	nir/algebraic: add and improve pack/unpack patterns Some duplicated patterns are removed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32172>	2025-01-07 05:47:52 +00:00
Marek Olšák	ebec182b04	nir/algebraic: use is_used_once for comparison patterns otherwise we are just creating new instructions while not removing any Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32172>	2025-01-07 05:47:52 +00:00
Marek Olšák	ee8916c414	nir: use IO intrinsics in nir_lower_drawpixels Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	0de28a9fd0	nir: use IO intrinsics in nir_lower_bitmap Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	a7ad1b302b	nir: remove redundant option linker_ignore_precision Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	730c8d506f	nir: flip the early exit condition in nir_lower_io_temporaries no change in behavior other than skipping COMPUTE as well. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	7b55ee999d	nir: don't set num_slots/src/dest_type/write_mask when they're set automatically to those values Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	55a4a8a2a8	nir: set src_type and dest_type to float implicitly for IO build helpers If you want to set it to int/uint, set .src_type or .dest_type. If you want to set it to float, you don't need to set the type at all. It's implicitly set to float. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	b9f9d001d7	nir: set nir_io_semantics::num_slots to at least 1 in build helpers Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Benjamin Lee	081438ad39	panfrost: add nir pass to lower noperspective varyings Mali only supports perspective-correct varying interpolation in hardware, so we have to emulate noperspective with lowering in both the VS and FS. Both vulkan and opengl allow mismatched interpolation qualifiers between stages. Because we need all varyings that are noperspective in the FS to be lowered in the VS, we cannot rely on the interpolation qualifiers in the VS. Loading the set of noperspective varyings as a sysval allows the implementation to pass them as a compile-time constant when known statically, or a runtime push constant when not. Passing noperspective varyings dynamically has a performance cost with unnecessary branches and fmuls. This sysval is not hooked up yet in either panfrost or panvk, so shader compilation will fail. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32127>	2025-01-03 07:04:05 +00:00
Benjamin Lee	6f541e2016	panfrost: add intrinsic to load frag coord at a barycentric This is needed for noperspective lowering, where we need to multiply the varying value by gl_FragCoord.w at the same barycentric as the varying. Normal nir_load_frag_coord_zw instructions are lowered to the new intrinsic on bifrost with the pan_lower_frag_coord_zw pass. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32127>	2025-01-03 07:04:05 +00:00
Timur Kristóf	ec548fd37b	Revert "nir/opt_varyings: Add workaround for RADV mesh shader multiview." The workaround is not needed anymore, because RADV now implements the FS layer ID input as a sysval. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32641>	2025-01-02 14:07:51 +00:00
Marek Olšák	c21bc65ba7	nir/opt_load_store_vectorize: make hole_size signed to indicate overlapping loads A negative hole size means the loads overlap. This will be used by drivers to handle overlapping loads in the callback easily. Reviewed-by: Mel Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32699>	2025-01-01 00:03:55 +00:00
Georg Lehmann	e112e2b047	nir,amd: optimize front_face ? a : -a Foz-DB Navi31: Totals from 3345 (4.21% of 79395) affected shaders: MaxWaves: 96182 -> 96174 (-0.01%) Instrs: 3135439 -> 3129508 (-0.19%); split: -0.24%, +0.05% CodeSize: 16776088 -> 16718048 (-0.35%); split: -0.38%, +0.03% VGPRs: 190884 -> 190848 (-0.02%); split: -0.03%, +0.01% Latency: 32624132 -> 32621734 (-0.01%); split: -0.16%, +0.16% InvThroughput: 5759987 -> 5749957 (-0.17%); split: -0.23%, +0.05% VClause: 51044 -> 51086 (+0.08%); split: -0.12%, +0.20% SClause: 103415 -> 103223 (-0.19%); split: -0.64%, +0.45% Copies: 170398 -> 170555 (+0.09%); split: -0.64%, +0.74% PreSGPRs: 135567 -> 133887 (-1.24%) PreVGPRs: 140569 -> 141317 (+0.53%) VALU: 1959144 -> 1953839 (-0.27%); split: -0.30%, +0.03% SALU: 217956 -> 217676 (-0.13%); split: -0.20%, +0.07% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:35 +00:00
Georg Lehmann	9bd4296845	nir: add nir_alu_srcs_negative_equal_typed Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:35 +00:00
Georg Lehmann	15d754fefa	nir: add load_front_face_fsign Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:34 +00:00
Georg Lehmann	b8fa9daf0c	nir: sink/move alu with two identical, non constant sources. Foz-DB Navi21: Totals from 32363 (40.76% of 79395) affected shaders: MaxWaves: 787499 -> 787675 (+0.02%); split: +0.02%, -0.00% Instrs: 28783404 -> 28783464 (+0.00%); split: -0.01%, +0.01% CodeSize: 156763536 -> 156765148 (+0.00%); split: -0.01%, +0.02% VGPRs: 1493304 -> 1492848 (-0.03%); split: -0.04%, +0.01% Latency: 243022511 -> 243051994 (+0.01%); split: -0.08%, +0.09% InvThroughput: 57827398 -> 57828129 (+0.00%); split: -0.05%, +0.05% VClause: 582208 -> 582298 (+0.02%); split: -0.07%, +0.08% SClause: 959634 -> 959312 (-0.03%); split: -0.07%, +0.04% Copies: 1965821 -> 1965826 (+0.00%); split: -0.17%, +0.17% Branches: 710593 -> 710596 (+0.00%); split: -0.00%, +0.01% PreSGPRs: 1313513 -> 1313632 (+0.01%); split: -0.00%, +0.01% PreVGPRs: 1210596 -> 1209103 (-0.12%); split: -0.12%, +0.00% VALU: 19463445 -> 19463497 (+0.00%); split: -0.02%, +0.02% SALU: 3319529 -> 3319500 (-0.00%); split: -0.01%, +0.01% Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32783>	2024-12-30 13:28:30 +00:00
Georg Lehmann	5b4b195f1b	nir: optimize unpacking 8bit values from a 64bit source Useful for load vectorization. Foz-DB Navi21: Totals from 299 (0.38% of 79395) affected shaders: Instrs: 287818 -> 284333 (-1.21%); split: -1.21%, +0.00% CodeSize: 1557124 -> 1540544 (-1.06%); split: -1.07%, +0.00% Latency: 4009407 -> 4012389 (+0.07%); split: -0.05%, +0.12% InvThroughput: 1260613 -> 1262530 (+0.15%); split: -0.01%, +0.17% VClause: 5472 -> 5369 (-1.88%); split: -1.92%, +0.04% SClause: 5419 -> 5305 (-2.10%); split: -2.58%, +0.48% Copies: 36709 -> 36060 (-1.77%); split: -1.81%, +0.04% PreSGPRs: 11861 -> 11655 (-1.74%) SALU: 66920 -> 64310 (-3.90%) Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32778>	2024-12-26 17:50:32 +00:00
Marek Olšák	58132d6fc8	radeonsi: implement nir_opt_frag_depth using kill_z instead of the NIR pass This uses si_shader_info to store whether gl_FragDepth can be removed, and it uses the kill_z epilog flag to do the removal without recompilation. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	a50d069d1c	nir/opt_varyings: clear info->clip/cull_distance_array_size if relocated svga breaks if shader_info declares these, but the shader is missing the outputs. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32684>	2024-12-20 02:32:08 +00:00
Marek Olšák	9d129505b5	nir/opt_varyings: set all IO types to float to facilitate full vectorization If types differ between components of a vec4 slot, IO vectorization can't be done. This also helps drivers like d3d12 that require matching types between shaders. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32684>	2024-12-20 02:32:08 +00:00
Caterina Shablia	f4fcfa8016	pan,nir: introduce load_attribute_pan load_attribute_pan is a panfrost-specific intrinsic for loading vertex attributes. Takes explicit vertex and instance IDs which we need in order to implement vertex attribute divisor with non-zero base instance on v9+. Passes which are used by panvk are modified to be aware of load_attribute_pan. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32039>	2024-12-18 08:33:16 +00:00
Georg Lehmann	c695043e81	nir/opt_algebraic: optimize min(max(a, b), a) Foz-DB Navi21: Totals from 105 (0.13% of 79395) affected shaders: MaxWaves: 2638 -> 2646 (+0.30%) Instrs: 76531 -> 75077 (-1.90%) CodeSize: 413668 -> 406484 (-1.74%) VGPRs: 4856 -> 4848 (-0.16%) Latency: 333684 -> 328438 (-1.57%); split: -1.57%, +0.00% InvThroughput: 80417 -> 78579 (-2.29%) VClause: 1818 -> 1768 (-2.75%) SClause: 3028 -> 2964 (-2.11%) Copies: 4708 -> 4513 (-4.14%); split: -4.50%, +0.36% PreVGPRs: 3792 -> 3715 (-2.03%); split: -2.08%, +0.05% VALU: 54734 -> 53528 (-2.20%) SALU: 6195 -> 6137 (-0.94%) VMEM: 2363 -> 2313 (-2.12%) SMEM: 5219 -> 5119 (-1.92%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32634>	2024-12-16 22:29:21 +00:00
Georg Lehmann	0e6d32777f	nir/opt_remove_phis: rematerialize equal alu Foz-DB Navi31: Totals from 943 (1.19% of 79395) affected shaders: MaxWaves: 24672 -> 24722 (+0.20%) Instrs: 1541665 -> 1544956 (+0.21%); split: -0.23%, +0.44% CodeSize: 8085180 -> 8109212 (+0.30%); split: -0.16%, +0.46% VGPRs: 57768 -> 57624 (-0.25%) Latency: 18043743 -> 17948245 (-0.53%); split: -1.28%, +0.75% InvThroughput: 2692605 -> 2677049 (-0.58%); split: -2.07%, +1.49% VClause: 25321 -> 25343 (+0.09%); split: -0.48%, +0.57% SClause: 38473 -> 38614 (+0.37%); split: -0.00%, +0.37% Copies: 86089 -> 86236 (+0.17%); split: -0.46%, +0.63% Branches: 36719 -> 36777 (+0.16%); split: -0.60%, +0.76% PreSGPRs: 44138 -> 44303 (+0.37%); split: -0.05%, +0.42% PreVGPRs: 43319 -> 43009 (-0.72%) VALU: 893684 -> 894272 (+0.07%); split: -0.42%, +0.48% SALU: 189561 -> 191358 (+0.95%); split: -0.05%, +1.00% VMEM: 42294 -> 42313 (+0.04%); split: -0.44%, +0.49% SMEM: 72916 -> 73144 (+0.31%) Instruction count regressions are largly caused by additional loop unrolling. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31028>	2024-12-16 20:38:38 +00:00
Qiang Yu	129e37bab6	nir: do not generate b2i64 when driver want to lower it This is found on GFX12 by: KHR-GL43.shader_ballot_tests.ShaderBallotBitmasks ACO does not support it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32570>	2024-12-16 07:35:07 +00:00
Alyssa Rosenzweig	bd89279dd4	nir: add lower_scratch_to_var pass to ease opencl pain. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>	2024-12-12 21:16:13 +00:00
Rhys Perry	26790e90d3	nir: make ballot ALU and mbcnt_amd operations reorderable These can be lowered to ALU and load_subgroup_invocation, all of which are reorderable. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32512>	2024-12-11 14:47:12 +00:00
Rhys Perry	650468fbdf	nir/move_discards_to_top: don't move across more intrinsics This missed dpp16_shift_amd, lane_permute_16_amd, last_invocation and ballot_relaxed. Instead, list the non-reorderable intrinsics which are allowed to be moved after discards. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32512>	2024-12-11 14:47:12 +00:00
Rhys Perry	5368569d06	nir: make load_helper_invocation non-reorderable This can't be moved to after demote, so it's not reorderable. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32512>	2024-12-11 14:47:12 +00:00
Georg Lehmann	e8b29abb25	nir: add unsigned upper bound support for fsat Foz-DB Navi21: Totals from 89 (0.11% of 79395) affected shaders: Instrs: 97018 -> 96995 (-0.02%) CodeSize: 492996 -> 492488 (-0.10%) Latency: 504649 -> 504555 (-0.02%) InvThroughput: 121968 -> 121875 (-0.08%) VALU: 67427 -> 67404 (-0.03%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32565>	2024-12-10 20:53:53 +00:00
Georg Lehmann	e78e63e3fe	nir: add unsigned upper bound support for f2i32 Foz-DB Navi21: Totals from 649 (0.82% of 79395) affected shaders: CodeSize: `2330592` -> 2314112 (-0.71%) Latency: 2068161 -> 2053370 (-0.72%) InvThroughput: 346583 -> 329425 (-4.95%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32565>	2024-12-10 20:53:53 +00:00

1 2 3 4 5 ...

5891 commits