fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 11:38:06 +02:00

Author	SHA1	Message	Date
Connor Abbott	2d45836c95	ir3: Plumb through ray_intersection intrinsic Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28447>	2025-01-20 01:22:23 +00:00
Connor Abbott	91f19bcbe0	ir3: Plumb through two-dimensional UAV loads There is native support for D3D-style untyped UAVs, which are an unsized array of "records." This will be needed for acceleration structures, because normal SSBO descriptors aren't large enough to cover all the 128-byte instance descriptors for the maximum number of instances (2**24). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28447>	2025-01-20 01:22:23 +00:00
Konstantin Seurer	01ec2f59a4	nir/print: Do not print trailing spaces after preds/succs Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32644>	2025-01-18 11:02:25 +00:00
Konstantin Seurer	eb3ab68e5e	nir/tests: Add reference shaders Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32644>	2025-01-18 11:02:25 +00:00
Konstantin Seurer	8838a0c595	nir/tests: Add a helper for comparing a shader against a string This allows unit tests to compare against a reference nir shader instead of implementing checks for interesting instructions/CF nodes. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32644>	2025-01-18 11:02:25 +00:00
Konstantin Seurer	6d1d15183f	nir/tests: Improve shader creation Sets some fields so they are not printed and allows specifying a stage. This decreases the size of reference shaders. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32644>	2025-01-18 11:02:25 +00:00
Konstantin Seurer	305be9cf5e	nir/print: Print less unused shader info Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32644>	2025-01-18 11:02:25 +00:00
Lionel Landwerlin	2603dbd796	nir: make lower-level printf helper respect buffer size Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Alyssa Rosenzweig	43e79b26de	nir/lower_printf: drop static buffer addr lowering no longer used, replaced by the new pass. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Alyssa Rosenzweig	07ad850787	nir: add nir_lower_printf_buffer pass this is a helper for lowering the printf buffer intrinsics to constants for backend convenience. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Alyssa Rosenzweig	7bc9bbcc6e	nir/lower_printf: support dynamic buffer size this is required for vtn_bindgen2 where we don't know the buffer size until the driver-specific code paths, but we need to lower printf (to hash format strings) in common code. so defer the buffer size decision to an intrinsic. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Alyssa Rosenzweig	6db9218ec3	nir/lower_printf: add option to hash format strings Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Alyssa Rosenzweig	e1368f0a30	nir,util: move printf serializing into util there's nothing NIR specific here and these routines will be useful otherwise. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Alyssa Rosenzweig	47e16cab5e	nir/lower_printf: drop default max buffer size no uses and it doesn't make sense. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Alyssa Rosenzweig	621ff262bc	nir/lower_printf: drop null check we derefernce options above. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>	2025-01-17 18:09:45 +00:00
Qiang Yu	e5041ef036	docs,src: replace doc and comments for PIPE_CAP with pipe_caps Use command: find . -type d \( -path "./.git" -o -path "./docs/relnotes" \) -prune -o -type f -exec sed -i 's/PIPE_CAP_\([A-Za-z0-9_]\)/pipe_caps.\L\1/g' {} + find . -type d \( -path "./.git" -o -path "./docs/relnotes" \) -prune -o -type f -exec sed -i 's/PIPE_CAPF_\([A-Za-z0-9_]\)/pipe_caps.\L\1/g' {} + With manual adjustment for docs/gallium/screen.rst to merge pipe_cap and pipe_capf section. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32955>	2025-01-17 04:39:47 +00:00
Marek Olšák	ff6e3e9f76	nir: add next_stage param to nir_slot_is_varying & nir_remove_sysval_output The result of nir_slot_is_varying depends on what the next shader stage is, and nir_remove_sysval_output uses it. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32855>	2025-01-16 16:28:15 +00:00
Marek Olšák	0d961b0723	nir: add barycentric coordinates src to load_point_coord_maybe_flipped Just like other input loads, radeonsi needs to know the barycentric coordinates for it. This adds the src and determines the optimal barycentric coordinates in nir_lower_point_smooth, the only producer of the intrinsic. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33046>	2025-01-16 02:58:03 +00:00
Sil Vilerino	e061792e25	src/compiler: Fix warning C4389: An == or != operation involved signed and unsigned variables. This could result in a loss of data. Reviewed-By: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Jesse Natalie <None> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32979>	2025-01-15 21:40:20 +00:00
Sil Vilerino	8ecb7bc2a2	src/compiler: Fix warning C4244 'argument' : conversion from 'type1' to 'type2', possible loss of data Reviewed-By: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Jesse Natalie <None> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32979>	2025-01-15 21:40:20 +00:00
Alyssa Rosenzweig	401b400de3	nir,asahi,hk: add barrier argument to MESA_DISPATCH_PRECOMP In the current API, precomp implicitly assumes full barriers both before & after every dispatch. That's not good for performance. However, dropping the barriers and requiring user to explicitly call barrier functions before/after would have bad ergonomics. So, we add a new parameter to the standard MESA_DISPATCH_PRECOMP signature representing the barriers required around the dispatch. As usual, the actual type & semantic is left to drivers to define what makes sense for their hardware. We just reserve the place for it. (I think most drivers will want bitflags here, but I don't think the actual flags are worth. If a driver wanted to use a struct here, that would work too.) Since the asahi stack doesn't do anything clever with barriers yet, we mechnically add an AGX_BARRIER_ALL barrier to all precomp users in-tree. We can optimize that later, this just gets the flag-day change in with no functional change. For JM panfrost, this will provide a convenient place to stash both their "job barrier" bit and their "suppress prefetch" bit (which is really a sort of barrier / cache flush, if you think about it). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32980>	2025-01-14 16:39:57 +00:00
Kenneth Graunke	2f334e8baf	nir: Add a nir_def_first_component_read() helper Similar to nir_def_last_component_read(). Just a little nicer than prodding at the bitmask of components read directly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32888>	2025-01-10 22:44:09 +00:00
Alyssa Rosenzweig	d9b4867e2a	nir/lower_robust_access: fix robustness with atomic swap this was missed in the original v3d pass, and then the common code port inherited the bug. (so strictly this fix "should" be backported even farther back but it won't apply before the Fixes here, and I don't think we do LTS that far back anyway). in theory this should fix a corner case with robustness on the gl (but not vulkan, at least for apple) drivers on broadcom & apple. Fixes: `f0fb8d05e3` ("nir: Add nir_lower_robust_access pass") Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32907>	2025-01-08 15:59:05 +00:00
Alyssa Rosenzweig	7a4469681e	nir: pass a callback to nir_lower_robust_access rather than try to enumerate everything a driver might want with an unmanageable collection of booleans, just do a filter callback + data. this ends up simpler overall, and will allow Intel to use this pass for just 64-bit images without needing to add even more booleans. while we're churning the pass signature, also do a quick port to nir_shader_intrinsics_pass Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [NIR and V3D] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32907>	2025-01-08 15:59:05 +00:00
Daniel Schürmann	d2f52e61c2	nir/divergence: change nir_has_divergent_loop() to return true only for divergent breaks The important information is whether a loop has a uniform number of iterations. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28627>	2025-01-08 13:33:54 +01:00
Mary Guillemard	ecdccae990	nir,agx: Allow nir_precomp_print_blob to print a static array This makes it stop leaking shader binary blobs definition and is required for panfrost clc. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32939>	2025-01-08 11:37:27 +00:00
Georg Lehmann	67d74a04b9	nir/peephole_select: allow load_vector/scalar_arg_amd Foz-DB Navi21: Totals from 1507 (1.90% of 79395) affected shaders: MaxWaves: 31830 -> 31870 (+0.13%); split: +0.20%, -0.08% Instrs: 938704 -> 937232 (-0.16%); split: -0.19%, +0.03% CodeSize: 4970860 -> 4964652 (-0.12%); split: -0.14%, +0.02% VGPRs: 79536 -> 79512 (-0.03%); split: -0.08%, +0.05% Latency: 5194524 -> 5218285 (+0.46%); split: -0.38%, +0.84% InvThroughput: 1200152 -> 1207251 (+0.59%); split: -0.02%, +0.61% VClause: 20728 -> 20741 (+0.06%); split: -0.11%, +0.17% SClause: 33612 -> 32871 (-2.20%); split: -2.78%, +0.57% Copies: 70601 -> 68847 (-2.48%); split: -2.62%, +0.13% Branches: 20032 -> 17521 (-12.53%) PreSGPRs: 47828 -> 47801 (-0.06%) VALU: 637446 -> 638094 (+0.10%); split: -0.02%, +0.13% SALU: 88627 -> 88462 (-0.19%); split: -1.08%, +0.90% VMEM: 36664 -> 36659 (-0.01%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32792>	2025-01-08 09:56:39 +00:00
Boris Brezillon	2af6e4beeb	pan: Don't pretend we support load_{vertex_id_zero_base,first_vertex} load_vertex_id_zero_base() is supposed to return the zero-based vertex ID, which is then offset by load_first_vertex() to get an absolute vertex ID. At the same time, when we're in a Vulkan environment, load_first_vertex() also encodes the vertexOffset passed to the indexed draw. Midgard/Bifrost have a sligtly different semantics, where load_first_vertex() returns vertexOffset + minVertexIdInIndexRange, and load_vertex_id_zero_base() returns an ID that needs to be offset by this vertexOffset + minVertexIdInIndexRange to get the absolute vertex ID. Everything works fine as long as all the load_first_vertex() and load_vertex_id_zero_base() calls are coming from the load_vertex_id() lowering. But as mentioned above, that's no longer the case in Vulkan, where gl_BaseVertexARB will be turned into load_first_vertex() and expect a value of vertexOffset in an indexed draw context. We thus need to fix the mismatch by introducing two new panfrost-specific intrinsic so we can stop abusing load_first_vertex() and load_vertex_id_zero_base(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32415>	2025-01-07 08:15:19 +00:00
Marek Olšák	3800f0af41	nir/algebraic: optimize pack_split(unpack(a).x, unpack(a).y) -> a This is required to optimize FP64 and Int64 shaders generated by virglrenderer. It generates pack/unpack around every 64-bit op, which NIR currently can't eliminate. This fixes that. There is a new constraint ".y", which means that the use of an instruction should have swizzle.y. This allows us to add patterns that have Y swizzle on results of instructions. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32172>	2025-01-07 05:47:52 +00:00
Marek Olšák	b1bc691b0f	nir/algebraic: add and improve pack/unpack patterns Some duplicated patterns are removed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32172>	2025-01-07 05:47:52 +00:00
Marek Olšák	ebec182b04	nir/algebraic: use is_used_once for comparison patterns otherwise we are just creating new instructions while not removing any Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32172>	2025-01-07 05:47:52 +00:00
Marek Olšák	ee8916c414	nir: use IO intrinsics in nir_lower_drawpixels Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	0de28a9fd0	nir: use IO intrinsics in nir_lower_bitmap Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	a7ad1b302b	nir: remove redundant option linker_ignore_precision Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	730c8d506f	nir: flip the early exit condition in nir_lower_io_temporaries no change in behavior other than skipping COMPUTE as well. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	7b55ee999d	nir: don't set num_slots/src/dest_type/write_mask when they're set automatically to those values Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	55a4a8a2a8	nir: set src_type and dest_type to float implicitly for IO build helpers If you want to set it to int/uint, set .src_type or .dest_type. If you want to set it to float, you don't need to set the type at all. It's implicitly set to float. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	b9f9d001d7	nir: set nir_io_semantics::num_slots to at least 1 in build helpers Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Benjamin Lee	081438ad39	panfrost: add nir pass to lower noperspective varyings Mali only supports perspective-correct varying interpolation in hardware, so we have to emulate noperspective with lowering in both the VS and FS. Both vulkan and opengl allow mismatched interpolation qualifiers between stages. Because we need all varyings that are noperspective in the FS to be lowered in the VS, we cannot rely on the interpolation qualifiers in the VS. Loading the set of noperspective varyings as a sysval allows the implementation to pass them as a compile-time constant when known statically, or a runtime push constant when not. Passing noperspective varyings dynamically has a performance cost with unnecessary branches and fmuls. This sysval is not hooked up yet in either panfrost or panvk, so shader compilation will fail. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32127>	2025-01-03 07:04:05 +00:00
Benjamin Lee	6f541e2016	panfrost: add intrinsic to load frag coord at a barycentric This is needed for noperspective lowering, where we need to multiply the varying value by gl_FragCoord.w at the same barycentric as the varying. Normal nir_load_frag_coord_zw instructions are lowered to the new intrinsic on bifrost with the pan_lower_frag_coord_zw pass. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32127>	2025-01-03 07:04:05 +00:00
Timur Kristóf	ec548fd37b	Revert "nir/opt_varyings: Add workaround for RADV mesh shader multiview." The workaround is not needed anymore, because RADV now implements the FS layer ID input as a sysval. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32641>	2025-01-02 14:07:51 +00:00
Marek Olšák	c21bc65ba7	nir/opt_load_store_vectorize: make hole_size signed to indicate overlapping loads A negative hole size means the loads overlap. This will be used by drivers to handle overlapping loads in the callback easily. Reviewed-by: Mel Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32699>	2025-01-01 00:03:55 +00:00
Georg Lehmann	e112e2b047	nir,amd: optimize front_face ? a : -a Foz-DB Navi31: Totals from 3345 (4.21% of 79395) affected shaders: MaxWaves: 96182 -> 96174 (-0.01%) Instrs: 3135439 -> 3129508 (-0.19%); split: -0.24%, +0.05% CodeSize: 16776088 -> 16718048 (-0.35%); split: -0.38%, +0.03% VGPRs: 190884 -> 190848 (-0.02%); split: -0.03%, +0.01% Latency: 32624132 -> 32621734 (-0.01%); split: -0.16%, +0.16% InvThroughput: 5759987 -> 5749957 (-0.17%); split: -0.23%, +0.05% VClause: 51044 -> 51086 (+0.08%); split: -0.12%, +0.20% SClause: 103415 -> 103223 (-0.19%); split: -0.64%, +0.45% Copies: 170398 -> 170555 (+0.09%); split: -0.64%, +0.74% PreSGPRs: 135567 -> 133887 (-1.24%) PreVGPRs: 140569 -> 141317 (+0.53%) VALU: 1959144 -> 1953839 (-0.27%); split: -0.30%, +0.03% SALU: 217956 -> 217676 (-0.13%); split: -0.20%, +0.07% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:35 +00:00
Georg Lehmann	9bd4296845	nir: add nir_alu_srcs_negative_equal_typed Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:35 +00:00
Georg Lehmann	15d754fefa	nir: add load_front_face_fsign Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:34 +00:00
Georg Lehmann	b8fa9daf0c	nir: sink/move alu with two identical, non constant sources. Foz-DB Navi21: Totals from 32363 (40.76% of 79395) affected shaders: MaxWaves: 787499 -> 787675 (+0.02%); split: +0.02%, -0.00% Instrs: 28783404 -> 28783464 (+0.00%); split: -0.01%, +0.01% CodeSize: 156763536 -> 156765148 (+0.00%); split: -0.01%, +0.02% VGPRs: 1493304 -> 1492848 (-0.03%); split: -0.04%, +0.01% Latency: 243022511 -> 243051994 (+0.01%); split: -0.08%, +0.09% InvThroughput: 57827398 -> 57828129 (+0.00%); split: -0.05%, +0.05% VClause: 582208 -> 582298 (+0.02%); split: -0.07%, +0.08% SClause: 959634 -> 959312 (-0.03%); split: -0.07%, +0.04% Copies: 1965821 -> 1965826 (+0.00%); split: -0.17%, +0.17% Branches: 710593 -> 710596 (+0.00%); split: -0.00%, +0.01% PreSGPRs: 1313513 -> 1313632 (+0.01%); split: -0.00%, +0.01% PreVGPRs: 1210596 -> 1209103 (-0.12%); split: -0.12%, +0.00% VALU: 19463445 -> 19463497 (+0.00%); split: -0.02%, +0.02% SALU: 3319529 -> 3319500 (-0.00%); split: -0.01%, +0.01% Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32783>	2024-12-30 13:28:30 +00:00
Georg Lehmann	5b4b195f1b	nir: optimize unpacking 8bit values from a 64bit source Useful for load vectorization. Foz-DB Navi21: Totals from 299 (0.38% of 79395) affected shaders: Instrs: 287818 -> 284333 (-1.21%); split: -1.21%, +0.00% CodeSize: 1557124 -> 1540544 (-1.06%); split: -1.07%, +0.00% Latency: 4009407 -> 4012389 (+0.07%); split: -0.05%, +0.12% InvThroughput: 1260613 -> 1262530 (+0.15%); split: -0.01%, +0.17% VClause: 5472 -> 5369 (-1.88%); split: -1.92%, +0.04% SClause: 5419 -> 5305 (-2.10%); split: -2.58%, +0.48% Copies: 36709 -> 36060 (-1.77%); split: -1.81%, +0.04% PreSGPRs: 11861 -> 11655 (-1.74%) SALU: 66920 -> 64310 (-3.90%) Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32778>	2024-12-26 17:50:32 +00:00
Marek Olšák	58132d6fc8	radeonsi: implement nir_opt_frag_depth using kill_z instead of the NIR pass This uses si_shader_info to store whether gl_FragDepth can be removed, and it uses the kill_z epilog flag to do the removal without recompilation. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	a50d069d1c	nir/opt_varyings: clear info->clip/cull_distance_array_size if relocated svga breaks if shader_info declares these, but the shader is missing the outputs. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32684>	2024-12-20 02:32:08 +00:00
Marek Olšák	9d129505b5	nir/opt_varyings: set all IO types to float to facilitate full vectorization If types differ between components of a vec4 slot, IO vectorization can't be done. This also helps drivers like d3d12 that require matching types between shaders. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32684>	2024-12-20 02:32:08 +00:00

1 2 3 4 5 ...

5901 commits