fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 13:48:06 +02:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	7a4469681e	nir: pass a callback to nir_lower_robust_access rather than try to enumerate everything a driver might want with an unmanageable collection of booleans, just do a filter callback + data. this ends up simpler overall, and will allow Intel to use this pass for just 64-bit images without needing to add even more booleans. while we're churning the pass signature, also do a quick port to nir_shader_intrinsics_pass Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [NIR and V3D] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32907>	2025-01-08 15:59:05 +00:00
Daniel Schürmann	d2f52e61c2	nir/divergence: change nir_has_divergent_loop() to return true only for divergent breaks The important information is whether a loop has a uniform number of iterations. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28627>	2025-01-08 13:33:54 +01:00
Mary Guillemard	ecdccae990	nir,agx: Allow nir_precomp_print_blob to print a static array This makes it stop leaking shader binary blobs definition and is required for panfrost clc. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32939>	2025-01-08 11:37:27 +00:00
Georg Lehmann	67d74a04b9	nir/peephole_select: allow load_vector/scalar_arg_amd Foz-DB Navi21: Totals from 1507 (1.90% of 79395) affected shaders: MaxWaves: 31830 -> 31870 (+0.13%); split: +0.20%, -0.08% Instrs: 938704 -> 937232 (-0.16%); split: -0.19%, +0.03% CodeSize: 4970860 -> 4964652 (-0.12%); split: -0.14%, +0.02% VGPRs: 79536 -> 79512 (-0.03%); split: -0.08%, +0.05% Latency: 5194524 -> 5218285 (+0.46%); split: -0.38%, +0.84% InvThroughput: 1200152 -> 1207251 (+0.59%); split: -0.02%, +0.61% VClause: 20728 -> 20741 (+0.06%); split: -0.11%, +0.17% SClause: 33612 -> 32871 (-2.20%); split: -2.78%, +0.57% Copies: 70601 -> 68847 (-2.48%); split: -2.62%, +0.13% Branches: 20032 -> 17521 (-12.53%) PreSGPRs: 47828 -> 47801 (-0.06%) VALU: 637446 -> 638094 (+0.10%); split: -0.02%, +0.13% SALU: 88627 -> 88462 (-0.19%); split: -1.08%, +0.90% VMEM: 36664 -> 36659 (-0.01%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32792>	2025-01-08 09:56:39 +00:00
Boris Brezillon	2af6e4beeb	pan: Don't pretend we support load_{vertex_id_zero_base,first_vertex} load_vertex_id_zero_base() is supposed to return the zero-based vertex ID, which is then offset by load_first_vertex() to get an absolute vertex ID. At the same time, when we're in a Vulkan environment, load_first_vertex() also encodes the vertexOffset passed to the indexed draw. Midgard/Bifrost have a sligtly different semantics, where load_first_vertex() returns vertexOffset + minVertexIdInIndexRange, and load_vertex_id_zero_base() returns an ID that needs to be offset by this vertexOffset + minVertexIdInIndexRange to get the absolute vertex ID. Everything works fine as long as all the load_first_vertex() and load_vertex_id_zero_base() calls are coming from the load_vertex_id() lowering. But as mentioned above, that's no longer the case in Vulkan, where gl_BaseVertexARB will be turned into load_first_vertex() and expect a value of vertexOffset in an indexed draw context. We thus need to fix the mismatch by introducing two new panfrost-specific intrinsic so we can stop abusing load_first_vertex() and load_vertex_id_zero_base(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32415>	2025-01-07 08:15:19 +00:00
Marek Olšák	3800f0af41	nir/algebraic: optimize pack_split(unpack(a).x, unpack(a).y) -> a This is required to optimize FP64 and Int64 shaders generated by virglrenderer. It generates pack/unpack around every 64-bit op, which NIR currently can't eliminate. This fixes that. There is a new constraint ".y", which means that the use of an instruction should have swizzle.y. This allows us to add patterns that have Y swizzle on results of instructions. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32172>	2025-01-07 05:47:52 +00:00
Marek Olšák	b1bc691b0f	nir/algebraic: add and improve pack/unpack patterns Some duplicated patterns are removed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32172>	2025-01-07 05:47:52 +00:00
Marek Olšák	ebec182b04	nir/algebraic: use is_used_once for comparison patterns otherwise we are just creating new instructions while not removing any Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32172>	2025-01-07 05:47:52 +00:00
Marek Olšák	ee8916c414	nir: use IO intrinsics in nir_lower_drawpixels Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	0de28a9fd0	nir: use IO intrinsics in nir_lower_bitmap Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	a7ad1b302b	nir: remove redundant option linker_ignore_precision Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	730c8d506f	nir: flip the early exit condition in nir_lower_io_temporaries no change in behavior other than skipping COMPUTE as well. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	7b55ee999d	nir: don't set num_slots/src/dest_type/write_mask when they're set automatically to those values Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	55a4a8a2a8	nir: set src_type and dest_type to float implicitly for IO build helpers If you want to set it to int/uint, set .src_type or .dest_type. If you want to set it to float, you don't need to set the type at all. It's implicitly set to float. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Marek Olšák	b9f9d001d7	nir: set nir_io_semantics::num_slots to at least 1 in build helpers Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>	2025-01-06 19:09:17 +00:00
Benjamin Lee	081438ad39	panfrost: add nir pass to lower noperspective varyings Mali only supports perspective-correct varying interpolation in hardware, so we have to emulate noperspective with lowering in both the VS and FS. Both vulkan and opengl allow mismatched interpolation qualifiers between stages. Because we need all varyings that are noperspective in the FS to be lowered in the VS, we cannot rely on the interpolation qualifiers in the VS. Loading the set of noperspective varyings as a sysval allows the implementation to pass them as a compile-time constant when known statically, or a runtime push constant when not. Passing noperspective varyings dynamically has a performance cost with unnecessary branches and fmuls. This sysval is not hooked up yet in either panfrost or panvk, so shader compilation will fail. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32127>	2025-01-03 07:04:05 +00:00
Benjamin Lee	6f541e2016	panfrost: add intrinsic to load frag coord at a barycentric This is needed for noperspective lowering, where we need to multiply the varying value by gl_FragCoord.w at the same barycentric as the varying. Normal nir_load_frag_coord_zw instructions are lowered to the new intrinsic on bifrost with the pan_lower_frag_coord_zw pass. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32127>	2025-01-03 07:04:05 +00:00
Timur Kristóf	ec548fd37b	Revert "nir/opt_varyings: Add workaround for RADV mesh shader multiview." The workaround is not needed anymore, because RADV now implements the FS layer ID input as a sysval. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32641>	2025-01-02 14:07:51 +00:00
Marek Olšák	c21bc65ba7	nir/opt_load_store_vectorize: make hole_size signed to indicate overlapping loads A negative hole size means the loads overlap. This will be used by drivers to handle overlapping loads in the callback easily. Reviewed-by: Mel Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32699>	2025-01-01 00:03:55 +00:00
Georg Lehmann	e112e2b047	nir,amd: optimize front_face ? a : -a Foz-DB Navi31: Totals from 3345 (4.21% of 79395) affected shaders: MaxWaves: 96182 -> 96174 (-0.01%) Instrs: 3135439 -> 3129508 (-0.19%); split: -0.24%, +0.05% CodeSize: 16776088 -> 16718048 (-0.35%); split: -0.38%, +0.03% VGPRs: 190884 -> 190848 (-0.02%); split: -0.03%, +0.01% Latency: 32624132 -> 32621734 (-0.01%); split: -0.16%, +0.16% InvThroughput: 5759987 -> 5749957 (-0.17%); split: -0.23%, +0.05% VClause: 51044 -> 51086 (+0.08%); split: -0.12%, +0.20% SClause: 103415 -> 103223 (-0.19%); split: -0.64%, +0.45% Copies: 170398 -> 170555 (+0.09%); split: -0.64%, +0.74% PreSGPRs: 135567 -> 133887 (-1.24%) PreVGPRs: 140569 -> 141317 (+0.53%) VALU: 1959144 -> 1953839 (-0.27%); split: -0.30%, +0.03% SALU: 217956 -> 217676 (-0.13%); split: -0.20%, +0.07% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:35 +00:00
Georg Lehmann	9bd4296845	nir: add nir_alu_srcs_negative_equal_typed Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:35 +00:00
Georg Lehmann	15d754fefa	nir: add load_front_face_fsign Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:34 +00:00
Georg Lehmann	b8fa9daf0c	nir: sink/move alu with two identical, non constant sources. Foz-DB Navi21: Totals from 32363 (40.76% of 79395) affected shaders: MaxWaves: 787499 -> 787675 (+0.02%); split: +0.02%, -0.00% Instrs: 28783404 -> 28783464 (+0.00%); split: -0.01%, +0.01% CodeSize: 156763536 -> 156765148 (+0.00%); split: -0.01%, +0.02% VGPRs: 1493304 -> 1492848 (-0.03%); split: -0.04%, +0.01% Latency: 243022511 -> 243051994 (+0.01%); split: -0.08%, +0.09% InvThroughput: 57827398 -> 57828129 (+0.00%); split: -0.05%, +0.05% VClause: 582208 -> 582298 (+0.02%); split: -0.07%, +0.08% SClause: 959634 -> 959312 (-0.03%); split: -0.07%, +0.04% Copies: 1965821 -> 1965826 (+0.00%); split: -0.17%, +0.17% Branches: 710593 -> 710596 (+0.00%); split: -0.00%, +0.01% PreSGPRs: 1313513 -> 1313632 (+0.01%); split: -0.00%, +0.01% PreVGPRs: 1210596 -> 1209103 (-0.12%); split: -0.12%, +0.00% VALU: 19463445 -> 19463497 (+0.00%); split: -0.02%, +0.02% SALU: 3319529 -> 3319500 (-0.00%); split: -0.01%, +0.01% Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32783>	2024-12-30 13:28:30 +00:00
Georg Lehmann	5b4b195f1b	nir: optimize unpacking 8bit values from a 64bit source Useful for load vectorization. Foz-DB Navi21: Totals from 299 (0.38% of 79395) affected shaders: Instrs: 287818 -> 284333 (-1.21%); split: -1.21%, +0.00% CodeSize: 1557124 -> 1540544 (-1.06%); split: -1.07%, +0.00% Latency: 4009407 -> 4012389 (+0.07%); split: -0.05%, +0.12% InvThroughput: 1260613 -> 1262530 (+0.15%); split: -0.01%, +0.17% VClause: 5472 -> 5369 (-1.88%); split: -1.92%, +0.04% SClause: 5419 -> 5305 (-2.10%); split: -2.58%, +0.48% Copies: 36709 -> 36060 (-1.77%); split: -1.81%, +0.04% PreSGPRs: 11861 -> 11655 (-1.74%) SALU: 66920 -> 64310 (-3.90%) Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32778>	2024-12-26 17:50:32 +00:00
Marek Olšák	58132d6fc8	radeonsi: implement nir_opt_frag_depth using kill_z instead of the NIR pass This uses si_shader_info to store whether gl_FragDepth can be removed, and it uses the kill_z epilog flag to do the removal without recompilation. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	a50d069d1c	nir/opt_varyings: clear info->clip/cull_distance_array_size if relocated svga breaks if shader_info declares these, but the shader is missing the outputs. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32684>	2024-12-20 02:32:08 +00:00
Marek Olšák	9d129505b5	nir/opt_varyings: set all IO types to float to facilitate full vectorization If types differ between components of a vec4 slot, IO vectorization can't be done. This also helps drivers like d3d12 that require matching types between shaders. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32684>	2024-12-20 02:32:08 +00:00
Caterina Shablia	f4fcfa8016	pan,nir: introduce load_attribute_pan load_attribute_pan is a panfrost-specific intrinsic for loading vertex attributes. Takes explicit vertex and instance IDs which we need in order to implement vertex attribute divisor with non-zero base instance on v9+. Passes which are used by panvk are modified to be aware of load_attribute_pan. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32039>	2024-12-18 08:33:16 +00:00
Georg Lehmann	c695043e81	nir/opt_algebraic: optimize min(max(a, b), a) Foz-DB Navi21: Totals from 105 (0.13% of 79395) affected shaders: MaxWaves: 2638 -> 2646 (+0.30%) Instrs: 76531 -> 75077 (-1.90%) CodeSize: 413668 -> 406484 (-1.74%) VGPRs: 4856 -> 4848 (-0.16%) Latency: 333684 -> 328438 (-1.57%); split: -1.57%, +0.00% InvThroughput: 80417 -> 78579 (-2.29%) VClause: 1818 -> 1768 (-2.75%) SClause: 3028 -> 2964 (-2.11%) Copies: 4708 -> 4513 (-4.14%); split: -4.50%, +0.36% PreVGPRs: 3792 -> 3715 (-2.03%); split: -2.08%, +0.05% VALU: 54734 -> 53528 (-2.20%) SALU: 6195 -> 6137 (-0.94%) VMEM: 2363 -> 2313 (-2.12%) SMEM: 5219 -> 5119 (-1.92%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32634>	2024-12-16 22:29:21 +00:00
Georg Lehmann	0e6d32777f	nir/opt_remove_phis: rematerialize equal alu Foz-DB Navi31: Totals from 943 (1.19% of 79395) affected shaders: MaxWaves: 24672 -> 24722 (+0.20%) Instrs: 1541665 -> 1544956 (+0.21%); split: -0.23%, +0.44% CodeSize: 8085180 -> 8109212 (+0.30%); split: -0.16%, +0.46% VGPRs: 57768 -> 57624 (-0.25%) Latency: 18043743 -> 17948245 (-0.53%); split: -1.28%, +0.75% InvThroughput: 2692605 -> 2677049 (-0.58%); split: -2.07%, +1.49% VClause: 25321 -> 25343 (+0.09%); split: -0.48%, +0.57% SClause: 38473 -> 38614 (+0.37%); split: -0.00%, +0.37% Copies: 86089 -> 86236 (+0.17%); split: -0.46%, +0.63% Branches: 36719 -> 36777 (+0.16%); split: -0.60%, +0.76% PreSGPRs: 44138 -> 44303 (+0.37%); split: -0.05%, +0.42% PreVGPRs: 43319 -> 43009 (-0.72%) VALU: 893684 -> 894272 (+0.07%); split: -0.42%, +0.48% SALU: 189561 -> 191358 (+0.95%); split: -0.05%, +1.00% VMEM: 42294 -> 42313 (+0.04%); split: -0.44%, +0.49% SMEM: 72916 -> 73144 (+0.31%) Instruction count regressions are largly caused by additional loop unrolling. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31028>	2024-12-16 20:38:38 +00:00
Qiang Yu	129e37bab6	nir: do not generate b2i64 when driver want to lower it This is found on GFX12 by: KHR-GL43.shader_ballot_tests.ShaderBallotBitmasks ACO does not support it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32570>	2024-12-16 07:35:07 +00:00
Alyssa Rosenzweig	bd89279dd4	nir: add lower_scratch_to_var pass to ease opencl pain. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>	2024-12-12 21:16:13 +00:00
Rhys Perry	26790e90d3	nir: make ballot ALU and mbcnt_amd operations reorderable These can be lowered to ALU and load_subgroup_invocation, all of which are reorderable. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32512>	2024-12-11 14:47:12 +00:00
Rhys Perry	650468fbdf	nir/move_discards_to_top: don't move across more intrinsics This missed dpp16_shift_amd, lane_permute_16_amd, last_invocation and ballot_relaxed. Instead, list the non-reorderable intrinsics which are allowed to be moved after discards. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32512>	2024-12-11 14:47:12 +00:00
Rhys Perry	5368569d06	nir: make load_helper_invocation non-reorderable This can't be moved to after demote, so it's not reorderable. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32512>	2024-12-11 14:47:12 +00:00
Georg Lehmann	e8b29abb25	nir: add unsigned upper bound support for fsat Foz-DB Navi21: Totals from 89 (0.11% of 79395) affected shaders: Instrs: 97018 -> 96995 (-0.02%) CodeSize: 492996 -> 492488 (-0.10%) Latency: 504649 -> 504555 (-0.02%) InvThroughput: 121968 -> 121875 (-0.08%) VALU: 67427 -> 67404 (-0.03%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32565>	2024-12-10 20:53:53 +00:00
Georg Lehmann	e78e63e3fe	nir: add unsigned upper bound support for f2i32 Foz-DB Navi21: Totals from 649 (0.82% of 79395) affected shaders: CodeSize: `2330592` -> 2314112 (-0.71%) Latency: 2068161 -> 2053370 (-0.72%) InvThroughput: 346583 -> 329425 (-4.95%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32565>	2024-12-10 20:53:53 +00:00
Georg Lehmann	0b366a7ab2	nir/uub: properly limit float support to 32bit Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32565>	2024-12-10 20:53:53 +00:00
Alyssa Rosenzweig	83dd4889a7	nir/lower_point_size: skip non-var derefs these can happen depending on pass order, otherwise we crash on the null pointer. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32564>	2024-12-10 19:13:07 +00:00
Alyssa Rosenzweig	69a0962c70	nir/lower_printf: use 64-bit math this lets load_store_vectorize vectorize the stores we produce. it also matches actual OpenCL kernel code looks, so drivers need to have an optimized path for these 64+32 patterns regardless. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32564>	2024-12-10 19:13:07 +00:00
Alyssa Rosenzweig	da967416db	nir/lower_printf: use unsigned math negative offsets/sizes don't make sense, and zero-extension is often easier to optimize/lower than sign-extension. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32564>	2024-12-10 19:13:07 +00:00
Alyssa Rosenzweig	8db0751eb8	nir/lower_printf: lower aborts Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32564>	2024-12-10 19:13:07 +00:00
Alyssa Rosenzweig	0b9072e2e5	nir/lower_printf: allow fixed address fixed address printf buffers can avoid a lot of complexity, especially with the general case of (e.g.) DGC-enqueued precompiled kernels. so add a knob for that and save the driver the need to write a lowering pass. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32564>	2024-12-10 19:13:07 +00:00
Alyssa Rosenzweig	816c14d33d	nir: add printf_abort intrinsic abort() for the gpu, implemented with the printf infrastructure since they go together. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32564>	2024-12-10 19:13:07 +00:00
Georg Lehmann	c5c22fc3a3	nir: add constant clip/cull distance optimization Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32518>	2024-12-10 16:35:01 +00:00
Benjamin Lee	b01afd06cd	nir: update docs for nir_get_io_arrayed_index_src Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>	2024-12-09 20:31:49 +00:00
Benjamin Lee	74ccf6cbdc	nir: add option to use compact view indices In panvk we pass absolute view indices to the hardware, so we need to do the conversion from compacted to absolute at some point. Emitting absolute indices from nir_lower_multiview initially looks like the simplest option, but nir_lower_io_to_temporaries will emit a write for every element of array varyings. This results in unnecessary writes to disabled views. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>	2024-12-09 20:31:49 +00:00
Benjamin Lee	becb014d27	nir: treat per-view outputs as arrayed IO This is needed for implementing multiview in panvk, where the address calculation for multiview outputs is not well-represented by lowering to nir_intrinsic_store_output with a single offset. The case where a variable is both per-view and per-{vertex,primitive} is now unsupported. This would come up with drivers implementing NV_mesh_shader or using nir_lower_multiview on geometry, tessellation, or mesh shaders. No drivers currently do either of these. There was some code that attempted to handle the nested per-view case by unwrapping per-view/arrayed types twice, but it's unclear to what extent this actually worked. ANV and Turnip both rely on per-view outputs being assigned a unique driver location for each view, so I've added on option to configure that behavior rather than removing it. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>	2024-12-09 20:31:49 +00:00
Benjamin Lee	6d843cde45	nir: document index semantics in nir_lower_multiview Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>	2024-12-09 20:31:49 +00:00
Benjamin Lee	975c3ecd1e	nir: handle arbitrary per-view outputs in nir_lower_multiview This is needed for panvk, where multiview is "all or nothing". When multiview is enabled, all outputs may be written with separate values for each view. The edge case mentioned in the previous `nir_can_lower_multiview` is now handled because we now handle an arbitrary number of per-view output vars instead of expecting to find exactly one. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>	2024-12-09 20:31:49 +00:00

1 2 3 4 5 ...

5878 commits