fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 09:20:12 +01:00

Author	SHA1	Message	Date
Christian Gmeiner	fb48d3d1da	nir: add enta specific intrinsic used for txs lowering Non of the know etnaviv GPUs support this feature in hardware and the binary blob provides sizes via uniforms too. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24217>	2023-07-21 08:52:03 +00:00
Alyssa Rosenzweig	9109830bb0	nir: Promote tess_coord_r600 to tess_coord_xy This intrinsic (vec2 tess_coord) is generally useful for non-r600 backends. Promote it. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24159>	2023-07-17 17:31:52 +00:00
Alyssa Rosenzweig	9bcdc45ee7	nir: Devendor load_sample_mask AGX will use this too for its MSAA lowerings. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24148>	2023-07-15 19:48:30 +00:00
Alyssa Rosenzweig	56d61d9a64	nir: Add fence_{pbe,mem}_to_tex(_pixel)_agx intrinsics Read-after-write hazards require special handling on AGX, since image loads are implemented with texturing. Add intrinsics to handle these hazards. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24148>	2023-07-15 19:48:30 +00:00
Christian Gmeiner	9ddedf4554	nir: rename intrinsic to have a more generic nameing Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24054>	2023-07-12 10:03:06 +00:00
Alyssa Rosenzweig	7229bffcb1	nir: Add intrinsics for register access Note the writemask handling is chosen for consistency with the rest of NIR. In every other instance, writemask=w requires a vec4 source. This is hardcoded into nir_validate and nir_print as what it means to have a writemask. More importantly, consistency with how register writemasks currently work. nir_print hides it, but r0.w = fneg ssa_1.x is actually a vec4 instruction with source ssa_1.xxxx. As a silly example nir_dest_num_components(that) = 4 in the old model. I realize this is quite strange coming from a scalar ISA, but it's perfectly natural for the class of vec4 hardware for which this was designed. In that hardware, conceptually all instructions are vec4`, so the sequence "fneg ssa_1 and write to channel w" is implemented as "fneg a vec4 with ssa_1.x in the last component and write that vec4 out but mask to write only the w channel". Isn't this inefficient? It can be. To save power, Midgard has scalar ALUs in addition to vec4 ALUs. Those details are confined to the backend VLIW scheduler; the instruction selection is still done as vec4. This mechanism has little in common with AMD's SALUs. Midgard has a wave size of 1, with special hacks for derivatives. As a result, all backends consuming register writemasks are expecting this pattern of code. Changing the store to take a vec1 instead of a vec4 would require changing every backend to reswizzle the sources to resurrect the vec4. I started typing a branch to do this yesterday, but it made a mess of both Midgard and nir-to-tgsi. Without any good reason to think it'd actually help performance, I abandoned the idea. Getting all 15 backends converted to the helpers is enough of a challenge without forcing 10 backends to reswizzle their sources too. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>	2023-07-12 01:34:26 +00:00
Rhys Perry	58f8e0e2a0	nir,aco: add INCLUDE_HELPERS index to reduce intrinsic Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23621>	2023-06-27 18:53:50 +00:00
Alyssa Rosenzweig	c7067660b2	nir: Add pixel_coord, frag_coord_zw intrinsics On some architectures, gl_FragCoord.xy is available as an integer but gl_FragCoord.zw requires interpolation. Add dedicated intrinsics so we can lower it all in NIR. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23836>	2023-06-27 14:38:21 +00:00
Alyssa Rosenzweig	942c206cd1	nir: Add discard_agx intrinsic sample_mask_agx corresponds directly to the hardware's 2-source instruction, but it's hard to use correctly and even harder to legalize after the fact, since it's responsible for not only discard but also late depth/stencil testing. For our various high-level lowering passes, it's easier to use a one-source discard (where we don't have to worry about sample masks), which the compiler will internally lower to the two-source instruction. Introduce such an instruction. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>	2023-06-23 17:37:41 +00:00
Caio Oliveira	59cc77f0fa	compiler: Move from nir_scope to mesa_scope Just moving the enum and performing renames, no behavior change. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23328>	2023-06-19 23:29:26 +00:00
Lionel Landwerlin	4ee1a8bb9c	nir: add a load_global_constant uniform intel variant Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Lionel Landwerlin	5ae8a78d8c	intel/fs: make use of load_ubo_uniform_block_intel The principle is the same as the load_ssbo_uniform_block_intel. Whenever we see a uniform offset, load the data only once in GRFs to reduce register pressure. Iris shader-db run on DG2 : total instructions in shared programs: 23001325 -> 23094969 (0.41%) instructions in affected programs: 1775989 -> 1869633 (5.27%) helped: 764 HURT: 2097 helped stats (abs) min: 1 max: 102 x̄: 6.96 x̃: 2 helped stats (rel) min: 0.03% max: 16.91% x̄: 1.36% x̃: 0.63% HURT stats (abs) min: 1 max: 2461 x̄: 47.19 x̃: 7 HURT stats (rel) min: <.01% max: 199.34% x̄: 5.91% x̃: 2.60% 95% mean confidence interval for instructions value: 25.43 40.03 95% mean confidence interval for instructions %-change: 3.60% 4.33% Instructions are HURT. total loops in shared programs: 5847 -> 5847 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 839329852 -> 845491482 (0.73%) cycles in affected programs: 130229434 -> 136391064 (4.73%) helped: 1098 HURT: 2228 helped stats (abs) min: 1 max: 130102 x̄: 1340.64 x̃: 22 helped stats (rel) min: <.01% max: 64.25% x̄: 4.03% x̃: 0.71% HURT stats (abs) min: 1 max: 185309 x̄: 3426.24 x̃: 87 HURT stats (rel) min: <.01% max: 92.85% x̄: 8.12% x̃: 3.82% 95% mean confidence interval for cycles value: 1342.16 2362.97 95% mean confidence interval for cycles %-change: 3.70% 4.52% Cycles are HURT. total spills in shared programs: 10768 -> 11856 (10.10%) spills in affected programs: 9717 -> 10805 (11.20%) helped: 25 HURT: 28 total fills in shared programs: 13720 -> 16258 (18.50%) fills in affected programs: 12016 -> 14554 (21.12%) helped: 25 HURT: 28 total sends in shared programs: 1034790 -> 1031266 (-0.34%) sends in affected programs: 33416 -> 29892 (-10.55%) helped: 1005 HURT: 0 helped stats (abs) min: 1 max: 22 x̄: 3.51 x̃: 3 helped stats (rel) min: 1.69% max: 60.00% x̄: 15.20% x̃: 14.08% 95% mean confidence interval for sends value: -3.72 -3.29 95% mean confidence interval for sends %-change: -15.82% -14.57% Sends are helped. LOST: 26 GAINED: 183 shader-db on a number of VK/DX titles on DG2 : PERCENTAGE DELTAS Shaders Instrs Cycles age_of_wonders_III 1928 +0.02% -0.19% PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Max live registers Max dispatch width assassins_creed_odyssey 2119 +1.12% -0.42% -0.03% -0.29% -9.10% -4.26% -0.64% +0.65% PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Max live registers aztec_ruins_high 269 -0.05% -0.45% -0.29% -7.27% -0.33% PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width dark_souls_3_dxvk_g2 1420 +0.09% +0.24% +0.21% +0.12% (stats look bad, but it's just one shader affected) PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Scratch Memory Size Max live registers fallout_4_dxvk_g2 1638 +0.67% +8.32% +16.02% +7.17% +100.00% +0.48% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Spill count Fill count Max live registers Max dispatch width red_dead_redemption2 5969 +0.16% -0.04% -0.04% +0.01% +0.05% -0.20% +0.04% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width rise_of_the_tomb_raider_g2 12129 +2.19% +1.36% -1.23% -0.36% +2.04% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers shooter-game 693 +0.07% -0.89% -0.09% -0.09% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width talos_g2 1140 +0.37% +3.80% -0.86% -0.67% +0.19% PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width total_war_warhammer2 477 +0.25% +0.66% -0.17% +0.10% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width witcher_3_dxvk_g2 1074 +0.75% -10.45% -0.15% -0.16% -0.16% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers wolfenstein_youngblood 1111 +0.52% +0.66% -0.59% -0.03% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Lionel Landwerlin	4a23a5a904	nir: add a new ubo uniform loading intrinsic for intel Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Alyssa Rosenzweig	12eb23530b	nir: Remove non-scoped barriers Nothing uses them anymore. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>	2023-06-13 16:36:11 +00:00
Jesse Natalie	92dcaf7deb	dxil: Remove custom SSBO lowering Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>	2023-06-13 00:43:37 +00:00
Jesse Natalie	ecfbc16f61	dxil: Delete load_ubo_dxil intrinsic Instead of splitting unaligned UBO loads while still using derefs, and then lowering load_ubo to load_ubo_dxil in lower_loads_stores_to_dxil, use lower_mem_access_bit_sizes and lower_ubo_vec4 to handle load size and alignment restrictions while converting to load_ubo_vec4 instead, which has the same semantics as load_ubo_dxil. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3842 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>	2023-06-13 00:43:36 +00:00
Jesse Natalie	f121d8fe12	microsoft/compiler: Un-lower shared/scratch to derefs Derefs have index-based access semantics, which means we don't need custom intrinsics to encode an index instead of a byte offset. Remove the "masked" store intrinsics and just emit the pair of atomics directly. This massively reduces duplication between scratch, shared, and constant, while also moving more things into nir so more optimizations can be done. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>	2023-06-13 00:43:36 +00:00
Jesse Natalie	f9b0382faf	microsoft/compiler: Emit const accesses as load_deref There's a few changes in here that are very inter-related. First, we stop lowering load_deref on shader_temp to load_ptr_dxil, and just leave it as load_deref. In order for that to work, we need the derefs to be in a shape that's acceptable to DXIL, so the only current producer of shader_temp loads (the CLC frontend) needs to run some lowering passes on them first. The DXIL backend is augmented to just write out deref indices while walking a deref chain, which will get combined in the load op into a GEP instruction. For non-mesh/raytracing shaders, these are required to be single-level scalar arrays, but the complexity here is preparation for when we don't need to do that anymore. Additionally, the const lookups are changed from using a hash table to just putting an index on the variable. All of this together is enough to enable the authored-forever-ago test which uses indirect array access into a const packed struct. The load_ptr_dxil handling didn't deal with packed structs / unaligned accesses, but now that we're in a logical address space with derefs instead of physical, there's no alignment to deal with anymore and the fact that it's packed goes out the window. This removes one custom DXIL intrinsic. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>	2023-06-13 00:43:36 +00:00
Alyssa Rosenzweig	5a55ef2fd1	nir: Add AGX atomic intrinsics This is a piece of cake with unified atomics :-) This will let us do our addressing math tricks nice and easily. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23529>	2023-06-09 12:06:00 +00:00
Daniel Schürmann	be9f4a80b8	nir: add nir_intrinsic_resume_shader_address_amd This intrinsic returns a pointer to the end of the shader and is intended for stitched binaries. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096>	2023-06-08 00:37:03 +00:00
Daniel Schürmann	03c4b5b0cc	nir,amd: add nir_intrinsic_store_[scalar\|vector]_arg_amd to overwrite inputs This intrinsic must only be used at top-level CF in order to not break SSA properties. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096>	2023-06-08 00:37:03 +00:00
Samuel Pitoiset	98bb7e10e7	nir: add nir_intrinsic_load_rasterization_primitive_amd For VK_KHR_fragment_shader_barycentric, AMD needs to know the primitive topology in the fragment shader but with fast-link GPL this is unknown at compile time and it needs to be passed dynamically. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16742>	2023-06-07 14:40:35 +00:00
Samuel Pitoiset	0358a23012	nir: add nir_intrinsic_load_provoking_vtx_amd Will be used to load provoking vertex info from the hardware to determine the provoking vertex ID. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16742>	2023-06-07 14:40:35 +00:00
Samuel Pitoiset	c2ec23ab84	spirv,nir: add support for BaryCoord{NoPersp}KHR builtins This introduces new intrinsics nir_intrinsic_load_barycentric_coord_xxx with 3-components instead of expanding the existing ones that are supposed to interpolate input varyings, while BaryCoord is a sysval on most hardware. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23254>	2023-06-02 13:25:43 +00:00
Qiang Yu	5351209632	nir,ac/llvm,radeonsi: replace nir_buffer_atomic_add_amd with ssbo atomic Now that radeonsi support pass desc to ssbo atomic ops, we can use ssbo atomic instead. aco does not implement nir_buffer_atomic_add either. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23096>	2023-06-02 17:51:02 +08:00
Lionel Landwerlin	54dfc08b89	nir: add a new intrinsic to describe resources accessed on intel Intel HW has multiple ways to access resources like UBO/SSBO/images : - binding tables : a small ~240 heap of surfaces - bindless surfaces : a 64Mb heap of surfaces up to Gfx12+, 4Gb on Gfx12.5+ - surfaces : a 4Gb heap on Gfx12.5+ (mostly unused at the moment, only available through the LSC) For samplers, we have 2 options since Gfx11+ : - samplers indexed from the Dynamic State Heap (4Gb) - samplers indexed from the Bindless Sampler Heap (4Gb) Additionally our whole push constant promotion mechanism is based around binding table indices. This is problematic if you want to also promote to push constants things that would be accessed through the bindless heap. To solve this issue, we introduce a new intrinsic that will cary a block index that is not based off the binding table index nor the bindless table offset. We will also use this intrinsic to identify whether the buffer/surface index in load_ubo/load_ssbo/store_ssbo/etc... is relative to the binding table or the bindless heap. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Alyssa Rosenzweig	4cdd85517d	nir: Add intrinsics for multisampling on AGX sample_mask_agx maps to the AGX instruction used to write out a sample mask. api_sample_mask_agx is a system value that returns the value of glSampleMask (or its Vulkan equivalent), used to lower glSampleMask (etc). This is distinct from sample_mask_in, which we map to the hardware thing and AND with this as a lowering. sample_positions_agx is a system value returning the sample positions in a packed fixed-point format matching the hardware register, used to lower gl_SamplePositions. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23040>	2023-05-25 17:41:33 +00:00
Rhys Perry	0d26d9d9b6	ac/nir: add fix_derivs_in_divergent_cf Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22636>	2023-05-25 16:29:16 +00:00
Alyssa Rosenzweig	66656822e3	nir: Add image_texel_address intrinsics Some hardware has an instruction to load the address of a texel in a writeable image, given the coordinates ("LEA_IMAGE"). This operation is defined only for uncompressed images, but it is well-defined regardless of the underlying twiddling. As such, it is not expected to be produced by APIs but is useful for internal lowering when it is known that images will be uncompressed (e.g. because image_store does not support compression on the hardware). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23120>	2023-05-22 14:33:13 +00:00
Alyssa Rosenzweig	c3ea2f8d20	nir: Document extra image source I was scratching my head about this for a few minutes until I found the answer in spirv_to_nir. Hopefully this saves someone else some head scratching in turn. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23120>	2023-05-22 14:33:13 +00:00
Samuel Pitoiset	f023ab01e9	nir: add nir_intrinsic_load_poly_line_smooth_enabled To lower smooth lines conditionally in fragment shaders for RADV because the line rasterization mode in Vulkan can be dynamic. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21587>	2023-05-22 07:58:34 +00:00
Alyssa Rosenzweig	2cc77088b9	nir: Remove legacy atomics The intrinsics are now totally dead and can be removed. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23036>	2023-05-16 22:36:22 +00:00
Konstantin Seurer	0cf22f9af3	nir: Make rq_load committed src an index committed has to be a constant so there is no need to have a src and depend on constant folding to remove the i2b. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22963>	2023-05-14 17:28:40 +00:00
Jesse Natalie	4621a6db50	dxil: Use unified atomics Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22993>	2023-05-12 16:01:40 -07:00
Alyssa Rosenzweig	ee6ddce636	ir3: Use unified atomics Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Rob Clark <robclark@freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22914>	2023-05-12 20:39:46 +00:00
Alyssa Rosenzweig	d51bc95837	nir: Add unified atomics Currently, we have an atomic intrinsic for each combination of memory type (global, shared, image, etc) and atomic operation (add, sub, etc). So for m types of memory supported by the driver and n atomic opcodes, the driver has to handle O(mn) intrinsics. This makes a total mess in every single backend I've looked at, without fail. It would be a lot nicer to unify the intrinsics. There are two obvious ways: 1. Make the memory type a constant index, keep different intrinsics for different operations. The problem with this is that different memory types imply different intrinsic signatures (number of sources, etc). As an example, it doesn't make sense to unify global_atomic_amd with global_atomic_2x32, as an example. The first takes 3 scalar sources, the second takes 1 vector and 1 scalar. Also, in any single backend, there are a lot more operations than there are memory types. 2. Make the opcode a constant index, keep different intrinsics for different operations. This works well, with one exception: compswap and fcompswap take an extra argument that other atomics don't, so there's an extra axis of variation for the intrinsic signatures. So, the solution is to have 2 intrinsics for each memory type -- for atomics taking 1 argument and atomics taking 2 respectively. Both of these intrinsics take an nir_atomic_op enum to describe its operation. We don't use a nir_op for this purpose, as there are some atomics (cmpxchg, inc_wrap, etc) that don't cleanly map to any ALU op and it would be weird to force it. The plan is to transition to these new opcodes gradually. This series adds a lowering pass producing these opcodes from the existing opcodes, so that backends can opt-in to the new forms one-by-one. Then we can convert backends separately without any cross-tree flag day. Once everything is converted, we can convert the producers and core NIR as a flag day, but we have far fewer producers than backends so this should be fine. Finally we can drop the old stuff. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22914>	2023-05-12 20:39:46 +00:00
Connor Abbott	6f2be52487	tu, ir3: Handle FDM shader builtins Also, add the necessary transform to fixup gl_FragCoord.xy. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20304>	2023-05-08 19:59:26 +00:00
Connor Abbott	b474ed1f3a	nir, ir3: Add option to use unscaled FragCoord for input attachments When rendering a scaled tile, we need to use the original, hardware FragCoord when accessing input attachments that are on-tile (i.e. were rendered to in a previous subpass) because they are also scaled in the same way that FragCoord is scaled. For input attachments that aren't already on-tile, however, we need to use the fixed gl_FragCoord. Add a new intrinsic and a bitfield of input attachments which should use it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20304>	2023-05-08 19:59:26 +00:00
Timur Kristóf	9b6945bb65	amd: Cleanup old GS intrinsics code. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>	2023-05-04 19:08:59 +00:00
Timur Kristóf	f66281c7fb	amd: Add and implement gs_wave_id sysval. Contains a global wave ID of legacy GS waves. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>	2023-05-04 19:08:58 +00:00
Timur Kristóf	c1591bfc28	amd: Add and implement sendmsg_amd intrinsic. This intrinsic is going to be used for simplifying GS code. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>	2023-05-04 19:08:58 +00:00
Lionel Landwerlin	1e0e4657f9	spirv/nir: wire ray interection triangle position fetch Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <f{merge_request.web_url}>	2023-05-04 11:25:41 +00:00
Qiang Yu	eb1fe8c32f	nir: add nir_export_dual_src_blend_amd intrinsic For GFX11 export dual source blend outputs when ACO. ACO need a pseudo instruction to emit a block of code which can't be done in nir currently. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22199>	2023-04-26 03:27:26 +00:00
antonino	a0645e3383	nir/zink: use sysvals in `nir_create_passthrough_gs` Previously the passthrough gs shader loaded some values with uniform loads using sevaral hardcoded values. This was not flexible for other drivers and started becoming too unflexible for zink itself. Use system values instead and use a lowering pass in zink. Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22667>	2023-04-25 13:11:59 +00:00
Qiang Yu	f7f0d31fcc	nir,ac/llvm,radeonsi: replace nir_load_smem_buffer_amd with nir_load_ubo They use same instruction. Just because when the time nir_load_smem_buffer_amd was introduced, radeonsi didn't support pass buffer descriptor to nir_load_ubo directly. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22523>	2023-04-19 01:59:02 +00:00
Lionel Landwerlin	0b8a2de2a1	anv: add dynamic buffer offsets support with independent sets With independent sets, we're not able to compute immediate values for the index at which to read anv_push_constants::dynamic_offsets to get the offset of a dynamic buffer. This is because the pipeline layout may not have all the descriptor set layouts when we compile the shader. To solve that issue, we insert a layer of indirection. This reworks the dynamic buffer offset storage with a 2D array in anv_cmd_pipeline_state : dynamic_offsets[MAX_SETS][MAX_DYN_BUFFERS] When the pipeline or the dynamic buffer offsets are updated, we flatten that array into the anv_push_constants::dynamic_offsets[MAX_DYN_BUFFERS] array. For shaders compiled with independent sets, the bottom 6 bits of element X in anv_push_constants::desc_sets[] is used to specify the base offsets into the anv_push_constants::dynamic_offsets[] for the set X. The computation in the shader is now something like : base_dyn_buffer_set_idx = anv_push_constants::desc_sets[set_idx] & 0x3f dyn_buffer_offset = anv_push_constants::dynamic_offsets[base_dyn_buffer_set_idx + dynamic_buffer_idx] It was suggested by Faith to use a different push constant buffer with dynamic_offsets prepared for each stage when using independent sets instead, but it feels easier to understand this way. And there is some room for optimization if you are set X and that you know all the sets in the range [0, X], then you can still avoid the indirection. Separate push constant allocations per stage do have a CPU cost. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15637>	2023-04-17 22:43:37 +00:00
Qiang Yu	7fcc5aa9c0	nir: add nir_load_barycentric_optimize_amd intrinsic Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21683>	2023-04-17 02:11:55 +00:00
Alyssa Rosenzweig	efaffcfbd4	nir: Add more system values for lowering XFB Add more system values for XFB. This should be good enough for lowering GL3.1 + transform_feedback2 + transform_feedback3. More will probably be needed for geom/tess but that will be easier to work with when I'm actually bringing up geom/tess. At any rate, we're splitting out XFB from the rasterization pipeline and since XFB happens only in the last shader pre-rasterization stage, VS+XFB is an orthogonal problem from e.g. VS+GS+XFB. Yeah, the combinatorics suck. These will be used by Asahi, and hopefully eventually Panfrost. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22123>	2023-04-13 18:04:25 +00:00
Lionel Landwerlin	2cf93f7632	nir: add 2 new intel intrinsics for uniform ssbo/shared loads Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21853>	2023-04-05 12:32:56 +00:00
Connor Abbott	0977925c53	nir, spirv: Add support for VK_EXT_fragment_density_map This involves two new system values. Reviewed-by: Faith Ekstrand <faith@gfxstrand.net> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20303>	2023-04-04 13:14:35 +00:00

1 2 3 4 5 ...

351 commits