In a NIR generated using SPIR-V initializers to variables, copy
propagation can end up transforming
vec1 32 ssa_33 = deref_var &@1 (shared mat2x4)
vec1 32 ssa_35 = mov ssa_33
vec1 32 ssa_7 = deref_cast (mat2x4 *)ssa_35 (shared mat2x4) /* ptr_stride=0 */
into
vec1 32 ssa_33 = deref_var &@1 (shared mat2x4)
vec1 32 ssa_7 = deref_cast (mat2x4 *)ssa_33 (shared mat2x4) /* ptr_stride=0 */
Before the optimization, the "head" of a path of deref that uses ssa_7
will be the cast. After, it will be the variable in ssa_33. Since
the types are the same, this is a trivial cast that would be picked up
by nir_opt_deref.
If we need to compare such deref-chain after optimization with another
deref-chain for the same variable, the compare function will get
confused by the cast in the middle.
One alternative would be to add nir_opt_deref to places that compare
derefs, but that might not scale well, so skip the trivial casts when
generating the paths instead.
Motivated by the discussion in
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047#note_383660.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3420>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3420>
This introduces a new NIR intrinsic for loading inputs at a specific
vertex index.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
From the SPV_AMD_shader_explicit_vertex_parameter extension:
"Returns the value of the input <interpolant> without any
interpolation, i.e. the raw output value of previous shader
stage."
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
We need the LINEAR versions for AMD_shader_explicit_vertex_parameter.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
This introduces one more interpolation mode INTERP_MODE_EXPLICIT,
which is needed for AMD_shader_explicit_vertex_parameter.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
A few hash_table users roll their own integer hash functions which
call _mesa_hash_data to perform the hashing which ultimately calls
into XXH32 with a dynamic key length. When using small keys with a
constant size the hash rate can be greatly improved by inlining
XXH32 and providing it a constant key length, see:
https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html
Additionally, this patch removes calls to _mesa_key_hash_string and
makes them instead call _mesa_has_string directly, matching the new
integer hash functions.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
These instructions are allowed to fetch from multisampled
subpass input attachments.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
nir_tex_src_ms_index is re-used for the fragment index with
nir_texop_fragment_fetch to avoid introducing a new texture source type.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
This introduces:
- nir_texop_fragment_mask_fetch (fetch a fragment mask from a
compressed multisampled color surface)
- nir_texop_fragment_fetch (fetch a color fragment for a
particular sample at corresponding fragment mask index).
These two texture operations are necessary for implementing
SPV_AMD_shader_fragment_mask.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
This new capability is for SPV_AMD_shader_fragment_mask.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
This helps avoid incorrect validation error when linking glsl
shaders and avoids assigning uniform storage slots that will
never be used.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3468>
We need to stripe any arrays before checking the type. Here we
just use the uniform type which has already be stripped.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3468>
I noticed that we can do better for these kinds of comparisons while
working on the lowering for iadd_sat@64 and isub_sat@64. This
eliminated 11 instruction from the fs-addSaturate-int64.shader_test.
My hope is that this will improve the run-time of int64 tests on Ice
Lake. I have no data to support or refute this.
Unsurprisingly, no changes on shader-db.
v2: Condition the min and max patterns with nir_lower_minmax64.
Suggested by Caio. Very long discussion in the MR. :)
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
v2: Rebase on 272e927d0e ("nir/spirv: initial handling of OpenCL.std
extension opcodes")
v3: Add missing SpvOpUCountTrailingZerosINTEL case to switch in
vtn_handle_body_instruction. Remove stray semicolon in
vtn_nir_alu_op_for_spirv_opcode. Use umin instead of umax for
SpvOpUCountTrailingZerosINTEL "lowering" in vtn_handle_alu.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
v2: Rearranged and expand the comment about the optimizations applied to
the lowering. Suggested by Caio.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
v2: Rebase on 272e927d0e ("nir/spirv: initial handling of OpenCL.std
extension opcodes")
v3: Add a new lower_usub_sat64 flag that only applies to the 64-bit
version of the nir_op_usub_sat instruction.
v4: Also enable the lowering when nir_lower_iadd64 is set.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
v2: Rebase on 272e927d0e ("nir/spirv: initial handling of OpenCL.std
extension opcodes")
v3: Add a new lower_hadd64 flag that only applies to the 64-bit versions
of the instructions.
v4: Also enable the lowering when nir_lower_iadd64 is set.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
uctz isn't added because it will implemented in the GLSL path and the
SPIR-V path using other pre-existing instructions.
v2: Avoid signed integer overflow for uabs_isub(0, INT_MIN). Noticed by
Caio.
v3: Alternate fix for signed integer overflow for abs_sub(0, INT_MIN).
I tried the previous methon in a small test program with -ftrapv, and it
failed.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
v2: Re-write iadd64_saturate and isub64_saturate to avoid undefined
overflow behavior. Also fix copy-and-paste bug in isub64_saturate.
Suggested by Caio.
v3: Avoid signed integer overflow for abs_sub(0, INT_MIN). Noticed by
Caio.
v4: Alternate fix for signed integer overflow for abs_sub(0, INT_MIN).
I tried the previous methon in a small test program with -ftrapv, and it
failed.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
These numbers are always confusing, and it's particularly so for this
field where it has a different meaning in different info structs.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
The arguments passed in were:
- prog->info.num_ssbos
- prog->nir->info.num_ssbos
- arbitrary values for standalone compilers
The num_ssbos should match between the prog's info and prog->nir's info
until this lowering happens.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
Gallium arbitrarily (it seems) put atomics below SSBOs, resulting in a
bunch of extra index management, and surprising shader code when you would
see your SSBOs up at index 16. It makes a lot more sense to see atomics
converted to SSBOs appear as magic high numbers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
This is required for the subgroupBroadcastDynamicId feature that was
added in Vulkan 1.2.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
To implement NIR-to-TGSI, we need to be able to get the size of the
uniform variable for the TGSI declaration, not just the
.driver_location. With its location in mesa/st, drivers couldn't link
to it from nir-to-tgsi.
This feels like a common enough function to want, so let's share it in
the core compiler.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
The only bit that gallium varied on was handling of bindless. We can
retain previous behavior for count_attribute_slots() by passing in
"true" (though I suspect this is just giving a silly answer to a silly
question), and delete our recursive function from mesa/st.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>