v2: update shader info input/output masks when pack components
v3: make sure interpolation loc matches, this is required for the
radeonsi NIR backend.
v4: 33dca36f4f fixed nir_gather_info to update outputs_read
correct, make sure we also adjust this correctly when
packing components.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)
V2:
- fix matrix support, non-array matrices were being skipped in v1
v3:
- handle lowering of tcs output loads correctly
- correctly mark indirect locations for either in or out not both
when processing a stage.
- use nir_src_copy() when lowering stores.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This is a bit more general and lets us pass additional options into the
spirv_to_nir pass beyond what capabilities we support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Instead of emitting absolutely everything, just emit the few functions
that are actually referenced in some way by the entrypoint. This should
save us quite a bit of time when handed large shader modules containing
many entrypoints.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
We have a nir_builder and it has an impl field.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
This fixes issues seen with certain versions of Unreal Engine 4 editor
and games built with that using GLSL 4.30.
v2: add driinfo_gallium change (Emil Velikov)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97852
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103801
Acked-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is more inline with what the functions name suggests it should
do, and makes the code much easier to follow.
This will also make adding uniform packing support much simpler.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
nir_validate.c's #endif already had the correct NDEBUG comment
Fixes: dcb1acdea0 "nir/validate: Only build in debug mode"
Fixes: 9ff71b649b "i965/nir: Validate that NIR passes call nir_metadata_preserve()"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
driver_cache_blob was introduced with the i965 disk cache, it allows
us to simplify the cache a little and possibly offers some minor
speed improvements since we load the GLSL metadata and TGSI from
disk in one pass.
Using driver_cache_blob should also make it straight forward to
implement binary support for ARB_get_program_binary in gallium.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
GL doesn't have this, but some hardware supports it. This is convenient
for lowering tg4 to plain texture calls, which is necessary on Adreno
A4xx hardware.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
We don't support ARB_vertex_blend.
Note that the attribute aliasing check for ARB_vertex_program had to be
rewritten.
vbo_context: 20344 -> 20008 bytes
gl_context: 74672 -> 74616 bytes
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is
generated to load gl_PatchVerticesIn in the SPIR-V path for both
Vulkan and OpenGL.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
generate_array_index fails to check whether the target of a subroutine
call exists in the AST, potentially passing around null ir_rvalue
pointers eventuating in abort/segfault.
Fixes: fd01840c0b ("glsl: add AoA support to subroutines")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100438
Declare glsl_type::sampled_type as glsl_base_type as we do for the
base_type field. And make base_type a bitfield to save a few bytes.
Update glsl_type constructor to take glsl_base_type instead of unsigned
and pass GLSL_TYPE_VOID instead of zero.
No Piglit regressions with llvmpipe.
v2:
- Declare both base_type and sampled_type as 8-bit fields
- Use the new ASSERT_BITFIELD_SIZE() macro.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Gather operations in both GLSL and SPIR-V require a sampler. Fixes
gathers returning garbage when using separate texture/samplers (on AMD,
was using an invalid sampler descriptor).
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We should use the result type of the OpSampledImage opcode, rather than
the type of the underlying image/samplers.
This resolves an issue when using separate images and shadow samplers
with glslang. Example:
layout (...) uniform samplerShadow s0;
layout (...) uniform texture2D res0;
...
float result = textureLod(sampler2DShadow(res0, s0), uv, 0);
For this, for the combined OpSampledImage, the type of the base image
was being used (which does not have the Depth flag set, whereas the
result type does), therefore it was not being recognised as a shadow
sampler. This led to the wrong LLVM intrinsics being emitted by RADV.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This is what we do in the condition too, so it makes sense.
v2: Only compute without_array() once (Ilia).
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Overlooked initially, be we need to remap the SSBO index for this as
well.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
According to the GLSL ES 3.20, GLSL 4.50, and GLSL 1.20 specs:
"To force all output variables to be invariant, use the pragma
#pragma STDGL invariant(all)
before all declarations in a shader."
Notably, this is only supposed to affect output variables. Furthermore,
"Only variables output from a shader can be candidates for invariance."
It looks like this has been wrong since we first supported the pragma in
2011 (commit 86b4398cd1).
Fixes dEQP-GLES2.functional.shaders.preprocessor.pragmas.pragma_fragment.
v2: Now that all cases are identical (other than compute shaders, which
have no output variables anyway), we can drop the switch statement
entirely. We also don't need the current_function == NULL check;
this was a hold over from when we had a single var_mode_out for both
function parameters and shader varyings, in the bad old days.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
The GL spec will soon be revised to clarify that a buffer binding for
a transform feedback buffer is only required if a variable is actually
defined to use the buffer binding point. Previously a declaration for
the default transform buffer would make it require a binding even if
nothing was declared to use the default buffer.
Affects:
KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list
KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list_and_api
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
I think it's more clear to only call emit_access once. The only
difference between the two calls is the value of size_mul used for the
offset parameter... but you really have to look at it to be sure.
The s/is_64bit/is_double/ change is because there are no int64_t or
uint64_t matrix types.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
I was going to squash this with the previous commit, but there's a lot
of churn in that commit.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Without this, the SPIR-V generator has to deal with a bunch of junk
like:
(swiz z (swiz xxx (swiz x (var_ref packed:binormal.z,light_dir))))
It seems better to cull that stuff out than to add code to deal with
it. The problem is the way swizzles to and from scalars have to be
handled in SPIR-V.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
If there is a long sequence of swizzled swizzles, compact all of them
down to a single swizzle.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: <thomashelland90@gmail.com>
This turned out to be a dead end, it is much easier and less error
prone to just cache the IR used by the drivers backend e.g. TGSI or
NIR.
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
... as can happen with various types like mat4, or else we'll smash the
stack writing past the end of components_local[].
Fixes: 5a0d3e1129 ("nir: Print the components referenced for split or
packed shader in/outs.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 94d669b0d2 ("glsl: enforce fragment shader input restrictions in
GLSL ES 3.10")
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This patch is mostly a patch done by Ilia Mirkin.
It fixes KHR-GL45.enhanced_layouts.varying_structure_locations.
v2: fix locations for TCS/TES/GS inputs and outputs (Ilia)
CC: Ilia Mirkin <imirkin@alum.mit.edu>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Most of NIR doesn't allow doing array indexing on a vector (though it
does on a matrix). However, nir_lower_io handles it just fine and this
behavior is needed for shared variables in Vulkan. This commit makes
glsl_get_array_element do something sensible for vector types and makes
nir_validate happy with them.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
We were already validating that the parent type goes along with the
child type but we weren't actually validating that the parent type is
reasonable. This fixes that.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
The GL_ARB_shader_ballot spec says that gl_SubGroupSizeARB is declared
as a uniform. This means that it cannot change across an invocation
such as a draw call or a compute dispatch. For compute shaders, we're
ok because we only ever use one dispatch size. For fragment, however,
the hardware dynamically chooses between SIMD8 and SIMD16 which violates
the spec. Instead, let's just pick a subgroup size based on the shader
stage. The fixed size we choose for compute shaders is a bit higher
than strictly needed but there's no real harm in that. The advantage is
that, if they do anything interesting with the value, NIR will see it as
an immediate and can optimize better.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Ballot intrinsics return a bitfield of subgroups. In GLSL and some
SPIR-V extensions, they return a uint64_t. In SPV_KHR_shader_ballot,
they return a uvec4. Also, some back-ends would rather pass around
32-bit values because it's easier than messing with 64-bit all the time.
To solve this mess, we make nir_lower_subgroups take a new parameter
called ballot_bit_size and it lowers whichever thing it gets in from the
source language (uint64_t or uvec4) to a scalar with the specified
number of bits. This replaces a chunk of the old lowering code.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
This lets you easily build integer immediates of arbitrary bit size.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The SUBGROUP_*_MASK system values are uint64_t when coming in from GLSL
but uvec4 when coming in from SPIR-V. Lowering based on type allows us
to nicely handle both.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>