This decreases memory usage, because serialized NIR is more compact.
If shader_has_one_variant is true and the shader is uncached, the first
variant is created from nir_shader, otherwise the first variant and
all other variants are created from serialized NIR.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
a later commit will add back st_vertex_program as a subclass of
st_common_program
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
This matches the uncached codepath.
affected_states was used before initialization, which was technically
a bug, but probably not reproducible due to _NEW_PROGRAM rebinding
everything.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
We could enable it on GFX10 if LLVM wasn't used as a fallback for
unsupported stages. Note that the CTS only tests it if
VK_KHR_shader_float16_int8 is enabled, even though it's not a
requirement.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
The multiplication reduction is larger than it could be, but it should be
easier to implement this way.
No failures with dEQP-VK.subgroups.*int64* except those caused by LLVM
being used for other stages.
v2: don't call setFixed() for v_add carry-out, since setHint sets physReg
v3: add and use emit_vadd32() helper
v4: use num_opcodes instead of last_opcode
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v3)
Should make 64-bit integer reductions easier to implement.
v4: use num_opcodes instead of last_opcode
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v3)
This extension allows to use subgroup operations with 8 and 16-bits
Untested on GFX6-GFX7, and most of subgroup operations are broken
on GFX10, so don't enable it for now. Not enabled on ACO because
it's still doesn't support 8-bits/16-bits.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The return type is always the src type (32 or 64 bits).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
It should rely on the source type, not on the return type which
is always a boolean anyways, so vote_feq was never selected. For
OpSubgroupAllEqualKHR it's always an integer comparison.
This fixes some VK_KHR_shader_subgroup_extended_types tests with RADV.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Always enabled; this doesn't require any driver work, it's just
core mesa bits.
quick_gl.txt is also updated because previously piglit ext_dsa
tests were skipped.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The spec is unclear on how to handle the buffer argument so we reuse
the logic from the EXT_direct_state_access spec.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We can't simply alias ARB_direct_state_access functions because
those fail if the vao has never been bound before.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The wording in ARB_framebuffer_no_attachments and EXT_direct_state_access
is different.
In the former framebuffer names must have been generated using glGenFramebuffers
before using the named functions.
In the latter framebuffer names have no such constraints, so we can't use
the _mesa_lookup_framebuffer_dsa function.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
It doesn't make sense to have nonlinear layouts for a buffer that can be
accessed as direct memory for a compute kernel. Turn that off so things
work as expected.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We can take the OpenCL kernel inputs and interpret them as uniforms by
simply reusing the Gallium callback.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Chrome OS would like to import and render to any supported format that has
a corresponding display plane format, and this prevents throwing
framebuffer incomplete for FBOs using these textures.
See: crbug.com/949260
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>