This is needed for RADV to support explicit component packing.
This is also required to use the new NIR component splitting /
packing passes.
V2:
- add commponent packing support for interpolate_at* intrinsics
- improve store packing support when not all varyings are scalar
as spotted by Bas the store source was incorrectly offset.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This restores performance for the drirc workaround, i.e.
KILL_IF does:
visible = src0 >= 0;
kill_flag &= visible; // accumulate kills
amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only
And all helper pixels are killed at the end of the shader:
amdgcn_kill(kill_flag);
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Vulkan CTS does not expect the value to be clamped (at least for D32),
and it makes a differences even though depth is in [0,1], due
to strict inequalities.
I couldn't find anything in the Vulkan spec about this, but the test
seemed to be copied from GL tests and the GL spec only specifies
clamping for fixed point formats. Hence I expect radeonsi to run into
this at some point as well, but given that they still have a usecase
with the Z16->Z32 promotion, I'll leave that for someone else to clean
up.
This at least fixes radv dEQP-VK.texture.shadow.* on VI.
Fixes: 0f9e32519b 'ac/nir: clamp shadow texture comparison value on VI'
Reviewed-by: Dave Airlie <airlied@redhat.com>
Otherwise we just need to write them to the tf ring.
this seems to improve the tessellation demo on Bonarie
~2190->~2230 fps
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
They don't take a single wave anymore and we need the barriers.
Fixes: 6bc42855f9 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <airlied@redhat.com>
For Vulkan SPIR-V the spec states
fma() Inherited from OpFMul followed by OpFAdd.
Matt says the backend will do the right thing depending on the
hardware being compiled for, if you use the fmuladd intrinsic.
Using the Mad Max pts test, on high settings at 4K:
CHP: 55->60
HGDD: 46->50
LM: 55->60
No change on Stronghold.
Thanks to Feral for spending the time to track this down.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The hardware usually does this automatically. However, we upgrade
depth to Z32_FLOAT to enable TC-compatible HTILE, which means the
hardware no longer clamps the comparison value for us.
The only way to tell in the shader whether a clamp is required
seems to be to communicate an additional bit in the descriptor
table. While VI has some unused bits in the resource descriptor,
those bits have unfortunately all been used in gfx9. So we use
an unused bit in the sampler state instead.
Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f
and many other tests in dEQP-GLES3.functional.texture.shadow.*
Fixes: d4d9ec55c5 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Avoid a v_cndmask: the absolute value is free due to input modifiers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Fix the custom cube coord selection sequence to be identical to
the hardware v_cubesc/tc and OpenGL spec. Affects texture sampling
with user-provided derivatives.
Fixes dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Redundant with the recently added ac_llvm_context::chip_class.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This builds, installs, and has been tested on a r290x (Hawaii) with the Vulkan
CTS. It dies horribly in a fire at the same point for the meson build as the
autotools build.
v2: - enable radv by default
- add shader cache support and enforce that it's built for radv
v3: - Fix typo in meson_options (Nicholas)
- strip trailing 'svn' from llvm version before setting the version
preprocessor flag (Bas)
- Check for LLVM module requirements
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
If we don't have a depth piece, we don't get a correct
swizzle mode and we hit an assert in addrlib.
In case of no depth get the preferrred swizzle mode for
stencil alone.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>