Setting zs_update_operation = FORCE_EARLY for color preloads triggers
hangs in the dEQP-VK.rasterization.rasterization_order_attachment_access
depth/stencil tests. I didn't determine why this is the case, but the
DDK uses WEAK_EARLY for color preload, and doing the same here fixes the
hang.
WEAK_EARLY requires ATEST, so I removed .is_blit=true from the compiler
inputs.
There aren't any known hangs outside of the one set of vulkan CTS tests,
and in particular no known hangs in the gallium driver. Because the
reason for the hangs is not understood, I also changed the gallium
driver to use WEAK_EARLY, under the assumption that the same conditions
that trigger the hang in vulkan might occur in GL.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: edd98aac3f ("panfrost: Add support for native wallpapering on Bifrost")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32954>
On bifrost, zs_update_operation=STRONG_EARLY, is equivalent to
WEAK_EARLY except that it may test/update without waiting for pixel
dependencies if it can prove that the test will pass.
STRONG_EARLY no longer exists on valhall, and the value 2 is reserved.
Even on bifrost, all of our current uses of STRONG_EARLY are
incorrect. For color preload, the shader skips ATEST, so FORCE_EARLY is
required. In the no-FS case, ATEST is skipped by definition (because
there is no shader), so FORCE_EARLY is required.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 519643bbe0 ("panfrost: Adjust the renderer state definition")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32954>
The pass is currently turning this :
mul(16) %17:F, %1:F, 0.5f
mul(16) %19:F, %1:F, -0.5f
(+f0.0) sel(16) %27:UD, %19:UD, %17:UD
into this :
{ 12} mul(16) %17:F, %1:F, 0.5f
{ 14} (+f0.0) sel(16) %27:UD, -%17:F, %17:UD
The type change in the SEL instruction incurs a type conversion that
produces invalid values.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 234c45c929 ("intel/brw: Write a new global CSE pass that works on defs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12477
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33070>
These failures were all caused by CTS bugs affecting Vulkan 1.0. But
since we now expose Vulkan 1.1 on V10, these issues no longer affect us.
Let's update the results to reflect this.
Fixes: 1a81bff6aa ("panvk: expose vk1.1 on v10 hardware")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33180>
This exptension requires Vulkan 1.1, which we don't expose there yet.
While we're at it, put panvk into the normal sorted order of the list of
drivers.
Fixes: d46b80249b ("panvk: enable subgroupSizeControl")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33180>
On mt8192, the 6.13 kernel fails to reliably initialize the GPU, causing
a fallback to llvmpipe. Return to the 6.6 kernel until this is resolved.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33181>
We don't need such argument and this way of specifying them is deprecated anyway.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Sergi Blanch Torné <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33173>
Since mesa is used in drm-ci, the artifacts in drm-ci jobs have
the 'mesa' prefix. This change replaces the hardcoded 'mesa'
prefix in the artifacts name with the CI_PROJECT_NAME variable.
Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33154>
a7xx introduced support for aliasing render target components using
alias.rt. This allows components to be bound to uniform (const or
immediate) values in the preamble:
alias.rt.f32.0 rt0.y, c0.x
alias.rt.f32.0 rt1.z, (1.000000)
This aliases the 2nd component of RT0 to c0.x and the 3rd component of
RT1 to the immediate 1.0. All components of all 8 render targets can be
aliased.
This is implemented by replacing const and immediate components of the
RT sources of end with alias.rt instructions in the preamble. If no
preamble exists, an empty one is created.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31222>
a7xx introduced support for aliasing render target components using
alias.rt. This allows components to be bound to uniform (const or
immediate) values in the preamble:
alias.rt.f32.0 rt0.y, c0.x
alias.rt.f32.0 rt1.z, (1.000000)
This aliases the 2nd component of RT0 to c0.x and the 3rd component of
RT1 to the immediate 1.0. All components of all 8 render targets can be
aliased.
In addition to using alias.rt, the hardware needs to be informed about
which render target components are being aliased using the
SP_PS_ALIASED_COMPONENTS{_CONTROL} registers. This commit implements
those registers.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31222>
a7xx introduced support for aliasing render target components using
alias.rt. This allows components to be bound to uniform (const or
immediate) values in the preamble:
alias.rt.f32.0 rt0.y, c0.x
alias.rt.f32.0 rt1.z, (1.000000)
This aliases the 2nd component of RT0 to c0.x and the 3rd component of
RT1 to the immediate 1.0. All components of all 8 render targets can be
aliased.
In addition to using alias.rt, the hardware needs to be informed about
which render target components are being aliased using the
SP_PS_ALIASED_COMPONENTS{_CONTROL} registers. This commit implements
those registers.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31222>
The clear FS shaders will statically use slot numbers from 0 up to the
number of supported render targets. However, the driver will remap those
slots to the actual render targets being cleared.
This means that ir3 should not make any assumptions about the static
slot number in those cases. This is especially important when
implementing alias.rt, which statically encodes the render target.
Add an new ir3_shader_option (fragdata_dynamic_remap) which allows the
driver to indicate to ir3 that it will perform such dynamic remapping.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31222>
When alias.rt is used to alias certain output components, we might end
up with a situation where some, but not all, of the components of
collects end up being unused. This is currently not supported which
means we end up with useless moves (coming from copy lowering) for
aliased output components.
Fix this by adding support for partial wrmasks for collects in DCE. The
wrmasks are initially zeroed out and then updated based on the wrmask of
their users. Sources of collects for which the corresponding dst ends up
being unused are treated as unused as well. This allows us to remove
the useless output moves by simply updating the wrmask of the end
sources.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31222>
Allocate alias registers for an alias group while trying to minimize the
number of needed aliases. That is, if the allocated GPRs for the group
are (partially) consecutive, only allocate aliases to fill-in the gaps.
For example:
sam ..., @{r1.x, r5.z, r1.z}, ...
only needs a single alias:
alias.tex.b32.0 r1.y, r5.z
sam ..., r1.x, ...
Also, try to reuse allocations of previous groups. For example, this is
relatively common:
sam ..., @{r2.z, 0}, @{0}
Reusing the allocation of the first group for the second one gives this:
alias.tex.b32.0 r2.w, 0
sam ..., r2.z, r2.w
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31222>
alias.tex allows us to construct an "alias table" that creates a mapping
between virtual alias registers and concrete GPRs, consts, or
immediates. The following texture instruction will lookup its sources in
this table and use the mapped value instead. This has a few advantages:
- We don't have to allocate consecutive registers (necessary for many
tex sources) as we can just map them to consecutive alias registers.
- We don't have to allocate GPRs at all for consts and immediates.
- There's no delay penalty when initializing alias registers with consts
or immediates.
For example, this code:
mov.u32u32 r1.x, r3.z
mov.u32u32 r1.y, c0.x
mov.u32u32 r1.z, 0
(rpt2)nop
sam ..., r1.x, ...
Can be implemented as follows:
alias.tex.b32.2 r40.x, r3.z
alias.tex.b32.0 r40.y, c0.x
alias.tex.b32.0 r40.z, 0
sam ..., r40.x, ...
Note that the alias registers (r40.xyz in this case) do not occupy GPR
space.
(More intelligent allocation strategies are possible; e.g., just mapping
r3.w and r4.x to c0.x and 0. This is implemented by the next commit.)
Support for alias.tex is implemented in two passes in ir3.
In a first pass, sources of tex instructions are replaced by alias
sources (IR3_REG_ALIAS) as follows:
- movs from const/imm: replace with the const/imm;
- collects: replace with the sources of the collect;
- GPR sources: simply mark as alias.
This way, RA won't be forced to allocate consecutive registers for
collects and useless collects/movs can be DCE'd. Note that simply
lowering collects to aliases doesn't work because RA would assume that
killed sources of aliases are dead, while they are in fact live until
the tex instruction that uses them.
The second pass inserts alias.tex instructions in front of the tex
instructions that need them and fixes up the tex instruction's sources.
This pass needs to run post-RA as discussed above. It also needs to run
post-legalization as all the sync flags need to be inserted based on the
registers instructions actually use, not on the alias registers they
have as sources.
This commit uses a very simple allocation strategy for alias registers:
simply allocate consecutive registers starting from r40.x. Note that
this works because the alias table is reset after a tex instruction is
executed so we don't have to worry about aliasing a live register.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31222>
Alias registers allow us to allocate non-consecutive registers and remap
them to consecutive ones using alias.tex. We implement this by adding
the sources of collects directly to the sources of their users. This
way, RA treats them as scalar registers and we can remap them to
consecutive registers afterwards. To keep track of the scalar sources
that should be remapped together, the IR3_REG_FIRST_ALIAS flag is
introduced. Every source of such an "alias group" will have the
IR3_REG_ALIAS set, while the first one will also have
IR3_REG_FIRST_ALIAS set.
This commit also adds a number of helpers to iterate over sources while
keeping track of the original src index (i.e., before they were expanded
to alias goups), and to iterate the sources within an alias group. It
also introduces a new notation (@{regs...}) to clearly show alias groups
when printing instructions.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31222>
Take the properties of alias.{rt,tex} and its registers into account:
- Don't count alias registers for GPR usage;
- Allow all immediates in alias regs;
- Fix properties like is_barrier and (ss) support;
- alias.rt dst is not a GPR, don't use it in legalize/postsched to track
dependencies;
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31222>
The UNK field encodes the table size for alias.tex: the first alias.tex
instruction uses it to indicate how many follow (i.e., it is the total
table size minus one).
Also switch from using a src to a cat7 field to store this value which
makes it a bit easier to handle.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31222>
The alias scope and type bits are intertwined in the encoding:
- bit 47: low scope
- bit 48: type
- bit 49: high scope
- bit 50: type size
Combining the low and high scope bits, the value is used as follows:
- 0: tex
- 1: rt
- 2: mem
- 3: mem
I don't know what the difference between 2 and 3 is. The blob currently
doesn't use mem at all.
The type bit seems to be used to make a distinction between floating
point (f) and integer (b) sources. There doesn't seem to be any
functional difference and it only affects how immediates are displayed.
Note that I haven't exactly mimicked the blob in these cases:
- alias.tex.f16/32: the blob uses b16/32 while printing immediates in
floating point notation. I think it make more sense to use f16/32.
- alias.rt.b16/32: the blob uses i16/32 here. I think it makes more
sense to stick to a single notation (b).
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31222>
There is a single swap field for each color attachment, regardless of
whether it's in GMEM or not, and this does appear to be used in
GMEM mode when MUTABLEEN is set on the attachment. This means that when
a color attachment has a non-identity swap because it's mutable on a750,
we have to use the same corresponding swap when it's a source in a
GMEM resolve. When using the fastpath, we have to make sure that the
swaps match because there aren't separate fields for GMEM and sysmem
swap.
This fixes dEQP-VK.image.mutable.2d.*_b8g8r8a8_unorm_draw_copy_resolve
with TU_DEBUG=gmem.
Fixes: 247d11d635 ("tu: Allow UBWC with images with swapped formats.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33115>
We don't currently support building acceleration structures on the CPU
or indirect building in the common framework, and drivers using it don't
either, but drivers have to return non-NULL entrypoints for CPU building
functions if they claim to support VK_KHR_acceleration_structure. Add
stub entrypoints here so that drivers don't have to have this
boilerplate.
Fixes dEQP-VK.api.version_check.entry_points on turnip.
Fixes: 671e3a65a6 ("tu: Support VK_KHR_acceleration_structure")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33153>