These can be used directly in vk_meta_rendering_info.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31159>
The first step conditionally updates the tiler flags based on dirty bits,
and the second step is the override flags, which are unconditionally
updated at draw time.
Use pan_pack_nodefaults() to avoid default initialized fields.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31159>
We need both the same-invocation usage mask and cross-invocation usage
mask. The AMD reason is below.
Cross-invocation TCS input access doesn't prevent the same-invocation
fast path in AMD hw because it's just a different way to load the same
data, and we want to use both paths for the same TCS input based on
the load instruction. The fast path can't be used for indirect access,
which is gathered separately for same-invocation access.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31645>
This is a per-component bit mask (0xf for each output).
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 0e0c2574d1 ("radv: Add shader stats for inputs and outputs.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31593>
Unigine Heaven with AMD_DEBUG=mono has incorrect rendering on gfx11
because it doesn't set nir_io_semantics::dual_source_blend_index for
the second output, resulting in garbage asm.
Instead of trying to find out what's wrong, I decided to rewrite this
to make it the same as the LLVM IR path. It simplifies the code and fixes
Unigine Heaven with AMD_DEBUG=mono.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31669>
This also means the infrastructure added by @gallo in 1dc64d0613
("ci: Use merge-skips files during merge pipelines") can be used and all
the manual adding of these files can be dropped, reducing the likeliness
of bugs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31739>
Add opcodes for VOTE_ALL, VOTE_ANY and VOTE_EQUAL. The first two
are also used for the quad variants. Move their lowering from
NIR conversion to brw_lower_subgroup_ops.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31029>
If we have a non-register-aligned source, MOV it to a new register
so that the invariant expected when generating SHADER_OPCODE_BROADCAST
is respected.
Added to ensure a later patch won't hit the `src.subnr == 0` assertion
in brw_broadcast() generation code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31029>
Include in the helper which already take care of using exec_all() and
taking the first component of the result. Both are expected by
SHADER_OPCODE_BROADCAST.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31029>
ANGLE has a waiver for certain XFB tests, but this wasn't properly applied
on Alder Lake and these tests weren't skipped there.
Add a global angle-skips.txt file so that we don't have to keep copy-pasting
these skips.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31721>
There is a fresher device type with a CML GPU, with also a bigger number of
boards. Those are more reliable, so also we can remove the manual rules.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26830>
Reverse engineered with the following OpenCL kernel:
kernel void add(global float* out, float a, float b) {
float r;
_viv_asm(CLAMP0MAX, r, a, b);
out[0] = r;
}
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31674>
Switch NAK to it.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31706>
For surfaces without a modifier, the surf_size check wasn't
necessary, but it was also invalid since surf_size is set later
(in gfx12_compute_miptree).
Since it's not required anyway, drop this check.
Fixes: 060d5dacfd ("ac: add gfx12 DCC shared code")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31683>
ffs(VRAM, GTT) returns the GTT bit as it's the smaller.
Simplify the code by explicitely selecting VRAM when both
domains are active, otherwise assert that only 1 bit is set.
Fixes: 593f72aa21 ("winsys/amdgpu-radeon: rework how we describe heaps")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31683>
ANGLE currently pulls absolutely loads of stuff that we don't need. Fix
it up so we don't need to do that anymore, so it's much faster to build.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31716>
Helps for the case when max_distance is set to ~0, where the pass would now
only create groups of two loads together due to overflow. Found while
experimenting with this pass on r300, however the only driver currently
affected is i915.
With i915 this change gains around 20 shaders in my small shader-db
(most notably some GLMark2, Unigine Tropics, Tesseract, Amnesia) at
the expense of increased register pressure in few other cases.
I'm assuming this is a good deal for such old HW, and this seems like what
was intended when the pass was introduced to i915, but anyway this
could be tweaked further driver side with a more optimized max_distance
value. Only shader-db tested.
Relevant i915 shader-db stats (lpt):
total tex_indirect in shared programs: 1529 -> 1493 (-2.35%)
tex_indirect in affected programs: 96 -> 60 (-37.50%)
helped: 29
HURT: 2
total temps in shared programs: 3015 -> 3200 (6.14%)
temps in affected programs: 465 -> 650 (39.78%)
helped: 1
HURT: 91
GAINED: 20
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: GKraats <vd.kraats@hccnet.nl>
Fixes: 33b4eb149e
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31529>