Move shared variables and configuration for the android-angle-venus-anv
job into a new template so that it can be reused by other jobs.
This follows the structure used everywhere else.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35737>
Update the kernel to enable CONFIG_LRU_GEN.
This is the only change from the previous kernel build.
Mutli-gen LRU improves how Linux handles high memory pressure, which
allows Cuttlefish to boot on devices with limited RAM.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35737>
Now that the workaround for stencil valuemask 0 is gone on more
recent cores, we can drop the failures that were caused by this
workaround.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35704>
Older cores have a hardware bug where the stencil test misbehaves when
the valuemask is 0 in the back stencil. The workaround is to effectively
switch to single sided stencil and hope for the best. This seems to be
okay for most actual use-cases and tests, as a 0 valuemask is quite rare.
However, as 0 is a valid value for the mask the workaround breaks some
edge cases. Newer cores don't seem to require the workaround and operate
correctly with all stencil valuemasks, so drop the workaround for them.
I don't know which feature bit tells us about the bugfix. I made an
educated guess by checking behavior on multiple GPUs: GC880r5106 and
GC2000r5108 need the workaround, GC3000r5450, GC7000r6204 and GC600r4653
work fine without it. By comparing the set BugFixes feature bits between
the cores and the fact that BugFixes15 already fixes some PE issues, I
think this one is the most likely candidate.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35704>
Drop the check for front stencil enable when selecting which stencil
ref value to put into the back stencil state. If the front stencil
isn't enabled then stencil operations as a whole aren't enabled and
we don't care which value ends up in the state.
While at it fix indentation.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35704>
There's a few things that will prevent enabling the more efficient
PE_COLOR_FORMAT_OVERWRITE mode, but pixel dicard due to alpha or
stencil testing isn't one of those conditions, as this behaves the
same way as pixel discard due to a failed depth test.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35704>
There is no question anymore: we need to swap front/back stencil
depending on the frontface winding order, as the hardware doesn't
have a front face configuration option. This is what is implemented
in the driver.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35704>
These temps are used to create the scratch_rsrc. Spilling them will
never benefit anything, because assign_spill_slots will insert code
that keeps them live. Since the spiller assumes all spilled variables
to be dead, this can cause more variables being live than intended and
spilling to fail.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>
movs supports the same conversions as cov which means that any cov of
its dst can be folded into the movs if all uses of its dst are the same
type of cov.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32624>
movs is just nir_intrinsic_read_invocation so this is a matter of
disabling the current lowering to nir_intrinsic_read_invocation_cond_ir3
and adding lowering to movs.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32624>
movs works like subgroupBroadcast/OpGroupNonUniformBroadcast. The fiber
id can either be an immediate or a value in a0.x. In the latter case,
the value has to be dynamically uniform or else the behavior is
undefined. The dst register has to be shared or else the behavior seems
to be equivalent to a normal mov (i.e., the fiber id is not taken into
account anymore).
movs can do a cov on the broadcasted value. It works exactly like a
normal cov except that using u8 as the src type does not seem to work at
all.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32624>
The emitted mov had a hard coded dst type of u32. This was fine because
we needed this to work around a HW bug with shared half movs. However,
since this bug was fixed in a7xx, we need to support half movs as well.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32624>
update_renames() assumes that killed operands are already removed from
the register file, except for precolored and copy-kill operands.
When dealing with vector-operands, however, unrelated operands might
also be moved, in order to make space.
Fixes: fb689f133e ('aco/ra: handle register assignment of vector-aligned operands')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35735>
If all draw calls of a job were submitted with GL_RASTERIZER_DISCARD
enabled we can avoid all the rasterization by not submitting the
supertile coordinates in the CLE.
There is a performance improvement in scenarios where only running the
geometry stages is needed like using TF. Altought the load/stores were
already avoided. Before this patch, the FS was still being executed for
each tile.
It helps on manhattan benchmarks.
fps_avg helped: gl_manhattan.trace: 12.71 -> 13.16 (3.54%)
fps_avg helped: gl_manhattan31.trace: 7.86 -> 8.02 (2.03%)
total fps_avg in affected (through threshold) runs: 20.57 -> 21.18 (2.96%)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35739>
The new layout affects the whole buffer so it needs to be done
on a full clear.
This fixes this piglit test on a RX 6800 XT:
ext_framebuffer_multisample-accuracy 6 depth_resolve small depthstencil
Fixes: 75a03d733a ("radeonsi: simplify and fix enable_tc_compatible_htile_next_clear logic")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35582>
cs_create returns a bool not an int so check correctly for
success (sdma was still used because sdma_cs would be non
NULL when entering this function the second time).
While at it update debug_flags to not retry creating sdma_cs
if it failed.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35582>
Otherwise, outputs_read/outputs_written might not be up-to-date
(mostly after nir_remove_dead_variables) and remove_point_size() might
reach an assertion later because the output variable isn't found.
It seems better to run nir_shader_gather_info() at the very end of
radv_optimize_nir() which can change a lot of things anyways.
No fossils-db changes.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35707>
The register footprint could limit occupancy. We need to take this into
account to avoid deadlocks when a kernel is using barriers.
Fixes: 6d85cd6a3b ("freedreno: Implement get_compute_state_info for Adreno 6xx/7xx")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35745>
fourcc_to_pipe_format() was using the endian specific pipe
formats but drilConfigs which guards the supported formats
was using the little endian definitions directly so we would
always skip the formats on big endian. The little endian
pipe format is the correct one to use since that is how
DRM_FORMAT_* formats are defined.
Fixes: 20b3400701 ("dril: rework config creation")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35726>
Assembler supports them, so allow them on @-macro lines. For now
we don't bother with multiline comments, if becomes a thing we
can add them later.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35699>
After commit 88309a9818, SFID names were renamed
- "dp data 1" became "hdc1"
- "thread_spawner" became "ts/btd"
Update macros in executor to use the new SFID names so the
generated assembly can be parsed correctly.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35701>