Take advantage of XOR associativity to break the loop-carried
dependency chain and help compiler auto-vectorization.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Empty Tile Elimination is an extension to Transaction Elimination that
allows to skip the pre-loading of clear tiles that were also clear at
the previous render on the selected RT. The crc_clear_color is written
as is as CRC value for clear tiles when empty_tile_write_enable is
set. If empty_tile_read_enable is set and if a tile is clear at the
next render on the selected RT, the written CRC is compared to the
crc_clear_color and the processing of the tile is short-circuited if
the values are equal.
This commit enables Empty Tile Elimination when supported. It also
fixes the crc_clear_color value in order to reflect changes of clear
values on any of the RTs. This is done by storing a hash of the clear
value channels of each cleared RT in the crc_clear_base sub-field.
Fixes: 5d5f7552a5 ("panfrost: XML-ify the multi-target framebuffer descriptors")
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Don't set the CRC buffer pointer nor the row stride on v4. Even if the
image props request Transaction Elimination, setting only these fields
isn't enough to enable it. The CRC read/write enable fields would need
to be appropriately set on the framebuffer descriptor too.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
v6 supports Transaction Elimination with multiple RTs at the condition
the write buffer size of the enabled color attachments for a tile
doesn't exceed 1600 bytes.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Add the pan_crc_state structure and use it to store CRC state. The
struct only has a valid boolean for now but will be extended
later. This removes some explicit dereferencing, allows to wrap
state handling inside functions and helps readability.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Retrieve and cache temporary CRC info once at the beginning of
pan_emit_fbd(). This makes CRC info retrieval more localized and
avoids duplication.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Transaction Elimination on a RT is disabled until there's a full frame
render with all tiles forcefully written back. This is currently done
by letting the Gallium driver track states and fix up FB preload by
disabling clean_fragment_write on the pre-frame DCD and by setting the
pre-frame mode to "always" (instead of "intersect").
This commit forces the write-back of all the tiles by setting
clean_tile_write_enable on the FBD instead. This simplifies the code
and removes most of the CRC state tracking from the Gallium driver.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
This commit doesn't really change the selection logic but tries to
make the reasoning more straightforward and prepare for future commits
where the CRC state will be cached.
A usable RT must pass a few conditional checks like the availability
of a CRC buffer. A selected RT must be usable and either have a valid
CRC buffer or be fully covered. In the MRT case, the first usable RT
with a valid CRC buffer is selected. If no RT has a CRC buffer
initialized, then the first usable RT is selected at the condition
it's fully covered.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
CRC RT selection for v5 and v6 (v4 isn't supported) currently returns
0 (instead of -1) as long as the CRC buffer is usable but without
checking its validity like it's done for v7+. While it doesn't
incorrectly enable Transaction Elimination, it uselessly makes
dependent CRC code paths taken.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Arch v5 and v6 should test the AFBC render block size too. In the
non-AFBC case, there's no need to check for the tile size which is
checked earlier by the caller.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Add function to retrieve whether the area of a pan_fb_info data
structure is fully covered by the draw extent.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Restrict the creation of a CRC buffer for an image to the 1st mipmap
level. At emit_fbd() time, Transation Elimination is only enabled if
CRC is enabled for the selected RT and if its first configured level
is 0.
This was previously enforced at the Gallium driver level but it needs
to be done at the lib level to later support PanVK too.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
It turns out we need the color sysvals recorded in system_values_read,
and PARAM_GEN is for point smoothing.
Acked-by: Pierre-Eric
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40556>
Some Max Payne 3 shaders are impacted by this and probably will fix some
issue there. The VK CTS isn't testing this, but it was verified to fix a
real problem by inserting 0 offsets into the instruction and having CTS
tests fail with the old ordering.
Totals from 3 (0.00% of 1163204) affected shaders:
CodeSize: 2496 -> 2736 (+9.62%)
Static cycle count: 732 -> 741 (+1.23%)
Fixes: ad01fbdda0 ("nak: Add a NIR texture lowering pass")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40957>
With the previous commit ("ac/surface: Filter swizzle modes for VCN"),
only video-compatible swizzle modes will be picked, so we can enable
tiling for VCN2+.
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40948>
This will allow compatible swizzle modes to be picked for RADV (radeonsi
filters modifiers when creating video surfaces).
This mirrors the logic from ac_modifier_supports_video, and in
addition ensures that XOR swizzle modes are disabled for image arrays
because VCN does not support slice indices.
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40948>
Map X6R10X6G10X6B10X6A10_UNORM to the native R10X6G10X6B10X6A10X6_UNORM
HW format on PAN_ARCH >= 11 where it is supported.
Enable the extension with formatRgba10x6WithoutYCbCrSampler in the
physical device, allowing VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16
to be used as a regular color format without YCbCr sampler conversion.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40653>