Instead, check if it's a vector or scalar and use 1 explicitly. In FS
output case where we were only using it assert we don't have any arrays,
assert that directly.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22580>
Otherwise there's no way to target PIPE_FORMAT_B4G4R4A4_UNORM instead
of the B5G6R5 or B5G5R5A1 if those are supported. This gets the behavior
closer to the Windows PFD selection.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25979>
To get MacOS to build, some extra dependencies need to be added to a couple of build targets.
This mainly shows up when not installing the dependencies in the default prefix locations.
On MacOS, this happens when using a custom build of brew to install the dependencies to 'odd' locations.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25992>
Partial revert of: 68e8e40163 ("ci/zink: reduce premerge testing on a618 to ~ 12 minutes")
Weston is kept, and reduction to the 2 devices, because we have only 9
at maximum capacity available (with 3 parallel jobs we would need at least 10).
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25982>
Patch adds a fallback to calculate_tile_dimensions if such case is hit,
this happened when running CTS tests on simulation.
Fixes: d13c81a2c3 ("iris/xehp: Implement TBIMR tile pass setup and pipeline bandwidth estimation.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25989>
Removes the link-time dependency on tgsi_get_gl_varying_semantic from
Gallium auxiliary.
ps_prim_id_input linkage removed due to redundancy - the SPI SID is
calculated for VARYING_SLOT_PRIMITIVE_ID on both sides.
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25695>
Currently is run around 20 minutes.
Except this, also use Weston (more likely use for zink+a618)
as an effort to stop testing X11 bugs instead of Mesa.
Rest will get covered by nightly job.
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25843>
The embedding driver could be failing the allocation for whatever
reason, in which case we should skip the surface state writes.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25955>
ldunifa works exactly the same as ldunif: the hw will prefetch the
next 4 bytes after a read, so if a buffer is exactly a multiple of
a page size and a shader uses ldunifa to read exactly the last 4 bytes
the prefetch will read out of bounds and spam the error on the kernel
log. Avoid that by allocating extra bytes in this scenario.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25752>
Existing binning settings which are required for gfx10.3 and newer cause
performance drop. Keep existing settings for gfx10.3 and newer version
and follow previous rules to set values for gfx9 to improve performance
of gfx9.
Signed-off-by: Julia Zhang <julia.zhang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25933>
Makes global load/stores coherent on a device level if requested by the
shader.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25937>
Host pointer allocations are all linear laid out, so just tell the drivers
in case they don't assume this implicitly.
Fixes: 71a9af4910 ("rusticl/mem: support read/write/copy ops for images")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25937>
This allows the hardware to behave as if TBIMR was disabled until a
polygon is processed which spans at least one tile. This is a rather
heavy-handed heuristic meant to prevent regressions in heavily
geometry-bound workloads that render large numbers of tiny primitives
much smaller than a TBIMR tile.
A particularly bad example of this was observed in SoTR, where certain
draw calls with a long-running VS and a mostly trivial PS render more
triangles than pixels, filling up the URB and TBIMR batch pretty
quickly, which causes EU utilization to tank (since once the URB has
filled up the parallelism of the VS is limited by the number of
polygons that fit in a TBIMR batch at the completion of each tile
walk, which isn't a lot in relation to the total EU count of a DG2),
and causes the bottleneck to be the rate at which the tile sequencer
performs additional tile passes, each one processing a small number
(<1024 polygons) of the hundreds of thousands of triangles of the
draw call.
Enabling this heuristic seems effective at avoiding that scenario in
SoTR among other titles (e.g. Total War Warhammer 3), but it's a bit
of a compromise since one could imagine cases where TBIMR is helpful
even if the geometry doesn't pass the box check, so a better heuristic
or a driconf rule may be useful in the future.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
This programs a TBIMR batch size equal to 128 polygons per slice in
order to match the hardware spec recommendation (BSpec 68436). This
has been confirmed to improve performance slightly relative to the
hardware default batch size of 256 polygons.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
This enables a couple of TBIMR performance tunables in
CHICKEN_RASTER_2 that default to disabled. TBIMR fast clip appears to
help slightly with some geometry-bound workloads. TBIMR open batch
allows the rasterizer to start working immediately on the first tile
of the framebuffer, even before the batch has been closed, which helps
reduce the latency cost of the tile walk.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
This sets up the basic parameters needed for tiled rendering based on
a back-of-the-envelope estimate of the amount of memory used by the
pixel pipeline during the tile pass. The actual cache footprint of a
tile can vary wildly based on runtime factors which aren't easily
predictable based on static analysis, so this is only intended to
provide a rough approximation within the right order of magnitude.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>