This adds an idiomatic way to insert NIR function calls with the builder. Since
functions have variable numbers of arguments, this is a variadic function.
v2: Define with a variadic macro instead, for safety with the argument count.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25498>
It is close enough, and a lot better than the defaults when gitlab doesn't
recognize the file format as currently happens for .cl
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25498>
In order to include a reference to the Raspberry Pi 5, and that the
support for 3.3 and 4.1 got dropped.
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25851>
This has been confusing for some time, as from a xml file with the
suffix v33 (so suggesting just one version) we were generating the
headers for v33, v40, v41 and v71.
So now there is a header for the vc4 driver, and one header for the
v3d/v3dv (so v3d "platform") drivers.
FWIW, this means that now the name of the original xml and the header
files generated doesn't maintain a so similar pattern, but again the
equivalence were not there anyway.
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25851>
Add two unit tests related to the LAVA job definition.
test_generate_lava_job_definition_sanity checks for the most important
fields, deploy actions, namespaces etc.
test_lava_job_definition compares the generated definition with static
skeleton YAML files committed inside tests/data folder.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25912>
Simplify both UART and SSH job definitions module to share common
building blocks themselves.
- generate_lava_yaml_payload is now a LAVAJobDefinition method, so
dropped the Strategy pattern between both modules
- if SSH is supported and UART is not enforced, default to SSH
- when SSH is enabled, wrap the last deploy action to run the SSH server
and rewrite the test actions, which should not change due to the boot
method
- create a constants module to load environment variables
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25912>
To absorb complexity from the building blocks to generate job
definitions for each mode:
- fastboot-uart
- uboot-uart
- uboot-ssh
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25912>
Break it to smaller pieces with variable size (fastboot has 3 deploy
actions and uboot only one) to build the base definition nicely in the
end.
Extract kernel/dtb attachment and init_stage1 extraction into functions
to be later reused by SSH job definition.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25912>
The LAVA job submitter is being used by other fd.o projects, such as
`drm/ci`, so let's make it generate more generic job definitions and
test cases.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25912>
Fixes performance regression introduced by prior refactoring of
pipe control code that unnecessarily added CS_FLUSH to query start
and end. Issue was diagnosed by Ben L (thank you!)
Confirmed this restores performance on:
* Borderlands3 +2%
* Payday +3%
* Factorio +3%
* HogwartsLegacy +4%
* Ghostrunner +7%
Fixes: 6dc95685 (convert genX_query pipe controls to use pc helper)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25983>
Partial revert of: 68e8e40163 ("ci/zink: reduce premerge testing on a618 to ~ 12 minutes")
Weston is kept, and reduction to the 2 devices, because we have only 9
at maximum capacity available (with 3 parallel jobs we would need at least 10).
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25982>
This fixes intel_clc builds with llvm 17 on gfx125_bvh_build_DFS_DFS
where it dies in the lower indirect derefs pass.
Co-authored-by: Dave Airlie <airlied@redhat.com>
Fixes: 4a4e175738 ("nir: Support deref instructions in lower_var_copies")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25536>
The usage after spilling is usually either the same as before or the
maximum.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25559>
This lets us measure optimizations without interference of waitcnt
instructions.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25559>
A somewhat random collection of fossils:
N Min Max Median Avg Stddev
x 6 16.59 16.61 16.605 16.603333 0.0081649658
+ 6 15.99 16 16 15.998333 0.0040824829
Difference at 95.0% confidence
-0.605 +/- 0.00830327
-3.64385% +/- 0.0485573%
(Student's t, pooled s = 0.00645497)
I'm not sure if nir_opt_if and nir_opt_loop_unroll are actually idempotent
or not.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24197>
For example, in the loop:
while (more_late_algebraic) {
more_late_algebraic = false;
NIR_PASS(more_late_algebraic, nir, nir_opt_algebraic_late);
NIR_PASS(_, nir, nir_opt_constant_folding);
NIR_PASS(_, nir, nir_copy_prop);
NIR_PASS(_, nir, nir_opt_dce);
NIR_PASS(_, nir, nir_opt_cse);
}
if nir_opt_algebraic_late makes no progress, later passes might be
skippable depending on which ones made progress in the previous iteration.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24197>
Patch adds a fallback to calculate_tile_dimensions if such case is hit,
this happened when running CTS tests on simulation.
Fixes: d13c81a2c3 ("iris/xehp: Implement TBIMR tile pass setup and pipeline bandwidth estimation.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25989>
When the pitch or slice pitch isn't properly aligned,
the SDMA HW is unable to copy between tiled images and buffers.
To work around this, we process the image chunk by chunk,
copying the data to a temporary buffer which uses supported
pitches, and then copy it to the intended destination.
The implementation assumes that at least one pixel row of the
image will fit into the temporary buffer, and will try to copy
as many rows at once as possible. Sadly, this still results in
a lot of packets being generated for large images.
A possibe future improvement is to copy the image slice by slice
when only the slice pitch is misaligned. However, that is out
of scope for this commit.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25831>
Some copy operations are poorly supported by the SDMA hardware,
meaning that the built-in packets don't support them, so we will
need to work around that by copying to and from a temporary BO.
The size of the temporary buffer was chosen so that it can fit
at least one full pixel row of the largest possible image.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25831>
Previously, RADV only had a simple implementation of
image to buffer copies using the SDMA for the PRIME copy.
This commit replaces that with a full-featured implementation
that includes buffer to image and image to buffer copies and
removes the assumptions that the PRIME copy had, as well as
adds new helper functions which will be shared with other copy
functions in upcoming commits.
Unaligned buffer/image copies require a workaround, which
will be implemented by a future commit.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25831>
This register seems needed to enable compute shader shader invocations
on GFX7. On GFX8+ it's working fine without emitting this register but
I think it doesn't hurt.
This fixes dEQP-VK.query_pool.statistics_query.*_cq on GFX7.
Fixes: a9945216ba ("radv: fix COMPUTE_SHADER_INVOCATIONS query on compute queue")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25957>
Looks like GFX6 always writes the number of compute shader invocations
at offset 0 when used on compute queue.
This fixes dEQP-VK.query_pool.statistics_query.*_cq on GFX6.
Fixes: a9945216ba ("radv: fix COMPUTE_SHADER_INVOCATIONS query on compute queue")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25957>
This partially reverts 74ab940156 which
was a fix for random GPU hangs with binning on GFX10+. Though,
according to RadeonSI, only GFX10+ is affected and this reduced perf
on GFX9 chips.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25951>
The following sequence is valid (although weird) but many other drivers
(including RADV) were broken:
- bind pipeline with some static state
- set state command for that static state (to a bad value)
- bind the same pipeline again
- draw
Fixes new dEQP-VK.dynamic_state.*.double_static_bind.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25954>