Commit graph

184774 commits

Author SHA1 Message Date
Friedrich Vock
b2067001d4 radv/rt: bsearch inlined shaders
When there are lots of inlined shaders, going over each one and checking
if the call index matches becomes expensive. Instead, use a
binary-search-like selection to skip most of the checks.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26380>
2023-12-21 12:39:06 +00:00
Joshua Ashton
6b1fafe716 nvk: Enable KHR_present_id and KHR_present_wait
This is needed for DXVK and VKD3D-Proton in order to implement
DXGI frame latency and frame latency waitable objects.

Gamescope also requires this to keep in-step with the host.

Using the same enablement system as RADV, etc to be safe.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26752>
2023-12-21 05:38:47 +00:00
Joshua Ashton
edb5229538 nvk: Hook up driconf for nvk_instance
We will use this in the future to enable present_id + present_wait like
in RADV.

This also enables the common WSI driconf entries for image count, etc
overrides to work by default, fixing some games.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26752>
2023-12-21 05:38:47 +00:00
Vinson Lee
2464cd81d3 nvk: Fix tautological-overlap-compare warning
../src/nouveau/vulkan/nvk_nir_lower_descriptors.c:145:55: warning: overlapping comparisons always evaluate to false [-Wtautological-overlap-compare]
  145 |    if (desc_type == VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER &&
      |        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
  146 |        desc_type == VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC)
      |        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: f1c909edd5 ("nvk/nir: Add cbuf analysis to nvi_nir_lower_descriptors()")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26772>
2023-12-20 21:16:17 +00:00
Roland Scheidegger
e61fae6eb8 lavapipe: bump image alignment up to 64 bytes
We can't query the actually required alignment from llvmpipe, but 16
bytes is insufficient. In particular, llvmpipe's pixel backend code for
depth/stencil will load/store 4 values at a time, which crashes
with d32s8x24 format if alignment is only 16 bytes (the color backend
does similar things but will use unaligned loads if alignment exceeds
16 bytes at least for now).
32 bytes would be enough, however the "ordinary" llvmpipe resource
layout code currently always aligns to at least 64 bytes, so use
this value as well.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26724>
2023-12-20 16:36:01 +00:00
Timur Kristóf
4d93aac74d radv: Use correct plane and binding index with SDMA.
This lets us support multi-planar images properly on transfer queues.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
2023-12-20 13:22:26 +00:00
Timur Kristóf
ab4720691c radv: Clean up SDMA chunked copy info struct.
Remove redundant fields.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
2023-12-20 13:22:26 +00:00
Timur Kristóf
7fe899a3b6 radv: Use SDMA surface structs for determining unaligned buffer copies.
This removes a bunch of redundant calculations, thus making the code
less error-prone.

Also move the row pitch calculation to a separate function.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
2023-12-20 13:22:26 +00:00
Timur Kristóf
dab4863396 radv: Pass radv_sdma_surf from copy functions to SDMA.
This makes it possible to gather the surface information
only once and pass it to the various SDMA functions, therefore
the code can be less error-prone.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
2023-12-20 13:22:26 +00:00
Timur Kristóf
85fa749c63 radv: Refactor and simplify SDMA surface info functions.
This makes it possible to call the function just once instead
of calling different functions repeatedly.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
2023-12-20 13:22:26 +00:00
Timur Kristóf
a21cba6799 radv: Unify SDMA surface struct for linear and tiled images.
Using just one struct for both types of images (and buffers)
will simplify a lot of code.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
2023-12-20 13:22:26 +00:00
Timur Kristóf
65dfdd3fff radv: Move SDMA function and struct declarations to a new header.
Very few parts of RADV actually need the SDMA functions, so
moving them to a separate header makes the driver cleaner and
also improves compilation time when SDMA functions change.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
2023-12-20 13:22:26 +00:00
Samuel Pitoiset
2ce0ea8e7c radv/ci: update CI lists for NAVI10,NAVI31 and RENOIR
These dynamic rendering failures/flakes are a known issue that will
be resolved soon as part of VKCTS. Until that, make sure CI is green.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26734>
2023-12-20 13:06:41 +00:00
Lang Yu
27c46dd207 radeonsi: emit SQ_NON_EVENT for GFX11_5
Signed-off-by: Lang Yu <lang.yu@amd.com>
Acked-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:28 +00:00
Pierre-Eric Pelloux-Prayer
981fbafa18 radeonsi: fix extra_md handling with fmask
Setting metadata on textures with fmask isn't allowed.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:28 +00:00
Pierre-Eric Pelloux-Prayer
5371fca829 radeonsi/sqtt: handle COMPUTE queues as well
Use cs_get_ip_type to support both type of queues instead
of restricting ourselves to GFX.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:27 +00:00
Pierre-Eric Pelloux-Prayer
2efd1baa64 radeonsi/sqtt: fix capturing RGP on RDNA3 with more than one Shader Engine
Based on radv 2cc981a0cd.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:27 +00:00
Pierre-Eric Pelloux-Prayer
e0507ec50b radeonsi/sqtt: fix emitting SQTT userdata when CAM is needed
Based on radv 6caae898dd.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:27 +00:00
Pierre-Eric Pelloux-Prayer
a2cfd4186f radeonsi/winsys: add cs_get_ip_type function
Will be used in the next commit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:27 +00:00
Pierre-Eric Pelloux-Prayer
c1f08608b8 radeonsi/sqtt: fix capturing indirect dispatches with SQTT
Ported from radv 083e7d3a92.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:27 +00:00
Pierre-Eric Pelloux-Prayer
5139441c96 radeonsi/sqtt: reformat with clang-format
To fix the 2-space indentation used in this file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:27 +00:00
Pierre-Eric Pelloux-Prayer
af8e6c9347 radeonsi/sqtt: use calloc instead of malloc
This makes sure the record is fully initialized and
fixes RGP crashes or missing shaders.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:27 +00:00
Pierre-Eric Pelloux-Prayer
b55a2065e0 radeonsi/sqtt: rework pm4.reg_va_low_idx
The initial logic was to remember the place were SPI_SHADER_PGM_LO_*
are written, then assume that we can get the register offset because
the sequence would always be:

   PKT3_SET_SH_REG
   SPI_SHADER_PGM_LO_* register offset
   VA low 32 bits value <- reg_va_low_idx

The problem is that this sequence isn't guaranteed, for instance we
can get this instead:

   0   c0067600 |
   1   00000046 |
   2   003ffffd | SPI_SHADER_PGM_RSRC3_VS
   3   00000020 | SPI_SHADER_LATE_ALLOC_VS
   4 * 00002080 | SPI_SHADER_PGM_LO_VS
   5   00000080 | SPI_SHADER_PGM_HI_VS

So the assert in si_state_draw.cpp would fail as well as the VA
update logic.

So instead remember which the SPI_SHADER_PGM_LO_* offset, and the low
32 bits of the VA in si_update_shaders.

Fixes: 8034a71430 ("radeonsi/sqtt: re-export shaders in a single bo")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:27 +00:00
Pierre-Eric Pelloux-Prayer
e4d537fb84 radeonsi/sqtt: clear record_counts variable
This avoids hitting the asserts in ac_sqtt_finish.

Fixes: 94ce6540d8 ("ac/sqtt: add helpers for initializing ac_thread_trace_data")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:27 +00:00
Pierre-Eric Pelloux-Prayer
77098ec467 radeonsi/sqtt: fix RGP pm4 state emit function
It was missing in c3129b2b83.

Fixes: c3129b2b83 ("radeonsi: add a simple version of si_pm4_emit_state for non-shader states")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:27 +00:00
Karol Herbst
63e08bd61d rusticl/nir: add missing nir include
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26764>
2023-12-20 11:31:31 +00:00
Karol Herbst
c4d8f257ce rusticl: fix constant and printf buffer size
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26764>
2023-12-20 11:31:31 +00:00
Karol Herbst
7e74ee07e3 rusticl: silence clippy::arc-with-non-send-sync for now
Allows compilation with newer clippy

Cc: mesa-stable
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26764>
2023-12-20 11:31:30 +00:00
Karol Herbst
382718e0e1 rusticl: do not warn on empty RUSTICL_DEBUG or RUSTICL_FEATURES
Fixes: b90d1cfbfe ("rusticl/platform: add RUSTICL_FEATURES boilerplate")
Fixes: ca1e9917a9 ("rusticl/program: allow dumping compilation logs through RUSTICL_DEBUG")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26764>
2023-12-20 11:31:30 +00:00
Karol Herbst
f8afd41667 clc: add workaround for clang always defining __IMAGE_SUPPORT_ and __opencl_c_int64
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26764>
2023-12-20 11:31:30 +00:00
Bas Nieuwenhuizen
07ad6fd34a radv: Use correct writemask for cooperative matrix ordering.
Not expecting this to actually fix anything externally visible,
but reduces some invalid usage when the resulting vector is
not 16 elements long (e.g. the C/result matrix).

Fixes: 9df4703fbb ("radv: Add cooperative matrix lowering.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26768>
2023-12-20 11:02:30 +00:00
David Heidelberg
16af090908 ci/lava: separate HW definitions from SW
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26722>
2023-12-20 10:15:44 +00:00
Bas Nieuwenhuizen
d04ee07712 radeonsi: Add support to clear LDS at the end of a shader.
No hash updates as I didn't find a facility to do it in radeonsi
(even though there are flags like forcing fma32).

Note that we do this very late to avoid any optimizations that
might remove the dead stores. (Checked that LLVM doesn't remove
them, but it is admittedly potentially brittle)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26679>
2023-12-20 09:15:45 +00:00
Bas Nieuwenhuizen
eaf61adea5 radv: Add option to clear LDS at the end of a shader.
Only shaders which explicitly allow shared memory are included for
now. The pass is very late to avoid optimizations removing the stores
and to ensure the clear gets added after MS outputs get loaded from LDS.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26679>
2023-12-20 09:15:45 +00:00
Bas Nieuwenhuizen
da6a5e1f63 nir: Add pass for clearing memory at the end of a shader.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26679>
2023-12-20 09:15:45 +00:00
Bas Nieuwenhuizen
bc99b73d70 nir: Add nir_static_workgroup_size helper.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26679>
2023-12-20 09:15:45 +00:00
Qiang Yu
21d569b081 radeonsi: unify elf and raw shader binary upload
RAW shader did not have dma shader upload, this commit share
the pre/post upload code with ELF, so RAW and ELF can have same
upload mechanism.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26750>
2023-12-20 06:51:07 +00:00
Faith Ekstrand
f11b4d1ebe nvk: Advertise shaderFloat64
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9661
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26587>
2023-12-20 02:40:25 +00:00
Faith Ekstrand
4a4815b855 nak/nir: Lower a bunch of fp64
All we have are add, mul, ffma, and comparisons.  Everything else has to
be lowered.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26587>
2023-12-20 02:40:25 +00:00
Faith Ekstrand
3e042173e4 nir/lower_doubles: Add lowering for fmin/fmax/fsat
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26587>
2023-12-20 02:40:25 +00:00
Faith Ekstrand
e1fecd83ed nak/sm50: Add DMnMx and use it for fp64 fmin/fmax
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26587>
2023-12-20 02:40:25 +00:00
Faith Ekstrand
1a7e83c87f nak/sm50: Properly legalize OpSel and drop an assert
While we're here, update sm70 to be a bit more modern.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26587>
2023-12-20 02:40:25 +00:00
Faith Ekstrand
7f5c6642d8 nak/sm50: Fix encoding of iadd with imm32
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26587>
2023-12-20 02:40:25 +00:00
Daniel Almeida
0ac6a81ab5 nak: sm50: fix ineg legalization
There is nothing to be done, as we lower this op unconditionally.

We need to add an arm in the match to not panic, though.

Fixes a panic in
dEQP-VK.binding_model.shader_access.primary_cmd_buf.sampler_immutable.vertex_fragment.multiple_contiguous_descriptors.2d

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26587>
2023-12-20 02:40:25 +00:00
Faith Ekstrand
73a1acef18 nak/sm50: Fix encoding of f20 immediates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26587>
2023-12-20 02:40:25 +00:00
Faith Ekstrand
17d2b2f2cc nak/sm50: Add encoding and legalization for dadd/dfma/dmul/dsetp
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26587>
2023-12-20 02:40:25 +00:00
Faith Ekstrand
1f5623c557 nak: Implement 64-bit nir_op_fsign
There is NIR lowering for this but this implementation is more
efficient.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26587>
2023-12-20 02:40:25 +00:00
Faith Ekstrand
d03cbac05a nak: Fix encoding of dsetp with RZ on SM70+
The `as_reg().is_some()` check returns false when src[1].src_ref is
Zero but we want to handle that as a register case.  Replace it with a
match instead.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26587>
2023-12-20 02:40:24 +00:00
Timothy Arceri
52dbf44d2e glsl: add support for inout params to glsl_to_nir()
Supporting these means we don't have to depend on calling the GLSL
IR optimisation loop for shaders that contain these parameter types.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26755>
2023-12-20 01:47:27 +00:00
Timothy Arceri
3d3ba9f428 glsl: move glsl ir lowering out of glsl_to_nir()
The main motivation for doing this is that some tests and even the
st tracker linking code dump out the GLSL IR for debugging before
glsl_to_nir() is called expecting it to already be in its final
form. Moving these to the linker makes those assumptions true.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26755>
2023-12-20 01:47:27 +00:00