Commit graph

13769 commits

Author SHA1 Message Date
Samuel Pitoiset
58d2a78dba radv: move dri options to radv_instance::drirc
To make it clearer that such an option is a per-application option.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26946>
2024-01-10 10:07:40 +00:00
Samuel Pitoiset
1854d03c20 radv: query drirc options in only one place
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26946>
2024-01-10 10:07:40 +00:00
Tatsuyuki Ishi
71e486d1cf radv: Add layer to skip UnmapMemory for Quantic Dream Engine
Detroit: Become Human has an optimization issue where the same BO is
mapped and unmapped on a per-frame basis, which leads to massive
overhead in the kernel, both creating the mapping and taking page
faults.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26942>
2024-01-10 07:39:58 +00:00
Friedrich Vock
0b55a3cf64 radv/rt: Acceleration structure updates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26729>
2024-01-09 18:47:45 +00:00
Friedrich Vock
62fe4f0b1b radv/rt: Move per-geometry build info into a geometry_data struct
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26729>
2024-01-09 18:47:45 +00:00
Samuel Pitoiset
4c7486c9aa radv/winsys: replace '<= GFX6' by '== GFX6'
GFX6 is the first AMD family supported by RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26867>
2024-01-09 16:31:31 +00:00
Samuel Pitoiset
ae4628d3d6 radv: do not program COMPUTE_MAX_WAVE_ID (GDS register) on GFX6
Ported from RadeonSI c2359797.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26867>
2024-01-09 16:31:30 +00:00
Konstantin Seurer
658ce711d5 radv/rt: Lower ray payloads to registers
This should allow for cross stage optimizations and it reduces latency
caused by scratch access.

Totals from 44 (9.69% of 454) affected shaders:
MaxWaves: 432 -> 436 (+0.93%)
Instrs: 2740662 -> 1610327 (-41.24%); split: -41.24%, +0.00%
CodeSize: 14616932 -> 8573620 (-41.34%)
VGPRs: 4880 -> 4816 (-1.31%)
SpillSGPRs: 464 -> 294 (-36.64%)
Latency: 18548886 -> 11465281 (-38.19%); split: -38.19%, +0.00%
InvThroughput: 5195964 -> 3066729 (-40.98%); split: -40.98%, +0.00%
VClause: 99672 -> 55611 (-44.21%)
SClause: 65827 -> 38697 (-41.21%)
Copies: 231231 -> 137676 (-40.46%); split: -40.47%, +0.01%
Branches: 111379 -> 65865 (-40.86%); split: -40.87%, +0.00%
PreSGPRs: 3854 -> 3812 (-1.09%); split: -1.19%, +0.10%
PreVGPRs: 4518 -> 4439 (-1.75%); split: -1.84%, +0.09%

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26431>
2024-01-09 13:02:11 +00:00
Konstantin Seurer
2e4951d3fb radv: Remove the BVH depth heuristics
It only helps Quake II RTX and hurts everything else.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26481>
2024-01-09 09:00:24 +00:00
Konstantin Seurer
719619c477 radv: Use PLOC for TLAS builds
Improves control performance by about 1%.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26481>
2024-01-09 09:00:24 +00:00
Dave Airlie
71bd479a7f radv: don't emit cp dma packets on video rings.
Only emit this on the gfx/ace rings.

Fixes hangs with CTS on video decode with navi3x.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26945>
2024-01-09 07:39:52 +00:00
Sergi Blanch Torne
ab6f7170e0 Revert "ac/nir: Export clip distances according to clip_cull_mask"
This reverts commit b38c776690.

This commit seems to offend radeonsi-raven-piglit and radeonsi-stoney-gl.

Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26941>
2024-01-09 02:36:19 +00:00
Konstantin Seurer
da647e7e42 radv/rt/rmv: Log pipeline library creation
Pipeline libraries own shaders which take up GPU memory.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26668>
2024-01-08 19:29:13 +00:00
Konstantin Seurer
84cc494e51 radv/rmv: Fix tracing ray tracing pipelines
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26668>
2024-01-08 19:29:13 +00:00
Timur Kristóf
55e5c4e089 radv: Expose transfer queues, hidden behind a perftest flag.
This is highly experimental and only recommended
for users who know what they are doing.

To fully support the spec we are going to need
gang submissions which are going to be implemented later.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26913>
2024-01-08 16:00:19 +01:00
Timur Kristóf
bd3f2567cc radv: Implement T2T scanline copy workaround.
The built-in tiled-to-tiled copy packet doesn't support copying
between images that don't meet certain criteria such as alignment,
micro tile format, compression state etc.

To work around this, we copy the image piece by piece to a
temporary buffer that we know is supported,
and then copy it to the intended destination.

The implementation assumes that at least one pixel row of the
image fits into the temporary buffer, and will try to copy as
many rows as fit.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26913>
2024-01-08 16:00:15 +01:00
Timur Kristóf
a4b4c9b723 radv: Implement image copies on transfer queues.
When either of the images is linear then the implementation can
use the same packets as used by the buffer/image copies.
However, tiled to tiled image copies use a separate packet.

Several variations of tiled to tiled copies are not supported
by the built-in packet and need a scanline copy as a workaround,
this will be implemented by an upcoming commit.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26913>
2024-01-08 16:00:10 +01:00
Timur Kristóf
1405f9b68b radv: Correct binding index for transfer buffer-image copies.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26913>
2024-01-08 15:58:37 +01:00
Georg Lehmann
fddd866b27 aco: apply fneg/fabs to VOP3P
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919>
2024-01-08 13:26:19 +00:00
Georg Lehmann
72ac6a5251 aco: clean up fneg/fabs combining
This technically fixes some bugs with fneg(fneg(a)) and fabs(fneg(a)), but
those shouldn't be present in the input NIR.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919>
2024-01-08 13:26:19 +00:00
Georg Lehmann
a90d154f62 aco: fix applying input modifiers to DPP8
Cc: mesa-stable

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919>
2024-01-08 13:26:19 +00:00
Georg Lehmann
1d61770dd5 aco: apply packed fneg commutatively
If only one component is negated, isel does not ensure that the constant
operand is in src1 because then the negate was a fmul, not a fneg.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919>
2024-01-08 13:26:19 +00:00
Daniel Schürmann
dce695b24f aco: refactor and speed-up dead code analysis
Assuming that no loop header phis are dead code,
we can perform the dead code analysis in a single iteration.

Totals from 25 (0.03% of 79330) affected shaders: (GFX11)

MaxWaves: 664 -> 662 (-0.30%)
Instrs: 487618 -> 488822 (+0.25%)
CodeSize: 2451548 -> 2459756 (+0.33%)
VGPRs: 1296 -> 1332 (+2.78%)
Latency: 2337256 -> 2338098 (+0.04%); split: -0.00%, +0.04%
InvThroughput: 560682 -> 576158 (+2.76%)
VClause: 15782 -> 15790 (+0.05%)
Copies: 37905 -> 38731 (+2.18%)
PreVGPRs: 1124 -> 1156 (+2.85%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26901>
2024-01-08 09:43:53 +00:00
Konstantin Seurer
b38c776690 ac/nir: Export clip distances according to clip_cull_mask
Avoids a mismatch between PA_CL_VS_OUT_CNTL and what the shader exports.
Outputs that are not written export zero.

Fixes: f823581 ("ac/nir: add ac_nir_export_position")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9187
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26812>
2024-01-08 08:35:35 +00:00
Samuel Pitoiset
907afddf97 radv: stop disabling DCC for mutable with 0 formats on GFX11
On GFX11, all formats are DCC compatible, so we can completely ignore
MUTABLE with a missing formats list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25664>
2024-01-08 08:02:58 +00:00
Samuel Pitoiset
fbe4e16db2 Revert "radv: disable DCC with signedness reinterpretation on GFX11"
This was affecting Cyberpunk and A Plague Tale Requiem but both issues
should be fixed now. The issue with A Plague Tale Requiem was because
of a game bug and vkd3d-proton now has a workaround.

This reverts commit e6735409ee.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25664>
2024-01-08 08:02:58 +00:00
Konstantin Seurer
ba7b08e324 radv/rt: Repurpose radv_ray_tracing_stage_is_compiled
Replace it with radv_ray_tracing_stage_is_always_inlined and use it inside
radv_rt_compile_shaders.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25224>
2024-01-07 21:28:32 +01:00
Konstantin Seurer
73cc952870 radv/sqtt: Avoid duplicate stage check
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25224>
2024-01-07 21:28:26 +01:00
Konstantin Seurer
77b9a6f9e2 radv/rt: Use radv_shader for compiled shaders
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25224>
2024-01-07 21:28:19 +01:00
Konstantin Seurer
59d490b8aa radv/rt: Remove useless assert
If it's NULL, the code will segfault anyways.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25224>
2024-01-07 21:28:11 +01:00
Konstantin Seurer
8198805e1f radv: Skip compiling chit and miss shaders
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25224>
2024-01-07 21:28:06 +01:00
Konstantin Seurer
0f87d406b5 radv/rt: Skip compiling a traversal shader
If we don't need one, we don't compile one.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25224>
2024-01-07 21:28:02 +01:00
Konstantin Seurer
aaa64217ca radv: Add more ray tracing data to the cache
This makes the cache more flexible when it comes to missing stages. This
will be used to skip compiling unused ray tracing stages.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25224>
2024-01-07 21:27:56 +01:00
Konstantin Seurer
a784477269 radv: Don't store library stack sizes
They are already imported in radv_rt_fill_stage_info.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25224>
2024-01-07 21:27:53 +01:00
Konstantin Seurer
92a951db6a radv: Make pipeline cache object data generic
Pipeline cache objects can hold some generic data. Anything concerning
that should not be handled in "common" code paths.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25224>
2024-01-07 21:27:27 +01:00
Daniel Schürmann
023e78b4d7 aco: add new post-RA scheduler for ILP
Totals from 77247 (97.37% of 79330) affected shaders: (GFX11)

Instrs: 44371374 -> 43215723 (-2.60%); split: -2.64%, +0.03%
CodeSize: 227819532 -> 223188224 (-2.03%); split: -2.06%, +0.03%
Latency: 301016823 -> 290147626 (-3.61%); split: -3.70%, +0.09%
InvThroughput: 48551749 -> 47646212 (-1.87%); split: -1.88%, +0.01%
VClause: 870581 -> 834655 (-4.13%); split: -4.13%, +0.00%
SClause: 1487061 -> 1340851 (-9.83%); split: -9.83%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25676>
2024-01-06 11:30:42 +00:00
Daniel Schürmann
72a5c659d4 aco: form clauses for LDS instructions
No fossil-db changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25676>
2024-01-06 11:30:42 +00:00
Daniel Schürmann
8f16745821 aco: fix should_form_clause() for memory instructions without operands
In particular, this applies to s_memtime and s_memrealtime.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25676>
2024-01-06 11:30:41 +00:00
Vinson Lee
568f61787a ac/rgp: Fix single-bit-bitfield-constant-conversion warning
../src/amd/common/ac_rgp.c:119:48: warning: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Wsingle-bit-bitfield-constant-conversion]
  119 |    header->flags.is_semaphore_queue_timing_etw = 1;
      |                                                ^ ~

Fixes: ed0c852243 ("radv: add initial SQTT files generation support")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26839>
2024-01-05 22:42:58 -08:00
Yonggang Luo
d6c258d9ee util: Add align_uintptr and use it treewide to replace ALIGN that works on size_t and uintptr_t
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26866>
2024-01-05 21:54:35 +00:00
Rhys Perry
ae54cbeb3f nir: remove sad_u8x4
All uses of this can be replaced with msad_4x8.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907>
2024-01-05 18:55:22 +00:00
Rhys Perry
5fd747a502 radv: enable msad_4x8
This helps some FSR3 shaders.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907>
2024-01-05 18:55:22 +00:00
Rhys Perry
a339699b5c ac/llvm: implement msad_4x8
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907>
2024-01-05 18:55:22 +00:00
Rhys Perry
1410735a62 aco: implement msad_4x8
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907>
2024-01-05 18:55:22 +00:00
Konstantin Seurer
c511b8968a radv: Implement VK_KHR_ray_tracing_position_fetch
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26895>
2024-01-05 18:20:20 +00:00
Rhys Perry
24ef827f71 radv: remove radv_shader_info's cs.subgroup_size
This is the same as wave_size.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26894>
2024-01-05 17:35:48 +00:00
Rhys Perry
59dbe633e3 radv: use CS wave selection for task shaders
This uses wave32 for small workgroups and wave64 when certain subgroup
operations are used.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26894>
2024-01-05 17:35:48 +00:00
Rhys Perry
3009dcd102 aco: correctly set min/max_subgroup_size for wave32-as-wave64
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26894>
2024-01-05 17:35:48 +00:00
Friedrich Vock
1e3541728b radv,aco: Convert 1D ray launches to 2D
Because we use unaligned dispatches, 1D launches only use 8 threads per
wave. Converting to 2D and fixing up launch IDs in the prolog
significantly increases occupancy.

Gives ~30% uplift in Ghostwire Tokyo.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26105>
2024-01-05 17:08:05 +00:00
Georg Lehmann
71edf4de5e aco/gfx12: implement broadcast dmask shrink behavior
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26897>
2024-01-05 12:03:54 +00:00