Commit graph

13796 commits

Author SHA1 Message Date
Samuel Pitoiset
570faac1c1 radv: fix segfault when getting device vm fault info
pFaultInfo can be NULL.

Fixes: 8097becc7f ("radv: add initial VK_EXT_device_fault support")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27292>
(cherry picked from commit c68f96878c)
2024-01-31 19:31:13 +00:00
Rhys Perry
aba20a934d aco: fix labelling of s_not with constant
Fixes RADV compilation of a Cyberpunk 2077 RT pipeline with
PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: dfaa3c0af6 ("aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27194>
(cherry picked from commit 6dc182b6b2)
2024-01-29 21:07:00 +00:00
Rhys Perry
b85673b086 radv: do nir_shader_gather_info after radv_nir_lower_rt_abi
Fixes compilation of a Doom Eternal shader with
PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT.

ac_nir_lower_resinfo() was not happening because it is predicated on
uses_resource_info_query and no later optimization updated it.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27195>
(cherry picked from commit 90939e93f6)
2024-01-23 20:34:31 +00:00
Karol Herbst
ee57c9df39 nir: rework and fix rotate lowering
No driver supports urol/uror on all bit sizes. Intel gen11+ only for 16
and 32 bit, Nvidia GV100+ only for 32 bit. Etnaviv can support it on 8,
16 and 32 bit.

Also turn the `lower` into a `has` option as only two drivers actually
support `uror` and `urol` at this momemt.

Fixes crashes with CL integer_rotate on iris and nouveau since we emit
urol for `rotate`.

v2: always lower 64 bit

Fixes: fe0965afa6 ("spirv: Don't use libclc for rotate")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by (Intel and nir): Ian Romanick <ian.d.romanick@intel.com>

Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27090>
(cherry picked from commit f2b7c4ce29)
2024-01-23 20:34:30 +00:00
Samuel Pitoiset
812bcc29af radv: fix indirect draws with NULL index buffer on GFX10
GFX10 has a hw bug and it can't handle 0-sized index buffer. The
non-indirect draw path was fine but not the indirect path where RADV
emits the index buffer.

This fixes flakes with dEQP-VK.*maintenance6* on NAVI14, and possibly
GPU hangs if there is an indirect draw with a valid index buffer right
before because it would re-use the same index buffer.

Fixes: db9816fd66 ("radv: add support for NULL index buffer")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27142>
(cherry picked from commit 783e3c096f)
2024-01-23 13:21:10 +00:00
Samuel Pitoiset
21d22653da radv: fix indirect dispatches on the compute queue on GFX7
GFX7 CP requires the indirect dispatch VA to be aligned to 32-bytes.

This fixes dEQP-VK.api.command_buffers.many_indirect_disps_on_secondary,
but it's unexpected that it uncovered this bug.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27148>
(cherry picked from commit 5c03cdbd02)
2024-01-23 13:21:09 +00:00
Georg Lehmann
ebd56d7a76 aco: stop scheduling at p_logical_end
No Foz-DB changes, but this fixes some issues when the spiller inserts
scratch loads after p_logical_end for p_return.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27119>
(cherry picked from commit 74fc2e287f)
2024-01-23 13:21:08 +00:00
Daniel Schürmann
e1d20b69c4 aco: give spiller more room to assign spilled SGPRs to VGPRs
On chordal graphs, a greedy coloring can be done in a way that never uses
more colors than are required for the largest clique. However, since we
have vector values and force phi resources into the same spill slots, the
interference graphs are not chordal, and thus, this assumption doesn't hold.

Use twice as many spill slots as upper bound.

Totals from 10 (0.01% of 79242) affected shaders: (GFX11)
MaxWaves: 52 -> 54 (+3.85%)
Instrs: 271386 -> 271779 (+0.14%)
CodeSize: 1362544 -> 1365432 (+0.21%)
VGPRs: 2536 -> 2532 (-0.16%)
SpillVGPRs: 778 -> 818 (+5.14%)
Scratch: 73472 -> 76800 (+4.53%)
Latency: 3331718 -> 3328798 (-0.09%); split: -0.14%, +0.05%
InvThroughput: 1665860 -> 1643350 (-1.35%); split: -1.40%, +0.05%
VClause: 3292 -> 3329 (+1.12%); split: -0.06%, +1.18%
Copies: 46082 -> 46257 (+0.38%)

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27011>
(cherry picked from commit e3098bb232)
2024-01-23 13:21:08 +00:00
Friedrich Vock
0392e4bf5c radv: Fix shader replay allocation condition
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26891>
(cherry picked from commit 43bdfebbff)
2024-01-23 13:21:07 +00:00
Konstantin Seurer
484a051aaa ac/llvm: Enable helper invocations for quad OPs
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9239
cc: mesa-stable

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27110>
(cherry picked from commit 220c912080)
2024-01-23 13:19:58 +00:00
Dave Airlie
085612fce5 radv: don't submit empty command buffers on encoder ring.
the vcn enc/unified rings don't do nop packets, and hang with 0 sized
cmd buffers. This just stops submitting 0 sized cmd buffers to the hw.

Fixes hangs with dEQP-VK.video.decode.h264_i on navi3x

Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25932>
(cherry picked from commit f33683e4da)
2024-01-18 13:07:57 +00:00
Dave Airlie
74ee323c17 radv/video: refactor sq start/end code to avoid decode hangs.
The extra cmd buffer layer was done wrong, need to emit the
sq start and ends around every reset/decode packet.

Fixes dEQP-VK.video.decode.h264_i on navi3x

Fixes: d8f3060bd9 ("radv/video: start adding gfx11 vcn decoder")
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25932>
(cherry picked from commit d32f2ee7b6)
2024-01-18 13:07:56 +00:00
David Heidelberg
a062b0432a ci/deqp: uprev deqp-runner for Linux too to 0.18.0
Previous commit upreved deqp only for the Android

Fixes: 1ff4687e86 ("ci: uprev deqp-runner from 0.16.1 to 0.18.0")

Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>

[Eric]
- rename the deqp-runner version to DEQP_RUNNER_VERSION instead of DEQP_VERSION
- update image tags
- fix expectations lists

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27062>
(cherry picked from commit 4ff77f08e4)
2024-01-17 23:29:18 +00:00
Friedrich Vock
9d1a064663 radv/rt: Add workaround to make leaves always active
DOOM Eternal builds acceleration structures with inactive primitives and
tries to make them active in later AS updates. This is disallowed by the
spec and triggers a GPU hang. Fix the hang by working around the bug.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27034>
(cherry picked from commit a9831caa14)
2024-01-17 21:42:02 +00:00
Tatsuyuki Ishi
c5b8590e6d radv: never set DISABLE_WR_CONFIRM for CP DMA clears and copies
This mirrors the changes in 69ff9c16bb ("radeonsi: never set
DISABLE_WR_CONFIRM for CP DMA clears and copies").

Cc: mesa-stable
Suggested-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27053>
(cherry picked from commit 43fb43ba2c)
2024-01-16 18:41:34 +00:00
Pierre-Eric Pelloux-Prayer
6febac5c96 Revert "ci/radeonsi: disable VA-API testing on raven"
This reverts commit 9017852de4.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26947>
(cherry picked from commit e2f39e8aca)
2024-01-15 21:56:36 +00:00
Pierre-Eric Pelloux-Prayer
ab960ee0bf radeonsi: compute epitch when modifying surf_pitch
In the linear case with no mipmaps addrlib sets epitch to surf_pitch - 1
so lets do the same thing here.

The change in si_descriptors.c looks like it's papering over a bug but I
couldn't find any other changes that wouldn't break at least one use case.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10375
Fixes: 115b61e51f ("ac/surface: don't oversize surf_size")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26947>
(cherry picked from commit 4e76c4ecb4)
2024-01-15 21:56:10 +00:00
Tatsuyuki Ishi
fc11cbb37e radv: Recompute max_waves after postprocessing RT config
The max waves for RT prolog need to be recalculated after merging the
resource usage of all shaders invoked from it.

Note that there is no need to panic, as the info was only used to
calculate maximum scratch size and with the RT prolog being low
footprint, this likely only caused overestimation rather than
underestimation.

Fixes: 533ec9843e ("radv: Precompute shader max_waves.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26998>
(cherry picked from commit 63827751e1)
2024-01-15 21:56:09 +00:00
Timur Kristóf
3753919715 radv: Correctly select SDMA support for PRIME blit.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10317
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27015>
(cherry picked from commit 436b89e838)
2024-01-15 21:56:00 +00:00
Samuel Pitoiset
6f6905fc94 radv: move all per-device keys from radv_pipeline_key to radv_device_cache_key
radv_device_cache_key contains everything per-device, while
radv_pipeline_key is more like per-pipeline keys.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26948>
2024-01-11 08:37:36 +00:00
Samuel Pitoiset
4d44cea3e0 radv: introduce radv_device_cache_key for per-device cache compiler options
This replaces RADV_HASH_SHADER_xxx by radv_device_cache_key which is
a new struct that contains per-device compiler options. More options
will be moved there.

Blake3 is used to replace sha1.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26948>
2024-01-11 08:37:36 +00:00
Samuel Pitoiset
dc857e5c9e radv: initialize radv_device::disable_trunc_coord earlier
The per-device cache key will need to be initialized before compiling
any meta shaders, so this needs to be done earlier.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26948>
2024-01-11 08:37:36 +00:00
Samuel Pitoiset
10a25f39df radv: move RADV_HASH_SHADER_KEEP_STATISTICS to radv_pipeline_key
This is more like a per-pipeline option.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26948>
2024-01-11 08:37:35 +00:00
Samuel Pitoiset
4455c79299 radv: add missing disable_shrink_image_store to the pipeline key
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26948>
2024-01-11 08:37:35 +00:00
Samuel Pitoiset
3f655bc47c radv: do not issue SQTT marker with DISPATCH_MESH_INDIRECT_MULTI
According to PAL, only DISPATCH_TASKMESH_GFX is supposed to emit a
SQTT marker as part of the packet, probably because there is also
a packet emitted on ACE for executing task shaders.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10401
Fixes: 312103e0ff ("radv: set THREAD_TRACE_MARKER_ENABLE for mesh/task draws")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26936>
2024-01-11 07:59:40 +00:00
Samuel Pitoiset
59c05b9cfa Revert "radv/rt: Lower ray payloads to registers"
This causes GPU hangs with vkd3d-proton in CI.

This reverts commit 658ce711d5.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26983>
2024-01-10 18:12:55 +00:00
Samuel Pitoiset
41cbd6f735 radv: rework declaring color arguments for PS epilogs
Rely on the actual FS outputs instead of the spi shader col format,
this is safer and it will help for future work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26903>
2024-01-10 16:10:15 +00:00
Samuel Pitoiset
58d2a78dba radv: move dri options to radv_instance::drirc
To make it clearer that such an option is a per-application option.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26946>
2024-01-10 10:07:40 +00:00
Samuel Pitoiset
1854d03c20 radv: query drirc options in only one place
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26946>
2024-01-10 10:07:40 +00:00
Tatsuyuki Ishi
71e486d1cf radv: Add layer to skip UnmapMemory for Quantic Dream Engine
Detroit: Become Human has an optimization issue where the same BO is
mapped and unmapped on a per-frame basis, which leads to massive
overhead in the kernel, both creating the mapping and taking page
faults.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26942>
2024-01-10 07:39:58 +00:00
Friedrich Vock
0b55a3cf64 radv/rt: Acceleration structure updates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26729>
2024-01-09 18:47:45 +00:00
Friedrich Vock
62fe4f0b1b radv/rt: Move per-geometry build info into a geometry_data struct
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26729>
2024-01-09 18:47:45 +00:00
Samuel Pitoiset
4c7486c9aa radv/winsys: replace '<= GFX6' by '== GFX6'
GFX6 is the first AMD family supported by RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26867>
2024-01-09 16:31:31 +00:00
Samuel Pitoiset
ae4628d3d6 radv: do not program COMPUTE_MAX_WAVE_ID (GDS register) on GFX6
Ported from RadeonSI c2359797.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26867>
2024-01-09 16:31:30 +00:00
Konstantin Seurer
658ce711d5 radv/rt: Lower ray payloads to registers
This should allow for cross stage optimizations and it reduces latency
caused by scratch access.

Totals from 44 (9.69% of 454) affected shaders:
MaxWaves: 432 -> 436 (+0.93%)
Instrs: 2740662 -> 1610327 (-41.24%); split: -41.24%, +0.00%
CodeSize: 14616932 -> 8573620 (-41.34%)
VGPRs: 4880 -> 4816 (-1.31%)
SpillSGPRs: 464 -> 294 (-36.64%)
Latency: 18548886 -> 11465281 (-38.19%); split: -38.19%, +0.00%
InvThroughput: 5195964 -> 3066729 (-40.98%); split: -40.98%, +0.00%
VClause: 99672 -> 55611 (-44.21%)
SClause: 65827 -> 38697 (-41.21%)
Copies: 231231 -> 137676 (-40.46%); split: -40.47%, +0.01%
Branches: 111379 -> 65865 (-40.86%); split: -40.87%, +0.00%
PreSGPRs: 3854 -> 3812 (-1.09%); split: -1.19%, +0.10%
PreVGPRs: 4518 -> 4439 (-1.75%); split: -1.84%, +0.09%

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26431>
2024-01-09 13:02:11 +00:00
Konstantin Seurer
2e4951d3fb radv: Remove the BVH depth heuristics
It only helps Quake II RTX and hurts everything else.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26481>
2024-01-09 09:00:24 +00:00
Konstantin Seurer
719619c477 radv: Use PLOC for TLAS builds
Improves control performance by about 1%.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26481>
2024-01-09 09:00:24 +00:00
Dave Airlie
71bd479a7f radv: don't emit cp dma packets on video rings.
Only emit this on the gfx/ace rings.

Fixes hangs with CTS on video decode with navi3x.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26945>
2024-01-09 07:39:52 +00:00
Sergi Blanch Torne
ab6f7170e0 Revert "ac/nir: Export clip distances according to clip_cull_mask"
This reverts commit b38c776690.

This commit seems to offend radeonsi-raven-piglit and radeonsi-stoney-gl.

Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26941>
2024-01-09 02:36:19 +00:00
Konstantin Seurer
da647e7e42 radv/rt/rmv: Log pipeline library creation
Pipeline libraries own shaders which take up GPU memory.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26668>
2024-01-08 19:29:13 +00:00
Konstantin Seurer
84cc494e51 radv/rmv: Fix tracing ray tracing pipelines
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26668>
2024-01-08 19:29:13 +00:00
Timur Kristóf
55e5c4e089 radv: Expose transfer queues, hidden behind a perftest flag.
This is highly experimental and only recommended
for users who know what they are doing.

To fully support the spec we are going to need
gang submissions which are going to be implemented later.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26913>
2024-01-08 16:00:19 +01:00
Timur Kristóf
bd3f2567cc radv: Implement T2T scanline copy workaround.
The built-in tiled-to-tiled copy packet doesn't support copying
between images that don't meet certain criteria such as alignment,
micro tile format, compression state etc.

To work around this, we copy the image piece by piece to a
temporary buffer that we know is supported,
and then copy it to the intended destination.

The implementation assumes that at least one pixel row of the
image fits into the temporary buffer, and will try to copy as
many rows as fit.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26913>
2024-01-08 16:00:15 +01:00
Timur Kristóf
a4b4c9b723 radv: Implement image copies on transfer queues.
When either of the images is linear then the implementation can
use the same packets as used by the buffer/image copies.
However, tiled to tiled image copies use a separate packet.

Several variations of tiled to tiled copies are not supported
by the built-in packet and need a scanline copy as a workaround,
this will be implemented by an upcoming commit.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26913>
2024-01-08 16:00:10 +01:00
Timur Kristóf
1405f9b68b radv: Correct binding index for transfer buffer-image copies.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26913>
2024-01-08 15:58:37 +01:00
Georg Lehmann
fddd866b27 aco: apply fneg/fabs to VOP3P
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919>
2024-01-08 13:26:19 +00:00
Georg Lehmann
72ac6a5251 aco: clean up fneg/fabs combining
This technically fixes some bugs with fneg(fneg(a)) and fabs(fneg(a)), but
those shouldn't be present in the input NIR.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919>
2024-01-08 13:26:19 +00:00
Georg Lehmann
a90d154f62 aco: fix applying input modifiers to DPP8
Cc: mesa-stable

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919>
2024-01-08 13:26:19 +00:00
Georg Lehmann
1d61770dd5 aco: apply packed fneg commutatively
If only one component is negated, isel does not ensure that the constant
operand is in src1 because then the negate was a fmul, not a fneg.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919>
2024-01-08 13:26:19 +00:00
Daniel Schürmann
dce695b24f aco: refactor and speed-up dead code analysis
Assuming that no loop header phis are dead code,
we can perform the dead code analysis in a single iteration.

Totals from 25 (0.03% of 79330) affected shaders: (GFX11)

MaxWaves: 664 -> 662 (-0.30%)
Instrs: 487618 -> 488822 (+0.25%)
CodeSize: 2451548 -> 2459756 (+0.33%)
VGPRs: 1296 -> 1332 (+2.78%)
Latency: 2337256 -> 2338098 (+0.04%); split: -0.00%, +0.04%
InvThroughput: 560682 -> 576158 (+2.76%)
VClause: 15782 -> 15790 (+0.05%)
Copies: 37905 -> 38731 (+2.18%)
PreVGPRs: 1124 -> 1156 (+2.85%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26901>
2024-01-08 09:43:53 +00:00