Commit graph

15893 commits

Author SHA1 Message Date
Roy Chan
b2ce281319 amd/vpelib: fix zero input handling
[why]
Fix in 0 input stream, it would crash in accessing null param->streams.

[how]
- Only use stream_ctx->stream instead of param->stream in color
  handling.
- Revise the geometric downscaling support.

Reviewed-by: Tomson Chang <tomson.chang@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Roy Chan <roy.chan@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31274>
2024-09-30 10:12:14 +00:00
Roy Chan
acde345606 amd/vpelib: Optimize the CPU usage by caching all the LUT configs
[why]
The fix point conversion takes quite a bit of CPU time.
And if there are no changes, we don't need to convert all sw points
into hw points and generate the corresponding configs.

[how]
Introduce a config cache header so that all config caching handlings
can be done in the same way.

Luts won't be cached if it is in bypass mode, only cache when it is
non bypass and some dirty flag is set by the upper layer.
The config cache handling will re-apply the cached config if not
dirty and not in bypass mode.

Reviewed-by: Krunoslav Kovac <krunoslav.kovac@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Roy Chan <roy.chan@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31274>
2024-09-30 10:12:14 +00:00
Jesse
ee590ee91a amd/vpelib: Config Writer hook and CDC refinement
Generalize CDC and config writer hook.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Jesse Agate <jesse.agate@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31274>
2024-09-30 10:12:14 +00:00
Rhys Perry
7f092cbd91 aco: workaround hazards in emit_long_jump
fossil-db (navi31):
Totals from 29 (0.04% of 79395) affected shaders:
CodeSize: 17612888 -> 17615096 (+0.01%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31316>
2024-09-30 09:04:35 +00:00
Rhys Perry
9fb97085d1 aco/tests: update assembler tests for llvm
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31316>
2024-09-30 09:04:35 +00:00
Mike Blumenkrantz
3178170516 ci: bump gl cts versions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31118>
2024-09-29 12:18:49 +00:00
Martin Roukala (né Peres)
e8cf44a71a radv/ci: document more vkcts flakes
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31415>
2024-09-28 07:37:11 +03:00
Marek Olšák
246051ebc6 ac/gpu_info: print 32bpp modifiers
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31187>
2024-09-27 19:21:55 +00:00
Marek Olšák
f7199b9971 ac/llvm: don't use the 64-bit umul_hi workaround with LLVM 19.1
It's fixed there.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31187>
2024-09-27 19:21:55 +00:00
Marek Olšák
89db355cc4 ac/llvm: use LLVM processor gfx942 for GFX940 when it's available
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31187>
2024-09-27 19:21:55 +00:00
Marek Olšák
163222abd0 ac/nir: set .image_dim and .image_array for all opcodes
for consistency

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31187>
2024-09-27 19:21:55 +00:00
Marek Olšák
14b576e023 ac: make sure VEGA20 and MI200 version ranges don't overlap with other chips
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31187>
2024-09-27 19:21:55 +00:00
Guilherme Gallo
f25fb0c128 ci/amd: Rebalance jobs via DEQP_FRACTION
As we don't have capacity for more parallelism atm, increase the
fraction of the CTS for those jobs.

job name		average min

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
---			---
radv-stoney-angle	16
radv-stoney-vkcts	15

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31414>
2024-09-27 16:38:26 +00:00
Guilherme Gallo
82633b08e7 ci/amd: Rebalance radeonsi-stoney-gl:x86_64
It is taking 19 min on average to run this job.
As we don't have more capacity, introduce a fraction of 2 and create the
`full` version to be run on nightly pipelines.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31414>
2024-09-27 16:38:26 +00:00
Rhys Perry
93372ea9af aco: do not use inline constants for 16-bit pseudo scalar trancendentals
Like https://github.com/llvm/llvm-project/pull/104395

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30729>
2024-09-27 11:11:42 +00:00
Samuel Pitoiset
ad95cc1a5c radv: simplify determining conformant products
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31367>
2024-09-27 06:29:16 +00:00
Georg Lehmann
5ccee0fe83 radv: remove nir_opt_reuse_constants call
aco now rematerializes constants per block, so this nir pass
no longer does anything meaningful besides adding noise:

Foz-DB Navi31:
Totals from 4674 (5.89% of 79395) affected shaders:
Instrs: 11497431 -> 11497335 (-0.00%); split: -0.02%, +0.02%
CodeSize: 60720620 -> 60725904 (+0.01%); split: -0.02%, +0.03%
VGPRs: 294440 -> 294428 (-0.00%)
SpillSGPRs: 3486 -> 3488 (+0.06%); split: -0.06%, +0.11%
Latency: 109298610 -> 109319617 (+0.02%); split: -0.02%, +0.04%
InvThroughput: 18606377 -> 18640872 (+0.19%); split: -0.01%, +0.19%
VClause: 232602 -> 232622 (+0.01%); split: -0.00%, +0.01%
SClause: 299675 -> 299746 (+0.02%); split: -0.01%, +0.04%
Copies: 840683 -> 840105 (-0.07%); split: -0.14%, +0.07%
Branches: 304581 -> 304646 (+0.02%); split: -0.00%, +0.03%
PreSGPRs: 233651 -> 233611 (-0.02%); split: -0.02%, +0.00%
VALU: 6624760 -> 6625051 (+0.00%); split: -0.00%, +0.01%
SALU: 1236841 -> 1236103 (-0.06%); split: -0.12%, +0.06%
VOPD: 2993 -> 2970 (-0.77%); split: +0.17%, -0.94%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>
2024-09-27 05:19:16 +00:00
Georg Lehmann
a9f8089240 nir: replace nir_opt_remove_phis_block with a single source version
This is what callers actually want, and it simplifies nir_opt_remove_phis
because we can assume dominance meta data is valid.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>
2024-09-27 05:19:16 +00:00
Georg Lehmann
78b8ec9c93 aco: optimize lanecount_to_mask
s_bfe uses 7 bits for the size, so when we extract from -1,
we can get all possible lane masks in one instruction.

Foz-DB Navi31:
Totals from 38601 (48.62% of 79395) affected shaders:
Instrs: 13670163 -> 13509738 (-1.17%)
CodeSize: 68011644 -> 67368308 (-0.95%)
Latency: 61203404 -> 61065419 (-0.23%); split: -0.23%, +0.00%
InvThroughput: 6897028 -> 6894634 (-0.03%); split: -0.05%, +0.01%
SALU: 1491291 -> 1342553 (-9.97%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:14 +00:00
Georg Lehmann
63b45767f8 aco/ssa_elimination: optimize branching sequence with SALU that has multiple definitions
Foz-DB Navi31:
Totals from 1801 (2.27% of 79395) affected shaders:
Instrs: 1595030 -> 1591942 (-0.19%); split: -0.19%, +0.00%
CodeSize: 8442656 -> 8430140 (-0.15%); split: -0.15%, +0.00%
Latency: 12885611 -> 12879201 (-0.05%); split: -0.05%, +0.00%
InvThroughput: 2420596 -> 2419800 (-0.03%); split: -0.03%, +0.00%
Copies: 125726 -> 123572 (-1.71%)
SALU: 249990 -> 247836 (-0.86%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:14 +00:00
Georg Lehmann
f129ae647a aco/ssa_elimination: don't check for VALU limitation when optimizing branching sequence
These instructions need exec, so we would never see them here because
try_optimize_branching_sequence will not be called.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:14 +00:00
Georg Lehmann
151cd9c92b ac/lower_ngg: use is_subgroup_invocation_lt_amd offset
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:14 +00:00
Georg Lehmann
bcfc5c09fa amd: add offset to is_subgroup_invocation_lt_amd
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:13 +00:00
Dave Airlie
a59efe40b2 radv/video: handle missing h265 feedback struct.
I'm not sure this should be missing, but handle if if it is.

Reviewed-by: Lynne <dev@lynne.ee>
Fixes: 7c6e3c70b6 ("radv/video/enc: report pps overrides in feedback for h265")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31158>
2024-09-26 04:56:34 +00:00
Dave Airlie
db5312f842 radv/video: add encode field for vcn4
Reviewed-by: Lynne <dev@lynne.ee>
Fixes: 967e4e09de ("radv/video: add h265 encode support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31158>
2024-09-26 04:56:34 +00:00
Dave Airlie
c78e32da3b radv/video/enc: report pictureAccessGranularity of CTB size.
Reviewed-by: Lynne <dev@lynne.ee>
Fixes: 967e4e09de ("radv/video: add h265 encode support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31158>
2024-09-26 04:56:34 +00:00
Dave Airlie
9fab2072a3 radv/video: use the h264 defines for macroblock w/h
Just a cleanup, add some comments as well.

Reviewed-by: Lynne <dev@lynne.ee>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31158>
2024-09-26 04:56:34 +00:00
Colin Marc
88dacc3d80 radv/video: set TemporalId correctly
This is only relevant for hierarchical coding using sub-layers.

Fixes: 967e4e09de ("radv/video: add h265 encode support")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Colin Marc <hi@colinmarc.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31303>
2024-09-26 04:18:11 +00:00
Samuel Pitoiset
a9095f0dbf radv: do not keep executable info when compiling shaders for ESO
This is completely useless and it's wasting memory.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31345>
2024-09-25 08:33:31 +00:00
Samuel Pitoiset
f7482e85ba radv: move updating compute scratch for RT when stack size is emitted
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31347>
2024-09-25 07:56:58 +00:00
Samuel Pitoiset
ebe66dee08 radv: move emitting some RT user SGPRs when the RT pipeline is emitted
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31347>
2024-09-25 07:56:58 +00:00
Timothy Arceri
c1b97415fa ci: disable gimark trace
gimark requires a mesa environment variable to be set to work around
a shader bug, disable it for now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31353>
2024-09-25 09:25:52 +10:00
Samuel Pitoiset
087ef34b9c aco: fix descriptor leaking when printing assembly with CLRX
This can explode the maximum number of descriptors.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31344>
2024-09-24 18:11:36 +00:00
Konstantin Seurer
25b09b9c5a radv: Fix report_ray_intersection affecting terminated rays
Fixes dEQP-VK.ray_tracing_pipeline.amber.flags-accept-first.

cc: mesa-stable

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31186>
2024-09-24 16:18:31 +00:00
Rhys Perry
bf41cf2eef radv/rt: don't split array/struct payload variables
If the shader has multiple payload variables, split passes might not
preserve the order and this can cause the offsets used for the stores to
not match the payload offsets for nir_intrinsic_trace_ray.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31204>
2024-09-24 15:41:04 +00:00
Rhys Perry
204e446bcd radv/rt: align constant data by 64 when inlining shaders
There's never any need for anything higher. If this were too high (such
as NIR_ALIGN_MUL_MAX), it would have caused issues.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31204>
2024-09-24 15:41:04 +00:00
Eric Engestrom
76c5c49fca radeonsi/ci: mark KHR-GL46.shader_image_load_store.basic-allTargets-atomic as fixed
Fixed by a commit in the range e1a53d41...1b4e1007

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31337>
2024-09-24 07:34:11 +00:00
Eric Engestrom
bc6eae7d0a radeonsi/ci: document spec@egl_ext_surface_compression@create as crashing
Fixes: 213f5e9152 ("Uprev Piglit to e9ab30aeaed97b69868cf4d6d6a3f70f3b53c362")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31337>
2024-09-24 07:34:11 +00:00
Samuel Pitoiset
cf536f63d1 radv: introduce dirty flags for shaders state
Instead of re-emitting some dynamic states when a new shader is bound,
only re-emit the user SGPR states. This is slightly more optimal.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31160>
2024-09-24 06:00:00 +00:00
Samuel Pitoiset
8beea85232 radv: rename shader_query_state to task_state
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31160>
2024-09-24 06:00:00 +00:00
Samuel Pitoiset
a0951bae70 radv: use only one user SGPR for all NGG state
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31160>
2024-09-24 06:00:00 +00:00
Samuel Pitoiset
3022282ba3 radv: make sure to re-emit shader query state when a task shader is bound
This doesn't change anything in practice because if we have a task
shader, we also have a mesh shader and the state was already re-emitted.

Though, this will prevent a regression from the upcoming patches because
the user SGPR layout will change.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31160>
2024-09-24 06:00:00 +00:00
Samuel Pitoiset
16341f41e1 radv: emit all shader related user SGPR states in one place
This will allow us to use only one user SGPR for NGG shaders, and also
further optimizations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31160>
2024-09-24 05:59:59 +00:00
David Rosca
1459193b99 ac: Add VCN IB parser
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31275>
2024-09-23 19:25:08 +00:00
Thong Thai
8da847560b ci: partially emulate cdna devices using lower image opcodes
Use the AMD_IMAGE_OPCODES=0 environment variable to test the lower image
opcode code path used by AMD CDNA/compute devices without graphics.

Runs a very limited subset of GL tests.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31180>
2024-09-23 17:46:33 +00:00
Konstantin Seurer
00c94e0cd4 radv: Workaround apps using ray tracing when it is unsupported
Emitting bvh64_intersect_ray_amd will crash the compiler on pre-GFX10_3
hardware.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11786
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30886>
2024-09-23 14:02:28 +00:00
Georg Lehmann
0e21cd9e15 aco/gfx10+: work around non uniform ds_append wave64 result
In wave64 for hw with native wave32, ds_append seems to be split in a load for
the low half and an atomic for the high half, and other LDS instructions can
be scheduled between the two.
Which means the result of the low half is unusable because it might be out of date.

I was only able to reproduce this issue in WGP mode, but be conservative and
apply the workaround in CU mode too.

Foz-DB Navi31:
Totals from 13 (0.02% of 79395) affected shaders:
Instrs: 7599 -> 7656 (+0.75%)
CodeSize: 39708 -> 39972 (+0.66%)
Latency: 83174 -> 83572 (+0.48%)
InvThroughput: 8271 -> 8357 (+1.04%)
Copies: 718 -> 717 (-0.14%)
VALU: 3689 -> 3703 (+0.38%)
SALU: 935 -> 965 (+3.21%)

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11921
Fixes: 45e935800a ("aco: implement nir_shared_append/consume_amd")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31301>
2024-09-23 13:17:58 +00:00
Samuel Pitoiset
5c897d00ef radv: fix assigning mesh shader outputs when clip/cull distances are read in FS
The per-primitive output offsets need to be recomputed.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31224>
2024-09-23 12:12:13 +00:00
Samuel Pitoiset
80d60acb77 radv: only export KHR_video_maintenance1 with KHR_video_queue
It's required, otherwise dEQP-VK.info.device_extensions fails.

Fixes: b30462535b ("radv/video: add KHR_video_maintenance1 support")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31285>
2024-09-20 15:26:28 +00:00
Samuel Pitoiset
cbae7792f9 radv: stop emulating GS invocations for legacy GS on RDNA1-2
This is basically a revert of !24231. This emulation was introduced
to fix counting the number of GS invocations in cases it's VS/TES as
NGG (hw does that), but it's unnecessary.

Since the specification has been clarified and pipeline stat queries
are undefined for stages not present in pipelines.
See https://gitlab.khronos.org/vulkan/vulkan/-/merge_requests/6547

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31153>
2024-09-20 12:28:08 +00:00