Commit graph

19458 commits

Author SHA1 Message Date
Samuel Pitoiset
0b5ae0758e ac/spm: add support for new Memory percentage counters in RGP 2.6
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:51:14 +01:00
Samuel Pitoiset
3d2bb52a81 ac/spm: add support for new Memory bytes counters in RGP 2.6
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:50:44 +01:00
Samuel Pitoiset
84ecdc534c ac/spm: add support for new LDS counters in RGP 2.6
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:50:41 +01:00
Samuel Pitoiset
07d9fc574c ac/spm: implement the new derived SPM chunk for performance counters
This is the new method to add performance counters to RGP captures.
This will be used to add the new RGP 2.6 counters too.

The previous SPM code will be deprecated at some point but it's hard
to support all generations in one batch. So, I will implement this
step by step.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:48:59 +01:00
Samuel Pitoiset
3e4d629458 ac/spm: add an ID to raw performance counters
This will be used to compute derived values for the new RGP/SPM chunk.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:48:29 +01:00
Samuel Pitoiset
21ad7e4e32 ac/spm: print an error message when a group is unknown
Help debugging.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:48:21 +01:00
Samuel Pitoiset
7da6fe6a00 ac/spm: fix programming more than one counter slot
Some blocks have two or more SPM counters and they should be used when
more than 4 counters are programmed (ie. 16-bit per counter).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:48:14 +01:00
Samuel Pitoiset
e5a041ee1c ac/spm: add an assertion to check the number of global instances
To make sure counters aren't silently discarded.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:48:06 +01:00
Samuel Pitoiset
eca9c00430 ac/spm: adjust configuration of some GPU blocks
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:47:58 +01:00
Samuel Pitoiset
6613dfb234 ac/perfcounter: add GCEA block description on GFX10-11
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:47:29 +01:00
Samuel Pitoiset
25e28819bd ac/perfcounter: adjust the number of events for TD on GFX10.3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:47:21 +01:00
Samuel Pitoiset
a4cb114f5a ac/perfcounter: add a separate group for GFX10.3
This is just a copy&paste but GFX10.3 has way more counters than GFX10
that will be added later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:47:09 +01:00
Samuel Pitoiset
044e7f6017 radv/nir: fix front_face opts for points/lines and unknown prim
Fixes new VKCTS coverage dEQP-VK.glsl.builtin_var.frontfacing.*.

Fixes: af375c6756 ("radv: Optimize fs builtins using static gfx state")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39041>
2025-12-22 07:59:30 +00:00
Daniel Schürmann
7b1f6fa6fc aco: remove radeon_family from aco::Program
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:48 +00:00
Daniel Schürmann
1e8d367537 amd: add and use ac_cu_info::has_vtx_format_alpha_adjust_bug
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:48 +00:00
Daniel Schürmann
febc29907c amd: add and use ac_cu_info::has_gfx6_mrt_export_bug
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:47 +00:00
Daniel Schürmann
7b7bdb76ab amd: add ac_cu_info::has_point_sample_accel flag and use in ACO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:47 +00:00
Daniel Schürmann
cfb745592d amd: add ac_cu_info::has_mad32 flag and use in ACO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:47 +00:00
Daniel Schürmann
1e3db50170 aco: use additional flags from ac_cu_info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
f7c4aa48a0 ac/gpu_info: add some more flags to ac_cu_info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
f791e46c47 aco: add ac_cu_info to aco_compiler_options
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
addd4ea59f aco: pass aco_compiler_options to init_program()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
bf9bec07c2 aco/tests: don't pass CHIP_UNKNOWN to ACO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
6f4e8046b5 ac/gpu_info: create separate function ac_fill_cu_info() to fill out CU info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:45 +00:00
Daniel Schürmann
749c619c45 ac/gpu_info: correct some SGPR and VGPR allocation values in ac_cu_info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:45 +00:00
Daniel Schürmann
553b431aca ac/gpu_info: move some CU information into separate struct ac_cu_info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:44 +00:00
Daniel Schürmann
0db1ae1f01 aco: disable XNACK on all GPUs
Affects code generation on GFX8 and GFX9 APUs where we misunderstood
the feature. XNACK replay is not being used with graphics APIs.

Totals from 41759 (65.90% of 63370) affected shaders: (Raven)

MaxWaves: 298672 -> 299000 (+0.11%)
Instrs: 19200726 -> 19138227 (-0.33%); split: -0.33%, +0.00%
CodeSize: 98501904 -> 98253196 (-0.25%); split: -0.26%, +0.00%
SGPRs: 3058544 -> 2831492 (-7.42%)
VGPRs: 1644896 -> 1643660 (-0.08%)
Latency: 193383803 -> 193224047 (-0.08%); split: -0.08%, +0.00%
InvThroughput: 92741082 -> 92698975 (-0.05%); split: -0.05%, +0.00%
SClause: 678580 -> 630107 (-7.14%); split: -7.15%, +0.00%
Copies: 1863375 -> 1863406 (+0.00%); split: -0.04%, +0.04%
VALU: 13791245 -> 13791267 (+0.00%); split: -0.00%, +0.00%
SALU: 2066726 -> 2066741 (+0.00%); split: -0.04%, +0.04%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:43 +00:00
Daniel Schürmann
d94e90df25 amd/common: link with libamdgpu_addrlib
ac_surface needs that.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:41 +00:00
Samuel Pitoiset
045b778ed6 radv: add the SQTT relocated shaders BO to the cmdbuf list
Found this while debugging another thing with amdgpu.debug_mask=0x1 (VM).

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39002>
2025-12-22 07:13:06 +00:00
Daniel Schürmann
f930ecdc55 amd: add newer small APUs to get_task_num_entries()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38999>
2025-12-19 13:03:49 +00:00
Benjamin Cheng
fa8b0b6bbb radv/video: Enable write combine for decode
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39025>
2025-12-18 15:25:57 -05:00
Pierre-Eric Pelloux-Prayer
645fff5dae ac/descriptors: account for num_storage_samples for gfx10
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes a page fault when nr_samples=4 but nr_storage_samples=2.
Based on si_is_format_supported this is only supported for color
formats and when has_eqaa_surface_allocator is true (< GFX11).

The referenced commit below didn't introduce the issue but it
exposed it by forcing the gfx blit path to be used.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13255
Fixes: 3424e16ece ("radeonsi: add decision code to select when to use CB_RESOLVE for performance")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38925>
2025-12-18 10:45:49 +00:00
Marek Olšák
3c5c96fedb radv: double pixel throughput in certain cases of PS without interpolated inputs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This reduces the number of initialized VGPRs by 1 when no barycentric
coordinates are used.

I have verified with zink that this indeed increases performance for
cases where sysvals like frag_coord and front_face are used without
interpolated PS inputs.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38936>
2025-12-18 03:37:58 +00:00
Emma Anholt
059d301c79 nir: Drop the mode argument of nir_lower_vars_to_scratch().
It only makes sense for function temps, and that's the only way it's been
used.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>
2025-12-17 19:50:28 +00:00
Samuel Pitoiset
f8feed17e1 ac,radv,radeonsi: add tracked register macros to common code
Because the tracked registers are really driver dependant, the driver
is expected to handle the tracked_registers struct itself.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>
2025-12-17 15:09:26 +00:00
Samuel Pitoiset
c580fc667f ac,radv: add ac_cmdbuf::context_roll and use it
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>
2025-12-17 15:09:26 +00:00
Samuel Pitoiset
f3b385859a ac,radv: add more cmdbuf emit helpers
Some can't be shared with RadeonSI because it uses templates in some
places.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>
2025-12-17 15:09:25 +00:00
Samuel Pitoiset
b444dc145a radv: remove redundant assertions in radeon_emit_{array}()
The common helpers already have assertions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>
2025-12-17 15:09:25 +00:00
Samuel Pitoiset
262fc80e45 ac,radv,radeonsi: add functions to initialize tracked regs
Also initialize the new slots for RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>
2025-12-17 15:09:25 +00:00
Samuel Pitoiset
44314e1ea6 ac,radv,radeonsi: add ac_tracked_regs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>
2025-12-17 15:09:24 +00:00
Samuel Pitoiset
c97bd17d4d radv: switch to AC_TRACKED_xxx
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>
2025-12-17 15:09:23 +00:00
Samuel Pitoiset
fad24d6fcc ac/cmdbuf: add new slots to ac_tracked_reg
For RADV registers that aren't tracked in RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>
2025-12-17 15:09:23 +00:00
Samuel Pitoiset
18bdb76408 ac,radeonsi: move si_tracked_reg to common code
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>
2025-12-17 15:09:22 +00:00
Samuel Pitoiset
5d76202b6d radv: create descriptors for color/depth-stencil surfaces earlier
For less CPU overhead when rendering begins and also because it's
easy to pre-compute those descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714>
2025-12-17 11:11:18 +00:00
Samuel Pitoiset
c8729cdd3c radv/meta: stop passing a stencil attachment for depth decompress
It should only be the depth aspect.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714>
2025-12-17 11:11:18 +00:00
Samuel Pitoiset
43d7d97b13 radv/meta: inject image view usage info
This will be used to initialize color/depth-stencil descriptors earlier
when the image view is created.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714>
2025-12-17 11:11:18 +00:00
Samuel Pitoiset
ce69cabb60 radv: constify radv_{cb,ds}_buffer_info parameters
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714>
2025-12-17 11:11:18 +00:00
Georg Lehmann
0478021fdc aco/optimizer: reassociate rcp(mul(a, const)) into rcp_omod(a)
Foz-DB Navi48:
Totals from 2484 (2.54% of 97637) affected shaders:
Instrs: 10368279 -> 10361892 (-0.06%); split: -0.06%, +0.00%
CodeSize: 55161104 -> 55150752 (-0.02%); split: -0.02%, +0.00%
SpillSGPRs: 14665 -> 14666 (+0.01%)
Latency: 87694014 -> 87689324 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 16595764 -> 16594448 (-0.01%); split: -0.01%, +0.00%
VClause: 209922 -> 209918 (-0.00%); split: -0.01%, +0.00%
SClause: 205195 -> 205251 (+0.03%); split: -0.01%, +0.04%
Copies: 843771 -> 843765 (-0.00%); split: -0.01%, +0.01%
Branches: 275985 -> 275962 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 170608 -> 170494 (-0.07%)
VALU: 5840893 -> 5838038 (-0.05%); split: -0.05%, +0.00%
SALU: 1481388 -> 1479037 (-0.16%); split: -0.16%, +0.00%
VOPD: 7496 -> 7485 (-0.15%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730>
2025-12-17 08:41:32 +00:00
Georg Lehmann
a8f5ced670 aco/optimizer: reassociate mul(mul(a, const), b) into mul_omod(a, b)
Foz-DB Navi48:
Totals from 14608 (14.96% of 97637) affected shaders:
MaxWaves: 364201 -> 364421 (+0.06%)
Instrs: 28051720 -> 28022503 (-0.10%); split: -0.13%, +0.03%
CodeSize: 148938740 -> 148943480 (+0.00%); split: -0.04%, +0.04%
VGPRs: 994520 -> 994004 (-0.05%); split: -0.05%, +0.00%
SpillSGPRs: 45182 -> 45179 (-0.01%)
Latency: 187734461 -> 187725301 (-0.00%); split: -0.07%, +0.06%
InvThroughput: 33967002 -> 33949881 (-0.05%); split: -0.11%, +0.06%
VClause: 495237 -> 495207 (-0.01%); split: -0.03%, +0.02%
Copies: 2048324 -> 2047937 (-0.02%); split: -0.12%, +0.10%
Branches: 598445 -> 598431 (-0.00%); split: -0.01%, +0.01%
PreSGPRs: 877715 -> 877684 (-0.00%)
PreVGPRs: 778146 -> 776383 (-0.23%); split: -0.23%, +0.00%
VALU: 16413380 -> 16391508 (-0.13%); split: -0.15%, +0.01%
SALU: 3685279 -> 3677655 (-0.21%); split: -0.23%, +0.02%
VOPD: 26219 -> 25926 (-1.12%); split: +0.43%, -1.55%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730>
2025-12-17 08:41:31 +00:00
Daniel Schürmann
125ac1626d radv: remove precomputed registers from radv_shader_binary
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It is enough to compute them after upload.
This saves some disk space and eliminates an unlikely
bug where the shader cache is shared between two GPUs
with the same chip but a different number of enabled CUs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38970>
2025-12-17 08:16:06 +00:00