Samuel Pitoiset
0b5ae0758e
ac/spm: add support for new Memory percentage counters in RGP 2.6
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013 >
2025-12-22 09:51:14 +01:00
Samuel Pitoiset
3d2bb52a81
ac/spm: add support for new Memory bytes counters in RGP 2.6
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013 >
2025-12-22 09:50:44 +01:00
Samuel Pitoiset
84ecdc534c
ac/spm: add support for new LDS counters in RGP 2.6
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013 >
2025-12-22 09:50:41 +01:00
Samuel Pitoiset
07d9fc574c
ac/spm: implement the new derived SPM chunk for performance counters
...
This is the new method to add performance counters to RGP captures.
This will be used to add the new RGP 2.6 counters too.
The previous SPM code will be deprecated at some point but it's hard
to support all generations in one batch. So, I will implement this
step by step.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013 >
2025-12-22 09:48:59 +01:00
Samuel Pitoiset
3e4d629458
ac/spm: add an ID to raw performance counters
...
This will be used to compute derived values for the new RGP/SPM chunk.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013 >
2025-12-22 09:48:29 +01:00
Samuel Pitoiset
21ad7e4e32
ac/spm: print an error message when a group is unknown
...
Help debugging.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013 >
2025-12-22 09:48:21 +01:00
Samuel Pitoiset
7da6fe6a00
ac/spm: fix programming more than one counter slot
...
Some blocks have two or more SPM counters and they should be used when
more than 4 counters are programmed (ie. 16-bit per counter).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013 >
2025-12-22 09:48:14 +01:00
Samuel Pitoiset
e5a041ee1c
ac/spm: add an assertion to check the number of global instances
...
To make sure counters aren't silently discarded.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013 >
2025-12-22 09:48:06 +01:00
Samuel Pitoiset
eca9c00430
ac/spm: adjust configuration of some GPU blocks
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013 >
2025-12-22 09:47:58 +01:00
Samuel Pitoiset
6613dfb234
ac/perfcounter: add GCEA block description on GFX10-11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013 >
2025-12-22 09:47:29 +01:00
Samuel Pitoiset
25e28819bd
ac/perfcounter: adjust the number of events for TD on GFX10.3
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013 >
2025-12-22 09:47:21 +01:00
Samuel Pitoiset
a4cb114f5a
ac/perfcounter: add a separate group for GFX10.3
...
This is just a copy&paste but GFX10.3 has way more counters than GFX10
that will be added later.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013 >
2025-12-22 09:47:09 +01:00
Samuel Pitoiset
044e7f6017
radv/nir: fix front_face opts for points/lines and unknown prim
...
Fixes new VKCTS coverage dEQP-VK.glsl.builtin_var.frontfacing.*.
Fixes: af375c6756 ("radv: Optimize fs builtins using static gfx state")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39041 >
2025-12-22 07:59:30 +00:00
Daniel Schürmann
7b1f6fa6fc
aco: remove radeon_family from aco::Program
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:48 +00:00
Daniel Schürmann
1e8d367537
amd: add and use ac_cu_info::has_vtx_format_alpha_adjust_bug
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:48 +00:00
Daniel Schürmann
febc29907c
amd: add and use ac_cu_info::has_gfx6_mrt_export_bug
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:47 +00:00
Daniel Schürmann
7b7bdb76ab
amd: add ac_cu_info::has_point_sample_accel flag and use in ACO
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:47 +00:00
Daniel Schürmann
cfb745592d
amd: add ac_cu_info::has_mad32 flag and use in ACO
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:47 +00:00
Daniel Schürmann
1e3db50170
aco: use additional flags from ac_cu_info
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:46 +00:00
Daniel Schürmann
f7c4aa48a0
ac/gpu_info: add some more flags to ac_cu_info
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:46 +00:00
Daniel Schürmann
f791e46c47
aco: add ac_cu_info to aco_compiler_options
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:46 +00:00
Daniel Schürmann
addd4ea59f
aco: pass aco_compiler_options to init_program()
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:46 +00:00
Daniel Schürmann
bf9bec07c2
aco/tests: don't pass CHIP_UNKNOWN to ACO
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:46 +00:00
Daniel Schürmann
6f4e8046b5
ac/gpu_info: create separate function ac_fill_cu_info() to fill out CU info
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:45 +00:00
Daniel Schürmann
749c619c45
ac/gpu_info: correct some SGPR and VGPR allocation values in ac_cu_info
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:45 +00:00
Daniel Schürmann
553b431aca
ac/gpu_info: move some CU information into separate struct ac_cu_info
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:44 +00:00
Daniel Schürmann
0db1ae1f01
aco: disable XNACK on all GPUs
...
Affects code generation on GFX8 and GFX9 APUs where we misunderstood
the feature. XNACK replay is not being used with graphics APIs.
Totals from 41759 (65.90% of 63370) affected shaders: (Raven)
MaxWaves: 298672 -> 299000 (+0.11%)
Instrs: 19200726 -> 19138227 (-0.33%); split: -0.33%, +0.00%
CodeSize: 98501904 -> 98253196 (-0.25%); split: -0.26%, +0.00%
SGPRs: 3058544 -> 2831492 (-7.42%)
VGPRs: 1644896 -> 1643660 (-0.08%)
Latency: 193383803 -> 193224047 (-0.08%); split: -0.08%, +0.00%
InvThroughput: 92741082 -> 92698975 (-0.05%); split: -0.05%, +0.00%
SClause: 678580 -> 630107 (-7.14%); split: -7.15%, +0.00%
Copies: 1863375 -> 1863406 (+0.00%); split: -0.04%, +0.04%
VALU: 13791245 -> 13791267 (+0.00%); split: -0.00%, +0.00%
SALU: 2066726 -> 2066741 (+0.00%); split: -0.04%, +0.04%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:43 +00:00
Daniel Schürmann
d94e90df25
amd/common: link with libamdgpu_addrlib
...
ac_surface needs that.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:41 +00:00
Samuel Pitoiset
045b778ed6
radv: add the SQTT relocated shaders BO to the cmdbuf list
...
Found this while debugging another thing with amdgpu.debug_mask=0x1 (VM).
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39002 >
2025-12-22 07:13:06 +00:00
Daniel Schürmann
f930ecdc55
amd: add newer small APUs to get_task_num_entries()
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38999 >
2025-12-19 13:03:49 +00:00
Benjamin Cheng
fa8b0b6bbb
radv/video: Enable write combine for decode
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39025 >
2025-12-18 15:25:57 -05:00
Pierre-Eric Pelloux-Prayer
645fff5dae
ac/descriptors: account for num_storage_samples for gfx10
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes a page fault when nr_samples=4 but nr_storage_samples=2.
Based on si_is_format_supported this is only supported for color
formats and when has_eqaa_surface_allocator is true (< GFX11).
The referenced commit below didn't introduce the issue but it
exposed it by forcing the gfx blit path to be used.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13255
Fixes: 3424e16ece ("radeonsi: add decision code to select when to use CB_RESOLVE for performance")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38925 >
2025-12-18 10:45:49 +00:00
Marek Olšák
3c5c96fedb
radv: double pixel throughput in certain cases of PS without interpolated inputs
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This reduces the number of initialized VGPRs by 1 when no barycentric
coordinates are used.
I have verified with zink that this indeed increases performance for
cases where sysvals like frag_coord and front_face are used without
interpolated PS inputs.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38936 >
2025-12-18 03:37:58 +00:00
Emma Anholt
059d301c79
nir: Drop the mode argument of nir_lower_vars_to_scratch().
...
It only makes sense for function temps, and that's the only way it's been
used.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245 >
2025-12-17 19:50:28 +00:00
Samuel Pitoiset
f8feed17e1
ac,radv,radeonsi: add tracked register macros to common code
...
Because the tracked registers are really driver dependant, the driver
is expected to handle the tracked_registers struct itself.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740 >
2025-12-17 15:09:26 +00:00
Samuel Pitoiset
c580fc667f
ac,radv: add ac_cmdbuf::context_roll and use it
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740 >
2025-12-17 15:09:26 +00:00
Samuel Pitoiset
f3b385859a
ac,radv: add more cmdbuf emit helpers
...
Some can't be shared with RadeonSI because it uses templates in some
places.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740 >
2025-12-17 15:09:25 +00:00
Samuel Pitoiset
b444dc145a
radv: remove redundant assertions in radeon_emit_{array}()
...
The common helpers already have assertions.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740 >
2025-12-17 15:09:25 +00:00
Samuel Pitoiset
262fc80e45
ac,radv,radeonsi: add functions to initialize tracked regs
...
Also initialize the new slots for RADV.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740 >
2025-12-17 15:09:25 +00:00
Samuel Pitoiset
44314e1ea6
ac,radv,radeonsi: add ac_tracked_regs
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740 >
2025-12-17 15:09:24 +00:00
Samuel Pitoiset
c97bd17d4d
radv: switch to AC_TRACKED_xxx
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740 >
2025-12-17 15:09:23 +00:00
Samuel Pitoiset
fad24d6fcc
ac/cmdbuf: add new slots to ac_tracked_reg
...
For RADV registers that aren't tracked in RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740 >
2025-12-17 15:09:23 +00:00
Samuel Pitoiset
18bdb76408
ac,radeonsi: move si_tracked_reg to common code
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740 >
2025-12-17 15:09:22 +00:00
Samuel Pitoiset
5d76202b6d
radv: create descriptors for color/depth-stencil surfaces earlier
...
For less CPU overhead when rendering begins and also because it's
easy to pre-compute those descriptors.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714 >
2025-12-17 11:11:18 +00:00
Samuel Pitoiset
c8729cdd3c
radv/meta: stop passing a stencil attachment for depth decompress
...
It should only be the depth aspect.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714 >
2025-12-17 11:11:18 +00:00
Samuel Pitoiset
43d7d97b13
radv/meta: inject image view usage info
...
This will be used to initialize color/depth-stencil descriptors earlier
when the image view is created.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714 >
2025-12-17 11:11:18 +00:00
Samuel Pitoiset
ce69cabb60
radv: constify radv_{cb,ds}_buffer_info parameters
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714 >
2025-12-17 11:11:18 +00:00
Georg Lehmann
0478021fdc
aco/optimizer: reassociate rcp(mul(a, const)) into rcp_omod(a)
...
Foz-DB Navi48:
Totals from 2484 (2.54% of 97637) affected shaders:
Instrs: 10368279 -> 10361892 (-0.06%); split: -0.06%, +0.00%
CodeSize: 55161104 -> 55150752 (-0.02%); split: -0.02%, +0.00%
SpillSGPRs: 14665 -> 14666 (+0.01%)
Latency: 87694014 -> 87689324 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 16595764 -> 16594448 (-0.01%); split: -0.01%, +0.00%
VClause: 209922 -> 209918 (-0.00%); split: -0.01%, +0.00%
SClause: 205195 -> 205251 (+0.03%); split: -0.01%, +0.04%
Copies: 843771 -> 843765 (-0.00%); split: -0.01%, +0.01%
Branches: 275985 -> 275962 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 170608 -> 170494 (-0.07%)
VALU: 5840893 -> 5838038 (-0.05%); split: -0.05%, +0.00%
SALU: 1481388 -> 1479037 (-0.16%); split: -0.16%, +0.00%
VOPD: 7496 -> 7485 (-0.15%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730 >
2025-12-17 08:41:32 +00:00
Georg Lehmann
a8f5ced670
aco/optimizer: reassociate mul(mul(a, const), b) into mul_omod(a, b)
...
Foz-DB Navi48:
Totals from 14608 (14.96% of 97637) affected shaders:
MaxWaves: 364201 -> 364421 (+0.06%)
Instrs: 28051720 -> 28022503 (-0.10%); split: -0.13%, +0.03%
CodeSize: 148938740 -> 148943480 (+0.00%); split: -0.04%, +0.04%
VGPRs: 994520 -> 994004 (-0.05%); split: -0.05%, +0.00%
SpillSGPRs: 45182 -> 45179 (-0.01%)
Latency: 187734461 -> 187725301 (-0.00%); split: -0.07%, +0.06%
InvThroughput: 33967002 -> 33949881 (-0.05%); split: -0.11%, +0.06%
VClause: 495237 -> 495207 (-0.01%); split: -0.03%, +0.02%
Copies: 2048324 -> 2047937 (-0.02%); split: -0.12%, +0.10%
Branches: 598445 -> 598431 (-0.00%); split: -0.01%, +0.01%
PreSGPRs: 877715 -> 877684 (-0.00%)
PreVGPRs: 778146 -> 776383 (-0.23%); split: -0.23%, +0.00%
VALU: 16413380 -> 16391508 (-0.13%); split: -0.15%, +0.01%
SALU: 3685279 -> 3677655 (-0.21%); split: -0.23%, +0.02%
VOPD: 26219 -> 25926 (-1.12%); split: +0.43%, -1.55%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730 >
2025-12-17 08:41:31 +00:00
Daniel Schürmann
125ac1626d
radv: remove precomputed registers from radv_shader_binary
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It is enough to compute them after upload.
This saves some disk space and eliminates an unlikely
bug where the shader cache is shared between two GPUs
with the same chip but a different number of enabled CUs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38970 >
2025-12-17 08:16:06 +00:00