mesa/src/amd/common
Rhys Perry 7d552d71e9 ac/nir: optimize txd(coord, ddx/ddy(coord))
This is done in ac_nir_lower_tex so that we can optimize derivative
calculations with a different exec mask than the texture sample by using
the nir_strict_wqm_coord_amd path.

It's also more aware of divergence than nir_lower_tex is.

fossil-db (gfx1201):
Totals from 103 (0.13% of 79839) affected shaders:
MaxWaves: 2610 -> 2620 (+0.38%)
Instrs: 347283 -> 345912 (-0.39%); split: -0.40%, +0.00%
CodeSize: 1892380 -> 1883824 (-0.45%); split: -0.46%, +0.00%
VGPRs: 8028 -> 7824 (-2.54%)
Latency: 3942575 -> 3939623 (-0.07%); split: -0.08%, +0.01%
InvThroughput: 867147 -> 865281 (-0.22%); split: -0.24%, +0.02%
VClause: 6230 -> 6221 (-0.14%); split: -0.19%, +0.05%
SClause: 3910 -> 3914 (+0.10%); split: -0.26%, +0.36%
Copies: 16091 -> 15721 (-2.30%); split: -2.74%, +0.44%
PreSGPRs: 4651 -> 4658 (+0.15%)
PreVGPRs: 6389 -> 6320 (-1.08%); split: -1.17%, +0.09%
VALU: 228715 -> 227490 (-0.54%); split: -0.54%, +0.01%
SALU: 32763 -> 32767 (+0.01%); split: -0.06%, +0.07%
VMEM: 9027 -> 9024 (-0.03%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37561>
2025-10-23 11:22:00 +00:00
..
nir ac/nir: optimize txd(coord, ddx/ddy(coord)) 2025-10-23 11:22:00 +00:00
virtio ac/virtio: fix alignment of metadata command 2025-06-27 08:15:50 +00:00
.clang-format amd: import libdrm_amdgpu ioctl wrappers 2024-11-25 21:03:41 -05:00
ac_binary.c ac/llvm: remove LDS linking code 2025-07-12 10:28:21 +00:00
ac_binary.h radv: only call radv_should_use_wgp_mode() once 2025-10-15 13:37:48 +01:00
ac_cmdbuf.c amd: move CP emit helpers to ac_cmdbuf_cp.c/h 2025-10-21 13:31:20 +02:00
ac_cmdbuf.h amd,radv,radeonsi: add and use more ac_cmdbuf_XXX helpers 2025-10-23 08:29:26 +00:00
ac_cmdbuf_cp.c amd,radv,radeonsi: add ac_emit_cp_release_mem() 2025-10-23 08:29:27 +00:00
ac_cmdbuf_cp.h amd,radv,radeonsi: add ac_emit_cp_release_mem() 2025-10-23 08:29:27 +00:00
ac_cmdbuf_sdma.c amd,radv: move SDMA utility helpers to common code 2025-10-21 13:31:20 +02:00
ac_cmdbuf_sdma.h amd,radv: move SDMA utility helpers to common code 2025-10-21 13:31:20 +02:00
ac_debug.c radv: Avoid calls to strlen when parsing umr output to speed up hang progressing 2025-08-07 06:44:41 +00:00
ac_debug.h ac: Add VCN IB parser 2024-09-23 19:25:08 +00:00
ac_descriptors.c ac/descriptors: add a function to create a descriptor for HiZ surfaces 2025-08-12 13:48:09 +00:00
ac_descriptors.h ac/descriptors: add a function to create a descriptor for HiZ surfaces 2025-08-12 13:48:09 +00:00
ac_drm_fourcc.h ac/surface: add radeonsi exported modifiers to supported list 2025-09-15 09:39:19 +00:00
ac_fake_hw_db.h ac: define physical VGPRs for fake hw overrides 2025-04-07 19:44:22 +00:00
ac_formats.c ac,radeonsi: add ac_is_reduction_mode_supported() 2024-07-10 07:57:42 +00:00
ac_formats.h ac,radeonsi: add ac_is_reduction_mode_supported() 2024-07-10 07:57:42 +00:00
ac_gather_context_rolls.c ac: Improve context roll readability 2024-03-19 16:08:14 +00:00
ac_gpu_info.c amd: change radeon_info::lds_size_per_workgroup for GFX10+ to 64KB 2025-10-15 11:20:09 +00:00
ac_gpu_info.h amd/common: remove radeon_info::lds_alloc_granularity and radeon_info::lds_encode_granularity 2025-10-15 11:20:08 +00:00
ac_hw_stage.h amd: Move ac_hw_stage to its own file 2023-07-03 21:12:45 +00:00
ac_ib_parser.c ac/parse_ib: Replace the parameter list with ac_ib_parser 2024-03-19 16:08:13 +00:00
ac_linux_drm.c ac/info: add ac_drm_query_pci_bus_info 2025-06-27 08:15:50 +00:00
ac_linux_drm.h radv: Move the amdgpu.h defines for Win32 to ac_linux_drm.h 2025-08-07 07:47:42 +00:00
ac_msgpack.c ac: remove unused code 2024-12-26 10:12:43 +00:00
ac_msgpack.h ac: remove unused code 2024-12-26 10:12:43 +00:00
ac_parse_ib.c ac/parse_ib: Update vcn ib parser to include missing commands 2025-10-03 14:44:07 +00:00
ac_perfcounter.c ac: fix potential overflows 2025-07-04 15:26:38 +00:00
ac_perfcounter.h ac/perfcounter: compute the number of global instances of TCP,SQ,GL1C and GL2C 2023-09-14 14:17:19 +00:00
ac_pm4.c amd,radv,radeonsi: add ac_pm4_emit_commands() 2025-10-23 08:29:24 +00:00
ac_pm4.h amd,radv,radeonsi: add ac_pm4_emit_commands() 2025-10-23 08:29:24 +00:00
ac_rgp.c amd: change radeon_info::lds_size_per_workgroup for GFX10+ to 64KB 2025-10-15 11:20:09 +00:00
ac_rgp.h ac/rgp: update dumping queue event records to the capture 2023-11-13 08:53:09 +00:00
ac_rgp_elf_object_pack.c all: rename gl_shader_stage to mesa_shader_stage 2025-08-06 10:28:40 +08:00
ac_rtld.c build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
ac_rtld.h all: rename gl_shader_stage to mesa_shader_stage 2025-08-06 10:28:40 +08:00
ac_shader_args.c ac: Don't include full nir.h anymore. 2025-02-12 22:33:07 +01:00
ac_shader_args.h radv: declare a new user SGPR for dynamic descriptors 2025-10-14 15:34:43 +00:00
ac_shader_debug_info.h amd: Add ac_shader_debug_info 2024-11-11 08:39:13 +00:00
ac_shader_util.c amd: stop using custom gl_access_qualifier for access type 2025-08-15 08:26:10 +00:00
ac_shader_util.h amd: add and use utility functions for LDS size encoding 2025-10-15 11:20:08 +00:00
ac_shadowed_regs.c build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
ac_shadowed_regs.h radv: remove useless radeon_cmdbuf forwarded declaration 2025-10-08 18:00:13 +00:00
ac_spm.c amd,radv,radeonsi: add ac_emit_spm_setup() 2025-10-23 08:29:27 +00:00
ac_spm.h amd,radv,radeonsi: add ac_emit_spm_setup() 2025-10-23 08:29:27 +00:00
ac_sqtt.c ac: fix potential overflows 2025-07-04 15:26:38 +00:00
ac_sqtt.h radv: replace radeon_cmdbuf by ac_cmdbuf completely 2025-10-08 18:00:15 +00:00
ac_surface.c ac/surface: Limit video modifiers to 64K_S also for VCN 2.2 2025-10-15 10:24:29 +00:00
ac_surface.h ac/surface: add ac_compute_surface_modifier 2025-09-15 09:39:19 +00:00
ac_surface_meta_address_test.c radv: don't include amdgpu.h directly 2025-08-28 18:08:20 +00:00
ac_surface_modifier_test.c radv: don't include amdgpu.h directly 2025-08-28 18:08:20 +00:00
ac_uvd_dec.c ac/uvd: Add ac_uvd_alloc_stream_handle 2025-05-13 09:36:47 +00:00
ac_uvd_dec.h ac/uvd: Add ac_uvd_alloc_stream_handle 2025-05-13 09:36:47 +00:00
ac_vcn.h ac/vcn: Add RADEON_VCN_IB_COMMON_OP_RESOLVEINPUTPARAMLAYOUT 2025-09-08 10:52:05 +00:00
ac_vcn_av1_default.h ac/radeonsi: add av1 defaults header file from radeonsi 2023-06-16 05:53:44 +00:00
ac_vcn_dec.c radeonsi/vcn: vcn5 av1 decoding context buffer fix 2025-07-18 16:45:42 +00:00
ac_vcn_dec.h ac/vcn_dec: Add av1_intrabc_workaround 2025-08-20 09:51:32 +00:00
ac_vcn_enc.c ac: Add ac_vcn_init_enc_cmds 2024-09-20 06:58:29 +00:00
ac_vcn_enc.h radeonsi/vcn: Support BT2020 matrix with EFC 2025-10-15 06:06:44 +00:00
ac_vcn_enc_av1_default_cdf.h ac,radeonsi: move vcn enc av1 default cdf file to common 2023-09-14 07:51:24 +00:00
ac_vcn_vp9_default.h amd: move vp9 probs table to common code. 2025-06-09 20:46:03 +00:00
amd_family.c build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
amd_family.h amd: add vpe_version 2025-06-12 07:44:27 +00:00
gfx10_format_table.h amd/common: only pass gfx_level to ac_get_gfx10_format_table() 2024-05-22 08:31:39 +00:00
gfx10_format_table.py ac/gfx10_format_table: Use new names for 422 subsampled formats 2025-10-14 09:33:28 +00:00
meson.build amd: move CP emit helpers to ac_cmdbuf_cp.c/h 2025-10-21 13:31:20 +02:00
sid.h radv,radeonsi: emit UPDATE_DB_SUMMARIZER_TIMEOUT on GFX12 2025-06-02 07:30:18 +00:00
sid_tables.py ac,radeonsi,winsyses: switch to SPDX-License-Identifier: MIT 2023-05-24 21:48:19 +00:00