Commit graph

20134 commits

Author SHA1 Message Date
Marek Olšák
a9e47751d2 ac: lower load_subgroup_id for ACO in NIR
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638>
2026-02-13 15:33:19 +00:00
Marek Olšák
0a9bdcac79 ac: lower load_workgroup_ids for ACO in NIR
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638>
2026-02-13 15:33:19 +00:00
Daniel Schürmann
97f095f6e0 aco/lower_branches: Add try_rotate_latch_block() optimization
This optimization looks for unconditional back-edges and aims
to rotate the loop in a way that the final block is emitted
before the loop header, essentially turning

BB1:
  if ()
    goto BB3;
BB2:
  <loop body>
  goto BB1;
BB3:
  ...

into

  goto BB1;
BB2:
  <loop body>
BB1:
  if(!cond)
    goto BB2;
BB3:
  ...

Totals from 4969 (5.89% of 84383) affected shaders: (Navi48)

Instrs: 15253038 -> 15254019 (+0.01%); split: -0.00%, +0.01%
CodeSize: 81225300 -> 81227696 (+0.00%); split: -0.02%, +0.02%
Latency: 320796283 -> 320693480 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 51395922 -> 51376156 (-0.04%); split: -0.04%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
ade5e300ab aco/insert_delay_alu: handle loop latch block before loop body
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
102aca9843 aco/assembler: emit block_kind_loop_latch before the loop header
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
da1594f8bb aco: introduce notion of block_kind_loop_latch
A block annotated with block_kind_loop_latch denotes a block
the re-entry point for a loop back-edge. It is emitted after
the loop preheader and (potentially) before the loop header.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
9887ce6709 aco/print_asm: Sort block markers by block offset
We are going to emit blocks in a different order.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
800a4957bb aco/lower_branches: Consider branch target of nested conditional branches
Totals from 1470 (1.74% of 84383) affected shaders: (Navi48)

Instrs: 5128451 -> 5126842 (-0.03%)
CodeSize: 29359832 -> 29353656 (-0.02%); split: -0.02%, +0.00%
Latency: 41047203 -> 41040786 (-0.02%)
InvThroughput: 6040459 -> 6039619 (-0.01%); split: -0.01%, +0.00%
Branches: 146219 -> 144648 (-1.07%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
fbf2083b8f aco/isel: Don't emit ELSE side of divergent branches which jump
Totals from 50 (0.06% of 84383) affected shaders: (Navi48)

Instrs: 402490 -> 402444 (-0.01%); split: -0.01%, +0.00%
CodeSize: 2239024 -> 2238864 (-0.01%); split: -0.01%, +0.00%
SpillSGPRs: 1493 -> 1496 (+0.20%)
Latency: 5836785 -> 5836747 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 1120893 -> 1120909 (+0.00%); split: -0.00%, +0.00%
Copies: 46128 -> 46082 (-0.10%)
VALU: 222708 -> 222715 (+0.00%); split: -0.00%, +0.00%
SALU: 53039 -> 52993 (-0.09%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
ba32219cf8 aco/isel: Don't emit ELSE side of uniform branches which jump
Totals from 4 (0.00% of 84383) affected shaders: (Navi48)

Instrs: 16473 -> 16468 (-0.03%)
CodeSize: 85276 -> 85300 (+0.03%)
SpillSGPRs: 175 -> 176 (+0.57%)
Latency: 267907 -> 267885 (-0.01%)
InvThroughput: 36302 -> 36298 (-0.01%)
Copies: 1353 -> 1345 (-0.59%)
VALU: 9025 -> 9029 (+0.04%)
SALU: 2635 -> 2627 (-0.30%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
96a639918c aco: don't emit p_logical_start / p_logical_end after divergent branches
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
3743230252 aco/isel: Do IF-simplification if that didn't happen during NIR optimizations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:43 +00:00
Daniel Schürmann
50b093ec90 aco/builder: Fix v_add_co_u32 carry-out to VCC if post_ra
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:43 +00:00
Eric Engestrom
fb1cb00a96 radv/ci: add vulkan fluster job on navi48
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39861>
2026-02-13 13:48:03 +00:00
Samuel Pitoiset
1be4ffdff9 ac,radv,radeonsi: use correct swizzle/pitch for depth-only images with SDMA
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes new VKCTS coverage
dEQP-VK.api.copy_and_blit.core.use_after_copy.*.

is_stencil isn't set for RadeonSI because it doesn't do SDMA copies
with Z/S.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39800>
2026-02-13 07:52:29 +01:00
Samuel Pitoiset
4ec3840184 radv/meta: move the barrier for depth/stencil compute resolves outside
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This barrier is only needed for rendering resolves (ie. not for
vkCmdResolveImage()). It's similar to color compute resolves now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39805>
2026-02-12 20:17:20 +00:00
Samuel Pitoiset
3e41b04de9 radv/meta: optimize a barrier with depth/stencil compute resolves
The compute resolve doesn't use HTILE of the destination image, so the
potential HTILE clear can run in parallel.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39805>
2026-02-12 20:17:20 +00:00
Samuel Pitoiset
85a3f7816d radv/meta: add HTILE support to radv_fixup_resolve_dst_metadata()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39805>
2026-02-12 20:17:20 +00:00
Samuel Pitoiset
6a454dabda radv/meta: stop fixing up HTILE after a partial resolve using compute
The decompression pass already resets HTILE to its uncompressed state,
so this is just redundant.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39805>
2026-02-12 20:17:19 +00:00
Samuel Pitoiset
c3cc6fd051 radv: cleanup barriers after a depth/stencil expand
Synchronize in radv_expand_depth_stencil() is more robust.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39805>
2026-02-12 20:17:19 +00:00
Samuel Pitoiset
7dd7731ac7 radv/meta: fix partial depth/stencil resolves with compute
HTILE must be decompressed for partial resolves when the hw doesn't
write the decompressed DWORD to HTILE. The driver must also
synchronize the depth/stencil expand if using graphics (the compute
path is already correctly synchronized in the helper).

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39805>
2026-02-12 20:17:18 +00:00
David Rosca
5d4f977573 radv/video: Support UVD decode on hawaii and older
H264 requires extra allocation in DPB. Use helper function
to get the required size, same as we do for encode.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:27 +00:00
David Rosca
24c74f522c ac/vcn_dec: Make the helper functions static
They are only used in ac_vcn_dec.c now.

Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:26 +00:00
David Rosca
7ad4f501fa radv: Drop videoarraypath debug option
It's not really usefull and only works for H264/5.
On AV1/VP9 it would cause hang.

Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:26 +00:00
David Rosca
19a8b7121e radv/video: Remove old VCN and UVD decode implementation
Only ac_video_dec is now used.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:26 +00:00
Benjamin Cheng
6aed906410 radv/video: Use ac_video_dec for decode
Supports VCN and UVD.

Co-authored-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:26 +00:00
David Rosca
26979becec radeonsi/video: Add video decoder using ac_video_dec
Supports VCN, VCN JPEG and UVD.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:26 +00:00
David Rosca
4d06fb9acd ac: Add UVD ac_video_dec implementation
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:26 +00:00
David Rosca
9608abb26b ac: Add VCN JPEG ac_video_dec implementation
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:26 +00:00
David Rosca
79af03556c ac: Add VCN ac_video_dec implementation
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:26 +00:00
David Rosca
b5028e84c8 ac: Add video decode interface
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:25 +00:00
Samuel Pitoiset
02a2451e1f radv: rename radv_image_use_dcc_image_stores()
To radv_image_compress_dcc_on_image_stores() because it seems more
informative.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39803>
2026-02-12 15:18:26 +00:00
Samuel Pitoiset
d58080f787 radv/meta: add a function to fixup DCC metadata for compute resolves
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39803>
2026-02-12 15:18:25 +00:00
Samuel Pitoiset
ed166804f6 radv/meta: remove an useless barrier when fixing up DCC for compute resolves
The resolve operation doesn't use DCC of the destination image, so the
clear can run in parallel.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39803>
2026-02-12 15:18:25 +00:00
Samuel Pitoiset
a673c9e414 radv/meta: stop fixing up DCC after a partial resolve using compute
The decompression pass already resets DCC to its uncompressed state,
so this is just redundant.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39803>
2026-02-12 15:18:25 +00:00
Konstantin Seurer
f574de2249 radv: Fix setting the viewport for depth stencil FS resolves
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 704fbbb ("radv/meta: rework depth/stencil resolves using graphics")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39836>
2026-02-12 14:25:31 +00:00
Konstantin Seurer
bc86c5adae radv: Stop saving descriptors before acceleration structure OPs
They only use compute+constants.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39836>
2026-02-12 14:25:31 +00:00
Ansari, Muhammad
d42268f3e5 amd/vpelib: Adding new wrapper for register profiling
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
[WHY]
To read back register read/write counts from VPEs, we need to add a new
wrapper function.

[HOW]
Added a wrapper that calls build command and populate the register
profiling data structure.

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Muhammad Ansari <Muhammad.Ansari@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39848>
2026-02-12 11:56:26 +00:00
Ali, Nawwar
2a5124a09f amd/vpelib: Fix crash during encoding test
[WHY]
Fix crash during encoding test

Co-authored-by: Agate, Jesse <Jesse.Agate+amdeng@amd.com>
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Nawwar Ali <Nawwar.Ali@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39848>
2026-02-12 11:56:25 +00:00
Agate, Jesse
39187b36b5 amd/vpelib: Add RGB 601 Primaires to BG Color
[WHY]
RGB 601 Primaries are missing from vpe_is_limited_cs

[HOW]
Add 601 primaries to the switch statement

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-Off-by: Jesse Agate <Jesse.Agate@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39848>
2026-02-12 11:56:25 +00:00
Rouf, Farhan
edf352a71a amd/vpelib: Embedded Buffer Size for 3DLUT FL
[WHY]
The embedded-buffer usage decision should be based on the stream's 3DLUT
mode rather than a loosely defined tm_enabled boolean.

[HOW]
- Replace cmd_info.tm_enabled with cmd_info.lut3d_type
- Add vpe_get_stream_lut3d_type() helper and use it in cmd info/buffer req
- Prefix internal helpers (vpe_calculate_scaling_ratios, vpe_should_generate_cmd_info)

Signed-Off-by: Farhan Rouf <Farhan.Rouf@amd.com>
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39848>
2026-02-12 11:56:25 +00:00
Assadian, Navid
dd7c2f9528 amd/vpelib: Reorder function pointers
[HOW]
- Re-order the function pointer assignments to have the same order as
defined.

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-Off-by: Navid Assadian <Navid.Assadian@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39848>
2026-02-12 11:56:25 +00:00
You, Min-Hsuan
e33bbe7ee7 amd/vpelib: refactor minor change
Make dscl_set_scaler_position be a function pointer

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Singed-off-by: Min-Hsuan You <Min-Hsuan.You@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39848>
2026-02-12 11:56:25 +00:00
Chan, Roy
3d750ed881 amd/vpelib: fix uninitialized variable
[WHY]
The packet header has uninitialized fields that can introduce 1b'1 in
reserved bits.

[HOW]
initialize the header to 0

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Roy Chan <Roy.Chan@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39848>
2026-02-12 11:56:25 +00:00
Ali, Nawwar
3216b0c193 amd/vpelib: coding style rectify
Revised the coding style

Co-authored-by: Roy Chan <Roy.Chan@amd.com>
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Nawwar Ali <Nawwar.Ali@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39848>
2026-02-12 11:56:25 +00:00
Ansari, Muhammad
58c544a9bd amd/vpelib: Fix potential overflow calculation
[WHY]
Multiplication result may overflow int before it is converted to long
long

[HOW]
Updated the expression to avoid possible overflow

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Muhammad Ansari <Muhammad.Ansari@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39848>
2026-02-12 11:56:24 +00:00
Lin, Ricky
dbff0fabf0 amd/vpelib: Augment swizzling modes
[WHY]
Support different generations of swizzle mode.

[HOW]
Added different swizzle mode parameters for supporting plane
description.

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Ricky Lin <Ricky.Lin@amdeng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39848>
2026-02-12 11:56:24 +00:00
Ali, Nawwar
f3db1d5f46 amd/vpelib: update 3dlut and shaper FL
[WHY]
Fast load support is required for 3DLUT and Shaper features.
The calculation logic needs to be modularized and exposed via
the resource interface to support this.

[HOW]
1. Add `calculate_shaper` and `program_fastload` function pointers to the `resource` struct.
2. Move shaper normalization, HDR multiplier update, and 3DLUT update logic from
   `vpe_color_update_movable_cm` into a new core function `vpe_calculate_shaper`.
3. Implement `vpe10_calculate_shaper` and assign it to the resource function table for VPE10 and VPE11.
4. Update `vpe_create_engine` return signature to remove `const` qualifier.

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Nawwar Ali <Nawwar.Ali@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39848>
2026-02-12 11:56:24 +00:00
Chang, Tomson
4ffd5a1c31 amd/vpelib: avoid using reg_update for multi-thread
[WHY]
Reg_update macro and its lastWritten_value design are static global
variables and cannot support multi-thread usage

[HOW]
remove reg_update usage and combine the separated calls together

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Tomson Chang <tomson.chang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39848>
2026-02-12 11:56:24 +00:00
Rhys Perry
c811348dc2 radv: include ahit/isec shaders in radv_get_shader_from_executable_index
This is necessary for GetPipelineExecutablePropertiesKHR, RADV_DEBUG and
fossil-db.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39827>
2026-02-12 11:31:37 +00:00