Sil Vilerino
19cbb13255
d3d12: Support slice NAL prefixes on slice notifications mode
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
df995c9631
d3d12: Implement multi-slice notifications
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
bedd423893
d3d12: Prepare d3d12_video_encoder_encode_bitstream for sliced encoding. Checked working with single slice buffer at this point
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
555a13661a
pipe: Add sliced encoding API and caps
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
c5f5ee41c8
d3d12: Add support for QP, SATD and RC bits output stats
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
25726509ff
pipe: Add PIPE_VIDEO_CAP_ENC_GPU_STATS_* and pipe_resource textures in H264/H265 encode pic params
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
56bbfb9598
d3d12: Add support for pipe_enc_move_rects for H264/H265 encode
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
7c490bb860
pipe: Add PIPE_VIDEO_CAP_ENC_MOVE_RECTS and pipe_enc_move_rects for H264/H265 encode
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
6d81bc0cdd
d3d12: Add support for pipe_enc_dirty_rects for H264/H265 encode
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
700c6fff58
pipe: Add PIPE_VIDEO_CAP_ENC_DIRTY_RECTS and pipe_enc_dirty_rects for H264/H265 encode
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
4e632ed891
d3d12: Use D3D12_FEATURE_VIDEO_ENCODER_SUPPORT2 when D3D12_VIDEO_USE_NEW_ENCODECMDLIST4_INTERFACE is set
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
731bc92e87
d3d12: Add #if guards for using new ID3D12VideoEncodeCommandList4
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
2465dcf4cc
d3d12: Fix reporting for PIPE_VIDEO_CAP_ENC_MAX_DPB_CAPACITY
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
0577c77a4c
d3d12: Report pipe_enc_cap_roi.log2_roi_min_block_pixel_size
...
Reviewed-By: Pohsiang Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Sil Vilerino
17cdbc5729
pipe: Add pipe_enc_cap_roi.log2_roi_min_block_pixel_size
...
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Pohsiang (John) Hsu
295ecc8d96
d3d12: enable D3D12_VIDEO_ENCODER_CODEC_CONFIGURATION_HEVC_FLAG_ENABLE_LONG_TERM_REFERENCES when max_num_ltr_frames > 0
...
Signed-off-by: Pohsiang Hsu <pohhsu@microsoft.com>
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Pohsiang (John) Hsu
ebd82f39f8
d3d12: Add support for retreiving PIPE_VIDEO_CAP_ENC_MAX_DPB_CAPACITY for H264/H265 encode
...
Signed-off-by: Pohsiang Hsu <pohhsu@microsoft.com>
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:22 +00:00
Pohsiang (John) Hsu
7557ce41be
pipe: add PIPE_VIDEO_CAP_ENC_MAX_DPB_CAPACITY for H264/H265 encode
...
Signed-off-by: Pohsiang Hsu <pohhsu@microsoft.com>
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:21 +00:00
Pohsiang (John) Hsu
ea7ef6d575
d3d12: Add support for retrieving PIPE_VIDEO_CAP_ENC_MAX_LONG_TERM_REFERENCES_PER_FRAME for H264/H265 encode
...
Signed-off-by: Pohsiang Hsu <pohhsu@microsoft.com>
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:21 +00:00
Pohsiang (John) Hsu
743f0a8df1
pipe: add PIPE_VIDEO_CAP_ENC_MAX_LONG_TERM_REFERENCES_PER_FRAME for H264/H265 encode
...
Signed-off-by: Pohsiang Hsu <pohhsu@microsoft.com>
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34844 >
2025-05-08 14:17:21 +00:00
Rhys Perry
2704a30df0
radv: perform nir_opt_access before the first radv_optimize_nir
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Two lowered loads might not be CSE'd after nir_lower_explicit_io if one of
them is shrinked. This doesn't happen for deref loads, but it needs the
CAN_REORDER flag first.
fossil-db (gfx1201):
Totals from 556 (0.70% of 79377) affected shaders:
MaxWaves: 14936 -> 14940 (+0.03%); split: +0.05%, -0.03%
Instrs: 2140334 -> 2140942 (+0.03%); split: -0.07%, +0.10%
CodeSize: 11137948 -> 11145416 (+0.07%); split: -0.07%, +0.13%
SpillSGPRs: 2385 -> 2527 (+5.95%); split: -0.34%, +6.29%
Latency: 12310570 -> 12305011 (-0.05%); split: -0.08%, +0.04%
InvThroughput: 2136142 -> 2135516 (-0.03%); split: -0.06%, +0.03%
VClause: 47419 -> 47420 (+0.00%); split: -0.01%, +0.01%
SClause: 58423 -> 58290 (-0.23%); split: -0.36%, +0.14%
Copies: 160626 -> 161321 (+0.43%); split: -0.25%, +0.68%
Branches: 69693 -> 69710 (+0.02%); split: -0.04%, +0.06%
PreSGPRs: 34824 -> 34945 (+0.35%); split: -0.24%, +0.58%
PreVGPRs: 28682 -> 28649 (-0.12%); split: -0.36%, +0.24%
VALU: 1080800 -> 1081171 (+0.03%); split: -0.04%, +0.08%
SALU: 353112 -> 353770 (+0.19%); split: -0.15%, +0.34%
SMEM: 81587 -> 81364 (-0.27%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
18a53230eb
aco: don't check dst_bitsize in apply_load_extract
...
I don't think this is necessary.
fossil-db (gfx1201):
Totals from 12 (0.02% of 79377) affected shaders:
Instrs: 73041 -> 72669 (-0.51%); split: -0.51%, +0.00%
CodeSize: 417376 -> 413852 (-0.84%); split: -0.85%, +0.00%
Latency: 1301862 -> 1301533 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 599874 -> 599723 (-0.03%)
VClause: 1344 -> 1346 (+0.15%)
Copies: 15855 -> 15832 (-0.15%); split: -0.37%, +0.23%
VALU: 42138 -> 41883 (-0.61%); split: -0.61%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
eb95f7cc0e
aco: support sign extension in apply_load_extract
...
fossil-db (gfx1201):
Totals from 10 (0.01% of 79377) affected shaders:
Instrs: 28954 -> 28938 (-0.06%)
CodeSize: 164552 -> 164472 (-0.05%)
Latency: 1249341 -> 1247037 (-0.18%)
InvThroughput: 297077 -> 296618 (-0.15%)
VALU: 15951 -> 15941 (-0.06%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
0de0fd38b4
aco: support more opcodes in apply_ds_extract
...
fossil-db (gfx1201):
Totals from 320 (0.40% of 79377) affected shaders:
Instrs: 3439754 -> 3432384 (-0.21%)
CodeSize: 18008696 -> 17973180 (-0.20%); split: -0.20%, +0.00%
VGPRs: 16016 -> 15404 (-3.82%)
Latency: 20246168 -> 20295740 (+0.24%); split: -0.08%, +0.33%
InvThroughput: 4462916 -> 4478546 (+0.35%); split: -0.08%, +0.43%
VClause: 87123 -> 87099 (-0.03%)
Copies: 261779 -> 261948 (+0.06%); split: -0.05%, +0.12%
Branches: 94611 -> 94601 (-0.01%); split: -0.01%, +0.00%
VALU: 1870695 -> 1865738 (-0.26%)
SALU: 488351 -> 487557 (-0.16%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
3b42626973
ac/nir: allow 8/16-bit smem loads
...
fossil-db (gfx1201):
Totals from 295 (0.37% of 79377) affected shaders:
Instrs: 314018 -> 313355 (-0.21%); split: -0.22%, +0.00%
CodeSize: 1697996 -> 1696528 (-0.09%); split: -0.11%, +0.02%
Latency: 4197719 -> 4197106 (-0.01%)
InvThroughput: 1258891 -> 1258744 (-0.01%)
PreSGPRs: 12232 -> 12230 (-0.02%)
SALU: 66762 -> 66269 (-0.74%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
5b116c4de9
ac/nir: allow vectorization of unsupported 8/16-bit loads
...
These can later be lowered to a vectorized 32-bit load.
fossil-db (gfx1201):
Totals from 1259 (1.59% of 79377) affected shaders:
MaxWaves: 36821 -> 36817 (-0.01%)
Instrs: 4363702 -> 4355749 (-0.18%); split: -0.23%, +0.05%
CodeSize: 22779980 -> 22706504 (-0.32%); split: -0.37%, +0.05%
VGPRs: 69672 -> 69792 (+0.17%); split: -0.02%, +0.19%
SpillSGPRs: 675 -> 673 (-0.30%)
Latency: 26684053 -> 26663819 (-0.08%); split: -0.11%, +0.03%
InvThroughput: 5617687 -> 5614798 (-0.05%); split: -0.10%, +0.04%
VClause: 106830 -> 106654 (-0.16%); split: -0.17%, +0.00%
SClause: 75523 -> 75495 (-0.04%); split: -0.04%, +0.01%
Copies: 323199 -> 323525 (+0.10%); split: -0.10%, +0.20%
Branches: 109475 -> 109480 (+0.00%); split: -0.00%, +0.01%
PreSGPRs: 55036 -> 55040 (+0.01%)
PreVGPRs: 47538 -> 47582 (+0.09%); split: -0.12%, +0.21%
VALU: 2377777 -> 2389977 (+0.51%); split: -0.02%, +0.53%
SALU: 578272 -> 578385 (+0.02%); split: -0.02%, +0.04%
VMEM: 190065 -> 181204 (-4.66%)
SMEM: 99709 -> 99565 (-0.14%)
VOPD: 244 -> 243 (-0.41%); split: +0.41%, -0.82%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
6dbf44ad9c
ac/nir: allow less than one register of overfetch
...
This is to allow vectorization of 8/16-bit loads, which can later be
cheaply lowered to a 32-bit load.
fossil-db (gfx1201):
Totals from 178 (0.22% of 79377) affected shaders:
MaxWaves: 4138 -> 4102 (-0.87%)
Instrs: 619714 -> 617917 (-0.29%); split: -0.32%, +0.03%
CodeSize: 3364396 -> 3352724 (-0.35%); split: -0.38%, +0.03%
VGPRs: 12896 -> 12980 (+0.65%); split: -0.19%, +0.84%
SpillSGPRs: 546 -> 545 (-0.18%)
Latency: 7589585 -> 7406076 (-2.42%); split: -2.45%, +0.04%
InvThroughput: 1926356 -> 1879866 (-2.41%); split: -2.42%, +0.00%
VClause: 12301 -> 11750 (-4.48%)
SClause: 13614 -> 13583 (-0.23%); split: -0.45%, +0.22%
Copies: 82207 -> 82265 (+0.07%); split: -0.10%, +0.17%
Branches: 19284 -> 19266 (-0.09%)
PreSGPRs: 9525 -> 9457 (-0.71%)
PreVGPRs: 12366 -> 12421 (+0.44%)
VALU: 347928 -> 348020 (+0.03%); split: -0.01%, +0.04%
SALU: 82620 -> 82519 (-0.12%); split: -0.19%, +0.07%
VMEM: 22248 -> 21430 (-3.68%)
SMEM: 17951 -> 17843 (-0.60%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
ddef4bddf8
ac/nir: round components when lowering 8/16-bit loads to 32-bit
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
f538cae743
nir/algebraic: optimize ior(unpack_4x8, unpack_4x8<<8) to unpack_32_2x16
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
10f4264936
nir/search: extend swizzle_y
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
4fa1c92862
aco/gfx12: allow 8/16-bit smem loads
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
75efc218f5
aco: support 8/16-bit loads in smem_combine()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
8abb787c6b
radv/gfx12: use dword3 smem loads for push constants
...
fossil-db (gfx1201):
Totals from 5 (0.01% of 79377) affected shaders:
(no affected stats)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
13b0131edc
aco/gfx12: select dwordx3 smem loads
...
fossil-db (gfx1201):
Totals from 47814 (60.24% of 79377) affected shaders:
MaxWaves: 1436871 -> 1436855 (-0.00%); split: +0.00%, -0.00%
Instrs: 36653621 -> 36649461 (-0.01%); split: -0.07%, +0.06%
CodeSize: 194102884 -> 194076060 (-0.01%); split: -0.06%, +0.04%
VGPRs: 2267944 -> 2269648 (+0.08%); split: -0.01%, +0.08%
SpillSGPRs: 8301 -> 8295 (-0.07%); split: -0.08%, +0.01%
Latency: 249627561 -> 249631829 (+0.00%); split: -0.04%, +0.04%
InvThroughput: 40004042 -> 40003575 (-0.00%); split: -0.02%, +0.02%
VClause: 680488 -> 680429 (-0.01%); split: -0.08%, +0.07%
SClause: 1062835 -> 1066206 (+0.32%); split: -0.20%, +0.52%
Copies: 2393981 -> 2393607 (-0.02%); split: -0.23%, +0.22%
Branches: 728117 -> 728113 (-0.00%); split: -0.00%, +0.00%
VALU: 20358269 -> 20358585 (+0.00%); split: -0.01%, +0.01%
SALU: 4737317 -> 4736411 (-0.02%); split: -0.07%, +0.06%
SMEM: 1712349 -> 1710075 (-0.13%); split: -0.13%, +0.00%
VOPD: 5808 -> 5813 (+0.09%); split: +0.12%, -0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
90a5c93ea5
aco: prepare for dwordx3 smem loads
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
208d62430f
aco/gfx12: use s_load_dwordx3 to load ray launch sizes
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:49 +00:00
Rhys Perry
cbd718506b
aco: add smem opcode helper
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:49 +00:00
Daniel Stone
fa27cacdd7
ci/panfrost: Really document T860 array flakes
...
It looks like this entire class of test is going to sometimes fail
causing GPU timeouts. As the random class presumably includes array
tests, flake the lot of those as well, rather than trying to figure out
which seed includes which subtests.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34878 >
2025-05-08 15:51:57 +03:00
Lionel Landwerlin
fa2627aefb
vulkan/runtime: add a multialloc variant for pipeline create
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34874 >
2025-05-08 11:22:55 +00:00
Lionel Landwerlin
565ac1ee6a
vulkan/runtime: fixup assert with link_geom_stages
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9308e8d90d ("vulkan: Add generic graphics and compute VkPipeline implementations")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34874 >
2025-05-08 11:22:55 +00:00
Lionel Landwerlin
a29d0cfaf0
vulkan/runtime: track dynamics descriptor in a set layout
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34874 >
2025-05-08 11:22:55 +00:00
Lionel Landwerlin
fead813644
vulkan/runtime: store index of the push descriptor in pipeline layout
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34874 >
2025-05-08 11:22:55 +00:00
Zan Dobersek
b8cc891e6e
ir3: allow asm roundtrip testing of compiled shader variants
...
The `asmroundtrip` IR3_SHADER_DEBUG option enables roundtrip testing of
ir3 asm facilities by generating disassembly for each compiled shader
variant, parsing that disassembly back into ir3 and assembling back into
binary, with the expectation that the initial binary and the post-roundtrip
binary are identical.
This should give some guarantee that any shader that ir3 can produce can
also be constructed through assembly and fed back into ir3.
When enabled, each shader variant has a parallel roundtrip variant created.
At the moment this variant is discarded after validation, but it could
replace the initial variant in the future to also test behavior of such
roundtrip-generated binary and accompanying metadata.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34076 >
2025-05-08 09:44:31 +00:00
Zan Dobersek
0acf46b973
ir3: fix parsing of texture prefetch headers
...
Adjust ir3 parsing rules for texture prefetches to the current state. Those
rules expect the write mask to always be present, so the disassembly
production code is adjusted accordingly.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34076 >
2025-05-08 09:44:31 +00:00
Zan Dobersek
c2f4d3d139
ir3: fix display of dot-product instructions
...
For dp2acc and dp4acc, don't display the derived NOP value by default, but
do display repeat flags for source registers. When the nop encoding
condition is met, the derived NOP value should be shown, mirroring what the
base cat3 instruction specification does.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34076 >
2025-05-08 09:44:31 +00:00
Juan A. Suarez Romero
19fe1e5b5b
v3d/v3dv/ci: update expected results
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add new timeouts and flakes.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34869 >
2025-05-08 08:47:40 +00:00
Job Noorman
b038cb3df1
tu: scalarize IO before linking
...
This allows nir_link_opt_varyings, nir_remove_unused_varyings and
nir_compact_varyings to find a lot more optimization opportunities.
The implementation has been shamelessly copied, with some minor tweaks,
from radv_link_shaders.
Note that the regression in "Early Preamble" is caused by more texture
operations becoming uniform and being hoisted to the preamble (where
they need GPRs).
Totals from 72221 (43.88% of 164575) affected shaders:
MaxWaves: 924390 -> 929534 (+0.56%); split: +0.62%, -0.06%
Instrs: 29657203 -> 29265425 (-1.32%); split: -1.63%, +0.31%
CodeSize: 61509010 -> 61032290 (-0.78%); split: -1.46%, +0.68%
NOPs: 4810811 -> 4799957 (-0.23%); split: -2.49%, +2.27%
MOVs: 923221 -> 830062 (-10.09%); split: -14.80%, +4.71%
Full: 949533 -> 933312 (-1.71%); split: -1.82%, +0.11%
(ss): 685957 -> 678810 (-1.04%); split: -3.68%, +2.63%
(sy): 326800 -> 324295 (-0.77%); split: -2.56%, +1.79%
(ss)-stall: 2710956 -> 2682550 (-1.05%); split: -4.19%, +3.15%
(sy)-stall: 9480654 -> 9332777 (-1.56%); split: -4.39%, +2.83%
STPs: 5907 -> 5885 (-0.37%)
LDPs: 2622 -> 2596 (-0.99%)
Preamble Instrs: 6728019 -> 6671785 (-0.84%); split: -1.75%, +0.92%
Early Preamble: 52865 -> 52319 (-1.03%); split: +0.26%, -1.30%
Cat0: 5280863 -> 5268118 (-0.24%); split: -2.33%, +2.08%
Cat1: 1385055 -> 1271076 (-8.23%); split: -11.33%, +3.10%
Cat2: 11333273 -> 11194153 (-1.23%); split: -1.25%, +0.02%
Cat3: 8735603 -> 8618710 (-1.34%); split: -1.34%, +0.00%
Cat4: 958143 -> 952511 (-0.59%)
Cat5: 840520 -> 836190 (-0.52%); split: -0.53%, +0.02%
Cat6: 242192 -> 232244 (-4.11%)
Cat7: 881554 -> 892423 (+1.23%); split: -1.25%, +2.48%
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34784 >
2025-05-08 08:18:24 +00:00
Job Noorman
6a57bfb004
nir/lower_io_to_vector: remove can_read_output assert
...
Since we're not creating new output reads, just vectorizing existing
ones, this isn't the place to assert whether we can actually read
outputs.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <anholt@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34784 >
2025-05-08 08:18:24 +00:00
Lionel Landwerlin
386decce41
panvk/ci: add more flaky tests
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Lionel Landwerlin <llandwerlin@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109 >
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
5c7c1eceb5
anv/brw: handle pipeline libraries with mesh
...
I always thought there was a massive issue with pipeline libraries &
mesh shaders. Indeed recent CTS tests have exposed a number of issues.
Some values delivered to the fragment shader are coming from different
places depending on whether the preceding shader is Mesh or not. For
example PrimitiveID is delivered in the per-primitive block in Mesh
pipelines whereas for other pipelines it's coming as a VUE slot (which
is per-vertex). Those are 2 different locations in the payload.
We have to find a layout for fragment shaders that is compatible with
everything. Leaving gaps here and there in the thread payload.
Fixes the following test pattern :
dEQP-VK.mesh_shader.ext.smoke.fast_lib.shared_*
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109 >
2025-05-08 06:48:35 +00:00