Ian Romanick
e9ae997ffc
brw: Only apply GRF 127 send workaround to Gfx9
...
The portion of the Bspec dedicated to Gfx6-Gfx11 says that this
workaround applies to "Pre-CNL" (with CNL being Gfx10). There is no
mention of this workaround in the sections for Xe or Xe2.
No shader-db or fossil-db changes on Skylake or older Intel platforms.
shader-db:
Lunar Lake, Meteor Lake, DG2, Tiger Lake, and Ice Lake (Lunar Lake shown)
total instructions in shared programs: 17107031 -> 17107027 (<.01%)
instructions in affected programs: 32182 -> 32178 (-0.01%)
helped: 16 / HURT: 14
total cycles in shared programs: 895016760 -> 894975410 (<.01%)
cycles in affected programs: 312774834 -> 312733484 (-0.01%)
helped: 9279 / HURT: 8091
LOST: 40
GAINED: 33
The pre-Xe2 platforms had a lot more lost / gained shaders. This appears
to be due to churn in the cycle counts and the SIMD32 heuristic.
fossil-db:
Lunar Lake
Totals:
Instrs: 208667436 -> 208671853 (+0.00%); split: -0.00%, +0.01%
Subgroup size: 14241168 -> 14241200 (+0.00%)
Cycle count: 31495149690 -> 31481397970 (-0.04%); split: -0.17%, +0.13%
Spill count: 508467 -> 508701 (+0.05%); split: -0.10%, +0.14%
Fill count: 611979 -> 612583 (+0.10%); split: -0.07%, +0.17%
Scratch Memory Size: 35288064 -> 35311616 (+0.07%); split: -0.07%, +0.14%
Totals from 205773 (29.10% of 707019) affected shaders:
Instrs: 103153541 -> 103157958 (+0.00%); split: -0.01%, +0.01%
Subgroup size: 4563584 -> 4563616 (+0.00%)
Cycle count: 12979963010 -> 12966211290 (-0.11%); split: -0.42%, +0.32%
Spill count: 494741 -> 494975 (+0.05%); split: -0.10%, +0.15%
Fill count: 597988 -> 598592 (+0.10%); split: -0.07%, +0.17%
Scratch Memory Size: 33351680 -> 33375232 (+0.07%); split: -0.08%, +0.15%
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 233063764 -> 233057897 (-0.00%); split: -0.01%, +0.00%
Subgroup size: 9892840 -> 9892856 (+0.00%)
Cycle count: 25387597341 -> 25373885583 (-0.05%); split: -0.36%, +0.31%
Spill count: 518469 -> 517940 (-0.10%); split: -0.19%, +0.09%
Fill count: 559444 -> 558537 (-0.16%); split: -0.29%, +0.13%
Scratch Memory Size: 19694592 -> 19658752 (-0.18%); split: -0.21%, +0.03%
Max dispatch width: 7135248 -> 7131672 (-0.05%); split: +0.13%, -0.18%
Totals from 301996 (37.49% of 805603) affected shaders:
Instrs: 144535999 -> 144530132 (-0.00%); split: -0.01%, +0.01%
Subgroup size: 3768528 -> 3768544 (+0.00%)
Cycle count: 18687102311 -> 18673390553 (-0.07%); split: -0.50%, +0.42%
Spill count: 515687 -> 515158 (-0.10%); split: -0.20%, +0.09%
Fill count: 557638 -> 556731 (-0.16%); split: -0.29%, +0.13%
Scratch Memory Size: 18662400 -> 18626560 (-0.19%); split: -0.22%, +0.03%
Max dispatch width: 2029872 -> 2026296 (-0.18%); split: +0.44%, -0.62%
Tiger Lake
Totals:
Instrs: 238813279 -> 238766482 (-0.02%); split: -0.04%, +0.02%
Subgroup size: 9851320 -> 9851328 (+0.00%)
Cycle count: 23668877036 -> 23646286421 (-0.10%); split: -0.51%, +0.42%
Spill count: 559060 -> 554241 (-0.86%); split: -1.12%, +0.26%
Fill count: 595926 -> 591843 (-0.69%); split: -1.46%, +0.78%
Scratch Memory Size: 19929088 -> 19764224 (-0.83%); split: -1.19%, +0.36%
Max dispatch width: 7102184 -> 7101840 (-0.00%); split: +0.13%, -0.13%
Totals from 284125 (35.42% of 802235) affected shaders:
Instrs: 144695094 -> 144648297 (-0.03%); split: -0.06%, +0.03%
Subgroup size: 3567312 -> 3567320 (+0.00%)
Cycle count: 11303753658 -> 11281163043 (-0.20%); split: -1.07%, +0.87%
Spill count: 554624 -> 549805 (-0.87%); split: -1.13%, +0.26%
Fill count: 592252 -> 588169 (-0.69%); split: -1.47%, +0.78%
Scratch Memory Size: 19553280 -> 19388416 (-0.84%); split: -1.21%, +0.37%
Max dispatch width: 1895488 -> 1895144 (-0.02%); split: +0.48%, -0.50%
Ice Lake
Totals:
Instrs: 239034316 -> 239049108 (+0.01%); split: -0.03%, +0.04%
Subgroup size: 9926440 -> 9926448 (+0.00%)
Cycle count: 24944253156 -> 24919967386 (-0.10%); split: -0.25%, +0.15%
Spill count: 575498 -> 571612 (-0.68%); split: -1.18%, +0.51%
Fill count: 709760 -> 716665 (+0.97%); split: -1.31%, +2.28%
Scratch Memory Size: 20699136 -> 20599808 (-0.48%); split: -1.45%, +0.97%
Max dispatch width: 7140856 -> 7143568 (+0.04%); split: +0.15%, -0.12%
Totals from 233451 (29.01% of 804669) affected shaders:
Instrs: 127440610 -> 127455402 (+0.01%); split: -0.07%, +0.08%
Subgroup size: 2835784 -> 2835792 (+0.00%)
Cycle count: 11818511030 -> 11794225260 (-0.21%); split: -0.53%, +0.32%
Spill count: 559557 -> 555671 (-0.69%); split: -1.22%, +0.52%
Fill count: 694460 -> 701365 (+0.99%); split: -1.34%, +2.33%
Scratch Memory Size: 19774464 -> 19675136 (-0.50%); split: -1.52%, +1.02%
Max dispatch width: 1602736 -> 1605448 (+0.17%); split: +0.69%, -0.52%
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35903 >
2025-07-15 19:35:42 +00:00
Dave Airlie
d5037a34bb
nak: don't set the divergent flag on uniform sysvals
...
The list of S2UR allowed sysvals comes from the nvidia NDA docs.
with open shader-db:
Totals:
CodeSize: 14022144 -> 12291088 (-12.35%); split: -12.94%, +0.59%
Number of GPRs: 41467 -> 40560 (-2.19%)
SLM Size: 92344 -> 68824 (-25.47%)
Static cycle count: 5312651 -> 3856674 (-27.41%); split: -27.95%, +0.54%
Spills to memory: 54216 -> 51018 (-5.90%)
Fills from memory: 54216 -> 51018 (-5.90%)
Spills to reg: 7533 -> 7204 (-4.37%); split: -4.42%, +0.05%
Fills from reg: 8406 -> 7987 (-4.98%)
Max warps/SM: 61508 -> 61780 (+0.44%)
Totals from 689 (51.73% of 1332) affected shaders:
CodeSize: 12873552 -> 11142496 (-13.45%); split: -14.09%, +0.64%
Number of GPRs: 26789 -> 25882 (-3.39%)
SLM Size: 89176 -> 65656 (-26.37%)
Static cycle count: 5058667 -> 3602690 (-28.78%); split: -29.35%, +0.57%
Spills to memory: 54216 -> 51018 (-5.90%)
Fills from memory: 54216 -> 51018 (-5.90%)
Spills to reg: 7533 -> 7204 (-4.37%); split: -4.42%, +0.05%
Fills from reg: 8406 -> 7987 (-4.98%)
Max warps/SM: 30908 -> 31180 (+0.88%)
PERCENTAGE DELTAS Shaders CodeSize Number of GPRs SLM Size Static cycle count Spills to memory Fills from memory Spills to reg Fills from reg Max warps/SM
google-meet-clvk/BgBlur 49 +6.46% -5.10% . +6.81% . . . . +1.48%
google-meet-clvk/Relight 81 +5.47% -4.74% . +6.29% . . . . +1.23%
parallel-rdp/small_subgroup 246 -2.88% -4.10% . +0.41% . . -3.65% -2.39% +0.73%
parallel-rdp/small_uber_subgroup 55 -23.04% -1.32% -36.28% -42.86% -1.61% -1.61% -6.88% -9.55% +0.68%
parallel-rdp/subgroup 327 -2.78% -2.64% . -0.26% . . -3.17% -2.07% +0.53%
parallel-rdp/uber_subgroup 55 -25.59% -1.32% -29.98% -41.29% -9.04% -9.04% -7.06% -10.08% +0.68%
q2rtx/q2rtx-rt-pipeline 42 -0.38% -0.25% -49.40% +0.84% -97.48% -97.48% . . .
sascha-willems/bloom 12 . . . . . . . . .
sascha-willems/computecloth 7 +0.20% . . +0.51% . . . . .
sascha-willems/computecullandlod 5 +0.21% . . +0.84% . . . . .
sascha-willems/computeheadless 1 -28.85% . . +27.71% . . . . .
sascha-willems/computenbody 6 +0.73% . . +1.78% . . . . .
sascha-willems/computeparticles 5 +0.53% . . +1.24% . . . . .
sascha-willems/computeraytracing 5 +0.14% . . +0.26% . . . . .
sascha-willems/computeshader 7 +1.29% . . +2.97% . . . . .
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36105 >
2025-07-15 19:07:11 +00:00
Dave Airlie
4726c08f53
nak: add uniform support for s2r
...
This adds s2ur support to the backend compiler.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36105 >
2025-07-15 19:07:11 +00:00
Dave Airlie
2273b6c46a
nak: add divergent attribute and wrapper for nir_load_sysval_nv
...
This wraps the sysval load in a builder where we can add
proper divergence for ctaid later.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36105 >
2025-07-15 19:07:11 +00:00
Marek Olšák
d12bc87dda
aco: implement upcasting 16-bit types for 32-bit color buffers in PS epilog
...
This was missed when implementing the change for LLVM.
Fixes: fbbf029529 - radeonsi: enable 16-bit mediump IO for PS outputs only, and VS->PS with env var
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36112 >
2025-07-15 18:28:30 +00:00
Christian Gmeiner
004abdc767
etnaviv: nir: Preserve dot product instructions
...
Modify the ALU width callback to return 0 for dot product operations,
preventing nir_lower_alu_width from decomposing them into multiply-add
chains. This preserves fdot2, fdot3, and fdot4 as single instructions
rather than expanding them into multiple fmul+fadd operations.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13531
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36130 >
2025-07-15 18:13:06 +00:00
Christian Gmeiner
76f2735fe2
etnaviv: nir: Enable vectorization with 4-component width limit
...
Add a custom ALU width callback that returns 4 to enable vectorization
of scalar operations up to 4 components. Without this callback, the
nir_lower_alu_width pass generates excessive scalar code instead of
utilizing etnaviv's vector capabilities.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13531
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36130 >
2025-07-15 18:13:06 +00:00
David Rosca
850a3b0cae
radv/video: Set correct VP9 decode minCodedExtent
...
Fixes: b8ac2d47e7 ("radv/video: add KHR_video_decode_vp9 support.")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35997 >
2025-07-15 17:44:15 +00:00
David Rosca
50eaa0c19f
radv/video: Set correct H264/5 decode minCodedExtent
...
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35997 >
2025-07-15 17:44:15 +00:00
David Rosca
42a25b2493
radeonsi/video: Set correct minimum size for VP9 decode
...
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35997 >
2025-07-15 17:44:14 +00:00
Faith Ekstrand
cf654be16b
vulkan/wsi/x11: Refuse to connect to thread-unsafe Displays
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We heavily rely on threading in Vulkan WSI. There's no way this is safe
if XInitThreads() hasn't been called. Fortunately, this should never be
the case since 2022 or so.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13530
Cc: mesa-stable
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36120 >
2025-07-15 17:13:46 +00:00
Marek Olšák
6286c1c66f
nir/opt_vectorize_io: optionally vectorize loads with holes
...
e.g. load X; load W; ==> load XYZW. Verified with a shader test.
This will be used by AMD drivers. See the code comments.
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36098 >
2025-07-15 16:29:30 +00:00
Romaric Jodin
b4977a1605
nir/lower_bit_size: Avoid round-trip conversion when possible
...
When we detect that the source is a conversion generated by the pass,
try to get the real source instead of doing a round-trip conversion.
Make sure that the nir_alu_type and the bit_size is the same between what
we need and what's before the detected conversion.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35744 >
2025-07-15 15:32:58 +00:00
Valentine Burley
f34ddff0bd
freedreno/ci: Update expectations
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Update the expectations from the latest nightly.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36132 >
2025-07-15 14:11:18 +00:00
Valentine Burley
13d9570ec9
panfrost/ci: Update expectations
...
Update the expectations from the latest nightly.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36132 >
2025-07-15 14:11:18 +00:00
Marek Olšák
37ae4df3e4
glsl: remove most IO optimizations that are replaced by nir_opt_varyings
...
The only last users of nir_link_opt_varyings are Vulkan drivers.
One linker error thrown by the optimizations is reimplemented
at the call site.
No interesting shader-db changes (other than random noise).
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091 >
2025-07-15 13:38:30 +00:00
Marek Olšák
0fdd6de65f
nir/lower_io: validate locations more accurately
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091 >
2025-07-15 13:38:29 +00:00
Marek Olšák
b0494f9485
nir/opt_varyings: optimize the consumer after constant propagation and dedupli.
...
A TF2 shader propagates 0 to the consumer, which eliminates 1 input
if we run algebraic opts and DCE before compaction.
This is a prerequisite for removing all IO var optimizations from the GLSL
linker that are redundant with nir_opt_varyings.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091 >
2025-07-15 13:38:29 +00:00
Marek Olšák
9607852c30
nir/opt_varyings: use nir_scalar
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091 >
2025-07-15 13:38:29 +00:00
Rob Clark
2e06da1597
gbm: Add more formats
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add additional formats to allow allocation via gbm. Rather than define
new GBM_FORMAT_x, just use the drm_fourcc.h format (they are the same,
and the distiction will be going away in the future).
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081 >
2025-07-15 12:37:10 +00:00
Rob Clark
6d3f266406
dri: Add additional 16/32b float/int formats
...
Add additional 16 and 32b float formats, and the missing BGR161616.
For the dri2_format_table, just use the pipe formats twice, rather than
introducing new __DRI_IMAGE_FORMAT_x in this day and age (they are the
same thing).
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081 >
2025-07-15 12:37:10 +00:00
Daniel Stone
0fcb0ac1c5
dri: Expand pipe_format <-> FourCC lookup table
...
This pulls in every known mapping from the core DRI frontend, as well as
from GBM and Wayland. We can then start using it to de-duplicate the
lookups all throughout the tree.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081 >
2025-07-15 12:37:10 +00:00
Daniel Stone
d6e639816f
dri: Convert pipe_format <-> FourCC lookup to a table
...
Saves typing it out twice.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081 >
2025-07-15 12:37:10 +00:00
Daniel Stone
b6332eb43b
dri: Convert DRI_IMAGE_FORMAT to pipe_format
...
No functional change.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081 >
2025-07-15 12:37:09 +00:00
Rob Clark
3fcf63f364
dri2: Allow allocating suboptimal for sampling
...
Fixes allocations via gbm, which doesn't necessarily need to be
renderable.
Fixes: ba7454a155 ("dri2+gallium: Support to import suboptimal formats")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081 >
2025-07-15 12:37:08 +00:00
Rob Clark
a1cd9f917f
mesa/main: Add MESA_FORMAT_RGB_UNORM16
...
Needed for importing DRM_FORMAT_BGR161616.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081 >
2025-07-15 12:37:08 +00:00
Rob Clark
3931d5ad18
drm-uapi: update drm_fourcc.h
...
Bring in new f16/f32 formats, etc.
Taken from the following commit in the drm tree, from the drm-next
branch:
commit 203dcde881561f1a4ee1084e2ee438fb4522c94a
Merge: 69d09a26096c 8290d37ad2b0
Author: Simona Vetter <simona.vetter@ffwll.ch>
Merge tag 'drm-msm-next-2025-07-05' of https://gitlab.freedesktop.org/drm/msm into drm-next
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081 >
2025-07-15 12:37:08 +00:00
Eric Engestrom
aade8a919d
ci: uprev apitrace
...
A new major version (13.0) was just released, a good excuse to uprev :)
New release notes:
https://github.com/apitrace/apitrace/releases/tag/13.0
Difference in this uprev:
b6102d1096...45a005875d
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35816 >
2025-07-15 11:56:39 +00:00
Jose Maria Casanova Crespo
7d51a10cda
v3d: Fix depth resource invalidation with separate_stencil
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
If there is a separate stencil in use, the resource invalidation
flag was not being removed for the depth buffer as rsc was assigned
to the separate stencil.
Fixes: 6ff509593c ("v3d: Only apply TLB load invalidation on first job after FB state update")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36030 >
2025-07-15 10:23:10 +00:00
Samuel Pitoiset
b59895140d
radv: add a way to disable the HIZ/HiS events based workaround on GFX12
...
This workaround doesn't mitigate the issue reliably/completely. An
alternative (but complex) solution also exists.
This introduces a small option that allows to disable the current
workaround as preliminary work.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36110 >
2025-07-15 10:01:54 +00:00
Pavel Gribov
24cb745460
radv: small fix for sam check
...
for exact PCIe 3.0 x8 case there will be
pcie_bandwidth_mbps >= bandwidth_mbps_threshold => (8069 >= 8069,12) == false
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36109 >
2025-07-15 09:37:32 +00:00
Samuel Pitoiset
763ff92ad9
zink/ci: enable RADV_PERFTEST=hic for GFX10+ jobs
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:17 +00:00
Samuel Pitoiset
d510f95f67
radv/ci: enable RADV_PERFTEST=hic for GFX10+ jobs
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
fbea486854
radv: advertise VK_EXT_host_image_copy on GFX10+ behind RADV_PERFTEST=hic
...
This exposes an experimental implementation of HIC with
RADV_PERFTEST=hic. It's passing 100% of VKCTS but it requires some
benchmarks first to verify if performance is acceptable or not.
No addrlib support for GFX6-9.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
ea4ad51eb1
radv: implement vkTransitionImageLayout()
...
It's a no-op.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
b2e338a9c7
radv: implement vkCopyImageToImageEXT()
...
Because there is no surface<->surface helper in addrlib, this allocates
a temporary buffer on the CPU to do image->buffer->image. It's a naive
implementation which is probably not the best for performance, but it
works at least.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
c9ea920da0
radv: implement vkCopyMemoryToImageEXT()/vkCopyImageToMemoryEXT()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
4a5370819c
radv: do not use MRT counters for host-transfer images
...
Otherwise, the tile swizzle changes and addrlib is confused.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:15 +00:00
Samuel Pitoiset
8d38b25cb3
radv: add support for querying HIC memcpy size
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:15 +00:00
Samuel Pitoiset
031843ebb1
radv: add support for querying HIC performance info
...
On GFX12, everything is compressed with DCC and it's completely
transparent to the userspace driver, so that should be optimal.
On older gens, using HIC disables compression which isn't optimal
for device access.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:15 +00:00
Samuel Pitoiset
d89b11011f
radv: add support for formats with host-transfer
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:15 +00:00
Samuel Pitoiset
545d5a0675
radv: set RADEON_SURF_HOST_TRANSFER for host-transfer images
...
To forbid some swizzles on GFX11.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:14 +00:00
Samuel Pitoiset
37f3997edf
radv: disable compression for host-transfer images
...
HIC isn't supposed to have compression.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:14 +00:00
Samuel Pitoiset
afa7509207
radv: map images with host-transfer at bind time
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:13 +00:00
Samuel Pitoiset
d85b7b6c62
radv: only expose host visible memory types for images with host-transfer
...
Because the memory must be mapped on the CPU.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:13 +00:00
Samuel Pitoiset
a75bd251df
ac/surface: add a flag to forbid some swizzles for surface<->memory copies
...
256KiB (also block variables) aren't supported on GFX11.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:13 +00:00
Samuel Pitoiset
f5f2392cf7
ac/surface: add support for surface<->memory copy using addrlib
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:13 +00:00
Samuel Pitoiset
16be376cc5
ac/surface: constify bpe_to_format()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974 >
2025-07-15 09:12:12 +00:00
Calder Young
3c7a834ebc
anv: Add support for AV1 video decoding on Gfx125 and Xe2
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36015 >
2025-07-15 01:21:53 +00:00
Calder Young
3456a65619
intel/genxml: Update AVP instructions for Gfx125 and Xe2
...
Acked-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36015 >
2025-07-15 01:21:53 +00:00