e.g. load X; load W; ==> load XYZW. Verified with a shader test.
This will be used by AMD drivers. See the code comments.
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36098>
When we detect that the source is a conversion generated by the pass,
try to get the real source instead of doing a round-trip conversion.
Make sure that the nir_alu_type and the bit_size is the same between what
we need and what's before the detected conversion.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35744>
The only last users of nir_link_opt_varyings are Vulkan drivers.
One linker error thrown by the optimizations is reimplemented
at the call site.
No interesting shader-db changes (other than random noise).
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091>
A TF2 shader propagates 0 to the consumer, which eliminates 1 input
if we run algebraic opts and DCE before compaction.
This is a prerequisite for removing all IO var optimizations from the GLSL
linker that are redundant with nir_opt_varyings.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091>
Add additional formats to allow allocation via gbm. Rather than define
new GBM_FORMAT_x, just use the drm_fourcc.h format (they are the same,
and the distiction will be going away in the future).
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081>
Add additional 16 and 32b float formats, and the missing BGR161616.
For the dri2_format_table, just use the pipe formats twice, rather than
introducing new __DRI_IMAGE_FORMAT_x in this day and age (they are the
same thing).
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081>
This pulls in every known mapping from the core DRI frontend, as well as
from GBM and Wayland. We can then start using it to de-duplicate the
lookups all throughout the tree.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081>
Bring in new f16/f32 formats, etc.
Taken from the following commit in the drm tree, from the drm-next
branch:
commit 203dcde881561f1a4ee1084e2ee438fb4522c94a
Merge: 69d09a26096c 8290d37ad2b0
Author: Simona Vetter <simona.vetter@ffwll.ch>
Merge tag 'drm-msm-next-2025-07-05' of https://gitlab.freedesktop.org/drm/msm into drm-next
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081>
If there is a separate stencil in use, the resource invalidation
flag was not being removed for the depth buffer as rsc was assigned
to the separate stencil.
Fixes: 6ff509593c ("v3d: Only apply TLB load invalidation on first job after FB state update")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36030>
This workaround doesn't mitigate the issue reliably/completely. An
alternative (but complex) solution also exists.
This introduces a small option that allows to disable the current
workaround as preliminary work.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36110>
This exposes an experimental implementation of HIC with
RADV_PERFTEST=hic. It's passing 100% of VKCTS but it requires some
benchmarks first to verify if performance is acceptable or not.
No addrlib support for GFX6-9.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
Because there is no surface<->surface helper in addrlib, this allocates
a temporary buffer on the CPU to do image->buffer->image. It's a naive
implementation which is probably not the best for performance, but it
works at least.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
On GFX12, everything is compressed with DCC and it's completely
transparent to the userspace driver, so that should be optimal.
On older gens, using HIC disables compression which isn't optimal
for device access.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
rustfmt would change not only lib.rs, which is output by this script,
but also all of the other files in the library. The problem is, those
other files are inputs to lib_rs_gen so the script was changing its own
inputs which meant ninja needed to be run twice before we reached a
fixed point. Remove rustfmt for now to prevent this issue.
Fixes: 591b5da4 ("nouveau/headers: Run rustfmt on generated files")
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36121>
This was missed when we documented vulkan 1.3 support in features.txt.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 1145cac490 ("docs/features: mark vk 1.3 as complete on panvk/v10+")
Reviewed-by: Chris Healy <cphealy@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36103>
The unordered mode stored in dependencies might be a bitmask and not
only a single mode. In practice, only the "stronger" mode will stick.
Make sure that the code testing for the mode uses "&" instead of "==",
to avoid prevent some valid combinations to happen, e.g.
```
// ...
add(16) g104<1>F g94<1,1,0>F g34<1,1,0>F { align1 1H @7 $7.dst compacted };
```
which without the fix ends up as
```
// ...
sync nop(1) null<0,1,0>UB { align1 WE_all 1N F@7 };
add(16) g104<1>F g94<1,1,0>F g34<1,1,0>F { align1 1H $7.dst compacted };
```
Enables two tests for the scoreboard pass that illustrate this case.
For measuring the effect, re-enabled the sync.nop accounting on total of
instructions and got the following results.
```
Totals:
Instrs: 322041261 -> 321748285 (-0.09%)
Cycle count: 22864587567 -> 22863073741 (-0.01%)
Max dispatch width: 7989040 -> 7989024 (-0.00%); split: +0.00%, -0.00%
Totals from 88212 (9.78% of 902056) affected shaders:
Instrs: 102282050 -> 101989074 (-0.29%)
Cycle count: 12787629859 -> 12786116033 (-0.01%)
Max dispatch width: 525336 -> 525320 (-0.00%); split: +0.01%, -0.01%
```
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36096>