Commit graph

224300 commits

Author SHA1 Message Date
Peyton Lee
96fc9cdfd5 amd/vpelib: refine coding style
refine coding style

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Mike Han <SzuChih.Han@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Han, Mike
c171124a34 amd/vpelib: add format support check
add format support check for new format support check

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Mike Han <SzuChih.Han@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Rouf, Farhan
156c62fcf3 amd/vpelib: Chroma coefficient select for sampled formats corrected
[WHY]
Turning on Video Upscale exposed an issue with AYUV formats going
through SPL resulting in color corruption. The coefficient select bit in
dscl_mode to 1 as if it was the same for 420 or 422 formats which is not
the case for AYUV.

[HOW]
Changing the ycbcr format check to a check for subsampled formats like
420 and 422, allows for the correct coefficient to be selected. The
dpp1_dscl_is_video_subsampled was a static bool function. It is now a
regular bool function that can be called from vpe20_dpp_dscl.c..

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Farhan Rouf <Farhan.Rouf@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Peyton Lee
79002d905b amd/vpelib: tighten external LUT compound color pipeline updates
Body
[WHY]
Fix remaining mismatches in external LUT compound handling.

[HOW]

Move indirect shaper programming to the frontend flow with proper enable checks.
Bypass internal shaper/gamut rebuild in LUT compound mode.
Update 3DLUT enable state and blend gamma on relevant dirty/force-update paths.

Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Shih, Jude
1ea42fb3b8 amd/vpelib: Realign DPP callback initialization with the updated interface layout
Realign DPP callback initialization with the updated interface layout

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Jude Shih <shenshih@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Ali, Nawwar
cc61f110b1 amd/vpelib: update shaper config size
[WHY]
Update size to match format

[HOW]
By updating corresponding value

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Nawwar Ali <nawwar.ali@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Han, Mike
791eaa2cb2 amd/vpelib: complete 16bpc RGBA format mapping for 10/12bpc msb/lsb support
RGBA 16bpc channel-order variants (BGRA/RGBA/ABGR/ARGB, UNORM/SNORM)
were not fully covered across VPE format programming and SPL translation.
This can lead to wrong format/crossbar selection for 10/12bpc MSB/LSB use cases

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Mike Han <SzuChih.Han@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Rouf, Farhan
84c893a000 amd/vpelib: Changed cmd_info input for background segment
[WHY]
When background segments are required, they create a 2x2 image that
needs to have a background generated onto it. The issue here was that
the 2x2 image was being stored in cmd_info[bg_index] where bg_index is
2. Due to this, when the plane descriptor was being constructed,
cmd_info[0] was empty and therefore resulted in an out-of-range value
resulting in a VPE hang.

[HOW]
Changed the bg_index to be the first index in the cmd_info inputs.

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Farhan Rouf <Farhan.Rouf@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Okenczyc, Andrzej
c7c74dc612 amd/vpelib: Report unsupported status if streams target rect equals 0
Added explicit error reporting for target_rect = 0, and uniformly used zero-rect helper to avoid misjudging invalid input as a normally processable empty segmentation.

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Andrzej Okenczyc <Andrzej.Okenczyc@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Kovac, Krunoslav
c1b02ea29e amd/vpelib: Fixes for external lut compound
[WHY]
A collection of fixes:
1. Always set degamma to bypass.
2. Use 0.0/1.0 for bias+scale in 3dlut compound case
3. SPL doesn't handle custom tf/cs. It doesn't actually use it, but we
need to pass in something or it will assert and bail out. Using PQ+2020
as default.
4. In lut compound case, we may not have CPU addressable 3DLUT. Don't
fail vpe_check_tonemap_support if only GPU mem address given.

Co-authored-by: Chan, Roy <Roy.Chan@amd.com>
Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Shih, Jude
663a7b03a3 amd/vpelib: Fix Compiler Warnings
Refine the code to avoid warning

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Jude Shih <shenshih@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Shih, Jude
a491749717 amd/vpelib: Refactor DPP function table layout
Refactor DPP function table layout and align callback mapping with build flags

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Jude Shih <shenshih@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Agate, Jesse
1b18bc9a31 amd/vpelib: Fix blending hang issue
[WHY]
When command 1 is a two pipe blending command, command 2 is a
three pipe blending command.
The config caching mechanism cannot handle this case correctly.
However, because the second command is three pipe blending, it requires
mpx mux of second pipe to be programmed with bot select coming from 3rd
pipe.
but current flow the their op are the same so the mux programming will
be reused which caused the issue.

[HOW]
Remove mpc mux and blend config from stream op to per segment as it is
not possible to cache those configs under the current system The penalty
will be very minor, the number of regwrites invovled here is small so no
tangible performance penalty will be had.

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Jesse Agate <Jesse.Agate@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Agate, Jesse
b5746a010e amd/vpelib: separating frontend programming
[WHY]
With reuse=0, per-frame frontend programming ran but frame-specific cache wasn’t updated. When the same pipe stream/op was reused, stream-op config saw cached entries and skipped programming; since frame cache wasn’t added, stream-op programming was missing, causing a hang on the last descriptor in pipe 1.

[HOW]
Split frontend programming into per-frame, per-op, and per-segment, and add a dedicated reuse bit for stream-op config to ensure programming occurs when needed.

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Jesse Agate <Jesse.Agate@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Shih, Jude
6b16d3d0ba amd/vpelib: Alpha blending enhancement
Enhance the capability of alpha blending in the latest version

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Jude Shih <shenshih@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Rouf, Farhan
34404c2df2 amd/vpelib: Introduced reset to frontend
Add a reset mechanism in the frontend to fix artifacts caused by improper disablement affecting subsequent blending. Perform reset and configuration only when necessary, keeping behavior consistent with the frontend.

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Farhan Rouf <Farhan.Rouf@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Kovac, Krunoslav
25d2f09e9e amd/vpelib: Enable VPE cap for 3DLUT
Set enabled according to HW support. Upper layer can override.

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Zhao, Jiali
13a188a4ef amd/vpelib: revert predication fix
[WHY]
The previous predication fix assumed bufs->cmd_buf.cpu_va and bufs->cmd_buf.size referred to the same underlying allocation/layout. In practice they can differ, which leads to incorrect predication parameters and breaks command submission.

[HOW]
Revert that change and restore the previous predication handling. A follow-up will reintroduce a correct fix with proper address/size pairing.

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Jiali Zhao <Jiali.Zhao@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Hsieh, Mike
12e5605e1d amd/vpelib: add indirect shaper config support
[WHY]
User can provide prepared gpu accessible data for shaper config.

[HOW]
Add indirect shaper config setup mechanism.

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Mike Hsieh <Mike.Hsieh@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Saeed, Ghamr
bfe39a8625 amd/vpelib: variable was accumulating size and not reset properly
[WHY]
This was a bug in the buffer size setting

[HOW]
A variable needed resetting after every loop

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Ghamr Saeed <Ghamr.Saeed@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Hsieh, Mike
78891a8df7 amd/vpelib: add optional __stdcall calling convention via build option
[WHY]
Some platforms require the __stdcall calling convention for ABI compatibility.

[HOW]
Introduce the VPE_USE_STDCALL build option to apply __stdcall to public API function declarations when enabled.

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Mike Hsieh <Mike.Hsieh@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
You, Min-Hsuan
e20692b4af amd/vpelib: fix FROD alignment handling after interface change
[WHY]
The frod_param.enable_frod field was changed from a bool to an int to support multiple FROD stages. The previous boolean-style condition did not correctly handle non-boolean integer values (e.g., multiple stages encoded as a non-zero value).

[HOW]
Treat any non-zero enable_frod as enabled and apply 16-pixel alignment regardless of which stage(s) are enabled.

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Min-Hsuan You <Min-Hsuan.You@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Kovac, Krunoslav
0ff61b7e76 amd/vpelib: fix custom color space handling
[WHY]
Input format checking only rejects CS_UNKNOWN, so CUSTOM ends up being
supported even if 3dlut compound not enabled which is invalid.

[HOW]
Add internal type for COLOR_SPACE_CUSTOM and translations. Rejecting at
check_input_format support is rather inelegant as I would need to pass
lut_compound.enabled to a bunch of helpers, so I will still report
custom as supported.
However, checking for 3dlut compound will now no longer trivially report
true if not enabled - it will check that space is not custom as this is
only valid with lut compound enabled.

Acked-by: Peyton Lee <peytolee@amd.com>
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42033>
2026-06-15 10:02:26 +00:00
Georg Lehmann
1981646c4f aco/tests: test creating v_dual_dot2acc_f32_f16 from v_dot2c_f32_f16 with inline constant
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42187>
2026-06-15 09:10:02 +00:00
Georg Lehmann
f91e0038b4 aco/tests: test v_pk_fmac_f16 and v_dotc_f32_f16 with inline constants
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42187>
2026-06-15 09:10:02 +00:00
Francisco Jerez
c3cdcd09ed intel/brw: Add NIR pass to vectorize dot products into DPAS matrix multiplications.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add a new optimization pass that identifies sequences of scalar dot
product operations and combines them into DPAS (Dot Product Accumulate
Systolic) matrix multiplication instructions for XeHP+ EUs that have a
systolic array pipeline (AKA XMX engine).

This is possible because a matrix multiplication as performed by DPAS
can be expressed like:

  E^i_k = D^i_k + Sum_j A^i_j B^j_k

I.e. each scalar component of a matrix multiplication is just a
(possibly large) dot product.  This pass identifies such chains of
sdot_4x8_iadd dot products in the program and bins them according to
the A and B arguments used.  Sets of dot products with consecutive
components are transformed into a matrix product for each densely
occupied interval of indices within each bin, as long as there is an
efficient way to transpose one of the arguments in the register file.

This enables programs to opportunistically take advantage of the
systolic array pipeline for linear arithmetic, which has massively
greater throughput than the regular FPUs (roughly a factor of 4x the
throughput for the specific instructions replaced currently), without
the application having to be updated in order to take advantage of it
through a matrix multiplication API like KHR_cooperative_matrix.

The immediate motivation for this is getting the open source driver to
accelerate the matrix multiplications used for inference by the XeSS
ML-driven upscaling library, since the Mesa driver was currently
limited to the generic HLSL path that doesn't take advantage of the
XMX pipeline.  Alternative AI-driven upscaling libraries can be
supported in theory though this hasn't been pursued yet, and there are
some assumptions in the optimization pass that might get in the way
currently:

 - Currently only the sdot_4x8_iadd intrinsic is supported for no
   particular reason other than it being the intrinsic generated by
   the XeSS library in its multivendor path.  It would be
   straightforward to add support for additional types supported by
   the systolic pipeline.

 - Currently one of the arguments of the dot products is restricted to
   be an SSBO load because that's what we encounter in the XeSS
   library, but any other kind of memory load intrinsic could be
   supported easily.

 - Also accidental is the current limitation to run on Xe2+
   hardware. Getting it to work on XeHP (e.g. DG2) is theoretically
   possible beyond some minor differences so it will probably be a
   future area for improvement.

 - The limitation of the shader subgroup size to 16 done at the end of
   the optimization pass is less accidental, because on all Intel Xe
   platforms released so far the DPAS instruction is limited to run at
   a fixed execution width (8 on XeHP and 16 on Xe2-3), so the backend
   would need a way to expose variable-width DPAS intrinsics e.g. by
   lowering them using SIMD splitting.  I have some code to try to
   achieve that, but the naïf SIMD splitting approach of DPAS
   instructions appears to hurt more cases than it helps so I don't
   have a ready solution to lift this restriction yet.

Evaluating the impact of this on the performance of XeSS kernels using
our internal microbenchmarks shows a performance improvement for XeSS
inference between 26% and 44% depending on the quality preset and
resolution, with a geomean improvement of 35% across the rendering
modes tested.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41814>
2026-06-15 08:10:51 +00:00
Francisco Jerez
8857da4db5 intel/brw/swsb: Omit redundant read-after-read synchronization for back-to-back DPAS.
Multiple DPAS instructions executed on the same functional unit are
guaranteed to read their source operands in program order, so no
scoreboard synchronization is required between a DPAS read and another
DPAS read of the same register.

In order to achieve that track the pipeline (DPAS vs. other) of each
out-of-order dependency via a new field on the dependency struct along
with the token ID of the out-of-order dependency.  When a read
dependency for a DPAS instruction is encountered whose producer is
also a DPAS unit, strip the SRC synchronization flag so that no
redundant wait is emitted.  The DST synchronization flag is preserved
since write-after-read hazards still require ordering.

This reduces the number of scoreboard stalls emitted within chains of
DPAS instructions that have overlapping sources (common in matrix
multiplication kernels), improving occupancy of the systolic pipeline.
It avoids performance regressions in XeSS kernels in combination with
the following vectorization optimization, and could also be helpful in
theory with other workloads that utilize the systolic pipeline via
KHR_cooperative_matrix.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41814>
2026-06-15 08:10:51 +00:00
Francisco Jerez
b394292085 intel/brw: Sort scheduling modes by performance after initial RA failure.
Previously when the first register allocation attempt failed,
brw_allocate_registers() would iterate over scheduling modes in the
fixed order specified by pre_modes[], assuming that the first
successful mode would be the most performant.  However that wasn't
ever a very reliable guarantee, and it becomes less so on Xe3+ were a
lower-register-pressure schedule can have higher thread parallelism.

But actually that's a bit of a silly situation since the pre_modes[]
loop that runs before the first brw_assign_regs() attempt already
iterates over multiple scheduling heuristics in order to choose which
one to try first, so it has a static analysis model of the relative
performance of the different heuristics which we can use in order to
properly sort the pre_modes[] list and make a more informed decision
about the iteration order at little extra cost.

This seems to be helpful even before xe3 in cases where
BRW_SCHEDULE_PRE_(NON_)LIFO outperforms BRW_SCHEDULE_PRE(_LATENCY), in
particular when the critical path heuristic used by
BRW_SCHEDULE_PRE_(NON_)LIFO does a better job at minimizing the
latency of the program than the mostly backward-looking heuristic of
BRW_SCHEDULE_PRE(_LATENCY).

That is apparently the case in several shaders from the XeSS library,
where the BRW_SCHEDULE_PRE heuristic hoists most of the memory loads
of the shader aggressively to the top creating a bottleneck instead of
interleaving the messages more effectively with the arithmetic along
the critical path of the program.  This patch avoids performance
regressions with the subsequent DPAS vectorization patch as a result
of this inversion of performance between the PRE and PRE_NON_LIFO
scheduling heuristics.

Note that this doesn't necessarily run the scheduler more times, it
just changes the order that the different scheduling modes are
attempted, no significant difference in the compile-time of shader-db
nor fossil-db has been observed.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41814>
2026-06-15 08:10:51 +00:00
Francisco Jerez
f638b30c03 nir/divergence: Allow local_invocation_id.z to be treated as uniform.
Add a new nir_divergence_uniform_local_invocation_id_z divergence
option that allows the Z component of the local invocation ID to be
treated as uniform across the subgroup, for cases where the driver
knows that as a result of the hardware's subgroup walk order the Z
component is guaranteed to remain constant across a subgroup.

On Intel hardware for the walk order currently in use all invocations
within a single subgroup are guaranteed to share the same
local_invocation_id.z value when the product of the X and Y workgroup
dimensions is a multiple of the SIMD width (32 at most).

This allows the subsequent vectorization optimization to have an
effect for many dot products in XeSS kernels whose two arguments
currently appear divergent, however one of them only appears divergent
due to the dependency on local_invocation_id.z, which is actually
subgroup-uniform for these kernels.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41814>
2026-06-15 08:10:51 +00:00
Sergi Blanch Torne
7c018be258 ci: disable Collabora's farm due to maintenance
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Planned downtime in the farm:
* Start: 2026-06-15 07:00 UTC
* End: 2026-06-15 13:00 UTC

Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41745>
2026-06-15 06:47:55 +00:00
Marek Olšák
ce4654ead6 radv: rename vrs_coarse_shading -> vrs_flat_shading
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
All coarse shading is VRS, but this code is about flat shading.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42234>
2026-06-13 19:29:59 +00:00
Marek Olšák
339945833c radv,radeonsi: disallow VRS flat shading if SubgroupInvocationID is used
The sysval is affected by VRS.

More subgroup sysvals might have to be added here.

Cc: mesa-stable

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42234>
2026-06-13 19:29:59 +00:00
Mary Guillemard
8f272b1fe1 nouveau/mme: Add a test for MME Shadow RAM behavior
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add a test to prove MME Shadow RAM behavior.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41506>
2026-06-13 14:16:39 +02:00
Mary Guillemard
8a1092712c nouveau/mme: Add some simple MME shadow RAM dumper
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41506>
2026-06-13 14:16:39 +02:00
Mike Blumenkrantz
8faf71d84f aux/tc: enforce strict resolve semantics
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
chromium/skia (stupidly) hits this path when drawing transparent svgs,
and it's definitely a bug in the browser engine, but no human can possibly
comprehend how any of that works

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42222>
2026-06-12 21:55:41 +00:00
Mike Blumenkrantz
2c5a2d8b39 util/tc: iterate the rp info more accurately during batch execution
these cases all trigger rp ends, but the info wasn't being iterated to
reflect the driver's expectation, leading to desync

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42222>
2026-06-12 21:55:41 +00:00
Mike Blumenkrantz
a4c07ed881 zink: tag tc info update in a few more places
these are places which might trigger rp ends

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42222>
2026-06-12 21:55:41 +00:00
Lionel Landwerlin
4e2abd872a anv: align storage texel buffer support on image support
We've enable image support based on typed write without format
support. Let's do the same for texel buffers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/12384
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42225>
2026-06-12 21:28:55 +00:00
Faith Ekstrand
d7f9fede84 kraid: Re-materialize constants
This isn't a great long-term solution but it cuts down on register
pressure for now and lets more shaders compile.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42226>
2026-06-12 17:10:28 -04:00
Faith Ekstrand
d7a3276386 kraid: Allocate whole registers for staging destinations
The LOAD and LD_PKA instructions have i8, i16, and i24 forms that can,
in theory, operate on partial registers.  However, there are issues with
races between ALU and message instructions on partial registers.  We
could probably come up with a complex model for this but for now it's
easiest to just force whole registers for message destinations.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42226>
2026-06-12 17:10:28 -04:00
Faith Ekstrand
5ebd05b8ea kraid: Add a Model::op_dst_is_staging_reg() helper
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42226>
2026-06-12 17:10:28 -04:00
Faith Ekstrand
fb7817bc71 kraid: Add a Model::op_src_is_staging_reg() helper
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42226>
2026-06-12 17:10:28 -04:00
Faith Ekstrand
cda7d27ad7 kraid: Fix RA for dead destinations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42226>
2026-06-12 17:10:28 -04:00
Faith Ekstrand
33e2ed7168 kraid: Only dump shaders if KRAID_DEBUG=print is set
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42226>
2026-06-12 17:10:28 -04:00
Collabora's Gfx CI Team
8aa1ad6cd4 Uprev ANGLE to 8e09325ebad45c7e11630a79754361e965e5fab0
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
196d1b79ea...8e09325eba

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41977>
2026-06-12 18:04:00 +00:00
Pohsiang (John) Hsu
6617b5b3fb mediafoundation: detach xThreadProc frame processing from apiLock to unblock concurrentt ProcessOutput calls
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42217>
2026-06-12 17:49:48 +00:00
Pohsiang (John) Hsu
6d4f890182 mediafoundation: fix a few minor variant bool handling
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42217>
2026-06-12 17:49:48 +00:00
Pohsiang (John) Hsu
6c49c1083c mediafoundation: extract code to ProcessDX12EncodeContext
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42217>
2026-06-12 17:49:48 +00:00
Valentine Burley
51325d9ac3 venus/ci: Revert ADL jobs to stable 6.17 kernel
Xe is unstable on 6.18+, so we need to revert to the previous stable
kernel if we want to have pre-merge jobs on ADL.

Cc: mesa-stable
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42041>
2026-06-12 17:23:02 +00:00
Valentine Burley
24d707d7e2 venus/ci: Retire Intel Comet Lake runner
The Flip-hatch devices are getting retired in the Collabora lab.

We can also drop a few skips that were only needed for CML.

Cc: mesa-stable
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42041>
2026-06-12 17:23:02 +00:00