Commit graph

216778 commits

Author SHA1 Message Date
Samuel Pitoiset
7c9e5b4c1c radv: remove unreachable code for prefetch in radv_cs_emit_cp_dma()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
CP DMA prefetches are implemented with a separate function.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38449>
2025-11-19 08:03:38 +00:00
Samuel Pitoiset
60d438e517 radv: always use MALL for CP DMA operations on GFX12
CP DMA isn't coherent with L2 on GFX12, but {SRC,DST}_ADDR_TC_L2 means
MALL.

Only small buffers are using copy/fill CP DMA operations, so this
shouldn't have much effect.

Found by inspection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38449>
2025-11-19 08:03:38 +00:00
Samuel Pitoiset
b2a13ce92c radv/tests: require drm-shim and use it instead of RADV_FORCE_FAMILY
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38507>
2025-11-19 07:11:05 +00:00
Samuel Pitoiset
8fd91a1ee9 ci: build drm-shim for RADV tests in debian-vulkan
RADV tests will require AMDGPU drm-shim.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38507>
2025-11-19 07:11:05 +00:00
David Rosca
1f83e73145 radeonsi/vcn: Reduce allocated size for pre-encode recon pics
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We use 4x downscale for pre-encode, so we don't need full size
pre-encode reconstructed pictures.

Cc: mesa-stable
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38303>
2025-11-19 05:06:33 +00:00
Yiwei Zhang
a49b7adad8 venus: add error log coverage for virtgpu backend
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Make life easier for ci debug, remote debug, and any kind of bug report
inspection. Long need to add this.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38443>
2025-11-19 04:28:49 +00:00
Yiwei Zhang
0afc408cb9 venus: properly fix the blob mem mapping size
There's a single underlying bo mapping shared by the initial alloc here
and the later import of the same. The mapping size has to be initialized
with the real size of the created blob resource, since the app can query
the exported native handle size for re-import. e.g. lseek dma-buf size

Similar to virtgpu_bo_create_from_device_memory, the app can do multiple
imports with different sizes for suballocation. So on the initial
import, the mapping size has to be initialized with the real size of the
backing blob resource.

Backport-to: 25.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38443>
2025-11-19 04:28:49 +00:00
Yiwei Zhang
c259ea24ee venus: avoid re-imported dma-buf to have a larger map size
If the allocation originates from the same instance, the tracker map
size follows the allocationSize. After export and re-import, mapping the
whole dma-buf can exceed the original map size. This change backs out
the offending changes.

Test: dEQP-VK.api.external.memory.*.suballocated.host_visible.*
Fixes: 442f242a49 ("venus: requests whole blob mem size for non-dedicated import")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38443>
2025-11-19 04:28:48 +00:00
Qiang Yu
a6bf07e7c2 dri: avoid sending too many present reuqests when app start or pause
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Found when running glxgears with vblank enabled and modesetting DDX.
glxgears will send many present requests at the beginning, but most
of them get complete event with skip mode. This problem causes
glxgears report ~75fps on a 60Hz monitor at the first record.
This change reduces it to 60fps.

Vulkan side X11 WSI does not have this problem as it will wait first
present request's complete event before send second present request.

How the problem happens:
1. client send present request 1 with target msc = 1
2. server side current msc is 100, so it find request 1 is
   outdated and queue it for vblank with target msc = 101
3. client send present request 2 with target msc = 2
4. server side current msc is still 100, so it find request 2
   is outdated and queue it with target msc = 101, and find
   request 1 will be overridden, so mark it as skipped and
   send idle notify for it.
5. client get the idle notify for request 1, and reuse the
   request 1 buffer for new back buffer to send present
   request 3.
6. this keeps going until client send present request N, and
   server finally process the vblank queue before 101 msc
   arrive and send complete event for all these requests back
   to client.

Reviewed-by: Michel Dänzer <michel@daenzer.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38178>
2025-11-19 10:01:50 +08:00
Felix DeGrood
198537039a anv/rt: reduce writes to block_incr_and_start_prim
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36937>
2025-11-18 22:41:21 +00:00
Felix DeGrood
768bb1c7a3 anv/rt: multithread writing of invalid leaves
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36937>
2025-11-18 22:41:21 +00:00
Felix DeGrood
09c218e8aa anv/rt: fully restore code to write instance_count
Conformance tests and games still pass without this code, but
spec says we need it:
  https://registry.khronos.org/vulkan/specs/latest/html/
  vkspec.html#vkCmdCopyAccelerationStructureToMemoryKHR

This is potentially expensive code. There may be a future
opportunity to optimize this out. Need to research.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36937>
2025-11-18 22:41:21 +00:00
Felix DeGrood
cff9d82c66 anv/rt: rewrite encode.comp for better performance
Rewrite ANV's encode.comp, the final intel-specific raytracing shader
used for bvh-build. Performance is greatly improved for this shader
by adding the following features:

1) Find children early. All threads speculative find their children
before they know if they are valid (not collapsed). This makes more
work overall but reduces latency for propagating valid nodes from
root to leaves. Nodes find out if they are valid faster if all nodes
know who their children are upfront.

2) Hoist code used for intra-thread communication. Communicate
to children as soon as possible, minimizing wait time for later
threads.

3) Multithread encoding. Still launching 1 simd lane per node, same
as before, but encoding of nodes and children are parallelized across
multiple lanes. This works well because most nodes are collapsed
without any encode work required.

4) Hash globalID. Reduce chance that the thread processing a node
will also need to process node's children, which was found to
degrade performance, particularly for root node processing.

Measured RT game speedups:
 * Hitman3 +48%
 * F1'22 +10%
 * Indiana Jones +8%
 * GravityMark +2.5%

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36937>
2025-11-18 22:41:20 +00:00
Olivia Lee
443ddace70 panvk/csf: merge v10 and v11 paths in issue_fragment_jobs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is quite a lot of logic to duplicate verbatim just to deal with the
slightly different synchronization.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38496>
2025-11-18 21:33:40 +00:00
Olivia Lee
5c6c4cbebd panvk/csf: factor out cs_match_iter_sb helper macro
This simplifies cases where we need to match on all of the possible
iter_sb values, which occurs frequently.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38496>
2025-11-18 21:33:40 +00:00
Ryan Mckeever
298ad17b81 panfrost: enable EXT_shader_pixel_local_storage
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:43 +00:00
Ryan Mckeever
c15a43cce0 pan/lib: prepare for pixel local storage support
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:43 +00:00
Ryan Mckeever
dfddcae916 pan/bi: introduce EXT_shader_pixel_local_storage support to compiler
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:43 +00:00
Ryan Mckeever
cc12bac4f8 gallium, mesa: keep track of pixel local storage state
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:43 +00:00
Ryan Mckeever
7f7c4ebbba glsl, mesa: add EXT_shader_pixel_local_storage extension
This commit also checks for and issues errors for the following:

INVALID_OPERATION is generated if the application attempts enable pixel
local storage while the value of SAMPLE_BUFFERS is one.

INVALID_OPERATION is generated if the application attempts to enable pixel
local storage while the current draw framebuffer is a user-defined frame-
buffer object and has an image attached to any color attachment other than
color attachment zero.

INVALID_OPERATION is generated if the application attempts to enable pixel
local storage while the current draw framebuffer is a user-defined frame-
buffer and the draw buffer for any color output other than color
output zero is not NONE.

INVALID_FRAMEBUFFER_OPERATION is generated if the application attempts to
enable pixel local storage while the current draw framebuffer is
incomplete.

INVALID_OPERATION is generated if pixel local storage is disabled and the
application attempts to issue a rendering command while a program object
that accesses pixel local storage is bound.

INVALID_OPERATION is generated if pixel local storage is enabled and the
application attempts to bind a new draw framebuffer, delete the currently
bound draw framebuffer, change color buffer selection via DrawBuffers, or
modify any attachment of the currently bound draw framebuffer including
their underlying storage.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:43 +00:00
Ryan Mckeever
04d3da19c6 glapi: add EXT_shader_pixel_local_storage extension
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:43 +00:00
Ryan Mckeever
05795a1bd2 compiler/glsl: replace tabs with spaces
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:43 +00:00
Boris Brezillon
98bd0850da nir: Add a pass to downgrade inout PLS vars to {in,out} only ones
Shaders might declare PLS vars as inout but might just use them as in
or out but not both. This pass detects those cases and adjusts the
variable/deref modes accordingly.

This pass should be called before nir_lower_io_vars_to_temporaries(),
otherwise the copy_derefs will be inserted, turning unused variables
into used ones.

This should ideally be called after DCE to make sure we don't leave
PLS inout variables behind.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:43 +00:00
Boris Brezillon
2cc254d8cb nir: Teach nir_lower_io_vars_to_temporaries() about PLS vars
Pixel local storage variables are like fragment shader outputs that
might be read, written or both. Teach nir_lower_io_vars_to_temporaries()
about these variables so they can be lowered along with the regular
fragment outputs.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:43 +00:00
Boris Brezillon
ea4d4d2a77 nir: Prepare nir_lower_io_vars_to_temporaries() for optional PLS lowering
Rather than adding another boolean to optionally lower PLS vars, pass
the types we want to lowers through a nir_variable_mode bitmask.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:42 +00:00
Eric R. Smith
ab867cc3cd nir: add intrinsics for pixel local storage
The pixel local storage load and store instructions keep track of the
format of the pixel local storage variables. This allows drivers to insert
the appropriate conversions on load/store.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:42 +00:00
Ryan Mckeever
75263ce911 nir: add support for pixel_local_storage variables
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:42 +00:00
Allen Ballway
bfee8d3a14 android: support longer property names
Property names no longer have a maximum length in Android 26+,
support longer names to fix truncated property names.

Signed-off-by: Allen Ballway <ballway@google.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13043
Test: vendor.mesa.custom.border.colors.without.format is untruncated
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38453>
2025-11-18 19:50:00 +00:00
Anna Maniscalco
9a72696e02 nir/lower_tex: copy is_sparse when lowering txd
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38497>
2025-11-18 19:03:36 +00:00
Natalie Vock
1243d575a5 aco/insert_nops: Consider s_setpc target susceptible to VALUReadSGPRHazard
Some GPU hangs witnessed in the wild on RDNA4 in Control and Arc Raiders
seem to point towards closest-hit shaders reading a stale value for the
SGPR pair containing the currently-executing shader's address.

This SGPR pair was read by VALU in the preceding traversal shader,
making it susceptible to VALUReadSGPRHazard. Inserting
VALUReadSGPRHazard mitigations before accessing the s_setpc target seems
to fix the hang. We don't have conclusive proof that this is hazardous,
but given that all signs point towards it and we have a reasonably
simple workaround, let's roll with this for now to mitigate the hangs.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38290>
2025-11-18 18:43:00 +00:00
Samuel Pitoiset
9f512d8f93 radv: advertise VK_EXT_custom_resolve
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38442>
2025-11-18 17:03:13 +00:00
Samuel Pitoiset
91469bcc30 radv: implement VK_EXT_custom_resolve
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38442>
2025-11-18 17:03:13 +00:00
Samuel Pitoiset
d700205a9d vulkan: add support for vkCustomResolveCreateInfoEXT
This basically remaps color attachment formats for the resolve
operation.

Co-Authored-by: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38442>
2025-11-18 17:03:13 +00:00
Karol Herbst
d46be8fbf2 rusticl/kernel: Do not run kernels with a workgroup size beyond work_dim
When no workgroup size is specified we try to run with the most optimal one
possible. However we didn't take into account that we shouldn't run a
workgroup of higher dimensionality than requested by the application.

Fixes: 376d1e6667 ("rusticl: implement cl_khr_suggested_local_work_size")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38375>
2025-11-18 15:42:43 +00:00
Karol Herbst
810dca450c rusticl/kernel: fix clGetKernelSuggestedLocalWorkSizeKHR implementation
There were two issues:
1. The global_work_offset parameter is optional but we errored on NULL
2. We didn't return the reqd_work_group_size when set on the kernel.

Fixes: 376d1e6667 ("rusticl: implement cl_khr_suggested_local_work_size")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38375>
2025-11-18 15:42:43 +00:00
David Rosca
2587a565d8 radeonsi/vcn: Remove unnecessary vars for AV1 encode
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
These are just copied from picture desc.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38260>
2025-11-18 09:28:13 +00:00
David Rosca
698de5360c radeonsi/vcn: Cleanup AV1 screen content tools coding
There is no disable_screen_content_tools in AV1 spec, instead this
should be seq_choose_screen_content_tools. But we don't need that either
as we keep the effective value in force_screen_content_tools.
Same for seq_choose_integer_mv and force_integer_mv.
Also stop overriding these values and instead fix frame header coding
to work with all combinations.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38260>
2025-11-18 09:28:13 +00:00
David Rosca
6050bda231 radeonsi/uvd_enc: Cleanup HEVC encode deblock params handling
This should consider values from PPS and overrides from slice header
if enabled.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38260>
2025-11-18 09:28:13 +00:00
David Rosca
153ff5dd8a radeonsi/vcn: Cleanup HEVC encode deblock params handling
This should consider values from PPS and overrides from slice header
if enabled.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38260>
2025-11-18 09:28:13 +00:00
David Rosca
10e274af62 radeonsi/video: Make helper radeon_bitstream functions static
Those are now only used in radeon_bitstream.c

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38260>
2025-11-18 09:28:12 +00:00
David Rosca
094e20f134 radeonsi/vcn: Use radeon_bitstream functions to code headers
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38260>
2025-11-18 09:28:12 +00:00
David Rosca
628694c16c radeonsi/vce: Use radeon_bitstream functions to code headers
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38260>
2025-11-18 09:28:12 +00:00
David Rosca
332ec608ad radeonsi/uvd_enc: Use radeon_bitstream functions to code headers
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38260>
2025-11-18 09:28:12 +00:00
David Rosca
1f51401dae radeonsi/video: Add VPS/SPS/PPS and sequence header functions to radeon_bitstream
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38260>
2025-11-18 09:28:12 +00:00
David Rosca
c5f898edb4 frontends/va: Add AV1 encode high_bitdepth flag
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38260>
2025-11-18 09:28:12 +00:00
Christoph Pillmayer
617f0562bb pan: Use bitset instead of bool array in bi_find_loop_blocks
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38393>
2025-11-18 09:02:58 +00:00
Christoph Pillmayer
6535a3b6b3 pan: Fix bi_find_loop_blocks
Before this commit, nested loops aren't counted correctly:
   -------------
   V           |
-> A --> B --> C ->
         ^     |
         -------
A is both predecessor and successor of B but A isn't in B's loop.

Instead a block B is in loop header H's block if H is the successor
of B and H dominates B.

Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38393>
2025-11-18 09:02:56 +00:00
Christoph Pillmayer
5ae1b68cb0 pan: Adapt calc_dominance from nir to bi
Mostly "s/nir_block/bi_block/g" and some small fixups.

Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38393>
2025-11-18 09:02:55 +00:00
Christoph Pillmayer
dd01573207 pan: Copy nir_dominance.c to bi_dominance.c
Next commit will actually convert it to be bifrost.

Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38393>
2025-11-18 09:02:52 +00:00
Christoph Pillmayer
ca9c9957e2 pan: Avoid some redundant SSA spills
Instead of inserting the spill instruction before the instruction that
caused the spill, instead insert it either right after the definition
or at the end of the block that contains the definition.
This helps reduce code size and also moves STOREs outside of loops on
average.

Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38238>
2025-11-18 08:42:23 +00:00