Commit graph

216778 commits

Author SHA1 Message Date
Alyssa Rosenzweig
583b25e806 util: fix container_of on MSVC
otherwise &container_of(..)->foo won't work, need extra parens. gcc version is
fine.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>
2025-11-12 21:22:10 +00:00
Jose Maria Casanova Crespo
a0b8ee614d v3dv: only apply simulator stride alignment for from_wsi images
This adds from_wsi field to v3dv_image, so we can apply simulator stride
 alignment only to WSI images.

Handling VK_STRUCTURE_TYPE_WSI_IMAGE_CREATE_INFO_MESA at
v3dv_GetPhysicalDeviceFormatProperties2 also removes debug warnings like:

MESA: debug: v3dv_GetPhysicalDeviceImageFormatProperties2: ignored VkStructureType Unknown VkStructureType value.(1000001002)

Fixes: 562bb8b62b ("v3dv: align width to 256 when using simulator")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38374>
2025-11-12 21:03:42 +00:00
Faith Ekstrand
7411acaa77 panvk/dispatch: s/shader/cs/g
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Even though "shader" uniquely means something here and we don't need to
specify what stage, "cs" is still shorter, obvious, and matches what we
do all over the 3D code so it saves some cognitive load when bouncing
back and forth.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38403>
2025-11-12 20:44:13 +00:00
Faith Ekstrand
1046f5ed48 panvk: Make noperspective_varyings const
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38403>
2025-11-12 20:44:12 +00:00
Joshua Simmons
7ac1f7777d vtn: Fix OpCopyLogical destination type
Previously the type info for nested values was copied from the source
operand, rather than propagating the new type from the destination
operand.

Fixes: 4c363acf94 ("vtn: Allow for OpCopyLogical with different but compatible types")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38248>
2025-11-12 20:30:30 +01:00
Eric Engestrom
43fe66c26a docs: add 25.2.8 to the calendar
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38407>
2025-11-12 19:09:53 +01:00
Eric Engestrom
0644351297 docs: add sha sum for 25.2.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38407>
2025-11-12 19:08:44 +01:00
Eric Engestrom
2b9def5042 docs: add release notes for 25.2.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38407>
2025-11-12 19:08:44 +01:00
Eric Engestrom
42d82fb6c0 docs: update calendar for 25.2.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38407>
2025-11-12 19:08:38 +01:00
Samuel Pitoiset
6cf1f3b39a radv: fix supporting more tess parameters with TCS for ESO unlinked shaders
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
VGT_OUTPRIM_TYPE should be programmed correctly when PointMode is only
set in TCS with ESO.

Fixes dEQP-VK.shader_object.tessellation.hlsl.point_mode.

Fixes: c6d9b9b4e0 ("radv: support more tessellation parameters with TCS for ESO unlinked shaders"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38376>
2025-11-12 16:21:17 +00:00
Gurchetan Singh
5826a0aad9 gfxstream: meson format -i {all meson files}
More readable, allows meson format to be used in the future.

Acked-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38388>
2025-11-12 16:05:12 +00:00
Danylo Piliaiev
c73c737eb2 tu/lrz: Try harder to have LRZ fast-clear enabled with FDM offset
Non-fast-clear path to clear LRZ is rather slow, plus without LRZ
fast-clear we cannot enable concurrent binning.

VK_EXT_fragment_density_map_offset makes us add the maximum possible
tile size to the depth image size, because we don't know the tile
size that will be selected later on for a framebuffer. In practice,
this caused LRZ fast-clear to be disabled in many cases due to
tile_max_w/tile_max_h being rather large.

Now, instead of the most pessimitic case we can do the following:
- Calculate the biggest possible tile size that could be added to
  the depth image that won't disable LRZ fast-clear.
- When calculating tiling config use the info about maximum tile
  size from the images, or in case of image-less framebuffer
  recalculate maximum possible tile size and it.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38218>
2025-11-12 15:36:43 +00:00
Danylo Piliaiev
4baf82b406 freedreno/fdl: Move LRZ FC size calculation to a separate function
Will be needed later to calculate max tile size for FDM offset case.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38218>
2025-11-12 15:36:43 +00:00
Marek Olšák
7d22e4c7ba gallium/noop: don't unref buffers passed to set_vertex_buffers to fix crashes
this code is invalid after the refcounting rework

Fixes: b3133e250e - gallium: add pipe_context::resource_release to eliminate buffer refcounting

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38329>
2025-11-12 15:02:20 +00:00
Lionel Landwerlin
c4e2878537 anv: disable software detiling on Xe2+ for image atomics 64bits
This is what happens when you leave MR unreviewed for months.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d39e443ef8 ("anv: add infrastructure for common vk_pipeline")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38400>
2025-11-12 14:33:01 +00:00
Timur Kristóf
0651fd4e6d radeonsi/ci, zink+radv/ci: Remove GS primitive_counter tests from flakes
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
These should be fixed now.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38364>
2025-11-12 13:40:55 +00:00
Timur Kristóf
7f5f8b3932 ac/nir/ngg: Use align() instead of ALIGN()
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38364>
2025-11-12 13:40:55 +00:00
Timur Kristóf
8f99d736d0 ac/nir/ngg: Fix scratch space for NGG GS streamout
For GS streamout, we need the following LDS scratch space:

- Repacking streamout vertices takes 1 dword per 4 waves per stream
  (max 16 bytes for Wave64, max 32 bytes for Wave32)
- 1 dword per stream for buffer info
  (16 bytes)
- 1 dword per buffer for buffer info
  (16 bytes)

Previously, the space used for buffer info aliased with the
space for repacking the output vertices in ngg_gs_finale(),
and there was no barrier in between, which caused a race
condition, resulting in random failure.

Fix this by allocating a few more LDS dwords so that aliasing
is not required, which also allows us to remove an extra
workgroup barrier.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12705
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38364>
2025-11-12 13:40:55 +00:00
Gert Wollny
13148afd0e etnaviv: isa: Add "thread" info to TEX instruction
Blob generates this with the glmark2:texture benchmark on STM32MP257.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38363>
2025-11-12 13:00:09 +00:00
Valentine Burley
13a20f6571 intel/ci: Drop timeout overrides for pre-merge jobs
LAVA jobs already have a global 1h timeout in GitLab. This exists because
GitLab jobs must start before we can determine whether a device is
available for testing.

Jobs themselves do not normally run that long, most of the delay comes
from waiting in the LAVA queue.

Dropping these overrides for pre-merge jobs fixes cases where the LAVA
job isn't picked up in time.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38395>
2025-11-12 12:35:27 +00:00
Christian Gmeiner
e9341568fa meson: require sysprof-capture-4 >= 4.49.0
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When Mesa is compiled with sysprof support, applications can crash with a
segfault during shutdown. This happens because sysprof_collector_mark()
registers thread-local storage destructors that get called after the library
containing the destructor code has been unloaded.

The problem was fixed in sysprof https://gitlab.gnome.org/GNOME/sysprof/-/merge_requests/152

CC: mesa-stable
Closes: mesa/mesa#13571
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38347>
2025-11-12 12:00:46 +00:00
Dmitry Baryshkov
9a33edca35 ci: drop google-freedreno remnants
Drop remnants of the  google-freedreno lab entries.

Fixes: 6541b911bd ("freedreno/ci: Remove baremetal job templates")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38386>
2025-11-12 09:43:43 +00:00
Samuel Pitoiset
74a66d102f ac/parse_ib: decode SDMA_OPCODE_POLL_REGMEM
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38366>
2025-11-12 08:52:17 +00:00
Samuel Pitoiset
75a1380355 radv: add RADV_DEBUG=dumpibs to dump command buffers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38366>
2025-11-12 08:52:17 +00:00
Samuel Pitoiset
842603dc4f radv/amdgpu: add a way to identify preamble/postamble when dumping CS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38366>
2025-11-12 08:52:17 +00:00
Autumn Ashton
2705d8bd8b radv/video: Implement VK_VALVE_video_encode_rgb_conversion
This is used by Steam Link VR (driver_vrlink) to avoid doing YUV conversion itself.

Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37500>
2025-11-12 08:25:36 +00:00
Natalie Vock
73a31dafbc radv: Fix PSO history with RT pipelines
1. The prolog needs to have a null check. Libraries don't have prologs.
2. We only need to print the shaders actually included in this pipeline.
   Libraries were already printed separately.
3. The traversal shader was wrongly omitted from the output.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38355>
2025-11-12 08:00:54 +00:00
Konstantin Seurer
c4aee84426 radv: Add re-format commit to .git-blame-ignore-revs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38336>
2025-11-12 07:55:36 +00:00
Samuel Pitoiset
0dba538643 radv/meta: fuse depth/stencil aspects copy with the GFX path
Depth/stencil copies on graphics are twice as fast now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:33 +00:00
Samuel Pitoiset
9d3dd174b8 radv/meta: rework radv_meta_nir_texel_fetch_build_func
This add a binding parameter that will be used for fused depth/stencil
copies.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:33 +00:00
Samuel Pitoiset
332f881375 radv/meta: simplify aspect/formats in radv_gfx_copy_image()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:32 +00:00
Samuel Pitoiset
cd59db45f9 radv/meta: simplify radv_gfx_copy_memory_to_image() even more
Selecting formats can be simplified.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:32 +00:00
Samuel Pitoiset
ed05c3fc31 radv/meta: remove multiple aspects in radv_gfx_copy_memory_to_image()
Only one aspect at any time is valid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:31 +00:00
Samuel Pitoiset
a1884dc737 radv/meta: remove radv_meta_blit2d_rect
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:31 +00:00
Samuel Pitoiset
1319b2bef6 radv/meta: split radv_meta_blit2d() into two separate functions
It's more code but it's definitely easier to read and it will allow us
to do more cleanups/optimizations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:30 +00:00
Samuel Pitoiset
bb3f69fefe radv/meta: remove useless blit2d_src_temps
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:29 +00:00
Andy Hsu
d226c0d97d u_trace: remove redundant char* to string conversion (v2)
Add the string length parameter to the set_name(),
set_value() function to remove the conversion from
char* to std::string which takes extra work like
calling strlen() to compute the string length.

From the callback sampling in the perfetto tracing,
the ratio of trace_payload_as_extra_intel_end_draw_indexed
to intel_ds_end_draw_indexed drops from 63.80% to 59.65%
with this change.

v2: Add the data of the callback sampling to the description.

Signed-off-by: Andy Hsu <hwandy@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38073>
2025-11-12 06:17:16 +00:00
Aitor Camacho
93460e969e docs,kk: Add KosmicKrisp documentation
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Adds build instructions and workarounds documentation.
Workarounds documentation only has the biggest offenders and
there are probably way more in code that need yet to be
documented.

Reviewed-by: Arcady Goldmints-Orlov <arcady@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38232>
2025-11-12 04:23:59 +00:00
Faith Ekstrand
f187b537b5 pan: Use nir_lower_point_size for the float16 conversion
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is more robust than smashing the variable to mediump and then
asking for mediump to be lowered later.  It's also faster because it
only involves one compiler pass, not two.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38379>
2025-11-12 01:34:36 +00:00
Faith Ekstrand
6ee4ea5ea3 nir: Add a type parameter to nir_lower_point_size()
On Mali, we need not only clamp but also convert to float16 on Valhall+.
We could have a separate pass for this but it fits in nicely with the
rest of nir_lower_point_size() so we might as well put it there.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38379>
2025-11-12 01:34:36 +00:00
Sviatoslav Peleshko
5af8abbf8b driconf: Add vertex_program_default_out option for Penumbra: Overture
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Penumbra's vertex program Diffuse_EnvMap_Reflect_vp.cg produces 3-component
texture coordinates and primitive colors while using the FF fragment
program. Add this WA to fix the misrenderings.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14170
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38295>
2025-11-11 22:16:46 +00:00
Sviatoslav Peleshko
f03432c81a mesa,driconf: Add WA to initialize vertex program outputs to vec4(0,0,0,1)
Per ARB_vertex_program spec result registers are 4-component and initially
undefined, and the FF fragment program expects its intputs to be
4-component too. So, if the client's vertex program does not write the
whole vector it will cause misrenderings unless the same client also
supplies fragment program that expects less than 4 componens.

This commit adds a workaround that initializes results to vec4(0, 0, 0, 1)
which seems to be an expected behavior for such clients.

Cc: mesa-stable
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38295>
2025-11-11 22:16:46 +00:00
Eric Engestrom
f30e5ff44b ci: uprev vkd3d
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
03cca4cd97...4acd227131

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38370>
2025-11-11 20:15:21 +00:00
Faith Ekstrand
51a68ecc87 panvk: Optimize in the preprocess hook
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
NIR is actually pretty good at optimizing UBO, SSBO, and shared memory
access but in order to do so, we actually have to run the optimizations
before we lower it all.  Same for I/O.  By doing all our lowering in
panvk before we ever run the optimization loop, we risk hampering it
significantly.

Ignoring loop changes (several get unrolled now), fossil-db on Sascha
Willems demos and a few others looks lik

    Instrs: 189054 -> 187802 (-0.66%); split: -0.67%, +0.01%
    CodeSize: 1756160 -> 1747072 (-0.52%); split: -0.52%, +0.01%
    Estimated normalized CVT cycles: 771.367106999997 -> 766.0311719999971 (-0.69%); split: -1.05%, +0.36%
    Estimated normalized SFU cycles: 1407.21875 -> 1406.9375 (-0.02%); split: -0.03%, +0.01%
    Estimated normalized Load/Store cycles: 17477.0 -> 16917.0 (-3.20%)
    Maximum number of threads: 1257 -> 1213 (-3.50%); split: +0.08%, -3.58%
    Number of hardware loops: 283 -> 278 (-1.77%)

    Totals from 186 (19.81% of 939) affected shaders:
    Instrs: 102588 -> 101336 (-1.22%); split: -1.23%, +0.01%
    CodeSize: 834432 -> 825344 (-1.09%); split: -1.10%, +0.02%
    Estimated normalized CVT cycles: 463.226562 -> 457.890627 (-1.15%); split: -1.74%, +0.59%
    Estimated normalized SFU cycles: 1021.84375 -> 1021.5625 (-0.03%); split: -0.05%, +0.02%
    Estimated normalized Load/Store cycles: 8425.0 -> 7865.0 (-6.65%)
    Maximum number of threads: 334 -> 290 (-13.17%); split: +0.30%, -13.47%
    Number of hardware loops: 63 -> 58 (-7.94%)

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Faith Ekstrand
1a9c7f8c8a panvk: Only lower outputs to temporaries
We need to lower outputs to get rid of output reads and so that we can
fix up layer writes on Bifrost.  However, there's really no point in
lowering reads besides moving them to the top.  Even then, NIR can
probably copy propagate the copies and we'll end up reading straight
from the input variable anyway.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Faith Ekstrand
a8b6213983 panvk: Lower copy_deref and indirect derefs before nir_lower_io
Neither nir_lower_io() nor nir_lower_indirect_derefs() know what to do
with copy_deref so we need to get rid of those first.  Also, there are
some NIR passes which can insert more copy_deref or propagate an
indirect load to the I/O variable so we want to lower those away right
before lowering I/O.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Faith Ekstrand
d6dc0ea5ae panvk: Split var copies and lower local vars early
These two passes are a prerequisite for basically anything that
optimizes on variables.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Faith Ekstrand
586e1ac2b8 pan/compiler: Expose the bifrost optimization loop
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Faith Ekstrand
0e9fcb33c3 nir: Add a couple panfrost sysvals to divergence analysis
Fixes: 2af6e4beeb ("pan: Don't pretend we support load_{vertex_id_zero_base,first_vertex}")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Daniel Schürmann
5682e39e6b amd: enable load/store_shared2_amd for GFX6
Totals from 1509 (2.43% of 62200) affected shaders: (Pitcairn)

MaxWaves: 8078 -> 8057 (-0.26%); split: +0.09%, -0.35%
Instrs: 977182 -> 951746 (-2.60%); split: -2.62%, +0.02%
CodeSize: 4951468 -> 4758192 (-3.90%); split: -3.92%, +0.01%
SGPRs: 76704 -> 76696 (-0.01%)
VGPRs: 81092 -> 81068 (-0.03%); split: -0.34%, +0.31%
Latency: 11663237 -> 11526070 (-1.18%); split: -1.19%, +0.01%
InvThroughput: 6198904 -> 6114851 (-1.36%); split: -1.43%, +0.07%
VClause: 26656 -> 26655 (-0.00%); split: -0.05%, +0.05%
SClause: 22304 -> 22307 (+0.01%); split: -0.03%, +0.04%
Copies: 107503 -> 109564 (+1.92%); split: -0.23%, +2.15%
Branches: 22917 -> 22918 (+0.00%)
PreSGPRs: 42246 -> 42242 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 64561 -> 64761 (+0.31%); split: -0.01%, +0.32%
VALU: 600285 -> 601139 (+0.14%); split: -0.26%, +0.40%
SALU: 130622 -> 130851 (+0.18%); split: -0.16%, +0.33%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37682>
2025-11-11 17:12:17 +00:00