Commit graph

222200 commits

Author SHA1 Message Date
Pierre-Eric Pelloux-Prayer
b2db3e1ddc radeonsi: add si_gfx_context.c and move code from si_pipe.c
Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:18:04 +02:00
Pierre-Eric Pelloux-Prayer
a335f4be7a radeonsi/gfx: move code from si_get to si_gfx_screen
These functions can be moved to the gfx subfolder and made static.

Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:18:02 +02:00
Pierre-Eric Pelloux-Prayer
d1c57f742e radeonsi/gfx: add si_gfx_screen.c
And move code specific to gfx/compute from radeonsi_screen_create_impl there.

ac_init_llvm_once has to stay in si_pipe.c because it has to be called very
early to avoid conflicts with u_queue initialisation.

Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:59 +02:00
Pierre-Eric Pelloux-Prayer
5f56a0e057 radeonsi: add si_resource_copy_buffer
Same as si_resource_copy_resource except it only supports buffers.

Also make sure that si_compute_clear_copy_buffer doesn't do
anything when has_gfx_compute is false.

Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:56 +02:00
Pierre-Eric Pelloux-Prayer
838ce62f3a radeonsi: extract si_init_gfx_caps from si_init_screen_caps
Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:54 +02:00
Pierre-Eric Pelloux-Prayer
a325be9548 radeonsi: move shader cache code to new file
Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:53 +02:00
Pierre-Eric Pelloux-Prayer
68a383531d radeonsi: add gfx subfolder
Same idea as for mm.

Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:47 +02:00
Pierre-Eric Pelloux-Prayer
4c08b87fe1 radeonsi: add si_init_screen_nir_options
Extract code from si_init_screen_get_functions to new helper.
The code assigning nir_options[] is moved out to help future
changes.

Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:45 +02:00
Pierre-Eric Pelloux-Prayer
e8cdd8ccb1 radeonsi: create a mm subfolder for multimedia code
Start moving code that's only for radeonsi multimedia support in this
folder to declutter si_pipe.c and si_get.c.

Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:42 +02:00
Pierre-Eric Pelloux-Prayer
b819ad62c2 radeonsi/vce: deal with has_gfx_compute being false
In this case the workaround can't be implemented so we must
report a failure.

Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:40 +02:00
Pierre-Eric Pelloux-Prayer
6e2d8c04be radeonsi: don't use staging texture when we can't blit
Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:38 +02:00
Pierre-Eric Pelloux-Prayer
3a1c466084 radeonsi: add has_gfx_compute property to si_screen
Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:36 +02:00
Pierre-Eric Pelloux-Prayer
931fc57f2a radeonsi: delay aux context initialization to first use
This avoids creating unneeded contexts.

Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:33 +02:00
Pierre-Eric Pelloux-Prayer
01c7a82760 gallium/vl: only release created sampler views
Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:31 +02:00
Pierre-Eric Pelloux-Prayer
d4c23daffc radeonsi: handle NULL return value from amdgpu_cs
The next commits will make it possible that sctx->gfx_cs isn't
initialized so we can't assume anymore that amdgpu_cs() always
return a valid cs.

Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41133>
2026-05-07 14:17:27 +02:00
Christian Gmeiner
4dbdd4c0ee panvk: Advertise VK_EXT_extended_dynamic_state3
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40882>
2026-05-07 10:56:49 +00:00
Christian Gmeiner
fd2d3992ce panvk: Apply sample mask in single-sample mode
Per Vulkan spec, the pipeline sample mask applies to all rasterization
sample counts, including single-sample. Drop the msaa-conditional clamp
that forced the sample mask to UINT16_MAX when rasterizationSamples == 1
and just use vk_dynamic_graphics_state's value directly. The default
when no static pSampleMask is provided is already all-ones, so existing
behaviour is preserved for pipelines that don't set the mask.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40882>
2026-05-07 10:56:49 +00:00
Icenowy Zheng
3afc792dc8 pvr: setup viewindex if the shader wants it even when multiview disabled
It's possible to use a shader that has ViewIndex input when multiview
isn't enabled. According to the Vulkan specification, when multiview
isn't enabled in a renderpass, the value of the ViewIndex input should
be 0.

However currently the driver does not emit execution of the PDS code
setting up view index, which leads to stale value to remain in
ViewIndex.

Setup the PDS code for setting view index and emit the command stream
for executing that PDS code when the shader wants ViewIndex, even if
multiview isn't enabled.

Fixes: 9d48088428 ("pvr: add view index support for vertex shaders")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40972>
2026-05-07 10:41:37 +00:00
Julia Zhang
d4b2e53ef3 radv: advertise VK_EXT_pipeline_protected_access
Advertise VK_EXT_pipeline_protected_access when TMZ is supported by the
physical device.

Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41224>
2026-05-07 10:00:30 +00:00
Eric Engestrom
665ebce297 docs: fix unescaped *
Fixes: 10f2c308c1 ("docs: add release notes for 26.1.0")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41406>
2026-05-07 09:49:34 +00:00
Eric Engestrom
c000356228 docs: add calendar for the 26.1 cycle, and 26.2 branchpoint and release candidates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41406>
2026-05-07 09:49:34 +00:00
Eric Engestrom
2d78d1bd84 docs: add sha sum for 26.1.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41406>
2026-05-07 09:49:34 +00:00
Eric Engestrom
6829dc1c3a docs: add release notes for 26.1.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41406>
2026-05-07 09:49:34 +00:00
Eric Engestrom
26039e2040 docs: update calendar for 26.1.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41406>
2026-05-07 09:49:34 +00:00
Kenneth Graunke
2729b1608f brw: Limit SIMD width based on NIR rather than first backend compile
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
I originally added this mechanism to have the first (SIMD8) compile
note that certain features were in use which would prevent SIMD16/32
from compiling, so we could skip the work of trying those.

But these days, there aren't many cases, and the ones we have are
easily detectable based on the NIR.  We can detect it earlier without
even having to do the SIMD8 compile.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>
2026-05-07 08:29:40 +00:00
Kenneth Graunke
c5928d40ae brw: Drop dead code from dispatch limit check for dual source blending
We checked that ver is 11 or 12.  It can't be >= 20.  This is dead code.

Dual source blending on Xe2 does not have native SIMD32 RT write message
support, but SIMD splitting is currently lowering it to low/high SIMD16
message pairs when using SIMD32 dispatch.  I'm not aware of any of the
hardware errata from previous platform still applying.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>
2026-05-07 08:29:40 +00:00
Kenneth Graunke
599d26db00 brw: Set prog_data::dual_src_blend from NIR outputs written bitfield
Simpler and set earlier.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>
2026-05-07 08:29:40 +00:00
Kenneth Graunke
afb97ff2af brw: Switch FS outputs to semantic IO and FRAG_RESULT_DUAL_SRC_BLEND
The new FRAG_RESULT_DUAL_SRC_BLEND option is easier to work with than
looking for FRAG_RESULT_DATA0 with an index of 1.  This also means we
no longer care about the dual source blend index, and can just use the
FRAG_RESULT location.  That cascades to meaning we no longer have to
store a tuple in driver_location.  And, if we just need location, we
can avoid populating that at all and use nir_io_semantics to get it.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>
2026-05-07 08:29:40 +00:00
Kenneth Graunke
4018aea9fa nir: Set FRAG_RESULT_DUAL_SRC_BLEND in outputs_written when lowering
Detecting dual source blending is currently annoying: you can either
look at info->fs.color_is_dual_source, or FRAG_RESULT_DUAL_SRC_BLEND
being in the info->outputs_written bitfield.

The former is only set if nir_shader_gather_info runs prior to
nir_lower_io lowering it to FRAG_RESULT_DUAL_SRC_BLEND.

The latter is only set if nir_shader_gather_info runs after the
nir_lower_io lowering.

Just make the IO lowering also set the outputs_written flag so if
you're trying to use FRAG_RESULT_DUAL_SRC_BLEND, you can always
check outputs_written without worrying about pass ordering.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>
2026-05-07 08:29:40 +00:00
Kenneth Graunke
ff34135d05 iris: Call elk_nir_lower_fs_outputs for Gen8 RT reads, not brw
This older code really only exists for Gen8 and elk.  On brw platforms,
we moved to handling this in brw_nir_lower_fs_load_output, and don't
need to lower FS outputs before iris_setup_binding_tables.

Call the right compiler's lowering so that things continue working
on elk once we change brw_lower_fs_outputs in the next commit.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>
2026-05-07 08:29:40 +00:00
Kenneth Graunke
fbaa5ad0c3 iris: Implement force_dual_color_blend_by_location via NIR
We can just have iris look at its own program key and change the
fragment shader output variable's location/index in the NIR.  By
doing this before lowering fragment shader outputs, the rest of
the output lowering does the right thing, and the backend no longer
has to consider hacks for broken OpenGL apps.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>
2026-05-07 08:29:40 +00:00
Pierre-Eric Pelloux-Prayer
e3beb262bd amd/virtio: fix amdgpu_sw_info_address_prt_wa_control_bit handling
Fixes: 60b406e233 ("ac/gpu_info: query the PRT workaround control bit from libdrm")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41381>
2026-05-07 07:59:00 +00:00
Pierre-Eric Pelloux-Prayer
760ed3e888 amd/virtio: use AMDGPU_VA_MGR_RESERVE_HALF_VA_FOR_PRT
To match what libdrm_amdgpu does in non-virtualized env.

Fixes: e0b5724e85 ("meson: bump required libdrm to 2.4.133 for AMDGPU")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41381>
2026-05-07 07:59:00 +00:00
Samuel Pitoiset
e5e375593b radv/tests: add tests for global pipeline keys compatibility
To verify that some GPUs are compatible and that shader binaries can be
shared to avoid precompiling twice for SteamOS.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41346>
2026-05-07 08:53:24 +02:00
Karol Herbst
75d9cb0b32 llvmpipe: never pass a NULL function name to LLVMAddFunction
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
LLVM seems to crash otherwise.

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41404>
2026-05-07 05:35:14 +00:00
Karol Herbst
4bad47e991 gallivm/nir/soa: use uint for booleans
Otherwise we'll hit a LLVM assert when handling load_const:
llvm/include/llvm/ADT/APInt.h:121: llvm::APInt::APInt(unsigned int, uint64_t, bool, bool): Assertion `llvm::isIntN(BitWidth, val) && "Value is not an N-bit signed value"' failed.

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41404>
2026-05-07 05:35:13 +00:00
Karol Herbst
3df48dec23 nir/lower_cl_images: call nir_progress on every function
llvmpipe supports real function calls, so we need to call nir_progress on
every function, not just the entry point.

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41404>
2026-05-07 05:35:12 +00:00
Faith Ekstrand
593e3b3916 panvk: Let the compiler handle texture queries on v9+
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The panfrost compiler is now able to handle these on v9+ and we don't
need to lower them ourselves anymore.  We only need the lowering on
Bifrost because we don't have the magic LD_PKA there.

Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41352>
2026-05-07 00:36:02 +00:00
Faith Ekstrand
8e6adcad7d pan/nir: Lower image queries in NIR on Valhall+
This new pass, pan_nir_lower_image(), will eventually subsume all image
lowering.  For now, though, it only lowers image_size and only on
Valhall.

Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41352>
2026-05-07 00:36:02 +00:00
Faith Ekstrand
48827ffb21 panfrost: Also remap image handles for image_size/samples
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41352>
2026-05-07 00:36:02 +00:00
Faith Ekstrand
e3fcc704ab pan/nir: Lower texture queries in nir_lower_tex() on Valhall+
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41352>
2026-05-07 00:36:02 +00:00
Calder Young
efc6a3053d anv: Fix some usage flags not propagated to ISL for explicit layouts
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Some vulkancts tests rely on vkGetImageMemoryRequirements to return the same
exact size after exporting and importing an image. This broke when we started
adding padding to sampled surfaces to manage overfetch, because the texture
usage flag does not get applied to the ISL surface when the image is recreated
using an explicit layout.

Fixes: 8d13628f7 ("isl: Add additional alignment/padding requirements to prevent overfetch")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41376>
2026-05-07 00:02:43 +00:00
Alyssa Rosenzweig
5636a57f60 jay/lower_scoreboard: use SYNC.allrd/allwr
This collapses piles of silliness.

Totals:
CodeSize: 71626288 -> 70710000 (-1.28%)

Totals from 1634 (61.73% of 2647) affected shaders:
CodeSize: 66319376 -> 65403088 (-1.38%)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41398>
2026-05-06 23:25:26 +00:00
Alyssa Rosenzweig
c1dc9d3b1a jay/lower_scoreboard: be the sole emitter of SYNC
this gets closer to something we can schedule and avoids some pointless syncs.

Totals from 491 (18.55% of 2647) affected shaders:
Instrs: 602994 -> 602946 (-0.01%)
CodeSize: 9063888 -> 9015904 (-0.53%)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41398>
2026-05-06 23:25:26 +00:00
Alyssa Rosenzweig
0885ed10f5 jay/lower_scoreboard: use .src annotations
This is less heavy handed, avoiding unnecessary stalls after SENDs in a
bunch of common cases. The stats (SIMD32) are:

Totals:
CodeSize: 70345392 -> 71674272 (+1.89%)

Totals from 1774 (67.02% of 2647) affected shaders:
CodeSize: 67359248 -> 68688128 (+1.97%)

What's happening here is we are inserting extra SYNC.nop instructions in a
bunch of cases for the .src preceding the eventual .dst. However, putting aside
the i-cache impact for a moment, this is showing the optimization doing what it
should (deferring dst syncs and inserting cheaper src syncs first). So this
should be positive in reality despite the negative stat impact.

The most hurt shaders are pooling up SYNC.nop's at the end of blocks due to
local-only SWSB and lack of SYNC.allwr optimization. The latter is added later
in this MR. The former is planned.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41398>
2026-05-06 23:25:25 +00:00
Alyssa Rosenzweig
130e724d5e jay/lower_scoreboard: refactor SYNC.nop insertion
for next commit

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41398>
2026-05-06 23:25:25 +00:00
Alyssa Rosenzweig
1ecd75a397 jay/lower_scoreboard: fix tracking for A@* and *@7
update the tracking with what we actually waited on, not what we ideally wanted
to wait on. reduces extra annotations in some cases.

SIMD32:

Totals from 194 (7.33% of 2647) affected shaders:
CodeSize: 14473840 -> 14469088 (-0.03%)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41398>
2026-05-06 23:25:25 +00:00
Alyssa Rosenzweig
93edf9a3fd jay/lower_scoreboard: refactor wait pipe code
for next commit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41398>
2026-05-06 23:25:25 +00:00
Alyssa Rosenzweig
18e09858eb jay/lower_scoreboard: elide more dependencies
IGC does these optimizations and I think they should be safe given my mental
model. Given a sequence like:

   r0 = add.f32 r1, r2
   r1 = add.f32 r3, r4

Each ALU pipe is pipelined but in-order. Therefore, the second add cannot
possibly complete before the first add, so it cannot write r1 before the first
add reads r1, so we can elide the write-after-read dependency. That in term
avoids a pipeline bubble between the two instructions. Ditto for
write-after-write.

Similarly if the distance is too great within an in-order pipe since there is a
maximum pipeline length, it's not infinite.

Note that if there was cross-pipe dependencies we do need the annotation since
the pipes themselves are parallel.

SIMD32:

Totals from 58 (2.19% of 2647) affected shaders:
CodeSize: 3316592 -> 3315056 (-0.05%); split: -0.05%, +0.00%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41398>
2026-05-06 23:25:25 +00:00
Alyssa Rosenzweig
e4dc161277 jay: assign accumulators post-RA
Greedy post-RA substitution pass, similar to IGC's AccSubstitution pass.
Stats together with the previous commits.

SIMD16:

   Totals from 2209 (83.45% of 2647) affected shaders:
   Instrs: 2701029 -> 2696350 (-0.17%)
   CodeSize: 39166720 -> 40372272 (+3.08%); split: -0.36%, +3.44%

SIMD32:

   Totals from 2211 (83.53% of 2647) affected shaders:
   Instrs: 4691165 -> 4641188 (-1.07%)
   CodeSize: 69365792 -> 69341616 (-0.03%); split: -0.50%, +0.47%

The instruction count reduction is from RA shuffle code getting coalesced via
accumulators. The code size changes are from:

* Fewer moves from the instr count reduction (helped)
* Smaller MADs encoded as MACs (helped)
* Fewer SYNC.nop due to fewer scoreboarding annotations (helped)
* Less compaction due to explicit accumulator operands (hurt)

I expect significant cycle count changes from this but we don't have a cycle
model wired up yet, so reading the assembly will have to do.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41398>
2026-05-06 23:25:25 +00:00