Commit graph

203775 commits

Author SHA1 Message Date
Lionel Landwerlin
713cb0fdc1 anv/hasvk: sort out debug options
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34332>
2025-04-04 15:18:28 +00:00
Lionel Landwerlin
8a51e097af docs: remove unused env variable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34332>
2025-04-04 15:18:28 +00:00
Lionel Landwerlin
43e0f02391 anv/hasvk: consider timeline semaphore support stable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34332>
2025-04-04 15:18:28 +00:00
Vlad Zahorodnii
c57da522fa vulkan/wsi/wayland: Document why wl_surface_damage() code path ignores provided damage
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34227>
2025-04-04 14:38:44 +00:00
Vlad Zahorodnii
0c943bbb64 vulkan/wsi/wayland: Damage whole surface using wl_surface_damage_buffer()
Most compositors work with damage in the buffer local coordinate space.
This change spares the compositors some work converting the provided
INT32_MAX x INT32_MAX damage region to the buffer local coordinate
space. It has no significant performance impact, but it'd still be nice
to use wl_surface_damage_buffer() if possible.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34227>
2025-04-04 14:38:44 +00:00
Vlad Zahorodnii
fd146d04d1 egl/wayland: Damage whole surface using wl_surface_damage_buffer()
Most compositors work with damage in the buffer local coordinate space.
This change spares the compositors some work converting the provided
INT32_MAX x INT32_MAX damage region to the buffer local coordinate
space. It has no significant performance impact, but it'd still be nice
to use wl_surface_damage_buffer() if possible.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34227>
2025-04-04 14:38:43 +00:00
Job Noorman
78ef51aa04 ir3/opt_preamble: take alias.rt into account for rewrite cost
FS outputs can use const registers in alias.rt without a mov so take
this into account when calculating the rewrite cost of instructions.

Totals:
MaxWaves: 2765084 -> 2765130 (+0.00%); split: +0.00%, -0.00%
Instrs: 56289002 -> 56285073 (-0.01%); split: -0.01%, +0.00%
CodeSize: 118071672 -> 118076808 (+0.00%); split: -0.00%, +0.01%
NOPs: 9491112 -> 9492474 (+0.01%); split: -0.00%, +0.02%
MOVs: 1790085 -> 1786768 (-0.19%); split: -0.19%, +0.00%
Full: 2156693 -> 2156607 (-0.00%); split: -0.00%, +0.00%
(ss): 1329812 -> 1329546 (-0.02%); split: -0.03%, +0.01%
(sy): 686396 -> 686386 (-0.00%); split: -0.00%, +0.00%
(ss)-stall: 4995295 -> 4995185 (-0.00%); split: -0.02%, +0.01%
(sy)-stall: 19828966 -> 19828624 (-0.00%); split: -0.01%, +0.01%
Cat0: 10450369 -> 10451731 (+0.01%); split: -0.00%, +0.02%
Cat1: 2787946 -> 2784566 (-0.12%); split: -0.12%, +0.00%
Cat2: 21265787 -> 21264447 (-0.01%)
Cat3: 16207098 -> 16206536 (-0.00%)
Cat7: 1597849 -> 1597840 (-0.00%); split: -0.00%, +0.00%

Totals from 730 (0.36% of 200220) affected shaders:
MaxWaves: 6308 -> 6354 (+0.73%); split: +0.79%, -0.06%
Instrs: 258235 -> 254306 (-1.52%); split: -1.59%, +0.07%
CodeSize: 698806 -> 703942 (+0.73%); split: -0.28%, +1.02%
NOPs: 21040 -> 22402 (+6.47%); split: -1.85%, +8.33%
MOVs: 9426 -> 6109 (-35.19%); split: -35.52%, +0.33%
Full: 8914 -> 8828 (-0.96%); split: -1.03%, +0.07%
(ss): 5118 -> 4852 (-5.20%); split: -6.58%, +1.39%
(sy): 2118 -> 2108 (-0.47%); split: -1.18%, +0.71%
(ss)-stall: 17360 -> 17250 (-0.63%); split: -4.57%, +3.94%
(sy)-stall: 34921 -> 34579 (-0.98%); split: -5.90%, +4.92%
Cat0: 24734 -> 26096 (+5.51%); split: -1.58%, +7.09%
Cat1: 12311 -> 8931 (-27.46%); split: -27.70%, +0.24%
Cat2: 106329 -> 104989 (-1.26%)
Cat3: 100547 -> 99985 (-0.56%)
Cat7: 3646 -> 3637 (-0.25%); split: -0.91%, +0.66%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34279>
2025-04-04 14:17:10 +00:00
Vignesh Raman
7959250d1e s3_upload: improve url validation and error message
Ensure s3_upload correctly validates the S3 folder url by requiring it
to end with /. This prevents wrong uploads to invalid paths, such as
file urls. Also improve the error message.

Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34255>
2025-04-04 13:32:45 +00:00
Zan Dobersek
248edb43c3 tu: allow D3D-compatible texture coordinate rounding
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When running under DXVK or vkd3d, the texture coordinate rounding behavior
should match D3D expectations. On Adreno, this behavior can be toggled
through the SP_TP_MODE_CNTL register.

A driconf-based option is introduced to help set the relevant register flag
that enables this behavior.

This fixes the cause of test_sampler_rounding test case failure in vkd3d on
Turnip's side, but a small change in vkd3d is also required, so the test
failure expectation isn't removed yet.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33987>
2025-04-04 10:09:47 +00:00
Zan Dobersek
3b1ca55b40 freedreno/registers: add useful A6XX_SP_TP_MODE_CNTL bitfields
Add additional bitfields for the A6XX_SP_TP_MODE_CNTL registers, ones that
we already use and the texcoord rounding mode bitfield that we'll need for
D3D-over-Vulkan implementations.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33987>
2025-04-04 10:09:47 +00:00
Benjamin Lee
e183650aa4 panvk/csf: fix uninitialized read in utrace_clone_init_builder
Previous code assumed that the caller of utrace_clone_init_builder would
fill some parameters of the builder config, but we were not. Instead,
initialize these from the csif props the same as all the other builder
instances.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 3096cf2a5d ("panvk/csf: flush and process trace events for all cmdbufs")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34270>
2025-04-04 09:43:02 +00:00
Hyunjun Ko
2b0df6c564 anv: Use vk_video_derive_h265_scaling_list
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34096>
2025-04-04 07:23:48 +00:00
MaciejDziuban
f31a33905a radv: Use vk_video_derive_h265_scaling_list
This commit makes radv use vk_video_derive_h265_scaling_list, which properly
applies default scaling lists whenever they're needed. It also simplifies
update_h265_scaling function into a simple memcpy. The firmware interface
struct and Vulkan's StdVideoH265ScalingLists struct both have identical memory
layouts, so it's not neccessary divide it into multiple copies with offsets.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34096>
2025-04-04 07:23:48 +00:00
MaciejDziuban
4072286f07 vulkan: Add default scaling lists for H265
H265 specification defines default scaling lists to use whenever scaling lists
are not specified in neither sps nor pps. Currently drivers ignore this
requirement and set the lists to zero. This commits adds a helper function
vk_video_derive_h265_scaling_list (similar to its h264 counterpart) that
selects either sps or pps lists and falls back to default values if neither
were specified. The default values were taken from ITU-T H265 specification
(revision 8), section 7.4.5.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34096>
2025-04-04 07:23:48 +00:00
MaciejDziuban
a1bf7192e5 vulkan: handle use_default_scaling_matrix_mask in h264 decoder
H264 specification defines this field to force usage of the default
scaling lists even if they are specified in ScalingList4x4 and
ScalingList8x8.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34096>
2025-04-04 07:23:47 +00:00
Ian Romanick
20cce95ce5 brw/opt: Don't call brw_opt_copy_propagation before brw_lower_load_reg
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On a 36c/72t Xeon system, performance of replaying
hogwarts_legacy.dx12vk-ultra.foz was improved 1.3% +/- 0.77% (n=10).

I picked MTL for the fossil-db results because it was the most negative.

shader-db:

All Intel platforms had fairly similar results. (Lunar Lake)
total instructions in shared programs: 16964217 -> 16964216 (<.01%)
instructions in affected programs: 51777 -> 51776 (<.01%)
helped: 20 / HURT: 27

total cycles in shared programs: 892934916 -> 893041912 (0.01%)
cycles in affected programs: 51245298 -> 51352294 (0.21%)
helped: 96 /HURT: 78

fossil-db:

All Intel platforms had similar results. (Meteor Lake shown)
Totals:
Instrs: 233678547 -> 233678944 (+0.00%); split: -0.00%, +0.00%
Cycle count: 24398049850 -> 24400490877 (+0.01%); split: -0.01%, +0.02%
Max live registers: 42145052 -> 42145038 (-0.00%); split: -0.00%, +0.00%

Totals from 1141 (0.14% of 805934) affected shaders:
Instrs: 1546001 -> 1546398 (+0.03%); split: -0.01%, +0.03%
Cycle count: 1201746062 -> 1204187089 (+0.20%); split: -0.14%, +0.34%
Max live registers: 84247 -> 84233 (-0.02%); split: -0.03%, +0.01%

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
991a2f510b brw/sat: Eliminate non-defs saturate propagation
The intervening_saturating_copy test is removed. The defs version of the
pass does not handle this case. It should not occur often in practice
anyway. Copy propagation and brw_nir_opt_fsat should prevent this
scenario from happening.

No shader-db changes on any Intel platform.

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 212677275 -> 212677278 (+0.00%)
Cycle count: 30466062848 -> 30466056040 (-0.00%)

Totals from 1 (0.00% of 706300) affected shaders:
Instrs: 1343 -> 1346 (+0.22%)
Cycle count: 411664 -> 404856 (-1.65%)

v2: Stop counting ip. The non-defs part of the pass was the only thing
that used it.

v3: Also delete "if (block != def->block) continue;" code. I noticed
this while working on some other changes to this function. It's the last
thing in the loop, so it's totally useless. Delete some other spurious
continues too.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v2]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
cc5a6a5ae8 brw/sat: Convert tests to use load_reg
This is in prepartion for a commit that removes the non-defs version of
the pass.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
2d13acf9d9 brw: Add passes to generate and lower load_reg
v2: Add support for WE_all instructions... this already just worked, so
I only had to delete the check and the FINISHME comment.

v3: Use logic more like def_analysis::update_for_reads to determine when
to not insert LOAD_REG instructions. Based on a suggestion by Ken.

v4: Eliminate "store" from all the names since STORE_REG does not exist
anymore. Fold insert_load_reg into brw_insert_load_reg. Elminate extra
call to s.def_analysis.require() after progress. Pull a loop-invariant
check out of the inst->srouces loop. Drop call to
brw_opt_split_virtual_grfs after lowering load_reg. All suggested by
Caio.

v5: Assert that LOAD_REG doesn't already exist in
brw_insert_load_reg. Update comment before fully_defines. Both
suggested by Caio.

v6: Don't explicitly special-case SHADER_OPCODE_MEMORY_STORE_LOGICAL.
Move the inst->dst.file != VGRF check earlier to avoid the loop over
sources. Both suggested by Ken. Move the call the brw_insert_load_reg
a little bit later, and explain why it's at that location. Suggested
by Caio.

v7: Many changes to the for-each-source loop in brw_insert_load_reg.
Removes incorrect multiplication of s.alloc.sizes with reg_unit. Adds
checks for matching SIMD size and NoMask in the search for pre-existing
LOAD_REG of same value.

v8: Add some unit tests. Suggested by Caio.

shader-db:

Lunar Lake
total instructions in shared programs: 16923237 -> 16921895 (<.01%)
instructions in affected programs: 450565 -> 449223 (-0.30%)
helped: 251 / HURT: 377

total cycles in shared programs: 910428418 -> 889920590 (-2.25%)
cycles in affected programs: 719248184 -> 698740356 (-2.85%)
helped: 9076 / HURT: 9082

total fills in shared programs: 2242 -> 2218 (-1.07%)
fills in affected programs: 116 -> 92 (-20.69%)
helped: 2 / HURT: 0

total sends in shared programs: 848635 -> 848421 (-0.03%)
sends in affected programs: 810 -> 596 (-26.42%)
helped: 10 / HURT: 0

LOST:   82
GAINED: 78

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
total instructions in shared programs: 19875784 -> 19871694 (-0.02%)
instructions in affected programs: 1050091 -> 1046001 (-0.39%)
helped: 251 / HURT: 2403

total cycles in shared programs: 905328238 -> 882446458 (-2.53%)
cycles in affected programs: 682736344 -> 659854564 (-3.35%)
helped: 7869 / HURT: 7911

total spills in shared programs: 5512 -> 5032 (-8.71%)
spills in affected programs: 1830 -> 1350 (-26.23%)
helped: 8 / HURT: 0

total fills in shared programs: 5648 -> 4782 (-15.33%)
fills in affected programs: 3312 -> 2446 (-26.15%)
helped: 8 / HURT: 0

total sends in shared programs: 1032942 -> 1032722 (-0.02%)
sends in affected programs: 572 -> 352 (-38.46%)
helped: 10 / HURT: 0

LOST:   138
GAINED: 53

Tiger Lake
total instructions in shared programs: 19711930 -> 19715591 (0.02%)
instructions in affected programs: 1040623 -> 1044284 (0.35%)
helped: 317 / HURT: 2474

total cycles in shared programs: 862988990 -> 860573870 (-0.28%)
cycles in affected programs: 612392461 -> 609977341 (-0.39%)
helped: 7447 / HURT: 7686

total sends in shared programs: 1034763 -> 1034555 (-0.02%)
sends in affected programs: 784 -> 576 (-26.53%)
helped: 8 / HURT: 0

LOST:   56
GAINED: 143

Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20545461 -> 20545220 (<.01%)
instructions in affected programs: 422405 -> 422164 (-0.06%)
helped: 180 / HURT: 459

total cycles in shared programs: 872697345 -> 866874523 (-0.67%)
cycles in affected programs: 573117917 -> 567295095 (-1.02%)
helped: 6783 / HURT: 6980

total spills in shared programs: 4335 -> 4336 (0.02%)
spills in affected programs: 90 -> 91 (1.11%)
helped: 1 / HURT: 2

total fills in shared programs: 4194 -> 4196 (0.05%)
fills in affected programs: 463 -> 465 (0.43%)
helped: 1 / HURT: 2

total sends in shared programs: 1079446 -> 1079238 (-0.02%)
sends in affected programs: 784 -> 576 (-26.53%)
helped: 8 / HURT: 0

LOST:   117
GAINED: 37

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 209708136 -> 209695617 (-0.01%); split: -0.02%, +0.01%
Send messages: 10927753 -> 10927640 (-0.00%)
Cycle count: 30540172048 -> 30427084732 (-0.37%); split: -0.99%, +0.62%
Spill count: 511621 -> 510932 (-0.13%); split: -0.22%, +0.08%
Fill count: 621166 -> 618440 (-0.44%); split: -0.56%, +0.12%
Scratch Memory Size: 35574784 -> 35648512 (+0.21%); split: -0.06%, +0.26%
Max live registers: 65453860 -> 65453140 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 75374990 -> 35195764 (-53.31%)

Totals from 503284 (71.25% of 706391) affected shaders:
Instrs: 180203778 -> 180191259 (-0.01%); split: -0.02%, +0.01%
Send messages: 9699732 -> 9699619 (-0.00%)
Cycle count: 30080349592 -> 29967262276 (-0.38%); split: -1.01%, +0.63%
Spill count: 511584 -> 510895 (-0.13%); split: -0.22%, +0.08%
Fill count: 621120 -> 618394 (-0.44%); split: -0.56%, +0.12%
Scratch Memory Size: 35443712 -> 35517440 (+0.21%); split: -0.06%, +0.27%
Max live registers: 52566092 -> 52565372 (-0.00%); split: -0.01%, +0.00%
Non SSA regs after NIR: 70110949 -> 29931723 (-57.31%)

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
8b2be206f3 brw/algebraic: Constant folding for BROADCAST and SHUFFLE
This prevents assertion failures in brw_eu_emit in a later commit in
this MR. Even though they have not been previously observed, these
assertion failures could happen even without that commit.

No shader-db or fossil-db changes on any Intel platform.

Fixes: 04e1783278 ("brw: Call brw_fs_opt_algebraic less often")

v2: Add SHUFFLE. Suggested by Ken. Fixed indentation.

v3: Update BROADCAST exec_size after rebasing on "brw/build: Use SIMD8
temporaries in emit_uniformize".

v4: Explain why munging the exec_size is correct.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
1b997c7bcc brw/coalesce: Prepare brw_opt_register_coalesce for load_reg
v2: Explain the problematic situation a little better in the
comment. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
15637334ce brw/copy: Prepare copy_propagation for load_reg
The changes to try_copy_propagate will be removed later in the series.

v2: Fix up some comments to note that offset != 0 is allowed only when
stride == 0. Apply same offset=0 restriction in try_copy_propagate_def
too. Allow copy propagation if the source is either a def or
UNIFORM. Don't copy prop a load_reg through a non-def value.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
cfc50390fb brw: Add basic infrastructure for load_reg pseudo op
load_reg is something like load_payload except it has a single
source. It copies the entire source to the destination. Its purpose is
to convert a non-SSA VGRF into an SSA value. This copy is marked as
volatile so that it will act as a scheduling barrier.

v2: Fix some typos in the commit message. Eliminate the
brw_builder::LOAD_REG overload that returns a brw_inst*. This is
unlikely to ever be used. Add some checks to brw_validate. All
suggested by Caio.

v3: Force the source and destination types of the LOAD_REG to by
integer. This will (eventually) simplify the creating of unit tests for
the pass that adds LOAD_REG instructions.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
b9656d51c0 brw/opt: Move non-SSA register accounting after first brw_opt_split_virtual_grfs
v2: Move to immediately before the main optimization loop. Most
importantly, this is after the first call to DCE.

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Non SSA regs after NIR: 237045283 -> 100183460 (-57.74%); split: -58.12%, +0.39%

Totals from 701423 (99.26% of 706657) affected shaders:
Non SSA regs after NIR: 236868848 -> 100007025 (-57.78%); split: -58.17%, +0.39%

Suggested-by: Ken
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Kenny Levinsen
cd4820d6ac device-select: Support linux-dmabuf feedback
device-select-layer needs to obtain the display server's preferred
display device, and has so far relied on wl_drm for this. wl_drm is
superseded by linux-dmabuf with some Wayland servers having dropped
support for wl_drm entirely.

Implement linux-dmabuf as preferred mechanism for obtaining the main
device, with wl_drm support retained as a fallback for now.

Signed-off-by: Kenny Levinsen <kl@kl.wtf>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34219>
2025-04-04 06:00:17 +00:00
llyyr
dc90e33ad2 vulkan/wsi/wayland: initialize surface colorspace with PASS_THROUGH_EXT
Starting with sRGB meant we would refcount to -1 if an application
chooses PASS_THROUGH. Instead, just initialize with PASS_THROUGH so the
initial refcount of 0 reflects reality.

Previously, we would segfault if an application chose PASS_THROUGH at
swapchain initialization then switched to a color managed colorspace
later in the runtime, because we would increment refcount from -1 -> 0
and this would result in not creating a new color managed surface.

Fixes: 789507c99c ("vulkan/wsi: implement the Wayland color management protocol")
Signed-off-by: llyyr <llyyr.public@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34353>
2025-04-04 05:10:08 +00:00
David Rosca
1610841f0f egl/x11: Fix swap interval setup
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Calling dri2_x11_setup_swap_interval with swap_available = false
sets the min/max/default swap interval values to zero.
EGL_MIN/MAX_SWAP_INTERVAL is always reported as 0 and the interval
value set by eglSwapInterval gets clamped to 0.

Set swap_available to true before calling dri2_x11_setup_swap_interval,
as was done before.

Fixes: c00701c83a ("egl/x11: unify swrast/kopper/dri3 paths a bit")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34235>
2025-04-04 04:15:01 +00:00
Jose Maria Casanova Crespo
efc87e0d6a glapi: import noop_array and public stubs earlier.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
After 711fc10ea3 "glapi: merge all shared-glapi source files
into one .c file" the V3D simulator started crashing. After
testing the changes of the merge one by one, it was identified
that previously shared_glapi_mapi_tmp.h was being imported twice
instead of only once as it happens after the merge. Although
the change done in the merge seems to be equivalent it seems
it was breaking the the debug builds.

Here can find an explanation why this problem was affecting debug
builds https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34363#note_2850196

Fixes: 711fc10ea3 ("glapi: merge all shared-glapi source files into one .c file")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12908
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34363>
2025-04-04 00:18:28 +00:00
Mike Blumenkrantz
12b57b34f8 gallium/util: check nr_samples in pipe_surface_equal()
this is otherwise broken

cc: mesa-stable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34367>
2025-04-03 23:41:30 +00:00
Timur Kristóf
a530890e75 nir/print: Fix variable mode for arrayed output load intrinsics.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This helps print the names of varyings correctly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
2025-04-03 19:54:51 +00:00
Timur Kristóf
e258492a8f radv: Remove radv_streamout_info::num_outputs.
This field was never used for determining the number of outputs,
just for determining whether streamout was enabled, which makes
it unnecessary. We can use enabled_stream_buffers_mask for that.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
2025-04-03 19:54:51 +00:00
Timur Kristóf
ce2138d73a radv: Call nir_opt_undef too after nir_opt_varyings.
Shaders may have undefined output stores after nir_opt_varyings.
These must be optimized out, otherwise they hit an assertion.

Fixes: 17f6ab28cc
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
2025-04-03 19:54:51 +00:00
Timur Kristóf
15d0804670 radv: Use buffers_written mask when gathering XFB info.
We need to enable these buffers regardless of whether or not the
shader actually writes any outputs to them, otherwise we break
XFB queries.

Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
2025-04-03 19:54:51 +00:00
Timur Kristóf
96d11d0f56 nir/opt_varyings: Fix assertion when deduplicating TCS outputs.
When deduplicating TCS outputs, we may find outputs that aren't
loaded by the shader itself. This previously hit a bad assertion.

Fixes: c66967b5cb
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12410
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
2025-04-03 19:54:51 +00:00
Timur Kristóf
a29b5857f7 nir/xfb: Preserve some xfb information when gathering from intrinsics.
We need to remember which streamout buffers and streams were enabled,
even if the shader doesn't actually write any outputs to them,
because the API requires that we count vertices created by this shader
towards queries against those streams.

That information can be gathered by nir_gather_xfb_info_with_varyings
from the original NIR I/O variables that we get from the frontend,
but it isn't included in any intrinsics so would be otherwise lost here.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
2025-04-03 19:54:51 +00:00
Jan Alexander Steffens (heftig)
1deb0536a1 gfxstream: Use proper log format for 32-bit Vulkan
On i686, where VK_USE_64_BIT_PTR_DEFINES is unset and Vulkan handles are
represented as 64-bit integers instead, the code used the wrong format
specifier, causing a build error.

Fixes: 7fb31361f4 ("Handle external fences in vkGetFenceStatus()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34124>
2025-04-03 19:35:20 +00:00
Zan Dobersek
335cc96069 tu: disable logic operations for float and sRGB formats
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Per spec, logic operations between fragment values and color attachments
should be disabled when attachments are using float or sRGB formats.
Regardless of attachment's format, enabled logic operations should keep
blending disabled.

Fixes: dEQP-VK.pipeline.*.logic_op_na_formats.*

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34212>
2025-04-03 15:48:19 +00:00
Lucas Stach
d917625226 etnaviv: add context flush sw query
Context flushes can be caused by all kinds of operations that aren't
obvious to a GL API user. As those are quite heavy-weight operations
it is nice to have some insight into how many of those are happening
per frame. Add a sw query to make this information easily accessible.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34350>
2025-04-03 14:27:55 +00:00
Stéphane Cerveau
ee535aa039 radv: video: rework maxActiveReferenceSlot/MaxDpbSlots
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
For the pReferenceSlots.slotIndex, the max
value should the maxDpbSlots which is
h264: 16 + 1
h265 : 15 + 2
av1: 7+2

Fixing SVA_CL1_E test vector in JVT-AVC_V1
fluster test suite.

Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33094>
2025-04-03 13:20:45 +00:00
Georg Lehmann
c21a53440f spirv: clamp/sign-extend non 32bit ldexp exponents
GLSL.std.450 allows any integer size here.
OpenCL only allows i32.

Cc: mesa-stable

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34071>
2025-04-03 12:35:59 +00:00
Job Noorman
45a5ccbf07 ir3/ra: create merge sets for splits/collects inserted for shared RA
Since shared RA happens after creating merge sets, newly inserted
splits/collects did not have merge sets created for them. Fix this by
creating merge sets for new instructions after shared RA.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>
2025-04-03 12:06:18 +00:00
Job Noorman
0cafd07b0c ir3: add ir3_aggressive_coalesce helper
To allow us to create merge sets outside of ir3_merge_regs.c.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>
2025-04-03 12:06:18 +00:00
Job Noorman
a0db2f9737 ir3/ra: assign interval offsets to new defs after shared RA
Shared RA might insert new defs to be handled by regular RA (e.g.,
shared spills). However, their interval offsets were not initialized
which caused their intervals to sometimes be mistakenly matched with
those containing offset 0. Fix this by calling index_merge_sets after
shared RA and modifying that function to only index new defs in that
case.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>
2025-04-03 12:06:18 +00:00
Eric Engestrom
6331441e24 ci: rename ci-tron priority tag to avoid conflict with the generic fdo runners
Otherwise, ci-tron runners with that tag could pick up jobs meant for the fdo
runners, as happened here:
https://gitlab.freedesktop.org/mesa/mesa/-/jobs/73883719

The inverse (fdo runners picking up a job meant for a ci-tron runner) is not
possible though, as ci-tron jobs always include a `farm:$RUNNER_FARM_LOCATION`
tag, so the problem only exists in the other direction.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34358>
2025-04-03 11:25:12 +00:00
Eric Engestrom
f84578e308 ci/build: drop LTO from fedora build
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's been broken for a few months by now and nobody has been interested
in fixing it, so let's drop LTO so that we get the rest of the benefits
from having that build at all.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34318>
2025-04-03 10:23:55 +00:00
Samuel Pitoiset
ef3363ef71 radv: rework suspend/resume user conditional rendering
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Better to suspend/resume in the top level function.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34338>
2025-04-03 08:54:36 +00:00
Samuel Pitoiset
4bc971a0bd radv: add new helper to suspend/resume user conditional rendering
Instead of duplicating same code everywhere.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34338>
2025-04-03 08:54:36 +00:00
Samuel Pitoiset
4d1d6d4147 radv: fix ignoring conditional rendering with vkCmdResolveImage()
This command isn't supposed to be affected by conditional rendering.

This fixes new VKCTS coverage
dEQP-VK.conditional_rendering.conditional_ignore.resolve_image*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34338>
2025-04-03 08:54:36 +00:00
Job Noorman
dd1ba74777 ir3: make shpe a terminator
shpe is a bit of a special instruction: it's not really a terminator
(i.e., it does not perform a jump) but it does have to stay at the end
of its block. Up to now, we tried to enforce this by creating const
write barriers on shpe; the assumption being that everything that
happens in the preamble ends in a write to the const file so shpe stays
at the end. Alas, it turns out this is not true: things like sampler
prefetches do not write the const file and nothing was preventing those
from being scheduled after shpe.

Instead of trying to create even more barrier dependencies, fix this by
making shpe a terminator. Both sched and postsched treat terminators
specially to make sure they always stay at the end of their block.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34290>
2025-04-03 08:16:59 +00:00
Danylo Piliaiev
f5019ee0d4 ir3: Fix shaders that write only color classified as empty
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Shader may have zero instructions and no prefetches but have inputs
that without modifications are used as output.

Fixed vkd3d test:
 test_depth_bias_behaviour

Fixes: b0a98d3b13
("ir3: Detect empty fragment shaders")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34348>
2025-04-03 06:47:43 +00:00