fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 11:20:11 +01:00

Author	SHA1	Message	Date
MaciejDziuban	4072286f07	vulkan: Add default scaling lists for H265 H265 specification defines default scaling lists to use whenever scaling lists are not specified in neither sps nor pps. Currently drivers ignore this requirement and set the lists to zero. This commits adds a helper function vk_video_derive_h265_scaling_list (similar to its h264 counterpart) that selects either sps or pps lists and falls back to default values if neither were specified. The default values were taken from ITU-T H265 specification (revision 8), section 7.4.5. Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34096>	2025-04-04 07:23:48 +00:00
MaciejDziuban	a1bf7192e5	vulkan: handle use_default_scaling_matrix_mask in h264 decoder H264 specification defines this field to force usage of the default scaling lists even if they are specified in ScalingList4x4 and ScalingList8x8. Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34096>	2025-04-04 07:23:47 +00:00
Ian Romanick	20cce95ce5	brw/opt: Don't call brw_opt_copy_propagation before brw_lower_load_reg Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details On a 36c/72t Xeon system, performance of replaying hogwarts_legacy.dx12vk-ultra.foz was improved 1.3% +/- 0.77% (n=10). I picked MTL for the fossil-db results because it was the most negative. shader-db: All Intel platforms had fairly similar results. (Lunar Lake) total instructions in shared programs: 16964217 -> 16964216 (<.01%) instructions in affected programs: 51777 -> 51776 (<.01%) helped: 20 / HURT: 27 total cycles in shared programs: 892934916 -> 893041912 (0.01%) cycles in affected programs: 51245298 -> 51352294 (0.21%) helped: 96 /HURT: 78 fossil-db: All Intel platforms had similar results. (Meteor Lake shown) Totals: Instrs: 233678547 -> 233678944 (+0.00%); split: -0.00%, +0.00% Cycle count: 24398049850 -> 24400490877 (+0.01%); split: -0.01%, +0.02% Max live registers: 42145052 -> 42145038 (-0.00%); split: -0.00%, +0.00% Totals from 1141 (0.14% of 805934) affected shaders: Instrs: 1546001 -> 1546398 (+0.03%); split: -0.01%, +0.03% Cycle count: 1201746062 -> 1204187089 (+0.20%); split: -0.14%, +0.34% Max live registers: 84247 -> 84233 (-0.02%); split: -0.03%, +0.01% Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	991a2f510b	brw/sat: Eliminate non-defs saturate propagation The intervening_saturating_copy test is removed. The defs version of the pass does not handle this case. It should not occur often in practice anyway. Copy propagation and brw_nir_opt_fsat should prevent this scenario from happening. No shader-db changes on any Intel platform. fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Instrs: 212677275 -> 212677278 (+0.00%) Cycle count: 30466062848 -> 30466056040 (-0.00%) Totals from 1 (0.00% of 706300) affected shaders: Instrs: 1343 -> 1346 (+0.22%) Cycle count: 411664 -> 404856 (-1.65%) v2: Stop counting ip. The non-defs part of the pass was the only thing that used it. v3: Also delete "if (block != def->block) continue;" code. I noticed this while working on some other changes to this function. It's the last thing in the loop, so it's totally useless. Delete some other spurious continues too. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v2] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	cc5a6a5ae8	brw/sat: Convert tests to use load_reg This is in prepartion for a commit that removes the non-defs version of the pass. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	2d13acf9d9	brw: Add passes to generate and lower load_reg v2: Add support for WE_all instructions... this already just worked, so I only had to delete the check and the FINISHME comment. v3: Use logic more like def_analysis::update_for_reads to determine when to not insert LOAD_REG instructions. Based on a suggestion by Ken. v4: Eliminate "store" from all the names since STORE_REG does not exist anymore. Fold insert_load_reg into brw_insert_load_reg. Elminate extra call to s.def_analysis.require() after progress. Pull a loop-invariant check out of the inst->srouces loop. Drop call to brw_opt_split_virtual_grfs after lowering load_reg. All suggested by Caio. v5: Assert that LOAD_REG doesn't already exist in brw_insert_load_reg. Update comment before fully_defines. Both suggested by Caio. v6: Don't explicitly special-case SHADER_OPCODE_MEMORY_STORE_LOGICAL. Move the inst->dst.file != VGRF check earlier to avoid the loop over sources. Both suggested by Ken. Move the call the brw_insert_load_reg a little bit later, and explain why it's at that location. Suggested by Caio. v7: Many changes to the for-each-source loop in brw_insert_load_reg. Removes incorrect multiplication of s.alloc.sizes with reg_unit. Adds checks for matching SIMD size and NoMask in the search for pre-existing LOAD_REG of same value. v8: Add some unit tests. Suggested by Caio. shader-db: Lunar Lake total instructions in shared programs: 16923237 -> 16921895 (<.01%) instructions in affected programs: 450565 -> 449223 (-0.30%) helped: 251 / HURT: 377 total cycles in shared programs: 910428418 -> 889920590 (-2.25%) cycles in affected programs: 719248184 -> 698740356 (-2.85%) helped: 9076 / HURT: 9082 total fills in shared programs: 2242 -> 2218 (-1.07%) fills in affected programs: 116 -> 92 (-20.69%) helped: 2 / HURT: 0 total sends in shared programs: 848635 -> 848421 (-0.03%) sends in affected programs: 810 -> 596 (-26.42%) helped: 10 / HURT: 0 LOST: 82 GAINED: 78 Meteor Lake and DG2 had similar results. (Meteor Lake shown) total instructions in shared programs: 19875784 -> 19871694 (-0.02%) instructions in affected programs: 1050091 -> 1046001 (-0.39%) helped: 251 / HURT: 2403 total cycles in shared programs: 905328238 -> 882446458 (-2.53%) cycles in affected programs: 682736344 -> 659854564 (-3.35%) helped: 7869 / HURT: 7911 total spills in shared programs: 5512 -> 5032 (-8.71%) spills in affected programs: 1830 -> 1350 (-26.23%) helped: 8 / HURT: 0 total fills in shared programs: 5648 -> 4782 (-15.33%) fills in affected programs: 3312 -> 2446 (-26.15%) helped: 8 / HURT: 0 total sends in shared programs: 1032942 -> 1032722 (-0.02%) sends in affected programs: 572 -> 352 (-38.46%) helped: 10 / HURT: 0 LOST: 138 GAINED: 53 Tiger Lake total instructions in shared programs: 19711930 -> 19715591 (0.02%) instructions in affected programs: 1040623 -> 1044284 (0.35%) helped: 317 / HURT: 2474 total cycles in shared programs: 862988990 -> 860573870 (-0.28%) cycles in affected programs: 612392461 -> 609977341 (-0.39%) helped: 7447 / HURT: 7686 total sends in shared programs: 1034763 -> 1034555 (-0.02%) sends in affected programs: 784 -> 576 (-26.53%) helped: 8 / HURT: 0 LOST: 56 GAINED: 143 Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 20545461 -> 20545220 (<.01%) instructions in affected programs: 422405 -> 422164 (-0.06%) helped: 180 / HURT: 459 total cycles in shared programs: 872697345 -> 866874523 (-0.67%) cycles in affected programs: 573117917 -> 567295095 (-1.02%) helped: 6783 / HURT: 6980 total spills in shared programs: 4335 -> 4336 (0.02%) spills in affected programs: 90 -> 91 (1.11%) helped: 1 / HURT: 2 total fills in shared programs: 4194 -> 4196 (0.05%) fills in affected programs: 463 -> 465 (0.43%) helped: 1 / HURT: 2 total sends in shared programs: 1079446 -> 1079238 (-0.02%) sends in affected programs: 784 -> 576 (-26.53%) helped: 8 / HURT: 0 LOST: 117 GAINED: 37 fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Instrs: 209708136 -> 209695617 (-0.01%); split: -0.02%, +0.01% Send messages: 10927753 -> 10927640 (-0.00%) Cycle count: 30540172048 -> 30427084732 (-0.37%); split: -0.99%, +0.62% Spill count: 511621 -> 510932 (-0.13%); split: -0.22%, +0.08% Fill count: 621166 -> 618440 (-0.44%); split: -0.56%, +0.12% Scratch Memory Size: 35574784 -> 35648512 (+0.21%); split: -0.06%, +0.26% Max live registers: 65453860 -> 65453140 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 75374990 -> 35195764 (-53.31%) Totals from 503284 (71.25% of 706391) affected shaders: Instrs: 180203778 -> 180191259 (-0.01%); split: -0.02%, +0.01% Send messages: 9699732 -> 9699619 (-0.00%) Cycle count: 30080349592 -> 29967262276 (-0.38%); split: -1.01%, +0.63% Spill count: 511584 -> 510895 (-0.13%); split: -0.22%, +0.08% Fill count: 621120 -> 618394 (-0.44%); split: -0.56%, +0.12% Scratch Memory Size: 35443712 -> 35517440 (+0.21%); split: -0.06%, +0.27% Max live registers: 52566092 -> 52565372 (-0.00%); split: -0.01%, +0.00% Non SSA regs after NIR: 70110949 -> 29931723 (-57.31%) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	8b2be206f3	brw/algebraic: Constant folding for BROADCAST and SHUFFLE This prevents assertion failures in brw_eu_emit in a later commit in this MR. Even though they have not been previously observed, these assertion failures could happen even without that commit. No shader-db or fossil-db changes on any Intel platform. Fixes: `04e1783278` ("brw: Call brw_fs_opt_algebraic less often") v2: Add SHUFFLE. Suggested by Ken. Fixed indentation. v3: Update BROADCAST exec_size after rebasing on "brw/build: Use SIMD8 temporaries in emit_uniformize". v4: Explain why munging the exec_size is correct. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	1b997c7bcc	brw/coalesce: Prepare brw_opt_register_coalesce for load_reg v2: Explain the problematic situation a little better in the comment. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	15637334ce	brw/copy: Prepare copy_propagation for load_reg The changes to try_copy_propagate will be removed later in the series. v2: Fix up some comments to note that offset != 0 is allowed only when stride == 0. Apply same offset=0 restriction in try_copy_propagate_def too. Allow copy propagation if the source is either a def or UNIFORM. Don't copy prop a load_reg through a non-def value. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	cfc50390fb	brw: Add basic infrastructure for load_reg pseudo op load_reg is something like load_payload except it has a single source. It copies the entire source to the destination. Its purpose is to convert a non-SSA VGRF into an SSA value. This copy is marked as volatile so that it will act as a scheduling barrier. v2: Fix some typos in the commit message. Eliminate the brw_builder::LOAD_REG overload that returns a brw_inst*. This is unlikely to ever be used. Add some checks to brw_validate. All suggested by Caio. v3: Force the source and destination types of the LOAD_REG to by integer. This will (eventually) simplify the creating of unit tests for the pass that adds LOAD_REG instructions. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	b9656d51c0	brw/opt: Move non-SSA register accounting after first brw_opt_split_virtual_grfs v2: Move to immediately before the main optimization loop. Most importantly, this is after the first call to DCE. fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Non SSA regs after NIR: 237045283 -> 100183460 (-57.74%); split: -58.12%, +0.39% Totals from 701423 (99.26% of 706657) affected shaders: Non SSA regs after NIR: 236868848 -> 100007025 (-57.78%); split: -58.17%, +0.39% Suggested-by: Ken Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Kenny Levinsen	cd4820d6ac	device-select: Support linux-dmabuf feedback device-select-layer needs to obtain the display server's preferred display device, and has so far relied on wl_drm for this. wl_drm is superseded by linux-dmabuf with some Wayland servers having dropped support for wl_drm entirely. Implement linux-dmabuf as preferred mechanism for obtaining the main device, with wl_drm support retained as a fallback for now. Signed-off-by: Kenny Levinsen <kl@kl.wtf> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34219>	2025-04-04 06:00:17 +00:00
llyyr	dc90e33ad2	vulkan/wsi/wayland: initialize surface colorspace with PASS_THROUGH_EXT Starting with sRGB meant we would refcount to -1 if an application chooses PASS_THROUGH. Instead, just initialize with PASS_THROUGH so the initial refcount of 0 reflects reality. Previously, we would segfault if an application chose PASS_THROUGH at swapchain initialization then switched to a color managed colorspace later in the runtime, because we would increment refcount from -1 -> 0 and this would result in not creating a new color managed surface. Fixes: `789507c99c` ("vulkan/wsi: implement the Wayland color management protocol") Signed-off-by: llyyr <llyyr.public@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34353>	2025-04-04 05:10:08 +00:00
David Rosca	1610841f0f	egl/x11: Fix swap interval setup Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Calling dri2_x11_setup_swap_interval with swap_available = false sets the min/max/default swap interval values to zero. EGL_MIN/MAX_SWAP_INTERVAL is always reported as 0 and the interval value set by eglSwapInterval gets clamped to 0. Set swap_available to true before calling dri2_x11_setup_swap_interval, as was done before. Fixes: `c00701c83a` ("egl/x11: unify swrast/kopper/dri3 paths a bit") Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34235>	2025-04-04 04:15:01 +00:00
Jose Maria Casanova Crespo	efc87e0d6a	glapi: import noop_array and public stubs earlier. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details After `711fc10ea3` "glapi: merge all shared-glapi source files into one .c file" the V3D simulator started crashing. After testing the changes of the merge one by one, it was identified that previously shared_glapi_mapi_tmp.h was being imported twice instead of only once as it happens after the merge. Although the change done in the merge seems to be equivalent it seems it was breaking the the debug builds. Here can find an explanation why this problem was affecting debug builds https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34363#note_2850196 Fixes: `711fc10ea3` ("glapi: merge all shared-glapi source files into one .c file") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12908 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34363>	2025-04-04 00:18:28 +00:00
Mike Blumenkrantz	12b57b34f8	gallium/util: check nr_samples in pipe_surface_equal() this is otherwise broken cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34367>	2025-04-03 23:41:30 +00:00
Timur Kristóf	a530890e75	nir/print: Fix variable mode for arrayed output load intrinsics. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This helps print the names of varyings correctly. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>	2025-04-03 19:54:51 +00:00
Timur Kristóf	e258492a8f	radv: Remove radv_streamout_info::num_outputs. This field was never used for determining the number of outputs, just for determining whether streamout was enabled, which makes it unnecessary. We can use enabled_stream_buffers_mask for that. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>	2025-04-03 19:54:51 +00:00
Timur Kristóf	ce2138d73a	radv: Call nir_opt_undef too after nir_opt_varyings. Shaders may have undefined output stores after nir_opt_varyings. These must be optimized out, otherwise they hit an assertion. Fixes: `17f6ab28cc` Cc: mesa-stable Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>	2025-04-03 19:54:51 +00:00
Timur Kristóf	15d0804670	radv: Use buffers_written mask when gathering XFB info. We need to enable these buffers regardless of whether or not the shader actually writes any outputs to them, otherwise we break XFB queries. Cc: mesa-stable Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>	2025-04-03 19:54:51 +00:00
Timur Kristóf	96d11d0f56	nir/opt_varyings: Fix assertion when deduplicating TCS outputs. When deduplicating TCS outputs, we may find outputs that aren't loaded by the shader itself. This previously hit a bad assertion. Fixes: `c66967b5cb` Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12410 Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>	2025-04-03 19:54:51 +00:00
Timur Kristóf	a29b5857f7	nir/xfb: Preserve some xfb information when gathering from intrinsics. We need to remember which streamout buffers and streams were enabled, even if the shader doesn't actually write any outputs to them, because the API requires that we count vertices created by this shader towards queries against those streams. That information can be gathered by nir_gather_xfb_info_with_varyings from the original NIR I/O variables that we get from the frontend, but it isn't included in any intrinsics so would be otherwise lost here. Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>	2025-04-03 19:54:51 +00:00
Jan Alexander Steffens (heftig)	1deb0536a1	gfxstream: Use proper log format for 32-bit Vulkan On i686, where VK_USE_64_BIT_PTR_DEFINES is unset and Vulkan handles are represented as 64-bit integers instead, the code used the wrong format specifier, causing a build error. Fixes: `7fb31361f4` ("Handle external fences in vkGetFenceStatus()") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34124>	2025-04-03 19:35:20 +00:00
Zan Dobersek	335cc96069	tu: disable logic operations for float and sRGB formats Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Per spec, logic operations between fragment values and color attachments should be disabled when attachments are using float or sRGB formats. Regardless of attachment's format, enabled logic operations should keep blending disabled. Fixes: dEQP-VK.pipeline..logic_op_na_formats. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34212>	2025-04-03 15:48:19 +00:00
Lucas Stach	d917625226	etnaviv: add context flush sw query Context flushes can be caused by all kinds of operations that aren't obvious to a GL API user. As those are quite heavy-weight operations it is nice to have some insight into how many of those are happening per frame. Add a sw query to make this information easily accessible. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34350>	2025-04-03 14:27:55 +00:00
Stéphane Cerveau	ee535aa039	radv: video: rework maxActiveReferenceSlot/MaxDpbSlots Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details For the pReferenceSlots.slotIndex, the max value should the maxDpbSlots which is h264: 16 + 1 h265 : 15 + 2 av1: 7+2 Fixing SVA_CL1_E test vector in JVT-AVC_V1 fluster test suite. Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33094>	2025-04-03 13:20:45 +00:00
Georg Lehmann	c21a53440f	spirv: clamp/sign-extend non 32bit ldexp exponents GLSL.std.450 allows any integer size here. OpenCL only allows i32. Cc: mesa-stable Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34071>	2025-04-03 12:35:59 +00:00
Job Noorman	45a5ccbf07	ir3/ra: create merge sets for splits/collects inserted for shared RA Since shared RA happens after creating merge sets, newly inserted splits/collects did not have merge sets created for them. Fix this by creating merge sets for new instructions after shared RA. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>	2025-04-03 12:06:18 +00:00
Job Noorman	0cafd07b0c	ir3: add ir3_aggressive_coalesce helper To allow us to create merge sets outside of ir3_merge_regs.c. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>	2025-04-03 12:06:18 +00:00
Job Noorman	a0db2f9737	ir3/ra: assign interval offsets to new defs after shared RA Shared RA might insert new defs to be handled by regular RA (e.g., shared spills). However, their interval offsets were not initialized which caused their intervals to sometimes be mistakenly matched with those containing offset 0. Fix this by calling index_merge_sets after shared RA and modifying that function to only index new defs in that case. Signed-off-by: Job Noorman <jnoorman@igalia.com> Fixes: `fa22b0901a` ("ir3/ra: Add specialized shared register RA/spilling") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>	2025-04-03 12:06:18 +00:00
Eric Engestrom	6331441e24	ci: rename ci-tron priority tag to avoid conflict with the generic fdo runners Otherwise, ci-tron runners with that tag could pick up jobs meant for the fdo runners, as happened here: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/73883719 The inverse (fdo runners picking up a job meant for a ci-tron runner) is not possible though, as ci-tron jobs always include a `farm:$RUNNER_FARM_LOCATION` tag, so the problem only exists in the other direction. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34358>	2025-04-03 11:25:12 +00:00
Samuel Pitoiset	ef3363ef71	radv: rework suspend/resume user conditional rendering Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Better to suspend/resume in the top level function. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34338>	2025-04-03 08:54:36 +00:00
Samuel Pitoiset	4bc971a0bd	radv: add new helper to suspend/resume user conditional rendering Instead of duplicating same code everywhere. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34338>	2025-04-03 08:54:36 +00:00
Samuel Pitoiset	4d1d6d4147	radv: fix ignoring conditional rendering with vkCmdResolveImage() This command isn't supposed to be affected by conditional rendering. This fixes new VKCTS coverage dEQP-VK.conditional_rendering.conditional_ignore.resolve_image*. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34338>	2025-04-03 08:54:36 +00:00
Job Noorman	dd1ba74777	ir3: make shpe a terminator shpe is a bit of a special instruction: it's not really a terminator (i.e., it does not perform a jump) but it does have to stay at the end of its block. Up to now, we tried to enforce this by creating const write barriers on shpe; the assumption being that everything that happens in the preamble ends in a write to the const file so shpe stays at the end. Alas, it turns out this is not true: things like sampler prefetches do not write the const file and nothing was preventing those from being scheduled after shpe. Instead of trying to create even more barrier dependencies, fix this by making shpe a terminator. Both sched and postsched treat terminators specially to make sure they always stay at the end of their block. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34290>	2025-04-03 08:16:59 +00:00
Danylo Piliaiev	f5019ee0d4	ir3: Fix shaders that write only color classified as empty Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Shader may have zero instructions and no prefetches but have inputs that without modifications are used as output. Fixed vkd3d test: test_depth_bias_behaviour Fixes: `b0a98d3b13` ("ir3: Detect empty fragment shaders") Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34348>	2025-04-03 06:47:43 +00:00
Connor Abbott	75178c4655	tu: Implement VK_QCOM_fragment_density_map_offset Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>	2025-04-03 05:37:56 +00:00
Connor Abbott	7351f8d587	tu/fdm: Skip some patchpoints when binning In order to implement FDM offset, we will have to offset the viewport and scissor in the binning pass. In order to do this, we have to pass a bin with nonsensical negative offsets to the patchpoint function, which would result in asserts when patching the load/store sequences. But we don't really need to patch these anyways as they are unused during binning, so add the ability to skip them when binning. FS params and some implementations of CmdClearAttachments (that don't contribute to visibility) can similarly be skipped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>	2025-04-03 05:37:56 +00:00
Connor Abbott	df0c17f76e	tu: Fix CmdClearAttachments with fragment density map The clear may be a partial clear, in which case we need to make sure that the clear rectangle is transformed into GMEM space so that it is clipped correctly. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>	2025-04-03 05:37:56 +00:00
Connor Abbott	0d4eed0e46	tu: Split out part of tiling config to vsc config For FDM offset, we will need to expand the number of bins by 1, which can change how pipes are allocated. We don't necessarily know whether FDM offset will be used when creating the VkFramebuffer, so we'll have to create two different configs when FDM is enabled. Split out the parts that are affected by the number of bins into a separate "VSC config" struct that will be duplicated with FDM offset. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>	2025-04-03 05:37:56 +00:00
Connor Abbott	304af47ba2	tu: Only allow power-of-two fragment areas Non-power-of-two fragment areas can result in precision loss and missed fragments, which was seen in an upcoming CTS test. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>	2025-04-03 05:37:56 +00:00
Caleb Callaway	5ad00bae8b	intel/compiler: fix lingering i965 references Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34351>	2025-04-03 03:17:25 +00:00
Job Noorman	02ff26be38	ir3: run opt_if after opt_vectorize Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details nir_opt_vectorize could replace swizzled movs with vectorized movs in a different block. If this happens with swizzled movs in a then block, it could leave this block empty. ir3 assumes only the else block can be empty (e.g., when lowering predicates) so make sure ifs are in that canonical form again. This fixes empty predication blocks in some shaders, for example: predt predf ... prede Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34272>	2025-04-03 00:19:31 +00:00
Job Noorman	ee0ee2a317	ir3: don't sync every TCS/GEOM block TCS/GEOM shaders need (sy)(ss) on their first instruction but we accidentally set it on the first instruction of every block. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34257>	2025-04-02 23:37:35 +00:00
Connor Abbott	3ba315f205	ir3: Split mad with scalar ALU At least on all a6xx/a7xx, mad.f32 and mad.f16 are not fused. This means that when the sources of a NIR ffma are all uniform we can split it in two to execute it on the scalar ALU. This is important to reduce register pressure and make more preambles executed early. On fossil-db the statistics are mostly a wash as expected, but with early preambles increasing dramatically: Totals: MaxWaves: 2249180 -> 2249230 (+0.00%); split: +0.01%, -0.01% Instrs: 49668884 -> 49662951 (-0.01%); split: -0.12%, +0.11% CodeSize: 103662656 -> 103831154 (+0.16%); split: -0.22%, +0.38% NOPs: 8502571 -> 8495568 (-0.08%); split: -0.61%, +0.53% MOVs: 1554442 -> 1538804 (-1.01%); split: -2.01%, +1.01% Full: 1820906 -> 1814292 (-0.36%); split: -0.39%, +0.03% (ss): 1168628 -> 1165868 (-0.24%); split: -1.01%, +0.78% (sy): 616751 -> 616521 (-0.04%); split: -0.52%, +0.49% (ss)-stall: 4384397 -> 4361662 (-0.52%); split: -1.44%, +0.93% (sy)-stall: 17850227 -> 17858949 (+0.05%); split: -0.58%, +0.63% Early-preamble: 102262 -> 115702 (+13.14%) Cat0: 9375820 -> 9367978 (-0.08%); split: -0.57%, +0.48% Cat1: 2470212 -> 2454318 (-0.64%); split: -1.28%, +0.64% Cat2: 18673655 -> 18707106 (+0.18%) Cat3: 14227810 -> 14211106 (-0.12%) Cat5: 1424184 -> 1424150 (-0.00%) Cat7: 1404718 -> 1405808 (+0.08%); split: -0.39%, +0.47% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34115>	2025-04-02 23:08:39 +00:00
Sviatoslav Peleshko	64980c4f05	vulkan/wsi/headless: Remove unnecessary wsi_configure_image() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details wsi_configure_image() with the same info is already called by configure_image() in wsi_swapchain_init(), so this second call is unnecessary. Furthermore, calling it the second time caused a memory leak of queue family indices array. Fixes: `d4a2c0fc` ("vulkan/wsi: add a headless swapchain implementation/option") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12811 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34194>	2025-04-02 21:17:30 +00:00
Dylan Baker	ff4b1b1e43	intel/decoder: free memory in error case This was handled in other instances in a previous patch, but this instance remains, as the zlib decompression routine is slightly different. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34118>	2025-04-02 19:26:55 +00:00
Dylan Baker	da14c0af67	intel/tools: move ascii85_decode to common code We have 3 copies of this function, so put it in the shared static library. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34118>	2025-04-02 19:26:55 +00:00
Dylan Baker	7b791cd0b4	intel/tools: deduplicate zlib_inflate function There are three copies of this function, all of them have the same memory leak in them. Instead of fixing them one by one, just use a common implementation for all three, since they already all have a shared helper lib. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34118>	2025-04-02 19:26:55 +00:00
David Rosca	a5edb9faac	radeonsi/vcn: Disable AV1 unidir compound with rate control Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details It causes significant bitrate overshoot currently. Cc: mesa-stable Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34237>	2025-04-02 17:55:23 +00:00

... 27 28 29 30 31 ...

190082 commits