fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-22 15:18:09 +02:00

Author	SHA1	Message	Date
Job Noorman	af8105d085	ir3/ra: ignore phis handled by shared RA If shared RA is used, it may have handled some phis. These are already ignored by regular RA in handle_phi but were used before that in potentially dangerous ways. More specifically, the interval of such phis was accessed which may cause an out-of-bounds read since it was never created. Fix this by skipping such phis earlier. Signed-off-by: Job Noorman <jnoorman@igalia.com> Fixes: `c6a932d4b3` ("ir3/ra: handle phis with preferred regs first") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34503>	2025-04-15 06:04:04 +00:00
Job Noorman	d8033ba173	ir3/ra: add helper for getting a dst interval There have been multiple issues related to accessing intervals through invalid register names. This usually results in a (difficult to diagnose) out-of-bounds access. Wrap all the interval accesses in a helper where we can assert that the name is in-bounds. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34503>	2025-04-15 06:04:04 +00:00
Connor Abbott	74531094cb	ir3: Vectorize shared memory loads/stores This drastically helps a Path of Exile 2 compute dispatch, going from 4.6ms to 2.7ms. Totals from 969 (0.59% of 164134) affected shaders: MaxWaves: 9586 -> 9560 (-0.27%); split: +0.02%, -0.29% Instrs: 1252433 -> 1234724 (-1.41%); split: -1.47%, +0.05% CodeSize: 2237424 -> 2195238 (-1.89%); split: -1.91%, +0.03% NOPs: 362213 -> 360913 (-0.36%); split: -0.92%, +0.56% MOVs: 58879 -> 59591 (+1.21%); split: -0.62%, +1.83% Full: 15817 -> 15867 (+0.32%); split: -0.04%, +0.36% (ss): 35671 -> 35434 (-0.66%); split: -1.80%, +1.14% (sy): 23953 -> 23964 (+0.05%); split: -0.38%, +0.43% (ss)-stall: 127807 -> 124930 (-2.25%); split: -3.43%, +1.18% (sy)-stall: 583947 -> 585886 (+0.33%); split: -0.61%, +0.94% Early-preamble: 317 -> 316 (-0.32%) Cat0: 394577 -> 393316 (-0.32%); split: -0.85%, +0.53% Cat1: 100335 -> 101057 (+0.72%); split: -0.36%, +1.08% Cat2: 415880 -> 415835 (-0.01%); split: -0.05%, +0.04% Cat3: 187928 -> 187929 (+0.00%); split: -0.00%, +0.00% Cat5: 19143 -> 19148 (+0.03%) Cat6: 69630 -> 52523 (-24.57%) Cat7: 47160 -> 47136 (-0.05%); split: -0.56%, +0.51% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34441>	2025-04-14 17:22:47 +00:00
Connor Abbott	9977c4d682	ir3: Move load/store vectorization to finalize Some frontends such as rusticl and turnip call the optimization loop before choosing the shared memory layout, in order to be able to delete variables that turn out to be unused. This means that we can't vectorize them until after the first run of the optimization loop. Other drivers also seem to do something similar. This also has the benefit that by delaying vectorization of UBOs until after they are lowered from derefs, we don't insert casts which remove the ability of nir_lower_explicit_io to insert a range, which was blocking the pushing of vectorized indirect UBO loads. This has a significant positive impact on fossil-db: Only doing vectorization later exposes a bug where vectorization could change the bitsize after we used it to determine which descriptor to use. It happened to work before because vectorization was usually done early. To fix it, move adjusting the descriptor to a new pass that happens after finalizing. Totals: MaxWaves: 2249140 -> 2281068 (+1.42%); split: +1.43%, -0.01% Instrs: 49624230 -> 49143117 (-0.97%); split: -1.14%, +0.17% CodeSize: 103796862 -> 104143744 (+0.33%); split: -0.98%, +1.31% NOPs: 8489860 -> 8512218 (+0.26%); split: -1.55%, +1.81% MOVs: 1531650 -> 1574911 (+2.82%); split: -1.37%, +4.20% Full: 1814334 -> 1748906 (-3.61%); split: -3.64%, +0.03% (ss): 1155395 -> 1128249 (-2.35%); split: -3.48%, +1.13% (sy): 608650 -> 567972 (-6.68%); split: -7.32%, +0.64% (ss)-stall: 4352550 -> 4340473 (-0.28%); split: -2.08%, +1.80% (sy)-stall: 17852259 -> 16943647 (-5.09%); split: -6.25%, +1.16% STPs: 24568 -> 24215 (-1.44%) LDPs: 37799 -> 37468 (-0.88%) Early-preamble: 115698 -> 113694 (-1.73%); split: +0.17%, -1.90% Cat0: 9345228 -> 9367782 (+0.24%); split: -1.41%, +1.65% Cat1: 2445265 -> 2549122 (+4.25%); split: -0.81%, +5.06% Cat2: 18704736 -> 18377519 (-1.75%); split: -1.76%, +0.01% Cat3: 14210303 -> 14130558 (-0.56%); split: -0.56%, +0.00% Cat4: 1346895 -> 1346462 (-0.03%); split: -0.03%, +0.00% Cat5: 1420418 -> 1420417 (-0.00%); split: -0.07%, +0.07% Cat6: 745590 -> 549358 (-26.32%); split: -26.66%, +0.34% Cat7: 1405795 -> 1401899 (-0.28%); split: -0.96%, +0.68% Totals from 79089 (48.19% of 164134) affected shaders: MaxWaves: 947648 -> 979576 (+3.37%); split: +3.40%, -0.03% Instrs: 38664140 -> 38183027 (-1.24%); split: -1.47%, +0.22% CodeSize: 80179110 -> 80525992 (+0.43%); split: -1.27%, +1.70% NOPs: 6880907 -> 6903265 (+0.32%); split: -1.91%, +2.23% MOVs: 1183855 -> 1227116 (+3.65%); split: -1.78%, +5.43% Full: 1107056 -> 1041628 (-5.91%); split: -5.96%, +0.05% (ss): 939342 -> 912196 (-2.89%); split: -4.28%, +1.39% (sy): 457959 -> 417281 (-8.88%); split: -9.73%, +0.85% (ss)-stall: 3664495 -> 3652418 (-0.33%); split: -2.47%, +2.14% (sy)-stall: 12266805 -> 11358193 (-7.41%); split: -9.10%, +1.69% STPs: 7494 -> 7141 (-4.71%) LDPs: 7050 -> 6719 (-4.70%) Early-preamble: 46339 -> 44335 (-4.32%); split: +0.43%, -4.75% Cat0: 7548630 -> 7571184 (+0.30%); split: -1.75%, +2.05% Cat1: 1823872 -> 1927729 (+5.69%); split: -1.09%, +6.78% Cat2: 14767716 -> 14440499 (-2.22%); split: -2.22%, +0.01% Cat3: 10630582 -> 10550837 (-0.75%); split: -0.75%, +0.00% Cat4: 1150090 -> 1149657 (-0.04%); split: -0.04%, +0.00% Cat5: 1068913 -> 1068912 (-0.00%); split: -0.09%, +0.09% Cat6: 554910 -> 358678 (-35.36%); split: -35.82%, +0.45% Cat7: 1119427 -> 1115531 (-0.35%); split: -1.20%, +0.86% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34441>	2025-04-14 17:22:46 +00:00
Connor Abbott	ec780eb0e7	ir3: Pass through access flags when lowering global accesses This will let us do optimizations such as moving loads to a preamble. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34483>	2025-04-14 16:53:34 +00:00
Job Noorman	35ec960f6f	ir3: run cp after ir3_imm_const_to_preamble Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Now that ir3_cp has an option to not lower immediates to const registers, we can use it after ir3_imm_const_to_preamble instead of manually propagating immediates. This fixes a lot of missed opportunities for early-preamble as we didn't propagate the mova1 immediate which a caused a GPR to be used in many preambles. Totals: Instrs: 49704517 -> 49703700 (-0.00%); split: -0.16%, +0.16% CodeSize: 103917968 -> 103187072 (-0.70%); split: -0.82%, +0.11% NOPs: 8516944 -> 8511764 (-0.06%); split: -0.78%, +0.72% MOVs: `1534023` -> 1536385 (+0.15%); split: -1.12%, +1.27% Full: 1816517 -> 1816548 (+0.00%); split: -0.05%, +0.06% (ss): 1162108 -> 1161490 (-0.05%); split: -1.03%, +0.98% (sy): 611398 -> 610311 (-0.18%); split: -0.80%, +0.62% (ss)-stall: 4384529 -> 4388096 (+0.08%); split: -1.22%, +1.30% (sy)-stall: 17858701 -> 17837101 (-0.12%); split: -0.87%, +0.74% STPs: 25096 -> 25491 (+1.57%); split: -0.05%, +1.63% LDPs: 37635 -> 38030 (+1.05%); split: -0.03%, +1.08% Preamble Instrs: 12589113 -> 11391946 (-9.51%); split: -9.75%, +0.24% Early Preamble: 115946 -> 122893 (+5.99%); split: +6.05%, -0.06% Cat0: 9374513 -> 9370393 (-0.04%); split: -0.71%, +0.67% Cat1: 2443348 -> 2446546 (+0.13%); split: -0.82%, +0.95% Cat2: 18731502 -> 18731478 (-0.00%); split: -0.00%, +0.00% Cat7: 1410092 -> `1410221` (+0.01%); split: -0.61%, +0.62% Totals from 39189 (23.81% of 164575) affected shaders: Instrs: 30656115 -> 30655298 (-0.00%); split: -0.26%, +0.26% CodeSize: 61714230 -> 60983334 (-1.18%); split: -1.37%, +0.19% NOPs: 6074700 -> 6069520 (-0.09%); split: -1.10%, +1.01% MOVs: 1010392 -> 1012754 (+0.23%); split: -1.70%, +1.93% Full: 617108 -> 617139 (+0.01%); split: -0.16%, +0.16% (ss): 778842 -> 778224 (-0.08%); split: -1.54%, +1.46% (sy): 362803 -> 361716 (-0.30%); split: -1.35%, +1.05% (ss)-stall: 3203827 -> `3207394` (+0.11%); split: -1.67%, +1.78% (sy)-stall: 9507680 -> 9486080 (-0.23%); split: -1.63%, +1.40% STPs: 23004 -> 23399 (+1.72%); split: -0.06%, +1.77% LDPs: 33942 -> 34337 (+1.16%); split: -0.04%, +1.20% Preamble Instrs: 8090918 -> 6893751 (-14.80%); split: -15.18%, +0.38% Early Preamble: 12246 -> 19193 (+56.73%); split: +57.25%, -0.52% Cat0: 6656706 -> 6652586 (-0.06%); split: -1.00%, +0.94% Cat1: 1546399 -> 1549597 (+0.21%); split: -1.30%, +1.50% Cat2: 11642214 -> 11642190 (-0.00%); split: -0.00%, +0.00% Cat7: 943911 -> 944040 (+0.01%); split: -0.91%, +0.92% Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34397>	2025-04-14 04:37:28 +00:00
Job Noorman	226ec669d8	ir3/cp: ignore alias sources for sam.s2en ir3_cp asserts that the first source of a sam.s2en is a collect which isn't necessarily true after creating alias registers. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34397>	2025-04-14 04:37:28 +00:00
Job Noorman	1618c2495b	ir3/cp: add option to disable immediate to const lowering This will allow it to be used after ir3_imm_const_to_preamble so that we don't have to do the propagation of immediates manually there. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34397>	2025-04-14 04:37:27 +00:00
Job Noorman	6546a40225	ir3: remove spaces in shader stats The shaderdb scripts don't like them. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34397>	2025-04-14 04:37:27 +00:00
Valentine Burley	b49eaf0966	ci/lava: Consolidate piglit trace job definitions Clean up LAVA job definitions. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>	2025-04-11 07:05:07 +00:00
Valentine Burley	87d58ea57a	ci/piglit: Consolidate HWCI_TEST_SCRIPT for piglit traces The HWCI_TEST_SCRIPT variable was always getting overwritten for these definitions. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>	2025-04-11 07:05:06 +00:00
Valentine Burley	1aeedddbb6	ci/piglit: Drop redundant PIGLIT_PROFILES variable PIGLIT_PROFILES was only used with the piglit-runner.sh script, which no jobs were using anymore. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>	2025-04-11 07:05:06 +00:00
Renato Pereyra	7190949927	perfetto/android: align datasource names with tooling expectations Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details A few Android tools are based on/assume the datasource names gpu.renderstages and gpu.counters. It is less effort to align with that naming for Android builds than to chase down those tools and fix them, not to mention account for new tools that may be created in the future. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34330>	2025-04-08 18:29:10 +00:00
Rob Clark	ea6e69e9d3	tu: vdrm vtest support In a few places, we need to deal with not having direct access to the rendernode device. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>	2025-04-08 15:38:39 +00:00
Rob Clark	bf0e3d6274	virtio/vdrm: Add vtest backend This allows for testing drm native ctx support without spinning up a VM. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>	2025-04-08 15:38:39 +00:00
Rob Clark	28ad8fd5b1	tu: Add some func traces Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>	2025-04-08 15:38:38 +00:00
Rob Clark	db88a490b8	tu: Avoid extraneous set_iova The GEM_NEW ccmd already passes the iova, so we don't need an extra SET_IOVA for newly created BOs. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>	2025-04-08 15:38:38 +00:00
Rob Clark	081869e591	tu/vdrm: Fix userspace fence cmds Somehow the update of the fence value to write was dropped, so the cmdstream that wrote the fence value would simply write zero over and over again. Fixes: `84d6eedd5e` ("tu: Refactor the submit path") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>	2025-04-08 15:38:38 +00:00
Rob Clark	471961d0ca	ir3: Comment re-indent To make this more readable. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>	2025-04-08 15:38:38 +00:00
Mike Blumenkrantz	b14c8128bf	tu: check for valid descriptor set when binding descriptors Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details these pointers can be null, and they are checked as null in pipeline layout creation, but here if the pointer is null it will crash cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34412>	2025-04-07 18:49:10 +00:00
Collabora's Gfx CI Team	fcf19bf335	Uprev ANGLE to 3818d37d5e94317f01810053b8f28c1f1e8b98e6 `1b34d2a18a...3818d37d5e` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34378>	2025-04-07 18:16:00 +00:00
Eric Engestrom	90844640a1	freedreno/ci: update expectations Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34386>	2025-04-04 23:49:23 +00:00
Connor Abbott	536b2b13c8	tu: Implement VK_EXT_fragment_density_map_offset Implement support for dynamic rendering, including suspending and resuming render passes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34159>	2025-04-04 22:35:20 +00:00
Danylo Piliaiev	0e9854a894	tu: Implement VK_KHR_shader_clock There is a special address defined in kernel from which ALWAYSON counter could be read. Blob uses this sequence to read it: getone #l15 mov.s32s32 r2.y, -4096 mov.s32s32 r2.z, 131071 (rpt5)nop ldg.u32 r2.w, g[r2.y], 1 ldg.u32 r2.y, g[r2.y+4], 1 (sy)(ss)mov.s32s32 r48.x, (last)r2.w mov.s32s32 r48.y, (last)r2.y l15: Passes: dEQP-VK.glsl.shader_clock.* Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29860>	2025-04-04 18:22:49 +00:00
Danylo Piliaiev	4b1b4ee10c	freedreno,tu: Read and pass to compiler uche_trap_base KGSL always exposed uche_trap_base, and MSM only recently got support for it. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29860>	2025-04-04 18:22:49 +00:00
Mark Collins	e4359cc49c	tu/kgsl: Fix KGSL syncobj lifetime in no CB submit The temporary syncobj created in the fast path of kgsl_queue_submit was not being destroyed, and potentially being assigned to multiple syncobjs without being properly duplicated. This could lead to a use-after-free or double-free since multiple syncobjs could be assigned the same FD. Signed-off-by: Mark Collins <mark@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34328>	2025-04-04 16:54:17 +00:00
Mark Collins	cf4bd2e412	tu/kgsl: Revert "Remove zero CB queue submission fast path" This reverts commit `0342d34bdb` which introduced a regression in the Turnip's KGSL backend, causing various sync issues since KGSL doesn't advance the GPU timeline when a submit without cmdbufs is made. A comment explaining the issue was added to the code, and the fast path is reintroduced. Signed-off-by: Mark Collins <mark@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34328>	2025-04-04 16:54:17 +00:00
Job Noorman	78ef51aa04	ir3/opt_preamble: take alias.rt into account for rewrite cost FS outputs can use const registers in alias.rt without a mov so take this into account when calculating the rewrite cost of instructions. Totals: MaxWaves: 2765084 -> 2765130 (+0.00%); split: +0.00%, -0.00% Instrs: 56289002 -> 56285073 (-0.01%); split: -0.01%, +0.00% CodeSize: 118071672 -> 118076808 (+0.00%); split: -0.00%, +0.01% NOPs: 9491112 -> 9492474 (+0.01%); split: -0.00%, +0.02% MOVs: 1790085 -> 1786768 (-0.19%); split: -0.19%, +0.00% Full: 2156693 -> 2156607 (-0.00%); split: -0.00%, +0.00% (ss): 1329812 -> 1329546 (-0.02%); split: -0.03%, +0.01% (sy): 686396 -> 686386 (-0.00%); split: -0.00%, +0.00% (ss)-stall: 4995295 -> 4995185 (-0.00%); split: -0.02%, +0.01% (sy)-stall: 19828966 -> 19828624 (-0.00%); split: -0.01%, +0.01% Cat0: 10450369 -> 10451731 (+0.01%); split: -0.00%, +0.02% Cat1: 2787946 -> 2784566 (-0.12%); split: -0.12%, +0.00% Cat2: 21265787 -> 21264447 (-0.01%) Cat3: 16207098 -> 16206536 (-0.00%) Cat7: 1597849 -> 1597840 (-0.00%); split: -0.00%, +0.00% Totals from 730 (0.36% of 200220) affected shaders: MaxWaves: 6308 -> 6354 (+0.73%); split: +0.79%, -0.06% Instrs: 258235 -> 254306 (-1.52%); split: -1.59%, +0.07% CodeSize: 698806 -> 703942 (+0.73%); split: -0.28%, +1.02% NOPs: 21040 -> 22402 (+6.47%); split: -1.85%, +8.33% MOVs: 9426 -> 6109 (-35.19%); split: -35.52%, +0.33% Full: 8914 -> 8828 (-0.96%); split: -1.03%, +0.07% (ss): 5118 -> 4852 (-5.20%); split: -6.58%, +1.39% (sy): 2118 -> 2108 (-0.47%); split: -1.18%, +0.71% (ss)-stall: 17360 -> 17250 (-0.63%); split: -4.57%, +3.94% (sy)-stall: 34921 -> 34579 (-0.98%); split: -5.90%, +4.92% Cat0: 24734 -> 26096 (+5.51%); split: -1.58%, +7.09% Cat1: 12311 -> 8931 (-27.46%); split: -27.70%, +0.24% Cat2: 106329 -> 104989 (-1.26%) Cat3: 100547 -> 99985 (-0.56%) Cat7: 3646 -> 3637 (-0.25%); split: -0.91%, +0.66% Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34279>	2025-04-04 14:17:10 +00:00
Zan Dobersek	248edb43c3	tu: allow D3D-compatible texture coordinate rounding Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details When running under DXVK or vkd3d, the texture coordinate rounding behavior should match D3D expectations. On Adreno, this behavior can be toggled through the SP_TP_MODE_CNTL register. A driconf-based option is introduced to help set the relevant register flag that enables this behavior. This fixes the cause of test_sampler_rounding test case failure in vkd3d on Turnip's side, but a small change in vkd3d is also required, so the test failure expectation isn't removed yet. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33987>	2025-04-04 10:09:47 +00:00
Zan Dobersek	3b1ca55b40	freedreno/registers: add useful A6XX_SP_TP_MODE_CNTL bitfields Add additional bitfields for the A6XX_SP_TP_MODE_CNTL registers, ones that we already use and the texcoord rounding mode bitfield that we'll need for D3D-over-Vulkan implementations. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33987>	2025-04-04 10:09:47 +00:00
Zan Dobersek	335cc96069	tu: disable logic operations for float and sRGB formats Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Per spec, logic operations between fragment values and color attachments should be disabled when attachments are using float or sRGB formats. Regardless of attachment's format, enabled logic operations should keep blending disabled. Fixes: dEQP-VK.pipeline..logic_op_na_formats. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34212>	2025-04-03 15:48:19 +00:00
Job Noorman	45a5ccbf07	ir3/ra: create merge sets for splits/collects inserted for shared RA Since shared RA happens after creating merge sets, newly inserted splits/collects did not have merge sets created for them. Fix this by creating merge sets for new instructions after shared RA. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>	2025-04-03 12:06:18 +00:00
Job Noorman	0cafd07b0c	ir3: add ir3_aggressive_coalesce helper To allow us to create merge sets outside of ir3_merge_regs.c. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>	2025-04-03 12:06:18 +00:00
Job Noorman	a0db2f9737	ir3/ra: assign interval offsets to new defs after shared RA Shared RA might insert new defs to be handled by regular RA (e.g., shared spills). However, their interval offsets were not initialized which caused their intervals to sometimes be mistakenly matched with those containing offset 0. Fix this by calling index_merge_sets after shared RA and modifying that function to only index new defs in that case. Signed-off-by: Job Noorman <jnoorman@igalia.com> Fixes: `fa22b0901a` ("ir3/ra: Add specialized shared register RA/spilling") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>	2025-04-03 12:06:18 +00:00
Job Noorman	dd1ba74777	ir3: make shpe a terminator shpe is a bit of a special instruction: it's not really a terminator (i.e., it does not perform a jump) but it does have to stay at the end of its block. Up to now, we tried to enforce this by creating const write barriers on shpe; the assumption being that everything that happens in the preamble ends in a write to the const file so shpe stays at the end. Alas, it turns out this is not true: things like sampler prefetches do not write the const file and nothing was preventing those from being scheduled after shpe. Instead of trying to create even more barrier dependencies, fix this by making shpe a terminator. Both sched and postsched treat terminators specially to make sure they always stay at the end of their block. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34290>	2025-04-03 08:16:59 +00:00
Danylo Piliaiev	f5019ee0d4	ir3: Fix shaders that write only color classified as empty Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Shader may have zero instructions and no prefetches but have inputs that without modifications are used as output. Fixed vkd3d test: test_depth_bias_behaviour Fixes: `b0a98d3b13` ("ir3: Detect empty fragment shaders") Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34348>	2025-04-03 06:47:43 +00:00
Connor Abbott	75178c4655	tu: Implement VK_QCOM_fragment_density_map_offset Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>	2025-04-03 05:37:56 +00:00
Connor Abbott	7351f8d587	tu/fdm: Skip some patchpoints when binning In order to implement FDM offset, we will have to offset the viewport and scissor in the binning pass. In order to do this, we have to pass a bin with nonsensical negative offsets to the patchpoint function, which would result in asserts when patching the load/store sequences. But we don't really need to patch these anyways as they are unused during binning, so add the ability to skip them when binning. FS params and some implementations of CmdClearAttachments (that don't contribute to visibility) can similarly be skipped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>	2025-04-03 05:37:56 +00:00
Connor Abbott	df0c17f76e	tu: Fix CmdClearAttachments with fragment density map The clear may be a partial clear, in which case we need to make sure that the clear rectangle is transformed into GMEM space so that it is clipped correctly. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>	2025-04-03 05:37:56 +00:00
Connor Abbott	0d4eed0e46	tu: Split out part of tiling config to vsc config For FDM offset, we will need to expand the number of bins by 1, which can change how pipes are allocated. We don't necessarily know whether FDM offset will be used when creating the VkFramebuffer, so we'll have to create two different configs when FDM is enabled. Split out the parts that are affected by the number of bins into a separate "VSC config" struct that will be duplicated with FDM offset. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>	2025-04-03 05:37:56 +00:00
Connor Abbott	304af47ba2	tu: Only allow power-of-two fragment areas Non-power-of-two fragment areas can result in precision loss and missed fragments, which was seen in an upcoming CTS test. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33500>	2025-04-03 05:37:56 +00:00
Job Noorman	02ff26be38	ir3: run opt_if after opt_vectorize Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details nir_opt_vectorize could replace swizzled movs with vectorized movs in a different block. If this happens with swizzled movs in a then block, it could leave this block empty. ir3 assumes only the else block can be empty (e.g., when lowering predicates) so make sure ifs are in that canonical form again. This fixes empty predication blocks in some shaders, for example: predt predf ... prede Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34272>	2025-04-03 00:19:31 +00:00
Job Noorman	ee0ee2a317	ir3: don't sync every TCS/GEOM block TCS/GEOM shaders need (sy)(ss) on their first instruction but we accidentally set it on the first instruction of every block. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34257>	2025-04-02 23:37:35 +00:00
Connor Abbott	3ba315f205	ir3: Split mad with scalar ALU At least on all a6xx/a7xx, mad.f32 and mad.f16 are not fused. This means that when the sources of a NIR ffma are all uniform we can split it in two to execute it on the scalar ALU. This is important to reduce register pressure and make more preambles executed early. On fossil-db the statistics are mostly a wash as expected, but with early preambles increasing dramatically: Totals: MaxWaves: 2249180 -> 2249230 (+0.00%); split: +0.01%, -0.01% Instrs: 49668884 -> 49662951 (-0.01%); split: -0.12%, +0.11% CodeSize: 103662656 -> 103831154 (+0.16%); split: -0.22%, +0.38% NOPs: 8502571 -> 8495568 (-0.08%); split: -0.61%, +0.53% MOVs: 1554442 -> 1538804 (-1.01%); split: -2.01%, +1.01% Full: 1820906 -> 1814292 (-0.36%); split: -0.39%, +0.03% (ss): 1168628 -> 1165868 (-0.24%); split: -1.01%, +0.78% (sy): 616751 -> 616521 (-0.04%); split: -0.52%, +0.49% (ss)-stall: 4384397 -> 4361662 (-0.52%); split: -1.44%, +0.93% (sy)-stall: 17850227 -> 17858949 (+0.05%); split: -0.58%, +0.63% Early-preamble: 102262 -> 115702 (+13.14%) Cat0: 9375820 -> 9367978 (-0.08%); split: -0.57%, +0.48% Cat1: 2470212 -> 2454318 (-0.64%); split: -1.28%, +0.64% Cat2: 18673655 -> 18707106 (+0.18%) Cat3: 14227810 -> 14211106 (-0.12%) Cat5: 1424184 -> 1424150 (-0.00%) Cat7: 1404718 -> 1405808 (+0.08%); split: -0.39%, +0.47% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34115>	2025-04-02 23:08:39 +00:00
Connor Abbott	15660caa90	tu: Fix layer_count with dynamic rendering + multiview Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details With "classic" renderpasses, the VkFramebuffer's layerCount must be 1 if multiview is enabled. We accidentally rely on this to not disable GMEM for multiview, and possibly for other things too. Apparently the dynamic rendering equivalent, VkRenderingInfo::layerCount, can be anything when multiview is enabled, and some CTS tests set it to the number of views. Sanitize it when constructing the internal framebuffer for dynamic rendering. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34080>	2025-04-02 15:47:47 +00:00
Danylo Piliaiev	c538a9ec6e	tu: Use EARLY_Z also for stencil tests Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details EARLY tests can test and write out stencil values. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33851>	2025-04-02 12:03:30 +00:00
Danylo Piliaiev	534cf4feeb	tu/lrz: Improve LRZ around stencil tests and reads_dest cases We were a bit too conservative and fully disabled LRZ for when stencil or blending were involved. There is no need to fully disable LRZ in those cases, only LRZ writes should be disabled. The final rules are: LRZ is DISABLED until depth attachment is cleared when: - Depth Write + changing direction of depth test e.g. from OP_GREATER to OP_LESS; - Depth Write + OP_ALWAYS or OP_NOT_EQUAL; - Clearing depth with vkCmdClearAttachments; - Depth image is a target of blit commands. - (pre-a650) Not clearing depth attachment with LOAD_OP_CLEAR; - (pre-a650) Using secondary command buffers; LRZ WRITE is DISABLED until depth attachment is cleared when: - Depth Write + blending (color blend, logic ops, partial color mask, etc.); - Fragment may be killed by stencil; LRZ is disabled for CURRENT draw when: - Fragment shader side-effects (writing to SSBOs, atomic operations, etc); - Fragment shader writes depth or stencil; LRZ WRITE is DISABLED (via LATE_Z) for CURRENT draw when: - Fragment may be via killed alpha-to-coverage, discard, sample coverage; Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33851>	2025-04-02 12:03:30 +00:00
Aaron Ruby	8513bcbd2f	virtio: Remove virglrenderer_hw.h entirely Capset definitions replaced by those in virtgpu_drm.h Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34300>	2025-04-01 22:11:10 +00:00
Collabora's Gfx CI Team	1ce0cef6bf	Uprev ANGLE to 1b34d2a18af12cc55a3bc74dd679c2937d10cc5c `6abdc11741...1b34d2a18a` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34277>	2025-04-01 12:51:06 +00:00
Danylo Piliaiev	be481e6615	tu: Disable FS in certain cases even if FS is not empty If FS doesn't have side-effects and color write mask is zero. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33735>	2025-03-31 12:15:56 +02:00

1 2 3 4 5 ...

6090 commits