fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-05 13:58:04 +02:00

Author	SHA1	Message	Date
David Rosca	c1610da677	vulkan/video: Add intra refresh support Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36718>	2025-08-20 10:57:59 +00:00
Georg Lehmann	639b91bb48	aco/isel: fix vectorized i2i16 with 8bit vec8 source The extract index is in dwords, not bytes. Fixes: `92d433c54a` ("aco: vectorize conversions from 8bit to 16bit") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36869>	2025-08-20 10:13:22 +00:00
David Rosca	638fa01203	radv/video: Enable AV1 decode workaround for gfx1153 Cc: mesa-stable Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36725>	2025-08-20 09:51:32 +00:00
David Rosca	4893e09c10	radeonsi/vcn: Enable AV1 decode workaround for gfx1153 Cc: mesa-stable Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36725>	2025-08-20 09:51:32 +00:00
David Rosca	231d877cc8	ac/vcn_dec: Add av1_intrabc_workaround Cc: mesa-stable Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36725>	2025-08-20 09:51:32 +00:00
Valentine Burley	021a3f768b	zink/ci: Update expectations from nightly jobs Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Document current failures and flakes from the nightly jobs. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>	2025-08-20 08:53:36 +00:00
Valentine Burley	c4d8c5ed4a	zink/ci: Switch to quick_gl profile for nightly ANV jobs The full nightly jobs have been failing for a while without much interest in them. Reduce Piglit coverage by switching to the `quick_gl` profile, which is what the pre-merge jobs run. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>	2025-08-20 08:53:36 +00:00
Valentine Burley	6b88e2bd38	anv/ci: Update expectations from nightly jobs Document current failures and flakes from the nightly jobs, and add a skip for tests that are timing out. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>	2025-08-20 08:53:36 +00:00
Valentine Burley	e4fc3e4ee6	anv/ci: Lower concurrency for nightly jobs The nightly jobs can hit OOMs on JSL and ADL, so reduce the number of threads used by deqp-runner to avoid that. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>	2025-08-20 08:53:36 +00:00
Job Noorman	7752cc26c4	ir3: use offset_shift for SSBO intrinsics Our SSBO access instructions expect offsets in units of the accessed type's size. However, we were ingesting SSBO intrinsics that use byte addresses. We were fixing this up in ir3_nir_lower_io_offsets by inserting a ushr or, if possible, propagating this shift into another shift that's part of the address calculation. Having to insert a ushr if unfortunate, as for most accesses, it should be possible to extract this shift directly from the access chain because the array strides and struct offsets would be properly aligned. It also prohibits nir_opt_offsets to find constant additions to extract as they would be hidden behind a ushr that often cannot be optimized away. `57ea689273` ("ir3: optimize SSBO offset shifts for nir_opt_offsets") tried to overcome the latter problem somewhat by pushing a ushr into additions. This turned out to be unsound because even though SSBO offsets are unsigned, intermediate results in the offset calculation might be negative values which means we should use ishr in those cases. Unfortunately, we cannot know when to use ushr or ishr. This commit switches ir3 to the newly introduced offset_shift index for SSBO intrinsics. This allows the shift to be extracted when lowering derefs in nir_lower_explicit_io. In some, we still might have to add an extra shift to make sure the offset uses the correct units. It turns out that this is very rare and using offset_shift greatly improves the shader stats: Totals from 33267 (20.20% of 164705) affected shaders: MaxWaves: 440368 -> 455258 (+3.38%); split: +3.40%, -0.01% Instrs: 22974358 -> 21844188 (-4.92%); split: -4.98%, +0.06% CodeSize: 45456418 -> 43099334 (-5.19%); split: -5.22%, +0.03% NOPs: 4612549 -> 4524353 (-1.91%); split: -2.97%, +1.05% MOVs: 802018 -> 817547 (+1.94%); split: -3.29%, +5.23% COVs: 381987 -> 382061 (+0.02%); split: -0.03%, +0.05% Full: 514078 -> 477339 (-7.15%); split: -7.18%, +0.04% (ss): 544419 -> 502332 (-7.73%); split: -9.12%, +1.39% (sy): 292099 -> 304697 (+4.31%); split: -3.19%, +7.50% (ss)-stall: 2106134 -> 2104011 (-0.10%); split: -1.82%, +1.71% (sy)-stall: 9704720 -> 10324864 (+6.39%); split: -4.64%, +11.03% STPs: 11301 -> 10074 (-10.86%) LDPs: 18654 -> 17202 (-7.78%) Preamble Instrs: 4652214 -> 4580289 (-1.55%); split: -1.59%, +0.04% Early Preamble: 13977 -> 13978 (+0.01%) Constlen: 1881764 -> 1881304 (-0.02%); split: -0.03%, +0.01% Last helper: 5157587 -> 5074042 (-1.62%); split: -1.86%, +0.24% Subgroup size: 2262976 -> 2263232 (+0.01%) Cat0: 5065452 -> 4976324 (-1.76%); split: -2.73%, +0.97% Cat1: 1241085 -> 1251974 (+0.88%); split: -2.52%, +3.40% Cat2: 8462897 -> 7723367 (-8.74%); split: -8.74%, +0.01% Cat3: 5738382 -> 5735312 (-0.05%); split: -0.06%, +0.00% Cat5: 761945 -> 763017 (+0.14%); split: -0.00%, +0.14% Cat6: 199819 -> 197766 (-1.03%); split: -1.34%, +0.31% Cat7: 890192 -> 581842 (-34.64%); split: -35.20%, +0.57% Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	30716cc524	nir/lower_explicit_io: add support for offset_shift The goal here is to generate addresses that are a right-shifted version of the actual byte address and record the shift amount in the offset_shift index. While we could just insert a ushr at the end of deref chains, this will prevent the shift to be optimized away in many cases. Instead, we try to extract the shift from the array strides and struct offsets that make up the deref chain, and only insert a ushr when absolutely necessary (i.e., for casts). This means we have to walk the entire deref chain at once for accesses that support offset_shift and we don't use the standard algorithm of replacing each deref one at a time. To be able to legally right-shift casts, we use the alignment information and never shift more than what the alignment could support. It should also be noted that casts generally have two sources: something provided by the driver (e.g., a Vulkan resource index) or a variable pointer coming from a phi/bcsel. For the latter, the entire access chain consists of multiple parts that are ended by either a phi/bcsel or an access. Only the part the ends in an access is handled by this new algorithm; the other parts are handled as usual. This is necessary because we have no way to encode the offset shift or to even know how much we would be able to shift without knowing how it is accessed. This commit adds the general implementation for lowering accesses using offset_shift and adds a compiler option for drivers to enable it for SSBO accesses. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	1406eafbcd	nir/lower_explicit_io: add alignment parameters to address builder We will need this when building shifted addresses. Since adding these parameters has a lot of code churn which would distract from the main changes, it is split-off in a separate commit. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	553a439b54	nir/lower_explicit_io: use nir_io_offset to pass around addresses We will add support for shifted addresses; this commit makes sure the APIs of the functions already support passing shifts. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	4c9afbd01d	nir/lower_explicit_io: add helper to build address The helper is used to build the address passed to build_explicit_io_load/store. For now, it simply takes care of adding the component offset when scalarizing. In the future, this can be used to do more complex address manipulations, like calculating the full deref chain address. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	1fffba12a0	nir/lower_explicit_io: make offset calculation reusable nir_explicit_io_address_from_deref implicitly builds the offset but only makes the full address available. Split-out the offset calculation in a separate function so we can reuse it elsewhere. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	b0bc97cb43	nir/opt_load_store_vectorize: fix wrap check for scaled offsets Hardware will typically do bounds checking on the final scaled address so the wrap check should do the same. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	cb773dec8c	nir/opt_load_store_vectorize: add support for offset_shift Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	249e27c9c7	nir/opt_load_store_vectorize: allow per-instruction offset scaling We currently support offset scaling on a per-intrinsic type basis. Since the introduction of the offset_shift index, different instantiations of the same type can now have a different scale. Add support for this by calculating the offset scale on the fly for instructions that have offset_shift. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	7fe484e373	nir/lower_mem_access_bit_sizes: add partial support for offset_shift Note: this was implemented and tested for ir3. The code paths that are never used there [1] seem non-trivial to implement. Since they cannot be easily tested, asserts and TODOs are added to ensure we don't accidentally hit them for intrinsics with offset_shift. [1]: these paths are never used on ir3 since lower_mem_access_bit_sizes is only used for SSBO accesses to lower 64b accesses (which are 64b aligned) to 32b ones. So we'll never request an increase of alignment. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	cd72d8e366	nir/opt_shrink_vectors: add support for offset_shift Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	f876adc372	nir/lower_wrmasks: add support for offset_shift Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	b85c379945	nir/lower_wrmasks: don't adjust BASE The immediate addition can easily be handled by nir_opt_offsets, which will also take any driver limits into account. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	dd296a6d80	nir/lower_io_to_scalar: add support for offset_shift Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	59eb95cd2f	nir/lower_atomics: add support for offset_shift Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	513412c893	nir,ir3: add offset_shift index to SSBO access intrinsics In ir3, SSBO offsets are in units of the accessed type size so we want to start using the new offset_shift index. Even though the shift is implicit for the ir3 intrinsics, we use nir_intrinsic_copy_const_indices when creating them so we need to make sure our indices match the ones used by the generic intrinsics. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	355c9b88f7	nir: add some helpers for dealing with offset_shift For intrinsics supporting offset_shift, dealing with their offset is a bit tricky as we cannot simply add a byte offset to it anymore (which is what most passes want to do). This commit adds some helpers to add byte offsets (and adjusting offset_shift accordingly) so that individual passes don't have to worry about this. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	7cc09e9952	nir: add offset_shift intrinsic index For load/store intrinsics that take an offset, this specifies the amount the offset is shifted left to calculate the final offset: offset = (offset_src + base) << offset_shift This is useful for backends that have memory operations that use offset units other than bytes (i.e., where the shift is implicit). Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	ebea9ce825	nir: add nir_src_is_deref helper Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	65d559fcf6	tu: pass SSBO/UBO min alignment to SPIR-V frontend Values are taken from minStorageBufferOffsetAlignment and minUniformBufferOffsetAlignment. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:29 +00:00
Samuel Pitoiset	e10d955bc4	radv/ci: document a very recent ACO regression on GFX12 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36806>	2025-08-20 06:31:15 +00:00
Samuel Pitoiset	eaaef8db5a	radv/ci: make radv-gfx1201-vkcts a pre-merge job Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36806>	2025-08-20 06:31:14 +00:00
Samuel Pitoiset	640aed5727	radv/ci: reduce the timeout for radv-gfx1201-vkcts Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36806>	2025-08-20 06:31:14 +00:00
Samuel Pitoiset	9b9f62125b	radv/ci: use 3 parallel jobs for radv-gfx1201-vkcts For pre-merge testing, it's required to be around 10 minutes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36806>	2025-08-20 06:31:14 +00:00
Samuel Pitoiset	d25952c3d3	radv/ci: update expected list of failures/flakes on GFX1201 50 runs in a row without any unexpected failures/hangs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36806>	2025-08-20 06:31:14 +00:00
Job Noorman	2a8c5ebc77	ir3: enable scalar predicates Enable the use of scalar predicates by marking predicate dsts as uniform when possible during instruction emission and in opt_predicates. Totals: Instrs: 48207402 -> 47967272 (-0.50%); split: -0.54%, +0.05% CodeSize: 101907026 -> 101768626 (-0.14%); split: -0.15%, +0.01% NOPs: 8386320 -> 8165410 (-2.63%); split: -2.88%, +0.25% MOVs: 1468853 -> 1470546 (+0.12%); split: -0.17%, +0.28% COVs: 823724 -> 823746 (+0.00%); split: -0.01%, +0.01% Full: 1716708 -> 1716767 (+0.00%); split: -0.00%, +0.01% (ss): 1113167 -> 1168194 (+4.94%); split: -0.15%, +5.09% (sy): 552317 -> 552288 (-0.01%); split: -0.10%, +0.09% (ss)-stall: 4013046 -> 4261336 (+6.19%); split: -0.11%, +6.30% (sy)-stall: 16741190 -> 16748983 (+0.05%); split: -0.17%, +0.22% STPs: 18895 -> 18901 (+0.03%); split: -0.02%, +0.05% LDPs: 23853 -> 23762 (-0.38%); split: -0.39%, +0.01% Preamble Instrs: 11506988 -> 11493425 (-0.12%); split: -0.12%, +0.01% Early Preamble: 121339 -> 121695 (+0.29%) Last helper: 11686328 -> 11628618 (-0.49%); split: -0.72%, +0.23% Cat0: 9241457 -> 9020508 (-2.39%); split: -2.62%, +0.22% Cat1: 2353411 -> 2354860 (+0.06%); split: -0.17%, +0.23% Cat2: 17468471 -> 17447932 (-0.12%); split: -0.12%, +0.00% Cat6: 515728 -> 515643 (-0.02%); split: -0.02%, +0.00% Cat7: 1637795 -> 1637789 (-0.00%); split: -0.05%, +0.05% Totals from 33275 (20.20% of 164705) affected shaders: Instrs: 30329487 -> 30089357 (-0.79%); split: -0.86%, +0.07% CodeSize: 59715922 -> 59577522 (-0.23%); split: -0.26%, +0.03% NOPs: 6265422 -> 6044512 (-3.53%); split: -3.86%, +0.33% MOVs: 1058197 -> 1059890 (+0.16%); split: -0.23%, +0.39% COVs: 427513 -> 427535 (+0.01%); split: -0.02%, +0.03% Full: 548495 -> 548554 (+0.01%); split: -0.01%, +0.02% (ss): 769340 -> 824367 (+7.15%); split: -0.21%, +7.36% (sy): 368276 -> 368247 (-0.01%); split: -0.14%, +0.13% (ss)-stall: 3076333 -> 3324623 (+8.07%); split: -0.15%, +8.22% (sy)-stall: 10740547 -> 10748340 (+0.07%); split: -0.27%, +0.34% STPs: 12872 -> 12878 (+0.05%); split: -0.02%, +0.07% LDPs: 20808 -> 20717 (-0.44%); split: -0.45%, +0.01% Preamble Instrs: 6354490 -> 6340927 (-0.21%); split: -0.22%, +0.01% Early Preamble: 15233 -> 15589 (+2.34%) Last helper: 8106631 -> 8048921 (-0.71%); split: -1.04%, +0.32% Cat0: 6888653 -> 6667704 (-3.21%); split: -3.51%, +0.30% Cat1: 1541452 -> 1542901 (+0.09%); split: -0.25%, +0.35% Cat2: 10963398 -> 10942859 (-0.19%); split: -0.19%, +0.00% Cat6: 265945 -> 265860 (-0.03%); split: -0.03%, +0.00% Cat7: 1164800 -> 1164794 (-0.00%); split: -0.07%, +0.07% Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>	2025-08-20 06:14:02 +00:00
Job Noorman	cccb3ecc6a	ir3/opt_predicates: move some helpers up We'll need them earlier in the next commit. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>	2025-08-20 06:14:02 +00:00
Job Noorman	0223ab01b7	ir3/isa: add encoding for scalar predicates Predicate registers can be written from the scalar ALU by using a special cat2 encoding: if the dst is encoded as a0.c, the instruction will execute on the scalar ALU and write to p0.c. This commit follows the blob and disassembles scalar predicates as up0.c. The "u" presumably stands for "uniform". Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>	2025-08-20 06:14:02 +00:00
Job Noorman	25ab37ae5b	ir3: make backend aware of scalar predicates Predicate registers can be written from the scalar ALU by using a special cat2 encoding: if the dst is encoded as a0.c, the instruction will execute on the scalar ALU and write to p0.c. This commit makes the ir3 backend aware of scalar predicates. A new register flag (IR3_REG_UNIFORM) is added that can be used to mark predicate dsts as being written by the scalar ALU. For such dsts, the same synchronization rules apply as for shared registers written by the scalar ALU (e.g., (ss) is needed to read them from the vector ALU). Scalar predicates can be used in the early preamble, which makes control flow available there. In many ways, the backend treats IR3_REG_UNIFORM the same as IR3_REG_SHARED. A new flag was added because IR3_REG_SHARED is mainly used to denote a separate register file, not as a flag to indicate usage by the scalar ALU. Scalar predicates still use the normal predicate register file but allow it to be written from the scalar ALU. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>	2025-08-20 06:14:02 +00:00
Job Noorman	bd28a40bd4	ir3/legalize: don't special-case early-preamble a1 reads We can just generically read from the regmask. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>	2025-08-20 06:14:02 +00:00
Job Noorman	8760c36579	ir3: use shared srcs for demote/kill condition No reason to force vector srcs. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>	2025-08-20 06:14:02 +00:00
Job Noorman	dbfed965ae	ir3: use ir3_get_predicate for demote/kill Instead of duplicating its functionality. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>	2025-08-20 06:14:02 +00:00
Job Noorman	2158211eeb	ir3: allow shared srcs for ldc.k Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This works just fine and opens up a lot more opportunities for early preamble. Note that I haven't seen actual cases where the index is large enough to need a register but verified in computerator that it works. Totals: MaxWaves: 2377648 -> 2377666 (+0.00%) Instrs: 48207402 -> 48219491 (+0.03%); split: -0.01%, +0.03% CodeSize: 101907026 -> 101929790 (+0.02%); split: -0.01%, +0.03% NOPs: 8386320 -> 8392647 (+0.08%); split: -0.03%, +0.10% MOVs: 1468853 -> 1474439 (+0.38%); split: -0.19%, +0.57% Full: 1716708 -> 1716655 (-0.00%) (ss): 1113167 -> 1115183 (+0.18%); split: -0.05%, +0.23% (sy): 552317 -> 552334 (+0.00%); split: -0.10%, +0.10% (ss)-stall: 4013046 -> 4011814 (-0.03%); split: -0.10%, +0.06% (sy)-stall: 16741190 -> 16738674 (-0.02%); split: -0.20%, +0.19% Preamble Instrs: 11506988 -> 11422360 (-0.74%); split: -0.79%, +0.06% Early Preamble: 121339 -> 123955 (+2.16%) Last helper: 11686328 -> 11688700 (+0.02%); split: -0.01%, +0.03% Cat0: 9241457 -> 9248390 (+0.08%); split: -0.02%, +0.10% Cat1: 2353411 -> 2359061 (+0.24%); split: -0.12%, +0.36% Cat7: 1637795 -> 1637301 (-0.03%); split: -0.18%, +0.14% Totals from 5370 (3.26% of 164705) affected shaders: MaxWaves: 66838 -> 66856 (+0.03%) Instrs: `4127945` -> 4140034 (+0.29%); split: -0.08%, +0.37% CodeSize: 8376584 -> 8399348 (+0.27%); split: -0.08%, +0.35% NOPs: 892650 -> 898977 (+0.71%); split: -0.24%, +0.95% MOVs: 199423 -> 205009 (+2.80%); split: -1.42%, +4.22% Full: 76648 -> 76595 (-0.07%) (ss): 106018 -> 108034 (+1.90%); split: -0.56%, +2.46% (sy): 48427 -> 48444 (+0.04%); split: -1.10%, +1.13% (ss)-stall: 479348 -> 478116 (-0.26%); split: -0.80%, +0.54% (sy)-stall: 1880900 -> 1878384 (-0.13%); split: -1.81%, +1.68% Preamble Instrs: 1096452 -> 1011824 (-7.72%); split: -8.34%, +0.62% Early Preamble: 0 -> 2616 (+inf%) Last helper: 1313193 -> 1315565 (+0.18%); split: -0.10%, +0.29% Cat0: 992161 -> 999094 (+0.70%); split: -0.23%, +0.93% Cat1: 234329 -> 239979 (+2.41%); split: -1.21%, +3.62% Cat7: 118722 -> 118228 (-0.42%); split: -2.42%, +2.00% The regressions in NOPs/MOVs seem to be cases of bad luck in RA/scheduling. I looked at a couple of cases and the main shader is essentially the same before RA. It's a bit unfortunate the differences in the preamble can have such an impact on the main shader... Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36673>	2025-08-20 05:10:23 +00:00
Tapani Pälli	ef09df004e	compiler/types: handle BFLOAT16 when decoding blob New type was not handled in the switch which lead to hitting following assert when running tests with pipeline cache: deqp-vk: ../src/compiler/glsl_types.c:3334: decode_type_from_blob: Assertion `!"Cannot decode type!"' failed. Fixes: `9e5d7eb88d` ("compiler/types: add a bfloat16 type") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36833>	2025-08-20 04:12:00 +00:00
Kovac, Krunoslav	9452f2ca3f	amd/vpelib: Minor Refactor [WHY] There will be more conditions for bypassing degamma, so refactor. Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>	2025-08-20 10:42:01 +08:00
Chan, Roy	dda6a76b54	amd/vpelib: check stream_count as well before accessing streams [WHY] It was found that the caller may call with stream_count = 0, while streams array is some garbage. it randomly ends up output_ctx being modified and leading to validation failure. [HOW] Add checking to the stream_count. Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Roy Chan <Roy.Chan@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>	2025-08-20 10:42:01 +08:00
Zhao, Jiali	2b50600a71	amd/vpelib: Extend TMZ value to 8 bit Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Jiali Zhao <Jiali.Zhao@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>	2025-08-20 10:42:01 +08:00
Ansari, Muhammad	c26cf7f74d	amd/vpelib: VPE Events [WHY] For further debugging need to know about the build cmd variables. [HOW] Added these input and output paramaters to vpe events. Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Muhammad Ansari <Muhammad.Ansari@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>	2025-08-20 10:42:01 +08:00
Leder, Brendan Steve (Brendan)	a486404e4d	amd/vpelib: General cleanup / optimization tasks Various small optimizations that have been accumulating, deal with them in one commit: - Add erase functionality for vector util, remove memsets for time opt. - Update should_gen_cmd_info to take in any stream variables. - Program funcs should directly program - update mpcc mux hook func to take in blend_mode. - Add reserved bits for debug flags. Signed-off-by: Brendan Steven, Leder <BrendanSteven.Leder@amd.com> Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>	2025-08-20 10:42:01 +08:00
Okenczyc, Andrzej	e5cdc78e0e	amd/vpelib: Move predication size calculation to bufs_req Calculation for the worst case scenario in bufs_req should also include predication command size. Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Andrzei Okenczyc <Andrzej.Okenczyc@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>	2025-08-20 10:42:01 +08:00
Assadian, Navid	fbeaca1202	amd/vpelib: Add necessary pointer casting Add necessary pointer casting to prevent unexpected behavior Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Navid Assadian <Navid.Assadian@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>	2025-08-20 10:42:01 +08:00

1 2 3 4 5 ...

210541 commits