fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 11:48:05 +02:00

Author	SHA1	Message	Date
Lionel Landwerlin	624dd006f4	anv: add lowering of descriptor heap intrinsics Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39478>	2026-05-05 18:21:16 +00:00
Lionel Landwerlin	f309f0b1a0	intel: add resource intrinsic support for heaps Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39478>	2026-05-05 18:21:16 +00:00
Lionel Landwerlin	25bc517ef5	brw: add heap support to brw_lower_storage_image Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39478>	2026-05-05 18:21:16 +00:00
Lionel Landwerlin	7b95d82240	anv: split sampler state packing from API object creation We'll reuse the state packing somewhere else later. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39478>	2026-05-05 18:21:16 +00:00
Lionel Landwerlin	5ec7d31e20	brw/lower_texel_address: add heap support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39478>	2026-05-05 18:21:16 +00:00
Konstantin Seurer	04463fe91e	vulkan: Rename radix_sort to radix_sort_u64 Preparation for optionally building with 96bit radix sort. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41300>	2026-05-04 20:42:49 +00:00
José Roberto de Souza	cbc1ec206d	intel: Add support for madvise purgeable VMAs in Xe KMD Initially this uAPI was part of the first public version of Xe KMD uAPI but as it did not had any users it was removed in some of fixes releases of the Linux version that added Xe KMD but I missed to update the comment in Mesa. At that time this uAPI had a restriction that did not allowed us to use, it was compatible with VMs created with DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE but now this flag is supported so here implementing it. Link: https://patchwork.freedesktop.org/series/156651/ Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40573>	2026-05-04 20:11:23 +00:00
Marek Olšák	f583f6e717	nir: use nir_build_frag_coord everywhere nir_build_frag_coord generates the correct sysval loads based on NIR options. nir_load_frag_coord shouldn't be used directly because drivers don't have to support it. v2: RADV can't use it because nir->options isn't set, so use load_pixel_coord. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>	2026-05-03 13:03:01 +00:00
Yiwei Zhang	2065c589c0	intel: use stable NDK __android_log_print helper The NDK api __android_log_print has been available since api level 3, which is preferred since NDK api is more stable. Acked-by: Valentine Burley <valentine.burley@collabora.com> Reviewed-by: Dhruv Mark Collins <mark@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41254>	2026-05-01 20:23:23 +00:00
Calder Young	ebe835e94c	intel_hang_replay: Don't force scratch page on Xe KMD unless explicitly requested Added a --scratch flag instead of always forcing the scratch page enabled, this allows the hang replay tool to be used to debug page faults. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Calder Young	04bfdb287b	anv: Disable scratch page by default on Xe KMD Page faults will now cause the device to be lost instead of being ignored. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Calder Young	4120ae4963	brw: Avoid vectorizing loads in NIR if it could extend into a different page Took inspiration from RADV to make nir_opt_load_store_vectorize robust against page faults, by checking the align_offset and align_mul to see if any extra components could be overlapping into a different page. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Calder Young	3ac6233655	brw: Avoid rounding every convergent block load up to a full register To simplify things, our backend rounds convergent block loads up to a full register. This causes page faults with the scratch page disabled since the address is not always aligned to a register size. Loading smaller blocks is slightly more difficult because the SEND instruction can only write back a multiple of full registers, even if the actual data is smaller. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Calder Young	8ce98fedc4	anv: Make sure robust UBO access does not fault We can just conditionally replace the address with an address to a zero initialized cacheline if the read is going to go out of bounds. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Calder Young	64b5823d33	blorp: Work around sampler overfetch for buffer copies First, the surface dimensions are used to determine the range of valid pages that the data in the buffer overlaps, then rows are removed from the surface until it does not overfetch into any neighboring pages. If any rows were removed, an extra BTI is set up with a texel buffer that views the contents of all the rows that were removed, and the shader is compiled with a branch to sample the last rows through the texel buffer instead of the main surface. Using the texel buffer allows it to access the last rows without dealing with overfetch or weird alignment hacks, and restricting texel buffer usage to just the part of the surface that can't be accessed safely ensures that we don't significantly impact performance for any buffer to image copy that is unlucky enough to be close to a page boundry. Co-authored-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Calder Young	fd7c094f7b	isl: Add and use isl_tiling_get_intratile_range_el/sa Consolidates the logic for calculating the intratile extent of a slice of a surface to avoid duplicating code in the next patch. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Calder Young	f5c848ef57	isl: Add function to calculate the amount of overfetch for an unpadded surface Adds a function to calculate the total size of a 2D linear sampling engine surface, including overfetch, for a buffer to image copy. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Calder Young	3cd9b14c80	isl: Optimize the sampler cache to overlap as few 64B cachelines as possible Since we now have a ISL_SURF_USAGE_NO_OVERFETCH_PADDING_BIT flag to turn extra padding calculations on and off, we can align the row pitch of linear surfaces that are accessed through the sampler to minimize the number of L3 cachelines that each sampler cacheline overlaps for added efficiency. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Calder Young	8d13628f7f	isl: Add additional alignment/padding requirements to prevent overfetch Bspec 58779 describes various cases where additional padding is required on the bottom and right sides of a sampling engine surface to avoid page faults. Since we don't want to mess up the other drivers that also use ISL, there's now a requires_padding boolean in isl_dev that can be used to enable/disable the extra padding calculations per device and driver. The extra padding can also be disabled per-surface by adding the usage flag ISL_SURF_USAGE_NO_OVERFETCH_PADDING_BIT, like when a specific row pitch is needed. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Calder Young	aee9602fea	isl: Add usage flag to force SurfaceArray to false When sampling BUFFER, 1D, or 2D surfaces, with no MSAA, no mipmap levels, linear tiling, and SurfaceArray set to false, the surface padding requirements are relaxed and its much easier to use the sampler to do buffer-to-image copies in BLORP. We can't have it like this by default though because we need SurfaceArray true for robustness. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Calder Young	bd88042f57	anv: Add padding to the shader heap to manage EU prefetch Like the command streamer, the EUs will also blindly prefetch up to 3.5KiB ahead of a shader. We can manage this in the shader heap by adding the required padding when we allocate the buffers to back a shader allocation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:40 +00:00
Calder Young	5fb78a26db	anv: Store batch buffers in a null-initialized VMA heap The command streamer will blindly prefetch up to 4KiB ahead of a batch buffer depending on the engine. To avoid page faults with the scratch page disabled, we can create a special VMA heap for batch buffers that has pages initialized with the null tile bit by default. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:40 +00:00
Calder Young	6aabe5482e	anv: Fix support for indirect SBTs on Xe3+ Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes: `6deb195` ("anv: Update RT dispatch globals to use 64bit data structure") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41004>	2026-05-01 00:18:23 +00:00
Calder Young	8f7309d9a9	anv: Fix address bit masking for indirect SBTs Fixes: `ce68824` ("anv: fix invalid masking of 48bit address") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41004>	2026-05-01 00:18:23 +00:00
Paulo Zanoni	8ced368644	anv: don't silently convert view ranges from u64 to u32 then u64 Both anv_buffer_view->vk.range and VkDescriptorAddressInfoEXT->range are VkDeviceSize, which is uint64_t. In Anv, we pass this to align_down_npot_u32(), anv_fill_buffer_surface_state() and anv_fill_buffer_view_surface_state(), all which convert it down to uint32_t. Then we call isl_buffer_fill_state(), converting the value back to uin64_t as size_B. Remove the intermediate u32 truncation everywhere. If some place does not accept values bigger than UINT_MAX, it is that place that should have a check. We shouldn't silently convert a u64 value to u32 and then back to u64. I'm not aware or any workloads that are affected by this bug today. Reviewed-by: Dylan Baker <dylan.c.baker@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41251>	2026-04-30 21:32:23 +00:00
Paulo Zanoni	4167b7d51f	intel/isl: warn about excessive num_elements only once Commit `f3c7e14f09` ("isl: don't assert(num_elements > (1ull << 27))") replaced an assert(num_elements <= (1 << 27)) with a mesa_logw(). At that time, the only games I knew that printed this message (Marvel's Spider-Man Remastered and Assassin's Creed: Valhalla) only printed it a few times during startup. It turns out that The Last Of Us Part II Remastered constantly prints this message during gameplay. Downgrade it to mesa_logw_once() so we don't spam the terminal, don't fill disks with log messages and don't make things slower in general. Fixes: `f3c7e14f09` ("isl: don't assert(num_elements > (1ull << 27))") Reviewed-by: Dylan Baker <dylan.c.baker@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41251>	2026-04-30 21:32:23 +00:00
Paulo Zanoni	c4b6df29bf	intel/isl: fix assert when surf->size_B is > UINT_MAX I have some local tests for Sparse Resources that I wrote when I was working on that for Anv. One of them tries to create a sparse buffer with size 4294967296 (which doesn't fit in an uint32_t). Without this patch, the right side of the assertion overflows and we get: sparse: ../../src/intel/isl/isl.c:3787: isl_surf_from_mem: Assertion `surf->size_B == surf->row_pitch_B * extent.h * extent.a' failed Fixes: `fcdae4d4c0` ("intel: Add and use isl_surf_from_mem()") Reviewed-by: Dylan Baker <dylan.c.baker@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41253>	2026-04-30 21:07:02 +00:00
Caio Oliveira	1ebc14bcb9	brw: Stop tracking inline parameter usage in prog_key/prog_data Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Since inline parameter is the last field of the thread payload, the backend can always assume they may exist. They won't affect the position of other payload fields and the register allocator will reuse any unused space. In Anv, also update EmitInlineParameter for Task/Mesh/CS to reflect previous changes in inline parameter setup. Remove/Update some stale comments since we are here. Finally, remove the prog_key/prog_data bits that tracked whether inline data or a push address was needed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41230>	2026-04-30 16:39:22 +00:00
Samuel Pitoiset	f2ce2868c5	ci: uprev vkd3d This contains new tests for DGC+multiview which are valid in DX12 but invalid in Vulkan, unless RADV allows support for it. Important to have coverage for us because it's used for Crimson Desert. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41193>	2026-04-30 15:00:02 +00:00
Lionel Landwerlin	b795a1a20c	intel/tools: add eu stall viewer Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>	2026-04-30 10:59:45 +00:00
Lionel Landwerlin	d595529475	imgui: update copy and port all tools using it Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>	2026-04-30 10:59:45 +00:00
Lionel Landwerlin	0a965c0bce	anv: add a shader-dump debug option Will use this with EU stall monitor. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>	2026-04-30 10:59:45 +00:00
Lionel Landwerlin	3951a00d86	anv: reorder debug options Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>	2026-04-30 10:59:43 +00:00
Lionel Landwerlin	5a462d77ff	anv: remove a bunch of KHR alias uses Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41233>	2026-04-30 09:04:01 +00:00
Lionel Landwerlin	4c7948ec0d	anv: stop using queue priority KHR aliases Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41233>	2026-04-30 09:04:01 +00:00
Lionel Landwerlin	dad8f65611	anv: fix null pointer access Reproduces with dEQP-VK.pipeline.no_queues.pipeline_binary.compute Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `595889018a` ("anv: implement VK_KHR_maintenance9") Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41233>	2026-04-30 09:04:01 +00:00
Caio Oliveira	e1745e0bd9	brw: Fix max_dispatch_width collection for CS with variable size The intention of the original commit was to make all the shaders report the same max_dispatch_width. When CS has multiple variants, this was not happening as expected. Fixes: `2acc2f18ea` ("intel/compiler: report max dispatch width statistic") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41209>	2026-04-29 15:52:04 +00:00
Alyssa Rosenzweig	a78634ccb0	jay/to_binary: rename grf -> phys_reg Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details since it covers accumulators to Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig	ab87a035c9	jay: drop a bunch of stale TODO and XXX These are either done, or never going to be done, or otherwise stale or silly or unnecessary. Drop a bunch. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig	70d09d97ef	jay: predicate NoMask instructions in uniform IF's Totals: Instrs: 4742391 -> 4742257 (-0.00%) CodeSize: 70245120 -> 70243520 (-0.00%); split: -0.00%, +0.00% Totals from 81 (3.06% of 2647) affected shaders: Instrs: 337727 -> 337593 (-0.04%) CodeSize: 4992992 -> 4991392 (-0.03%); split: -0.03%, +0.00% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig	f199f00564	jay: adjust flag replication Now instructions still read/write UFLAG, which preserves the information about lane 0 we need for proper predication etc. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig	930d36b54a	jay: smarten predication pass Merge the empty else optimization, the then-block predication, and the break-while fusion into a unified "try to predicate each side of an if, peephole optimizing control flow" optimization. This is simpler and more general. Totals: Instrs: 4783809 -> 4775647 (-0.17%) CodeSize: 70766656 -> 70674064 (-0.13%); split: -0.13%, +0.00% Totals from 1109 (41.90% of 2647) affected shaders: Instrs: 4130644 -> 4122482 (-0.20%) CodeSize: 61180848 -> 61088256 (-0.15%); split: -0.15%, +0.00% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig	80081ef7b2	jay: check for inverse-ballots in jay_uses_flag Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig	86f19bc983	jay: propagate inverse-ballots only locally Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig	d7283a25d7	jay: do not copyprop ballots globally Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig	5828b66b65	jay: convert to LCSSA for correctness with loops. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig	fed6b7bea0	jay: drop UGPR->UMEM spilling path This is totally broken now that we have a physical CFG for UGPRs. And of course, UGPRs generally were totally broken without the physical CFG. So I conclude this code basically never worked. Which is good because it was also basically always dead too. Just delete it and replace with a clear error message, instead of pretending it works and either randomly splatting validation or just straight up miscompiling silently or whatever. We might need an alternative UGPR->GPR spill path some day but that day is not today. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig	ad040f2fbb	jay: introduce a physical control flow graph Consider: u0 = foo() if (divergent) { u0 = bar() r0 = baz(u0) } else { r0 = quux(u0) } Logically, this is fine, there is no interference between bar() and u0. But physically, both sides of the if execute so the bar() write to u0 overwrites the variable the else reads. So this is a miscompile. The solution is to model the extra edges in the physical control flow graph, which lives next to the existing logical control flow graph. Liveness for UGPRs now follows the physical CFG, while liveness for GPRs continues to follow the logical CFG. That models the interference properly, while still allowing phis to work as before (since phis writing UGPRs follow uniform bits of control flow that are necessarily critical edge free for the same reason the logical CFG is). Because our RA copies shuffled registers back at block ends (following Colombet), there's no issue with live range splits here (unlike aco which inserts phis for this case and then needs to worry about critical edges around those phis). There might still be an extremely-challenging-to-hit bug here with UGPR spilling which I need to think more about. It might be fine as-is? Not convinced though. But this is big enough and strictly less broken than what we have right now and the full solution will build on this, so here we are. Fixes artefating in SuperTuxKart and Celestia knows what else. Totals: Instrs: 2770938 -> 2771269 (+0.01%); split: -0.00%, +0.02% CodeSize: 40133712 -> 40138480 (+0.01%); split: -0.01%, +0.02% Totals from 158 (5.97% of 2647) affected shaders: Instrs: 514523 -> 514854 (+0.06%); split: -0.02%, +0.09% CodeSize: 7603040 -> 7607808 (+0.06%); split: -0.03%, +0.09% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig	fadb826515	jay/opt_propagate: disable f64 opts for now could be done but would need more work. No stats change. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig	8e4145948f	jay/opt_propagate: fold uflag copies Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>	2026-04-28 23:13:50 +00:00

1 2 3 4 5 ...

15978 commits