fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-06 11:38:05 +02:00

Author	SHA1	Message	Date
Yogesh Mohan Marimuthu	6e813b99af	winsys/amdgpu: wait for vm syncobj before creating userq Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	2a499412e5	winsys/amdgpu: pass job fences to VM ioctl In case of userq, fences are not installed in kernel kms handled. fences are handled internally in mesa. So when unmapping a buffer, fences will have to be passed by mesa to kernel so that kernel can wait on these fences to unmap the buffer. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	37b217b0fb	winsys/amdgpu: destroy bo_fence_lock late in do_winsys_deinit() In case of userq when destroying bo, fences are gathered and passed to kernel. Fences are gathered using bo_fence_lock, In do_winsys_deinit() currently bo_cache is destroyed after destroying bo_fence_lock. This leads to crash. Fix this by moving destroying bo_fence_lock late in do_winsys_deinit(). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	30e95cfd87	winsys/amdgpu: use timeline syncobj for userq vm operations In case of kernel queues method of job submission, buffer list for the job is passed to amdgpu_cs ioctl. Kernel can ensure that VM mapping is completed before submitting the job. With user queues amdgpu_cs ioctl is not called, so the kernel can't determine automatically when BO should be prepared for submissions. To achieve this, a timeline syncobj is attach to the gem_va ioctls, which can then be used as a dependency for future jobs. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	7c1ba1078b	winsys/amdgpu: use bo_va_op_raw() function instead of bo_va_op() This will make it easy when adding timeline syncobj parameter for user queue. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	94c41852bd	ac: add inherit vmid field to indirect buffer packet Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	cda75d6497	ac: add new userq signal and wait packet id Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	086741b3ae	winsys/amdgpu: call userq init and destroy functions Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	093cf74b26	ac/gpuinfo: add use_userq and AMD_USERQ variable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	0182629411	winsys/amdgpu: add userq helper functions This patch adds init(), deinit(), ring packet helpder macros functions for userq. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	48ea133c97	winsys/amdgpu: add CLEAR_VRAM flag to zero vram when creating bo Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	468ea03c6e	winsys/amdgpu: add DOORBELL domain to bo Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Shashank Sharma	42d49faee5	amd: add new AMDGPU_INFO subquery for userqueue metadata This patch: - adds a new subquery (AMDGPU_INFO_UQ_FW_AREAS) in AMDGPU_INFO_IOCTL to get the size and alignment of shadow and csa objects from the kernel. This information is required for a userqueue consumer (like MESA/libdrm) to create the userqueue metadata objects properly. - also adds supporting metadata structures and a high level wrapper function (amdgpu_query_uq_metadata_info) to the query, to make it easy to use. The corresponding kernel changes for this UAPI extension can be found in amd-gfx mailing list, link: https://patchwork.freedesktop.org/patch/621390/?series=139715&rev=2 This patch adds support only for the GFX IP, and the other engines may be supported in subsequent development. This patch was reviewed in libdrm library at https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/400 Cc: Marek Olsak <marek.olsak@amd.com> Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Arvind Yadav <arvind.yadav@amd.com> Reviewed-by: Marek Olsak <marek.olsak@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Arvind Yadav	b0a70da496	amd: Add amdgpu userqueue IOCTL functions This patch adds new IOCTL functions to support userqueue create, remove, signal and wait etc. This patch was reviewed in libdrm library at https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392 Cc: Deucher, Alexander <alexander.deucher@amd.com> Cc: Koenig, Christian <christian.koenig@amd.com> Cc: Sharma, Shashank <shashank.sharma@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	3981b017eb	amd: include amdgpu_drm.h from mesa instead of system for ac_fake_hw_db.h Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	367856bc72	amd: update amdgpu_drm.h for new userq ioctl Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:05 +00:00
Lucas Stach	e6b018c9dd	etnaviv: stall after RS/BLT operation when draw_stall debug option is enabled RS and BLT operations can exhibit issues in some cases. To help in debugging such issues stall after RS and BLT operations when ETNA_MESA_DEBUG=draw_stall is enabled. In that case the FE will point right at the faulty RS/BLT operation, instead of the next stall which may be many state loads later. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32444>	2024-12-03 11:18:45 +00:00
Jose Maria Casanova Crespo	a5485a9414	v3d: Don't load/store if rasterizer discard is enabled This moves the tlb job load/store logic to the new helper v3d_update_job_tlb_load_store. Then an early return is included so if the rasterizer discard is enabled, no load/stores are emitted because of the draw call. This helps in situations where transform feedback is used and there is only interest in the geometry results. We identified that some jobs were not rendering at all, but they were having the performance cost of doing several loads and stores. This generates a huge performance improvement on manhattan benchmarks. fps_avg helped: gl_gfxbench_manhattan.trace: 8.37 -> 11.54 (37.85%) fps_avg helped: gl_gfxbench_manhattan31.trace: 6.02 -> 7.51 (24.62%) total fps_avg in affected (through threshold) runs: 14.39 -> 19.04 (32.32%) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32351>	2024-12-03 10:56:17 +00:00
Samuel Pitoiset	9535f27d8f	radv/ci: mark few tests as expected failures RADV is the only driver in Mesa CI to use VKCTS main but it doesn't recognize 1.4 correctly yet. This will be fixed with a VKCTS uprev. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:55 +00:00
Samuel Pitoiset	40f95c63f4	radv: bump VKCTS conformance version to 1.4.0.0 for some GFX8+ GPUs Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:55 +00:00
Samuel Pitoiset	00afc4e353	radv: advertise Vulkan 1.4 on GFX8+ GFX6-7 can't support Vulkan 1.4 because indexTypeUint8 isn't supported in hardware, and emulating features for very old hardware isn't the option I would personally choose. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:55 +00:00
Samuel Pitoiset	db61d45b94	radv: add new Vulkan 1.4 features/properties Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:55 +00:00
Samuel Pitoiset	75691dd59c	radv: promote VK_EXT_pipeline_robustness to core 1.4 API Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:55 +00:00
Samuel Pitoiset	7892e8600b	radv: promote VK_KHR_shader_subgroup_rotate to core 1.4 API Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:55 +00:00
Samuel Pitoiset	8c2ff0a80b	radv: promote VK_KHR_push_descriptor to core 1.4 API Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:55 +00:00
Samuel Pitoiset	e20d5173fd	radv: promote VK_KHR_map_memory2 to core 1.4 API Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:55 +00:00
Samuel Pitoiset	5b9ebe331c	radv: promote VK_KHR_maintenance6 to core 1.4 API Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:55 +00:00
Samuel Pitoiset	54cd43f93e	radv: promote VK_KHR_maintenance5 to core 1.4 API Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:55 +00:00
Samuel Pitoiset	81798d9ebe	radv: promote VK_KHR_line_rasterization to core 1.4 API Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:55 +00:00
Samuel Pitoiset	5917f70a6e	radv: promote VK_KHR_index_type_uint8 to core 1.4 API Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:55 +00:00
Samuel Pitoiset	64101baecf	radv: promote VK_KHR_global_priority to core 1.4 API Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:54 +00:00
Samuel Pitoiset	ac26c5af52	radv: promote VK_KHR_dynamic_rendering_local_read to core 1.4 API Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32432>	2024-12-03 10:21:54 +00:00
Samuel Pitoiset	a437af59fc	zink/ci: skip one more modifier test on POLARIS10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32453>	2024-12-03 08:46:00 +00:00
Samuel Pitoiset	3d804851be	radv: try to detect use-after-free with address binding report This performs some very basic verifications with the faulty VA we get from the kernel. This will probably be improved over time. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32403>	2024-12-03 08:13:13 +00:00
Samuel Pitoiset	1b68a92c59	radv: dump address binding report with RADV_DEBUG=hang This contains much more info than the BO history from the winsys and it will be helpful for debugging. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32403>	2024-12-03 08:13:13 +00:00
Samuel Pitoiset	1ae6fcfbaf	radv: add a small helper to dump VM fault with the GPU hang report Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32403>	2024-12-03 08:13:13 +00:00
Samuel Pitoiset	f8af89aaa0	radv: add address binding report support for BOs imported with a ptr Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32403>	2024-12-03 08:13:13 +00:00
Samuel Pitoiset	723cbc95d8	radv: add address binding report support for BOs imported with a fd Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32403>	2024-12-03 08:13:13 +00:00
Deborah Brouwer	caa6ccd7d6	ci: move pipeline_summary tool to .marge/hooks Move the tool to summarize a failed pipeline to a generic .marge/hooks directory. This will allow the fdo-bots repo to handle all marge hooks in a consistent way across repositories that use this service. Add a symlink to the bin/ci directory so that the pipeline summary tool can still be run locally as well. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32413>	2024-12-02 19:22:59 -08:00
Timothy Arceri	fd431a5b71	glsl: drop unused ir_equals.cpp Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32448>	2024-12-03 02:46:39 +00:00
Kenneth Graunke	6fd10a6620	brw: Tune vectorizer conditions to allow overfetching with holes Notably, our convergent block loads were already overfetching - we rounded up to block sizes of 8, 16, 32, or 64(LSC-only). But we did so in the backend, rather than NIR. With recent changes, nir_opt_load_store_vectorizer allows holes of up to 28 bytes (7 components at 4 bytes each). This allows us to detect cases where we did a convergent block load for 1 component (but loaded a whole vec8), then another load for the next vec8, and combine them into a single V16 load. Single component loads aren't the most common, but convergent loads of a vec2 in one group and a vec3 in another are quite common, and it makes no sense to do V8+V8 loads instead of V16. For non-block loads, we allow a max hole of 4 bytes. This allows the common case of XYZ_ + XYZ_ loads (where the last component is unread) to combine into a single larger load. fossil-db results on Lunarlake: Totals: Instrs: 146692608 -> 146246432 (-0.30%); split: -0.33%, +0.02% Subgroup size: 11100528 -> 11100512 (-0.00%) Send messages: 7003425 -> 6862529 (-2.01%); split: -2.01%, +0.00% Cycle count: 22396273274 -> 22523048654 (+0.57%); split: -1.08%, +1.64% Spill count: 67671 -> 67594 (-0.11%); split: -1.59%, +1.48% Fill count: 128999 -> 130223 (+0.95%); split: -1.73%, +2.68% Scratch Memory Size: 5986304 -> 6042624 (+0.94%); split: -1.40%, +2.34% Max live registers: 48898858 -> 48881655 (-0.04%); split: -0.05%, +0.01% Non SSA regs after NIR: 172397792 -> 167577380 (-2.80%); split: -2.80%, +0.00% Totals from 451003 (80.87% of 557667) affected shaders: Instrs: 134111754 -> 133665578 (-0.33%); split: -0.36%, +0.03% Subgroup size: 9039104 -> 9039088 (-0.00%) Send messages: 6127775 -> 5986879 (-2.30%); split: -2.30%, +0.00% Cycle count: 20306336726 -> 20433112106 (+0.62%); split: -1.19%, +1.81% Spill count: 56230 -> 56153 (-0.14%); split: -1.92%, +1.78% Fill count: 112920 -> 114144 (+1.08%); split: -1.97%, +3.06% Scratch Memory Size: 3769344 -> 3825664 (+1.49%); split: -2.23%, +3.72% Max live registers: 43750259 -> 43733056 (-0.04%); split: -0.05%, +0.01% Non SSA regs after NIR: 158449343 -> 153628931 (-3.04%); split: -3.04%, +0.00% In particular, sends get cut by 20.85% for Borderlands 3 DX12, 13.82% on Cyberpunk 2077, 10.75% on Strange Brigade, and 10.20% on Red Dead Redemption 2. Yet, spill/fills remain about the same. fossil-db results on Alchemist are similar though not quite as good. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32315>	2024-12-03 02:02:33 +00:00
Kenneth Graunke	f88eb48ff2	anv: Don't consider nir_var_mem_global for vectorizer robustness checks nir_opt_load_store_vectorize checks for potential address wrapping when vectorizing two loads ("low" and "high"). It looks for cases where "low" might have a large address, and "high" has a positive offset which, when added together, could trigger integer wraparound. The issue here is that if the large address of "low" was considered out-of-bounds, adding offset could wrap around to a small address, which might actually be in-bounds. Thus, when loaded separately, "low" will fail and trigger robustness out-of-bound-read behavior, but "high" would read correctly. When vectorized, the entire load would fail. This is explicitly tested for with 32-bit SSBO addresses in the Vulkan CTS. However, anv's 64-bit global addresses and VMA handling effectively prevent this case. Addresses 0-4095 are a reserved page so that if people try to use 0 as a NULL pointer, it never maps to a valid BO. That alone guarantees that the above case where "high" gets a small address would never be in-bounds, so we don't need to check for it. In fact, we allocate most user allocations out of high addresses, and have specialized allocation heaps for certain types of GPU data structures in the lower GB of memory. For a load to wrap around and successfully land in the right heap, it would have to load gigabytes. Disabling this allows load vectorization and overfetching in more cases. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32315>	2024-12-03 02:02:33 +00:00
Kenneth Graunke	5712fc48a9	nir: Allow large overfetching holes in the load store vectorizer The load__uniform_block_intel intrinsics always load either 8x or 16x 32-bit components worth of data (so 32 byte increments). This leads to cases where we load a few components from one vec8, followed by a few components of an adjacent vec8. We want to combine those into a vec16 load, as that loads a whole cacheline at a time, and requires less hoops to calculate addresses and request memory loads. So, we allow 7 4 = 28 bytes of holes, which handles vec8+vec8 where only the .x component is read. Most drivers and intrinsics will not want such large holes. I thought about adding a per-intrinsic max_hole to the core code, but decided that since we already have driver callbacks, we can just rely on them to reject what makes sense to them. No driver callbacks currently allow holes, so this should not currently affect any drivers. But any work in progress branches may need to be updated to reject larger holes. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32315>	2024-12-03 02:02:33 +00:00
Kenneth Graunke	01680a66a9	brw: Simplify choose_oword_block_size_dwords() Just calculate the block size using util_logbase2() - it's simpler. Also drop the name "oword" as this refers to legacy HDC messages, rather than the newer LSC "vector size" field. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32315>	2024-12-03 02:02:33 +00:00
Kenneth Graunke	e8c85f8476	brw: Only consider components read for UBO push analysis Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32315>	2024-12-03 02:02:33 +00:00
Kenneth Graunke	e703ff5e02	brw: Only consider components read for UBO loads This will matter more with overfetching, where we may suggest loading additional data that we don't actually need for vectorization purposes. We want to make sure that push ranges have the data we actually need; any extra padding is irrelevant. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32315>	2024-12-03 02:02:33 +00:00
Kenneth Graunke	da93b13f8b	brw: Use nir_combined_align in brw_nir_should_vectorize_mem Better than open-coding this. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32315>	2024-12-03 02:02:32 +00:00
Kenneth Graunke	8c795af0b8	brw: Drop a few crocus references in comments crocus no longer uses brw. It uses elk. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32315>	2024-12-03 02:02:32 +00:00
Kenneth Graunke	46af23649c	brw: Drop "regular uniform" concept from UBO push analysis i965 used to upload its own regular GL uniforms and push those in addition to UBO ranges. st/mesa instead uploads regular uniforms and presents those to use as UBO 0. So this really isn't a thing anymore. nir_intrinsic_load_uniform is still used today but it represents Vulkan push constants. anv_nir_compute_push_layout already takes care of ensuring too many ranges aren't present, so it doesn't need the pass to do so. iris doesn't use this intrinsic at all. We can also drop the compute shader check, because neither iris nor anv use UBO push analysis for compute shaders - except for anv's internal kernels, which already have well specified push layouts. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32315>	2024-12-03 02:02:32 +00:00
Kenneth Graunke	586a470a00	brw: Drop image deref handling from brw_analyze_ubo_ranges This was for pre-Skylake image load/store handling with image params. We don't support that in brw anymore. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32315>	2024-12-03 02:02:32 +00:00

1 2 3 4 5 ...

198662 commits