fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 11:18:11 +02:00

Author	SHA1	Message	Date
Nanley Chery	caf007ff27	anv: Drop can_fast_clear_with_non_zero_color() This got dropped during a rebase. Fixes: `35f02d8f36` ("anv: Inline can_fast_clear_with_non_zero_color") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33035>	2025-01-15 15:43:19 +00:00
Matthew Brost	2a053b2e60	anv/xe: Bind queue per anv_queue The Xe uAPI is designed to use bind queues such that binds without input dependencies (sync objects) do not block on binds with input dependencies. For example: - Bind A (sparse) is submitted with a list of input dependencies. - Bind B (immediate) is subsequently submitted without a list of input dependencies. If Bind A and Bind B share a single bind queue, Bind B will not be scheduled until Bind A completes. Using individual bind queues decouples Bind A and Bind B, allowing Bind B to make immediate progress. This change creates a separate bind queue for each ANV queue, enabling support for sparse bindings that may have input dependencies. v2: - Bail on bind queue creation failure (Linoel) - Only create bind queue if VK_QUEUE_SPARSE_BINDING_BIT is set (Jose) v3: - Add comment around submit->queue usage (Jose) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32873>	2025-01-14 14:39:53 +00:00
Nanley Chery	cd8e120b97	anv: Allow more single subresource fast-clears with FCV Format re-interpretation is no longer a problem with texture views. The clear color address now points to a clear color that is in the expected format. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31374>	2025-01-14 03:43:55 +00:00
Nanley Chery	35f02d8f36	anv: Inline can_fast_clear_with_non_zero_color Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31374>	2025-01-14 03:43:55 +00:00
Nanley Chery	5549cb921d	Revert "anv: turn off non zero fast clears for CCS_E" This reverts commit `25a232238f`. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11110 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11325 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31374>	2025-01-14 03:43:55 +00:00
Nanley Chery	3e62401df3	anv: Drop bpc check for non-zero fast clears Use the matching clear color address for an image view format to support any clear color. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31374>	2025-01-14 03:43:55 +00:00
Nanley Chery	83cd73385a	anv: Use L3 Fabric flush in fast-clear post-amble on TGL Replace the Tile Cache flush with an L3 Fabric flush. According to HSD 1604687438, this should be faster. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31600>	2025-01-14 03:14:00 +00:00
Nanley Chery	cec086a074	anv: Reduce fast-clear post-amble synchronization On gfx12+, the pre-amble and post-amble flushes contain the stalls necessary to ensure the prior operation is complete. Remove the extra uses of ANV_PIPE_END_OF_PIPE_SYNC_BIT in post-amble flushes. Also do this for the pre-amble flushes, but this doesn't have any impact. The flush application function will implicitly add the bit. For A750, this improves the TWWH3 trace in the performance CI by 0.52% (n=2). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31600>	2025-01-14 03:14:00 +00:00
Nanley Chery	052d7e1a9c	anv: Slow clear if fast-clear cost is not mitigated Fast-clears require expensive flushes beforehand and afterwards. The cost of flushes are decreased in a series of back-to-back fast-clears as no extra fast-clear flushes are required in between them. If the ratio of a command buffer's recorded back-to-back fast clears over independent fast-clears falls below 1/2, prevent that command buffer from recording any further fast-clears. Averaging two runs of our Factorio trace on an A750 shows a +14.37% improvement in FPS. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32984>	2025-01-13 20:42:31 +00:00
Hyunjun Ko	638fc5e472	anv: change bool to VkResult Fixes: `41caf3665c` ("anv/image: allocate some memory for mv storage after video images.") Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32775>	2025-01-10 21:45:04 +00:00
Hyunjun Ko	ec60462a65	anv: fix to set default cdf buf correctly. v1. Store cdf index values to the state of the commnad buffer. (Lionel Landwerlin <lionel.g.landwerlin@intel.com>) Fixes: dEQP-VK.video.decode.av1.sizeup_8_separated_dpb Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32775>	2025-01-10 21:45:04 +00:00
Hyunjun Ko	e510efed05	anv: support in-loop super resolution for AV1 decoding Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32775>	2025-01-10 21:45:04 +00:00
Hyunjun Ko	788263501d	anv: calculate global parmeters correctly for AV1 decoding Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32775>	2025-01-10 21:45:04 +00:00
Dave Airlie	8432b8b282	anv: add initial support for AV1 decoding Co-authored-by: Hyunjun Ko <zzoon@igalia.com> - Allow intrabc - Fix to manage refrenece frames using referenceNameSlotIndices - Fix to set bitmask of motion field projection correctly - Set destination buffer offset to the BSD_OBJECT - Support 10-bit decoding. - Fix small bugs. - Change to C-style comment. Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32775>	2025-01-10 21:45:04 +00:00
Hyunjun Ko	0fd0a51df6	anv/video: Fix to return supported video format correctly. Since 8-bit decoding is not default, we need to check the flag too. Fixes: `a64ae20d0` ("anv: support HEVC 10-bit decoding" ) Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32775>	2025-01-10 21:45:04 +00:00
Dave Airlie	6a28e7a6c7	anv: add default av1 tables from media-driver Co-authored-by: Hyunjun Ko <zzoon@igalia.com> - Change to C-style comment. Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32775>	2025-01-10 21:45:04 +00:00
Michael Cheng	c3c05ffb5f	intel : Expose Shader hashes for utrace and Perfetto This patch exposes shader hashes (computes and draws) to Perfetto and utrace. By including these hashes in traces, developers can correlate compute and draw calls with their assoicated ASM dumps when analyzing the traces. To achieve this, intel_tracepoint.py has been reworked to preprocess tracepoint arguments dynamically. Any argument containing "hash" in its variable name is now forrmated as hexadecimal before being passed to the tracepoint definition. Signed-off-by: Michael <michael.cheng@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32708>	2025-01-10 17:38:16 +00:00
Sagar Ghuge	710624fcc0	anv: Use 3DSTATE_URB_ALLOC_* instructions Use 3DSTATE_URB_ALLOC_* instruction to program URB for multislice device config. In case only one slice is available in the device, SliceN fields will be ignored by HW. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32736>	2025-01-09 21:26:40 +00:00
Lionel Landwerlin	08e82b28e8	anv: use the correct MOCS for depth destinations Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31778>	2025-01-09 17:47:27 +00:00
José Roberto de Souza	1d1d5653ac	anv: Check VkResult main batch buffer before start companion batch buffer It could run the companion batch buffer even if the main batch buffer failed, that was possible to happen in i915 and Xe KMD. In case the main context/queue is banned and companion is not it could still return that submission was properly start what was not. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32850>	2025-01-09 13:47:28 +00:00
José Roberto de Souza	4c6194cae0	anv: Check VkResult of perf query batch buffer On i915 it could be executing the main batch buffer in i915_queue_exec_locked() even if the perf query batch buffer failed. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32850>	2025-01-09 13:47:28 +00:00
Sagar Ghuge	33d9a685a5	anv: Add pipelined coarse pixel state 3DSTATE_CPS_POINTERS is deprecated on PTL, so let's switch to 3DSTATE_COARSE_PIXEL to deliver CPS state as pipelined state. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32737>	2025-01-07 23:53:44 +00:00
Chia-I Wu	83dec767da	anv: use common calibrated timestamp support partially Use the common GetPhysicalDeviceCalibrateableTimeDomainsKHR. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32689>	2025-01-07 03:39:29 +00:00
José Roberto de Souza	7ac9ac0f93	anv: Allow larger SLM sizes for task and mesh shader It was hard-coded to 64k but Xe2 platforms and newer supports larger SLM sizes. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Dylan Baker <dylan.c.baker@intel.com> Cc: mesa-stable Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32874>	2025-01-06 18:31:20 +00:00
Tapani Pälli	72351afe24	anv: handle mesh in sbe_primitive_id_override This prevents crashes seen in some upcoming cts tests. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32861>	2025-01-06 08:41:18 +00:00
Hyunjun Ko	5ecea6ec4a	anv: handle negative value of slot index for h265 decoding. Fixes: `8d519eb5` ("anv: add initial video decode support for h265") Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32823>	2025-01-06 01:02:14 +00:00
Hyunjun Ko	168298b891	anv: Enable remapping picture ID Fix to handle 16 refs. v1. handle the case where a slot index is negative. (Lionel Landwerlin <lionel.g.landwerlin@intel.com>) Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32823>	2025-01-06 01:02:14 +00:00
Hyunjun Ko	9221feaf79	anv: define ANV_VIDEO_H264_MAX_DPB_SLOTS prep work for remapping slot ids for h264 decoding. Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32823>	2025-01-06 01:02:13 +00:00
Lionel Landwerlin	98cdb9349a	anv: ensure null-rt bit in compiler isn't used when there is ds attachment Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `15987f49bb` ("anv: avoid setting up a null RT unless needed") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12396 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32867>	2025-01-03 23:12:22 +00:00
Lionel Landwerlin	1448778385	anv: rework tbimr push constant workaround We'll want to know about the empty push constant for device generated commands. It's easier if the information is stored in anv_pipeline_bind_map::push_ranges[]. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32828>	2025-01-03 11:48:42 +00:00
Lionel Landwerlin	6281b207db	anv: add tracepoints timestamp mode for empty dispatches When the runtime is going to potentially emit no dispatch, we need to have a way to capture a timestamp. Add a new flag for this to tell whether we don't have a HW instruction to capture the timestamp and rely on MI_STORE_REGISTER_MEM instead. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `de00fe3f66` ("anv: add BVH building tracking through u_trace") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12382 Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32835>	2025-01-03 10:36:49 +00:00
Lionel Landwerlin	6fb2d3b163	anv: limit the memcpy data for push constants Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32824>	2025-01-02 16:48:04 +00:00
Sagar Ghuge	76e85df2d2	anv: Switch to ANISOTROPIC_FAST filter mode Same thing as ANISOTROPIC including all restrictions except HW is allowed to take liberties with precision to speed things up, Currently only has an affect on formats of type *_sRGB. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32738>	2024-12-31 21:49:41 +00:00
Lionel Landwerlin	5e4aeb3ad7	anv: fix index buffer size changes With vkCmdBindIndexBuffer2KHR only the provided size can change which currently fails to reprogram the index buffer properly. Signed-off-by: Lionel Landwerlin <llandwerlin@gmail.com> Fixes: `5c2aca456e` ("anv: implement vkCmdBindIndexBuffer2KHR") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12376 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32785>	2024-12-27 13:20:49 +00:00
Rohan Garg	308c2b9828	anv: refactor choose_isl_tiling_flags to pass fewer arguments Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: José Roberto de Souza <None> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32771>	2024-12-23 19:33:36 +00:00
Erik Faye-Lund	e17abeca44	anv: use vk_descriptor_type_is_dynamic No need to open-code this one now that we have a generic helper. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32627>	2024-12-19 15:12:58 +00:00
José Roberto de Souza	2bd3df75e5	anv: Emit STATE_SYSTEM_MEM_FENCE_ADDRESS According to HAS it is necessary to emit this instruction once per context so MI_MEM_FENCE works properly. Fixes: `86813c60a4` ("mi-builder: add read/write memory fencing support on Gfx20+") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32680>	2024-12-18 17:16:05 +00:00
José Roberto de Souza	b8f93bfd38	anv: Always create anv_async_submit in init_copy_video_queue_state() A next patch will emit more instructions in video and copy queues for Gfx 200 and newer but the current code only creates anv_async_submit if device has aux_map. Instead we can always create anv_async_submit and only submit it to hardware if any instruction was emited. Fixes: `86813c60a4` ("mi-builder: add read/write memory fencing support on Gfx20+") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32680>	2024-12-18 17:16:05 +00:00
Kevin Chuang	1b55f10105	anv/bvh: Dump BVH synchronously upon command buffer completion Modified the BVH dumping mechanism to synchronously wait for the command buffer to complete before saving BVH data to files. This approach is more robust compared to the previous method of dumping during acceleration strucutre destruction. Note: if DEBUG_BVH_ANY is enabled but intel-rt is disabled, we will wait for nothing. Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32585>	2024-12-16 23:01:11 +00:00
Felix DeGrood	0f46c53b0c	anv: Use vfg distribution mode = RR_STRICT for Xe2+ Performance tuning. Round Robin strict faster on Xe2 for some workloads. Speedup: - Borderlands3-dx11-trace: +4% - WolfensteinYoungblood-vk.g6: +1.5% - Cyberpunk2077-dx12vk-2160p-ultra: +0.5% Acked-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32566>	2024-12-13 19:15:48 +00:00
Sagar Ghuge	d3f9139e49	intel: Use Morton compute walk order According to HSD 14016252163 if compute shader uses the sample operation, morton walk order and set the thread group batch size to 4 is expected to increase sampler cache hit rates by increasing sample address locality within a subslice. Rework: * Caio: "\|\|" => "&&" for type checking in instr_uses_sampler() * Jordan: Use nir's foreach macros rather than nir_shader_lower_instructions() Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32430>	2024-12-12 19:56:47 -08:00
Lionel Landwerlin	2bb98a8f99	anv: document UBO descriptor range alignments Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32347>	2024-12-12 07:35:18 +00:00
Benjamin Lee	becb014d27	nir: treat per-view outputs as arrayed IO This is needed for implementing multiview in panvk, where the address calculation for multiview outputs is not well-represented by lowering to nir_intrinsic_store_output with a single offset. The case where a variable is both per-view and per-{vertex,primitive} is now unsupported. This would come up with drivers implementing NV_mesh_shader or using nir_lower_multiview on geometry, tessellation, or mesh shaders. No drivers currently do either of these. There was some code that attempted to handle the nested per-view case by unwrapping per-view/arrayed types twice, but it's unclear to what extent this actually worked. ANV and Turnip both rely on per-view outputs being assigned a unique driver location for each view, so I've added on option to configure that behavior rather than removing it. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>	2024-12-09 20:31:49 +00:00
Benjamin Lee	975c3ecd1e	nir: handle arbitrary per-view outputs in nir_lower_multiview This is needed for panvk, where multiview is "all or nothing". When multiview is enabled, all outputs may be written with separate values for each view. The edge case mentioned in the previous `nir_can_lower_multiview` is now handled because we now handle an arbitrary number of per-view output vars instead of expecting to find exactly one. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>	2024-12-09 20:31:49 +00:00
Mi, Yanfeng	06d3eb8e01	anv:increase instruction heap to 3Gb Black Myth Wukong is generating more than 2Gb of shaders in pre-compiling stage after VK_EXT_shader_image_atomic_int64 extension enabled. Driver will crash in create shader stages due to dereference null pointer of kernel map. Signed-off-by: Mi, Yanfeng <yanfeng.mi@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32548>	2024-12-09 19:14:38 +00:00
Mi, Yanfeng	0a5a04f509	anv:Fix memory grow calculation overflow issue when old buffer size is large than 2G, 32bit cannot hold 2 times buffer size (>4G). Fixes: `8d813a90d6` ("anv: fail pool allocation when over the maximal size") Signed-off-by: Mi, Yanfeng <yanfeng.mi@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32551>	2024-12-09 18:49:17 +00:00
Lionel Landwerlin	de00fe3f66	anv: add BVH building tracking through u_trace Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32483>	2024-12-09 14:45:00 +00:00
Alyssa Rosenzweig	972f8aa287	vulkan: rename depth bias graphics states "constant" is a special keyword in OpenCL C, and we'd like to #define it suitably in host C23 to facilitate compatiblity between host/device headers. That means we can't have any identifiers named "global" or "constant". Fortunately, this is the only 'constant' in any file I'm hitting. To avoid the clash, don't abbreviate "constant factor", use "constant_factor" instead. For consistency, "slope factor" then becomes "slope_factor". The new names are longer but match the Vulkan API exactly. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [Intel] Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> [NVK and panvk] Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [V3DV] Reviewed-by: Simon Perretta <simon.perretta@imgtec.com> [IMG] Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32505>	2024-12-06 13:48:26 -05:00
Nanley Chery	483c40a21d	anv: Allow compressed memtypes with default buffer types Source 2 games segfault if certain buffers are not able to use the same memory types as images. CS2 specifically expects this to be the case for vertex and index buffers (VK_BUFFER_USAGE_2_INDEX_BUFFER_BIT, VK_BUFFER_USAGE_2_VERTEX_BUFFER_BIT). I have not tested other Source 2 games to see how much the requirement differs for the usage (if at all). Up until now, we've disabled CCS for the Source 2 engine with the anv_disable_xe2_ccs driconf option. However, this option is not great for performance. So, replace this with a new option to allow the same memory types we use for images on buffers - anv_enable_buffer_comp. Compression of buffers is generally not good for performance. I collected the result of unconditionally enabling the feature in the performance CI on BMG. I used the default configuration to average the result of two runs of each trace. The CI reports that 4 game traces would regress between 0.44-1.01% FPS with buffer compression. However, the CI actually shows it to be beneficial in three of our game traces: * Cyberpunk-trace-dx12-1080p-high 106.51% * Hitman3-trace-dx12-1080p-med 101.59% * Blackops3-trace-dx11-1080p-high 100.44% So, enable the option for the two games we already have driconf entries for, Cyberpunk and Hitman3. Of course, also enable the option for Source 2 games. Casey Bowman reports that on BMG, some frame times drop from ~15ms to ~7ms in CS2. This is in large part due to the removal of HiZ resolves, which is a consequence of the game now using of HIZ_CCS_WT instead of plain HIZ. Ref: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11520 Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32519>	2024-12-06 17:21:06 +00:00
Lionel Landwerlin	371b7a9b0d	anv: set pipeline flags correct for imported libs Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3d49cdb71e` ("anv: implement VK_EXT_graphics_pipeline_library") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32507>	2024-12-05 19:53:34 +00:00

1 2 3 4 5 ...

6117 commits