fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-24 21:50:12 +01:00

Author	SHA1	Message	Date
Lionel Landwerlin	06ad9a25e5	brw: fix Wa_22013689345 emission Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details 2 problems : - not detecting null destination correctly - applied too late using SHADER_OPCODE_MEMORY_FENCE, when lowering already happened Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34319>	2025-04-10 16:44:28 +00:00
Benjamin Lee	22fa3e88dd	panvk: advertise VK_KHR_float_controls2 This is all supported by the common nir code, no changes needed on our end. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>	2025-04-10 16:21:09 +00:00
Benjamin Lee	7612dc4713	panvk: advertise VK_KHR_shader_float_controls Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>	2025-04-10 16:21:09 +00:00
Benjamin Lee	95056fa75a	panvk/va: don't advertise independent denorm behavior Valhall supports all combinations of ftz/preserve denorm behavior between FP16 and FP32 except FP16=ftz, FP32=preserve. Because of this, we can't advertise independent denorm behavior. Even with INDEPENDENCE_NONE, it is still possible for shaders to set denorm behavior for one size and leave the other size unspecified. Previously we were defaulting to preserve for any unspecified size, but with FP16=ftz, we need to default unspecified FP32 to preserve. When advertising INDEPENDENCE_NONE, the CTS checks that the shaderDenormFlushToZeroFloat* and shaderDenormPreserveFloat* features are equal for all sizes, so we need to advertise the same supported denorm behavior for FP64 even though we don't support FP64 at all. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>	2025-04-10 16:21:09 +00:00
Benjamin Lee	b6406c179b	pan/bi: implement denorm behavior float controls On bifrost independent float controls are implementable, just potentially expensive because it requires scheduling FP16 and FP32 instructions in separate clauses. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>	2025-04-10 16:21:09 +00:00
Benjamin Lee	9737c1fa15	pan/bi: ignore ftz mode when scheduling int instructions This allows more efficient scheduling by putting a 16-bit int instruction in the same clause as a 32-bit float instruction even when the 16-bit and 32-bit float controls are different. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>	2025-04-10 16:21:09 +00:00
Benjamin Lee	08765d53c9	pan/bi: refactor bi_instr_ftz to allow dontcare FTZ states The current behavior is identical, but we can express that some instructions may be packed in either FTZ and no-FTZ clauses in the future. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>	2025-04-10 16:21:09 +00:00
Benjamin Lee	5bb85e965e	pan/va: preserve signed zero in f32->f16 conversions Using 'FADD.f32 x, +0' for f32->f16 conversions strips signed zero, which we can't do if we advertise shaderSignedZeroInfNanPreserveFloat16. Adding -0 instead preserves the original sign. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Fixes: `b63ef74e73` ("pan/bi: Stop using V2F32_TO_V2F16 on Valhall") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>	2025-04-10 16:21:09 +00:00
Benjamin Lee	239c6b833a	panfrost: implement float controls rounding mode Many float instructions do not have a rounding mode modifier, but all of the operations that are listed as requiring correct rounding in the vulkan spec are supported in hardware. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>	2025-04-10 16:21:09 +00:00
Benjamin Lee	6f68649400	pan/va: add roundmode modifier to additional instructions These are needed to implement VK_KHR_shader_float_controls rounding mode. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>	2025-04-10 16:21:09 +00:00
Lars-Ivar Hesselberg Simonsen	20c0d169e4	vk/sync: Fix execution only barriers With vkCmdPipelineBarrier, it's possible to specify a barrier with pipeline stages but without any memory barriers. These might not be practical, but are legal Vulkan code. Barriers like this are currently ignored in mesa, as we only convert barriers with passed memory barriers into vkCmdPipelineBarrier2. This commit adds handling of execution only barriers by converting them into a memory barrier without access masks. Fixes: `97f0a4494b` ("vulkan: implement legacy entrypoints on top of VK_KHR_synchronization2") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34187>	2025-04-10 15:28:22 +00:00
Daniel Stone	7c73b9a498	doc/ci: Update nginx caching snippets Fix the nginx cache snippets - I'd missed the file nesting somehow. Tested on a debian:bookworm image with nginx-full installed, checked that we could pull an arbitrary external site, as well as S3, as well as GitLab artifacts. Signed-off-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34341>	2025-04-10 15:21:51 +00:00
Ludvig Lindau	6393ebbdbb	panvk: Get flush_id once per submit Get flush_id once per command buffer in the submit and use it for all subqueues instead of getting a new flush_id for every subqueue. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34448>	2025-04-10 15:00:57 +00:00
Tapani Pälli	30d78dc942	mesa: various fixes for ClearTexImage/ClearTexSubImage Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes some upcoming CTS tests for texture clears. * some drivers will attempt to issue clears with zero range and hit asserts/crashes (spec clarification for negative values) * fix error thrown with negative values to match spec * fix cases for clearing generic compressed formats * fix negative case of using color format while having depth/stencil internalformat and vice versa Cc: mesa-stable Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34428>	2025-04-10 14:32:56 +00:00
Tapani Pälli	3bc016bb6c	mesa: clamp texbuf query size to MAX_TEXTURE_BUFFER_SIZE Fixes upcoming CTS test checking for clamping. Cc: mesa-stable Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34428>	2025-04-10 14:32:56 +00:00
Boris Brezillon	24b1aa6c28	panvk/csf: Optimize read-only tile buffer access When the color/input attachment map is known at compile time, we can determine the set of read-only render targets and replace .wait by .wait_resource flows, in order to avoid read-after-read serialization. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:54 +00:00
Boris Brezillon	4f4ac56145	pan/va: Support relaxed waits on read-only render targets On Valhall we can optimize lower waits, which waits for both readers and writers, into resource_waits which only wait for writers, allowing threads accessing read-only resources to execute concurrently. Let's use that on LD_TILE instructions so we can optmize the read-only case. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	fbb2805575	panvk: Advertise KHR_dynamic_rendering_local_read support Now that we support local reads we can safely advertise KHR_dynamic_rendering_local_read. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	7a2b23b0bd	panvk: Skip BY_REGION barriers if we're in a render pass If we are in a render pass, the intra-draw synchronization happens through the FPK parameters, shader waits and draw dependencies, so we can safely skip the barrier in that case. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	bfd5ddbf32	panvk: Optimize input attachment loads when we can When we know the input attachment is also an active color attachment we can load the value from the tile buffer instead of going back to the texture. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	20275d6521	pan/bi: Introduce two intrinsics to support input attachment remapping In order to dynamically load the content of the tile buffer, we need to know the target (color, depth or stencil) and the conversion to apply. Let's define the load_input_attachment_{target,conv}_pan intrinsics so we can dissociate the logic lowering input attachment loads into load_converted_output_pan, and the part optimizing the shader when input attachment map is passed at compile time. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	244995e4af	panvk: Support color attachment remapping We take the color attachment remapping into account when emitting blend descriptors, and we make sure we re-emit those when this color attachment map is dirty. We also need to take the remapping into account when checking the render targets written by the fragment shader, hence the addition of a color_attachment_written_mask() helper. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	9d5d03bf78	panvk/jm: Move cmd_prepare_draw_sysvals() out of the layer loop The only sysval that changes is the layer_id, so let's call cmd_prepare_draw_sysvals() outside of the layer loop, and manually update the sysval there. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	fe21da08ed	pan/earlyzs: Support the shader ZS read-only case and its optimization on v10+ We are about to allow ZS tile buffer reads in panvk in order to support VK_KHR_dynamic_rendering_local_read, and this requires dealing with a new case in the early ZS logic. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	d2cd5ca609	panvk: Generate the earlyzs LUT at shader creation time Do what the gallium driver does and generate the LUT when creating the shader to avoid regenerating this LUT in the draw path. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	b8174b21d2	panvk: Isolate CS specific bits in panvk_shader We are about to add FS specific info there, so let's make sure all the per-stage bits are part of a union and are conditionally filled/[de]serialized based on the shader type. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	8a16636444	panvk: Re-order things in panvk_deserialize_shader() Re-order things in panvk_deserialize_shader() to avoid declaring local variables for stuff we feed the panvk_shader with. The only exception is pan_shader_info, because we need to know the shader stage to call vk_shader_zalloc(), which if part of pan_shader_info. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	864ea81dcf	panvk/csf: Set invalidate_inherited_ctx only if the render pass is inherited Secondary command buffers don't necessarily inherit their render context. If we flush draws, we should only set invalidate_inherited_ctx when the render context is inherited, otherwise is messes up with the primary command buffer state. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	f3be0836b7	pan/bi: Pass an explicit sampleid to load_converted_output_pan Needed if we want to lower multisample input attachment loads to tile buffer loads. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	cdeda45282	pan/bi: Pass load_converted_output_pan target through a source This allows us to pass a dynamic render target which will be needed to support VK_KHR_dynamic_rendering_local_read. While at it, we also enable support for depth/stencil tile loads. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	2e8829f54a	pan/bi: Allow depth/stencil tile buffer access using LD_TILE LD_TILE has a .z_stencil modifier we can use to access the depth/stencil tile buffer. This will be needed for native depth/stencil input attachments support in panvk. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	08307ecf3c	panvk/jm: Don't force a preload if the previous batch didn't have draws We should only force a preload after a batch split if the batch we flush had draws, otherwise we might lose the effect of clears asked by the user. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	3669cc66c6	vulkan/state: Fix default input attachment map values When no input attachment location info is provided, the depth/stencil attachment are supposed to be NO_INDEX, not UNUSED, and we should also set the color_attachment_count to UNKNOWN. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Lionel Landwerlin	e321c438dc	anv: fix self dependency computation Some upcoming changes in the runtime will make it impossible to rely on the pipeline or runtime information to know whether a fragment shader has input attachments. Instead we gather that information at compile time and store it in our shader bind_map. At runtime we check whether the fragment shader has input attachments and whether those map to the runtime depth/stencil input attachments to set the 3DSTATE_PS_EXTRA::PixelShaderKillsPixel. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `d2f7b6d5a7` ("anv: implement VK_KHR_dynamic_rendering_local_read") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	be2532fc00	vk/pass: Add input attachment location info For drivers using the render pass emulation provided by the runtime, it's important to express the mapping between depth/stencil/color attachments and input attachments using VkRenderingInputAttachmentIndexInfoKHR, otherwise those drivers have to special-case emulated render passes in their CmdBeginRendering() implementation. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	38e546c202	vulkan/state: Fix input attachment map state initialization/copy vk_dynamic_graphics_state_copy() is not copying the input attachment map, and color_attachment_count is not initialized in vk_dynamic_graphics_state_init_ial(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Corentin Noël	34a5f4ac7c	virgl: Use drmCloseBufferHandle instead of calling dmIoctl directly Makes the code a bit lighter. Signed-off-by: Corentin Noël <corentin.noel@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34437>	2025-04-10 12:55:51 +00:00
Corentin Noël	5144a4f56c	virgl: Close handle on resource info failure We just opened the GEM handle a few line before (or used drmPrimeFDToHandle to acquire it), on failure it is just better to close it. Signed-off-by: Corentin Noël <corentin.noel@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34437>	2025-04-10 12:55:51 +00:00
Martin Krastev	60d815d1bf	docs/svga: Add steps how to get VMware Workstation Pro on Linux Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Signed-off-by: Martin Krastev <martin.krastev@broadcom.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12829 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34429>	2025-04-10 10:38:56 +00:00
Caterina Shablia	e5bdb41200	panfrost: move the comment closer to what it's about Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>	2025-04-10 08:05:21 +00:00
Caterina Shablia	83383cb4b8	panfrost: require buffer_count and pushed_words to be passed to panfrost_emit_const_buf Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>	2025-04-10 08:05:21 +00:00
Alyssa Rosenzweig	59a3e12039	panfrost: do not push "true" UBOs Panfrost supports pushing uniforms to hardware uniform registers (RMU/FAU for Midgard/Bifrost respectively). Since OpenGL uniforms are lowered to UBO #0, it does this with a pass that pushes UBOs. That's good! The pass also pushes 'true' OpenGL UBOs, since they look the same in the backend at this point. This is where the trouble comes in: - True UBOs are allocated in GPU BOs, not CPU allocated buffers. That means it's write-combine memory, which we cannot read from efficiently (at least depending on coherency details that were never plumbed through panfrost.ko and unlikely to be replumbed now that panthor is the new hot stuff). So, pushing true UBOs reduces GPU overhead at the cost of tremendous CPU overhead. This is dubious... When I benchmarked this on MT8192 in early 2023, this pushing improved FPS in SuperTuxKart but hurt FPS in Dolphin. - True UBOs can be written on the GPU. In OpenGL, we have batch tracking infrastructure to sort this mess out in theory. What this means is that pushing UBOs requires us to flush writers AND STALL at draw-time. If this is ever hit, our performance is utterly trashed. But it gets worse. - True UBOs can be written in the same batch that reads them. For example, we could bind a buffer as a transform feedback buffer, do a draw with XFB, then rebind as a UBO and do a draw reading. This is where we collapse -- our logic will flush the writer, which is the same batch we were in the middle of enqueueing a draw to. When we try to push words, we'll crash with theatrics. This could be solved by smartening the batch tracking logic but it's not trivial by any means. So, pushing true UBOs on the CPU is broken and can hurt performance. Stop doing it! Long term, the solution will be to push on the GPU instead. This avoids all of these issues. This can be done with a compute kernel or with CSF instructions. The Vulkan driver will likely have to do this for performance, since pushing UBOs from the CPU is utterly broken in Vulkan for the above reasons. I have a branch somewhere doing this on v9 but I'm doing this on NIR time to unblock a core change that was crashing piglit due to this pile of unsoundness. Let's fix the correctness issues first, then someone can look at recovering performance later when we're not blocking unrelated work. Fixes corruption in Piglit test gles-3.0-transform-feedback-uniform-buffer-object, which writes a UBO with transform feedback. (I suspect the test still doesn't pass for the same reason it's broken on other tilers. But that's a better place to be than oodles of memory corruption.) According to CI, fixes spec@arb_uniform_buffer_object@rendering{-dsa}-offset. Cc: mesa-stable Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>	2025-04-10 08:05:21 +00:00
Caterina Shablia	2c75b6bb01	panfrost: update nr_uniform_buffers before dispatching XFB Currently nr_uniform_buffers will be whatever the previous draw set for its vertex shader, which is not what the XFB shader usually expects. Fixes: `c246af0d` ("panfrost: Only upload UBOs when needed") Cc: mesa-stable Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>	2025-04-10 08:05:21 +00:00
Caterina Shablia	6948ab727f	panfrost: don't overwrite push uniforms and sysvals UBO with user's UBO ss->info.ubo_mask includes the push+sysval UBO so if there's a user UBO bound at the same index as the push+sysval UBO, without this change we end up writing a descriptor for the user UBO at that index. Fixes: `3b3cd59f` ("panfrost: Launch transform feedback shaders") Cc: mesa-stable Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>	2025-04-10 08:05:21 +00:00
Alyssa Rosenzweig	f179f6952f	panfrost: invert and rename no_ubo_to_push flag only the GL driver actually wants this, neither panvk nor internal shaders do. Cc'd as a prereq to the next patch Cc: mesa-stable Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>	2025-04-10 08:05:21 +00:00
Samuel Pitoiset	2f00daf67a	radv: tidy up radv_emit_hw_ngg() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34420>	2025-04-10 06:56:25 +00:00
Samuel Pitoiset	1290b38f57	radv: tidy up radv_emit_raster_state() Better isolation between configuration and emission. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34420>	2025-04-10 06:56:25 +00:00
Samuel Pitoiset	4b2d119d90	radv: reduce the number of emitted DWORDS for MSAA 8x user sample locs From 24 DWORDS to 16 DWORDS. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34420>	2025-04-10 06:56:25 +00:00
Samuel Pitoiset	c1ebf82700	radv: track redundant DB_RENDER_OVERRRIDE register writes on GFX12 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34420>	2025-04-10 06:56:25 +00:00
Samuel Pitoiset	7f5727b313	radv: use consecutive registers for PA_SC_WINDOW_SCISSOR_{TL,BR} For less DWORDS. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34420>	2025-04-10 06:56:25 +00:00

1 2 3 4 5 ...

204065 commits