fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-01 01:38:06 +02:00

Author	SHA1	Message	Date
Emma Anholt	870e233ca5	vulkan/wsi/display: Avoid holding drm master for the device's fd. We get a display fd passed in to us through wsi_display_init_wsi(), and when that was the first open of the display device with no previous DRM master, it got master privs and we saved that as the display fd to use for KHR_display. However, that meant that no other client can get DRM master, preventing things like vkAcquireDRMDisplayEXT() users from getting a master fd to pass in to us. Instead, we can drop master at device init time, and pick it back up when a VK_KHR_display swapchain is created that uses that fd. This allows dEQP-VK.wsi.acquire_drm and dEQP-VK.wsi.direct_drm CTS tests to run, which was previously impossible (those tests try to create a custom VK instance, while the CTS already has an instance that had been created with KHR_display enabled, so they're not the first open of the fd). It also means that you could successfully implement VT switching between a KHR_display client and other userspace DRM clients. Also, we can finally implement the text about vkAcquireDRMDisplayEXT's drmFd needing to match the device's fd. The risk of this change, though, is if you're implementing a compositor, and your clients have a chance to open the DRM fd before you've created your swapchain, they may inadvertently have master and DOS you. However, this is no different than the previous situation, where someone with permissions to open DRM could hold master and DOS you already. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38502>	2026-01-26 19:42:33 +00:00
Emma Anholt	fa72be80d9	wsi/display: Fix up the swapchain init error paths. Lots of unwinding was broken, and the CTS caught some of it once I fixed CTS testing. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38502>	2026-01-26 19:42:33 +00:00
Emma Anholt	1a172efa20	vulkan/wsi/display: Add some super useful debug messaging. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38502>	2026-01-26 19:42:33 +00:00
Emma Anholt	f8831ccb2d	vulkan/wsi/display: Rename XCB RandR functions to mention "randr" Otherwise, it can be unclear when reading this code what part is talking to X11 and what is talking to the kernel. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38502>	2026-01-26 19:42:32 +00:00
Emma Anholt	cf32a5f0d1	ci: Skip dEQP-VK.wsi.direct_drm. While I want these to be tested given that I'm hacking on the code, we can't run them in parallel with each other or you'll get unstable results. Note that these are effectively all skips currently, due to https://gitlab.khronos.org/Tracker/vk-gl-cts/-/issues/6168 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38502>	2026-01-26 19:42:31 +00:00
Mike Blumenkrantz	a842e641d9	ntv: emit demote extension/capability when emitting demote Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details this is cleaner and more accurate cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39540>	2026-01-26 19:24:00 +00:00
Connor Abbott	b7a492630e	tu: Implement bin skipping for zero-density regions Follow the semi-documented behavior of the blob driver and skip rendering bins whose fragment density is 0 (i.e. fragment area is infinite). Some Oculus VR apps using an earlier version of the Unity SDK rely on this instead of VK_QCOM_multiview_per_view_render_areas. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35894>	2026-01-26 18:58:25 +00:00
Connor Abbott	54b50094a0	tu: Implement bin merging for views When apps use VK_QCOM_multiview_per_view_render_areas, there may be some bins which are only visible (i.e. overlapping the render area) in one view. In the typical VR use-case, there is a strip of bins to the right of the the left eye and to the left of the right eye that are not used with that eye. By making sure that the right eye is never rendered to, we can reuse that space to double the GMEM height and merge two bins along the left edge, partially offsetting the cost of extra bins from offsetting the left and right viewports and render areas. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35894>	2026-01-26 18:58:25 +00:00
Connor Abbott	25202d3e47	tu: Remove fdm argument from tu6_emit_tile_select We can just check whether the list of patchpoints is non-empty. This is simpler and will help if we want to add patchpoints without FDM. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35894>	2026-01-26 18:58:25 +00:00
Connor Abbott	b311397151	tu: Support VK_QCOM_multiview_per_view_render_areas In order to implement this we have to modify all of the cases where we set a scissor and then loop over attachments to conditionally set the scissor inside each layer of the attachment based on whether per-view render areas are supported. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35894>	2026-01-26 18:58:25 +00:00
Connor Abbott	ff8f5074c6	tu/autotune: Take render pass layers into account I noticed when adding support for render areas per view that this didn't take the number of views into account at all. Based on the code, the right thing to do seems be to multiply by the layer count. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35894>	2026-01-26 18:58:24 +00:00
Connor Abbott	b3a8302147	tu: Implement VK_QCOM_multiview_per_view_viewports We already had to implement per-view viewports for fragment density map. When multiviewPerViewViewports is enabled, we just have to do what we did before, except we also have to stop sharing the same original viewport across all views when FDM is enabled. The app can specify a different viewport for each view and on top of that we will also transform it differently depending on the fragment area for that view, instead of only the transform being different. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35894>	2026-01-26 18:58:24 +00:00
Mel Henning	e32bfc5efe	nvk: Ignore meta ops in occlusion queries Fixes: `052bbd65c9` ("nvk: Implement pipeline statistics and occlusion queries") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39510>	2026-01-26 18:41:54 +00:00
Faith Ekstrand	c081ab864f	nvk: Enable ZPASS_PIXEL_COUNT in draw_state_init() Fixes: `052bbd65c9` ("nvk: Implement pipeline statistics and occlusion queries") Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39510>	2026-01-26 18:41:54 +00:00
Gurchetan Singh	ea5d69eb52	gfxstream: fix build after vk.xml update This is a backport of f134cc5a1e: ("Update <type category="funcpointer"> schema to simplify") in vulkan-docs, essentially. It changed things about how vk.xml is parsed. Fixes: `b30f780c` ("vulkan: update spec to 1.4.340") Reviewed-by: Aaron Ruby <aruby@qnx.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39502>	2026-01-26 18:25:51 +00:00
Connor Abbott	9e63224424	tu: Use a patchpoint for subpass clears with FDM Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The rectangle to clear, which is the render area for subpass clears, is specified in framebuffer coordinates, but the hardware uses GMEM coordinates with FDM. I assumed this was ok for subpass clears, because the end of the bin in GMEM coordinates is always less than or equal to the end in framebuffer coordinates, so we would clear past the end of the bin which is still safe because only the render area would be stored to sysmem: bin 0 bin 1 bin 2 \|---\| \|---\| \|---\| GMEM coordinates (what the HW "sees") \|-------\|-------\|-------\| framebuffer coordinates (used e.g. as STORE_OP_STORE destination) \|-----------------------\| render area/clear rectangle (past end of bin in GMEM coordinates!) There was a hack for FDM offset, where framebuffer coordinates are shifted to the left, but that was it. However this breaks down if the render area doesn't start at (0,0), because it can miss pixels in GMEM coordinates that should be cleared: bin 0 bin 1 bin 2 \|---\| \|---\| \|---\| GMEM coordinates (what the HW "sees") \|-------\|-------\|-------\| framebuffer coordinates (used e.g. as STORE_OP_STORE destination) \|------------------\| render area/clear rectangle (we don't clear bin 0!) Here we should clear the right half of bin 0 but instead we don't clear it at all. Instead of adding yet more hacks to expand the render area, just add a patchpoint to transform the render area into GMEM coordinates. We already do this for CmdClearAttachments where we didn't have a choice, so just reuse that. As a bonus, we can also delete the hack for FDM offset. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39495>	2026-01-26 12:17:12 -05:00
Connor Abbott	66952a6c56	tu: Handle FDM-per-layer in CmdClearAttachments paths We need to re-emit the scissor per layer if FDM-per-layer is enabled. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39495>	2026-01-26 12:15:59 -05:00
Patrick Lerda	0b8d8f2b17	r600: update cubearray imagesize calculation The previous method to calculate imageSize().z was incorrect for a cubearray view. This change was tested on palm and cayman. Here is the test fixed: spec/arb_texture_view/rendering-layers-image/layers rendering of imagecubearray: fail pass Fixes: `6c1432f0be` ("r600/eg: fix cube map array buffer images.") Signed-off-by: Patrick Lerda <patrick9876@free.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39063>	2026-01-26 16:38:10 +00:00
Patrick Lerda	dbe2ec0299	r600: enable GL_EXT_shader_realtime_clock This extension seems to work. This change was tested with the current piglit repository: spec/ext_shader_realtime_clock/execution/clock2x32: skip pass Signed-off-by: Patrick Lerda <patrick9876@free.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37954>	2026-01-26 16:24:00 +00:00
José Roberto de Souza	1bd83ba819	intel/dev: Add INTEL_DEVICE_INFO_MMAP_MODE_INVALID Adding this mmap mode makes explicit in code that PAT compressed buffers should not be mmaped. Although there is no CPU access Xe KMD uAPI still requires a cpu_caching to be set, so setting WC. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>	2026-01-26 15:24:55 +00:00
José Roberto de Souza	ac23454d1c	anv: Move anv_bo_get_mmap_mode() to i915 backend That function is only called from i915 backend no needed to be on common code. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>	2026-01-26 15:24:55 +00:00
José Roberto de Souza	90249d93d9	intel/dev: Improve PAT entries comment XD is transient display, meaning that GT caches are flushed when display IP needs access buffer. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>	2026-01-26 15:24:55 +00:00
José Roberto de Souza	60e38344a0	intel/dev: Remove INTEL_DEVICE_INFO_MMAP_MODE_UC This is not used and we don't have any future plans to use it, so removing it. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>	2026-01-26 15:24:54 +00:00
José Roberto de Souza	85ea85dd9a	intel/dev: Remove INTEL_DEVICE_INFO_MMAP_MODE_XD This is not used and don't make sense as the transient display is on the GPU side. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>	2026-01-26 15:24:54 +00:00
David Rosca	62f07b8c63	radeonsi/vcn: Add low latency decode debug option Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Similar to the low latency option for encode, this reduces latency of decoding at the cost of increased power usage. Can be enabled with AMD_DEBUG=lowlatencydec Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39450>	2026-01-26 15:00:06 +00:00
David Rosca	ce25865e8f	radeonsi/vcn: Clean up decode flags Always OR the flags and replace numeric value with a define. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39450>	2026-01-26 15:00:06 +00:00
Benjamin Cheng	c10ebb0fda	radv/video: Use a more reliable way of computing tile sizes Some apps (old FFmpeg, contemporary CTS) send down pMi{Col,Row}Starts in SB units, not MI units. Instead of dependening on those values which could be unreliable, derive the tile sizes in SB using other parameters. Cc: mesa-stable Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39492>	2026-01-26 14:41:20 +00:00
Wenfeng Gao	98f5fa618b	mediafoundation: Support externally provided motion hints Added support for externally provided motion hints by reading the MFSampleExtension_MoveRegions sample attribute. The motion hint data is converted into pipe_enc_move_info and passed down to the driver for use during encoding. Reviewed-by: Yubo Xie <yuboxie@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39515>	2026-01-26 14:26:21 +00:00
Patrick Lerda	afcead9158	r600: fix rv770 clamp to max_texel_buffer_elements This change fixes the clamp to max_texel_buffer_elements issue related to rv770 and older gpus. Here are the tests fixed on rv770: spec/arb_texture_buffer_object/texture-buffer-size-clamp/r8ui_texture_buffer_size_via_sampler: fail pass spec/arb_texture_buffer_object/texture-buffer-size-clamp/rg8ui_texture_buffer_size_via_sampler: fail pass spec/arb_texture_buffer_object/texture-buffer-size-clamp/rgba8ui_texture_buffer_size_via_sampler: fail pass Fixes: `1a441ad5cb` ("r600: clamp to max_texel_buffer_elements") Signed-off-by: Patrick Lerda <patrick9876@free.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39385>	2026-01-26 14:10:55 +00:00
Patrick Lerda	2ed761021f	r600: make vertex r10g10b10a2_sscaled conformant on palm and beyond This is a gl4.3 issue very similar to `e8fa3b4950`. The mode r10g10b10a2_sscaled processed as vertex on palm at the hardware level doesn't follow the current standard. Indeed, the .w component (2-bits) is not calculated as expected. The table below describes the situation. This change fixes this issue by adding two gpu instructions at the vertex fetch shader stage. An equivalent C representation and a gpu asm dump of the generated sequence are available below. .w(2-bits) expected palm cypress 0 0 0 0 1 1 1 1 2 -2 2 -2 3 -1 3 -1 w_out = w_in - (w_in > 1. ? 4. : 0.); 0002 00000024 A0040000 ALU 2 @72 0072 801F2C0A 600004C0 1 w: SETGT*4 __.w, R10.w, 1.0 0074 839FCC0A 61400010 2 w: ADD R10.w, R10.w, -PV.w Note: cypress returns the expected value, and does not need this correction. This change was tested on palm, barts and cayman. Here are the tests fixed: khr-gl4[3-6]/vertex_attrib_binding/basic-input-case6: fail pass khr-gles31/core/vertex_attrib_binding/basic-input-case6: fail pass Cc: mesa-stable Signed-off-by: Patrick Lerda <patrick9876@free.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38849>	2026-01-26 13:40:22 +00:00
Patrick Lerda	da1108dcc4	r600: fix rv770 dot4 operations Using a PV register which is not PV.x, after a dot4 operation, does not work on rv770. Anyway, this does work on evergreen but this is not documented. This change updates this behavior for all the r600 gpus which fixes the issue on rv770. It adds max4 which has the same requirement in the case of max4 being implemented. Here are some of the affected tests on rv770: piglit/bin/fp-abs-01 -auto -fbo glcts --deqp-case=KHR-GL31.buffer_objects.triangles piglit/bin/shader_runner generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-distance-vec2-vec2.shader_test -auto -fbo Fixes: `942e6af40b` ("r600/sfn: use PS and PV inline registers when possible") Signed-off-by: Patrick Lerda <patrick9876@free.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39101>	2026-01-26 13:09:48 +00:00
Patrick Lerda	98c5ada8d1	r600: disable l8_srgb on r700 and older gpus Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The gamma is not processed by the hardware when processing a one component format texture (FMT_8). This change triggers a fall back to the r8g8b8a8_srgb format which is properly supported by the hardware of these older gpus. Here are the tests fixed on rv770: spec/arb_framebuffer_srgb/fbo-fast-clear: fail pass spec/ext_texture_srgb/fbo-fast-clear: fail pass spec/!opengl 1.1/teximage-colors gl_sluminance8/gl_sluminance8 texture with gl_.*: fail pass Signed-off-by: Patrick Lerda <patrick9876@free.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39159>	2026-01-26 12:52:10 +00:00
Patrick Lerda	d5d844bfc4	r600: fix cayman msaa shading behavior The functionality was working properly at glMinSampleShading(0.) and glMinSampleShading(1.). The issue was with the intermediary values. This change makes this function compatible with the evergreen setup. Note: this was one of the few functionalities which were working properly on evergreen but not on cayman. Here are the tests fixed: spec/arb_sample_shading/samplemask 4 all/0.500000 partition: fail pass spec/arb_sample_shading/samplemask 4/0.500000 partition: fail pass spec/arb_sample_shading/samplemask 6 all/0.250000 partition: fail pass spec/arb_sample_shading/samplemask 6 all/0.500000 partition: fail pass spec/arb_sample_shading/samplemask 6/0.250000 partition: fail pass spec/arb_sample_shading/samplemask 6/0.500000 partition: fail pass spec/arb_sample_shading/samplemask 8 all/0.250000 partition: fail pass spec/arb_sample_shading/samplemask 8 all/0.500000 partition: fail pass spec/arb_sample_shading/samplemask 8/0.250000 partition: fail pass spec/arb_sample_shading/samplemask 8/0.500000 partition: fail pass deqp-gles31/functional/shaders/sample_variables/sample_mask_in/bit_count_per_two_samples/multisample_rbo_4: fail pass deqp-gles31/functional/shaders/sample_variables/sample_mask_in/bit_count_per_two_samples/multisample_rbo_8: fail pass deqp-gles31/functional/shaders/sample_variables/sample_mask_in/bit_count_per_two_samples/multisample_texture_4: fail pass deqp-gles31/functional/shaders/sample_variables/sample_mask_in/bit_count_per_two_samples/multisample_texture_8: fail pass Fixes: `f7796a966d` ("radeonsi: add basic code for overrasterization") Signed-off-by: Patrick Lerda <patrick9876@free.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38615>	2026-01-26 12:37:54 +00:00
Daniel Schürmann	6313e9f549	nir/opt_loop: Relax restrictions on opt_loop_peel_initial_break() for more loops In addition to loops where the break condition can be constant-folded, we also allow to peel the initial break from loops which have at least one phi with a constant loop-carried source, effectively removing that phi from the loop. Totals from 172 (0.22% of 79377) affected shaders: (Navi31) Instrs: 372798 -> 369181 (-0.97%); split: -1.07%, +0.10% CodeSize: 1907312 -> 1891948 (-0.81%); split: -0.89%, +0.09% VGPRs: 8436 -> 8460 (+0.28%) Latency: 3646016 -> 3396657 (-6.84%) InvThroughput: 434848 -> 389079 (-10.53%) Copies: 28436 -> 27118 (-4.63%); split: -4.79%, +0.15% Branches: 26504 -> 25344 (-4.38%); split: -4.44%, +0.06% PreSGPRs: 8585 -> 8603 (+0.21%) VALU: 148291 -> 148355 (+0.04%); split: -0.01%, +0.06% SALU: 95625 -> 92649 (-3.11%); split: -3.22%, +0.11% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33666>	2026-01-26 12:02:49 +00:00
Daniel Schürmann	71d68d9166	asahi/clc: call nir_opt_remove_phis after nir_opt_loop Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33666>	2026-01-26 12:02:49 +00:00
Daniel Schürmann	028da14e2a	panfrost/clc: call nir_opt_remove_phis after nir_opt_loop Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33666>	2026-01-26 12:02:49 +00:00
Georg Lehmann	809fb0fba3	ac/nir/lower_ps_late: emit scalar f2f16_rtz for when one half of a packed export is undef Foz-DB Navi48: Totals from 7200 (8.74% of 82405) affected shaders: Instrs: 9056391 -> 9048177 (-0.09%); split: -0.09%, +0.00% CodeSize: 48681288 -> 48640684 (-0.08%); split: -0.09%, +0.00% VGPRs: 413088 -> 413784 (+0.17%) Latency: 76340711 -> 76320080 (-0.03%); split: -0.03%, +0.00% InvThroughput: 12692959 -> 12684618 (-0.07%); split: -0.07%, +0.00% VClause: 148823 -> 148821 (-0.00%) Copies: 601739 -> 601874 (+0.02%); split: -0.01%, +0.03% VALU: 5213356 -> 5207253 (-0.12%); split: -0.12%, +0.00% SALU: 1160815 -> 1160817 (+0.00%); split: -0.00%, +0.00% VOPD: 79520 -> 79444 (-0.10%); split: +0.09%, -0.18% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:23 +00:00
Georg Lehmann	8c895c5c61	ac/nir/lower_ps_late: CSE partial packed exports Foz-DB Navi48: Totals from 425 (0.52% of 82405) affected shaders: Instrs: 1110029 -> 1109658 (-0.03%); split: -0.03%, +0.00% CodeSize: 6135272 -> 6133848 (-0.02%); split: -0.02%, +0.00% VGPRs: 29856 -> 29844 (-0.04%) Latency: 10258411 -> 10258043 (-0.00%); split: -0.00%, +0.00% InvThroughput: 1898177 -> 1897661 (-0.03%) Copies: 88221 -> 88173 (-0.05%) VALU: 575276 -> 574894 (-0.07%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:22 +00:00
Georg Lehmann	e74323577f	aco/optimizer: optimize pack(undef, f2f16_rtz(a)) for salu Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:22 +00:00
Georg Lehmann	6cbd16daae	aco/optimizer: optimize pack(undef, f2f16_rtz(a)) for gfx8+ Do this late because the v_cvt_pkrtz_f16_f32 can be applied to its operand. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:22 +00:00
Georg Lehmann	57ca974d1d	aco/optimizer: optimize pack(undef, f2f16_rtz(a)) for gfx6/7 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:21 +00:00
Georg Lehmann	ba73792de0	aco/optimizer: fix parsing salu p_insert as shift Fixes: `88f7e3fff3` ("aco/optimizer: parse pseudo alu instructions") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:21 +00:00
Georg Lehmann	830d6de9ff	aco/isel: optimize pack_32_2x16_split(undef, const) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:20 +00:00
Georg Lehmann	b2d9615000	nir/opt_algebraic: optimize bcsel to hi 16bits with undef lo Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:20 +00:00
Georg Lehmann	d06b627d23	nir/opt_algebraic: optimize f2f16_rtz of bcsel with constants Foz-DB Navi48: Totals from 145 (0.18% of 82405) affected shaders: Instrs: 1706001 -> 1705669 (-0.02%); split: -0.03%, +0.01% CodeSize: 9621036 -> 9620784 (-0.00%); split: -0.02%, +0.02% SpillSGPRs: 711 -> 726 (+2.11%); split: -0.56%, +2.67% Latency: 20066360 -> 20066193 (-0.00%); split: -0.00%, +0.00% InvThroughput: 4326789 -> 4326763 (-0.00%); split: -0.00%, +0.00% Copies: 192041 -> 191995 (-0.02%); split: -0.03%, +0.01% Branches: 75673 -> 75675 (+0.00%); split: -0.00%, +0.01% VALU: 765163 -> 764835 (-0.04%); split: -0.05%, +0.00% SALU: 351758 -> 351715 (-0.01%); split: -0.01%, +0.00% VOPD: 65236 -> 65282 (+0.07%); split: +0.17%, -0.10% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:20 +00:00
Georg Lehmann	ee5492e6dd	nir/opt_algebraic: remove f2f16 roundtrip conversions Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:20 +00:00
Georg Lehmann	592b6579da	nir/opt_algebraic: optimize f2f16_rtz(min/max) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:18 +00:00
Georg Lehmann	2b92c0f06e	nir/opt_algebraic: optimize f2f16_rtz(b2f(a)) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:18 +00:00
Danylo Piliaiev	096e0aae74	tu: Avoid disabling LRZ when possible for suspend/resume+depth-only draws Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details We don't actually care if previous suspended RP had depth-only draws with color writes skipped, we only care if previous RP disabled LRZ writes due to this; the mere fact of first draws being depth-only doesn't affect LRZ of next draws in any way. However, for next RPs in suspend-resume chain we have to assume that previous RP may have had color writes. For secondary cmdbufs with ordinary renderpasses it is easy to be less pessimistic, and that's what we do in order to not regress DXVK performance. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39293>	2026-01-26 09:48:44 +00:00
Rhys Perry	928ecfc6c0	radv: fix RADV_DEBUG=shaderstats with RT pipelines radv_dump_shader_stats() printed stats for every shader with a certain stage, and we called this function each time an RT shader is compiled. This means we could repeat the stats for a shader. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39484>	2026-01-26 09:26:14 +00:00

1 2 3 4 5 ...

217689 commits