Commit graph

187529 commits

Author SHA1 Message Date
Lionel Landwerlin
bbade676f4 anv/iris: centralize TBIMR drirc
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33795>
2025-02-27 21:10:59 +00:00
Yiwei Zhang
af1b4f61b5 venus: added passthrough extension support - Part V
Below extensions are added:
1. VK_KHR_fragment_shader_barycentric
2. VK_EXT_legacy_vertex_attributes
3. VK_EXT_ycbcr_image_arrays

Test:
- dEQP-VK.fragment_shading_barycentric.*
- dEQP-VK.pipeline.*.vertex_input.legacy_vertex_attributes.*
- dEQP-VK.ycbcr.format.*

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33783>
2025-02-27 20:35:36 +00:00
Yiwei Zhang
b02e8a9f1d venus: added passthrough extension support - Part IV
Below extensions are added:
1. VK_EXT_shader_atomic_float
2. VK_EXT_shader_atomic_float2
3. VK_EXT_shader_image_atomic_int64
4. VK_EXT_shader_replicated_composites

Test:
- dEQP-VK.glsl.atomic_operations.*
- dEQP-VK.image.atomic_operations.*

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33783>
2025-02-27 20:35:35 +00:00
Yiwei Zhang
1fe8be9215 venus: added passthrough extension support - Part III
Below are added:
1. VK_KHR_shader_maximal_reconvergence
2. VK_KHR_shader_subgroup_uniform_control_flow
3. VK_KHR_shader_quad_control
4. VK_EXT_shader_subgroup_vote

Test:
- dEQP-VK.reconvergence.*
- dEQP-VK.subgroups.subgroup_uniform_control_flow.*
- dEQP-VK.subgroups.shader_quad_control.*

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33783>
2025-02-27 20:35:35 +00:00
Yiwei Zhang
f16345b2f6 venus: added passthrough extension support - Part II
Below are added:
1. VK_KHR_compute_shader_derivatives
2. VK_NV_compute_shader_derivatives
3. VK_KHR_workgroup_memory_explicit_layout

Test:
- dEQP-VK.compute.*workgroup_memory_explicit_layout.*
- dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.*

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33783>
2025-02-27 20:35:35 +00:00
Yiwei Zhang
48b50c77df venus: added passthrough extension support - Part I
Below are added:
1. VK_KHR_depth_clamp_zero_one
2. VK_EXT_depth_clamp_zero_one
3. VK_EXT_depth_range_unrestricted
4. VK_EXT_post_depth_coverage
5. VK_ARM_rasterization_order_attachment_access

Test: dEQP-VK.depth.*

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33783>
2025-02-27 20:35:35 +00:00
Yiwei Zhang
785f44adc8 venus: sync protocol for the passthrough extensions
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33783>
2025-02-27 20:35:35 +00:00
Kevin Chuang
87ff7b061f anv/bvh: Fix copy shader handling sparse buffer
Fixes: 692b5fa9f2 ("anv: Add shader to copy acceleration structures")

This commit fixes the future test "sparse_binding_structures" for
"header_bottom_address" for ray tracing pipeline.

Even on 48-bit ray tracing (Xe1/2), the software-defined part
instance_leaf_part1.bvh_ptr has to be in canonical form for copy.comp
to deference a bvh, which means we have to preserve the upper 16bits.
This is especially relevant in cases where the acceleration structure buffer
is located high, such as sparse buffer.

Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33745>
2025-02-27 20:10:10 +00:00
Kevin Chuang
b9a980ea73 anv/bvh: Fix encoder handling sparse buffer
Fixes: 2fe57947e3 ("anv: Implement encode shader to fit in ANV BVH")

This commit resolves the failures in the future tests
"sparse_binding_structures" for rayquery. Sparse buffers' heaps are
located high, and since it's in canonical form, the higher 16bits are
all set to 1. However, the existing encoder did not expect any non-zero
values at the higher 16bits. As a result, the instance flags got
corrupted, causing most triangle tests to fail.

Thanks for Paulo providing insights about sparse buffer properties.

Co-developed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33745>
2025-02-27 20:10:10 +00:00
Sagar Ghuge
2c8148a76e anv: CPS LOD Compensation Enable is deprecated on Xe2+
On Xe2+, Hardware will always have scale.x and scale.y as 1.0.
This is not fixing any issues.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33726>
2025-02-27 19:49:02 +00:00
Job Noorman
739ca77e66 nir/lower_subgroups: use build_cluster_mask for quad mask
build_subgroup_quad_mask can now be written in terms of
build_cluster_mask.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31732>
2025-02-27 18:53:19 +00:00
Jason Macnak
14bc2e2d39 gfxstream: Remove duplicated boxed handle func declarations
... and fix up include paths.

Test: cvd create --gpu_mode=gfxstream_guest_angle_host_swiftshader

Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33740>
2025-02-27 17:37:55 +00:00
Jason Macnak
039e64264a gfxstream: Move the handle replay buffer into BoxedHandleManager
... since `BoxedHandleManager` should, well, manager the handles.

This simplifies `VkDecoderGlobalState` a little bit and should also
allow us to remove a bunch of functions that no longer need to
depend on `VkDecoderGlobalState`.

Test: cvd create --gpu_mode=gfxstream_guest_angle_host_swiftshader
Test: cvd snapshot_take --force \
                        --auto_suspend \
                        --snapshot_path=/tmp/snapshot1
Test: cvd reset -y
Test: cvd create --snapshot_path=/tmp/snapshot1

Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33740>
2025-02-27 17:37:55 +00:00
Jason Macnak
4ddd8bd96e gfxstream: Remove unused handling mappers
Not used.

Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33740>
2025-02-27 17:37:55 +00:00
Serdar Kocdemir
6bf253b8e8 gfxstream: Add VK_KHR_multiview support
Enable the extension to be advertised for the guest.

Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33740>
2025-02-27 17:37:55 +00:00
Serdar Kocdemir
35dd4b4fc2 gfxstream: Track more fence functions on host
Add vkGetFenceStatus and vkWaitForFences functions to the
global state tracking list for the host.
This will allow adding more functionality to the fences
and perform additional operations before waiting for and
signaling them.

Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33740>
2025-02-27 17:37:55 +00:00
Sergii Ushakov
3449c3c98a gfxstream: Emulate DMABUF with OPAQUE_FD
Enables software rendering via swiftshader on host side and angle
on guest when using DMABUF based framebuffers.

TEST=Run internal application successfully

Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33740>
2025-02-27 17:37:55 +00:00
Jason Macnak
18afdaa168 gfxstream: Move snapshot decoder replay into VkDecoderGlobalState
... to break the recursive behavior of the replay calling into
VkDecoderSnapshot so that locking and thread safety annotations can be
preserved in VkDecoderSnapshot.

Follow up to aosp/3412302.

Test: cvd create --gpu_mode=gfxstream_guest_angle_host_swiftshader
Test: cvd snapshot_take --snapshot_path=<>
Test: cvd create --snapshot_path=<>

Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33740>
2025-02-27 17:37:55 +00:00
Aditya Kumar
63de837a8b gfxstream: Fix compiling gfxstream for musl libs
musl has the unistd.h in top level.

Test: m USE_HOST_MUSL=true

Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33740>
2025-02-27 17:37:55 +00:00
Bo Hu
0a0a350499 gfxstream: Adding support for VK_KHR_global_priority extension
According to
https://registry.khronos.org/vulkan/specs/latest/man/html/VK_KHR_global_priority.html

This device extension allows applications to query
the global queue priorities supported by a queue
family, and then set a priority when creating queues

Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33740>
2025-02-27 17:37:55 +00:00
Benjamin Lee
55c476efed panvk: advertise shaderFloat16
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
2025-02-27 16:49:11 +00:00
Benjamin Lee
252c59602e panfrost: implement 16-bit ldexp
Bifrost LDEXP.v2f16 takes a 16-bit exponent, which requires messy
lowering. The codegen for this is quite bad currently, but would be
improved by implementing unpack_32_2x16_split_*, and by fusing
comparisons with CSEL.

The main alternative is converting to F32, then LDEXP.f32, then
converting back to F16. This has better codegen for dynamic exponents
currently, but worse in the common case with a constant exponent where
all the saturating cast logic can be folded.

Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.ldexp.compute.vec2
when shaderFloat16 is enabled in panvk.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
2025-02-27 16:49:11 +00:00
Benjamin Lee
2a70665df7 panfrost/va: remove swizzle mod from LDEXP
This instruction does not support swizzles. This information is not used
for anything, but will be if we use the instruction tables for
bi_lower_swizzle.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 316486dd9f ("pan/va: Add initial ISA.xml for Valhall")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
2025-02-27 16:49:11 +00:00
Benjamin Lee
810351ad03 panfrost: fix condition in bi_nir_is_replicated
The original implementation of this returned false when the src was
replicated, and true when it was not.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 21bdee7bcc ("pan/bi: Switch to lower_bool_to_bitsize")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
2025-02-27 16:49:11 +00:00
Benjamin Lee
fb9583cd53 panfrost: reorder lower_bit_size pass
nir_lower_int64 may generate 16-bit fexp2 instructions, which need to be
lowered.

Fixes dEQP-VK.spirv_assembly.instruction.compute.convertstof.int64_to_float16_m1234
when shaderFloat16 is enabled in panvk. I don't believe it's possible to
trigger this with mediump, so it's not a bug without shaderFloat16.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
2025-02-27 16:49:11 +00:00
Benjamin Lee
a33cd3def2 panfrost: fix large int32->float16 conversions
On vulkan, truncating to S/U16 before converting is not valid, because
out-of-range conversions are specified to be correctly rounded. IEEE 754
requires that out-of-range values round to ±inf with RTNE and ±F16_MAX
with RTZ.

On gl, truncating is valid for U16->F16, because out-of-range int->float
conversions are undefined behavior. For S16->F16, it is not valid
because S16_MAX < F16_MAX, so some in-range values will be truncated as
well.

Instead, just handle S/U16->F16 as S/U16->F32->F16.

Fixes dEQP-VK.spirv_assembly.instruction.compute.convertstof.int32_to_float16_*
when shaderFloat16 is enabled in panvk.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: be74b84e6f ("pan/bi: Fill in some more conversions")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
2025-02-27 16:49:11 +00:00
Alejandro Piñeiro
142311258d nir: aliasing checks should be also done with index != 0
Right now the aliasing/overlapping checks are only done with index
0. I guess that was done because variables don't get a different
internal location even if you have a different index.

But doing that, the checks would not detect a case like this:
  layout(location = 0, index = 1) out vec4 color;
  layout(location = 0, index = 1) out vec4 factor;

That was used on the following piglit parser test:
spec/arb_explicit_attrib_location/1.10/compiler/layout-13.frag

And as the spec included on that test, is a link error case:

" * if more than one varying out variable is bound to the same
    number and index; or"

This commit executes the aliasing checks for index 1 too, and moves
the skip down, to only skip if the current variable and all previous
location-assigned variables has different index and location.

The bad news is that now such assigned variables need to be tracked on
OpenGL-ES. Before that commit that was avoided.

With this commit the mentioned parser test properly fails to link in
any driver.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33093>
2025-02-27 15:10:52 +00:00
Job Noorman
2619d576e7 nir/lower_phis_to_scalar: don't create moves for undef sources
Creating moves out of undefs makes it more difficult for other passes to
detects undefs without having to chase moves. Instead, just create a new
1-component undef.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29889>
2025-02-27 13:18:14 +00:00
Job Noorman
5ae12b6a5a nir/lower_phis_to_scalar: use nir_builder API where possible
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29889>
2025-02-27 13:18:14 +00:00
Job Noorman
66407e3d24 nir/lower_phis_to_scalar: remove unused mem_ctx
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29889>
2025-02-27 13:18:14 +00:00
Lionel Landwerlin
f8af4b597e vulkan/runtime: store flags on descriptor set layouts
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33799>
2025-02-27 13:26:58 +02:00
Peyton Lee
9c97b2bf9b radeonsi/vpe: fix background issue
Fixed the issue where the background color was specified but not displayed.
Fixed the issue where the color would be different from the expected.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33790>
2025-02-27 11:02:19 +00:00
Daniel Schürmann
3c27a9f0e2 aco/tests: add more tests for chained branches
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33762>
2025-02-27 10:40:01 +00:00
Daniel Schürmann
713396ec8e aco/assembler: Don't insert chained branches into otherwise empty blocks
No fossil changes, but keeps block offsets of the empty blocks intact.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33762>
2025-02-27 10:40:01 +00:00
Daniel Schürmann
6659db285a aco/assembler: Fix short jumps over chained branches
If we insert

   <code>
   s_branch 1
   s_branch Target

at the end of some block, and later hide an additional chained branch
after the existing one, then we have to update the 's_branch 1' to
also jump over the newly added branch.

Fixes: cab5639a09 ('aco/assembler: chain branches instead of emitting long jumps')
Closes: #12673
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33762>
2025-02-27 10:40:01 +00:00
Christian Gmeiner
dd896828ba etnaviv/ci: Bring back GC7000
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33768>
2025-02-27 10:18:13 +00:00
Yiwei Zhang
acd5497067 venus: support wsi maintenance1 extensions
Venus has long supported creating swapchain image alias via binding. So
below are exposed without extra work needed:
- VK_EXT_surface_maintenance1
- VK_EXT_swapchain_maintenance1

Test: dEQP-VK.wsi.*.maintenance1.*

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33782>
2025-02-27 09:53:57 +00:00
Yiwei Zhang
673a95e5b4 venus: align on wsi frontends support
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33782>
2025-02-27 09:53:57 +00:00
Job Noorman
1673824908 ir3/opt_prefetch_descriptors: fix crash after nir_progress rewrite
nir_progress was being called on the preamble even if it was NULL.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 9a58a8257e ("treewide: Switch to nir_progress")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33791>
2025-02-27 09:25:06 +00:00
Kenneth Graunke
88309a9818 brw: Rename shared function enums for clarity
Our name for this enum was brw_message_target, but it's better known as
shared function ID or SFID.  Call it brw_sfid to make it easier to find.

Now that brw only supports Gfx9+, we don't particularly care whether
SFIDs were introduced on Gfx4, Gfx6, or Gfx7.5.  Also, the LSC SFIDs
were confusingly tagged "GFX12" but aren't available on Gfx12.0; they
were introduced with Alchemist/Meteorlake.

GFX6_SFID_DATAPORT_SAMPLER_CACHE in particular was confusing.  It sounds
like the SFID to use for the sampler on Gfx6+, however it has nothing to
do with the sampler at all.  BRW_SFID_SAMPLER remains the sampler SFID.
On Haswell, we ran out of messages on the main data cache data port, and
so they introduced two additional ones, for more messages.  The modern
Tigerlake PRMs simply call these DP_DC0, DP_DC1, and DP_DC2.  I think
the "sampler" name came from some idea about reorganizing messages that
never materialized (instead, the LSC came as a much larger cleanup).

Recently we've adopted the term "HDC" for the legacy data cluster, as
opposed to "LSC" for the modern Load/Store Cache.  To make clear which
SFIDs target the legacy HDC dataports, we use BRW_SFID_HDC0/1/2.

We were also citing the G45, Sandybridge, and Ivybridge PRMs for a
compiler that supports none of those platforms.  Cite modern docs.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33650>
2025-02-27 08:49:24 +00:00
Lionel Landwerlin
dcb5cfbfcc vulkan/runtime: add a multialloc vk_shader allocator
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33792>
2025-02-27 10:01:17 +02:00
Lionel Landwerlin
009ef67c8d vulkan/runtime: pass robustness state to preprocess vfunc
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33792>
2025-02-27 10:01:16 +02:00
Lionel Landwerlin
4dba1ad93f vulkan/runtime: ensure robustness state is fully initialized
This is part of the hashing key :

==25753== Uninitialised byte(s) found during client check request
==25753==    at 0x93D29AE: blob_write_bytes (blob.c:164)
==25753==    by 0x93A62C6: vk_pipeline_precomp_shader_serialize (vk_pipeline.c:722)
==25753==    by 0x93AC55E: vk_pipeline_cache_add_object (vk_pipeline_cache.c:433)
==25753==    by 0x93A691B: vk_pipeline_precompile_shader (vk_pipeline.c:875)
==25753==    by 0x93A8FB9: vk_create_graphics_pipeline (vk_pipeline.c:1715)
==25753==    by 0x93A9799: vk_common_CreateGraphicsPipelines (vk_pipeline.c:1860)
==25753==  Address 0xf1adf82 is 82 bytes inside a block of size 152 alloc'd
==25753==    at 0x64FA858: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==25753==    by 0x99AAC38: vk_default_alloc (vk_alloc.c:26)
==25753==    by 0x93A403B: vk_alloc (vk_alloc.h:48)
==25753==    by 0x93A406B: vk_zalloc (vk_alloc.h:56)
==25753==    by 0x93A60A0: vk_pipeline_precomp_shader_create (vk_pipeline.c:680)
==25753==    by 0x93A689D: vk_pipeline_precompile_shader (vk_pipeline.c:866)
==25753==    by 0x93A8FB9: vk_create_graphics_pipeline (vk_pipeline.c:1715)
==25753==    by 0x93A9799: vk_common_CreateGraphicsPipelines (vk_pipeline.c:1860)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9308e8d90d ("vulkan: Add generic graphics and compute VkPipeline implementations")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33792>
2025-02-27 10:01:02 +02:00
Tapani Pälli
78e5157a9c intel/compiler: add a spec note about L1WT types being uncached
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33755>
2025-02-27 05:38:35 +00:00
Peyton Lee
7c8d58c26c radeonsi/vpe: vpe support hdr input
when an application asks for supported formats
will return HDR formats(2020, explicit) is supported.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33731>
2025-02-27 03:15:17 +00:00
Peyton Lee
43ce5b1138 radeonsi/vpe: vpe support tonemapping
if input source is HDR stream, vpe can use gmlib generating tonemapping
table to convert HDR image to SDR image.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33731>
2025-02-27 03:15:17 +00:00
Peyton Lee
2e46c41448 amd/gmlib: add gmlib for radeonsi
radeonsi drivers can use gmlib to generate 3dlut used to do tonemapping.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33731>
2025-02-27 03:15:16 +00:00
Marek Olšák
2e124dd389 util: remove glthread enablement from app profiles
It's mature, so if you want it, just enable it for your driver by default.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33480>
2025-02-27 02:28:58 +00:00
Faith Ekstrand
8fffcdb18b nak/nir: Re-materialize load_const instructions in use blocks
This is useful both for correctness (to ensure that things we think are
constant stay constant) and it improves performance a bit by reducing
register pressure and avoiding spilling.

Pipeline-db stats:

    CodeSize: 29665072 -> 29437344 (-0.77%); split: -0.92%, +0.16%
    Number of GPRs: 157124 -> 156082 (-0.66%)
    SLM Size: 148900 -> 146436 (-1.65%)
    Static cycle count: 6840286 -> 6805711 (-0.51%); split: -0.98%, +0.47%
    Spills to memory: 177779 -> 173337 (-2.50%)
    Fills from memory: 177779 -> 173337 (-2.50%)
    Spills to reg: 17692 -> 16731 (-5.43%)
    Fills from reg: 12013 -> 11897 (-0.97%)
    Max warps/SM: 309128 -> 309456 (+0.11%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33771>
2025-02-27 00:26:54 +00:00
Faith Ekstrand
8de37b142e nvk: Only support compute shader derivatives on Turing+
Fixes: e0e7d8d910 ("nvk: Advertise VK_NV/KHR_compute_shader_derivatives")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33771>
2025-02-27 00:26:54 +00:00