We don't support CCS on block-compressed textures prior to Xe2. On Xe2,
CCS is enabled on every image.
Improves INTEL_DEBUG=perf outputs. For example, in the Naraka trace on
DG2, we now report that r32_uint is CCS_E-incompatible instead of
bptc_rgba. This incompatibility is due to the storage usage flag and
will be clarified in future commits.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41034>
Print the image format which is incompatible (or has an incompatible
list). On gfx12+, the format list shouldn't impact CCS_E-compatibility.
So, not printing the entire list should be sufficient on those
platforms.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41034>
anv emits performance warnings earlier about compression being disabled,
so no need to emit this for AUX_NONE. Do provide the tiling however as
Xe2+ supports compressed linear surfaces.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41034>
We're already grabbing the VS for VERTEX_VARY_SPD on v11 and earlier and
we're already carrying the code to check for IA_PRIMITIVE_TOPOLOGY. It
makes sense to have the code which selects shader descriptor there, too.
Otherwise the helper is a little too magic and can lead to bugs if
someone isn't paying attention. (See also the previous commit.)
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41638>
Intersection shader works on custom procedural geometries which are
present only in BLAS (Object) level not in the TLAS (World) level.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41739>
Stops incorrectly assuming cached-coherent memory is supported on
hardware that does not support it, such as a610 and a619-holi.
Fixes: 5a59410962 ("turnip: add cached and cached-coherent memory types")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41761>
Zink sets multiple external memory handle types(like Opaque FD,DMA-BUF)
without confirming if the Vulkan driver actually supports them. This may lead
to failures when attempting to allocate external memory with vkAllocateMemory.
This patch introduces query_external_memory_compatibility() to verify
handle type support via VkPhysicalDeviceImageFormatProperties2.
Combine handle types only if they are compatible; otherwise, use a single
supported type as a fallback.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40212>
find_good_mod was accumulating DISJOINT_BIT across iterations and
setting ici->usage on success. Change it to return results via
out-parameters and save/restore ici->flags around each modifier
attempt. The caller (negotiate_image_config) now explicitly sets
ici->usage and ici->flags after find_good_mod returns.
Also save/restore flags in the LINEAR modifier fallback path.
Assisted-by: Claude
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41734>
Replace the nested retry loops (eval_ici, set_image_usage,
double_check_ici, suboptimal_check_ici, try_set_image_usage_or_EXTENDED)
with a flat candidate array that encodes the same fallback order.
Instead of mutating a shared VkImageCreateInfo through deeply nested
function calls and retrying with toggled flags, we now:
1. build_usage_candidates() generates an array of (tiling, usage,
flags, has_format_list) tuples in preference order
2. try_image_config() applies each candidate and calls check_ici
3. negotiate_image_config() iterates tiling/extended combos, builds
candidates for each, and takes the first passing one
The modifier path (find_good_mod) is kept separate since it iterates
modifiers and takes the last good one (max-by-position, matching
the GBM worst-to-best convention), which is fundamentally different
from the candidate model's first-match-from-fallback-chain.
Duplicate candidates from the old code's redundant retry paths are
eliminated via dedup_configs(). The pNext chain surgery in
double_check_ici (manually unlinking VkImageFormatListCreateInfo) is
replaced by try_image_config's explicit format list chain/unchain.
The cube-compatible post-pass is simplified to a single check_ici
call instead of re-running the full negotiation.
Assisted-by: Claude
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41734>
Move disjoint/non-disjoint BindImageMemory into bind_image_memory().
create_image now reads as a clear sequence: format list, init_ici,
negotiate, pNext chain, CreateImage, allocate, bind. No functional
change.
Assisted-by: Claude
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41734>
Move VkExternalMemoryImageCreateInfo, DRM modifier explicit/list
create info, and user memory pNext chain building into
setup_image_pnext(). The Vk*Info structs now live in
image_pnext_state on the caller's stack. No functional change.
Assisted-by: Claude
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41734>
While trying to use that feature on RADV I ran into an infinite
recursion.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 97b4a6d0e3 ("compiler: SPIR-V shader replacement")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41751>
The kevins are increasingly creaky and unreliable after a decade of
excellent service, so it's time to send them off to the farm and move
our T860 jobs to a device type which can actually run jobs.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41749>
Forcing flush by setting initial_gfx_cs_size to zero requires
there are always packets emitted on starting new gfx IB.
But this is not the case with userq, as there is no preamble.
Add a new flag to be used with si_flush_gfx_cs to force flush.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41530>
Dishonered 2 or DXVK is creating pipelines with empty fragment
shaders. With alpha-to-coverage a dynamic state, we currently consider
there is a need for a render target but if the shader is not writing
anything, it's not needed.
This change only considers the color output writes as it's the alpha
channel there that is used for coverage computation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41711>
Suggested by @gurchetansingh.
Android's Soong build system treats several compiler warnings as errors
by default: https://android.googlesource.com/platform/build/soong/+/27f57506/cc/config/global.go/#218
To catch these issues in Mesa, introduce `soong_compat_c_args`
and `soong_compat_cpp_args` with the following flags treated as errors:
-D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS
-Werror=date-time
-Werror=gnu-alignof-expression
-Werror=ignored-qualifiers
-Werror=implicit-fallthrough
-Werror=int-conversion
-Werror=missing-prototypes
-Werror=pragma-pack
-Werror=pragma-pack-suspicious-include
-Werror=sizeof-array-div
-Werror=string-plus-int
-Werror=unreachable-code-loop-increment
These compatibility flags are added to the meson configurations
for ANV, Gfxstream, Lavapipe, PanVK, Turnip, and Venus.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Gurchetan Singh <gurchetan.singh.foss@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41644>
Based on the approach in e0eea5ea4e.
When a file is too large, -Wmisleading-indentantion will give the warning
below, that we can't prevent from a #pragma:
../src/freedreno/vulkan/tu_perfetto.cc: In function 'void setup_incremental_state(MesaRenderpassDataSource<TuRenderpassDataSource, TuRenderpassTraits>::TraceContext&, tu_device*)':
../src/freedreno/vulkan/tu_perfetto.cc:162: note: '-Wmisleading-indentation' is disabled from this point onwards, since column-tracking was disabled due to the size of the code/headers
162 | if (!state->was_cleared)
../src/freedreno/vulkan/tu_perfetto.cc:162: note: adding '-flarge-source-files' will allow for more column-tracking support, at the expense of compilation time and memory
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89549 for details.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41644>
Only adding the workarounds that have an actual effect on that driver.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41664>
We want extract the leaf type from potential hit and assign it
to commited hit.
Instead of that, we were simply assigning leaf type 0x7 to commited hit.
This patch mask out leaf type with nir_iand_imm and also update the
incorrect field comment.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41667>
Add optional OA performance counter collection around each execute()
call. Examples:
```
# List all profiles and counters, with descriptions.
$ executor --oa list
# Collect all counters from a profile.
$ executor --oa ComputeBasic file.lua
# Collect a subset of counters from a profile, separated by comma.
$ executor --oa ComputeBasic:GpuTime,AvgGpuCoreFrequency file.lua
# By default use ComputeBasic profile, so counter names only also work.
$ executor --oa GpuTime file.lua
```
The selected counters are printed to stdout after the script finishes,
or written to a file specified by --oa-csv FILENAME.
Assisted-by: Pi coding agent (GPT-5.5)
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41610>
ORCJIT expects every functions prototypes to be present even when using
object caches. Code for adding stubs for entry point functions was added
previously when implementing shader cache for ORCJIT, but when using
OpenCL, extra functions could be present in compute shaders which need
stubs too.
Reuse the code for constructing references for extra functions to
generate function stubs for them.
This fixes function calls with Rusticl on llvmpipe with ORCJIT.
Fixes: bb0efdd4d8 ("llvmpipe: add shader cache support for ORCJIT implementation")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41532>
This struct was initially packed to fit in a slot in NIR intrinsics
indices. Nowadays NIR supports larger indices and cooperative matrix
has extensions that allow it to go beyond the existing limit. This
patch changes the struct to be larger and remove the manual bit packing.
The hash table change is to use the specialized version for u64 keys
that's available in src/util.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41691>
It's not about the memory traffic but updating the Tmax value/distance
so that on next intersection, we would be comparing the updated Tmax
value/distance instead of original distance.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41709>
Execution mask gets applied to last thread in the threadgroup to mask
off simd lanes, But with BTD enabled, we are seeing only last 4
components has valid stack ID's and upper 4 components of the register
are zero.
Changing execution mask somehow populates the stack IDs properly.
This is on simulator, before changing the execution mask:
00000000 00000000 00000000 00000000 000F000E 000D000C 000B000A 00090008 00000000 00000000 00000000 00000000 000F000E 000D000C 000B000A 00090008 r1
After changing execution mask:
000F000E 000D000C 000B000A 00090008 00070006 00050004 00030002 00010000 000F000E 000D000C 000B000A 00090008 00070006 00050004 00030002 00010000 r1
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41409>