It doesn't make sense to use a key from a random intersection shader.
fossil-db (gfx1201):
Totals from 3 (0.00% of 210263) affected shaders:
Instrs: 9728 -> 10024 (+3.04%)
CodeSize: 60140 -> 60012 (-0.21%)
Latency: 95724 -> 95905 (+0.19%)
InvThroughput: 15015 -> 15044 (+0.19%)
VALU: 2985 -> 2997 (+0.40%)
VMEM: 345 -> 429 (+24.35%)
VOPD: 307 -> 323 (+5.21%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>
This should be done if the any-hit enables robustness but the intersection
does not.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>
The traversal stage key is still a bit nonsense.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>
It doesn't really make sense to make this per-mesa_shader_stage. Each
VkPipelineShaderStageCreateInfo can have different flags.
This is just a refactor at the moment. Actually letting them differ within
a mesa_shader_stage is for a later commit.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>
This is more appropriate, and can be done now that the function is called
in radv_shader_deserialize().
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>
This improves RADV_DEBUG=hang's pipeline.log when shader caching is not
disabled.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>
Instead of passing various fields from stage, just pass the entire object.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>
NIR printing is done earlier without nir_string, so I don't know why this
was done.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>
No need to create these again and pass them around as parameters. These
functions already have plenty of those.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>
It points to the heap variable.
This fixes
dEQP-VK.binding_model.descriptor_heap.basic.raygen.acceleration_structure_untyped.
Fixes: 20d11c59a4 ("vulkan: Add a lowering pass for descriptor heap mappings")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42252>
This isn't intended to be used for sparse BOs and it was incorrect
anyways because flags isn't initialized, so it was only clearing the
original VA range, not including the padding. Since sparse is still
experimental on GFX6-7, let's just apply the workaround to non-sparse
BOs.
This fixes sparse support on VEGA10, since addc719ec2
("radv: workaround has_smem_partial_oob_access_bug").
Fixes: 10a5e5e4f3 ("radv/amdgpu: Add ability to pad BOs with a read-only VM page")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42245>
Because free_list is always NULL for REPLAYED arenas, freed blocks were
never passed to add_hole() and freelist.prev was still NULL. So,
adjacent blocks were never merged together and that caused a memleak
with unreachable blocks.
This fixes a memleak detected by ASAN in
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.singlethreaded_compilation.s0_l11_check_capture_replay_handles
and similar tests.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42012>
Add semantic location check, because multiple variables can share same
component location.
Fixes: ea863c0c1c ("nir/print: Do not access invalid indices of load_uniform")
Signed-off-by: Caius Moldovan <caius.moldovan@imgtec.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42114>
UNORM16/SNORM16 render targets are backed by 16-bit-integer TLB
formats, which V3D HW cannot blend. The compiler already supports
software blend lowering in NIR, but V3DV only enabled it for dual-src
blending. As a result format_supports_blending refused the BLEND_BIT
for these formats and Dawn could not advertise the WebGPU
Unorm16TextureFormats feature.
Set pipeline->blend.use_software when any color attachment uses a
software-normalised format so the existing NIR blend lowering kicks
in, and expose VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BLEND_BIT for
those formats.
Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42176>
Add a Foldable trait similar to what is already used in NAK for software
emulation of opcodes, since Mali has many variations like V4I8 that run
the same exact operation independently on each component of the vector,
this commit also adds a FoldableComp trait that lets the implementor
only focus on a single component and automatically implements Foldable.
We also add tests on OpShiftLop as an initial subject, we'll add most of
the arithmetic opcodes as time goes on to have a tight description of
the hardware.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>
Add the generic infrastructure to load/store the test data and compile
the shader, along simple tests that use the hw_runner.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>
This is a very small driver that just sends compute jobs to the graphics
card without any of the Vulkan or OpenGL indirections. For now it only
supports v10-v13 since it's what Kraid is targeting. Lots of the
low-level code that handles CSF encoding and descriptor handling is in C
foir semplicity (and because there is no genxml equivalent for rust yet).
device.rs also implements a barebone memory-safe Rust abstraction for
mali GPUs, as a treat.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>
We'll need the extra ensurance if we want to share the model across
threads.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>
The compiler will also implement a very small driver that depends
on genxml and libpanfrost, so it needs to be defined after them, but
before clc.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>
Previously libpanfrost depended on the panfrost compiler, that was just
used for the pan_disassemble function used to disassemble and print
shaders. We'll need to add a dependency from kraid tests to libpanfrost
and this made things harder due to meson shenanigans.
This commit splits the dependency between libpanfrost and the compiler by
adding the disassembler as a callback, so that the user can provide its
own disassembler.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>
If tests are enabled with the same name as the original crate two entries
are placed in rust-project.json with identical name, rust-analyzer does
not like that, rename tests to "kraid_test" to fix it.
Also, meson rust tests are weird as they directly call rustc --test flag
directly, and rust-project.json does not see any test cfg option.
To have proper code analysis in #[cfg(test)] we need to specify that
option directly in meson (this will mean that rustc will see --test and
--cfg test at the same time, it doesn't seem to mind though)
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>
Rust bindgen creates include dependencies that are relative to the
project root, that works perfectly if the build root is inside of the
project root, but breaks when it's a separate directory
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>
The gen_opcodes custom target generates gen_opcodes.h,
gen_opcodes_private.h, and gen_opcodes.cpp, but idep_gen_opcodes_h only
declared gen_opcodes.h.
Declare gen_opcodes_private.h as well so that generated-header
dependencies are exported correctly to downstream hermetic build systems.
Fixes ninja-to-soong build failures due to missing gen_opcodes_private.h.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42248>
Felix:
- Fix typo in the end debug marker for update
Thanks to Kevron, He tested couple of workloads on BMG:
- Hitman +50.3%
- F122 +26.8%
- SOTR +18%
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39617>
This commit adds new debug options to dump out parent-child relationship
map using INTEL_DEBUG=bvh_pcrel_map.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39617>
Track where is each leaf_id encoded in final BVH.
It's a map of leaf_id == final_bvh_offset. This will help us to navigate
the BVH layout in update pass.
Leaf block offset will give us : Leaf id -> bvh block
and parent-child map can be used for: bvh_block -> parent offset.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39617>
This map stores parent BVH offset for each of their children. This will
help us to walk the BVH layout later in the update pass.
Since we are tracking block indexes, even with 2^32 large BVH size, we
can have 2^26 max indices (each block 64B wide) that leaves us 6 bits in
which we can track child slot index occupancies in parent.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39617>
Extract leaf encoding in encode.h and move some of the helper in
anv_build_helper.h
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39617>