Some apps exhibit bind patterns that can be easily implemented in
terms of fewer vm_bind ops than we currently do.
For now let's only optimize the case when a vm_bind op is
contiguous wrt the previous one on the right, in both VA and
BO (if applicable) ranges. With this optimization alone we already
get a decent reduction in some CTS sparse tests.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38986>
There are additional conditions that must be met before
DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED can be used. These
conditions are verified by the handler of this modifier, but not
panvk_image_can_use_mod. Let's call the handler of this modifier
so it can finally decide whether this modifier can be used.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38986>
It might be that the radv_pipeline_cache_lookup_nir_handle() in
radv_ray_tracing_pipeline_cache_search() fails but we will later need the
NIR. If rt_stages[i].shader was non-NULL, then we would not have created
the NIR.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.2
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38263>
The non-dynamic members of xfb_info are already included in
sizeof(hk_passthrough_gs_key), so adding nir_xfb_info_size counts them
twice. Because of this we were including uninitialized memory in the key
in hk_handle_passthrough_gs, which is undefined behavior.
Fixes: 5bc8284816 ("hk: add Vulkan driver for Apple GPUs")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39574>
This fixes bunch of cts tests hitting issues when attempting
anv_image_mcs_op with compute.
Fixes: ab9d3528dc ("anv: fix queue check in anv_blorp_execute_on_companion on xe3")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39581>
When proceeding with rendering, any transient attachment that will be used
as LRZ buffer should also be allocated. With GMEM rendering, these
attachments otherwise remained unloaded and subsequent LRZ clears produced
GPU faults.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 764b3d9161 ("tu: Implement transient attachments and lazily allocated memory")
Fixes: #14604
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39535>
The samples per tile calculation was incorrect for sample count 4 and 8.
Fix:
dEQP-VK.pipeline.monolithic.multisample.std_sample_locations.draw.depth.samples_4.*
dEQP-VK.pipeline.monolithic.multisample.std_sample_locations.draw.stencil.samples_4.*
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39580>
Add a new vfunction to support shader capture/replay, needed for RT
pipeline capture/replay.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33022>
Geometry shaders load from separate handles for each vertex, so they
don't incorporate the vertex index in the URB offset like tessellation
shaders do. This means we can have a constant offset (within a vertex's
section) but not have a constant vertex index.
Prior to 41d7debcfe we were emitting non-folded ALU so we thought the
offset was non-constant at this point. Now we can properly detect
constant offsets...but still don't want to use push inputs for
non-constant vertex indices.
Fixes: 41d7debcfe ("brw: Use nir_imul_imm in per-vertex/per-primitive offset calculation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39603>
This now also removes dead variables created by split_array_vars,
and in the future it is reasonable other optimizations inside the
optimization loop to make temp variables dead.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39596>
Previously the matching logic was designed to match names
like this
```
99993681767ac...32132a.anv.mda.tar/CS/NIR8/046-ssa
```
So up until the first slash of a pattern, a prefix match would be used,
followed by fuzzy matching for the remaining pattern. This don't
work well when there are subdirectories in the name, so when we see
```
before/99993681767ac...32132a.anv.mda.tar/CS/NIR8/046-ssa
before/91132154353bd...090919.anv.mda.tar/CS/NIR8/046-ssa
after/91132154353bd...090919.anv.mda.tar/CS/NIR8/046-ssa
```
the first entry can't be matched by `before/9999/first` since the fuzzy
match will kick in for the 9999 and if the second entry has four 9s
(which it does here) there would be multiple choices.
In practice the flexibility of fuzzy matching is not really needed
since we've been using consistent small prefixes (like CS, NIR8, BRW,
etc). The exception is the last part (the object versions, i.e.
"pass names"), where sometimes is convenient to reach by a substring.
The new matching logic is to use prefix match by default, except when
matching the "object version", where substring match is used. In the
example a possible set of the patterns to identify each entry can be
`b/99/ssa`, `b/91/ssa` and `a/91/ssa`.
The patch adds a few tests to the `is_match()` to clarify the behavior.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39506>
This macro will stop the loop early if there's no chance to make further
progress.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39504>
Add a pass tracker struct that can live the whole lifetime
of brw_compile() functions, it will keep track of the debug_archiver
and also store some metadata that allow us to name the passes.
With that, we can also embed the loop tracking in the same struct,
so that is free for any loop to use the "early break" optimization.
There are other brw_nir_* passes that are called in the pre-processing
phase. These are not currently included in the mda yet. Will be
handled when we hook debug_archiver or similar to the runtime/driver.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39504>
The properly terminated regex automatically detects this case now.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586>