It might be that the radv_pipeline_cache_lookup_nir_handle() in
radv_ray_tracing_pipeline_cache_search() fails but we will later need the
NIR. If rt_stages[i].shader was non-NULL, then we would not have created
the NIR.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.2
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38263>
The non-dynamic members of xfb_info are already included in
sizeof(hk_passthrough_gs_key), so adding nir_xfb_info_size counts them
twice. Because of this we were including uninitialized memory in the key
in hk_handle_passthrough_gs, which is undefined behavior.
Fixes: 5bc8284816 ("hk: add Vulkan driver for Apple GPUs")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39574>
This fixes bunch of cts tests hitting issues when attempting
anv_image_mcs_op with compute.
Fixes: ab9d3528dc ("anv: fix queue check in anv_blorp_execute_on_companion on xe3")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39581>
When proceeding with rendering, any transient attachment that will be used
as LRZ buffer should also be allocated. With GMEM rendering, these
attachments otherwise remained unloaded and subsequent LRZ clears produced
GPU faults.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 764b3d9161 ("tu: Implement transient attachments and lazily allocated memory")
Fixes: #14604
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39535>
The samples per tile calculation was incorrect for sample count 4 and 8.
Fix:
dEQP-VK.pipeline.monolithic.multisample.std_sample_locations.draw.depth.samples_4.*
dEQP-VK.pipeline.monolithic.multisample.std_sample_locations.draw.stencil.samples_4.*
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39580>
Add a new vfunction to support shader capture/replay, needed for RT
pipeline capture/replay.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33022>
Geometry shaders load from separate handles for each vertex, so they
don't incorporate the vertex index in the URB offset like tessellation
shaders do. This means we can have a constant offset (within a vertex's
section) but not have a constant vertex index.
Prior to 41d7debcfe we were emitting non-folded ALU so we thought the
offset was non-constant at this point. Now we can properly detect
constant offsets...but still don't want to use push inputs for
non-constant vertex indices.
Fixes: 41d7debcfe ("brw: Use nir_imul_imm in per-vertex/per-primitive offset calculation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39603>
This now also removes dead variables created by split_array_vars,
and in the future it is reasonable other optimizations inside the
optimization loop to make temp variables dead.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39596>
Previously the matching logic was designed to match names
like this
```
99993681767ac...32132a.anv.mda.tar/CS/NIR8/046-ssa
```
So up until the first slash of a pattern, a prefix match would be used,
followed by fuzzy matching for the remaining pattern. This don't
work well when there are subdirectories in the name, so when we see
```
before/99993681767ac...32132a.anv.mda.tar/CS/NIR8/046-ssa
before/91132154353bd...090919.anv.mda.tar/CS/NIR8/046-ssa
after/91132154353bd...090919.anv.mda.tar/CS/NIR8/046-ssa
```
the first entry can't be matched by `before/9999/first` since the fuzzy
match will kick in for the 9999 and if the second entry has four 9s
(which it does here) there would be multiple choices.
In practice the flexibility of fuzzy matching is not really needed
since we've been using consistent small prefixes (like CS, NIR8, BRW,
etc). The exception is the last part (the object versions, i.e.
"pass names"), where sometimes is convenient to reach by a substring.
The new matching logic is to use prefix match by default, except when
matching the "object version", where substring match is used. In the
example a possible set of the patterns to identify each entry can be
`b/99/ssa`, `b/91/ssa` and `a/91/ssa`.
The patch adds a few tests to the `is_match()` to clarify the behavior.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39506>
This macro will stop the loop early if there's no chance to make further
progress.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39504>
Add a pass tracker struct that can live the whole lifetime
of brw_compile() functions, it will keep track of the debug_archiver
and also store some metadata that allow us to name the passes.
With that, we can also embed the loop tracking in the same struct,
so that is free for any loop to use the "early break" optimization.
There are other brw_nir_* passes that are called in the pre-processing
phase. These are not currently included in the mda yet. Will be
handled when we hook debug_archiver or similar to the runtime/driver.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39504>
The properly terminated regex automatically detects this case now.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586>
Compares versions of two objects one by one. Useful to compare two
shader compilations and find the first pass that changed.
This could already be done by using something like
`diff <(mda log ...) <(mda log ...)` but it is useful enough to become
a builtin.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39420>
This is a no-op from a codegen PoV since both SrcSwizzle::Xx and
SrcSwizzle::None will result in .high not being set. However, it allows
other parts of the compiler to more easily reason about the fact that it
only reads the bottom 16 bits.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39572>
Instead of depending on a global "high" bit that affects both source and
destination, this models f2f.32.16 as an F16v2 op which ignores one of
the two components. This makes encoding the op a tiny bit more complex
(though that's easy enough to shove in a helper) in exchange for letting
copy-prop propagate OpPrmt and swizzles into it.
Shader-db stats:
Totals:
CodeSize: 24304240 -> 24298928 (-0.02%)
Static cycle count: 274812403 -> 274809320 (-0.00%)
Totals from 39 (0.57% of 6891) affected shaders:
CodeSize: 266672 -> 261360 (-1.99%)
Static cycle count: 138321 -> 135238 (-2.23%)
PERCENTAGE DELTAS Shaders CodeSize Static cycle count
google-meet-clvk/BgBlur 49 -0.49% -0.44%
google-meet-clvk/Relight 81 -0.55% -0.18%
q2rtx/q2rtx-rt-pipeline 42 -0.31% -0.10%
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39572>