In SIMD32, the fence instruction is currently going to read grf0-3
leading to such assertions in the backend :
../src/intel/compiler/brw_fs_reg_allocate.cpp:206:
void fs_visitor::calculate_payload_ranges(bool, unsigned int, int*) const:
Assertion `j < payload_node_count' failed.
The reason we haven't seen the problem yet is that there always enough
payload register to accomodate this. But the following change is going
to make the inline parameter register optional.
Since SHADER_OPCODE_MEMORY_FENCE is emitted in the generator as SIMD1
NoMask (see brw_memory_fence), we can limit ourselves to SIMD1
exec_all() in the IR as well so that the IR accounts for grf0 as a
source.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>
Useful to insert debug traces a bit later in the lowering process (in
particular after load/store vectorization).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>
These values are programmed by the kernel and not determined by the
hardware, but provide a default value that should match what drm/msm
programs for older kernels that can't report it. kgsl has always
supported returning the highest_bank_bit, although it hardcodes some of
the other parameters so we have to follow what it does instead of using
this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26578>
When creating and caching the bo_export object for a given zink_bo, the
screen file descriptor was used. Since no buffer object's file descriptor
would match that, bo_export objects were continuously added to the exports
list and no bo_export was effectively picked from the cache. Using the
buffer object's file descriptor avoids that.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: b0fe621459 ("zink: add back kms handling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31715>
"Backport" of the llvmpip fix.
Nearest sampling was being done using coordinates
on texel boundaries, which caused aliasing bugs.
Shift coordinates by half a texel to correct this.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31562>
Based on the original code from sp_tex_sample.c,
this was supposed to be a comparison with pmax2,
not pmin2.
This mostly seemed to result in anisotropic filtering
turning on to "maximum" at any value of max_aniso > 1.
Most apparent when runing the texfilt test in demos.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31562>
Nearest sampling was being done using coordinates
on texel boundaries, which caused aliasing bugs.
Shift coordinates by half a texel to correct this.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31562>
We need to pass through the robust_modes flag to nir_opt_vectorize based
on a flag set when compiling the shader, not globally in the compiler,
for VK_EXT_pipeline_robustness. Refactor the ir3 compiler interface
to add an ir3_shader_nir_options struct that can be passed around to
the appropriate places, and wire it up in turnip to the shader key. The
shader key replaces the old mechanism of hashing in the compiler
options.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31687>
The geometry BO should be released in csf_cleanup_context().
Fixes: 447075eeee ("panfrost: Add support for the CSF job frontend")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31705>
Always enable the level covering the whole FB, and disable the finest
levels if we don't have enough to cover everything.
This is suboptimal for small primitives, since it might force primitives
to be walked multiple times even if they don't cover the the tile being
processed. On the other hand, it's hard to guess the draw pattern, so
it's probably good enough for now.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31682>
When flushing the render target cache for future operations, we need a
stall at pixel scoreboard. We likely didn't see any issue until now
because a change in render target added the pb-stall.
When using a 2 compute shaders with the following pattern :
vkCmdDispatch()
vkCmdPipelineBarrier() ImageBarrier with (src|dst)AccessMask=0 & identical layout
vkCmdDispatch()
we should ensure that the first dispatch is completed before executing
the second one, otherwise they can race to on resource accesses. This
fixes failures in some new CTS tests.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31676>
This assumes that the best case jump latency is higher than the worst case
ALU latency.
Foz-DB Navi31:
Totals from 17720 (22.32% of 79395) affected shaders:
Instrs: 26009663 -> 25929989 (-0.31%); split: -0.31%, +0.00%
CodeSize: 136571496 -> 136254420 (-0.23%); split: -0.23%, +0.00%
Latency: 215731308 -> 215722059 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 36534197 -> 36532070 (-0.01%); split: -0.01%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132>
there was some confusion over exactly where ici->usage should be set,
since the value must be set when doing all the format checks but then
also it was re-set again later to a potentially different value based
on an unchecked return
now get_image_usage() is set_image_usage() with a more consistent policy
around exactly where the usage is set
this code still sucks tho
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31686>
Unigine Heaven has a TCS that reads pos.xyz and tescoord.w from all
invocations in every invocation. By putting those two in the same vec4,
AMD hw can reduce the amount of shared memory that is allocated for those
inputs from 2 vec4s to 1 vec4.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31670>
Skip the code for other shader stages because it doesn't do anything there.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31670>
Now we have a few extra methods and things are diverging a bit between
Linux and Windows. Add a few unit tests to make sure this works.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30594>
Move the memstream code into common code. Other Rust code interfacing
with FILE pointers will find the memstream abstraction useful.
Most notably, pinning is actually enforced this time with PhantomPinned.
Add a .clang-format from a sibling dir (i.e.: compiler/nir) while we're
at it to keep things tidy.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30594>
Flush support is needed by the Rust code, which will switch from
its own memstream to u_memstream in a future patch.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30594>