When we start generating indirect draws, we'll generate values ourself
and point the HW vertex buffer entries to right location from the
device.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
Iris does not use Gfx11+ SGVS extended parameters, so we have to rely
on the old Gfx9 method of providing the parameters through vertex
buffers.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
We're about to introduce a hard dependency on OpenCL functions in Iris
& Anv to generate commands. Intel-clc has been modified to generate
serialized NIR.
A number of builders are doing cross builds, so we can't use the
intel-clc built in that cross build. Other builds like ASAN/MSAN also
complain when running the built version of intel-clc because of
uninitialized values in the packaged LLVM libraries from the
x86_64-base image.
To solve those problems we build a host version of intel-clc and use
that binary in the cross build to generate the serialized NIR.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
This will be used to generate a serialized NIR of functions for
internal shaders.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
When doing query result copies in 3D mode, we're flushing the render
target cache, but the shader writes go through the dataport.
Fixes flakes/fails in piglit with shader query copies forced with Zink :
$ query_copy_with_shader_threshold=0 ./bin/arb_query_buffer_object-coherency -auto -fbo
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b3b12c2c27 ("anv: enable CmdCopyQueryPoolResults to use shader for copies")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
Without this, on some "buggy" qemu cpu setup, LLVM could crash
if LLVM detects the wrong CPU type.
Fixes: f92cadccc6 ("llvmpipe: Always using util_get_cpu_caps to get cpu caps for llvm on x86")
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27539>
These regs are written by blob, for some of them blob could
write non-zero values. So executing Turnip after blob without
writing these regs could lead to nasty GPU crashes.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>
A750 added a new optimization - placement of vertex attributes
into GMEM, so part of GMEM is carved out for it and needs to
be considered during GMEM allocations.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>
A7XX introduces some changes into the CCU such as having different
amounts of memory per CCU for depth and color and dividing up CCU
control into two registers A7XX_RB_CCU_CNTL and A7XX_RB_CCU_CNTL2
where CNTL2 no longer requires a complete flush to be updated, we
currently don't take advantage of this as any CCU updates set both
registers but it's a potential optimization we can add in the future.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>
Inline consts suffer the same issue as driver params, so they also
should be preloaded via preamble. There is special instruction to
load from global memory into consts.
Co-Authored-By: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>
Add a separate pass which uses the analyze_ubo_ranges machinery to
construct ranges of readonly globals accessed in the shader and push
them to constants in the preamble, using ldg.k if possible. This is
enough to handle inline uniforms in turnip but also provides a base for
OpenCL, although the pass would need further work for that.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>
a750 has SS6_DIRECT path broken, we should either use UBO lowering
or SS6_INDIRECT path.
It is implemented as INDIRECT load even on a750+ because with UBO
lowering it would be tricky to get const offset for to use in multidraw,
also we would need to ensure the offset is not 0.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>
A750 expects driver params loaded through the preamble, old path
does work but has issues when the same LOAD_STATE is used between
several draw calls (it seems that LOAD_STATE is executed only for
the first draw call).
To solve this we now lower driver params to UBOs and let NIR deal with
them.
Notes:
- VS params are loaded via old path since blob do the same and there
are no issues observed.
- FDM is not supported at the moment.
- For now driver params data is emitted via CP_NOP because it's tricky
to allocate space for the data. (It is emitted when we are already in
sub_cs)
Co-Authored-By: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>
3d blits used DIRECT consts upload path, which doesn't work
properly on a750+, however uploading them via SS6_INDIRECT
seem to be working.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>
SP_FS_VGPR_CONFIG was found to be correlated with blob using avgs/uvgs.
Other SP_*_VGPR_CONFIG where undefined per-stage regs and it was tested
via rddecompiler that they "fix" hangs in respective shader stage,
when such stage uses the following instructions pattern:
avgs.s.1.tex.0
(ss) avgs.e;
uvgs.s.tex.0;
uvgs.e
The exact meaning of SP_*_VGPR_CONFIG is to be investigated.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>
We have to push the lowering of texture operations a bit further in
pipeline since nir_lower_tex gets invoked twice and if there is no LOD
source present, nir_lower_tex adds that as a source. Once that's all
done we can easily combine the LOD and array index into a single 32-bit
value.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>
In previous patches, we have moved the Intel specific lowering code in
brw_nir_lower_texture file. We can go ahead and drop the Intel specific
texture source too.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>
Since this lowering is totally Intel specific, we don't have to
introduce the new texture source. We can use the nir_tex_src_backend1
source to pack LOD/LOD Bias and array index into 32 bit single value.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>