We already show the address range, which is most of why I'd think you'd be
looking at hex values. I find a more human-readable number nice for
debugging, instead of counting zeroes to decide if it's 1.5MB or 96kb.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20540>
Some binding table entries have been identify as unused in the shaders
by the push constant analysis pass. We can just put the null entry in
there.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b49b18f0b7 ("anv: reduce BT emissions & surface state writes with push descriptors")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20555>
We can't just check the load_ubo range is contained in the push entry,
we also need to check that the push entry set/binding matches the
load_ubo set/binding.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ff91c5ca42 ("anv: add analysis for push descriptor uses and store it in shader cache")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20555>
We'll use those to fill the push constant addresses, so we can't have
them turned to null.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ff91c5ca42 ("anv: add analysis for push descriptor uses and store it in shader cache")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20555>
This 2 PIPELINE_CONTROL flushes are not necessary for TGL and newer
and also it have different requirements of flush, so here doing
this two changes at the same time.
As no ANV_PIPE_INVALIDATE_BITS is set as parameter of
anv_add_pending_pipe_bits(),
genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer) will only emit one
PIPELINE_CONTROL.
BSpec: 44505
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20501>
For INTEL_MEASURE, ensure all prior instructions completed before
timestamp taken. Continue to support no CS flush case for Perfetto.
CS stall was dropped from pipecontrol when adding u_trace support.
Fixes: cc5843a573 ("anv: implement u_trace support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20502>
After using task shader, we need to emit a zero URB state and a
nullprim (empty pipe control) before rendering with primitives.
After this, a normal URB state needs to be returned, this will
happen when pipeline batch is emitted during pipeline switch.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20334>
To avoid the replication of code to properly emit PIPELINE_SELECT.
init_compute_queue_state() had a different emit of PIPELINE_SELECT but
as there is no compute engine in GFX VER 11 we are safe with the
differences.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20444>
The loop going from 0 to max_draw_count multiplies the value which
could potentially overflow.
Fixes Coverity CID 1517852
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3596a8ea7a ("anv: factor out some indirect draw count entry points")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20436>
The issue we're addressing here is that we have 2 batches and the both
grow at different rate. We want to keep doubling the main batch size
as the application writes more and more commands to limit the number
of GEM BOs. But we don't want to have the generation batch size to be
linked to the main batch.
v2: remove gfx7 code
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15642>
anv_bo has information that will be needed by a future patch in
anv_gem_mmap(), so here already preparing for that.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20423>
Add the PSS stall bit to ANV_PIPE_STALL_BITS so that it get's cleared on
flush.
Fixes: f3c62973 ("anv,iris: PSS Stall Sync around color fast clears")
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20317>
Pre-patch, anv_descriptor_pool used a free list for host allocations
that never merged adjacent free blocks. If the pool only allocated
fixed-sized blocks, then this would not be a problem. But the pool
allocations are variable-sized, and this caused over half of the pool's
memory to be consumed by unusable free blocks in some workloads, causing
unnecessary memory footprint.
Replacing the free list with util_vma_heap, which does merge adjacent
free blocks, fixes the memory explosion in the target workload.
Disdavantges of util_vma_heap compared to the free list:
- The heap calls malloc() when a new hole is created.
- The heap calls free() when a hole disappears or is merged with an
adjacent hole.
- The Vulkan spec expects descriptor set creation/destruction to be
thread-local lockless in the common case. For workloads that
create/destroy with high frequency, malloc/free may cause overhead.
Profiling is needed.
Tested with a ChromeOS internal TensorFlow benchmark, provided by
package 'tensorflow', running with its OpenCL backend on clvk.
cmdline: benchmark_model --graph=mn2.tflite --use_gpu=true --min_secs=60
gpu: adl
memory footprint from start of benchmark:
before: init=132.691MB max=227.684MB
after: init=134.988MB max=134.988MB
Reported-by: Romaric Jodin <rjodin@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20289>
This reverts commit b1126abb38.
This breaks all hell at least on DG2, as there are several cases left
where current_pipeline gets checked against GPGPU to decide what to do,
and the value doesn't match that of ANV_HW_PIPELINE_STATE_COMPUTE.
On top of that, it also misses checking for
ANV_HW_PIPELINE_STATE_RAYTRACING.
Then there's the fact that in some cases, current_pipeline will be
UINT32_MAX, because it's the original undefined state and also used
after executing a secondary command buffer because we are not tracking
on which pipeline did the secondary left us.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7910
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20349>
This replaces brw_fs_get_dispatch_enables(), which was added in
b9403b1c47 ("intel: factor out dispatch PS enabling logic"), but this
function will not work well for future changes to 3DSTATE_PS.
So, instead, this moves the related code into a "genX" file which can
directly update 3DSTATE_PS for the given platform.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20329>
This is another field that, after the recent commits, became unused.
It's either zero-initialized (by the memset) or copy-initialized
(which means it's also zero). And it never even gets used anywhere
anyway, so even if the value was non-zero it wouldn't matter.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20309>
As a consequence of the last two commits, reloc_bos is always NULL and
never used anywhere, so remove it.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20309>
The last commit made it clear that anv_reloc_list_grow() only ever
gets called with zero as num_additional_relocs, which means it will
always immediately return VK_SUCCESS without doing anything. That
means we can remove it.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20309>
There are only a few places in the code where num_relocs gets set:
- During anv_reloc_list_init() where it gets memset() to 0.
- At anv_reloc_list_init_clone() where it gets set with the value of
another anv_reloc_list->num_relocs.
- During anv_reloc_list_clear(), where it gets set to 0.
- During anv_reloc_list_append(), where it gets added with the value
of another anv_reloc_list->num_relocs.
As you can see, either we explicitly set the value to 0 or we copy the
value that's present in another anv_reloc_list, which should be 0. The
one place where we used to increment num_relocs was in
anv_reloc_list_add(), but that was deleted by:
7b7381e8d7 ("anv: Delete anv_reloc_list_add()")
So in this commit we delete the num_relocs field from struct
anv_reloc_list and we also delete some lines where, if the value is 0,
nothing will happen.
There's more we could be deleting here, but I wanted this commit to be
minimal so it's very clear that num_relocs can't be non-zero. We were
having some speculation that anv_reloc_list may still be important for
actually adding BOs to the batch and building the validation list, so
let's go slowly with the removal to make everything more easily
reviewable.
The one possibility I could be missing here is another situation like
the memset() we have at anv_reloc_list_init() or some other crazy
indirect overwrite, but as far as I have checked, that is not the
case.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20309>
Now that we removed relocations, this is not being used anywhere.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20309>
We'll want 2 batches :
* the main one
* another to contain dispatch commands to generate stuff in the
main batch
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20295>
And allow the function to get the very first address in the batch.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20295>