Most of the time, we can infer the type to append in
util_dynarray_append using __typeof__, which is standardized in C23 and
support in Jesse's MSMSVCV. This patch drops the type argument most of
the time, making util_dynarray a little more ergonomic to use.
This is done in four steps.
First, rename util_dynarray_append -> util_dynarray_append_typed
bash -c "find . -type f -exec sed -i -e 's/util_dynarray_append(/util_dynarray_append_typed(/g' \{} \;"
Then, add a new append that infers the type. This is much more ergonomic
for what you want most of the time.
Next, use type-inferred append as much as possible, via Coccinelle
patch (plus manual fixup):
@@
expression dynarray, element;
type type;
@@
-util_dynarray_append_typed(dynarray, type, element);
+util_dynarray_append(dynarray, element);
Finally, hand fixup cases that Coccinelle missed or incorrectly
translated, of which there were several because we can't used the
untyped append with a literal (since the sizeof won't do what you want).
All four steps are squashed to produce a single patch changing every
util_dynarray_append call site in tree to either drop a type parameter
(if possible) or insert a _typed suffix (if we can't infer). As such,
the final patch is best reviewed by hand even though it was
tool-assisted.
No Long Linguine Meals were involved in the making of this patch.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38038>
The previous implementation was horribly slow with a larger number
of descriptor sets.
The new approach uses util_vma_heap (like ANV) which is a perfect fit.
This fixes stuttering in Indiana Jones because that games seems to use
a huge number of descriptor sets which can also be freed.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13901
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37976>
When NIR_DEBUG=serialize or NIR_DEBUG=clone is used, NIR_PASS recreates
nir_function_impl and nir_variable objects, causing use-after-free since
insert_rt_case() keeps pointers to those in local variables and var_remap.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37573>
This doesn't work with NIR_DEBUG=serialize or NIR_DEBUG=clone.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37573>
When NIR_DEBUG=serialize or NIR_DEBUG=clone is used, NIR_PASS recreates
nir_function_impl and nir_variable objects, causing use-after-free since
radv_nir_lower_rt_abi() keeps pointers to those in local variables.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37573>
The framebuffer dimension exposed to apps is still 16k but since the
driver allows 32k image on GFX12+, meta operations might perform
operations (like a copy) using graphics.
While we are at it, use the correct bitfield for setting BR_X/BR_Y on
GFX12.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37974>
The normal encode pass writes batches to a section in build scratch
memory. Those batches contain information about the internal node and
the primitive nodes. The encoder is split to avoid the register
pressure of the compressor and maximize occupancy.
The compressor works in two passes because one pass can not guarantee
that every primitive node (except) has at least two triangles. This
guarantee is used to advertise a smaller acceleration structure size to
the application.
During compression, every invocation processes at most two triangles.
Groups of 8 invocations are used to support the maximum triangle count
of 16 that the hardware supports.
The first step of compression is loading the triangle(s). Shared
vertices are deduplicated early to avoid doing it in the compression
loop. The compression loop tries to add triangles to a list of triangles
until the computed node size needed for storing the triangles reaches
the hardware node size. For this, each invocation first deduplicates
vertices with the triangles that have already been picked. It then
computes the node size of the picked triangles plus the candidate
triangles of the current invocation. The invocation that computed the
smallest size is added to the list.
Because it may not be possible to fit every triangle into the same node,
there can be multiple hardware nodes which are written in parallel for
optimal performance. If there are no nodes with only one triangle, all
nodes are written. If there is, compression of the batch is aborted and
the index of the batch is written to build scratch memory. The second
compression pass will repeat the steps above but only for those aborted
batches. The nodes with only one triangle can and are now merged.
It can not be determined during box node encode which triangles will be
compressed together so the encoder also has to fix up the parent box
node's child infos.
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36965>