We'll get three new opcodes to properly model float multiply-add.
ffma_old is temporary and will be deleted at the end of this series.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
sed -i "s/nir_src_parent_instr/nir_src_use_instr/" `find ./ -type f`
sed -i "s/nir_src_parent_if/nir_src_use_if/" `find ./ -type f`
sed -i "s/nir_src_set_parent/nir_src_set_use/" `find ./ -type f`
There are two kinds of "parent" in relation to a src/def:
- the instruction where the def or src's def is defined
- the instruction which the src is a part of and where the def is used
Clarify that the parent here is where the src's def is used, not where
it's defined.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41344>
We have 4 image intrinsic variants now. This enum is useful for
nir_rewrite_image_intrinsic() and it will be used by other NIR passes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40709>
The new common code gives us 32-bit values, handle this.
This fixes the various crashes on CTS since the common code changes of
last week.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 22a061fb91 ("nir: Use better calculation for alpha-to-coverage mask")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40691>
Since 31af989270 ("nir/lower_continue_constructs: Simplify loops before
lowering continue constructs"), we never ingest loops with continues. That lets
us delete a bunch of now dead code (and outdated comments) around control flow.
This patch is part of the treewide effort to improve loops in NIR. I already
sent the Intel patch earlier this week and this weekend hit delete here too.
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40609
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenz.ca>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Tested-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40690>
This gets us an in-tree user for the new code and a net code deletion for the
MR across the tree.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40074>
To demonstrate that this helper is legitimately useful cross-tree and should
be made common code :-)
Sample AGX IR with this patch:
10 = mov_imm #0x3f5680f1 /* 0.837905 */
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40021>
For large constants inlined into phis, this would overread the remap[] array,
which could crash. No CTS tests affected though.
Christoph found the bug and fixed it for Bifrost over in
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39305. I just did a
quick CTS run of the obvious AGX backport over this morning's breakfast.
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reported-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39313>
Unifies nir per instruction float control.
In the future this can be split into contract/reassoc/transform
like SPIR-V.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (except SPIR-V)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>
Mix of Coccinelle patch, manual fix ups, sed, etc. Probably best to review the diff
as-if hand written:
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38955>
This is done by grep ALIGN( to align(
docs,*.xml,blake3 is excluded
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38365>
We add a bunch of new helpers to avoid the need to touch >parent_instr,
including the full set of:
* nir_def_is_*
* nir_def_as_*_or_null
* nir_def_as_* [assumes the right instr type]
* nir_src_is_*
* nir_src_as_*
* nir_scalar_is_*
* nir_scalar_as_*
Plus nir_def_instr() where there's no more suitable helper.
Also an existing helper is renamed to unify all the names, while we're
churning the tree:
* nir_src_as_alu_instr -> nir_src_as_alu
..and then we port the tree to use the helpers as much as possible, using
nir_def_instr() where that does not work.
Acked-by: Marek Olšák <maraeo@gmail.com>
---
To eliminate nir_def::parent_instr we need to churn the tree anyway, so I'm
taking this opportunity to clean up a lot of NIR patterns.
Co-authored-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>
fixes O(N^2) memory usage and runtime around liveness/scheduling/spilling/RA,
and proves out the design for the common code sparse bitsets (I did need to make
an adjustment for this - worth the effort).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37908>
Most of the time, we can infer the type to append in
util_dynarray_append using __typeof__, which is standardized in C23 and
support in Jesse's MSMSVCV. This patch drops the type argument most of
the time, making util_dynarray a little more ergonomic to use.
This is done in four steps.
First, rename util_dynarray_append -> util_dynarray_append_typed
bash -c "find . -type f -exec sed -i -e 's/util_dynarray_append(/util_dynarray_append_typed(/g' \{} \;"
Then, add a new append that infers the type. This is much more ergonomic
for what you want most of the time.
Next, use type-inferred append as much as possible, via Coccinelle
patch (plus manual fixup):
@@
expression dynarray, element;
type type;
@@
-util_dynarray_append_typed(dynarray, type, element);
+util_dynarray_append(dynarray, element);
Finally, hand fixup cases that Coccinelle missed or incorrectly
translated, of which there were several because we can't used the
untyped append with a literal (since the sizeof won't do what you want).
All four steps are squashed to produce a single patch changing every
util_dynarray_append call site in tree to either drop a type parameter
(if possible) or insert a _typed suffix (if we can't infer). As such,
the final patch is best reviewed by hand even though it was
tool-assisted.
No Long Linguine Meals were involved in the making of this patch.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38038>
Introduce new NIR intrinsics to handle getting a "sink" read-only
address and another intrinsic to handle conversion of address to
read-write (allowing implementation to replace the "sink" read-only with
another address like required for Asahi)
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37914>
We run agx_preprocess_nir as the last step of each new compute shaders
in agx_nir_lower_gs but we could move this out of the pass and makes it
the driver responsability to call it.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37914>
This was something that came up in the slop MR. Not sure it's actually a
good idea or not but kind of curious what people think, given we have a
sound tool (Coccinelle) to do the transform. Saves a redundant branch
but means extra noninlined function calls.. likely no actual perf impact
but saves some code.
Via Coccinelle patches:
@@
expression ptr;
@@
-if (ptr) {
-free(ptr);
-}
+free(ptr);
@@
expression ptr;
@@
-if (ptr) {
-FREE(ptr);
-}
+FREE(ptr);
@@
expression ptr;
@@
-if (ptr) {
-ralloc_free(ptr);
-}
+ralloc_free(ptr);
@@
expression ptr;
@@
-if (ptr != NULL) {
-free(ptr);
-}
-
+free(ptr);
@@
expression ptr;
@@
-if (ptr != NULL) {
-FREE(ptr);
-}
-
+FREE(ptr);
@@
expression ptr;
@@
-if (ptr != NULL) {
-ralloc_free(ptr);
-}
-
+ralloc_free(ptr);
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v3d]
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> [venus]
Reviewed-by: Frank Binns <frank.binns@imgtec.com> [powervr]
Reviewed-by: Janne Grunau <j@jannau.net> [asahi]
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> [radv]
Reviewed-by: Job Noorman <jnoorman@igalia.com> [ir3]
Acked-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Job Noorman <jnoorman@igalia.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37892>
While Intel will use the full feature set of util_lut3 in the future, AGX can
use the cut down 2-source versions right now to get us an in-tree user.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37200>
This was added with the goal to eventually replace the per
pass subgroup/ballot size options, but that won't work because
some backends don't have a fixed subgroup size across the compilation
process.
It was also mostly added to hack around mesa state tracker behavior,
and we have a better solution there now.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37164>
fixes a bunch of OpenCL CTS including test_basic vload_private due to failing
to relower the derefs but also lol.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Backport-to: 25.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36631>
It's not only for GL, change to a generic name.
Use command:
find . -type f -not -path '*/.git/*' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} +
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
no other users now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>
When I went to use opt_reassociate for tu, I was advised that you want to
do this loop to get the best results. If everyone needs it, let's make it
common code and explain what's going on.
In the process, also make it skip work appropriately when there's no
progress.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36342>