It's a lot more explicit to just have an intrinsic for this than to
treat blend shaders as their own weird stage. Also, the new intrinsic
uses the same io_semantics as a fragment store so the back-end code is a
little easier to read because it now checks sem.dual_source_blend_index
instead of the generic load_input offset.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39244>
The one non-trivial change here is that we're now using BLEND with a
constant descriptor instead of ST_TILE for MSAA blend shaders. However,
this shouldn't make any practical difference.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39244>
This helper is shorter and it also caches the result in the collect
cache so it can be used as a vector (or, in this case, a 64-bit value).
Cc: mesa-stable
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39244>
ANGLE is a massive pain to debug because it threads like mad. This at
least ensures the shaders aren't weirdly interleaved from multiple
threads.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39244>
We're not using the mesa log functions for any of our back-end compiler
stuff so we should make NIR log the same way.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39244>
v2: invert logic (Francesco Ansanelli)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39136>
We all know who wrote a bunch of Panfrost code. No need to repeat this a million
places, the copyright line is plenty.
in cases where there's a joint me & Italo/Eric/.. tag, i've left it alone to
respect others' potential wishes.
$ find . -type f -exec perl -i -p0e 's/ \*\s+\* Author[^\n]+\s+\*\s+Alyssa[^\n]+\n \*\// \*\//' \{} \;
v2: delete more tags (Boris).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39136>
Move the update of pan_shader_info into its own function. For now this
does nothing, but it will allow us to update the info before printing
stats, which will be useful in the future.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38961>
This isn't very useful yet (the info isn't fully filled
out at the point of the call) so the printed values should
be viewed skeptically.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38961>
We need a subset of the pan_shader_info struct for each variant,
and previously we were building a struct containing that subset and
passing the struct to bi_compile_variant_nir. Instead we should pass
a pointer to the whole struct and let bi_compile_variant_nir build the
substructure. This is both more efficient, and also gives the stats
code access to the full information.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38961>
Add the actual registers used (including uniforms used) by the shader
to stats. This is calculated by the stats gathering code, because the
scheduler and scoreboard passes run after register allocation and can
sometimes change the results.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38961>
s/util_dynarray_clear/util_dynarray_fini/ to fix the leak.
Fixes: 7dc4f28507 ("pan/bi: schedule simple iterators to avoid extra move")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38923>
We were running this in the preprocess step and then trusting that it
would clean up everything before we got to the back-end. However, we
were running the entire optimization loop in between as well as drivers
potentially adding stuff (since panvk has it's own passes after
postprocess). Instead, this should be one of the last things run, right
before we go into the back-end.
Cc: mesa-stable
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38821>
The comments pretty explicitly say that they assume lowered UBOs and
SSBOs so preprocess is the wrong place for them.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38821>
Other resource variables like UBOs and SSBOs are still useful for
debugging and not harming anything. Also, some Vulkan-focused passes
look up variables from the otherwise detached resource intrinsics and
use them for things. We really want to keep them around.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38821>
bit_size <= 32 does not actually guarantee a single component, which
nir_src_as_uint() requires. We could just check num_components == 1 but
it's easy enough to support any vector that fits in 32 bits.
Cc: mesa-stable
Reviewed-by: Romaric Jodin <rjodin@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38772>
It existed entirey to save us a switch statement. It's very unlikely
that's worth pulling GENX API command stream stuff into the compiler.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38788>
We multiply by 16 correctly but then drop that in the case where vbase
is non-zero. We typically lower FS input indirects so we don't see this
often but there are a few cases where they still sneak through.
Fixes: 0fcddd4d2c ("pan/bi: Rework varying linking on Valhall")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38788>
There is nothing IR about it. It's really the compiler interface file,
so it should all go in pan_compiler.h.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38753>
While we're at it, rename pan_lower_* to pan_nir_lower_* for consistency
with everything else. The intention is that eventually this will be a
private header and drivers will stop calling these passes themselves.
However, that is a long way off so for now we'll just move them to the
sensibly named place. Notably, pan_preprocess/optimize/postprocess_nir
stay in pan_compiler.h because those are intended to be external APIs
indefinitely.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38753>
This is going to be where we put the compiler interface. For now,
disassembly wrappers are as good a place to start as any.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38753>
The only thing that only pulls in bifrost is our CL compiler for
pre-builts. Everything that uses pan_shader.h pulls it in because
that's header-only and calls into both back-ends anyway. Some day
we might want to figure out how to properly dead-code the midgard
compiler for Vulkan but today is not that day.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38753>
src/panfrost/util is all compiler stuff, we just put it in util/ so it
could be used by both midgard and bifrost. We also rename all the files
to have a pan_nir prefix. We'll rename the functions later.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38753>
It was this way for a while but then someone decided that we should try
to hide midgard and get it off the main path. However, we still try to
present a unified interface to the GL driver. The combination of these
two things means that those common compiler interfaces have to live
outside of panfrost/compiler which makes everything more awkward than
it needs to be. We either need to move it back into src/gallium and
make abstracting across the two the GL driver's problem (which breaks
other tooling) or need to put both under src/panfrost/compiler. The
midgard compiler can just as easily bitrot in panfrost/compiler/midgard.
The first step is moving the bifrost compiler.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38753>
`comps <= (1 << subword_shift)` cannot be guarantee.
Here is an example:
```
8x2 %27 = @load_ssbo (%26 (0x1000001), %4) (access=readonly|reorderable, align_mul=2, align_offset=0, offset_shift=0)
8x2 %32 = ior %25, %31
32 %34 = ult32 %33 (0x7), %12
8x2 %35 = b32csel %34.xx, %27, %32
```
When processing `%34.xx` in `bi_emit_alu` (for `instr->src[0]`),
`comps` is computed from the instr definition (`%35`), but
`subword_shift` from the src bitsize.
In that case comps is greater than `1 << subword_shift`, but this is
supported by `bi_alu_src_index`.
This example is extracted from `dEQP-VK.spirv_assembly.type.vec2.i8.bit_field_insert_offset16_count16_comp`
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36638>
When v4i8 instruction are using to compute a v2i8, it puts the 2
result values in b0 & b2, thus we need to swizzle the destination to
have them in b0 & b1 as expected by the consumer of the v2i8 produced.
example: dEQP-VK.spirv_assembly.type.vec2.i8.mul_frag
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36638>
Vectorizing it prevents optimisation related to the store
instruction. This is having negative impact from a shader-db
point-of-view.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36638>
All drivers use the same callback and it is unlikely that new drivers will use
this pass since it has better replacements today (lower_mem_bit_sizes for
memory, and it never worked for I/O). This should discourage as much.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38533>