Input attachment lowering really has nothing to do with descriptor
lowering. It just needs to be called first so we still have access to
the variables.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38821>
We're about to move this to another file which won't have anything to do
with descriptor sets. This makes the one potentially functional change
easily bisectable and then the next commit is just moving code.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38821>
This is needed for intershader optimization and GS lowering.
We now pass all NIR variants with pan_compile_inputs to
panvk_compile_shader and handle sysvals/push consts lowering in there.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38821>
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Co-authored-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38821>
We were running this in the preprocess step and then trusting that it
would clean up everything before we got to the back-end. However, we
were running the entire optimization loop in between as well as drivers
potentially adding stuff (since panvk has it's own passes after
postprocess). Instead, this should be one of the last things run, right
before we go into the back-end.
Cc: mesa-stable
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38821>
The comments pretty explicitly say that they assume lowered UBOs and
SSBOs so preprocess is the wrong place for them.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38821>
Other resource variables like UBOs and SSBOs are still useful for
debugging and not harming anything. Also, some Vulkan-focused passes
look up variables from the otherwise detached resource intrinsics and
use them for things. We really want to keep them around.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38821>
cat2_may_use_scalar_alu() was incorrect because the instruction could
use an indirectly-accessed const where a0.x (i.e. the offset) is
non-uniform. Fortunately, we already know whether this is the case,
because the original instruction would then write a non-shared GPR.
Also, the restrictions for scalar ALU are the same regardless of whether
we write up0.x or a shared GPR, and vice versa the restrictions for
normal cat2 are the same regardless of whether we write p0.x or a
non-shared GPR, so it should always be safe to write p0.x if non-shared
and up0.x if shared. So, just do that.
Fixes: 2a8c5ebc77 ("ir3: enable scalar predicates")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38895>
This is about writes to shared regs, not GPR (as early-preamble can only
use shared regs). It's a pretty hypothetical case, but might as well
get it correct.
Fixes: 189e494249 ("ir3: Add (sy) before end of preamble when necessary")
Reported-by: Job Noorman <jnoorman@igalia.com>
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38868>
This calculation needs to happen in the same loop as the
geometry/triangle id calculations in case the selected invocation is
before all invocations that were already selected.
Totals from 1269 (15.10% of 8406) affected BVHs:
compacted_size: 137581888 -> 137606464 (+0.02%); split: -0.08%, +0.10%
sah: 6496048424 -> 6496048450 (+0.00%); split: -0.00%, +0.00%
primitive_node_count: 604384 -> 604656 (+0.05%); split: -0.14%, +0.19%
Fixes: c18a7d0 ("radv: Emit compressed primitive nodes on GFX12")
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38462>
We already have a function called pvr_render_targets_init() in
pvr_device.c. Having two symbols with the same name, even though they
both are static only leads to confusion.
So let's rename one of them to clear things up.
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38888>
The only HW-definition we depend on here is the occlusion query minimum
alignment. But this value doesn't differ per arch, so let's just make it
a global define for now.
This should probably have a better name and location. But AFAICT, this
applies to a lot in this header. So let's untangle this down the road
instead.
Acked-by: Ashish Chauhan <Ashish.Chauhan@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38832>
This just pre-emptively fixes up some includes that will bite us in the
future if not fixed.
Acked-by: Ashish Chauhan <Ashish.Chauhan@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38832>
These are only implemented for Rogue so far, which is fine because
that's all we currently support. Once we add support for more
generations, these helpers need updating.
Acked-by: Ashish Chauhan <Ashish.Chauhan@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38832>
This takes us from one "supported"-bit to per-bind flags for each of
vertex-buffer, sampler-view, render-target, depth-stencil and
storage-image bindings.
While we chould have stored extra bits for storage storage/uniform
texel-buffers as well, that seems a bit excessive to me.
Acked-by: Ashish Chauhan <Ashish.Chauhan@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38832>
The PBE details here aren't relevant for all GPUs, so let's split them
out to a separate table here.
Acked-by: Ashish Chauhan <Ashish.Chauhan@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38832>
The VkFormat is the key into the format-table, so we always know from a
higher level up what VkFormat we're talking about. So let's instead pass
VkFormat around, and look up pvr_format from the format-table later on
instead.
This leads to a few more lookups, but those are relatively cheap, and
doesn't happen in critical code-paths.
Acked-by: Ashish Chauhan <Ashish.Chauhan@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38832>
On the assumption that nobody will use a sample id greater than the sample
count, have loop unrolling guess based on the driver's max sample count.
This unrolls a simple resolve shader with a uniform max samples on ir3 to:
value = vec4(0);
if (max_samples > 0) {
value += txf_ms(coord, 0);
if (max_samples > 1 {
value += txf_ms(coord, 1);
if (max_samples > 2){
value += txf_ms(coord, 2);
if (max_samples > 3) {
value += txf_ms(coord, 3);
for (i = 4; i < max_samples; i++)
value += txf_ms(coord, i);
}
}
}
}
...
This is only worth a 1% win on our microbenchmark as-is, but if we could
flatten those ifs out and pull the fadds out to the end, avoiding syncs
per load would be a big win. This seems like a first step.
I've taken a shot at updating drivers to set the value, and tried to leave
notes in places that drivers might update, and want to follow up with
updating the compiler option.
This affects over half the DX11 apps in shader-db-private.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38585>
In preparation for dynamic rendering and for removing renderpasses
move relevant code so it doesn't need to move again.
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38744>
In preparation for dynamic rendering make the code in start of renders
clear more robust and generic so not to depend on render passes when
it doesn't have / need to.
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38744>
The assert here was too strict and dynamic rendering tests do exercise this
part. The correct fix is to either re-enable RTA support or handle properly
multi-layered emulation.
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38744>