Without TC, we'd get draw->count==0.. which is obviously not correct.
With TC we get random garbage for draw->count. Which turns into
exciting things like trying to allocate multi-gigabyte buffers for
tess param/factor buffers.
But we can just tell the CP to split up large tess draws, and put an
upper bound on the tess param/factor buffer sizes.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9864>
nir_to_tgsi lowers nir_op_isub to UADD and the negate source mod
on the second operator. Since r600 doesn't support source mods
for integers but support SUB_INT we switch the opcode and clear the
negate flag.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9579>
Since the last round of enhanced layout packing, it seems like this code
is no longer needed. This allows us to use the HANDLE_EMIT_BUILTIN()
macro, which simplifies things further.
It would have been incorrect anyway, because it's the only code that
uses the location directly as an index. All other locations multiplies
it by four first.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9860>
We might only need to emit a read or write cap for a given image. This
could provide the Vulkan driver with the chance to optimize things
slightly.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9855>
We're using it for in-memory cache as well, so it needs to be computed
unconditionally.
Fixes: bf09ba5385 ("lima: implement shader disk cache")
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9838>
OpEntryPoint is only supposed to include the Input and Output storage
classes prior to SPIR-V 1.4.
Since we're always emitting SPIR-V 1.0, we should simply omit these from
OpEntryPoint for now. If we start emitting SPIR-V 1.4 or later, we
should add these back in.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9828>
The liveness code in ppir can be a bottleneck in complicated shaders
with multiple spills, and the use of the mesa set data structure is one
of the main reasons it is expensive to run.
With some changes, it can be adapted to using bitsets which makes it run
substantially faster.
ppir liveness can't run with a regular bitset for registers since we
need to track inviditual component masks for non-ssa registers, but we
can switch to using a separate packed bit array just for the masks,
rather than a full blown hash set. This also makes operations such as
liveness propagation much more straightforward.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9745>
The live_out set was serving mostly as a temporary storage of data
propagated from the next instructions.
If we do an early copy to preserve the instruction live_in set and
propagate the liveness information directly to the live_in set, we can
skip having a live_out set entirely.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9745>
liveness information was stored in blocks so that the algorithm doesn't
have to find a valid instruction to inherit liveness information from.
It turns out this part of the code can be a performance bottleneck in
applications loading with complicated shaders, so skip the liveness
information in blocks to reduce the number of times we have to copy
liveness information.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9745>
Loosely based on ANV implementation.
For executable's internal representation we output:
- Initial NIR after spirv_to_nir
- Final optimized NIR
- IR3 disassembly
Note, that vkGetPipelineExecutablePropertiesKHR is required to
return executable properties even if pipeline was not created with
CAPTURE_STATISTICS or CAPTURE_INTERNAL_REPRESENTATIONS bits set.
So the executables array is unconditionally populated, however
NIR and IR3 disassemlies are filled only when
CAPTURE_INTERNAL_REPRESENTATIONS is set.
Passes dEQP-VK.pipeline.executable_properties.*
Works with RenderDoc.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8877>
We already have code to lower away these to something we don't need so
much special-case code to handle.
The sad part here is that we generate slightly worse code; trees of
OpLogicalAnd / OpLogicalOr instead of OpAny OpAll. But I would imagine
that for any GPU where this matters, these would be easy to combine
back, so I'm not losing a lot of sleep over this.
But this makes things simpler. And if we *really* care about OpAny or
OpAll, we should add NIR ALU instructions for them so we can optimize
them on other places as well, not open-code all places where it could
improve things.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9797>
Wire up disk cache routines and change fs and vs keys to use nir_sha1
instead of pointer to uncompiled shader to be able to reuse them for
disk cache.
Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9672>
Makes calling code more explicit about what is being set, and allows
take advantage of zero initialization for the ones the callsite don't
care.
Besides moving to the struct, two extra "ergonomic" changes were done:
- Add a new shader_time boolean, so shader_time_index is ignored when
unused -- this allow taking advantage of the zero initialization of
unset fields.
- Since we have a struct, provide space for the error_str pointer.
Both iris and i965 were using it, and the extra rstrdup in case of
failure shouldn't be a burden for the others.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>
Makes calling code more explicit about what is being set, and allows
take advantage of zero initialization for the ones the callsite don't
care.
Besides moving to the struct, two extra "ergonomic" changes were done:
- Add a new shader_time boolean, so shader_time_index is ignored when
unused -- this allow taking advantage of the zero initialization of
unset fields.
- Since we have a struct, provide space for the error_str pointer.
Both iris and i965 were using it, and the extra rstrdup in case of
failure shouldn't be a burden for the others.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>