Commit graph

198406 commits

Author SHA1 Message Date
Erik Faye-Lund
0284e7fedb mesa/main: properly check for EXT_memory_object
This extension isn't supported in GLES 1.x, so let's tighten the check.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32349>
2024-11-29 13:48:26 +00:00
Philipp Zabel
dddec9a66d teflon: Support fused ReLU6 activation via output saturation
If the output tensor quantization range does not exceed 6.0, ReLU6 can
be replaced with ReLU: output values larger than 6.0 are clipped by
output saturation.

Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32388>
2024-11-29 13:32:42 +00:00
Hans-Kristian Arntzen
6370acbead radv: Add sparse mappings to radv_check_va.py.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32146>
2024-11-29 12:57:42 +00:00
Hans-Kristian Arntzen
cb15b34295 radv/winsys: Report VA mappings in bo_log too.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32146>
2024-11-29 12:57:42 +00:00
Philipp Zabel
a9f0624d6b teflon: Reject per-axis quantization
Until a workaround for missing hardware support is implemented, stop
pretending to support convolution operations on tensors with per-axis
quantization.

Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32387>
2024-11-29 11:20:27 +00:00
Philipp Zabel
0501a3b5c1 etnaviv/ml: Create combined input tensors for addition first
Fix addition where one summand was already used as input to an earlier
operation, for example in the last operation of MobileNet V2 residual
blocks.

Fixes an assertion when trying to run MobileNet V2:

  .../src/gallium/drivers/etnaviv/etnaviv_ml.c:58: etna_ml_create_tensor: Assertion `size == pipe_buffer_size(res)' failed.

Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31987>
2024-11-29 10:46:33 +00:00
Philipp Zabel
47b4aef5db teflon/tests: Enable int8 tests
Enable signed 8-bit convolution tests.

Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31979>
2024-11-29 10:53:01 +01:00
Philipp Zabel
563316417a teflon/tests: prep test executor for signed convolutions
Subtract 128 from the input and output tensor zero points, to keep
them in int8_t range (conv2d.tflite is set up for uint8_t).

Set weight tensor zero point to zero, as required by TensorFlow Lite
for int8_t weight tensors.

Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31979>
2024-11-29 10:52:58 +01:00
Philipp Zabel
4153154423 etnaviv/nn: Add support for signed 8-bit tensors
The hardware only supports unsigned 8-bit tensors, but with the
configurable zero point we can map signed 8-bit integers to unsigned
8-bit integers by adding a constant offset of 128 to all values and to
the zero point setting.

This requires adding 128 to all input tensors and subtracting 128
from all output tensors during inference.

Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31979>
2024-11-29 10:52:56 +01:00
Philipp Zabel
f9c34a3eb0 teflon: Add is_signed parameter to ml_subgraph_invoke and ml_subgraph_read_output
There probably is a better way to provide this information to the
gallium driver, but this allows the driver to apply conversions as
needed when writing input tensors and reading back output tensors.

Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31979>
2024-11-29 10:52:48 +01:00
Peyton Lee
1ca2137a84 radeonsi/vpe: optimize software functions
1. Break down the configuration functions
2. Remove unnecessary debug messages and redundant coding
3. Add support for color primaries

Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32296>
2024-11-29 08:37:47 +00:00
Timothy Arceri
05d2fe2372 glsl: remove glsl/program.h
It is now unused.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32402>
2024-11-29 14:31:30 +11:00
Timothy Arceri
8142797721 glsl: move _mesa_glsl_compile_shader() declaration
The function is in glsl_parser_extras.cpp so move the declaration to
glsl_parser_extras.h

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32402>
2024-11-29 14:30:03 +11:00
Benjamin Cheng
323b59a5b5 radv/video: support event for pre-VCN4 decode queues
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32400>
2024-11-29 10:03:48 +10:00
Benjamin Cheng
1689d88e4a radv/video: support event for pre-VCN4 encode queues
Prior to VCN4, the encode queue is separate from the decode queue. For
encode, the WRITE_MEMORY command can be executed with similar framing as
for VCN4, but notably there is no signature support, so it must be
skipped.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32400>
2024-11-29 10:02:14 +10:00
Benjamin Cheng
152b06acd8 ac/vcn: allow sq signature package to be skipped
This is preparing for radv event support on pre-VCN4 encode queues.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32400>
2024-11-29 10:01:49 +10:00
Boris Brezillon
25c0a11cf7 panvk: Add a flag to force SIMULTANEOUS_USE
Turns out we have a bunch of test that fail when the descriptor
ring-buffer is involved. Add a flag so we can extend testing coverage
without adding more CTS tests.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
2024-11-28 20:21:52 +00:00
Boris Brezillon
46a0231c9c panvk/csf: Don't disable SIMULTANEOUS_USE when tracing is enabled
Now that we switched to event-based tracing, we can keep the
SIMULTANEOUS_USE flag even when tracing is enabled.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
2024-11-28 20:21:52 +00:00
Boris Brezillon
bd49fa68b0 panvk/csf: Use event-based CS tracing
Use the new event-based tracing system to capture IDVS/COMPUTE/FRAGMENT
jobs and their context.

When tracing is enabled, the descriptor ring buffer is replaced by
a bigger linear buffer such that descriptors are not recycled before
we get a change to decode the trace.

If the decode buffer is too small and a OOB is detected, the driver will
suggest the user to allocate a bigger buffer with the
PANVK_{DESC,CS}_TRACEBUF_SIZE env vars.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
2024-11-28 20:21:52 +00:00
Boris Brezillon
bf05842a8d pan/cs: Add an event-based tracing mechanism
Interpreting the command buffer only really works if everything is
static, but panvk started to make extensive use of loops, and
conditionals which depends on memory values that get updated by the
command stream itself. This makes it impossible to walk back to the
original state in order to replay the CS actions.

Move away from this approach in favor of an event-based tracing
mechanism recording particular CS commands and their context at
execution time. Of course, that means the auxiliary descriptors
shouldn't be recycled until the traces are decoded, but that's more
tractable. We just need to turn the descriptor ring buffers into
linear buffers with a guard page, and crash on OOB, with a message
suggesting the user to tweak the maximum trace buffer sizes.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
2024-11-28 20:21:52 +00:00
Boris Brezillon
4e5f75d1d7 pan/cs: Add a LOAD_IP pseudo instruction
Will be useful if we want to be able to make the trace events point to
the instruction they are recording.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
2024-11-28 20:21:52 +00:00
Boris Brezillon
8c30c2924f pan/decode: Provide a helper to print messages outside of the decoding path
Just a wrapper around pandecode_log() taking the lock and making sure
the dump stream is opened.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
2024-11-28 20:21:52 +00:00
Boris Brezillon
7d0dc3d30c pan/decode: Add a helper to print CS binaries without interpreting them
In panvk, we want to switch from interpretation-based decoding to
event-tracing based decoding, so we no longer depend on the memory state
to get accurate job information.

Even if we're not interested in interpreting the CS, we still want to
dump CS binaries so developers can know what's passed to the GPU.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
2024-11-28 20:21:52 +00:00
Boris Brezillon
41d3f16a28 pan/decode: Rename pandecode_cs() into pandecode_interpret_cs()
pandecode_cs() does both the decoding and the interpretation.
Rename the function to avoid the confusion.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
2024-11-28 20:21:52 +00:00
Boris Brezillon
1a8ef18aeb pan/decode: s/interpret_ceu/interpret_cs/
Everything else is prefixed cs, not ceu, so let's drop the remaining
ceu occurrences.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
2024-11-28 20:21:52 +00:00
Boris Brezillon
3778df8778 pan/decode: Untangle CS disassembling and interpretation
Despite the name, disassemble_ceu_instr() does more than disassembling
the instruction, it also partially interpret it.

Add a print_cs_instr() helper that does just the disassembling/printing
part, and move the remaining of disassemble_ceu_instr() to
interpret_ceu_instr().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
2024-11-28 20:21:52 +00:00
Rob Clark
dfd519ed80 vdrm+tu+fd: Make cross-device optional
Similar to commit 087e9a96d1 ("venus: make cross-device optional"),
make VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE use optional, because qemu does
not support this.

Fixes: 06e57e3231 ("virtio: Add vdrm native-context helper")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32392>
2024-11-28 19:55:11 +00:00
Caio Oliveira
a9acc0bea4 util/ra: Remove unimplemented function declaration
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32395>
2024-11-28 19:19:26 +00:00
Timur Kristóf
2089bf7b57 radv: Use default 0 for undefined builtin PS inputs.
The previous code not only left them undefined, but also
didn't increment the array index, so subsequent PS inputs
would be broken after the undefined one.

Note that this doesn't affect any valid Vulkan apps, but it makes
the code a bit simpler and it makes undefined inputs a little more
forgiving, at no expense for valid PS.

This code actually uncovers a bug in Zink, so I'm also documenting
the failing Zink test case.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
2024-11-28 18:14:57 +00:00
Timur Kristóf
b0b1a07193 radv: Remove now unused num_prim_interp from shader_info.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
2024-11-28 18:14:57 +00:00
Timur Kristóf
12b9b461e5 radv: Emit SPI_PS_IN_CONTROL when emitting PS inputs on GFX10.3.
GFX10.3 keeps track of per-vertex and per-primitive PS inputs
separately in NUM_INTERP / NUM_PRIM_INTERP,
which we only really know when emitting the inputs.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
2024-11-28 18:14:57 +00:00
Timur Kristóf
e2b8c4a9ac radv, aco: Consolidate num_interp + num_prim_interp into num_inputs.
num_inputs contains the total number of FS inputs.

Note that this also fixes a bug where some calculations in RADV
and ACO were missing the per-primitive attributes from the LDS
usage of PS.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
2024-11-28 18:14:57 +00:00
Timur Kristóf
e5a9ae912b radv: Slightly simplify potentially per-primitive FS inputs.
Add export_prim_id_per_primitive for mesh shaders.
This prepares to also configure some of these to be per-primitive
in the future, even in the traditional pipeline.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
2024-11-28 18:14:56 +00:00
Timur Kristóf
930243bf36 radv: Reorder potentially per-primitive FS builtins.
There are some FS built-ins that can be per-vertex or
per-primitive depending on whether a mesh shader is used:
primitive ID (implicit in VS), layer and viewport.

However, the HW requires per-primitive FS inputs to be ordered last.
This causes bugs when the same unlinked FS is used together
with VS/TES/GS and MS (with unlinked ESO or fast-linked GPL).

To solve this problem, we reorder the FS inputs so that these
potentially per-primitive inputs go after per-vertex inputs but
before per-primitive inputs.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
2024-11-28 18:14:56 +00:00
Alyssa Rosenzweig
f4a3ba5302 asahi,vtn: precompile kernels
switch libagx to the precompilation pipeline. see the big comment in the
previous commit for why we're doing this.

while doing so, we move some dispatch stuff. there was so much churn from
precompile that this avoids doing the churn twice. that new header will be used
for DGC down the road.

there's also a small vtn/bindgen patch in here to skip bindgen'ing entrypoints,
as that conflicts with the new dispatch macros. this is the sane behaviour, we
just need to do the full precomp switch across the tree at once.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32339>
2024-11-28 17:34:12 +00:00
Alyssa Rosenzweig
e3001352ad nir: add helpers for precompiled shaders
v2: generalize function signatures.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> [v1]
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> [v1]
Acked-by: Mary Guillemard <mary.guillemard@collabora.com> [v2]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32339>
2024-11-28 17:34:12 +00:00
Rhys Perry
4c3809e7fc aco: use small_vec in RegCounterMap
This seems to be a little faster.

insert_NOPs (navi31):
Difference at 95.0% confidence
	-11.484 +/- 6.13377
	-1.62767% +/- 0.860593%
	(Student's t, pooled s = 5.71913)

insert_NOPs (gfx1200):
Difference at 95.0% confidence
	-35.6745 +/- 4.97972
	-8.1236% +/- 1.10453%
	(Student's t, pooled s = 4.6431)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32374>
2024-11-28 17:07:34 +00:00
Rhys Perry
7a500c8b22 aco: make small_vec copyable
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32374>
2024-11-28 17:07:34 +00:00
Marek Olšák
c26da94b4c nir/opt_varyings: replace options::lower_varying_from_uniform with a cost number
This is a simple way for drivers to enable uniform expression propagation
without having to set any callbacks for it. It replaces the old option.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32390>
2024-11-28 15:39:46 +00:00
Marek Olšák
428613b690 nir/opt_varyings: add a default callback for varying_estimate_instr_cost
used when the driver doesn't set it.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32390>
2024-11-28 15:39:46 +00:00
Marek Olšák
1f238f0a2e nir/opt_varyings: always call remove_dead_varyings in init_linkage
so that we don't have to do it after every init_linkage call.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32390>
2024-11-28 15:39:46 +00:00
Scott Moreau
7d1a32fafd dri: Fix hardware cursor for cards without modifier support
After the breaking commit, gbm_bo_create_with_modifiers({LINEAR}) returns
a BO with gbm_bo_get_modifier() = INVALID. This restores the functionality
and fixes most notably, hardware cursors for cards without modifiers.

Fixes #12039.

Fixes: 361f362258 ("dri: Unify createImage and createImageWithModifiers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31725>
2024-11-28 14:52:42 +00:00
Marek Olšák
c1442030ec vc4: lower clip planes in st/mesa
This fixes:
    spec@glsl-1.20@execution@clipping@vs-clip-vertex-enables
with the latest nir_lower_clip changes.

The driver breaks when POS is stored before CLIP_DIST.
That's the only change caused by previous commits according to
VC4_DEBUG=nir.

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>
2024-11-28 14:14:47 +00:00
Marek Olšák
c50c9e9bf9 nir/lower_clip: implement ClipVertex lowering for GS + lowered IO correctly
This is currently needed to fix d3d12 for st_unlower_io_to_vars.

The idea is to track the current value of ClipVertex in a temporary
variable, and for every emit_vertex, we load the ClipVertex value from
the temporary (which matches the stored value) and insert new CLIP_DIST
stores before emit_vertex.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>
2024-11-28 14:14:47 +00:00
Marek Olšák
a648acc287 nir/lower_clip: convert nir_lower_clip_gs to nir_shader_intrinsics_pass
and add struct lower_clip_state to hold the state for both
nir_lower_clip_gs and nir_lower_clip_vs.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>
2024-11-28 14:14:47 +00:00
Marek Olšák
3b8e4a71fe nir/lower_clip: set clip_distance_array_size outside of create_clipdist_vars
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>
2024-11-28 14:14:47 +00:00
Marek Olšák
b4ef50bca8 nir/lower_clip: separate code for IO variables and intrinsics
The code for IO variables was interleaved with code for IO intrinsics,
which was difficult to follow.

lower_clip_outputs is split and replaced by more accurate names:
lower_clip_vertex_var and lower_clip_vertex_intrin

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>
2024-11-28 14:14:47 +00:00
Marek Olšák
3e40c2010e nir/lower_clip: don't set cursor to fix crashes due to removed instructions
The original builder already points at the end of the function impl.
Just use that.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>
2024-11-28 14:14:47 +00:00
Job Noorman
1a0b4531d1 ir3: add workaround for predication hardware bug
Predication instructions sometimes need extra nops to workaround what
seems to be a hardware bug: prede needs 6 nops and the second
predt/predf of a predt/predf pair needs 4 nops.

The prede workaround is enabled starting from a6xx gen3 and the
predf/predt workaround from a6xx gen4, following the blob.

Fixes rendering corruption in God of War (2018).

Signed-off-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32366>
2024-11-28 13:08:36 +00:00
Job Noorman
c129547d9c ir3/isa: allow rpt6/rpt7
The blob sometimes uses this for nop.

Signed-off-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32366>
2024-11-28 13:08:36 +00:00