Commit graph

176834 commits

Author SHA1 Message Date
Rhys Perry
9fe3af1e2a aco: insert s_nop before discard early exit sendmsg(dealloc_vgpr)
Forgot about this one.

fossil-db (gfx1100):
Totals from 3920 (2.94% of 133461) affected shaders:
Instrs: 6632088 -> 6636008 (+0.06%)
CodeSize: 34165376 -> 34181056 (+0.05%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 37fbfa655a ("aco: insert s_nop before VGPR deallocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29770>
2024-06-18 20:17:38 +00:00
Erico Nunes
08ecb39789 lima/ci: update piglit ci expectations
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29474>
2024-06-18 19:13:27 +00:00
Maaz Mombasawala
9cadf45ddf svga: Retry DRM_VMW_SYNCCPU ioctl on failure.
The ioctl DRM_VMW_SYNCCPU may sometimes fail with ERESTART or EBUSY, which
in turn bubbles up to the application as a GL_OUT_OF_MEMORY error.
We are seeing this in glamor, while this does not cause any real issues, it
does pollute the system log.
Retrying DRM_VMW_SYNCCPU fixes this issue.

Reviewed-by: Neha Bhende <neha.bhende@broadcom.com>
Reviewed-by: Zack Rusin <zack.rusin@broadcom.com>
Reviewed-by: Martin Krastev <martin.krastev@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29755>
2024-06-18 19:01:53 +00:00
Caio Oliveira
f982d2bb79 intel/brw: Fix typo in DPAS emission code
The enums were mixed up.  Code was working because they were being
used only for their numerical values.

Fixes: e666872c75 ("intel/compiler: Initial bits for DPAS instruction")
Acked-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29762>
2024-06-18 18:25:21 +00:00
Georg Lehmann
c3c398d56d aco: make local functions static in files without anonymous namespace
I don't think adding an anonymous namespace in these files is worth it
given the amount of global functions

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740>
2024-06-18 17:53:07 +00:00
Georg Lehmann
046414e061 aco: add more anonymous namespaces
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740>
2024-06-18 17:53:07 +00:00
Connor Abbott
c9c483bf02 ir3: Enable early preamble
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Danylo Piliaiev
d8d192f3f4 ir3: Correctly assemble mova1 with (r) on const
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Danylo Piliaiev
e9c764c825 freedreno/ir3: mova has special meaning for (r) flag
It prevents the hazard when in the following case:

 ldc.1.k.imm c[a1.x], 0, 1
 (ss)mova1 a1.x, 8

The correct way is:
 ldc.1.k.imm c[a1.x], 0, 1
 (ss)mova1 a1.x, (r)8

Without it ldc may use a1.x which is set after ldc.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Connor Abbott
0a4afef6ea freedreno/a6xx: Implement early preamble
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Connor Abbott
53ba1613ec tu: Implement early preamble
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Connor Abbott
3ce04c1111 ir3: Add ir3_info::early_preamble
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Connor Abbott
d35c1e5051 freedreno/a6xx: Workaround early preamble HW bug
Port of the previous commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Connor Abbott
472ce31e56 tu: Workaround early preamble HW bug
This seems to be reproducable only by running CTS in parallel with
deqp-runner.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Connor Abbott
1f1f42e9d4 freedreno,ir3: Add has_early_preamble
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Connor Abbott
aa1603bcb0 ir3/legalize: Insert dummy bary.f after preamble
Otherwise it will not get executed with early preamble.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Connor Abbott
b38fef99ac ir3: Put VS->TCS barrier after preamble
Putting it beforehand doesn't work with early preamble.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Adam Jackson
10d21d4100 mesa: Enable EXT_shadow_samplers for GLES2
I thought this was just the funny GLES spelling of the extn name, but
there's also some ESSL bits you need to add. Most of which you could
probably yoink from the old Unity glsl-optimizer (which itself yoinked
most of the GLSL compiler from Mesa):

94a9b2959b

Signed-off-by: Adam Jackson <ajax@redhat.com>
Co-authored-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6691>
2024-06-18 14:40:33 +00:00
Samuel Pitoiset
33a849e004 radv: emit indirect sets for indirect compute pipelines with DGC
This used to work by luck because the current DGC prepare shader
is using one descriptor set and it was the currently bound compute
shader... Using two descriptor sets or starting from 1 would just fail.

For indirect compute pipelines, descriptors must be emitted from the
DGC shader because there is no bound compute pipeline at all. This
solution is using indirect descriptor sets because it's much shorter
and easier to implement. This could be improved but nothing uses
indirect compute pipelines and this is like experimental stuff.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29700>
2024-06-18 13:50:16 +00:00
Samuel Pitoiset
b1ba02e707 radv: force using indirect descriptor sets for indirect compute pipelines
Emitting descriptors in DGC is a huge pain but using indirect descriptor
sets is much easier.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29700>
2024-06-18 13:50:16 +00:00
Timothy Arceri
ef21df917f glsl: remove do_function_inlining()
This no longer has any users. nir based inlining should be used for any
new code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29519>
2024-06-18 12:34:52 +00:00
Timothy Arceri
f1ef6517e8 glsl: remove Par-linking from the standalone linker
lima was the last user of this feature so lets remove it. This will
allow us to drop more soon to be unused glsl ir code once full nir
linker support lands.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29519>
2024-06-18 12:34:52 +00:00
Timur Kristóf
0bf10ad4ad radv: Use number of TES inputs for TCS-TES linking.
This is to match what ac_nir_lower_tess_io_to_mem also does.
Doesn't address any known bug, but it's theoretically possible
that TCS outputs_written and TES inputs_read mismatch, so let's
be on the safe side here.

Fixes: be49b02f05
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29696>
2024-06-18 12:06:22 +00:00
Timur Kristóf
0355364743 ac/nir/tess: Fix per-patch output VRAM mapping.
VARYING_SLOT_PATCH0 is greater than 64 so it is wrong to use it
with BITFIELD64_BIT. Check for VARYING_SLOT_TESS_LEVEL_* properly
when mapping output locations in VRAM.

Fixes: 2cf7f282df
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11253
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29696>
2024-06-18 12:06:21 +00:00
Timur Kristóf
0f0ebd8512 ac/nir/tess: Fix per-patch output LDS mapping.
VARYING_SLOT_PATCH0 is greater than 64 so it is wrong to use it
with BITFIELD64_BIT. Check for VARYING_SLOT_TESS_LEVEL_* properly
when mapping output locations in LDS.

Fixes: c61eb54806
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29696>
2024-06-18 12:06:21 +00:00
Timur Kristóf
348b8859dc ac/nir/tess: Only write tess factors that the TES reads.
Otherwise we would write to a memory location reserved
for another per-patch output.

Fixes: 2cf7f282df
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11324
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29696>
2024-06-18 12:06:21 +00:00
Zan Dobersek
9845e99960 tu: avoid memory polling in occlusion query endings using ZPASS_DONE
On newer hardware where ZPASS_DONE events are used for sample count writes
the memory polling in occlusion query endings can be wholly avoided. A WFI
is still required, but the performance gain is still in the range of 10% on
the trivial occlusionquery demo.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29403>
2024-06-18 11:39:57 +00:00
Zan Dobersek
5653c52151 tu: fix ZPASS_DONE interference between occlusion queries and autotuner
On newer devices where ZPASS_DONE events have sample count writing
abilities the firmware expects these events to come in begin-end pairs,
essentially corresponding to a typical occlusion query usage. Since this
event is also used in the autotuner we have to avoid event pairs to be
emitted in an interleaved fashion.

Additional renderpass state now tracks whether a given renderpass contains
an occlusion query. If so, autotuner will emit miscellaneous ZPASS_DONE
events in order to form its own begin-end pairs before and after the
renderpass commands.

Occlusion query behavior inside a renderpass doesn't change. But when used
outside of a renderpass, possible autotuner usage requires to again emit
ZPASS_DONE events that end up forming begin-end pairs of these events both
at the start and the end of the query.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 4e6a1f8852 ("tu/autotune: Use `CP_EVENT_WRITE7::ZPASS_DONE` on A7XX")
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29403>
2024-06-18 11:39:57 +00:00
Job Noorman
6bc7cd6108 ir3: only add live-in phis for top-level intervals while spilling
When both an interval and some of its children would be live-in, we used
to add phis for all of them. This could lead to cases where the pressure
after spilling was higher than before.

This happens, for example, when both a split and its parent are live-in.
Before spilling, the split wouldn't add to the pressure because its
parent had already been inserted. After spilling, since we created a phi
for the split, the link with its parent would be lost and it would add
to the pressure.

Fix this by only adding phis for top-level intervals and adding splits
after them.

Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
18cd803cef ir3: refactor ir3_spill.c to use the ir3_cursor/ir3_builder API
There were a few places that used an instruction pointer to decide where
new instructions should be created. NULL was used to add them at the end
of the block. While fixing a spilling bug, a new option was needed to
add instructions at the beginning of the block. This will be much easier
to implement using cursors.

Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
1972db36c6 ir3: add ir3_cursor/ir3_builder helpers
Whenever instructions need to be created at specific locations, ir3
often passes around an instruction pointer. When set, new instructions
are added before or after it (depending on the context). When NULL, new
instructions are added at the end of the block. This whole scheme is
confusing.

This patch adds ir3_cursor and ir3_builder structs and the associated
helper functions. The API mirrors the one from nir_cursor/nir_builder.

This patch does not refactor existing code to use the new API. This will
happen in future patches.

Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
dc04fd8e62 ir3: restore interval_offset after liveness recalculation in shared RA
This value is usually set by ir3_merge_regs. Since we don't need to call
this again after shared RA, we have to copy it manually to the new
liveness struct.

Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
3f3c190649 ir3: move liveness recalculation inside ir3_ra_shared
Similar to how ir3_spill does it. This will make it easier to optimize
this in the future. E.g., we only need to recalculate liveness when any
instruction were added.

Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
7a5b198a44 ir3: index instructions before fixing up merge sets after spilling
ir3_force_merge (through merge_merge_sets) expects instructions to be
indexed. However, the instructions created during spilling would not be
automatically indexed at this point.

Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
018d0ab805 ir3: make indexing instructions optional in ir3_merge_regs
While fixing up merge sets after spilling, we need to index before
calling ir3_merge_regs so it would be a waste to index again.

Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
17b155fede ir3: expose instruction indexing helper for merge sets
We will need it to fix up merge sets after spilling.

Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
1bc3b819e6 ir3: don't remove collects early while spilling
It might happen that a collect that cannot be coalesced with one of its
sources while spilling can be coalesced with it afterwards. In this
case, we might be able to remove it in remove_src_early during spilling
but not afterwards (because it may have a child interval). If this
happens, we could end up with a register pressure that is higher after
spilling than before. Prevent this by never removing collects early
while spilling.

Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
eaec57ab6b ir3: don't remove intervals for non-killed tex prefetch sources
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
70e10babea ir3: correctly set wrmask for reload.macro
We used to set it MASK(elems) which would break when not all elements
are contiguous (which could happen for tex instructions after dce).

Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
37c929ce5d ir3: set offset on splits created while spilling
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
af6f82b954 ir3: fix handling of early clobbers in calc_min_limit_pressure
Early clobbers should always add to the register pressure since they
cannot overlap with sources. handle_instr in ir3_spill.c handles this
properly but calc_min_limit_pressure did not.

Fixes: 2ff5826f09 ("ir3/ra: Add IR3_REG_EARLY_CLOBBER")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
023c7351f2 ir3: fix crash in try_evict_regs with src reg
try_evict_regs might end up calling check_dst_overlap which only works
for dst regs. Make sure this doesn't happen for src regs.

Fixes: 34803d15ab ("ir3/ra: Add proper support for multiple destinations")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
c24aad5867 ir3: set current instruction before all validation asserts
The first assert happened before setting the current instruction which
caused the error message to refer to the previous instruction.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
f57bee676f ir3: debug print limit pressure and post-spill max pressure
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
eadabc2eab ir3: print dst_offset of spill.macro
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
ac2a582fac ir3: print intervals when dumping merge sets
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:23 +00:00
Job Noorman
0a0ac6a72f ir3: print sharedness/halfness of merge set regs
This helps debugging issues when different types of registers end up in
the same merge set.

Signed-off-bype Job Noorman <jnoorman@igalia.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
2024-06-18 11:09:22 +00:00
Danylo Piliaiev
e602a7a392 freedreno/replay: Fix replaying without SET_IOVA
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29753>
2024-06-18 09:46:19 +00:00
Danylo Piliaiev
7c07c44d57 freedreno/rddecompiler: Make possible to use original shader
Sometimes decompiled shader isn't easily compiled back into the
same binary, e.g. when some part of bitset is not decoded.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29753>
2024-06-18 09:46:19 +00:00
Kenneth Graunke
9e750f00c3 intel/brw: Make opt_copy_propagation_defs clean up its own trash
Copy propagation often eliminates all uses of an instruction.  If we
detect that we've done so, we can eliminate the instruction ourselves
rather than leaving it hanging until the next DCE pass.

This saves some CPU time as other passes don't see dead code.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>
2024-06-18 09:02:25 +00:00