This is a temp fix. Currently we mix use llvm and aco to compile
shaders when AMD_DEBUG=useaco, but disk cache need function
identifier when creation, aco compiled shader should not use llvm
function identifier, so we have to disable disk cache when use
aco for now.
After aco is able to compile all shaders, we can re-enable disk
cache by removing the llvm function identifier when aco.
Fixes: d1dd36a74e ("radeonsi: be able to use aco compiler for mono ps")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9673
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25607>
Unlike other dirty bits, the vis_test dirty bit wasn't being taken into
consideration when determining whether PPP state needed to be emitted as part
of a draw call.
Fixes a large number of tests in dEQP-VK.query_pool.occlusion_query.*.
Fixes: 2b1992a000 ("pvr: Implement vkCmdBeginQuery API.")
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25491>
Bits were being assigned rather than ORed into the mask during setup. Noticed
through code inspection.
Fixes: e089166776 ("pvr: Add support for VK_ATTACHMENT_LOAD_OP_LOAD.")
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25487>
The number of const shared registers was being used for the allocation size
rather than the number of bytes. In practice this doesn't make a difference as
the max allocation size is 24 bytes, which then gets rounded up to 64 bytes by
the buffer allocation function. However, we might as well make the allocation
size correct to avoid any future confusion. Noticed through code inspection.
Fixes: 7509e259f8 ("pvr: Implement color/depth/depth+stencil attachment clear.")
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25489>
pvr_is_stencil_store_load_needed() may be called on secondary command buffers,
which don't have any attachments. This wasn't being taken into account, meaning
a segfault could occur.
Fixes a segfault seen in:
dEQP-VK.renderpass.suballocation.attachment_allocation.input_output.39
Fixes: 54876512a1 ("pvr: Add mid fragment pipeline barrier if needed.")
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25486>
Android-14/clang-17 throws an error with it:
ld.lld: error: version script assignment of 'global' to symbol
'__driDriverExtensions' failed: symbol not defined
Fixes: d43e6a9a49 ("dri: Remove the megadriver compat stub")
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25587>
It works with just one counter.
This mitigates https://gitlab.freedesktop.org/drm/amd/-/issues/2902
quite a lot when you run dEQP-VK.transform_feedback.* in parallel on
more than 16 threads with RDNA3.
For example, on my GPU the kernel reports 16 GDS OA counters which means
that if you run VKCTS with 16 threads (ie. 16 Vulkan devices are
created) it's fine. Otherwise, the kernel might report ENOMEM.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25619>
Otherwise, we have dangling BO pointers in the global BO list. Not
quite sure why this hasn't been triggered before.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25623>
In general, it is unsafe to speculatively hoist conditionally executed loads
into the preamble. For example, if the shader does:
if (ptr is valid) {
foo(*ptr)
}
we cannot dereference ptr in the preamble without knowing that the pointer is
valid (which may not be determinable, since it might not be uniform).
nir_opt_preamble needs to stop speculating in this case, or otherwise using
preambles can cause faults on legal shaders.
However, some platforms may be able to speculate loads safely. For example,
Apple hardware is able to suppress MMU faults, making speculation safe. This is
controlled global register to control this behaviour, set at boot-time by the
kernel. (macOS suppresses these faults unconditionally, this feature may be
used in their implementation of sparse textures. Currently Linux does not
suppress any faults but this may change later.)
Since nir_opt_preamble should work soundly and optimally on a variety of
platforms, we need to respect the ACCESS flag.
Thanks to the if-else hoisting implemented earlier in the series, this isn't too
terrible of a band-aid on Asahi:
total instructions in shared programs: 1499674 -> 1507699 (0.54%)
instructions in affected programs: 78865 -> 86890 (10.18%)
helped: 0
HURT: 337
Instructions are HURT.
total bytes in shared programs: 10238284 -> 10279308 (0.40%)
bytes in affected programs: 554504 -> 595528 (7.40%)
helped: 3
HURT: 334
Bytes are HURT.
total halfregs in shared programs: 452049 -> 454015 (0.43%)
halfregs in affected programs: 7569 -> 9535 (25.97%)
helped: 7
HURT: 150
Halfregs are HURT.
There are no shader-db changes on ir3 as expected, since ir3 can safely
speculate all instructions in my shader-db.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24011>
Add infrastructure to reconstruct if's. Later in the series, this will let us
hoist loads from inside uniform if's without speculating. For now, it lets us
handle phi's in nir_opt_preamble in a straightforward way.
Results on AGX are good:
total instructions in shared programs: 1504730 -> 1499674 (-0.34%)
instructions in affected programs: 153673 -> 148617 (-3.29%)
helped: 496
HURT: 0
Instructions are helped.
total bytes in shared programs: 10287768 -> 10238284 (-0.48%)
bytes in affected programs: 1113724 -> 1064240 (-4.44%)
helped: 496
HURT: 0
Bytes are helped.
total halfregs in shared programs: 452669 -> 452049 (-0.14%)
halfregs in affected programs: 14825 -> 14205 (-4.18%)
helped: 152
HURT: 99
Halfregs are helped.
total threads in shared programs: 16469504 -> 16470784 (<.01%)
threads in affected programs: 8960 -> 10240 (14.29%)
helped: 10
HURT: 0
Threads are helped.
Results on ir3 is a bit more of a wash but still should be a win overall: The
regression in moves seems scary, but the cost model already accounts for them as
evidenced by instruction count coming out ahead.
total instructions in shared programs: 3108750 -> 3105993 (-0.09%)
instructions in affected programs: 317367 -> 314610 (-0.87%)
helped: 675
HURT: 242
Instructions are helped.
total nops in shared programs: 673152 -> 675048 (0.28%)
nops in affected programs: 74551 -> 76447 (2.54%)
helped: 353
HURT: 347
Inconclusive result (%-change mean confidence interval includes 0).
total non-nops in shared programs: 2435598 -> 2430945 (-0.19%)
non-nops in affected programs: 232664 -> 228011 (-2.00%)
helped: 816
HURT: 38
Non-nops are helped.
total mov in shared programs: 78201 -> 84011 (7.43%)
mov in affected programs: 10726 -> 16536 (54.17%)
helped: 60
HURT: 781
Mov are HURT.
total cov in shared programs: 74964 -> 74906 (-0.08%)
cov in affected programs: 273 -> 215 (-21.25%)
helped: 17
HURT: 0
Cov are helped.
total dwords in shared programs: 6716814 -> 6748726 (0.48%)
dwords in affected programs: 879778 -> 911690 (3.63%)
helped: 12
HURT: 948
Dwords are HURT.
total full in shared programs: 193210 -> 193212 (<.01%)
full in affected programs: 278 -> 280 (0.72%)
helped: 12
HURT: 22
Inconclusive result (value mean confidence interval includes 0).
total constlen in shared programs: 493632 -> 494816 (0.24%)
constlen in affected programs: 19904 -> 21088 (5.95%)
helped: 9
HURT: 306
Constlen are HURT.
total cat0 in shared programs: 742476 -> 745046 (0.35%)
cat0 in affected programs: 84455 -> 87025 (3.04%)
helped: 277
HURT: 489
Cat0 are HURT.
total cat1 in shared programs: 153303 -> 159059 (3.75%)
cat1 in affected programs: 17810 -> 23566 (32.32%)
helped: 69
HURT: 780
Cat1 are HURT.
total cat2 in shared programs: 1144508 -> 1140731 (-0.33%)
cat2 in affected programs: 121284 -> 117507 (-3.11%)
helped: 841
HURT: 0
Cat2 are helped.
total cat3 in shared programs: 942098 -> 934804 (-0.77%)
cat3 in affected programs: 87140 -> 79846 (-8.37%)
helped: 855
HURT: 1
Cat3 are helped.
total cat4 in shared programs: 65261 -> 65249 (-0.02%)
cat4 in affected programs: 42 -> 30 (-28.57%)
helped: 12
HURT: 0
Cat4 are helped.
total sstall in shared programs: 237311 -> 241281 (1.67%)
sstall in affected programs: 33755 -> 37725 (11.76%)
helped: 179
HURT: 493
Sstall are HURT.
total (ss) in shared programs: 58166 -> 58795 (1.08%)
(ss) in affected programs: 4535 -> 5164 (13.87%)
helped: 35
HURT: 664
(ss) are HURT.
total systall in shared programs: 503784 -> 503805 (<.01%)
systall in affected programs: 3170 -> 3191 (0.66%)
helped: 16
HURT: 13
Inconclusive result (value mean confidence interval includes 0).
total (sy) in shared programs: 27261 -> 27259 (<.01%)
(sy) in affected programs: 76 -> 74 (-2.63%)
helped: 8
HURT: 5
Inconclusive result (value mean confidence interval includes 0).
total waves in shared programs: 439848 -> 439872 (<.01%)
waves in affected programs: 160 -> 184 (15.00%)
helped: 12
HURT: 0
Waves are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24011>
The way backends walk NIR when translating. This will make it easy to filter
can_move based on the parent control flow.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24011>
It can be beneficial to move phi nodes, even though they can often be coalesced.
Model this cost so nir_opt_preamble can make good decisions about hoisting phi
nodes (and by extension, if-statements) into the preamble.
At this point in the series, this has no effect, but it will avoid certain
shader-db regressions associated with the nir_opt_preamble changes later in the
series.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24011>
Speculating these loads is safe, but nir_opt_preamble doesn't know that. Set the
ACCESS bits appropriately to let it know.
This will avoid any code gen regression from the nir_opt_preamble change.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24011>
Determining whether it is safe to hoist a load instruction out of control flow
depends on complex hardware and driver details. Rather than encoding this as
knobs in every NIR pass that wants to do so (notably nir_opt_preamble and
nir_opt_peephole_select), add a per-load ACCESS flag for backends to set.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24011>
The type is called vk_object_base, not vk_vk_objet_base... This should
fix the cross-referencing of this type.
Fixes: f6d4641433 ("vulkan,docs: Document vk_instance")
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25634>
Right now, S3 always returns something, so we need to check
the content-type .
Acked-by: Emma Anholt <emma@anholt.net>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25606>
The way we init render pass related structures is dangerous with when
structs are not zero initialized - too easy to miss a field. There
were already at least two issues with it.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25592>
This allows us to pack the is_if boolean into the bottom bit of the parent
pointer, eliminating the boolean and hence shrinking the nir_src by 8 bytes (due
to the extra 63 bits of padding incurred in the old layout).
Because all access is forced through helpers now, this is a local change.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>
It is undefined behaviour in C to read a different member of a union than was
written. Nothing in-tree should be using this behaviour with the nir_src union:
nir_if should never be read as nir_instr and vice versa. Assert this.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>
First, we need to give the parent_instr field a unique name to be able to
replace with a helper. We have parent_instr fields for both nir_src and
nir_def, so let's rename nir_src::parent_instr in preparation for rework.
This was done with a combination of sed and manual fix-ups.
Then we use semantic patches plus manual fixups:
@@
expression s;
@@
-s->renamed_parent_instr
+nir_src_parent_instr(s)
@@
expression s;
@@
-s.renamed_parent_instr
+nir_src_parent_instr(&s)
@@
expression s;
@@
-s->parent_if
+nir_src_parent_if(s)
@@
expression s;
@@
-s.renamed_parent_if
+nir_src_parent_if(&s)
@@
expression s;
@@
-s->is_if
+nir_src_is_if(s)
@@
expression s;
@@
-s.is_if
+nir_src_is_if(&s)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>
These will become nontrivial later in the series. For now these have no smarts
in them, in order to make the conversion completely mechanical.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>
It is invalid to read parent_instr for an if-use (or parent_if for a
non-if-use). Make sure we read the right one when handling if-uses.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>
This re-introduces "radv: fix alignment of DGC command buffers" and
"radv/amdgpu: fix alignment of command buffers" which were valid
changes.
IBs need to be aligned to the IB size requirement, not the number of
padded NOPs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25588>
This enables some extensions and a bunch of formats for ycbcr
support.
dEQP-VK.api.info.format_properties.g8_b8_r8_3plane_420_unorm,Fail
dEQP-VK.api.info.format_properties.g8_b8r8_2plane_420_unorm,Fail
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25609>