We were walking the instructions in the block for each
first-rpt-instruction in the block. Instead, on the first query per
block, make a set of all the rpts in the block, so we can O(1) check for
the remainder.
shader-db runtime for deadspace3 -7.60909% +/- 2.28996% (n=10) on a
debugoptimized build.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37625>
Android has to enable dumping globally. There's no per app based env
var at runtime since most apps just fork from Zygote process. So we have
to add process name to the dump file name. Now with pandecode.dump as
the base name, it'll be like below on Android:
- pandecode.dump.com.example.VkCube.ctx-*
- pandecode.dump.com.google.android.apps.nexuslauncher.ctx-*
This can be generally useful on Linux as well when debugging different
things to avoid accidentally touching existing dumps.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37729>
This avoids native crash on Android when system priviledged process is
involved during app launch animation but does not have specified storage
access (e.g. system_server can't access the common location /sdcard/*).
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37729>
A 64-bit atomic load/store should be considered entirely out-of-bounds if
any part of it is out-of-bounds. Since we implemented these as 32-bit vec2
load/store, it would have been possible for the first half to be in-bounds
while the second half is out-of-bounds.
From 9.6.1. Robust Buffer Access of Vulkan 1.4.324 specification:
> Any non-atomic access to a uniform, storage, uniform texel, or storage
> texel buffer wider than 32-bits may be treated as multiple 32-bit
> accesses that are separately bounds checked.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36602>
This existed since ccfe9813fb because NIR
had no atomic loads/stores. This is no longer the case.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36602>
For non-atomic loads, this situation would require a data race.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36602>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36602>
This is so that passes and backends can tell if a coherent load/store is
atomic or not, instead of having to assume it could be either.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36602>
Due to the division of TU_DEBUG options into runtime and envvar
options, it limited where options could be set from when
TU_DEBUG_FILE was being used. This commit addresses that by allowing
the envvar to set runtime debug options even when TU_DEBUG_FILE is
active while also allowing the file to set non-runtime options if
the file included them at startup.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37580>
The primary CS doesn't need to use chaining in order to use IB2.
Allow using IB2 packets when chaining is disabled.
Rationale for this patch:
When chaining is enabled (the default), this patch removes a
useless check.
When chaining is disabled (by noibchaining), this patch allows us
to use IB2 without chaining.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37280>
All CS always use IBs, so the naming was confusing.
Rename these fields to chain_ib to better reflect
what it actually means, which is enabling chaining:
radv_amdgpu_winsys::use_ib_bos
radv_amdgpu_cs::chain_ib
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37280>
These instructions don't need a sampler.
This doesn't fix anything now because this helper isn't unused yet, but
it will help for descriptor heap.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37720>
We form LDS clauses because heavily interleaving LDS and VALU leads to false
dependencies. But LDS is completely uncached, so splitting the clause with
waitcnts shouldn't hurt, it might even be beneficial because the first
LDS store can start earlier.
Foz-DB Navi48:
Totals from 170 (0.21% of 80287) affected shaders:
Instrs: 239633 -> 240148 (+0.21%)
CodeSize: 1276584 -> 1278532 (+0.15%)
Latency: 3788507 -> 3789876 (+0.04%); split: -0.01%, +0.04%
InvThroughput: 841637 -> 841694 (+0.01%); split: -0.01%, +0.02%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37701>
Lowering them earlier right after VTN would allow us to implement
embedded samplers for descriptor heap properly for merged shaders.
Non-immediate samplers are still lowered in
radv_nir_apply_pipeline_layout because they require shader arguments.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37688>
The DUTs have been in use for over 2 weeks and the new jobs landed over
1 week ago, without new unknown problems cropping up (not bullet-proof
ethernet gadget).
Additionally, the high temperature (up to 95°C) was discussed with
@lumag and he is not concerned by it... so let's move the jobs to the
merge pipeline!
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37612>
Use vk_video_is_profile_supported first, and add AMD specific
restrictions later.
vulkaninfo reports on Navi31:
H.264 Decode (4:2:0 8-bit) Baseline progressive
H.264 Decode (4:2:0 8-bit) Main progressive
H.264 Decode (4:2:0 8-bit) High progressive
H.264 Decode (4:2:0 8-bit) Baseline interlaced (interleaved lines)
H.264 Decode (4:2:0 8-bit) Main interlaced (interleaved lines)
H.264 Decode (4:2:0 8-bit) High interlaced (interleaved lines)
H.264 Decode (monochrome 8-bit) High progressive
H.264 Decode (monochrome 8-bit) High interlaced (interleaved lines)
H.265 Decode (4:2:0 8-bit) Main
H.265 Decode (4:2:0 8-bit) Main 10
H.265 Decode (4:2:0 8-bit) Main Still Picture
H.265 Decode (4:2:0 10-bit) Main 10
VP9 Decode (4:2:0 8-bit) Profile 0
VP9 Decode (4:2:0 10-bit) Profile 2
AV1 Decode (4:2:0 8-bit) Main with film grain support
AV1 Decode (4:2:0 8-bit) Main without film grain support
AV1 Decode (4:2:0 10-bit) Main with film grain support
AV1 Decode (4:2:0 10-bit) Main without film grain support
AV1 Decode (4:2:0 12-bit) Professional with film grain support
AV1 Decode (4:2:0 12-bit) Professional without film grain support
AV1 Decode (monochrome 8-bit) Main with film grain support
AV1 Decode (monochrome 8-bit) Main without film grain support
AV1 Decode (monochrome 10-bit) Main with film grain support
AV1 Decode (monochrome 10-bit) Main without film grain support
AV1 Decode (monochrome 12-bit) Professional with film grain support
AV1 Decode (monochrome 12-bit) Professional without film grain support
H.264 Encode (4:2:0 8-bit) Baseline
H.264 Encode (4:2:0 8-bit) Main
H.264 Encode (4:2:0 8-bit) High
H.265 Encode (4:2:0 8-bit) Main
H.265 Encode (4:2:0 8-bit) Main 10
H.265 Encode (4:2:0 8-bit) Main Still Picture
H.265 Encode (4:2:0 10-bit) Main 10
AV1 Encode (4:2:0 8-bit) Main
AV1 Encode (4:2:0 10-bit) Main
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37656>