Commit graph

57163 commits

Author SHA1 Message Date
Alyssa Rosenzweig
40372bd720 panfrost: Implement a disk cache
Wire up the Mesa shader disk cache into Panfrost. Coupled with the
precompiles from the previous patch, this should greatly reduce shader
recompile jank.

This is a bare bones implementation. Obvious future work includes:

- Caching internal (outside of Gallium) shaders
- Implement finalize_nir to reduce on disk size of shaders

That doesn't need to come in this patch.

This patch does shuffle some allocation patterns around to avoid extra
nir_shader_clones, but the result should be pretty clean.

---

Consider dEQP-GLES31.functional.ssbo.layout.basic_unsized_array.* in the CTS.
With a cold cache:

   44.11user 0.66system 0:45.44elapsed 98%CPU (0avgtext+0avgdata 267804maxresident)
   k 0inputs+0outputs (130major+74725minor)pagefaults 0swaps

But with this commit and a warm cache:

   4.07user 0.35system 0:04.56elapsed 96%CPU (0avgtext+0avgdata 211012maxresident)
   k0inputs+0outputs (1major+49489minor)pagefaults 0swaps

That's an 11x improvement!

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
b35a55bb42 panfrost: Precompile shaders
We have no vertex shader key, and unless legacy GL features are used, the
fragment shader key is known ahead-of-time. That means we can precompile shaders
at CSO create time, hopefully avoiding some draw-time jank.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
01bbf8e2df panfrost: Precompile transform feedback program
This avoids the weird compiled_shader pointer inside of compiled_shader. Because
we don't have a nonempty vertex shader key, there will only ever be a single
transform feedback program per CSO.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
b290ac960b panfrost: Make fixed_varying_mask a fragment-only key
This makes it clear that there are no VS variants.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
6d6f25e97e panfrost: Use u_dynarray for variants
No need to open code our own "special" dynarray. Unify the graphics/compute CSO
creation to make this work without duplicating more code.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
7bc34fbe84 panfrost: Remove uncompiled_shader->active_variant
The active compiled shader (variant) is context state, it is inappropriate to
stash it on the uncompiled shader. Add compiled shader pointers to the context
and get rid of the active_variant mutation. Names from iris.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
52b4181eed panfrost: Rename structs to panfrost_(un)compiled_shader
Consistency with other drivers, this makes the language less variant-centric.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
ea45460f55 panfrost: Remove unused req_input_mem copy
Cloverism.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
78f7128dad panfrost: Merge pan_assemble.c into pan_shader.c
We now have a common place for the driver side of shader compilation. As a bonus
this gets rid of the old "assemble" name which hasn't been accurate since 2018
or so.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
5ef46b4f72 panfrost: Consolidate all shader compiling code
Compute and graphics shaders will need similar paths for the disk cache. Let's
consolidate the code to make it easier to work with.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
ecbeb6a335 panfrost: Remove bogus assert
Nothing enforces this except perhaps the implicit structure of shader keys.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
4860b0f59e panfrost: Move small compute functions to pan_context.c
So we can use pan_compute.c for just programs.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
2e1a69105d panfrost: Delete set_global_resources
Cloverism.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
93bf7104d0 panfrost: Don't allocate space for empty varyings
PIPE_FORMAT_NONE has a block size of 1, oddly, but we don't actually
need to allocate any space for it. This acts as a small optimization for
a few shaders with the new varying linker.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Rob Clark
4087374deb freedreno/a6xx: Mark gl45 supported
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
bb52332b50 freedreno/a6xx: ARB_query_buffer_object support
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
41455c6369 freedreno: Core ARB_query_buffer_object support
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
27250d67e5 freedreno/batch: Add a global epilogue
Rename the existing one to make it clear that it is per-tile, and add a
new one that runs after all the tile passes.  Will be needed in the next
commit.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
c9b0cd6e80 freedreno/a5xx+a6xx: Add base class for query samples
For PIPE_CAP_QUERY_BUFFER_OBJECT we'll need to write on the GPU a flag
when the query result is available, which means the buffers used for
query results should have a header with availability flag.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
46f84ce20a freedreno/a6xx: Remove unused field
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
5c5e4238ff freedreno/a6xx: Fix occlusion queries
WFI is not a strong enough barrier, which shows up in piglit qbo tests
which do a single draw.

Fixes: 13fc03f4c0 ("freedreno/a6xx: Avoid stalling for occlusion queries")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
701c0fdca2 freedreno/a6xx: Enable ARB_shader_group_vote
Already supported for at least a6xx.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
5b50332a14 freedreno/a3xx+: Enable ARB_derivative_control
Also already supported by ir3.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
5ceff032ad freedreno/a3xx+: Enable ARB_shader_texture_image_samples
This is already supported for ir3

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
7598db41ae freedreno/a6xx: Implement ARB_clear_texture
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
13946b8a6a freedreno/a6xx: Use box to pass 2d clear params
Simplifies the interface slightly and makes it possible to re-use the
path for pctx->clear_texture() in the next commit.  The z dimensions
still come from the surface.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
cd181b6140 freedreno: Add ARB_gl_spirv support
All the heavy lifting is done in nir.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Erik Faye-Lund
fe6a84729d zink: put union fields into structs named by the shader-stages
This makes it easier to see that a field is only valid in a given stage,
to avoid undefined behavior.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19457>
2022-11-02 15:01:11 +00:00
Erik Faye-Lund
090a111c5d zink: do not read is_generated unless in tcs shader
It's undefined behavior in C to read a union member if another member
has been written to more recently. Let's be more careful here!

Fixes: a9d2b86c2c ("zink: store the spirv_shader to the zink_shader struct for generated tcs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19457>
2022-11-02 15:01:11 +00:00
Erik Faye-Lund
7d7e94066d zink: consider polygon-mode for rast_prim
But because polygon-offset needs to consider the primitive-type *before*
overriding the type, add a zink_prim_type()-helper for the partially
resolved state.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438>
2022-11-02 14:30:58 +00:00
Erik Faye-Lund
1859941768 zink: only set line-width if drawing lines
This might seem like a premature optimization, but it's going to make a
bit more sense with the next commit, to prevent needlessly regressing
performance.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438>
2022-11-02 14:30:58 +00:00
Erik Faye-Lund
53721827ea zink: correct depth-bias enable condition
This should be based on the fill_mode, not on the primitive type. We
*also* need to check if we'll rasterize triangles in the end, though.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438>
2022-11-02 14:30:58 +00:00
Adam Jackson
b78afc2c73 rusticl: meson devenv support
This gets 'meson devenv -C build clinfo' working on iris for me.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19451>
2022-11-02 13:44:12 +00:00
Pierre-Eric Pelloux-Prayer
4147add280 radeonsi: update db_eqaa even if msaa is disabled
This seems to fix rendering in application toggling MSAA on and
off between draw calls.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7537
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19326>
2022-11-02 11:24:36 +01:00
Pierre-Eric Pelloux-Prayer
abf3dea738 radeonsi/gfx11: enable sdma copy DRI_PRIME
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19326>
2022-11-02 11:24:05 +01:00
Illia Abernikhin
aa4ac5ff8b utils: Merge util/debug.* into util/u_debug.* and remove util/debug.*
Rename env_var_as_unsigned() -> debug_get_num_option(), because duplicate
Rename env_var_as_bool() -> debug_get_bool_option(), because duplicate

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7177

Signed-off-by: Illia Abernikhin <illia.abernikhin@globallogic.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19336>
2022-11-02 07:25:39 +00:00
Thomas Debesse
d375a0ff8a crocus: set clear_buffer = u_default_clear_buffer
This is required when crocus is enabled in rusticl,
the lack of it contributes to this error:

thread '<unnamed>' panicked at 'Context missing features. This should never happen!', ../src/gallium/frontends/rusticl/mesa/pipe/context.rs:44:13

Signed-off-by: Thomas Debesse <dev@illwieckz.net>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19001>
2022-11-02 06:52:15 +00:00
Thomas Debesse
e74e82ea77 gallium/clover: pass -opaque-pointers to Clang on LLVM 15 and 16
This does the exact opposite of 06e96074 from !16129.

Before LLVM commit 702d5de4 opaque pointers were supported but not enabled
by default when building LLVM. They were made default in commit 702d5de4.
LLVM commit d69e9f9d introduced -opaque-pointers/-no-opaque-pointers cc1
options to enable or disable them whatever the LLVM default is.

Those two commits follow llvmorg-15-init and precede llvmorg-15.0.0-rc1 tags.

Since LLVM commit d785a8ea, the CLANG_ENABLE_OPAQUE_POINTERS build option of
LLVM is removed, meaning there is no way to build LLVM with opaque pointers
enabled by default.
It was said at the time it was still possible to explicitly disable opaque
pointers via cc1 -no-opaque-pointers option, but it is known a later commit
broke backward compatibility provided by -no-opaque-pointers as verified with
arbitrary commit d7d586e5, so there is no way to use opaque pointers starting
with LLVM 16.

Those two commits follow llvmorg-16-init and precede llvmorg-16.0.0-rc1 tags.

Since Mesa commit 977dbfc9 opaque pointers are properly implemented in Clover
and used.

If we don't pass -opaque-pointers to Clang on LLVM versions supporting opaque
pointers but disabling them by default, there will be an API mismatch between
Mesa and LLVM and Clover will not work.

Signed-off-by: Thomas Debesse <dev@illwieckz.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19290>
2022-10-25 07:18:16 +02:00
Thomas Debesse
1a06dbcaed Revert "gallium/clover: pass -no-opaque-pointers to Clang", opaque pointers are now implemented
This reverts commit 06e9607478 from !16129.

Clover passed -no-opaque-pointers option to Clang to workaround the fact
the Clover code was not ported to opaque pointers yet.

Opaque pointers are now implemented thanks to !19103 so passing this
option to tell Clang to not do opaque pointers while Clover does
is actually breaking Clover.

Here is an example of what happens when using opaque pointers while
passing -no-opaque-pointers at the same time:

  fatal error: cannot open file 'hawaii-amdgcn-mesa-mesa3d.bc':
   Opaque pointers are only supported in -opaque-pointers mode

This fixes one of the last remaining bits to fully support opaque pointers
in Mesa as referenced in #7468, this is the last remaining bit to fully support
opaque points in Clover.

Signed-off-by: Thomas Debesse <dev@illwieckz.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19290>
2022-10-25 05:20:29 +02:00
Alyssa Rosenzweig
45a111c21c nir/opt_algebraic: Fuse c - a * b to FMA
Algebraically it is clear that

   -(a * b) + c = (-a) * b + c = fma(-a, b, c)

But this is not clear from the NIR

   ('fadd', ('fneg', ('fmul', a, b)), c)

Add rules to handle this case specially. Note we don't necessarily want
to  solve this by pushing fneg into fmul, because the rule opt_algebraic
(not the late part where FMA fusing happens) specifically pulls fneg out
of fmul to push fneg up multiplication chains.

Noticed in the big glmark2 "terrain" shader, which has a cycle count
reduced by 22% on Mali-G57 thanks to having this pattern a ton and being
FMA bound.

BEFORE: 1249 inst, 16.015625 cycles, 16.015625 fma, ... 632 quadwords
AFTER: 997 inst, 12.437500 cycles, .... 504 quadwords

Results on the same shader on AGX are also quite dramatic:

BEFORE: 1294 inst, 8600 bytes, 50 halfregs, ...
AFTER: 1154 inst, 8040 bytes, 50 halfregs, ...

Similar rules apply for fabs.

v2: Use a loop over the bit sizes (suggested by Emma).

shader-db on Valhall (open + small subset of closed), results on Bifrost
are similar:

total instructions in shared programs: 167975 -> 164970 (-1.79%)
instructions in affected programs: 92642 -> 89637 (-3.24%)
helped: 492
HURT: 25
helped stats (abs) min: 1.0 max: 252.0 x̄: 6.25 x̃: 3
helped stats (rel) min: 0.30% max: 20.18% x̄: 3.21% x̃: 2.91%
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 2.80 x̃: 3
HURT stats (rel)   min: 0.46% max: 9.09% x̄: 3.89% x̃: 3.37%
95% mean confidence interval for instructions value: -6.95 -4.68
95% mean confidence interval for instructions %-change: -3.08% -2.65%
Instructions are helped.

total cycles in shared programs: 10556.89 -> 10538.98 (-0.17%)
cycles in affected programs: 265.56 -> 247.66 (-6.74%)
helped: 88
HURT: 2
helped stats (abs) min: 0.015625 max: 3.578125 x̄: 0.20 x̃: 0
helped stats (rel) min: 0.65% max: 22.34% x̄: 5.65% x̃: 4.25%
HURT stats (abs)   min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0
HURT stats (rel)   min: 8.33% max: 12.50% x̄: 10.42% x̃: 10.42%
95% mean confidence interval for cycles value: -0.28 -0.12
95% mean confidence interval for cycles %-change: -6.30% -4.30%
Cycles are helped.

total fma in shared programs: 1582.42 -> 1535.06 (-2.99%)
fma in affected programs: 871.58 -> 824.22 (-5.43%)
helped: 502
HURT: 9
helped stats (abs) min: 0.015625 max: 3.578125 x̄: 0.09 x̃: 0
helped stats (rel) min: 0.60% max: 25.00% x̄: 5.46% x̃: 4.82%
HURT stats (abs)   min: 0.015625 max: 0.0625 x̄: 0.03 x̃: 0
HURT stats (rel)   min: 4.35% max: 12.50% x̄: 6.22% x̃: 4.35%
95% mean confidence interval for fma value: -0.11 -0.08
95% mean confidence interval for fma %-change: -5.58% -4.93%
Fma are helped.

total cvt in shared programs: 665.55 -> 665.95 (0.06%)
cvt in affected programs: 61.72 -> 62.12 (0.66%)
helped: 33
HURT: 43
helped stats (abs) min: 0.015625 max: 0.359375 x̄: 0.04 x̃: 0
helped stats (rel) min: 1.01% max: 25.00% x̄: 6.68% x̃: 4.35%
HURT stats (abs)   min: 0.015625 max: 0.109375 x̄: 0.04 x̃: 0
HURT stats (rel)   min: 0.78% max: 38.46% x̄: 10.85% x̃: 6.90%
95% mean confidence interval for cvt value: -0.01 0.02
95% mean confidence interval for cvt %-change: 0.23% 6.24%
Inconclusive result (value mean confidence interval includes 0).

total quadwords in shared programs: 93376 -> 91736 (-1.76%)
quadwords in affected programs: 25376 -> 23736 (-6.46%)
helped: 169
HURT: 1
helped stats (abs) min: 8.0 max: 128.0 x̄: 9.75 x̃: 8
helped stats (rel) min: 1.52% max: 33.33% x̄: 8.35% x̃: 8.00%
HURT stats (abs)   min: 8.0 max: 8.0 x̄: 8.00 x̃: 8
HURT stats (rel)   min: 25.00% max: 25.00% x̄: 25.00% x̃: 25.00%
95% mean confidence interval for quadwords value: -11.18 -8.11
95% mean confidence interval for quadwords %-change: -8.95% -7.36%
Quadwords are helped.

total threads in shared programs: 4697 -> 4701 (0.09%)
threads in affected programs: 4 -> 8 (100.00%)
helped: 4
HURT: 0
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for threads value: 1.00 1.00
95% mean confidence interval for threads %-change: 100.00% 100.00%
Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Ol<C5><A1><C3><A1>k <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19312>
2022-11-01 22:39:45 -04:00
Emma Anholt
07bac4094a gallium: update docs about PIPE_CAP_PREFER_IMM_ARRAYS_AS_CONSTBUF.
We can provide better guidance on when to (un-)set this given that
everyone's on NIR now.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16539>
2022-11-01 14:55:56 -07:00
Emma Anholt
467ee94001 iris: Disable GLSL lower_const_arrays_to_uniforms.
We want to use nir_opt_large_constants() instead (which is already
enabled), since that doesn't involve uploading the large immediate data
array again on each CB0 update.  The downside is a bit of addressing math,
since constant_data is accessed using 64-bit global addresses.

The shader-db results are a bit all over:

All Iris driver platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19910185 -> 19913931 (0.02%)
instructions in affected programs: 225374 -> 229120 (1.66%)
helped: 3 / HURT: 348

total cycles in shared programs: 856004856 -> 855016808 (-0.12%)
cycles in affected programs: 22832422 -> 21844374 (-4.33%)
helped: 277 / HURT: 101

total spills in shared programs: 6580 -> 6609 (0.44%)
spills in affected programs: 516 -> 545 (5.62%)
helped: 1 / HURT: 4

total fills in shared programs: 8235 -> 8267 (0.39%)
fills in affected programs: 1022 -> 1054 (3.13%)
helped: 1 / HURT: 3

total sends in shared programs: 1039347 -> 1039095 (-0.02%)
sends in affected programs: 16367 -> 16115 (-1.54%)
helped: 251 / HURT: 0

LOST:   5
GAINED: 2

LOST:
- 3 SIMD16 fragment shaders (Superposition)
- 2 SIMD16 compute shaders (Aztec Ruins)

GAINED:
- fake news... 2 SIMD8 compute shaders that replace the lost SIMD16
  compute shaders.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16539>
2022-11-01 14:55:33 -07:00
Kenneth Graunke
96054f8eba iris: Use nir_intrinsic_load_global_constant for large constants
We were using the old load_global intrinsic still, which can't be
reordered, limiting optimization opportunities.  We know the data here
is constant, so we can use the newer load_global_constant intrinsic.

This doesn't seem to have any impact on shader-db or fossil-db on any
Intel platform.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16539>
2022-11-01 14:55:13 -07:00
Emma Anholt
e4d61f37d4 rusticl: Fix the invalid memory migration flags check.
We want to know if you have any invalid flags set, not if you don't have
any valid flags set.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19446>
2022-11-01 21:30:52 +00:00
Ruijing Dong
4bf116d440 frontends/va: fixed an av1 dec image corruption.
[problem]
When decoding an av1 bitstream, it shows image corruption
in the middle of the bitstream around key frames.

[analysis]
in av1_spec.pdf page 38/669, there is a sentence below:

if ( frame_type == KEY_FRAME && show_frame ) {
   for ( i = 0; i < NUM_REF_FRAMES; i++) {
      RefValid[ i ] = 0
      ......
   }
   ......
}

This shows that the condition of invalidating current
DPB frames should be the coming frame_type is KEY_FRAME plus
show_frame is equal to 1. Otherwise, some of the frames
in sequence after KEY_FRAME still refer to the reference frames
before KEY_FRAME, and if these before KEY_FRAME reference
frames were invalidated, these frames could not find their
reference frames, and it could cause image corruption.

[solution]
Add condition of show_frame, with the corresponding fix
in ffmpeg, we cannot see this issue any longer.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19386>
2022-11-01 10:24:11 -04:00
Gert Wollny
b1e9065fe4 r600/sfn: remove load_uniform handling
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19416>
2022-11-01 14:00:44 +00:00
Gert Wollny
350c56b1c3 r600/sfn: lower uniforms to UBOs
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19416>
2022-11-01 14:00:44 +00:00
Thomas Debesse
981bc603b4 clover: implement CLOVER_DEVICE_TYPE like RUSTICL_DEVICE_TYPE
Allows to make Clover devices appearing as cpu, gpu or accelerator
by setting the CLOVER_DEVICE_TYPE environment variable like
the RUSTICL_DEVICE_TYPE environment variable does.

For example it can make the CPU llvmpipe device appear as GPU or GPU devices
appear as CPU. This is useful for testing OpenCL with applications that may
use different code path given the OpenCL device is a CPU or a GPU.

The initial motivation for RUSTICL_DEVICE_TYPE implementation was to test
rusticl with llvmipe on applications ignoring CPU devices.

This brings Clover on par with rusticl on that topic.

CL_DEVICE_TYPE_CUSTOM isn't implemented or applications may crash when
iterating devices because CL_DEVICE_TYPE_CUSTOM is OpenCL 1.2 and Clover
is OpenCL 1.1.

Signed-off-by: Thomas Debesse <dev@illwieckz.net>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18931>
2022-11-01 13:32:01 +00:00
Gert Wollny
3b9f36db47 r600/sfn: Handle load_workgroup_size
Fixes: 79ca456b48
   r600/sfn: rewrite NIR backend

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19417>
2022-11-01 08:04:48 +00:00
Alyssa Rosenzweig
fda7d17e81 gallium: Default to PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT
Supported in all hardware and software drivers. Only that don't support
are virgl and svga, depending on host capabilities. I don't think
there's anything to be done there. This does give fewer places to screw
up the CAPs, though, because everyone wants ARB_buffer_storage.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Ol<C5><A1><C3><A1>k <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19392>
2022-10-31 23:35:33 -04:00