Commit graph

150129 commits

Author SHA1 Message Date
Alyssa Rosenzweig
52b4181eed panfrost: Rename structs to panfrost_(un)compiled_shader
Consistency with other drivers, this makes the language less variant-centric.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
ea45460f55 panfrost: Remove unused req_input_mem copy
Cloverism.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
78f7128dad panfrost: Merge pan_assemble.c into pan_shader.c
We now have a common place for the driver side of shader compilation. As a bonus
this gets rid of the old "assemble" name which hasn't been accurate since 2018
or so.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
5ef46b4f72 panfrost: Consolidate all shader compiling code
Compute and graphics shaders will need similar paths for the disk cache. Let's
consolidate the code to make it easier to work with.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
ecbeb6a335 panfrost: Remove bogus assert
Nothing enforces this except perhaps the implicit structure of shader keys.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
4860b0f59e panfrost: Move small compute functions to pan_context.c
So we can use pan_compute.c for just programs.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
2e1a69105d panfrost: Delete set_global_resources
Cloverism.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
2316b80d77 panfrost: Don't use nir_variable to link varyings
NIR deemphasizes nir_variable. We want to transition off it. Instead of walking
the list of variables and playing games with the GLSL types to collect varying
information, walk the list of instructions and use the I/O semantics to collect
similar information.

In addition to avoiding the reliance on nir_variable, this fixes handling of
struct varyings under certain circumstances. Such programs are compiled by the
GLES3.1 CTS but not used, so without this fix, the affected tests would regress
when precompiling.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
93bf7104d0 panfrost: Don't allocate space for empty varyings
PIPE_FORMAT_NONE has a block size of 1, oddly, but we don't actually
need to allocate any space for it. This acts as a small optimization for
a few shaders with the new varying linker.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
d0281fc16a pan/mdg: Use bifrost_nir_lower_store_component
Move the pass from the Bifrost compiler to the Midgard/Bifrost common code
directory, and take advantage of it on Midgard, where it fixes the same
tests as it fixed originally on Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
17589be72b pan/mdg: Use .u32 for flat shading
This is simple and matches what we do on Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
225a8f6e27 pan/mdg: Don't pair ST_VARY.a32 with other instrs
For some reason, LD_ATTR/ST_VARY.a32 bundles raise INSTR_INVALID_ENC, at
least on Mali-T860. Don't construct such pairs. This is a blunt hack but
I don't know where this curveball requirement is coming from and this
unblocks the rest of this series.

total instructions in shared programs: 99879 -> 99788 (-0.09%)
instructions in affected programs: 3179 -> 3088 (-2.86%)
helped: 49
HURT: 9
helped stats (abs) min: 1.0 max: 6.0 x̄: 2.04 x̃: 2
helped stats (rel) min: 0.93% max: 10.53% x̄: 5.46% x̃: 4.88%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.61% max: 2.13% x̄: 1.41% x̃: 1.14%
95% mean confidence interval for instructions value: -1.93 -1.20
95% mean confidence interval for instructions %-change: -5.37% -3.41%
Instructions are helped.

total bundles in shared programs: 43778 -> 45102 (3.02%)
bundles in affected programs: 10737 -> 12061 (12.33%)
helped: 10
HURT: 369
helped stats (abs) min: 1.0 max: 3.0 x̄: 1.50 x̃: 1
helped stats (rel) min: 2.90% max: 18.75% x̄: 6.93% x̃: 5.21%
HURT stats (abs)   min: 1.0 max: 10.0 x̄: 3.63 x̃: 4
HURT stats (rel)   min: 0.82% max: 44.44% x̄: 15.27% x̃: 13.33%
95% mean confidence interval for bundles value: 3.29 3.69
95% mean confidence interval for bundles %-change: 13.68% 15.69%
Bundles are HURT.

total quadwords in shared programs: 76783 -> 77914 (1.47%)
quadwords in affected programs: 18633 -> 19764 (6.07%)
helped: 9
HURT: 370
helped stats (abs) min: 1.0 max: 2.0 x̄: 1.22 x̃: 1
helped stats (rel) min: 0.87% max: 8.33% x̄: 3.71% x̃: 3.85%
HURT stats (abs)   min: 1.0 max: 7.0 x̄: 3.09 x̃: 3
HURT stats (rel)   min: 0.82% max: 35.00% x̄: 7.82% x̃: 6.11%
95% mean confidence interval for quadwords value: 2.82 3.15
95% mean confidence interval for quadwords %-change: 7.02% 8.06%
Quadwords are HURT.

total registers in shared programs: 7266 -> 7076 (-2.61%)
registers in affected programs: 1224 -> 1034 (-15.52%)
helped: 171
HURT: 25
helped stats (abs) min: 1.0 max: 3.0 x̄: 1.27 x̃: 1
helped stats (rel) min: 8.33% max: 50.00% x̄: 21.85% x̃: 20.00%
HURT stats (abs)   min: 1.0 max: 2.0 x̄: 1.12 x̃: 1
HURT stats (rel)   min: 10.00% max: 100.00% x̄: 35.73% x̃: 33.33%
95% mean confidence interval for registers value: -1.10 -0.84
95% mean confidence interval for registers %-change: -17.69% -11.32%
Registers are helped.

total threads in shared programs: 4956 -> 5019 (1.27%)
threads in affected programs: 99 -> 162 (63.64%)
helped: 43
HURT: 6
helped stats (abs) min: 1.0 max: 2.0 x̄: 1.74 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 2.0 max: 2.0 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.91 1.66
95% mean confidence interval for threads %-change: 67.36% 95.90%
Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig
e04156b42a pan/mdg: Disassemble the .a32 bit
Corresponds to .auto32 on Bifrost. This is helpful for a conformant
implementation of flat shading.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
2022-11-02 16:52:11 +00:00
Rob Clark
4087374deb freedreno/a6xx: Mark gl45 supported
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
bb52332b50 freedreno/a6xx: ARB_query_buffer_object support
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
41455c6369 freedreno: Core ARB_query_buffer_object support
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
27250d67e5 freedreno/batch: Add a global epilogue
Rename the existing one to make it clear that it is per-tile, and add a
new one that runs after all the tile passes.  Will be needed in the next
commit.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
c9b0cd6e80 freedreno/a5xx+a6xx: Add base class for query samples
For PIPE_CAP_QUERY_BUFFER_OBJECT we'll need to write on the GPU a flag
when the query result is available, which means the buffers used for
query results should have a header with availability flag.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
46f84ce20a freedreno/a6xx: Remove unused field
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
5c5e4238ff freedreno/a6xx: Fix occlusion queries
WFI is not a strong enough barrier, which shows up in piglit qbo tests
which do a single draw.

Fixes: 13fc03f4c0 ("freedreno/a6xx: Avoid stalling for occlusion queries")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
701c0fdca2 freedreno/a6xx: Enable ARB_shader_group_vote
Already supported for at least a6xx.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
6edac0aaed freedreno/ir3: Unconditionally lower subgroup ops
For devices that don't support getfiberid, we force the subgroup size
to 1 for things other than compute stage.  This matches what zink does.
And fixes spec@arb_shader_group_vote@vs-eq-uniform once we expose
ARB_shader_group_vote.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
5b50332a14 freedreno/a3xx+: Enable ARB_derivative_control
Also already supported by ir3.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
5ceff032ad freedreno/a3xx+: Enable ARB_shader_texture_image_samples
This is already supported for ir3

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
7598db41ae freedreno/a6xx: Implement ARB_clear_texture
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
13946b8a6a freedreno/a6xx: Use box to pass 2d clear params
Simplifies the interface slightly and makes it possible to re-use the
path for pctx->clear_texture() in the next commit.  The z dimensions
still come from the surface.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Rob Clark
cd181b6140 freedreno: Add ARB_gl_spirv support
All the heavy lifting is done in nir.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
2022-11-02 15:42:14 +00:00
Erik Faye-Lund
fe6a84729d zink: put union fields into structs named by the shader-stages
This makes it easier to see that a field is only valid in a given stage,
to avoid undefined behavior.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19457>
2022-11-02 15:01:11 +00:00
Erik Faye-Lund
090a111c5d zink: do not read is_generated unless in tcs shader
It's undefined behavior in C to read a union member if another member
has been written to more recently. Let's be more careful here!

Fixes: a9d2b86c2c ("zink: store the spirv_shader to the zink_shader struct for generated tcs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19457>
2022-11-02 15:01:11 +00:00
Erik Faye-Lund
7d7e94066d zink: consider polygon-mode for rast_prim
But because polygon-offset needs to consider the primitive-type *before*
overriding the type, add a zink_prim_type()-helper for the partially
resolved state.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438>
2022-11-02 14:30:58 +00:00
Erik Faye-Lund
1859941768 zink: only set line-width if drawing lines
This might seem like a premature optimization, but it's going to make a
bit more sense with the next commit, to prevent needlessly regressing
performance.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438>
2022-11-02 14:30:58 +00:00
Erik Faye-Lund
53721827ea zink: correct depth-bias enable condition
This should be based on the fill_mode, not on the primitive type. We
*also* need to check if we'll rasterize triangles in the end, though.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438>
2022-11-02 14:30:58 +00:00
Adam Jackson
b78afc2c73 rusticl: meson devenv support
This gets 'meson devenv -C build clinfo' working on iris for me.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19451>
2022-11-02 13:44:12 +00:00
Rhys Perry
a71d068fd0 radv/llvm: fix GS shaders on GFX8/9
6698753cdb switched our GS output stores to use MUBUF.

The stride doesn't matter for the ESGS descriptor (because idxen=false and
the index stride is 64), but this fixes it anyway.

This also changes ACO to use MUBUF store too, since MTBUF doesn't seem to
work correctly with an invalid data format in the descriptor.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 6698753cdb ("ac/llvm: don't use tbuffer_store as a fallback for swizzled stores")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18885>
2022-11-02 12:48:01 +00:00
Joan Bruguera
6014a642ae nv50/ir/nir: ignore sampler for TXF/TXQ ops.
Recently, a regression was reported where videos in Firefox had shifted/
glitched colors on certain Kepler hardware. This was bisected to
bf02bffe15, however, the issue already
existed but didn't hit users until TGSI was switched to NIR as default.

The issue was traced to a YUV-to-RGB fragment shader used by Firefox,
which uses three samplers for the Y/U/V components. The Y component was
handled correctly, but the U/V components were bogus, causing the issue.

After analysis, it appears the TXF/TXQ ops. should only handle the texture
(r) but not the sampler (s), see 63b850403c
and 346ce0b988.
Similarly, handleTXQ/handleTXF on nv50_ir_from_tgsi always sets s=0.
Only Kepler was affected because other hardware ignores s at codegen.

Always set s=0 on NIR for TXF/TXQ, to keep TGSI behavior and fix the
regression.

Thanks: Karol Herbst and M Henning for help diagnosing the issue.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7416
Cc: mesa-stable
Suggested-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Signed-off-by: Joan Bruguera <joanbrugueram@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19453>
2022-11-02 12:29:34 +00:00
Pierre-Eric Pelloux-Prayer
4147add280 radeonsi: update db_eqaa even if msaa is disabled
This seems to fix rendering in application toggling MSAA on and
off between draw calls.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7537
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19326>
2022-11-02 11:24:36 +01:00
Pierre-Eric Pelloux-Prayer
abf3dea738 radeonsi/gfx11: enable sdma copy DRI_PRIME
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19326>
2022-11-02 11:24:05 +01:00
Marcin Ślusarz
dcaaeb56ef anv: program 3DSTATE_MESH_DISTRIB with the recommended values
It improves performance of vk_meshlet_cadscene on A770.

Fixes: f083df8710 ("anv: update task/mesh distribution with the recommended values")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19412>
2022-11-02 08:56:53 +00:00
Marcin Ślusarz
d1d2dee970 anv: set 3DSTATE_[MESH|TASK]_CONTROL.MaximumNumberofThreadGroups
Documentation is worded in a confusing way, which may be understood that
we don't have to set this field to get good results.

MESH part of this commit improves performance of vk_meshlet_cadscene
by a factor of 2 on A380.

Fixes: ef04caea9b ("anv: Implement Mesh Shading pipeline")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19412>
2022-11-02 08:56:53 +00:00
Marcin Ślusarz
11612d81b7 intel/genxml: fix width of 3DSTATE_TASK_CONTROL.MaximumNumberofThreadGroups
Fixes: 3567d47f3e ("intel/genxml: Inline the BODY structs into the instructions")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19412>
2022-11-02 08:56:53 +00:00
Illia Abernikhin
aa4ac5ff8b utils: Merge util/debug.* into util/u_debug.* and remove util/debug.*
Rename env_var_as_unsigned() -> debug_get_num_option(), because duplicate
Rename env_var_as_bool() -> debug_get_bool_option(), because duplicate

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7177

Signed-off-by: Illia Abernikhin <illia.abernikhin@globallogic.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19336>
2022-11-02 07:25:39 +00:00
Illia Abernikhin
0e47171abe utils: Move functions from debug.* to u_debug.*
Add unit tests for debug_get_bool_option and debug_get_num_option
Merge env_var_as_boolean and debug_get_bool_option and implement
 env_var_as_boolean with debug_get_bool_option in a stricter side.
Merge env_var_as_unsigned and debug_get_num_option and implement
 env_var_as_unsigned with debug_get_num_option in a stricter side.
Move debug_control, parse_debug_string, parse_enable_string,
 comma_separated_list_contains from debug.* to u_debug.*

Main changes:
os_get_option() is used instead of getenv() for env_var_as_boolean
 and env_var_as_unsigned;
also debug_get_bool_option() has logic like "true" always if not "false";
env_var_as_boolean() now uses different logic:
 if env variable is neither "true" nor "false" returns the default value,
 we left the second one; but if you want the behavior to be the same as in
 the old version of debug_get_bool_option() use dfault=true

Signed-off-by: Illia Abernikhin <illia.abernikhin@globallogic.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19336>
2022-11-02 07:25:39 +00:00
Thomas Debesse
d375a0ff8a crocus: set clear_buffer = u_default_clear_buffer
This is required when crocus is enabled in rusticl,
the lack of it contributes to this error:

thread '<unnamed>' panicked at 'Context missing features. This should never happen!', ../src/gallium/frontends/rusticl/mesa/pipe/context.rs:44:13

Signed-off-by: Thomas Debesse <dev@illwieckz.net>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19001>
2022-11-02 06:52:15 +00:00
Thomas Debesse
e74e82ea77 gallium/clover: pass -opaque-pointers to Clang on LLVM 15 and 16
This does the exact opposite of 06e96074 from !16129.

Before LLVM commit 702d5de4 opaque pointers were supported but not enabled
by default when building LLVM. They were made default in commit 702d5de4.
LLVM commit d69e9f9d introduced -opaque-pointers/-no-opaque-pointers cc1
options to enable or disable them whatever the LLVM default is.

Those two commits follow llvmorg-15-init and precede llvmorg-15.0.0-rc1 tags.

Since LLVM commit d785a8ea, the CLANG_ENABLE_OPAQUE_POINTERS build option of
LLVM is removed, meaning there is no way to build LLVM with opaque pointers
enabled by default.
It was said at the time it was still possible to explicitly disable opaque
pointers via cc1 -no-opaque-pointers option, but it is known a later commit
broke backward compatibility provided by -no-opaque-pointers as verified with
arbitrary commit d7d586e5, so there is no way to use opaque pointers starting
with LLVM 16.

Those two commits follow llvmorg-16-init and precede llvmorg-16.0.0-rc1 tags.

Since Mesa commit 977dbfc9 opaque pointers are properly implemented in Clover
and used.

If we don't pass -opaque-pointers to Clang on LLVM versions supporting opaque
pointers but disabling them by default, there will be an API mismatch between
Mesa and LLVM and Clover will not work.

Signed-off-by: Thomas Debesse <dev@illwieckz.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19290>
2022-10-25 07:18:16 +02:00
Thomas Debesse
1a06dbcaed Revert "gallium/clover: pass -no-opaque-pointers to Clang", opaque pointers are now implemented
This reverts commit 06e9607478 from !16129.

Clover passed -no-opaque-pointers option to Clang to workaround the fact
the Clover code was not ported to opaque pointers yet.

Opaque pointers are now implemented thanks to !19103 so passing this
option to tell Clang to not do opaque pointers while Clover does
is actually breaking Clover.

Here is an example of what happens when using opaque pointers while
passing -no-opaque-pointers at the same time:

  fatal error: cannot open file 'hawaii-amdgcn-mesa-mesa3d.bc':
   Opaque pointers are only supported in -opaque-pointers mode

This fixes one of the last remaining bits to fully support opaque pointers
in Mesa as referenced in #7468, this is the last remaining bit to fully support
opaque points in Clover.

Signed-off-by: Thomas Debesse <dev@illwieckz.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19290>
2022-10-25 05:20:29 +02:00
Alyssa Rosenzweig
2a6338722e panfrost: Don't use nir_variable in the compilers
More future proof, simpler, and works with early I/O lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19456>
2022-11-02 04:22:06 +00:00
Alyssa Rosenzweig
6a87719d35 pan/bi: Don't lower outputs for compute
Useless.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19456>
2022-11-02 04:22:06 +00:00
Kenneth Graunke
fde99747e9 nir: Drop infer_non_readable option for nir_opt_access()
Everybody sets it to true now, and the only reason for the option to
exist was to work around a bug that's now been fixed.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19162>
2022-11-02 03:42:04 +00:00
Kenneth Graunke
1462a61b5d st/mesa: Let nir_opt_access() infer non-readable
In issue #3278, Danylo noted that nir_opt_access() could desynchronize
the prog->sh.ImageAccess[] and prog->sh.BindlessImage[].access fields,
which are filled out as part of uniform linking, prior to running this
optimization pass.  Those fields are used to fill out pipe_image_view's
shader_access field, which is used by a lot of drivers these days.

There's an easy solution to this issue however: we can simply call the
pass prior to linking, a few lines earlier.

This lets us infer that images are non-readable, which may let drivers
do additional optimizations.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3278
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19162>
2022-11-02 03:42:04 +00:00
Alyssa Rosenzweig
45a111c21c nir/opt_algebraic: Fuse c - a * b to FMA
Algebraically it is clear that

   -(a * b) + c = (-a) * b + c = fma(-a, b, c)

But this is not clear from the NIR

   ('fadd', ('fneg', ('fmul', a, b)), c)

Add rules to handle this case specially. Note we don't necessarily want
to  solve this by pushing fneg into fmul, because the rule opt_algebraic
(not the late part where FMA fusing happens) specifically pulls fneg out
of fmul to push fneg up multiplication chains.

Noticed in the big glmark2 "terrain" shader, which has a cycle count
reduced by 22% on Mali-G57 thanks to having this pattern a ton and being
FMA bound.

BEFORE: 1249 inst, 16.015625 cycles, 16.015625 fma, ... 632 quadwords
AFTER: 997 inst, 12.437500 cycles, .... 504 quadwords

Results on the same shader on AGX are also quite dramatic:

BEFORE: 1294 inst, 8600 bytes, 50 halfregs, ...
AFTER: 1154 inst, 8040 bytes, 50 halfregs, ...

Similar rules apply for fabs.

v2: Use a loop over the bit sizes (suggested by Emma).

shader-db on Valhall (open + small subset of closed), results on Bifrost
are similar:

total instructions in shared programs: 167975 -> 164970 (-1.79%)
instructions in affected programs: 92642 -> 89637 (-3.24%)
helped: 492
HURT: 25
helped stats (abs) min: 1.0 max: 252.0 x̄: 6.25 x̃: 3
helped stats (rel) min: 0.30% max: 20.18% x̄: 3.21% x̃: 2.91%
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 2.80 x̃: 3
HURT stats (rel)   min: 0.46% max: 9.09% x̄: 3.89% x̃: 3.37%
95% mean confidence interval for instructions value: -6.95 -4.68
95% mean confidence interval for instructions %-change: -3.08% -2.65%
Instructions are helped.

total cycles in shared programs: 10556.89 -> 10538.98 (-0.17%)
cycles in affected programs: 265.56 -> 247.66 (-6.74%)
helped: 88
HURT: 2
helped stats (abs) min: 0.015625 max: 3.578125 x̄: 0.20 x̃: 0
helped stats (rel) min: 0.65% max: 22.34% x̄: 5.65% x̃: 4.25%
HURT stats (abs)   min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0
HURT stats (rel)   min: 8.33% max: 12.50% x̄: 10.42% x̃: 10.42%
95% mean confidence interval for cycles value: -0.28 -0.12
95% mean confidence interval for cycles %-change: -6.30% -4.30%
Cycles are helped.

total fma in shared programs: 1582.42 -> 1535.06 (-2.99%)
fma in affected programs: 871.58 -> 824.22 (-5.43%)
helped: 502
HURT: 9
helped stats (abs) min: 0.015625 max: 3.578125 x̄: 0.09 x̃: 0
helped stats (rel) min: 0.60% max: 25.00% x̄: 5.46% x̃: 4.82%
HURT stats (abs)   min: 0.015625 max: 0.0625 x̄: 0.03 x̃: 0
HURT stats (rel)   min: 4.35% max: 12.50% x̄: 6.22% x̃: 4.35%
95% mean confidence interval for fma value: -0.11 -0.08
95% mean confidence interval for fma %-change: -5.58% -4.93%
Fma are helped.

total cvt in shared programs: 665.55 -> 665.95 (0.06%)
cvt in affected programs: 61.72 -> 62.12 (0.66%)
helped: 33
HURT: 43
helped stats (abs) min: 0.015625 max: 0.359375 x̄: 0.04 x̃: 0
helped stats (rel) min: 1.01% max: 25.00% x̄: 6.68% x̃: 4.35%
HURT stats (abs)   min: 0.015625 max: 0.109375 x̄: 0.04 x̃: 0
HURT stats (rel)   min: 0.78% max: 38.46% x̄: 10.85% x̃: 6.90%
95% mean confidence interval for cvt value: -0.01 0.02
95% mean confidence interval for cvt %-change: 0.23% 6.24%
Inconclusive result (value mean confidence interval includes 0).

total quadwords in shared programs: 93376 -> 91736 (-1.76%)
quadwords in affected programs: 25376 -> 23736 (-6.46%)
helped: 169
HURT: 1
helped stats (abs) min: 8.0 max: 128.0 x̄: 9.75 x̃: 8
helped stats (rel) min: 1.52% max: 33.33% x̄: 8.35% x̃: 8.00%
HURT stats (abs)   min: 8.0 max: 8.0 x̄: 8.00 x̃: 8
HURT stats (rel)   min: 25.00% max: 25.00% x̄: 25.00% x̃: 25.00%
95% mean confidence interval for quadwords value: -11.18 -8.11
95% mean confidence interval for quadwords %-change: -8.95% -7.36%
Quadwords are helped.

total threads in shared programs: 4697 -> 4701 (0.09%)
threads in affected programs: 4 -> 8 (100.00%)
helped: 4
HURT: 0
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for threads value: 1.00 1.00
95% mean confidence interval for threads %-change: 100.00% 100.00%
Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Ol<C5><A1><C3><A1>k <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19312>
2022-11-01 22:39:45 -04:00