This adds a new base vk_shader object along with vtables for creating,
binding, and working with shader objects.
Unlike other parts of the runtime, the new shader object code is a bit
more sanitized and opinionated than just handing you the Vulkan
entrypoints. For one thing, the create_shaders() calback takes a NIR
shader, not SPIR-V. Conversion of SPIR-V into NIR, handling of magic
meta NIR shaders, etc. is all done in common code. [De]serialization is
done via `struct blob` and the common code does a checksum of the binary
and handles rejecting invalid binaries based on shaderBinaryUUID and
shaderBinaryVersion. This should make life a bit easier for driver
authors as well as provides a bit nicer interface for building the
common pipeline implementation on top of shader objects.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
Because these functions were introduced by VK_EXT_shader_objects, we
technically have to expose them even though they have to do with NV
extensions that no one else supports.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
We're supposed to just ignore samples above what we support, and there's
no VU matching this assert. Fixes a crash in
dEQP-VK.pipeline.shader_object_unlinked_spirv.extended_dynamic_state.misc.sample_shading_dynamic_sample_count.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
nir->info.outputs_written isn't used for fragment shaders except as an
early out a few lines above this, so we don't rely on this property.
My best guess is that this was intended to check if the information
from nir_gather_info is stale, but dead variables fail the assert
even if the info is up to date.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27595>
Reference frame list is formed by each of the provided
recon_frame, while the assumption here is to use the API
provided by VAAPI interface, when a frame is marked as
"long term reference" by
av1->picture_flags.bits.long_term_reference
Its recon_frame will be kept in DPB marked by its
recon_frame as signature. When a future input requests
refering to it, it can go this way:
1. set av1->ref_frame_ctrl_l0.field.search_idx2 to indicate
which ref_frame_idx slot will be used.
x = av1->ref_frame_ctrl_l0.field.search_idx2;
2. n = av1->ref_frame_idx[x-1];
av1->reference_frames[n] as the signature to compare with.
if av1->reference_frames[n] is pointing to the
same video buffer (signature) as the one marked as
"long term reference". Then the new input is refering to
it only.
3. in SVC case, long terms are used for temproal_id 0 only,
because using long term means potentially scene change
could happen.
4. the "long term reference" recon_frame should be kept,
instead of being reused until it is no longer needed to
avoid signature duplication.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27771>
Add vcn4 av1 long term reference support.
So that frames can be controlled from application
side to refer to the identified reference, which
usually could provide better coding efficiency in
the case of scene chagne back and forth, just it
needs to identify and mark these frames before
using them.
We assume 2 long term reference frames should be
good in a key frame period, and these long term
references can be overwritten by marking new ones.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27771>
Here it borrowed the term "long term reference" to represent
the customized reference frame rather than the default ones used.
To enable that, it needs application to leverage VAAPI existing
interface to mark a frame as "long term reference", and then
it will be preserved in the DPB for later usage. This preserved
frame later could be refered to by having its signature used in
the ref_frame_idx[] list, and the index can be indicated by
RefFrameCtrl index2, which has not been used for other purpose.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27771>
No shader-db changes on any Intel platform.
fossil-db:
All Ice Lake and newer platforms had similar results. (Ice Lake)
Totals:
Instrs: 165513303 -> 165511820 (-0.00%)
Cycles: 15125314947 -> 15125211500 (-0.00%); split: -0.00%, +0.00%
Totals from 82 (0.01% of 656120) affected shaders:
Instrs: 544627 -> 543144 (-0.27%)
Cycles: 22616493 -> 22513046 (-0.46%); split: -0.46%, +0.00%
No fossil-db changes on Gfx9.
Suggested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
This adds optimizations for iadd, fadd, and ixor with reduce,
inclusive scan, and exclusive scan.
NOTE: The fadd and ixor optimizations had no shader-db or fossil-db
changes on any Intel platform.
NOTE 2: This change "fixes" arb_compute_variable_group_size-local-size
and base-local-size.shader_test on DG2 and MTL. This is just changing
the code path taken to not use whatever path was not working properly
before.
This is a subset of the things optimized by ACO. See also
https://gitlab.freedesktop.org/mesa/mesa/-/issues/3731#note_682802. The
min, max, iand, and ior exclusive_scan optimizations are not
implemented.
Broadwell on shader-db is not happy. I have not investigated.
v2: Silence some warnings about discarding const.
v3: Rename mbcnt to count_active_invocations. Add a big comment
explaining the differences between the two paths. Suggested by Rhys.
shader-db:
All Gfx9 and newer platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 20300384 -> 20299545 (<.01%)
instructions in affected programs: 19167 -> 18328 (-4.38%)
helped: 35 / HURT: 0
total cycles in shared programs: 842809750 -> 842766381 (<.01%)
cycles in affected programs: 2160249 -> 2116880 (-2.01%)
helped: 33 / HURT: 2
total spills in shared programs: 4632 -> 4626 (-0.13%)
spills in affected programs: 206 -> 200 (-2.91%)
helped: 3 / HURT: 0
total fills in shared programs: 5594 -> 5581 (-0.23%)
fills in affected programs: 664 -> 651 (-1.96%)
helped: 3 / HURT: 1
fossil-db results:
All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 165551893 -> 165513303 (-0.02%)
Cycles: 15132539132 -> 15125314947 (-0.05%); split: -0.05%, +0.00%
Spill count: 45258 -> 45204 (-0.12%)
Fill count: 74286 -> 74157 (-0.17%)
Scratch Memory Size: 2467840 -> 2451456 (-0.66%)
Totals from 712 (0.11% of 656120) affected shaders:
Instrs: 598931 -> 560341 (-6.44%)
Cycles: 184650167 -> 177425982 (-3.91%); split: -3.95%, +0.04%
Spill count: 983 -> 929 (-5.49%)
Fill count: 2274 -> 2145 (-5.67%)
Scratch Memory Size: 52224 -> 35840 (-31.37%)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
This doesn't help very much now. A later commit adds a NIR optimization
pass, tentatively called nir_opt_uniform_subgroup, that converts many
kinds of subgroup operations to things involving
bitCount(ballot(true)). This commit makes a huge difference in the
results of that later commit.
No shader-db changes on any Intel platform.
Fossil-db results:
All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 165558033 -> 165557519 (-0.00%)
Cycles: 15156188362 -> 15156178922 (-0.00%); split: -0.00%, +0.00%
Totals from 299 (0.05% of 656117) affected shaders:
Instrs: 88293 -> 87779 (-0.58%)
Cycles: 3709498 -> 3700058 (-0.25%); split: -0.28%, +0.03%
v2: Rebase on splitting ELK from BRW. Remove devinfo->ver >= 8 check.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
Otherwise the compiler generates an extra MOV to load the constant into
a register first because reasons. 🤷 vote_any, vote_all, vote_ieq,
and vote_feq handling already do this.
No shader-db changes on any Intel plaform.
Fossil-db results:
All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 165592451 -> 165557937 (-0.02%)
Cycles: 15133282615 -> 15133059360 (-0.00%); split: -0.00%, +0.00%
Totals from 33779 (5.15% of 656115) affected shaders:
Instrs: 4396576 -> 4362062 (-0.79%)
Cycles: 86867412 -> 86644157 (-0.26%); split: -0.37%, +0.11%
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
The problem seems to have been related to
nir_intrinsic_load_global_block_intel being marked as non-divergent.
No shader-db or fossil-db changes on any Intel platform.
v2: Rebase on splitting ELK from BRW. Remove devinfo->ver >= 8 check.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
This is divergent because it specifically loads sequential values into
successive SIMD lanes.
No shader-db or fossil-db changes on any Intel platform.
Fixes: 9f44a26462 ("nir/divergence: handle load_global_block_intel")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
This is almost a 1:1 copy of the same function in libwayland. If the version
with the symbol propagates far enough the fallback can be removed again.
Signed-off-by: Sebastian Wick <sebastian.wick@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27511>
The fast-clear value is now the same for SNORM and UNORM, so our trick
of reinterpreting SNORM as UNORM when copying now works with UBWC. We
can also freely reinterpret UNORM, SNORM, and UINT formats, as tested by
dEQP-VK.image.mutable.*.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27506>