Fix defect reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable fp going out of scope leaks the storage it points to.
Fixes: 4fd7495c69 ("intel/clc: add ability to output NIR")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27778>
Not only do we want to know where an Op originated from, but also how it got
transformed along the way if possible. Preferably all the way to the final
machine code emitted.
This commit inserts OpAnnotates in some of the optimization passes when
map_instr() or Instr::new_boxed is used.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27158>
For submissions with an empty last batch containing no cmd buffers but
with semaphores as zink does, adding feedback to that batch would make
it no longer empty and increase submission overhead on some drivers.
Since feedback order is enforced by barriers, the feedback cmds can
instead be appended to the previous batch (if it exists) so that the
last batch remains empty.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27830>
This pulls everything into nvk_cmd_draw.c where it's a bit easier to
manage. When the time comes for switching to EXT_shader_object, this
will let us handle VK_EXT_dynamic_rendering_unused_attachments via the
common vk_pipeline code.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
These implementations are built on top of vk_shader. For the most part,
the driver shouldn't notice a difference between draws consuming
pipelines vs. draws consuming shaders. The only real difference is
that, when vk_driver_shader_ops::compile() is called for pipelines, a
struct vk_graphics_pipeline_state is provided. For shader objects, the
state object will be NULL indicating that all state is unknown. Besides
that, all the rest of the differences between Vulkan 1.0 pipelines,
VK_EXT_graphics_pipeline_library, and VK_EXT_shader_object are handled
by the Vulkan runtime code.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
We need to be able to thunk through a destroy callback if we want to
have different kinds of pipelines implemented in different parts of the
stack.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
This adds a new base vk_shader object along with vtables for creating,
binding, and working with shader objects.
Unlike other parts of the runtime, the new shader object code is a bit
more sanitized and opinionated than just handing you the Vulkan
entrypoints. For one thing, the create_shaders() calback takes a NIR
shader, not SPIR-V. Conversion of SPIR-V into NIR, handling of magic
meta NIR shaders, etc. is all done in common code. [De]serialization is
done via `struct blob` and the common code does a checksum of the binary
and handles rejecting invalid binaries based on shaderBinaryUUID and
shaderBinaryVersion. This should make life a bit easier for driver
authors as well as provides a bit nicer interface for building the
common pipeline implementation on top of shader objects.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
Because these functions were introduced by VK_EXT_shader_objects, we
technically have to expose them even though they have to do with NV
extensions that no one else supports.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
We're supposed to just ignore samples above what we support, and there's
no VU matching this assert. Fixes a crash in
dEQP-VK.pipeline.shader_object_unlinked_spirv.extended_dynamic_state.misc.sample_shading_dynamic_sample_count.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024>
nir->info.outputs_written isn't used for fragment shaders except as an
early out a few lines above this, so we don't rely on this property.
My best guess is that this was intended to check if the information
from nir_gather_info is stale, but dead variables fail the assert
even if the info is up to date.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27595>
Reference frame list is formed by each of the provided
recon_frame, while the assumption here is to use the API
provided by VAAPI interface, when a frame is marked as
"long term reference" by
av1->picture_flags.bits.long_term_reference
Its recon_frame will be kept in DPB marked by its
recon_frame as signature. When a future input requests
refering to it, it can go this way:
1. set av1->ref_frame_ctrl_l0.field.search_idx2 to indicate
which ref_frame_idx slot will be used.
x = av1->ref_frame_ctrl_l0.field.search_idx2;
2. n = av1->ref_frame_idx[x-1];
av1->reference_frames[n] as the signature to compare with.
if av1->reference_frames[n] is pointing to the
same video buffer (signature) as the one marked as
"long term reference". Then the new input is refering to
it only.
3. in SVC case, long terms are used for temproal_id 0 only,
because using long term means potentially scene change
could happen.
4. the "long term reference" recon_frame should be kept,
instead of being reused until it is no longer needed to
avoid signature duplication.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27771>
Add vcn4 av1 long term reference support.
So that frames can be controlled from application
side to refer to the identified reference, which
usually could provide better coding efficiency in
the case of scene chagne back and forth, just it
needs to identify and mark these frames before
using them.
We assume 2 long term reference frames should be
good in a key frame period, and these long term
references can be overwritten by marking new ones.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27771>
Here it borrowed the term "long term reference" to represent
the customized reference frame rather than the default ones used.
To enable that, it needs application to leverage VAAPI existing
interface to mark a frame as "long term reference", and then
it will be preserved in the DPB for later usage. This preserved
frame later could be refered to by having its signature used in
the ref_frame_idx[] list, and the index can be indicated by
RefFrameCtrl index2, which has not been used for other purpose.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27771>
No shader-db changes on any Intel platform.
fossil-db:
All Ice Lake and newer platforms had similar results. (Ice Lake)
Totals:
Instrs: 165513303 -> 165511820 (-0.00%)
Cycles: 15125314947 -> 15125211500 (-0.00%); split: -0.00%, +0.00%
Totals from 82 (0.01% of 656120) affected shaders:
Instrs: 544627 -> 543144 (-0.27%)
Cycles: 22616493 -> 22513046 (-0.46%); split: -0.46%, +0.00%
No fossil-db changes on Gfx9.
Suggested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
This adds optimizations for iadd, fadd, and ixor with reduce,
inclusive scan, and exclusive scan.
NOTE: The fadd and ixor optimizations had no shader-db or fossil-db
changes on any Intel platform.
NOTE 2: This change "fixes" arb_compute_variable_group_size-local-size
and base-local-size.shader_test on DG2 and MTL. This is just changing
the code path taken to not use whatever path was not working properly
before.
This is a subset of the things optimized by ACO. See also
https://gitlab.freedesktop.org/mesa/mesa/-/issues/3731#note_682802. The
min, max, iand, and ior exclusive_scan optimizations are not
implemented.
Broadwell on shader-db is not happy. I have not investigated.
v2: Silence some warnings about discarding const.
v3: Rename mbcnt to count_active_invocations. Add a big comment
explaining the differences between the two paths. Suggested by Rhys.
shader-db:
All Gfx9 and newer platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 20300384 -> 20299545 (<.01%)
instructions in affected programs: 19167 -> 18328 (-4.38%)
helped: 35 / HURT: 0
total cycles in shared programs: 842809750 -> 842766381 (<.01%)
cycles in affected programs: 2160249 -> 2116880 (-2.01%)
helped: 33 / HURT: 2
total spills in shared programs: 4632 -> 4626 (-0.13%)
spills in affected programs: 206 -> 200 (-2.91%)
helped: 3 / HURT: 0
total fills in shared programs: 5594 -> 5581 (-0.23%)
fills in affected programs: 664 -> 651 (-1.96%)
helped: 3 / HURT: 1
fossil-db results:
All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 165551893 -> 165513303 (-0.02%)
Cycles: 15132539132 -> 15125314947 (-0.05%); split: -0.05%, +0.00%
Spill count: 45258 -> 45204 (-0.12%)
Fill count: 74286 -> 74157 (-0.17%)
Scratch Memory Size: 2467840 -> 2451456 (-0.66%)
Totals from 712 (0.11% of 656120) affected shaders:
Instrs: 598931 -> 560341 (-6.44%)
Cycles: 184650167 -> 177425982 (-3.91%); split: -3.95%, +0.04%
Spill count: 983 -> 929 (-5.49%)
Fill count: 2274 -> 2145 (-5.67%)
Scratch Memory Size: 52224 -> 35840 (-31.37%)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
This doesn't help very much now. A later commit adds a NIR optimization
pass, tentatively called nir_opt_uniform_subgroup, that converts many
kinds of subgroup operations to things involving
bitCount(ballot(true)). This commit makes a huge difference in the
results of that later commit.
No shader-db changes on any Intel platform.
Fossil-db results:
All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 165558033 -> 165557519 (-0.00%)
Cycles: 15156188362 -> 15156178922 (-0.00%); split: -0.00%, +0.00%
Totals from 299 (0.05% of 656117) affected shaders:
Instrs: 88293 -> 87779 (-0.58%)
Cycles: 3709498 -> 3700058 (-0.25%); split: -0.28%, +0.03%
v2: Rebase on splitting ELK from BRW. Remove devinfo->ver >= 8 check.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>