We're setting the same field a dozen lines below to the exact same
value.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13169>
In preparation for adding support for the graphics mesh pipeline,
identify all the paths that are specific the primitive pipeline.
This shouldn't change any behavior since the code currently only
supports the primitive pipeline.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13047>
So that we can use the active_stages in copy_non_dynamic_state() later.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13047>
This is already handled by vk_device_init(); drivers no longer
need to do it themselves.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12867>
There are exceptions in spec where the framebuffer image format and
format given for render pass attachment may differ. This happens in
particular when subpass has resolve attachment that might resolve
only depth from a combined depth+stencil format. There the formats do
not need to match but be 'compatible' with each other.
As example using VK_FORMAT_D32_SFLOAT format is considered compatible
when actual framebuffer format is VK_FORMAT_D32_SFLOAT_S8_UINT.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13082>
emit_slice_hashing_state gets called once per queue, meaning
device->slice_hash can get allocated multiple times. This can be
reproduced by setting the env-var ANV_QUEUE_OVERRIDE=gc=2.
Reworks:
* Only pack the struct once (s-b Lionel, Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13114>
Reworks:
* Set BLORP_BATCH_USE_COMPUTE in anv_blorp_batch_init (s-b Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
Reworks:
* Let blorp_clear handle DEBUG_BLOCS
* Old subject was: "Use compute blorp for vkCmdFillBuffer with
INTEL_DEBUG=blocs"
* Old subject was: "anv/blorp: Support params.cs_prog_data being set"
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
This knowledge was repeated in multiple places so move the values to
intel_device_info struct.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13014>
Since only Anv uses the value, I'm only enabling this on anv.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 518693c3ec ("spirv: Handle the SubgroupSize execution mode")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13034>
The populate_base_prog_key will set VARYING depending if the pipeline
flag is used. Later, when full subgroups flag is set, it will flip to
UNIFORM -- which for compute shaders is effectively the same, so don't
bother setting it again.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12946>
The lowering passes will soon be moved to another function, so there
won't be any choice.
As a side benefit, this allows eliminating the uses_atomic_load_store
**pointer** parameter from brw_nir_lower_storage_image. For some reason
crocus was passing false instead of NULL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>
This consolidates several duplicated pieces of code into devinfo.
max_scratch_ids is an array that provides the max number of threads
for the rendering and compute stages.
This fixes some exceptions missed by crocus for scratch ids on haswell
and cherryview.
It also fills out devinfo->max_scratch_ids properly for stages VS
through CS on Gfx12.5. But, functionally this should not make a
difference as Gfx12.5 already uses COMPUTE for all stages.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12799>
By default, Mesa's X11 Vulkan WSI will wait for buffers to be ready
before submitting them to Xwayland when the swapchain is created
with the IMMEDIATE mode.
This is undesirable when the Wayland compositor already monitors
fences. A Wayland compositor may want to know the delay between
the buffer submition and the end of the GPU work, this is impossible
to measure if the WSI waits for the buffer to be ready before
submission.
Since most compositors don't monitor fences, let's introduce a driconf
option for this for now. We can reconsider once more compositors
have better support for fences.
Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11290>
With LSC support, we can do 64-bit atomics with A32/64 messages.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12566>
For now, only mark the 4x8BitPacked variants as accelerated.
Applications are unlikely to use the "add with saturate" opcodes from
VK_INTEL_shader_integer_functions2, so, technically, all of the
AccumulatingSaturating variants "[provide] a performance advantage over
user-provided code composed from elementary instructions..." on all
Intel platforms. If we encounter an application that cares, we can do
things differently then. Ditto for the non-packed 8Bit, 4-element
vector variants.
v2: Don't memset props as this also zeros sType and pNext. Noticed by
Georg Lehmann in !12617.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12624>
This commit does several things:
* Unify code common to several drivers by evaluating INTEL_NO_HW within
intel_get_device_info_from_fd (suggested by Jordan).
* For drivers that keep a copy of the intel_device_info struct, a
separate copy of the no_hw field is now unnecessary. Remove them.
* Minimize kernel queries when INTEL_NO_HW is true. This is done for
code simplification, but we may find reason to undo this later on.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>