So that we can use the active_stages in copy_non_dynamic_state() later.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13047>
This is already handled by vk_device_init(); drivers no longer
need to do it themselves.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12867>
There are exceptions in spec where the framebuffer image format and
format given for render pass attachment may differ. This happens in
particular when subpass has resolve attachment that might resolve
only depth from a combined depth+stencil format. There the formats do
not need to match but be 'compatible' with each other.
As example using VK_FORMAT_D32_SFLOAT format is considered compatible
when actual framebuffer format is VK_FORMAT_D32_SFLOAT_S8_UINT.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13082>
emit_slice_hashing_state gets called once per queue, meaning
device->slice_hash can get allocated multiple times. This can be
reproduced by setting the env-var ANV_QUEUE_OVERRIDE=gc=2.
Reworks:
* Only pack the struct once (s-b Lionel, Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13114>
Reworks:
* Set BLORP_BATCH_USE_COMPUTE in anv_blorp_batch_init (s-b Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
Reworks:
* Let blorp_clear handle DEBUG_BLOCS
* Old subject was: "Use compute blorp for vkCmdFillBuffer with
INTEL_DEBUG=blocs"
* Old subject was: "anv/blorp: Support params.cs_prog_data being set"
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
Reworks:
* Use BLORP_BATCH_USE_COMPUTE flag rather than compute param to
blorp_copy (s-b Jason)
* Squash "intel/blorp: Set shader_pipeline for compute"
* Squash "intel/blorp: Add blorp_copy_supports_compute function"
* Squash "intel: Support compute for image/buffer copy if INTEL_DEBUG=blocs
is set"
* Squash "intel/blorp: Support compute for some blit operations"
* Use nir_image_store (s-b Jason)
* Use nir_push_if (s-b Jason)
* Require gfx12 for ccs in blorp_copy_supports_compute (s-b Jason)
* Add nir_pop_if (s-b Ken)
* Fix aux_usage check on gfx12 blorp_copy_supports_compute (s-b Ken)
* Use blorp_set_cs_dims (s-b Jason)
* Use dim=2d with array=true for nir_image_store (s-b Jason, Francisco)
* Restructure gen checks in blorp_copy_supports_compute (s-b Ken)
* Use nir_load_global_invocation_id (s-b Jason)
* Fix inefficient calculation of store_pos (s-b Jason)
* Use bounds_if being NULL/non-NULL for nir_pop_if (s-b Jason)
* discard => bounds (s-b Ken)
* Re-add ISL_AUX_USAGE_CCS_E in *_supports_compute (s-b Sagar)
* Skip duplicated in_bounds calculation (s-b Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
Based the blorp_params, blorp_get_cs_local_y returns a recommended
local_y size for a compute shader.
blorp_set_cs_dims sets the compute program dims based on a given
local_y size.
Reworks:
* Add blorp_set_cs_dims (s-b Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
Reworks:
* Don't pack params, just memcpy param struct (s-b Jason)
* Old subject: "intel/blorp: Emit compute program if
params.cs_prog_data is set"
* Various cleanups of push-const size/alignment (s-b Jason)
* Fix subslice count by moving to devinfo (s-b Ken)
* Simplify cw.InterfaceDescriptor code (s-b Ken)
* Drop some comments from i965 (s-b Ken)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
Render will use blorp_setup_binding_table and blorp_emit_btp, but
compute will only use blorp_setup_binding_table.
Rework:
* Use blorp_setup_binding_table, blorp_emit_btp (s-b Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
The compute path will use blorp_emit_sampler_state, whereas the render
path will use blorp_emit_sampler_state_ps.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
Make INTEL_DEBUG=blorp dump the blorp compute shaders instead using
the general INTEL_DEBUG=cs which is now reserved for actual compute
programs.
Ref: 05933fb0f7 ("intel/compiler: Use INTEL_DEBUG=blorp to dump blorp shaders")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
When they re-arranged all the dataport stuff and added the LSC, doing
URB fencing through the dataport no longer makes sense. Instead, there
is now a fence message on the URB shared function.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13092>
Found this nugget while looking at the ACO driver. It seems sensible to
avoid SLM fences if there is no SLM. This also makes the check depend
on SLM usage rather than just shader stage which will be useful if we
ever implement task/mesh because task shaders also have SLM.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13092>
Start off making everything look like LSC where we have three types of
fences: TGM, UGM, and SLM. Then, emit the actual code in a generation-
aware way. There are three HW generation cases we care about:
XeHP+ (LSC), ICL-TGL, and IVB-SKL. Even though it looks like there's a
lot to deduplicate, it only increases the number of ubld.emit() calls
from 5 to 7 and entirely gets rid of the SFID juggling and other
weirdness we've introduced along the way to make those cases "general".
While we're here, also clean up the code for stalling after fences and
clearly document every case where we insert a stall.
There are only three known functional changes from this commit:
1. We now avoid the render cache fence on IVB if we don't need image
barriers.
2. On ICL+, we no longer unconditionally stall on barriers. We still
stall if we have more than one to help tie them together but
independent barriers are independent. Barrier instructions will
still operate in write-commit mode and still be scheduling barriers
but won't necessarily stall.
3. We now assert-fail for URB fences on LSC platforms. We'll be adding
in the new URB fence message for those platforms in a follow-on
commit.
It is a big enough refactor, however, that other minor changes may be
present.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13092>
This knowledge was repeated in multiple places so move the values to
intel_device_info struct.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13014>
Since only Anv uses the value, I'm only enabling this on anv.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 518693c3ec ("spirv: Handle the SubgroupSize execution mode")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13034>