Commit graph

54 commits

Author SHA1 Message Date
Lionel Landwerlin
487586fefa anv: implement inline parameter promotion from push constants
Push constants on bindless stages of Gfx12.5+ don't get the data
delivered in the registers automatically. Instead the shader needs to
load the data with SEND messages.

Those stages do get a single InlineParameter 32B block of data
delivered into the EU. We can use that to promote some of the push
constant data that has to be pulled otherwise.

The driver will try to promote all push constant data (app + driver
values) if it can, if it can't it'll try to promote only the driver
values (usually a shader will only use a few driver values). If even
the drivers values won't fit, give up and don't use the inline
parameter at all.

LNL internal fossil-db:

Totals from 315738 (20.08% of 1572649) affected shaders:
Instrs: 155053691 -> 154920901 (-0.09%); split: -0.09%, +0.00%
CodeSize: 2578204272 -> 2574991568 (-0.12%); split: -0.15%, +0.02%
Send messages: 8235628 -> 8184485 (-0.62%); split: -0.62%, +0.00%
Cycle count: 43911938816 -> 43901857748 (-0.02%); split: -0.05%, +0.03%
Spill count: 481329 -> 473185 (-1.69%); split: -1.82%, +0.13%
Fill count: 405617 -> 399243 (-1.57%); split: -1.86%, +0.28%
Max live registers: 34309395 -> 34309300 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 8298224 -> 8299168 (+0.01%)
Non SSA regs after NIR: 18492887 -> 17631285 (-4.66%); split: -4.73%, +0.08%

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>
2026-02-25 10:44:09 +00:00
Lionel Landwerlin
789bb544f5 anv: add a shrinking push constant loading pass
Shaders will often contains things like this :

con 32    %469 = @load_push_constant (%468 (0x30)) (base=0, range=128, align_mul=256, align_offset=48)

We don't need 128 bytes of push constants to do that load.

This will become important when we rely more on base/range in the next
commit to promote things to inline parameters (only 32B of space
available).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>
2026-02-25 10:44:08 +00:00
Lionel Landwerlin
e94cb92cb0 anv: use internal surface state on Gfx12.5+ to access descriptor buffers
As a result on Gfx12.5+ we're not holding any binding table entry to
access descriptor buffers.

This should reduce the amount of binding table allocations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10711
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>
2026-02-12 16:45:26 +00:00
Lionel Landwerlin
87abf57764 anv: drop unused argument for compute_push_layout
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>
2026-02-12 16:45:26 +00:00
Lionel Landwerlin
faa857a061 intel: rework push constant handling
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
nr_params & params array are gone.

brw_ubo_range is not stored on the prog_data structure anymore (Anv
already stored a copy of that with its own additional information)

The backend now only deals with load_push_data_intel. load_uniform &
load_push_constant have to be lowered by the driver.

Pre Gfx12.5 platforms have to provide a subgroup_id_param to specify
where the subgroup_id value is located in the push constants.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>
2026-01-09 14:19:52 +00:00
Lionel Landwerlin
049adad4f4 anv: split non binding related intrinsics from apply_layout
Trying to cut down apply_pipeline_layout a bit and also allowing some
reuse for a new extension.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38495>
2025-11-19 10:27:27 +00:00
Lionel Landwerlin
1de9f367e8 anv: remove unused gfx/compute pipeline code
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34872>
2025-09-05 07:46:20 +00:00
Lionel Landwerlin
50fd669294 anv: prep work for separate tessellation shaders
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34872>
2025-09-05 07:46:17 +00:00
Lionel Landwerlin
a91e0e0d61 brw: add support for separate tessellation shader compilation
Tessellation factors have to be written dynamically (based on the next
shader primitive topology) and the builtins read using a dynamic
offset (based on the preceeding shader's VUE).

Anv is updated to use this new infrastructure for dynamic
patch_control_points.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34872>
2025-09-05 07:46:17 +00:00
Sagar Ghuge
cac3b4f404 anv: Mask off excessive invocations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
For unaligned invocations, don't launch two COMPUTE_WALKER, instead we
can mask off excessive invocations in the shader itself at nir level and
launch one additional workgroup.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36245>
2025-08-12 23:17:02 +00:00
Lionel Landwerlin
18f234a8a2 anv: avoid looking at the pipeline to flush push descriptors
We do this at the cost of recomputing some values that where available
on the pipeline at vkCmdBindPipeline() time.

We can look at the shaders on graphics/compute which will work nicely
with the runtime.

The runtime doesn't have support for ray tracing pipelines so we keep
using them.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36512>
2025-08-01 11:35:07 +00:00
Lionel Landwerlin
8d5cb999f9 anv: store layout_type on the bind_map for convenience
Pipeline layout is going away.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36512>
2025-08-01 11:35:06 +00:00
Lionel Landwerlin
fe6e9284c9 anv: stop using anv_pipeline_sets_layout
The vulkan runtime code doesn't allow to use the pipeline layout and
instead just provides an array of set layouts.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36512>
2025-08-01 11:35:03 +00:00
Lionel Landwerlin
f156af9ec6 anv: expose helper function outside of anv_pipeline.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36512>
2025-08-01 11:35:01 +00:00
jhananit
a74ac59220 anv: Remove NIR_PASS_V usage
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

anv: Fix for metadata failure

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35889>
2025-07-14 19:25:52 +00:00
Lionel Landwerlin
8e7e0ef75a anv: make Wa_18019110168 deal with dynamic provoking vertex
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:32 +00:00
Lionel Landwerlin
df15968813 anv/brw: stop turning load_push_constants into load_uniform
Those intrinsics have different semantics in particular with regards
to divergence. Turning one into the other without invalidating the
divergence information breaks NIR validation. But also the conversion
means we get artificially less convergent values in the shaders.

So just handle load_push_constants in the backend and stop changing
things in Anv.

Fixes a bunch of tests in
   dEQP-VK.descriptor_indexing.*
   dEQP-VK.pipeline.*.push_constant.graphics_pipeline.dynamic_index_*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34546>
2025-05-22 07:49:20 +00:00
Lionel Landwerlin
5c7c1eceb5 anv/brw: handle pipeline libraries with mesh
I always thought there was a massive issue with pipeline libraries &
mesh shaders. Indeed recent CTS tests have exposed a number of issues.

Some values delivered to the fragment shader are coming from different
places depending on whether the preceding shader is Mesh or not. For
example PrimitiveID is delivered in the per-primitive block in Mesh
pipelines whereas for other pipelines it's coming as a VUE slot (which
is per-vertex). Those are 2 different locations in the payload.

We have to find a layout for fragment shaders that is compatible with
everything. Leaving gaps here and there in the thread payload.

Fixes the following test pattern :

  dEQP-VK.mesh_shader.ext.smoke.fast_lib.shared_*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
4717382f84 anv: lower input vertices for TCS unconditionally
Take the opportunity to reuse the backend pass.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
45117c0ed5 anv: simplify loading driver internal constants
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30713>
2024-08-22 19:44:39 +00:00
Francisco Jerez
01118a3fbb anv/xe2+: Align push constant ranges to GRF boundaries.
This fixes corruption of push constants on Xe2 due to a mismatch in
the uniform layout implemented by the compiler and assumed by the
driver.  To fix it we need to align the push constant ranges computed
by the Vulkan driver to a multiple of the GRF size of the platform.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29926>
2024-06-27 07:39:17 +00:00
Lionel Landwerlin
1de44b1951 anv: add pipeline/shader support for descriptor buffers
Lowering/layout is pretty much the same as direct descriptors. The
caveats is that since the descriptor buffers are not visible from the
binding tables we can't promote anything to the binding table (except
push descriptors).

The reason for this is that there is nothing that prevents an
application to use both types of descriptors and because descriptor
buffers have visible address + capture replay, we can't merge the 2
types in the same virtual address space location (limited to 4Gb max,
limited 2Gb with binding tables).

If we had the guarantee that both are not going to be used at the same
time, we could consider a 2Gb VA for descriptor buffers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22151>
2024-02-29 07:05:06 +00:00
Lionel Landwerlin
9934613c74 anv/hasvk: track robustness per pipeline stage
And split them into UBO and SSBO

v2 (Lionel):
 - Get rid of robustness fields in anv_shader_bin
v3 (Lionel):
 - Do not pass unused parameters around

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17545>
2023-08-09 09:00:12 +03:00
Lionel Landwerlin
06dfd216d3 anv: add direct descriptor support to apply_layout
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
2023-05-30 06:36:38 +00:00
Lionel Landwerlin
02cecffe2b anv: add a pass to partially lower resource_intel
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
2023-05-30 06:36:38 +00:00
Lionel Landwerlin
e9fa840eed anv: implement EDS2.extendedDynamicState2PatchControlPoints
We make the compiler assume the worst possible case (it's not great
because we have to burn 32 GRFs of potential input data) and then we
push the actual value through push constants.

This enables VK_EXT_gpl usage on zink, which causes two traces to change
their results.  Raven is an imperceptible change, blender has missing
original pngs but looks plausible.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22378>
2023-05-24 18:32:07 +00:00
Lionel Landwerlin
3d49cdb71e anv: implement VK_EXT_graphics_pipeline_library
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15637>
2023-04-17 22:43:37 +00:00
Lionel Landwerlin
0b8a2de2a1 anv: add dynamic buffer offsets support with independent sets
With independent sets, we're not able to compute immediate values for
the index at which to read anv_push_constants::dynamic_offsets to get
the offset of a dynamic buffer. This is because the pipeline layout
may not have all the descriptor set layouts when we compile the
shader.

To solve that issue, we insert a layer of indirection.

This reworks the dynamic buffer offset storage with a 2D array in
anv_cmd_pipeline_state :

   dynamic_offsets[MAX_SETS][MAX_DYN_BUFFERS]

When the pipeline or the dynamic buffer offsets are updated, we
flatten that array into the
anv_push_constants::dynamic_offsets[MAX_DYN_BUFFERS] array.

For shaders compiled with independent sets, the bottom 6 bits of
element X in anv_push_constants::desc_sets[] is used to specify the
base offsets into the anv_push_constants::dynamic_offsets[] for the
set X.

The computation in the shader is now something like :

  base_dyn_buffer_set_idx = anv_push_constants::desc_sets[set_idx] & 0x3f
  dyn_buffer_offset = anv_push_constants::dynamic_offsets[base_dyn_buffer_set_idx + dynamic_buffer_idx]

It was suggested by Faith to use a different push constant buffer with
dynamic_offsets prepared for each stage when using independent sets
instead, but it feels easier to understand this way. And there is some
room for optimization if you are set X and that you know all the sets in
the range [0, X], then you can still avoid the indirection. Separate
push constant allocations per stage do have a CPU cost.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15637>
2023-04-17 22:43:37 +00:00
Lionel Landwerlin
ff91c5ca42 anv: add analysis for push descriptor uses and store it in shader cache
We'll use this information to avoid :
   - binding table emission
   - allocation of surface states

v2: Fix anv_nir_push_desc_ubo_fully_promoted()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
2022-10-14 23:03:16 +00:00
Kenneth Graunke
9cb57c9a7a anv: Delete has_a64_buffer_access flag
It's always true.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18208>
2022-09-02 09:40:46 +00:00
Jason Ekstrand
30251aaca2 anv: Stop looking at the pipeline in multiview lowering
Passing all the data we need in directly avoids issues where we might
forget what is and isn't set on the pipeline object at the time the
shader call happens.  This will be especially important once we start
splitting things up for pipeline libraries.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17602>
2022-08-31 02:00:18 +00:00
Lionel Landwerlin
eac5a2fdfa anv: make apply_pipeline_layout/compute_push_layout visible to NIR debug
Useful for debug.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17209>
2022-06-24 07:12:18 +00:00
Jason Ekstrand
b704d03efd anv: Do UBO loads with global addresses for bindless
This makes UBO loads in the variable pointers or bindless case work just
like SSBO loads in the sense that they use A64 messages and 64-bit
global addresses.  The primary difference is that we have an
optimization in anv_nir_lower_ubo_loads which uses a (possibly
predicated) block load message when the offset is constant so we get
roughly the same performance as we would from plumbing load_ubo all the
way to the back-end.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8635>
2021-03-17 17:49:59 +00:00
Jason Ekstrand
61749b5a15 anv: Add a pass for lowering A64 UBO access
Instead of load_global_constant_offset/bounded, we want to use the
Intel-specific block load intrinsic whenever we can.  This way we get
the same wide block loads that we usually use for constant offset UBO
pulls with a binding table.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8635>
2021-03-17 17:49:59 +00:00
Jason Ekstrand
e06144a818 anv: Use 64bit_global_32bit_offset for SSBOs
This has the advantage of giving us cheaper address calculations because
we can calculate in 32 bits first and then do a single 64x32 add.  It
also lets us delete a bunch of code for dealing with descriptor
dereferences (vulkan_resource_reindex, and friends) because our bindless
SSBO pointers are now vec4s regardless of whether or not we're doing
bounds checking.  This also unifies UBOs and SSBOs.  The one down-side
is that, in certain variable pointers cases, it may end up burning more
memory and/or increasing register pressure.  This seems like a worth-
while trade-off.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8635>
2021-03-17 17:49:59 +00:00
Caio Marcelo de Oliveira Filho
cf54785239 anv/gen12: Lower VK_KHR_multiview using Primitive Replication
Identify if view_index is used only for position calculation, and use
Primitive Replication to implement Multiview in Gen12.  This feature
allows storing per-view position information in a single execution of
the shader, treating position as an array.

The shader is transformed by adding a for-loop around it, that have an
iteration per active view (in the view_mask).  Stores to the position
now store into the position array for the current index in the loop,
and load_view_index() will return the view index corresponding to the
current index in the loop.

The feature is controlled by setting the environment variable
ANV_PRIMITIVE_REPLICATION_MAX_VIEWS, which defaults to 2 if unset.
For pipelines with view counts larger than that, the regular
instancing will be used instead of Primitive Replication.  To disable
it completely set the variable to 0.

v2: Don't assume position is set in vertex shader; remove only stores
    for position; don't apply optimizations since other passes will
    do; clone shader body without extract/reinsert; don't use
    last_block (potentially stale). (Jason)

    Fix view_index immediate to contain the view index, not its order.
    Check for maximum number of views supported.
    Add guard for gen12.

v3: Clone the entire shader function and change it before reinsert;
    disable optimization when shader has memory writes. (Jason)

    Use a single environment variable with _DEBUG on the name.

v4: Change to use new nir_deref_instr.
    When removing stores, look for mode nir_var_shader_out instead
    of the walking the list of outputs.
    Ensure unused derefs are removed in the non-position part of the
    shader.
    Remove dead control flow when identifying if can use or not
    primitive replication.

v5: Consider all the active shaders (including fragment) when deciding
    that Primitive Replication can be used.
    Change environment variable to ANV_PRIMITIVE_REPLICATION.
    Squash the emission of 3DSTATE_PRIMITIVE_REPLICATION into this patch.
    Disable Prim Rep in blorp_exec_3d.

v6: Use a loop around the shader, instead of manually unrolling, since
    the regular unroll pass will kick in.
    Document that we don't expect to see copy_deref or load_deref
    involving the position variable.
    Recover use_primitive_replication value when loading pipeline from
    the cache.
    Set VARYING_SLOT_LAYER to 0 in the shader.  Earlier versions were
    relying on ForceZeroRTAIndexEnable but that might not be
    sufficient.
    Disable Prim Rep in cmd_buffer_so_memcpy.

v7: Don't use Primitive Replication if position is not set, fallback
    to instancing; change environment variable to be
    ANV_PRIMITVE_REPLICATION_MAX_VIEWS and default it to 2 based on
    experiments.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
2020-04-07 17:16:09 +00:00
Jason Ekstrand
e03f965280 anv: Bounds-check pushed UBOs when robustBufferAccess = true
We also have to add nir_intrinsic_load_push_constant to the list of
intrinsics which use push constants in brw_nir_analyze_ubo_ranges
because we're moving the loop where we rewrite the intrinsics to after
we've analyzed UBO loads.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3777>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3777>
2020-03-07 04:51:29 +00:00
Lionel Landwerlin
c056193288 anv: drop unused parameter from apply layout pass
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-16 14:35:25 +02:00
Lionel Landwerlin
7c223cf316 anv: constify pipeline layout in nir passes
Was hoping to find potential issues but nothing. Still probably a good
idea.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-16 14:35:22 +02:00
Jason Ekstrand
9baa33cef0 anv: Rework push constant handling
This substantially reworks both the state setup side of push constant
handling and the pipeline compile side.  The fundamental change here is
that we're no longer respecting the prog_data::param array and instead
are just instructing the back-end compiler to leave the array alone.
This makes the state setup side substantially simpler because we can now
just memcpy the whole block of push constants and don't have to
upload one DWORD at a time.

This also means that we can compute the full push constant layout
up-front and just trust the back-end compiler to not mess with it.
Maybe one day we'll decide that the back-end compiler can do useful
things there again but for now, this is functionally no different from
what we had before this commit and makes the NIR handling cleaner.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
aecde23519 anv: Pre-compute push ranges for graphics pipelines
It turns off that emitting push constants is one of the hottest paths in
the driver and ANY work we do there costs us.  By pre-computing things a
bit ahead of time, we shave 5% off the runtime of a CPU-limited example
running with the Dawn WebGPU implementation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Daniel Schürmann
c31f470066 anv,nir: Move lower_input_attachments pass from ANV to NIR.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-08 14:02:50 +02:00
Jason Ekstrand
9ce7c29724 anv/nir: Add a central helper for figuring out SSBO address formats
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand
a24654b49d anv/nir: Rework arguments to apply_pipeline_layout
Instead of taking a whole pipeline (which could be anything!), just take
a physical device and robust_buffer_access boolean.  This makes it
easier to verify that only the things in the hash actually affect
pipeline compilation.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-11-22 09:17:28 -06:00
Jason Ekstrand
dfe18be09e anv: Implement vkCmdDispatchBase
This is part of the device groups extension/feature but it's a decent
chunk of work in its own right so it's worth breaking into its own
patch.  The mechanism we use is fairly straightforward: we just push the
base work group id into the shader and add it to the work group id we
get from dispatch.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Iago Toral Quiroga
e1a49f974b anv/pipeline: don't take the layout from the pipeline to compile shaders
The Vulkan spec states that VkPipelineLayout objects must not be
destroyed while any command buffer that uses them is in the recording
state, but it permits them to be destroyed otherwise. This means that
applications are allowed to free pipeline layouts after command recording
is finished even if there are pipeline objects that still exist and were
created with these layouts.

There are two solutions to this, one is to use reference counting on
pipeline layout objects. The other is to avoid holding references to
pipeline layouts where they are not really needed.

This patch takes a step towards the second option by making the
pipeline shader compile code take pipeline layout from the
VkGraphicsPipelineCreateInfo provided rather than the pipeline
object.

A follow-up patch will remove any remaining uses of the layout field
so we can remove it from the pipeline object and avoid the need
for reference counting.

v2: Use ANV_FROM_HANDLE, remove unnecessary braces (Jason)

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:46 +01:00
Lionel Landwerlin
f3e91e78a3 anv: add nir lowering pass for ycbcr textures
This pass implements all the implicit conversions required by the
VK_KHR_sampler_ycbcr_conversion specification.

It also inserts plane sources onto sampling instructions that we then
let the pipeline layout pass deal with, when mapping things correctly
to descriptors.

v2: Add new file to meson build (Lionel)
    Use nir_frcp() rather than (1.0f / x) (Jason)
    Reuse nir_tex_instr_dest_size() rather than handwritten one (Jason)
    Return progress (Jason)
    Account for array of samplers (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 16:32:19 +01:00
Jason Ekstrand
0db7070330 anv/pipeline: Add shader lowering for multiview
v2 (Jason Ekstrand):
 - Take a view_mask rather than a whole subpass
 - Build the view mask into the VS shader key

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
0bed97006f anv/nir: Delete the apply_dynamic_offsets prototype
That pass hasn't existed since dd4db84640
but the prototype stuck around for no reason.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
347f43c8ec anv: Add an input attachment lowering pass
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-11-22 13:44:55 -08:00