Commit graph

652 commits

Author SHA1 Message Date
Caio Marcelo de Oliveira Filho
dab7a4d82c anv: Remove unused field urb.total_size
This was used before the URB calculation functions were shared by GL
and Vulkan.  Also drop the substruct for the remaining, `l3_config` is
a good name on its own.

Also-written-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3981>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3981>
2020-02-27 14:45:10 -08:00
Caio Marcelo de Oliveira Filho
89a3856714 anv: Add pipe_state_for_stage() helper
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3911>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3911>
2020-02-21 13:09:44 -08:00
Lionel Landwerlin
f9febfae41 anv: set MOCS on push constants
v2: Also set MOCS on 3DSTATE_CONSTANT_ALL (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 67d2cb3e93 ("anv: Add get_push_range_address() helper.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3732>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3732>
2020-02-06 10:10:11 +00:00
Lionel Landwerlin
bcb611361b anv: implement gen12 post sync pipe control workaround
Same as Skylake.

v2: Restrict to A0

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3405>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3405>
2020-02-05 00:25:48 +00:00
Lionel Landwerlin
8949d27bb8 anv: implement gen9 post sync pipe control workaround
We've been missing this workaround for a while and since it's required
for Gen12, let's implement it for Gen9 first.

v2: Update comment for Gen9.

v3: Fix clearing of bits... (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3405>
2020-02-05 00:25:48 +00:00
Jason Ekstrand
8c5fd2942b anv: Always fill out the AUX table even if CCS is disabled
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
2020-01-30 18:46:31 -06:00
Jason Ekstrand
73434b665b intel/genxml: Drop SLMEnable from L3CNTLREG on Gen11
SML is no longer in the L3$ on Gen11+.  It's not incredibly clear from
the docs but no Gen11 platforms are in the list of platforms on which
this bit exists.  Also, we've been always setting it false on Gen11 in
ANV and i965 thanks to GEN_L3P_SLM being zero with no ill effects.

Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
2020-01-30 18:45:53 -06:00
Jason Ekstrand
a2e9dd51b3 anv: Set actual state pool sizes when we have softpin
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2020-01-29 09:43:42 -06:00
Jordan Justen
2969012d03 anv: Emit CS Stall before Instruction Cache flush for gen12 WA
Before flushing the instruction cache with a pipe control, we need to
use a CS Stall pipe control.

Ref: GEN:BUG:1409226450
Rework: Add stall-at-scoreboard (Lionel)
Rework: Merge with other anvil pre-invalidate stalls (Lionel)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3457>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3457>
2020-01-28 21:57:17 +00:00
Jason Ekstrand
06657e1dda anv: Replace one more aux_surface.isl.size_B check
This one was missed in 41bffe0913.

Fixes: 41bffe0913 "anv: Replace aux_surface.isl.size_B checks with..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3593>
2020-01-28 18:15:29 +00:00
Jason Ekstrand
07a441d53f anv: Rework CCS memory handling on TGL-LP
The previous way we were attempting to handle AUX tables on TGL-LP was
very GL-like.  We used the same aux table management code that's shared
with iris and we updated the table on image create/destroy.  The problem
with this is that Vulkan allows multiple VkImage objects to be bound to
the same memory location simultaneously and the app can ping-pong back
and forth between them in the same command buffer.  Because the AUX
table contains format-specific data, we cannot support this ping-pong
behavior with only CPU updates of the AUX table.

The new mechanism switches things around a bit and instead makes the aux
data part of the BO.  At BO creation time, a bit of space is appended to
the end of the BO for AUX data and the AUX table is updated in bulk for
the entire BO.  The problem here, of course, is that we can't insert the
format-specific data into the AUX table at BO create time.

Fortunately, Vulkan has a requirement that every TILING_OPTIMAL image
must be initialized prior to use by transitioning the image from
VK_IMAGE_LAYOUT_UNDEFINED to something else.  When doing the above
described ping-pong behavior, the app has to do such an initialization
transition every time it corrupts the underlying memory of the VkImage
by using it as something else.  We can hook into this initialization and
use it to update the AUX-TT entries from the command streamer.  This way
the AUX table gets its format information, apps get aliasing support,
and everyone is happy.

One side-effect of this is that we disallow CCS on shared buffers.
We'll need to fix this for modifiers on the scanout path but that's a
task for another patch.  We should be able to do it with dedicated
allocations.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand
fd0f9d1196 anv: Make AUX table invalidate a PIPE_* bit
This commit moves it in with all the other cache invalidation operations
as if it were done by PIPE_CONTROL even though it's a pair of register
writes.  This means we only have to write the GFX_AUX_TABLE_BASE_ADDR
register once at device initialization instead of every invalidate.
Invalidates are now a single LRI instead of two.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Caio Marcelo de Oliveira Filho
c1a2ac2abe anv: Always initialize target_stencil_layout
Pass down stencil data from the subpass attachment like we do
elsewhere.  Only stencil attachments will make use of it.

Fixes warnings like

    ../src/intel/vulkan/genX_cmd_buffer.c: In function ‘cmd_buffer_begin_subpass’:
    ../src/intel/vulkan/genX_cmd_buffer.c:4656:41: warning: ‘target_stencil_layout’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     4656 |       att_state->current_stencil_layout = target_stencil_layout;
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557>
2020-01-24 14:01:38 -08:00
Jason Ekstrand
41bffe0913 anv: Replace aux_surface.isl.size_B checks with aux_usage checks
Now that aux_usage has a unified meaning, aux_usage == NONE if and only
if aux_surface.isl.size_B > 0.  In most of these cases, the question
we're asking is "does have compression?" and not "have we allocated an
aux surface for compression?".

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
2020-01-24 21:07:26 +00:00
Jason Ekstrand
e693a57232 anv: Rework the meaning of anv_image::planes[]::aux_usage
Previously, we set aux_usage=ISL_AUX_USAGE_NONE when we really meant
CCS_D.  This sort-of made sense before we had anv_layout_to_aux_usage
but now that we have that helper.  However, in our more modern aux
tracking model, all aux usage goes through anv_layout_to_* and we're
better off making the meaning of anv_image::planes[]::aux_usage be
AUX_USAGE_NONE if and only if there is no compression.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
2020-01-24 21:07:26 +00:00
Jason Ekstrand
c70a786c77 anv: Improve BTI change cache flushing
This commit makes two changes:

 1. We set pending_pipe_bits instead of emitting PIPE_CONTROL directly
    for the flush at the end of cmd_buffer_begin_subpass.

 2. Because BLORP ops such as vkCmdClearAttachments may come in the
    middle of a render pass, we have to also flag the need for a cache
    flush after the blorp op.

Fixes: 185630c6bc "anv/blorp: Do the gen11 BTI flush"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:26 +00:00
Jason Ekstrand
bf3a262a80 anv: Add a usage parameter to anv_layout_to_aux_usage
Most places we actually know the usage and can provide it.  There are
two exceptions to this:

 1. We pass 0 into get_blorp_surf_for_anv_image when we use
    ANV_IMAGE_LAYOUT_EXPLICIT_AUX because anv_layout_to_aux_usage is
    never actually called so it doesn't matter.

 2. We pass 0 into anv_layout_to_aux_usage in transition_color_buffer.
    However, the coming commits which will begin using the usage
    parameter only care about depth.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Jason Ekstrand
f8a4de6316 anv: Use isl_aux_state for HiZ resolves
Rather than looking at the aux usage, we look at the isl_aux_state which
provides us with more detailed information.  This commit adds a couple
helpers to isl which let us quickly determine if we have valid depth/hiz
on the initial layout and if we need valid depth/hiz for the final
layout.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Jason Ekstrand
769d6ba200 anv: Use TRANSFER_SRC_OPTIMAL for depth/stencil MSAA resolves
As of 52ad1712ed, TRANSFER_SRC_OPTIMAL and SHADER_READ_ONLY_OPTIMAL
are now identical for depth buffers so there's no reason why we need to
use the "wrong" layout.  Technically, according to Vulkan, blits and
MSAA resolves are transfer ops so we should use the transfer layout now
that we can.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Tapani Pälli
5fede43fe0 anv: initialize clear_color_is_zero_one
Fixes following valgrind warning:

   ==12508== Conditional jump or move depends on uninitialised value(s)
   ==12508==    at 0x2CCD8B79: cmd_buffer_begin_subpass (genX_cmd_buffer.c:4599)
   ==12508==    by 0x2CCDA72B: gen9_CmdBeginRenderPass (genX_cmd_buffer.c:5275)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3487>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3487>
2020-01-21 17:47:30 +02:00
Jason Ekstrand
1ec84bd208 anv: Take a device in anv_perf_warn
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Jason Ekstrand
cb6ea77045 anv: Take an anv_device in vk_errorf
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Jason Ekstrand
70e8064e13 anv: Add an anv_physical_device field to anv_device
Having to always pull the physical device from the instance has been
annoying for almost as long as the driver has existed.  It also won't
work in a world where we ever have more than one physical device.  This
commit adds a new field called "physical" to anv_device and switches
every location where we use device->instance->physicalDevice to use the
new field instead.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Jason Ekstrand
44f5a92c0b anv: Drop some VK_IMAGE_TILING_OPTIMAL checks
The DRM format modifiers extension adds a TILING_DRM_FORMAT_MODIFIER
which will be used for modifiers so we can no longer use OPTIMAL to
indicate tiled inside the driver.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Tapani Pälli
630cbb45ac anv: set depth stall enabled when depth flush enabled on gen12
This implements HW workaround #1409600907 for anv driver.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378>
2020-01-16 14:05:54 +02:00
Lionel Landwerlin
308efbf2f3 anv: implement another workaround for non pipelined states
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>
2020-01-16 11:51:30 +02:00
Iván Briano
4ef3f7e3d3 anv: Enable Vulkan 1.2 support
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-15 08:34:57 -06:00
Lionel Landwerlin
a014105498 anv: fix pipeline switch back for non pipelined states
Setting state base address can happen even before pipeline is
selected. Also we must ensure it is set to 3D for Gen12, we can't
switch back to an invalid pipeline value (UINT32_MAX).

v2: Reuse helpers (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b34422db5e ("anv: Implement Gen12 workaround for non pipelined state")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3396>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3396>
2020-01-15 11:14:43 +02:00
Lionel Landwerlin
b34422db5e anv: Implement Gen12 workaround for non pipelined state
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>
2020-01-14 11:52:36 +02:00
Jason Ekstrand
21bc16a723 anv: Drop an unused variable 2020-01-13 12:20:48 -06:00
Jason Ekstrand
9b71171442 anv: Re-use flush_descriptor_sets in flush_compute_state
There's no reason to hand-roll all of the memory re-allocation fall-back
code for compute shaders.  It's  just duplicated complexity.  This also
makes it more clear in flush_compute_state where the
MEDIA_INTERFACE_DESCRIPTOR_LOAD command gets emitted relative to other
packets in the command stream.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-09 19:45:00 -06:00
Jason Ekstrand
ae72d1238c anv: Flag descriptors dirty when gl_NumWorkgroups is used
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-09 19:45:00 -06:00
Jason Ekstrand
ca6b3b11af anv: Don't add dynamic state base address to push constants on Gen7
Because Gen7 push constants are already relative to dynamic state base
address, they aren't really an address.  It's deceptive to return an
address from the helper function.  Instead, let's leave it as a
special-case in the gen7-11 helper; we don't need the helper for code
de-duplication for Gen7 anyway.

Fixes: 67d2cb3e93 "anv: Add get_push_range_address() helper"
Closes: #2323
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-09 19:44:06 -06:00
Jason Ekstrand
0f60aa4037 anv: Re-emit all compute state on pipeline switch
It's a very odd case to hit in the real world.  However, there are some
CTS tests which switch back and forth between dispatch and clear without
changing the pipeline.

Fixes: bc612536eb "anv: Emit a dummy MEDIA_VFE_STATE before switching..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-07 04:03:35 +00:00
Jason Ekstrand
46af0ecc1d anv: Use PIPE_CONTROL flushes to implement the gen8 VF cache WA
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
1b5cb92b62 anv: Apply cache flushes after setting index/draw VBs
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
7ce39a55c1 anv: Always invalidate the VF cache in BeginCommandBuffer
I think the reason why we only do this for primaries is that we didn't
expect to have blorp calls in secondaries.  However, you are allowed to
have a full render pass in a secondary command buffer so resolves and
clears can end up in there.  We should just always invalidate.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Rafael Antognolli
d3e339364f anv: Use 3DSTATE_CONSTANT_ALL when possible.
Use this new instruction introduced in Gen12. The instruction itself is
smaller, and it also allows us to emit a single instruction to all
stages that have the same push constant buffers (e.g. when they don't
have constant buffers).

There's one restriction to use this instruction, though: the length
field is only 5 bits long, so we need to check whether we can use it,
and fallback to the old 3DSTATE_CONSTANT_XS if that field is >= 32.

v2:
 - Rebased on top of the lasted changes from Jason.
 - Added review suggestions by Caio.
 - Removed struct push_bos and merged some code into
 anv_nir_compute_push_layout().

v3:
 - Remove code churn due to gen8+ workaround in
 anv_nir_compute_push_layout(). This code has been removed in an earlier
 commit, and implemented in cmd_buffer_emit_push_constant().

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Rafael Antognolli
7d5da53d27 anv: Move code for emitting push constants into its own function.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Rafael Antognolli
67d2cb3e93 anv: Add get_push_range_address() helper.
Add a helper function to get the push range address. Once we have a
separate function for emitting gen12 push constants, we can use this
helper and avoid duplicating code.

v3: Do not add range->start to the address in gen7 (Caio).
v4: Do not drop range->start from gen7 (Caio, Jason).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Rafael Antognolli
c0225a728e anv: Move gen8+ push constant packet workaround.
Store push_ranges in ascending order, and only "shift" them to the end
of the array during state packet emission.

We don't need this workaround with the new 3DSTATE_CONSTANT_ALL packet.
So instead of applying the workaround here just for GEN < 12 (which
requires and extra loop through all the ranges to figure out if we
should shift them or not), we simply move the whole logic to the state
emission code. At that point, in a later commit, we are already looping
through all of the ranges anyway to check which packet we will be using,
so we might as well implement the workaround there, where it is going to
be used.

v3: Move gen8+ workaround to the state emission code (Caio).
v4: Add explanation of why we moved the workaroudn (Caio).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Jason Ekstrand
178a2946c0 anv: Respect the always_flush_cache driconf option
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-03 17:10:51 -06:00
Jason Ekstrand
a8965c076b anv: Push constants are relative to dynamic state on IVB
Fixes: aecde2351 "anv: Pre-compute push ranges for graphics pipelines"
Closes: #2136
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-26 22:15:54 +00:00
Rafael Antognolli
dadb6ebbd1 intel: Add workaround for stencil state.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-11-19 21:43:09 +00:00
Jason Ekstrand
fdaf8144a8 anv: Emit a NULL vertex for zero base_vertex/instance
If both are zero (the common case), we can emit a null vertex buffer
rather than emitting a vertex buffer with zeros in it.  The packing of
the VERTEX_BUFFER_STATE is faster because no relocation is emitted and
we can avoid creating the vertex buffer which means one less
anv_state_stream_alloc.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
98dc179c1e anv: More carefully dirty state in BindPipeline
Instead of blindly dirtying descriptors and push constants the moment we
see a pipeline change, check to see if it actually changes the bind
layout or push constant layout.  This doubles the runtime performance of
one CPU-limited example running with the Dawn WebGPU implementation when
running on my laptop.

NOTE: This effectively reverts beca63c6c0.  While it was a nice
optimization, it was based on prog_data and we can't do that anymore
once we start allowing the same binding table to be used with multiple
different pipelines.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
22f16ff54a anv: More carefully dirty state in BindDescriptorSets
Instead of dirtying all graphics or all compute based on binding point,
we're now much more careful.  We first check to see if the actual
descriptor set changed and then only dirty the stages used by that
descriptor set.  For dynamic offsets, we keep a bitfield per-stage of
which offsets are actually used in that stage and we only dirty push
constants and descriptors if that stage has dynamic offsets AND those
offsets actually change.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
ca8117b5d5 anv: Use a switch statement for binding table setup
It theoretically could be more efficient but the real point here is that
it's no longer really a matter of dealing with special cases and then
the "real" thing.  The way we're handling binding tables, it's more of a
multi-step process and a switch is more natural.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
ca91ab8015 anv: Re-arrange push constant data a bit
This moves the compute stuff into a anv_push_constants::cs sub-struct.
It also moves dynamic offsets into the push constants.  This means we
have to duplicate the data per-stage but that doesn't seem like the end
of the world and one day we may wish to make dynamic offsets per-stage
anyway.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
aecde23519 anv: Pre-compute push ranges for graphics pipelines
It turns off that emitting push constants is one of the hottest paths in
the driver and ANY work we do there costs us.  By pre-computing things a
bit ahead of time, we shave 5% off the runtime of a CPU-limited example
running with the Dawn WebGPU implementation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00