Commit graph

239 commits

Author SHA1 Message Date
Jason Ekstrand
d9ea015ced anv/pipeline: Lower pipeline layouts etc. after linking
This allows us to use the link-optimized shader for determining binding
table layouts and, more importantly, URB layouts.  For apps running on
DXVK, this is extremely important as DXVK likes to declare max-size
inputs and outputs and this lets is massively shrink our URB space
requirements.

VkPipeline-db results (Batman pipelines only) on KBL:

    total instructions in shared programs: 820403 -> 790008 (-3.70%)
    instructions in affected programs: 273759 -> 243364 (-11.10%)
    helped: 622
    HURT: 42

    total spills in shared programs: 8449 -> 5212 (-38.31%)
    spills in affected programs: 3427 -> 190 (-94.46%)
    helped: 607
    HURT: 2

    total fills in shared programs: 11638 -> 6067 (-47.87%)
    fills in affected programs: 5879 -> 308 (-94.76%)
    helped: 606
    HURT: 3

Looking at shaders by hand, it makes the URB between TCS and TES go from
containing 32 per-vertex varyings per tessellation shader pair to a more
reasonable 8-12.  For a 3-vertex patch, that's at least half the URB
space no matter how big the patch section is.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-17 10:50:28 -05:00
Jason Ekstrand
f210a5f4bb anv/pipeline: Set tess IO read/written key fields in compile_*
We want these to be set as close to the final compile as possible so
that they are guaranteed to happen after nir_shader_gather_info is
called.  The next commit is going to move nir_shader_gather_info to
after the linking step which makes this necessary.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-17 10:50:28 -05:00
Jason Ekstrand
2e4094cd8f anv/pipeline: Use more fields from stage in compile_cs
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-17 10:50:28 -05:00
Eric Engestrom
fcf259ef97 anv: set error in all failure paths
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 5b196f39bd "anv/pipeline: Compile to NIR in compile_graphics"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-09 11:20:27 +01:00
Jason Ekstrand
1d900e55fd anv/pipeline: Disable FS dispatch for pointless fragment shaders
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-03 05:52:23 -07:00
Jason Ekstrand
7ef6cd0ee8 anv/pipeline: Do cross-stage linking optimizations
This appears to help the Aztec Ruins benchmark by about 2% on my Kaby
Lake gt2 laptop.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
a5bffa061d anv/pipeline: Pull most of the anv_pipeline_compile_* into common code
This leaves us with a series of little anv_pipeline_compile_* functions
which each take a compiler object, a mem_ctx, the stage to compile, and
the previous stage for VUE linking purposes.  Some of them do
interesting things but most are little more than wrappers around
brw_compile_*.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
5351339554 anv/pipeline: Add a separate "link" stage
This breaks compilation up a bit into "link" and "compile".  In the
"link" stage, new anv_pipeline_link_* helpers are called which are
responsible for setting up the binding table and doing anything needed
to properly link with the next stage in the pipeline if one exists.
They are called in reverse order starting with the fragment shader so
you can assume linking in later stages is already done.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
5b196f39bd anv/pipeline: Compile to NIR in compile_graphics
This pulls the SPIR-V to NIR step out into common code.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
946fcd02a9 anv/pipeline: Recompile all shaders if any are missing from the cache
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
f76d6d8a63 anv/pipeline: Drop anv_pipeline_add_compiled_stage
We can set active_stages much more directly and then it's just candy
around setting pipeline->stages[stage].

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
703a24932a anv/pipeline: Pull shader compilation out into a helper.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
f3c59ca947 anv/pipeline: Call anv_pipeline_compile_* in a loop
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
bdc3565c8c anv/pipeline: Hash the entire pipeline in one go
Instead of hashing each stage separately (and TES and TCS together), we
hash the entire pipeline.  This means we'll get fewer cache hits if
they, for instance, re-use the same VS over and over again but it also
means we can now safely do cross-stage optimizations.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
4a8236ae17 anv/pipeline: Populate keys up-front
Instead of having each anv_pipeline_compile_* function populate the
shader key, make it part of the anv_pipeline_stage struct and fill it
out up-front.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
76503b319a anv/pipline: Add a helper struct for per-stage info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-02 10:29:20 -07:00
Jason Ekstrand
de9e5cf35a anv/pipeline: Add populate_tcs/tes_key helpers
They don't really do anything interesting, but it's more consistent this
way.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
e621f57556 anv/pipeline: Rework the parameters to populate_wm_prog_key
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
b2e0b0dad6 anv/pipeline: More aggressively optimize away color attachments
Instead of just looking at the number of color attachments, look at
which ones are actually used by the subpass.  This lets us potentially
throw away chunks of the fragment shader.  In DXVK, for example, all
subpasses have 8 attachments and most are VK_ATTACHMENT_UNUSED so this
is very helpful in that case.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
80bc0b728c anv: Restrict the number of color regions to those actually written
The back-end compiler emits the number of color writes specified by
wm_prog_key::nr_color_regions regardless of what nir_store_outputs we
have.  Once we've gone through and figured out which render targets
actually exist and are written by the shader, we should restrict the key
to avoid extra RT write messages.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Jason Ekstrand
4d57e543b8 anv/pipeline: Fix up deref modes if we delete a FS output
With the new deref instructions, we have to keep the modes consistent
between the derefs and the variables they reference.  Since we remove
outputs by changing them to local variables, we need to run the fixup
pass to fix the modes.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-01 18:02:28 -07:00
Kenneth Graunke
488972222c i965: Combine both gl_PatchVerticesIn lowering passes.
Until now, we had separate passes for lowering gl_PatchVerticesIn to
a statically known constant (for TES inputs when linked against a TCS),
and a uniform in the other cases.  Annoyingly, one had to be run before
nir_lower_system_values, and the other afterward.  This simplified the
passes, but made life painful for the callers.

This patch combines both into a single pass.  If you give it a non-zero
static count, it uses that.  If you give it Mesa state slots, it turns
it back into a built-in uniform.  Otherwise, it does nothing.

This also moves the i965 uniform lowering out to shared code.

v2: Make token arrays const.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-26 21:51:36 -07:00
Jason Ekstrand
820d5e51b7 intel/compiler: Account for built-in uniforms in analyze_ubo_ranges
The original pass only looked for load_uniform intrinsics but there are
a number of other places that could end up loading a push constant.  One
obvious omission was images which always implicitly use a push constant.
Legacy VS clip planes also get pushed into the shader.  This fixes some
new Vulkan CTS tests that test random combinations of bindings and, in
particular, test lots of UBOs and images together.

Cc: mesa-stable@lists.freedesktop.org
Cc: Kenneth Graunke <kenneth@whitecape.org>
2018-07-23 15:28:17 -07:00
Ilia Mirkin
257128079c anv/gen9: expose VK_EXT_post_depth_coverage
Note that the use of ICMS_INNER_CONSERVATIVE disagrees with the GL driver.
Perhaps it's more performant than ICMS_NORMAL and is otherwise permitted?
Not sure, so I left it as-is.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-22 14:56:44 -07:00
Jason Ekstrand
227dabc266 anv: Implement VK_EXT_vertex_attribute_divisor
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-09 15:37:51 -07:00
Jason Ekstrand
2caf6c0392 anv/pipeline: Add a per-VB instance divisor
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-09 15:37:51 -07:00
Jason Ekstrand
32f4feb5a0 anv/pipeline: Use a per-VB struct instead of separate arrays
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-09 15:37:51 -07:00
Jose Maria Casanova Crespo
6db20229ab anv: Enable SPV_KHR_8bit_storage and VK_KHR_8bit_storage
Enables SPV_KHR_8bit_storage and VK_KHR_8bit_storage on gen 8+
using the VK_KHR_get_physical_device_properties2 functionality
to expose if the extension is supported or not.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-07-10 00:14:50 +02:00
Jason Ekstrand
208be8eafa anv: Make subpass::depth_stencil_attachment a pointer
This makes certain checks a bit easier and means that we don't have
the attachment information duplicated in the attachment list and in
depth_stencil_attachment.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 10:11:53 -07:00
Jason Ekstrand
f5c38f4a30 anv: Add device-level helpers for searching for and uploading kernels
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-02 13:07:06 -07:00
Jason Ekstrand
eae192bf5f anv/pipeline: Stop optimizing for not having a cache
Before, we were only hashing the shader if we had a shader cache to
cache things in.  This means that if we ever get it wrong, we could end
up trying to cache a shader with an undefined hash.  Since not having a
shader cache is an extremely uncommon case, let's optimize for code
clarity and obvious correctness over avoiding a hash operation.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-07-02 13:07:06 -07:00
Jason Ekstrand
3a5ed18c51 anv: Add support for shader constant data to the pipeline cache
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-02 12:09:47 -07:00
Francisco Jerez
244a0ff3a8 i965: Add plumbing for shader time in 32-wide FS dispatch mode.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
032b845edf anv/pipeline: Convert apply_pipeline_layout to deref instructions
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
d57e724a45 anv/pipeline: Convert YCbCr lowering to deref instructiosn
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
38f1b89805 anv/pipeline: Convert lower_input_attachments to deref instructions
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
5cd7324a57 anv/pipeline: Do less deref instruction lowering
This commit removes most of the deref instruction lowering.  Instead of
lowering early, we only lower textures and images and we only do so
right before any of the anv image lowering passes.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:54:00 -07:00
Jason Ekstrand
c11833ab24 nir,spirv: Rework function calls
This commit completely reworks function calls in NIR.  Instead of having
a set of variables for the parameters and return value, nir_call_instr
now has simply has a number of sources which get mapped to load_param
intrinsics inside the functions.  It's up to the client API to build an
ABI on top of that.  In SPIR-V, out parameters are handled by passing
the result of a deref through as an SSA value and storing to it.

This virtue of this approach can be seen by how much it allows us to
delete from core NIR.  In particular, nir_inline_functions gets halved
and goes from a fairly difficult pass to understand in detail to almost
trivial.  It also simplifies spirv_to_nir somewhat because NIR functions
never were a good fit for SPIR-V.

Unfortunately, there is no good way to do this without a mega-commit.
Core NIR and SPIR-V have to be changed at the same time.  This also
requires changes to anv and radv because nir_inline_functions couldn't
handle deref instructions before this change and can't work without them
after this change.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:58 -07:00
Jason Ekstrand
b0c643d8f5 spirv: Use NIR per-member splitting
Before, we were doing structure splitting in spirv_to_nir.
Unfortunately, this doesn't really work when you think about passing
struct pointers into functions.  Doing it later in NIR is a much better
plan.

Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:57 -07:00
Jason Ekstrand
74212c2414 anv,i965,radv,st,ir3: Call nir_lower_deref_instrs
This inserts a call to nir_lower_deref_instrs at every call site of
glsl_to_nir, spirv_to_nir, and prog_to_nir.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22 20:15:54 -07:00
Gustavo Lima Chaves
7dfaf025c5 anv: enable VK_EXT_shader_stencil_export
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-08 11:16:01 -07:00
Samuel Iglesias Gonsálvez
2cf64fdb46 anv: ignore pColorBlendState if all color attachments of the subpass are unused
According to Vulkan spec:

  "pColorBlendState is a pointer to an instance of the
   VkPipelineColorBlendStateCreateInfo structure, and is ignored if the
   pipeline has rasterization disabled or if the subpass of the render pass the
   pipeline is created against does not use any color attachments."

Fixes tests from CL#2505:

   dEQP-VK.renderpass.*.simple.color_unused_omit_blend_state

v2:
- Check that blend is not NULL before usage.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-09 07:01:10 +02:00
Iago Toral Quiroga
002cb6f2b3 anv/pipeline: support SpvCapabilityInt16 in gen8+
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Caio Marcelo de Oliveira Filho
c9bdc7f7e2 anv: enable VK_EXT_shader_viewport_index_layer
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-26 15:32:05 -07:00
Jason Ekstrand
c3f9d5c235 anv/pipeline: Lower more constant initializers earlier
Once we've gotten rid of everything but the main entrypoint, there's no
reason why we should go ahead and lower them all.  This is what radv
does and it will make future work easier.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-09 19:45:25 -07:00
Caio Marcelo de Oliveira Filho
f6338c3b85 anv/pipeline: set active_stages early
Since the intermediate states of active_stages are not used,
i.e. active_stages is read only after all stages were set into it,
just set its value before compiling the shaders.

This will allow to conditionally run certain passes based on what
other shaders are being used, e.g. a certain pass might only be
applicable to the vertex shader if there's no geometry or tessellation
shader being used.

v2: Use vk_to_mesa_shader_stage. (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-19 18:00:49 +00:00
Caio Marcelo de Oliveira Filho
318073ce66 anv/pipeline: fail if TCS/TES compile fail
v2: Add Fixes tag. (Lionel)

Fixes: e50d4807a3 ("anv: Compile TCS/TES shaders.")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-19 18:00:49 +00:00
Jason Ekstrand
03c07ac548 anv: Add support for SPIR-V 1.3 subgroup operations
This requires us to bump the subgroup size to 32 for all shader stages
because Vulkan requires that to be a physical device query.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
68df93ecbc anv: Trivially implement VK_KHR_device_group
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
dfe18be09e anv: Implement vkCmdDispatchBase
This is part of the device groups extension/feature but it's a decent
chunk of work in its own right so it's worth breaking into its own
patch.  The mechanism we use is fairly straightforward: we just push the
base work group id into the shader and add it to the work group id we
get from dispatch.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00