Commit graph

346 commits

Author SHA1 Message Date
Caio Marcelo de Oliveira Filho
395de69b1f intel/fs: Allow multiple slots for position
Change brw_compute_vue_map() to also take the number of pos slots.  If
more than one slot is used, the VARYING_SLOT_POS is treated as an
array.

When using Primitive Replication, instead of a single position, the
VUE must contain an array of positions.  Padding might be
necessary (after clip distance) to ensure rest of attributes start
aligned.

v2: Add note about array in the commit message and assert that
    pos_slots >= 1 to make clear 0 is invalid. (Jason)
    Move padding to be after the clip distance.

v3: Apply the correct offset when gathering the sources from outputs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v2]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
2020-04-07 17:16:09 +00:00
Tapani Pälli
e8f0483ec4 intel/compiler: detect if atomic load store operations are used
Patch adds a new arg and modifies existing calls from i965, anv
pass NULL but iris stores this information for later use.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>
2020-03-16 10:34:21 +00:00
Jason Ekstrand
4432dd6ea4 anv: Dump push ranges via VK_KHR_pipeline_executable_properties
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4173>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4173>
2020-03-13 16:31:44 +00:00
Caio Marcelo de Oliveira Filho
0a5053b687 anv: Reduce compute pipeline batch_data size
The batch associated with the compute pipeline only needs room for a
MEDIA_VFE_STATE. So this patch moves the batch_data to each pipeline
struct and cap the one in compute pipeline.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho
925df46b7e anv: Split graphics and compute bits from anv_pipeline
Add two new structs that use the anv_pipeline as base.  Changed all
functions that work on a specific pipeline to use the corresponding
struct.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho
af33f0d767 anv: Use a separate field in the pipeline for compute shader
This is a preparation for splitting the compute and graphics pipelines
into separate structs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho
88df3bf79a anv: Keep the shader stage in anv_shader_bin
This will be used to decouple the logic flush_descriptor_sets() from
the position in the shader array, allowing us to store just the
shaders needed for each pipeline.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho
9bf044d254 anv: Use a dynamic array for storing executables in pipeline
Avoids waste for pipelines that don't use all the shaders, and is
flexible enough to cover cases where there are multiple variants per
shader (e.g. SIMD8/16/32 for fragment shader).

Even though we could pre-calculate the exact size of the array, this
is not a critical path so it is worth preventing the bug that will
likely happen when new variants are added but not accounted for.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho
9b0682df82 anv: Use pipeline type to decide whether or not lower multiview
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho
613c9b78e3 anv: Add a new enum to identify the pipeline type
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
2020-03-12 13:18:54 -07:00
Jason Ekstrand
e03f965280 anv: Bounds-check pushed UBOs when robustBufferAccess = true
We also have to add nir_intrinsic_load_push_constant to the list of
intrinsics which use push constants in brw_nir_analyze_ubo_ranges
because we're moving the loop where we rewrite the intrinsics to after
we've analyzed UBO loads.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3777>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3777>
2020-03-07 04:51:29 +00:00
Caio Marcelo de Oliveira Filho
dab7a4d82c anv: Remove unused field urb.total_size
This was used before the URB calculation functions were shared by GL
and Vulkan.  Also drop the substruct for the remaining, `l3_config` is
a good name on its own.

Also-written-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3981>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3981>
2020-02-27 14:45:10 -08:00
Jason Ekstrand
5dfd83d7a1 anv: Always enable the data cache
Because we set the needs_data_cache bit from the NIR during compilation,
any time a shader was pulled out of the pipeline cache, we wouldn't set
the bit and the data cache was disabled.  Fortunately, on Gen8+, this
bit is ignored because we always use the ALL section in the L3$ config
instead of separate DC and RO sections.  On Gen7, however, this meant
that we were basically never running with the data cache enabled and our
compute performance was suffering massively because of it.  This commit
improves Geekbench 5 scores on my Haswell GT3 by roughly 330% (no,
that's not a typo).

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3912>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3912>
2020-02-25 20:12:10 +00:00
Caio Marcelo de Oliveira Filho
956e4b2d37 nir, intel: Move use_scoped_memory_barrier to nir_options
This option will be used later by GLSL, so move to a common struct.

Because nir_options is filled in the compiler instead of the Vulkan
driver, fix that up.  GLSL will ignore that for now.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913>
2020-02-24 19:12:11 +00:00
Caio Marcelo de Oliveira Filho
7df5d36078 anv: Use intel_debug_flag_for_shader_stage()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3911>
2020-02-21 13:09:44 -08:00
Arcady Goldmints-Orlov
e9f83185a2 Rename nir_lower_constant_initializers to nir_lower_variable_initalizers
This is naming is more clear as nir_variables can be initializes not
just with a nir_constant but with a pointer to another nir_variable.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047>
2020-02-12 15:41:49 +00:00
Ian Romanick
c57338b924 anv: Enable SPV_INTEL_shader_integer_functions2 and VK_INTEL_shader_integer_functions2
Currently only implemented in the scalar backend, so only enable for
Gen8+.  If support for the other opcodes is added to the vec4 backend,
Gen7 could be supported.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Jason Ekstrand
78ff747408 anv: Drop the instance pointer from anv_device
There are very few times when we actually want to fetch the instance
from the anv_device.  We can put up with a bit of pain there in exchange
for strongly discouraging people from doing this in general.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Jason Ekstrand
70e8064e13 anv: Add an anv_physical_device field to anv_device
Having to always pull the physical device from the instance has been
annoying for almost as long as the driver has existed.  It also won't
work in a world where we ever have more than one physical device.  This
commit adds a new field called "physical" to anv_device and switches
every location where we use device->instance->physicalDevice to use the
new field instead.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Caio Marcelo de Oliveira Filho
75a19186b2 anv: Ignore some CreateInfo structs when rasterization is disabled
According to the description of VkGraphicsPipelineCreateInfo(),
pViewportState, pMultisampleState, pDepthStencilState and
pColorBlendState must be ignored when rasterization is not enabled.

This avoids potentially invalid pointers being dereferenced when
rasterization is disabled.  Tested with `demos_x64 VK_Parameter_Zoo`
from Renderdoc repository.

v2: Don't store the `raster_enabled` as part of anv_pipeline, just
    query it from the create info.  This avoids storing a state that's
    only used during pipeline creation. (Jason)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2258
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch> [v1]
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-03 13:57:31 -08:00
Lionel Landwerlin
c056193288 anv: drop unused parameter from apply layout pass
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-16 14:35:25 +02:00
Jason Ekstrand
98dc179c1e anv: More carefully dirty state in BindPipeline
Instead of blindly dirtying descriptors and push constants the moment we
see a pipeline change, check to see if it actually changes the bind
layout or push constant layout.  This doubles the runtime performance of
one CPU-limited example running with the Dawn WebGPU implementation when
running on my laptop.

NOTE: This effectively reverts beca63c6c0.  While it was a nice
optimization, it was based on prog_data and we can't do that anymore
once we start allowing the same binding table to be used with multiple
different pipelines.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
9baa33cef0 anv: Rework push constant handling
This substantially reworks both the state setup side of push constant
handling and the pipeline compile side.  The fundamental change here is
that we're no longer respecting the prog_data::param array and instead
are just instructing the back-end compiler to leave the array alone.
This makes the state setup side substantially simpler because we can now
just memcpy the whole block of push constants and don't have to
upload one DWORD at a time.

This also means that we can compute the full push constant layout
up-front and just trust the back-end compiler to not mess with it.
Maybe one day we'll decide that the back-end compiler can do useful
things there again but for now, this is functionally no different from
what we had before this commit and makes the NIR handling cleaner.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
aecde23519 anv: Pre-compute push ranges for graphics pipelines
It turns off that emitting push constants is one of the hottest paths in
the driver and ANY work we do there costs us.  By pre-computing things a
bit ahead of time, we shave 5% off the runtime of a CPU-limited example
running with the Dawn WebGPU implementation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
0709c0f6b4 anv: Flatten descriptor bindings in anv_nir_apply_pipeline_layout
This lets us stop tracking the pipeline layout.  It also means less
indirection on a very hot path.  As an extra bonus, we can make some of
our data structures smaller.  No measurable CPU overhead improvement.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
abfd4651ed anv/pipeline: Assume layout != NULL
In the early days of the driver we allowed layout to be VK_NULL_HANDLE
and used that for some internal pipelines when we wanted to be lazy.
Vulkan doesn't actually allow NULL layouts, however, so there's no
reason to have this check.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
6a8f43030c anv: Stop compacting render targets in the binding table
Instead, always emit one entry for every color attachment in the subpass
or one NULL if there are no color attachments.  This will let us adjust
an Ice Lake workaround so we don't get a stall on every draw call.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-31 21:07:15 +00:00
Jason Ekstrand
c765e2156a anv: Don't claim the null RT as a valid color target
If it's NULL, we can let the compiler go ahead and delete it or flag it
as NULL.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-31 21:07:15 +00:00
Jason Ekstrand
df7a730b4f anv: Don't delete fragment shaders that write sample mask
Also, use color_outputs_valid rather than nr_color_outputs since it
should be a bit more accurate.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-31 21:07:15 +00:00
Caio Marcelo de Oliveira Filho
06aecb14c0 anv: Implement VK_KHR_vulkan_memory_model
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-24 11:39:56 -07:00
Jason Ekstrand
c7e5d24d8f anv/pipeline: Capture serialized NIR
This allows the serialized NIR to be displayed in RenderDoc and similar
tools.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-09 22:28:01 +00:00
Caio Marcelo de Oliveira Filho
f7ca072ab2 anv: Implement VK_KHR_shader_clock
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-07 09:12:12 -07:00
Samuel Iglesias Gonsálvez
f5dd6dfe01 anv: enable VK_KHR_shader_float_controls and SPV_KHR_float_controls
This adds support for
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR and
enables de Vulkan and SPIR-V extensions.

Also, notice that this includes the updates applied to the
VkPhysicalDeviceFloatControlsPropertiesKHR structure in the extension
VK_KHR_shader_float_controls v4 and Vulkan 1.1.116.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Jason Ekstrand
f58e0405b6 intel/fs: Drop the gl_program from fs_visitor
It's not used by anything anymore now that so much lowering has been
moved into NIR.  Sadly, we still need on in brw_compile_gs() for
geometry shaders on Sandy Bridge.  Short of a lot of pointless work,
that one's probably not going away.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-25 01:02:52 -05:00
Jason Ekstrand
d787a2d05e anv: Implement VK_KHR_pipeline_executable_properties
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
67cb55ad11 anv: Add a ralloc context to anv_pipeline
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
fec4bdff40 anv: Force a full re-compile when CAPTURE_INTERNAL_REPRESENTATION_TEXT is set
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
651fbbf9b8 anv/pipeline: Split setting up per-stage keys into its own loop
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
78f3dfb4a2 anv: Record shader compile stats in the pipeline cache
We're going to want these to be available regardless of caching.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
2af380d20f anv/pipeline: Stash generated code in the pipeline stage
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
134607760a intel/compiler: Fill a compiler statistics struct
This commit is all annoying plumbing work which just adds support for a
new brw_compile_stats struct.  This struct provides a binary driver
readable form of the same statistics we dump out to stderr when we
INTEL_DEBUG is set with a shader stage.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Rhys Perry
c52c54a746 anv,i965,iris: deduplicate setting of total_shared
v5: add patch

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-08 12:10:39 -05:00
Rhys Perry
024a46a407 anv: use derefs for shared memory access
vkpipeline-db for my Skylake GPU:
total instructions in shared programs: 8847602 -> 8847896 (<.01%)
instructions in affected programs: 10165 -> 10459 (2.89%)
helped: 8
HURT: 2

total cycles in shared programs: 1606273555 -> 1606251634 (<.01%)
cycles in affected programs: 2201803 -> 2179882 (-1.00%)
helped: 7
HURT: 3

The shaders with more instructions is due to a loop over a shared array
in Three Kingdoms being unrolled (and creating a lot of nested ifs). Not sure
if that's good or bad.

One of the shaders with worse cycles is only worse by 0.04% and the other
two are the shaders with loops unrolled.

v2: add patch
v4: don't set spirv_options.shared_addr_format
v4: move comment concerning the shared address format used and NULL
v4: add vkpipeline-db results
v5: rename to nir_lower_vars_to_explicit_types
v5: move setting of total_shared to outside brw_compile_cs
v6: set shared_addr_format
v6: formatting changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (v5)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-08 12:10:39 -05:00
Jason Ekstrand
f6e7de41d7 anv: Implement VK_EXT_line_rasterization
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-06 02:05:28 +00:00
Jason Ekstrand
abf9e10488 anv: Use dirty bits for dynamic state tracking
Previously, we assumed that the dirty bit was always 1 << VK_DYNAMIC_*
and this assumption is about to be false.  Extensions which define new
VK_DYNAMIC_* enums won't be nice and tightly packed which this really
requires.  Instead, add functions to don the conversions and rework the
bits a bit.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-06 02:05:28 +00:00
Jason Ekstrand
4bb6e6817e intel: Use a system value for gl_FragCoord
It's kind-of an anomaly that the Intel drivers are still treating
gl_FragCoord as an input.  It also makes zero sense because we have to
special-case it in the back-end.

Because ANV is the only user of nir_lower_wpos_center, we go ahead and
just update it to look for nir_intrinsic_load_frag_coord as part of this
patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-29 23:30:26 +00:00
Jason Ekstrand
d10de25309 anv: Implement VK_EXT_subgroup_size_control
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-24 12:55:40 -05:00
Jason Ekstrand
bcef32d49b anv/pipeline: Plumb pipeline shader stage create flags
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-24 12:55:40 -05:00
Jason Ekstrand
c84b8eeeac intel/compiler: Be more conservative about subgroup sizes in GL
The rules for gl_SubgroupSize in Vulkan require that it be a constant
that can be queried through the API.  However, all GL requires is that
it's a uniform.  Instead of always claiming that the subgroup size in
the shader is 32 in GL like we have to do for Vulkan, claim 8 for
geometry stages, the maximum for fragment shaders, and the actual size
for compute.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-24 12:55:40 -05:00
Jason Ekstrand
14781e2122 intel/compiler: Add a "base class" for program keys
Right now, all keys have two things in common: a program string ID and a
sampler_prog_key_data.  I'd like to add another thing or two and need a
place to put it.  This commit adds a new brw_base_prog_key struct which
contains those two common bits.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-10 19:35:55 +00:00