Commit graph

6168 commits

Author SHA1 Message Date
Jordan Justen
4656be70dd anv: Support multiple engines with DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT
v2 (Jason Ekstrand):
 - Separate the anv_gem interface from anv_queue internals
 - Rework on top of the new anv_queue_family stuff

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:33 +00:00
Jordan Justen
c5e7c91487 anv: Add anv_gem_count_engines
v2 (Jason Ekstrand):
 - Take a drm_i915_query_engine_info

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:33 +00:00
Jordan Justen
5d84c764fd anv: Gather engine info from i915 if available
v2 (Jason Ekstrand):
 - Don't take an anv_physical_device in anv_gem_get_engine_info()
 - Return the engine info from anv_gem_get_engine_info()
 - Free the engine info in anv_physical_device_destroy()

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:33 +00:00
Jordan Justen
c0d07c838a anv: Support i915 query (DRM_IOCTL_I915_QUERY) from Linux v4.17
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:33 +00:00
Jordan Justen
8d07f71918 anv: Print queue number with INTEL_DEBUG=bat
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:33 +00:00
Jordan Justen
9fd0806621 anv: Turn device->queue into an array
Rework: Lionel
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:33 +00:00
Jordan Justen
40d4799d8a anv: Add exec_flags to anv_queue
This may vary based on the newer kernel engines based contexts.

v2 (Jason Ekstrand):
 - Initialize anv_queue::exec_flags in anv_queue_init
 - Don't conflate this with refactors to get_reset_stats

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:33 +00:00
Jason Ekstrand
89ae945730 anv: Add an anv_queue_family struct
This is modeled on anv_memory_type and anv_memory_heap which we already
use for managing memory types.  Each anv_queue_family contains some data
which is returned by vkGetPhysicalDeviceQueueFamilyProperties() verbatim
as well as some internal book-keeping bits.  An array of queue families
along with a count is stored in the physical device.  Each anv_queue
then contains a pointer to the anv_queue_family to which it belongs.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:32 +00:00
Lionel Landwerlin
4b920ba5ab anv: store queue creation flags on anv_queue
v2 (Jason Ekstrand):
 - Pass the whole VkDeviceQueueCreateInfo into anv_queue_init()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:32 +00:00
Jason Ekstrand
e18d045b69 anv: Refactor anv_queue_finish()
By moving vk_object_base_finish() to the end and putting the thread
clean-up in an if block we both better mimic anv_queue_init() and have a
more correct object destruction order.  It comes at the cost of a level
of indentation but that seems to actually make the function more clear.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:32 +00:00
Lionel Landwerlin
34721e2af4 anv: pass context to reset stats helper
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:32 +00:00
Jason Ekstrand
e2cd83fbc5 anv: Fix an old parameter name in GetDeviceQueue
I don't know if this is a typo or an artifact of ancient versions of the
Vulkan API.  In any case, it's wrong.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:32 +00:00
Jason Ekstrand
dc8d74a555 anv: Drop anv_dump
I originally wrote this several years ago to aid in app debugging.  Now
that we have nice tools like RenderDoc, it's no longer needed.  I don't
think anyone's really used it in 4 years or more.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
2021-01-28 18:26:32 +00:00
Jason Ekstrand
f3a43e36e0 intel/fs: Add an ex_desc field to fs_inst for SHADER_OPCODE_SEND
I meant to do this years ago when I first added SHADER_OPCODE_SEND.  At
the time, the only use for the extended descriptor was bindless handles
which were always one thing and never non-constant.  However, it doesn't
actually require any extra instructions because we have to OR in ex_mlen
anyway.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8748>
2021-01-28 17:57:48 +00:00
Eleni Maria Stea
4ad4cd8906 anv: Enabled the VK_EXT_sample_locations extension
Enabled the VK_EXT_sample_locations for Intel Gen >= 7.

v2: Replaced device.info->gen >= 7 with True, as Anv doesn't support
    anything below Gen7. (Lionel Landwerlin)

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1887>
2021-01-27 23:25:27 +00:00
Eleni Maria Stea
6ab5dc45f6 anv: Removed unused header file
In src/intel/vulkan/genX_blorp_exec.c we included the file:
common/gen_sample_positions.h but not use it. Removed.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1887>
2021-01-27 23:25:27 +00:00
Eleni Maria Stea
27ee40f4c9 anv: Add support for sample locations
Allowing the user to set custom sample locations, by filling the
extension structs and chaining them to the pipeline structs according
to the Vulkan specification section [26.5. Custom Sample Locations]
for the following structures:

'VkPipelineSampleLocationsStateCreateInfoEXT'
'VkSampleLocationsInfoEXT'
'VkSampleLocationEXT'

Once custom locations are used, the default locations are lost and need
to be re-emitted again in the next pipeline creation. For that, we emit
the 3DSTATE_SAMPLE_PATTERN at every pipeline creation.

v2: In v1, we used the custom anv_sample struct to store the location
    and the distance from the pixel center because we would then use
    this distance to sort the locations and send them in increasing
    monotonical order to the GPU. That was because the Skylake PRM Vol.
    2a "3DSTATE_SAMPLE_PATTERN" says that the samples must have
    monotonically increasing distance from the pixel center to get the
    correct centroid computation in the device. However, the Vulkan
    spec seems to require that the samples occur in the order provided
    through the API and this requirement is only for the standard
    locations. As long as this only affects centroid calculations as
    the docs say, we should be ok because OpenGL and Vulkan only
    require that the centroid be some lit sample and that it's the same
    for all samples in a pixel; they have no requirement that it be the
    one closest to center. (Jason Ekstrand)
    For that we made the following changes:
    1- We removed the custom structs and functions from anv_private.h
       and anv_sample_locations.h and anv_sample_locations.c (the last
       two files were removed). (Jason Ekstrand)
    2- We modified the macros used to take also the array as parameter
       and we renamed them to start by GEN_. (Jason Ekstrand)
    3- We don't sort the samples anymore. (Jason Ekstrand)

v3 (Jason Ekstrand):
    Break the refactoring out into multiple commits

v4: Merge dynamic/non-dynamic changes into a single commit (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1887>
2021-01-27 23:25:27 +00:00
Lionel Landwerlin
43acc10bd0 intel/common: store sample position in plain arrays
Allows to extract the values in different ways than just the genxml
format.

v2 (Jason Ekstrand):
 - Add a struct gen_sample_location so that we can re-use the array
   macros from the earlier commit.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1887>
2021-01-27 23:25:27 +00:00
Eleni Maria Stea
cb082d8260 anv/state: Take explicit sample locations in emit helpers
This commit adds a "locations" parameter to emit_multisample and
emit_sample_pattern which, if provided, will override the default
sample locations.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1887>
2021-01-27 23:25:27 +00:00
Jason Ekstrand
a02891fdfd anv: Break SAMPLE_PATTERN and MULTISAMPLE emit into helpers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1887>
2021-01-27 23:25:27 +00:00
Eleni Maria Stea
983cebb5d2 anv: Implement physical device properties for VK_EXT_sample_locations
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with
implementation dependent values and according to the table from the
Vulkan Specification section [36.1. Limit Requirements]:

pname | max | min
pname:sampleLocationSampleCounts   |-            |ename:VK_SAMPLE_COUNT_4_BIT
pname:maxSampleLocationGridSize    |-            |(1, 1)
pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375)
pname:sampleLocationSubPixelBits   |-            |4
pname:variableSampleLocations      | true        |implementation dependent

The hardware only supports setting the same sample location for all the
pixels, so we only support 1x1 grids.

Also, variableSampleLocations is set to true because we can set sample
locations per draw.

Implement the vkGetPhysicalDeviceMultisamplePropertiesEXT according to
the Vulkan Specification section [36.2. Additional Multisampling
Capabilities].

v2: 1- Replaced false with VK_FALSE for consistency. (Sagar Ghuge)
    2- Used the isl_device_sample_count to take the number of samples
    per platform to avoid extra checks. (Sagar Ghuge)

v3: 1- Replaced VK_FALSE with false as Jason has sent a patch to replace
    VK_FALSE with false in other places. (Jason Ekstrand)
    2- Removed unecessary defines and set the grid size to 1 (Jason Ekstrand)

v4: Fix properties reporting in GetPhysicalDeviceProperties2, not
    GetPhysicalDeviceFeatures2 (Lionel)
    Use same alignment as other functions (Lionel)
    Report variableSampleLocations=true (Lionel)

v5: Don't overwrite the pNext in GetPhysicalDeviceMultisamplerPropertiesEXT

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> (v3)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1887>
2021-01-27 23:25:27 +00:00
Eleni Maria Stea
ecd8477e93 anv: Added the VK_EXT_sample_locations extension to the anv_extensions list
Added the VK_EXT_sample_locations to the anv_extensions.py list to
generate the related entrypoints.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1887>
2021-01-27 23:25:27 +00:00
Caio Marcelo de Oliveira Filho
804c90e256 anv: Implement VK_KHR_workgroup_memory_explicit_layout
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8699>
2021-01-27 22:20:53 +00:00
Kenneth Graunke
a710145b5b intel: Produce a "constrained" output from gen_get_urb_config()
When calculating a URB configuration, we start with a notion of how
much space each stage /wants/ (to achieve the maximum amount of
concurrency), but sometimes fall back to giving it less than that,
because we don't have enough space.  (Typically, this happens when
the per-stage size is large, or there are many stages, or both.)

We now output a "constrained" boolean which is true if we weren't
able to satisfy all the "wants" due to a lack of space.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8721>
2021-01-27 18:30:54 +00:00
Marcin Ślusarz
2fc5411e5e intel/perf: export information about units of performance counters
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8580>
2021-01-27 11:38:06 +00:00
Abhishek Kumar
26c9574bdb intel: change urb max shader geometry for KBL GT1
Below Deqp CTS failure is seen on KBL GT1(tested on 0x5906) only ,
GT2 all test passes, changing the max shader geometry to 256
(previous 640) fixes all failure tests.Similar issues on
CML GT1 (Gen9) is fixed
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8550

dEQP-GLES31.functional.geometry_shading.layered.
	 render_with_default_layer_cubemap
	 render_with_default_layer_3d
	 render_with_default_layer_2d_array

Signed-off-by: Abhishek Kumar <abhishek4.kumar@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8731>
2021-01-27 09:46:44 +00:00
Caio Marcelo de Oliveira Filho
9f3d5e99ea compiler: Use util/bitset.h for system_values_read
It is currently a bitset on top of a uint64_t but there are already
more than 64 values.  Change to use BITSET to cover all the
SYSTEM_VALUE_MAX bits.

Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8585>
2021-01-26 20:20:47 +00:00
Sagar Ghuge
001722b3a3 anv: Skip CCS ambiguate which preceed fast-clears
We can skip CCS ambiguate if followed by a fast clear within render
pass.

v2: (Jason)
- Check array layer as well since we only fast clear first layer and
  first LOD.
- Don't drop fast clear check while doing resolve operation.

Fixes: d5849bc840 "anv: Skip HiZ and CCS ambiguates which preceed fast-clears"
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6988>
2021-01-26 00:08:59 +00:00
Francisco Jerez
e2c5ef6cd6 intel/gen12: Fix memory corruption issues in fused Gen12 parts.
According to the BSpec page for MEDIA_VFE_STATE, on Gen12 platforms
"if a fused configuration has fewer threads than the native POR
configuration, the scratch space allocation is based on the number of
threads in the base native POR configuration".  However we currently
use the subslice count from devinfo->num_subslices[0], which only
includes the subslices currently enabled by the platform fusing.  This
leads to scratch space underallocation and occasional hangs.

The problem is likely to affect most Gen12 GPUs with less than 96 EUs.
GFXBench5 Aztec Ruins is able to reproduce the issue fairly reliably.

Fixes: 9e5ce30da7 "intel: fix the gen 12 compute shader scratch IDs"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8636>
2021-01-26 00:01:27 +00:00
Sagar Ghuge
dab229ef69 anv: Invalidate the correct AUX-TT entry
While invalidating the AUX-TT entries, we have to consider the surface
offset as well otherwise, we will end up invalidating another surface's
CCS portion.

For eg. when we have HiZ+CCS and STC_CCS enabled, both will use the CCS
portion allocated at the end of BO. While invalidating the CCS portion
of stencil buffer, we will end up invalidating the CCS portion that
belongs to the depth main surface and vice-versa, if the surface offset
is not considered.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4123
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8677>
2021-01-25 22:37:39 +00:00
Lionel Landwerlin
998f38bd99 anv: fix invalid programming of BLEND_STATE
We can't enable Logic Op & Color Buffer Blend. The Vulkan spec seems
to say Logic Op discards blending.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3767
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8691>
2021-01-25 22:03:58 +00:00
Connor Abbott
5c41a416c1 anv: Use sized types for nir_tex_instr::dest_type
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7989>
2021-01-25 11:21:42 +01:00
Connor Abbott
fe45fefe57 intel/blorp: Use sized types for nir_tex_instr::dest_type
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7989>
2021-01-25 11:21:42 +01:00
Connor Abbott
68969cbcb7 brw/vec4: Don't convert tex dest type to glsl_type
We were using nir_tex_instr::dest_type to a glsl_type, then passing it
to emit_texture(), only to just check the number of components. Just
pass the number of components directly. This lets us delete
brw_glsl_base_type_for_nir_type, which was asserting with
nir_texop_all_samples_equal because it didn't handle bool32.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7989>
2021-01-25 11:21:42 +01:00
Lionel Landwerlin
65f7b93435 intel: silence unused var warnings in release builds
v2: Use ASSERTED

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4162
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8681>
2021-01-25 09:04:32 +00:00
Jason Ekstrand
cca257d596 anv: Advertise shaderInt64 on Gen11+
On Gen11, they took away our hardware int64 support.  We have lowering
for all of it in NIR except for subgroup ops.  Now that all the subgroup
ops are implemented, we can enable the feature.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:38 +00:00
Jason Ekstrand
8c2543d037 intel/fs: Implement umin/umax shuffle
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:38 +00:00
Jason Ekstrand
a6500236e3 intel/fs: Refactor our shuffle emit code
This adds an emit_scan_step helper which gives us a place to do
something a bit more interesting than emitting a single op.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:38 +00:00
Jason Ekstrand
44571c6a68 intel/fs: Properly lower 64-bit MUL on 64-bit-incapable platforms
There are two problems this commit solves:  First, is that the 64x64 MUL
lowering generates a Q MOV which, because of how late it runs in the
compile pipeline, it never gets removed.  Second, it generates 32x32
MULs and we have to run it a second time to lower those.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:38 +00:00
Jason Ekstrand
c80db6611a intel/fs: Support 64-bit CLUSTER_BROADCAST on Gen11+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:38 +00:00
Jason Ekstrand
b90921ec0c intel/fs: Support 64-bit SHUFFLE on Gen11+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:38 +00:00
Jason Ekstrand
cdedc82329 intel/fs: Support 64-bit SEL_EXEC on Gen11+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:37 +00:00
Jason Ekstrand
58bcb5401d intel/fs: QUAD_SWIZZLE requires packed data
We could probably support some strides if we tried hard enough but the
whole point of this opcode is to accelerate things with crazy Align16 or
crazy regions.  It's ok if we have to emit an extra MOV to get a packed
source.

Fixes: 8b4a5e641b "intel/fs: Add support for subgroup quad operations"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:37 +00:00
Jason Ekstrand
69a3559efd intel/reg,fs: Handle immediates properly in subscript()
Just returning the original type isn't what we want in basically any
case.  Mask and shift the immediate as needed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:37 +00:00
Jason Ekstrand
e797daba53 intel/compiler: Move brw_reg_type_for_bit_size to brw_reg_type.h
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:37 +00:00
Jason Ekstrand
4c8cbe9b13 intel/compiler: Return 1 for immediates in regs_read
Previously, we were returning 2 whenever the source was a Q type.  As
far as I can tell, the only reason why this hasn't blown up before is
that it was only ever used for VGRFs until the SWSB pass landed which
uses it for everything.  This wasn't a problem because Q types generally
aren't a thing on TGL.  However, they are for a small handful of
instructions.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:37 +00:00
Jason Ekstrand
0e1447eb1b anv: Early-exit from cmd_buffer_flush_state
If we don't have any dynamic state, pipeline, or descriptor changes,
we can do a very quick early-exit instead of checking for a bunch of
stuff bit-by-bit.

Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8594>
2021-01-22 17:21:11 +00:00
Jason Ekstrand
18fc1dfea3 anv: Only flush descriptors used by the pipeline
Previously, if we had a pipeline transition from something which used,
say, tessellation to something which didn't and we ended up with
tessellation descriptors dirty, we could end up re-emitting far more
than necessary.  With this commit, we mask off unused stages so we only
update when necessary.

Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8594>
2021-01-22 17:21:11 +00:00
Jason Ekstrand
72c7a68c2b anv: Take the set of stages to flush in flush_descriptor_sets
Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8594>
2021-01-22 17:21:11 +00:00
Jason Ekstrand
16a81cabb5 anv: Exit early from cmd_buffer_apply_pipe_flushes
Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8594>
2021-01-22 17:21:11 +00:00