Commit graph

7207 commits

Author SHA1 Message Date
Ian Romanick
2c51e96672 intel/fs: Fix gl_FrontFacing optimization on Gfx12+
It's not obvious why the (gl_FrontFacing ? -1.0 : 1.0) case was handled
different for Gfx12+ than for previous generations, and it's not
correct.  It tries to negate the result as an integer, and it does this
before the mask operation that clears the other bits in the value.

When we eventually support dual-SIMD8 dispatch, the other front-facing
bit is in g1.6 at bit 15, so similar code should be possible there.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: c92fb60007 ("intel/fs/gen12: Implement gl_FrontFacing on gen12+.")
Closes: #5876
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14625>
(cherry picked from commit 945fb51fb5)
2022-01-26 18:28:30 +00:00
Lionel Landwerlin
f7a52a16cf anv: fix missing descriptor copy of bufferview/surfacestate content
When doing copies of descriptors from one set to another, that contain
either a UNIFORM_BUFFER or STORAGE_BUFFER, both the buffer view &
surface state are allocated from the source descriptor. Therefore we
need to copy their content otherwise we could run into lifecycle
issues when the source descriptor is destroyed.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14585>
(cherry picked from commit acebea9cf1)
2022-01-26 18:28:30 +00:00
Lionel Landwerlin
b4fb2974de intel/fs: disable VRS when omask is written
As indicated by
VkPhysicalDeviceFragmentShadingRatePropertiesKHR::fragmentShadingRateWithShaderSampleMask
our implementation will clamp to 1x1 when reading samplemask or
writing to samplemask.

This fixes vkd3d-proton tests test_sample_mask_dxbc & test_sample_mask_dxil

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b6332fc4a8 ("intel/compiler: handle coarse pixel in render target writes descriptors")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14553>
(cherry picked from commit 30a8b8d2df)
2022-01-26 18:28:29 +00:00
Lionel Landwerlin
33461292fb intel/dev: fixup chv workaround
We're using the wrong helper to get the subslice total count.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c24ba6cecb ("intel/dev: Handle CHV CS thread weirdness in get_device_info_from_fd")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14492>
(cherry picked from commit d6c0d16791)
2022-01-12 19:54:27 +00:00
Lionel Landwerlin
d987419d8a anv: limit compiler valid color outputs using NIR variables
This fixes a test from the vkd3d-proton test_dual_source_blending_dxbc
test which asserts in the backend with :

   brw_fs_visitor.cpp:716: void fs_visitor::emit_fb_writes(): Assertion `!prog_data->dual_src_blend || key->nr_color_regions == 1' failed.

This is because there is 2 color attachments provided by the
renderpass so we initially set nr_color_regions = 2. But once we've
parsed the shader, we can see it's only using one output (with dual
source color blending).

This change looks at the output variables to update the valid output
variables.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14417>
(cherry picked from commit 07bc6b7ed9)
2022-01-12 19:54:26 +00:00
Lionel Landwerlin
886b86f601 anv: don't leave anv_batch fields undefined
Because the extend_cb vfunc is not initialized, there is a risk that
the emission code calls into a random pointer.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14418>
(cherry picked from commit 1d40d53e03)
2022-01-12 19:54:26 +00:00
Rohan Garg
7241ec2ee5 intel/fs: OpImageQueryLod does not support arrayed images as an operand
When we lower SPIR-V to NIR for textures in vtn_handle_texture, we only
bump the number of coordinate components when the op is not a lod query.
Update the assert to take this into account.

This fixes:
  - dEQP-VK.robustness.robustness2.bind.template.r32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.r32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.r32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.r32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.r32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.r32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.rg32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.rg32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.rg32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.rg32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.rg32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.rg32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.rgba32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.rgba32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.rgba32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.rgba32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.rgba32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.bind.template.rgba32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.r32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.r32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.r32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.r32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.r32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.r32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.rg32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.rg32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.rg32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.rg32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.rg32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.rg32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.rgba32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.rgba32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.rgba32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
  - dEQP-VK.robustness.robustness2.push.notemplate.rgba32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag

Fixes: 231337a1 ("intel/fs/xehp: Assert that the compiler is sending all 3 coords for cubemaps.")
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13925>
(cherry picked from commit af13119993)
2022-01-12 19:54:25 +00:00
Henry Goffin
8be9711422 intel/compiler/test: Fix build with GCC 7
Without this change, test_fs_scoreboard.cpp does not compile on GCC 7
due to the use of C99 initializers in a C++ source file.

Fixes: c847bfaaf5 ("intel/fs/gen12: Add tests for scoreboard pass")

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14349>
(cherry picked from commit fe617bcca0)
2022-01-12 19:54:24 +00:00
Dave Airlie
5f32bdee91 intel/genxml/gen4-5: fix more Raster Operation in BLT to be a uint
This has been partly fixed twice before, but looks like some got missed.

Fixes arb_copy_image on gen4 with crocus

Fixes: de625dddee ("intel/genxml: fix raster operation field in blt genxml")

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14345>
(cherry picked from commit a2293e33fd)
2022-01-12 19:54:23 +00:00
Jason Ekstrand
4e143b9c0b Revert "anv: Stop doing too much per-sample shading"
This reverts commit 1f559930b6.  Turns
out, this approach won't work.

Fixes: 1f559930b6 ("anv: Stop doing too much per-sample shading")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14196>
(cherry picked from commit b05d228695)
2022-01-01 18:16:22 +00:00
Francisco Jerez
03f6edd5b7 intel/fs: Add physical fall-through CFG edge for unconditional BREAK instruction.
This adds a missing CFG edge that represents a possible physical
control flow path the EU might take under some conditions which isn't
part of the logical CFG of the program.  This possibility shouldn't
have led to problems on platforms prior to Gfx12, since the missing
control flow edge cannot possibly influence liveness intervals.
However on Gfx12+ it becomes the compiler's responsibility to resolve
data dependencies across instructions, and the missing physical
control flow paths may lead to a WaR data hazard currently not visible
to the software scoreboard pass, which could lead to data corruption.

Worse, the possibility for this path to be taken by the EU increases
on Gfx12+ due to a hardware bug affecting EU fusion -- However the
same physical path can be potentially taken on earlier platforms as
well, so this patch extends the CFG on all platforms for consistency,
even though the lack of this edge shouldn't lead to any functional
issues on platforms earlier than Gfx12.  There are no shader-db
changes on earlier platforms, so there seems to be no disadvantage
from using the same CFG representation as on later platforms.

This issue has ben reported on TGL with the following conformance
test, thanks to Ian for bringing the FULSIM dependency check warning
to my attention:

   dEQP-VK.graphicsfuzz.spv-stable-pillars-volatile-nontemporal-store

Fixes: 4d1959e693 ("intel/cfg: Represent divergent control flow paths caused by non-uniform loop execution.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4940
Reported-by: Tapani Pälli <tapani.palli@intel.com>
Reported-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14248>
(cherry picked from commit e7470a40c5)
2021-12-29 20:56:29 +00:00
Ian Romanick
fd54eeb897 intel/stub: Silence "initialized field overwritten" warning
src/intel/tools/intel_noop_drm_shim.c:459:36: warning: initialized field overwritten [-Woverride-init]
  459 |    [DRM_I915_GEM_EXECBUFFER2_WR] = i915_ioctl_noop,
      |                                    ^~~~~~~~~~~~~~~

Fixes: 0f4f1d70bf ("intel: add stub_gpu tool")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14218>
(cherry picked from commit 2dc7c24b80)
2021-12-17 22:30:50 +00:00
Jason Ekstrand
29fcb94b9e anv: Stop doing too much per-sample shading
We were setting anv_pipeline::sample_shading_enable based on
sampleShadingEnable without looking at minSampleShading.  We would then
pass this value into nir_lower_wpos_center which would add sample_pos to
frag_coord.  Then the back-end compiler picks up on the existence of
sample_pos and forces persample dispatch.  This leads to doing
per-sample dispatch whenever sampleShadingEnable = VK_TRUE regardless of
the value of minSampleShading.  This is almost certainly costing us
perf somewhere.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14022>
(cherry picked from commit 1f559930b6)
2021-12-17 22:30:50 +00:00
Ian Romanick
55d80bc20a intel/compiler: Don't predicate a WHILE if there is a CONT
Previously a predicated BREAK that appeared immediately before the WHILE
would get merged into the WHILE.  This doesn't work if other flow
control (e.g., a CONT) can transfer directly to the WHILE.

On Intel platforms, this fixes the CTS test
dEQP-VK.graphicsfuzz.stable-binarysearch-tree-nested-if-and-conditional.

No shader-db changes on any Intel platform.

When this commit was first created (over a month before it is going to
land), there were some regressions that were prevented by other commits
in MR !13095.  That does not appear to be the case now, so I don't know
what changed.  Basically, the treatment of discard as a combination of
demote and terminate causes additional continues in some loops, and
those continues trigger this bug.  The other commits from that MR
prevent those continues from being generated in the first place.

All Intel platforms had simlar fossil-db results. (Ice Lake shown)
Instructions in all programs: 144419989 -> 144419995 (+0.0%)
SENDs in all programs: 6947332 -> 6947332 (+0.0%)
Loops in all programs: 38277 -> 38277 (+0.0%)
Spills in all programs: 204075 -> 204075 (+0.0%)
Fills in all programs: 319480 -> 319480 (+0.0%)

A few shaders in Doom 2016 were hurt by one instruction each.  It seems
likely that these shaders would have experienced at least some
mis-rendering.

Closes: #4213
Fixes: d13bcdb3a9 ("i965/fs: Extend predicated break pass to predicate WHILE.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14128>
(cherry picked from commit 4563261ad1)
2021-12-17 22:30:49 +00:00
Francisco Jerez
d1f2820154 intel/fs/xehp: Teach SWSB pass about the exec pipeline of FS_OPCODE_PACK_HALF_2x16_SPLIT.
This virtual instruction is translated into multiple half float
physical instructions, even though its destination is typically of
integer type, which prevents the software scoreboard pass from
inferring the correct execution pipeline for the virtual instruction
on XeHP+ platforms.  Teach the SWSB lowering pass about this
inconsistency between the IR and physical instruction types.

Fixes among other tests:

dEQP-GLES31.functional.shaders.builtin_functions.pack_unpack.packhalf2x16_compute

Fixes: d4537770bb ("intel/fs: Add helper functions inferring sync and exec pipeline of an instruction.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5685
Reported-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14002>
(cherry picked from commit de55fd358f)
2021-12-17 22:30:49 +00:00
Lionel Landwerlin
b0ce6f5f58 intel/nir: preserve access value when duping intrinsic
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 6339aba775 ("intel/compiler: Lower SSBO and shared loads/stores in NIR")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13718>
(cherry picked from commit 7661237a31)
2021-12-17 22:30:48 +00:00
Tapani Pälli
0025ef9880 anv: allow VK_IMAGE_LAYOUT_UNDEFINED as final layout
From VK_KHR_synchronization2:
   "Image memory barriers that do not perform an image layout
    transition can be specified by setting oldLayout equal to
    newLayout.

    E.g. the old and new layout can both be set to
    VK_IMAGE_LAYOUT_UNDEFINED, without discarding data in the
    image."

v2: make assert more readable (Lionel Landwerlin)

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14008>
(cherry picked from commit d44d2e823f)
2021-12-17 22:30:48 +00:00
Lionel Landwerlin
73f5d5053e intel/fs: fix shader call lowering pass
Now that we removed the intel intrinsic and just use the generic one,
we can skip it in the intel call lowering pass and just deal with it
in the intel rt intrinsic lowering.

v2: rewrite with nir_shader_instructions_pass() (Jason)

v3: handle everything in switch (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 423c47de99 ("nir: drop the btd_resume_intel intrinsic")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12113>
(cherry picked from commit c5a42e4010)
2021-12-01 18:55:47 +00:00
Lionel Landwerlin
f5c31a44a8 anv: don't try to close fd = -1
CID: 1464334

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13879>
(cherry picked from commit 04bd5bb69b)
2021-12-01 18:55:46 +00:00
Iván Briano
5d130ac3ff intel/nir: also allow unknown format for getting the size of a storage image
Fixes: fa251cf111 ("intel/nir: allow unknown format in lowering of storage images")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13847>
(cherry picked from commit 0388783a03)
2021-12-01 18:55:46 +00:00
Kenneth Graunke
a6c713f8c5 intel/genxml: Fix MI_FLUSH_DW to actually specify the length properly
Fixes: 569afd37f1 ("intel/genxml: Copy gen12.xml to gen125.xml")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13809>
(cherry picked from commit 29025f66fd)
2021-11-17 20:06:22 +00:00
Lionel Landwerlin
5380104201 anv: fix multiple wait/signal on same binary semaphore
We need to guarantee that when vkQueueSubmit() returns the application
can actually wait on a signaled semaphore/syncobj.

When using a thread to do the submission to i915, this gets a bit
tricky in the following case :

   A syncobj is used both as a wait & signal semaphore and has been
   signaled once already. It contains a fence before entering
   vkQueueSubmit().

   This means we need to reset the syncobj to ensure when we return
   from vkQueueSubmit(), the syncobj contains no stale fence.

   Currently in the Anv, the submission thread is in charge of putting
   the new fence in the syncobj and also picks up the wait fence
   directly from the syncobj. This means we can't reset the syncobj
   from vkQueueSubmit().

The solution to this has been pointed by Bas & Jason :

   In vkQueueSubmit(), clone the wait syncobj fence into a new
   temporary syncobj that will be destroy after submission and use
   this temporary syncobj as a wait fence for i915. This allows us to
   reset the original syncobj in vkQueueSubmit().

   For this to work with wait_before_signal behavior, we also need to
   do a wait-on-materialize on binary semaphores from vkQueueSubmit().
   Otherwise the application thread calling vkQueueSubmit() could race
   the submission thread and pick up the wrong fence when cloing.

v2: Use copy semantic for clone_syncobj_dma_fence() (Jason)
    Do the cloning prior to adding the syncobj to anv_queue_submit so
    that if the cloning fails don't have an invalid syncobj in
    anv_queue_submit (Jason)

v3: Fix another syncobj leak (Jason)

v4: Fix invalid argument order (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4945
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11474>
(cherry picked from commit d2ff2b9e4a)
2021-11-17 20:06:20 +00:00
Lionel Landwerlin
a4ba277451 anv: don't forget to add scratch buffer to BO list
We reference the scratch BO using a bindless index in the command
streamer instructions, but we forgot to add them to the BO list.

v2: Make use of pipeline reloc list (Jason)

v3: Don't add NULL BOs to the reloc list (Lionel)

v4: Don't add BOs twice to reloc list when dealing with addresses
    (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: eeeea5cb87 ("anv: Add support for scratch on XeHP")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13544>
(cherry picked from commit 46c37c8600)
2021-11-10 21:58:07 +00:00
Jason Ekstrand
6108b3c7f2 anv: Also disallow CCS_E for multi-LOD images
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4616
Fixes: e3101c96bb ("anv/image: Disable multi-layer CCS_E on TGL+")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13680>
(cherry picked from commit e614789588)
2021-11-10 21:58:06 +00:00
Jason Ekstrand
6bbf2110db anv: Fix FlushMappedMemoryRanges for odd mmap offsets
When the client calls vkMapMemory(), we have to align the requested
offset down to the nearest page or else the map will fail.  On platforms
where we have DRM_IOCTL_I915_GEM_MMAP_OFFSET, we always map the whole
buffer.  In either case, the original map may start before the requested
offset and we need to take that into account when we clflush.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
(cherry picked from commit 90ac06e502)
2021-11-10 21:58:05 +00:00
Lionel Landwerlin
03ee3a9dd0 intel/devinfo: fix wrong offset computation
A bit difficult to find what commit introduced the issue because of
all the renaming, but it was my bug :)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015>
(cherry picked from commit 349bfb7275)
2021-11-10 21:58:04 +00:00
Lionel Landwerlin
198f6463ee intel/perf: fix perf equation subslice mask generation for gfx12+
v2: Fix comment change (Marcin)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015>
(cherry picked from commit 67619d8153)
2021-11-10 21:58:04 +00:00
Lionel Landwerlin
f4fe896423 intel/dev: fix subslice/eu total computations with some fused configurations
When a device has its first slice/subslice fused off, we can't use the
number of slices/subslices to iterate the mask array.

v2: Fix spelling (Marcin)
    Use size_t for iterator (Marcin)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Matt Roper <matthew.d.roper@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5601
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015>
(cherry picked from commit a543a94404)
2021-11-10 21:58:03 +00:00
Lionel Landwerlin
8b4c231932 intel/dev: reuse internal functions to set mask
Rather than having 2 paths to set the slice/subslice/eu masks, reuse
the other internal functions. This simplifies finding bugs within this
code :

  * If we have i915 query topology support, update_from_topology() is
    called.

  * If we don't have query topology support but we have getparam for
    slice/subslice/EU, we generate a topology data and call
    update_from_topology()

  * If we have no kernel support to query any kind of topology, we
    generate the values return by the kernel for slice/subslice/EU and
    call update_from_masks() which in turns calls
    update_from_topology()

v2: Fixup typo (Adam)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015>
(cherry picked from commit e10c641f00)
2021-11-10 21:58:03 +00:00
Lionel Landwerlin
546a870459 intel/dev: don't forget to set max_eu_per_subslice in generated topology
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015>
(cherry picked from commit d7c6a90c26)
2021-11-10 21:58:02 +00:00
Lionel Landwerlin
84140c8792 intel/dev: fix HSW GT3 number of subslices in slice1
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015>
(cherry picked from commit d1db5d562a)
2021-11-10 21:58:02 +00:00
Vadym Shovkoplias
bca13e8fc8 intel/fs: Fix a cmod prop bug when cmod is set to inst that doesn't support it
Fixes dEQP-VK.reconvergence.*nesting* tests.

There are cases when cmod is set to an instruction that cannot have
conditional modifier. E.g. following:

 find_live_channel(32) vgrf166:UD,  NoMask
 cmp.z.f0.0(32) null:D, vgrf166+0.0<0>:D, 0d

is optimized to:

 find_live_channel.z.f0.0(32) vgrf166:UD,  NoMask

v2:
 - Add unit test to check cmod is not set to 'find_live_channel' (Matt Turner)
 - Update flag_subreg when conditonal_mod is updated (Ian Romanick)

Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5431
Fixes: 32b7ba66b0 ("intel/compiler: fix cmod propagation optimisations")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13268>
(cherry picked from commit 2dbb66997e)
2021-11-03 20:15:49 +00:00
Lionel Landwerlin
a82babccd1 anv: fix push constant lowering with bindless shaders
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9fa1cdfe7f ("intel/rt: Implement push constants as global memory reads")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13529>
(cherry picked from commit a6031cd9bd)
2021-10-27 19:58:10 +01:00
Marcin Ślusarz
2d6c11843d intel: fix INTEL_DEBUG environment variable on 32-bit systems
INTEL_DEBUG is defined (since 4015e1876a) as:

 #define INTEL_DEBUG __builtin_expect(intel_debug, 0)

which unfortunately chops off upper 32 bits from intel_debug
on platforms where sizeof(long) != sizeof(uint64_t) because
__builtin_expect is defined only for the long type.

Fix this by changing the definition of INTEL_DEBUG to be function-like
macro with "flags" argument. New definition returns 0 or 1 when
any of the flags match.

Most of the changes in this commit were generated using:
for c in `git grep INTEL_DEBUG | grep "&" | grep -v i915 | awk -F: '{print $1}' | sort | uniq`; do
    perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c
    perl -pi -e "s/INTEL_DEBUG & (\([A-Z0-9_ |]+\))/INTEL_DBG\1/" $c
done
but it didn't handle all cases and required minor cleanups (like removal
of round brackets which were not needed anymore).

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334>
(cherry picked from commit d05f7b4a2c)
2021-10-20 20:40:58 +01:00
Vinson Lee
d19e28c139 anv: Fix assertion.
Fix defect reported by Coverity Scan.

Assign instead of compare (PW.ASSIGN_WHERE_COMPARE_MEANT)
assign_where_compare_meant: use of "=" where "==" may have been intended

Fixes: 35315c68a5 ("anv: Use the common wrapper for GetPhysicalDeviceFormatProperties")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13395>
(cherry picked from commit 9eb010ee1e)
2021-10-20 20:40:58 +01:00
Clayton Craft
3f61f84fe3 anv: don't advertise vk conformance on GPUs that aren't conformant
This sets the conformance version to 0.0.0.0 for GPUs that have
incomplete support for vulkan, so that it's easier to check if vulkan is
fully supported by a GPU at runtime for applications/libraries.

    $ vulkaninfo|grep conf
    MESA-INTEL: warning: Ivy Bridge Vulkan support is incomplete
        conformanceVersion = 0.0.0.0

Signed-off-by: Clayton Craft <clayton@craftyguy.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13275>
(cherry picked from commit b2ef7e6d6b)
2021-10-20 20:40:57 +01:00
Caio Marcelo de Oliveira Filho
94e07058ee intel/compiler: Remove unused ret declaration
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13340>
2021-10-13 17:24:29 +00:00
Caio Marcelo de Oliveira Filho
bd2cc4b916 intel/compiler: Convert test_eu_compact to use gtest
Be consistent with the other test suites in intel/compiler.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13340>
2021-10-13 17:24:29 +00:00
Lionel Landwerlin
9fb2c84768 isl: only bump the min row pitch for display when not specified
If the ISL caller didn't specify a row_pitch_B, let's use the
NVIDIA/AMD requirements. Otherwise keep using the Intel requirement,
as the caller is likely trying to import a buffer and if we can deal
with that row_pitch_B, we should accept it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a3a4517f41 ("isl: Work around NVIDIA and AMD display pitch requirements")
Reported-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13024>
2021-10-13 14:46:49 +00:00
Lionel Landwerlin
47ff6767ea anv: fill correct surface state for lowered storage image
Small typo/copy-paste.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c0093c4668 ("anv: Flip around the way we reason about storage image lowering")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13332>
2021-10-13 14:33:14 +00:00
Tapani Pälli
d729038c07 anv: use vk_object_zalloc for wsi fences created
Otherwise we hit assert in vk_object_base_assert_valid when attemping to
create handle from anv_fence with unknown base type.

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13330>
2021-10-13 11:59:17 +00:00
Tapani Pälli
840c79fc9b anv/android: fix parameters given for vk_common_QueueSubmit
Common queue submit expects pWaitDstStageMask to be set per each
semaphore (as per Vulkan spec) and crashes if these are not given
properly.

This fixes crashes seen when running vulkan apps on Android.

v2: change the VkPipelineStageFlags given (Lionel)

Fixes: b996fa8efa ("anv: implement VK_KHR_synchronization2")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13305>
2021-10-13 06:00:56 +00:00
Felix DeGrood
5bf6987873 anv: dirty only state impacted by blorp_exec
Instead of dirtying all state after blorp operations,
avoid dirtying state that blorp never touches.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5077
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12567>
2021-10-13 04:31:34 +00:00
Jason Ekstrand
a64d90026b anv: Use the common WSI wrappers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13234>
2021-10-13 00:06:15 +00:00
Jason Ekstrand
916c9335b4 meson: Add and use an idep for Vulkan WSI
Acked-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13234>
2021-10-13 00:06:15 +00:00
Nanley Chery
10be870c72 anv: Tile cache flush for depth before fast clear
Instead of doing a tile cache flush after slow clears, resolves, and
ambiguates, do it before fast clears of HIZ_CCS_WT surfaces. This agrees
with the Bspec.

Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11539>
2021-10-12 18:05:46 +00:00
Nanley Chery
81e9c25c1b anv: Allow HIZ_CCS_WT with subpass self-dependencies
This unblocks later commits that aim to align the driver with the tile
cache flushing requirements in the Bspec.

Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11539>
2021-10-12 18:05:46 +00:00
Lionel Landwerlin
ce3dd1375f anv: implement VK_KHR_format_feature_flags2
v2: fix SAMPLED_IMAGE_DEPTH_COMPARISON_BIT_KHR (Ivan)

v3: Fixup VK_FORMAT_FEATURE_2_STORAGE_IMAGE_BIT_KHR setting (Ivan)
    Add missing drm-modifiers/android bits (Lionel)

v4: Avoid duplicating get_ahw_buffer_format_properties() (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13198>
2021-10-11 10:29:12 -05:00
Lionel Landwerlin
01d1ec292a anv: start computing KHR_format_features2 flags for storage images
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13198>
2021-10-11 10:29:12 -05:00
Jason Ekstrand
c0093c4668 anv: Flip around the way we reason about storage image lowering
There are roughly two cases when it comes to storage images.  In the
easy case, we have full hardware support and we can just emit a typed
read/write message in the shader and we're done.  In the more complex
cases, we may need to fall back to a typed read with a different format
or even to a raw (SSBO) read.

The hardware has always had basically full support for typed writes all
the way back to Ivy Bridge but typed reads have been harder to come by.
Starting with Skylake, we finally have enough that we at least have a
format of the right bit size but not necessarily the right format so we
can use a typed read but may still have to do an int->unorm or similar
cast in the shader.

Previously, in ANV, we treated lowered images as the default and write-
only as a special case that we can optimize.  This flips everything
around and treats the cases where we don't need to do any lowering as
the default "vanilla" case and treats the lowered case as special.
Importantly, this means that read-write access to surfaces where the
native format handles typed writes now use the same surface state as
write-only access and the only thing that uses the lowered surface state
is access read-write access with a format that doesn't support typed
reads.  This has the added benefit that now, if someone does a read
without specifying a format, we can default to the vanilla surface and
it will work as long as it's a format that supports typed reads.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13198>
2021-10-11 10:29:09 -05:00