Commit graph

5224 commits

Author SHA1 Message Date
D Scott Phillips
4724bad429 anv/gen11+: Disable object level preemption
An unknown issue is causing vs push constants to become corrupted
during object-level preemption. For now, restrict to command
buffer level preemption to avoid rendering corruption.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5110>
(cherry picked from commit 81201e4617)
2020-05-28 11:16:10 -07:00
Jason Ekstrand
54973a393b anv:gpu_memcpy: Emit 3DSTATE_VF_INDEXING on Gen8+
If this gets run right after something which uses
VK_VERTEX_INPUT_RATE_INSTANCE on its first vertex binding, we could end
up in serious trouble.

Fixes: 3d9747780b "anv: Add a helper for doing buffer copies with..."

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5090>
(cherry picked from commit 164aed6c81)
2020-05-28 11:16:08 -07:00
Ian Romanick
fcc8debd5a anv/tests: Don't rely on assert or changing NDEBUG in tests
This is the last part of the fix for #2903.

v2: Add test_common.h.

Fixes: f7c56475d2 ("anv/tests: compile to something sensible in release builds")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4994>
(cherry picked from commit f4638cfdad)
2020-05-14 10:20:01 -07:00
Danylo Piliaiev
3705ec33a5 anv: Fix deadlock in anv_timelines_wait
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2945
Fixes: 34f32a6d66
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5005>
(cherry picked from commit 06b6c687e2)
2020-05-14 10:20:00 -07:00
Danylo Piliaiev
6b4950f2d3 anv: Translate relative timeout to absolute when calling anv_timelines_wait
Fixes: 34f32a6d66
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5025>
(cherry picked from commit 15dd7933bc)
2020-05-14 10:19:59 -07:00
Lionel Landwerlin
299d0f9a81 anv: don't expose VK_INTEL_performance_query without kernel support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2b5f30b1d9 ("anv: implement VK_INTEL_performance_query")
Acked-by: Timothy Strelchun <timothy.strelchun@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4937>
(cherry picked from commit 4f17e9eef6)
2020-05-08 10:29:06 -07:00
Lionel Landwerlin
071ba3898a intel/perf: store the probed i915-perf version
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
(cherry picked from commit aad0e6f810)
2020-05-08 10:29:05 -07:00
D Scott Phillips
7b7a921c1c anv,iris: Fix input vertex max for tcs on gen12
gen12 does away with the single patch dispatch mode for tcs, and
increases some limits so that 8_patch mode can always work. Make the
necessary changes so we don't try to fall back to single patch mode.

Fixes KHR-GL46.tessellation_shader.single.max_patch_vertices and others

Fixes: 44754279ac ("intel/fs/gen12: Use TCS 8_PATCH mode.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4843>
(cherry picked from commit 65b05ebdda)
2020-05-04 10:21:04 -07:00
D Scott Phillips
2ddc07ee08 intel/fs: Update location of Render Target Array Index for gen12
Render Target Array Index has moved from R0.0[26:16] to
R1.1[26:16] on gen12.

Fixes dEQP-VK.multiview.input_attachments.*

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4836>
(cherry picked from commit 7bd15135a6)
2020-05-04 10:21:03 -07:00
Jason Ekstrand
90934659dc intel/fs: Don't delete coalesced MOVs if they have a cmod
Shader-db results on ICL:

    total instructions in shared programs: 17133088 -> 17133287 (<.01%)
    instructions in affected programs: 61300 -> 61499 (0.32%)
    helped: 0
    HURT: 199

This means it's likely fixing 199 bugs. :-)  All the changed shaders are
in Mad Max.  It's surprisingly difficult to get the back-end compiler to
generate a pattern that hits this we don't tend to emit a lot coalescable
MOVs.  The pattern in Mad Max that's able to hit is fsign(fsat(x)) under
the right conditions.

Closes: #2820
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4773>
(cherry picked from commit e581ddeeee)
2020-04-29 16:12:07 -07:00
Jason Ekstrand
4072515d57 anv: Expose CS workgroup sizes based on a maximum of 64 threads
Otherwise, we'll hit asserts in brw_compile_cs.

Fixes: cf12faef61 "intel/compiler: Restrict cs_threads to 64"
Closes: #2835
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4746>
(cherry picked from commit 81ac741f89)
2020-04-28 12:01:49 -07:00
Jason Ekstrand
002f718dfa intel/devinfo: Compute the correct L3$ size for Gen12
Fixes: 8125d7960b "intel/dev: Add preliminary device info for Tigerlake"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4782>
(cherry picked from commit 86f67952d3)
2020-04-28 12:01:48 -07:00
Jason Ekstrand
74e0db6171 anv: Drop an assert
Ever since Vulkan 1.2, this feature has been in core so enabling the
extension is no longer required.

Fixes: 4ef3f7e3d3 "anv: Enable Vulkan 1.2 support"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4694>
(cherry picked from commit 9c009da208)
2020-04-27 10:33:12 -07:00
Jason Ekstrand
ca9452e34c anv: Properly handle all sizes of specialization constants
Closes: #2812
cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
(cherry picked from commit a44e63398b)
2020-04-27 10:33:08 -07:00
Lionel Landwerlin
ea37c93a6b intel/perf: Enable MDAPI queries for Gen12
We're missing the cases for gen12 leading to those metrics going
missing.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15b7b56eb2 ("intel/perf: add TGL support")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4586>
(cherry picked from commit 086ea1ac7e)
2020-04-23 09:37:03 -07:00
Lionel Landwerlin
96750187c1 intel/perf: move mdapi query definitions to their own file
Where they belong.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
(cherry picked from commit dde96d31b7)
2020-04-23 09:37:01 -07:00
Lionel Landwerlin
265c4537ab intel/perf: break GL query stuff away
This stuff is somewhat specific to the GL extension & drivers. On
Vulkan we won't use this, it also made a rather large file.

v2: Fix Android build (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
(cherry picked from commit 33b9c7a7f6)
2020-04-23 09:36:41 -07:00
Lionel Landwerlin
c4b110d33b intel/perf: move register definition to special file
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
(cherry picked from commit f5c5574f42)
2020-04-23 09:36:29 -07:00
Jason Ekstrand
dde2cac42a anv: Apply any needed PIPE_CONTROLs before emitting state
Push constants in particular can get picked up by the hardware at weird
times that happen *before* 3DPRIMITIVE.  Therefore, we need to flush
before we emit all our state to ensure that any data they may pick up is
in memory in time.  This fixes an app which does vkCmdCopyBuffers
immediately followed by a vkCmdBeginRenderPass and vkCmdDraw which uses
the destination of the copy as a UBO which we push.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4601>
(cherry picked from commit 969aeb6a93)
2020-04-22 22:00:45 -07:00
Jason Ekstrand
928580a13d anv: Move vb_emit setup closer to where it's used in flush_state
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4601>
(cherry picked from commit ffc84eac0d)
2020-04-22 22:00:40 -07:00
Abhishek Kumar
dd7fdda487 anv/android: fix assert in anv_import_ahw_memory
Commit fixes assert that triggers when running
   dEQP-VK.api.external.memory.android_hardware_buffer.dedicated.buffer#bind_export_import_bind

on a debug build of Mesa.

Fixes: c79a528d ("anv/android: support import/export of AHardwareBuffer objects")
Signed-off-by: Abhishek Kumar <abhishek4.kumar@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4655>
(cherry picked from commit f06e4ab319)
2020-04-22 15:03:11 -07:00
Jason Ekstrand
4dfb9d9fce anv: Report correct SLM size
Fixes: d787a2d0 "anv: Implement VK_KHR_pipeline_executable_properties"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4597>
(cherry picked from commit b8acf9a3d4)
2020-04-20 10:03:00 -07:00
Jason Ekstrand
e16cb98ce2 intel: Add _const versions of prog_data cast helpers
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4597>
(cherry picked from commit e003104605)
2020-04-20 10:02:56 -07:00
Jason Ekstrand
1cf4f626d0 anv/image: Use align_u64 for image offsets
The ALIGN functions in util/u_math.h work on uintptr_t whose size
changes depending on your platform.  Use ones which take an explicit
64-bit type instead to avoid 32-bit platform issues.

Cc: mesa-stable@lists.freedesktop.org
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4414>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4414>
(cherry picked from commit 5cc27d59a1)
2020-04-09 14:06:48 -07:00
Juan A. Suarez Romero
49a4ba5e05 anv/pipeline: allow more than 16 FS inputs
A fragment shader can have more than 16 inputs, so SBE emission should
deal with all of them.

This fixes dEQP-VK.pipeline.max_varyings.*

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
(cherry picked from commit 191ced539a)
2020-04-09 14:06:44 -07:00
Juan A. Suarez Romero
04067fbe59 intel/compiler: store the FS inputs in WM prog data
Store the fragment shader inputs in the program data so we can use them
later when required without needing the NIR shader.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
(cherry picked from commit 460de2159e)
2020-04-09 14:06:42 -07:00
Mathias Fröhlich
3744a31d6c i965: Move down genX_upload_sbe in profiles.
Avoid looping over all VARYING_SLOT_MAX urb_setup array
entries from genX_upload_sbe. Prepare an array indirection
to the active entries of urb_setup already in the compile
step. On upload only walk the active arrays.

v2: Use uint8_t to store the attribute numbers.
v3: Change loop to build up the array indirection.
v4: Rebase.
v5: Style fix.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
(cherry picked from commit 630154e77b)
2020-04-09 14:06:40 -07:00
Jason Ekstrand
e51b749f1d anv: Account for the header in anv_state_stream_alloc
If we have an allocation that's exactly the block size, we end up
computing a new block size to allocate that's exactly the block size,
add in the header, and then assert fail.  When computing the block size,
we need to account for the header.

Fixes: 955127db93 "anv/allocator: Add support for large stream..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4336>
(cherry picked from commit 63bec07e14)
2020-03-31 12:29:11 +02:00
Francisco Jerez
cbc7ba2e47 intel/fs/gen12: Fix interaction of SWSB dependency combination with EU fusion workaround.
This has been reported to fix a hang in Shadow of Mordor on Gen12.
One of its compute shaders seems to cause an in-order exec_all
dependency to be merged into an out-of-order SET dependency slot,
which would prevent us from baking the SET dependency into the parent
instruction, leading to an assert failure in emit_inst_dependencies()
(Thanks to Rafael for noticing that).  Prevent that by avoiding
combination of in-order dependencies whenever that would cause a SET
dependency to be demoted to a SYNC.NOP instruction.

Fixes: e14529ff32 "intel/fs/gen12: Workaround data coherency issues due to broken NoMask control flow."
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 36c155a017)
2020-03-30 12:26:45 +02:00
Lionel Landwerlin
7e7722dca3 isl: drop min row pitch alignment when set by the driver
When the caller of the isl_surf_init() specifies a row pitch, do not
consider the minimum CCS requirement if it's incompatible with the
caller's value.

isl_surf_get_ccs_surf() will check that the main surface alignment
matches CCS expectations.

v2: Simplify checks (Nanley)

v3: Add Comment about isl_surf_get_ccs_surf() (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Fixes: a3f6db2c4e ("isl: drop CCS row pitch requirement for linear surfaces")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
(cherry picked from commit 507abc3959)
2020-03-20 00:21:51 +01:00
Lionel Landwerlin
b1cbf7d9fa isl: only apply main surface ccs pitch constraint with CCS
We could be creating a Y-tiled surface that isn't going to use CCS
(this could be the case when clearly indicated through modifiers).
Don't apply the main surface pitch alignment constraint in that case.

v2: Use logical NOT (Sagar)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a3f6db2c4e ("isl: drop CCS row pitch requirement for linear surfaces")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
(cherry picked from commit def3470e9b)
2020-03-20 00:21:50 +01:00
Lionel Landwerlin
18e76206b0 isl: properly filter supported display modifiers on Gen9+
Y tiling is supported for display on Gen9+ so don't filter it from the
possible flags.

v2: Drop Yf from display supported tilings on Gen12+ (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
(cherry picked from commit dab0aadea9)
2020-03-20 00:21:49 +01:00
Lionel Landwerlin
0414dba695 isl: implement linear tiling row pitch requirement for display
We're missing a requirement for alignment of row pitch for the display
HW. In linear tiling, the row pitch must be a 64bytes aligned.

v2: Use correct formula to align to 64bytes (Chad)

v3: Matching {} (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
(cherry picked from commit 157a3cf3ec)
2020-03-20 00:21:49 +01:00
Jason Ekstrand
226ff465b7 anv: Swizzle fast-clear values
Starting with Gen12, we can fast-clear a lot more surface formats and we
are suddenly in the position of having to fast-clear surfaces with
formats with an implicit swizzle such as VK_FORMAT_R4G4B4A4_UNORM_PACK16
which is represented as ISL_FORMAT_A4B4G4R4 with a BGRA swizzle.  In
order for blorp to do the fast-clear color conversion for us, it needs
a properly swizzled color.

This fixes the following Vulkan CTS groups on TGL:

 - dEQP-VK.pipeline.blend.format.b4g4r4a4_unorm_pack16.*
 - dEQP-VK.api.image_clearing.core.clear_color_image.*.b4g4r4a4*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>
(cherry picked from commit 46187bb54f)
2020-03-19 09:51:48 -07:00
Jason Ekstrand
7eb4b33a9a intel/blorp: Add support for swizzling fast-clear colors
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>
(cherry picked from commit 3fb8f19481)
2020-03-19 09:51:47 -07:00
Jason Ekstrand
c15220de7e anv: Do an end-of-pipe sync before updating AUX table entries
We've found in GL that an actual end-of-pipe sync is required before
invalidating the aux tables and that a simple CS stall is insufficient.
If we're about to modify the actual AUX table entries from the GPU, we
should definitely make sure it's stopped dead before we do so.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4206>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4206>
(cherry picked from commit d60375cbc2)
2020-03-18 10:28:45 -07:00
Rafael Antognolli
98cd8c666d anv: Wait for the GPU to be idle before invalidating the aux table.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>
(cherry picked from commit 43dc842cb9)
2020-03-18 10:28:24 -07:00
Jason Ekstrand
44e9b6ab62 anv: Do end-of-pipe sync around MCS/CCS ops instead of CS stall
v2: Do end-of-pipe sync after clear depth stencil too (Jason).
v3: Also do end-of-pipe sync before clear depth stencil too (Jason).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>
(cherry picked from commit 3ca3050de5)
2020-03-18 10:28:19 -07:00
Jason Ekstrand
8bc42bf9db anv: Use a proper end-of-pipe sync instead of just CS stall
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>
(cherry picked from commit 2db471953a)
2020-03-18 10:28:12 -07:00
Jason Ekstrand
5d2f7e96ad anv: Use the PIPE_CONTROL instead of bits for the CS stall W/A
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>
(cherry picked from commit ac8d412ba3)
2020-03-18 10:28:05 -07:00
Francisco Jerez
868286d10b intel/fs: Fix workaround for VxH indirect addressing bug under control flow.
The current workaround for this hardware bug involved marking the ADD
instruction used to initialize the address register as NoMask on
Gen12, which was based on the assumption that the problem was caused
by a hardware bug affecting the application of the execution mask to
the address register write.

However that doesn't seem to be the case: The address register write
was working correctly, the real problem leading to hangs on TGL is
that the indirect addressing logic is unable to deal with garbage
values in the address register (e.g. misaligned offsets), even for
channels which are currently inactive due to non-uniform control flow.
The current workaround isn't able to avoid that situation in general,
since the result of the NoMask ADD instruction for a dead channel is
calculated based on the corresponding (dead) component of the
indirect_byte_offset source, which would still be undefined in the
likely case that the source was initialized under control flow itself.

This would lead to hangs whenever MOV_INDIRECT was used under
non-uniform control flow in some scenarios like a tessellation shader
from GFXBench5/gl_4 (AKA Car Chase) on TGL.  In addition I've managed
to reproduce the same issue on earlier platforms by initializing the
whole address register with garbage before the ADD instruction, so
this seems to be a long-standing issue we have avoided mostly by luck.

This patch fixes the problem and applies the workaround to all
platforms, since even when the hardware is able to deal with garbage
address values without hanging there might be a significant
performance cost from reading random GRF registers due to the useless
extra EU cycles spent fetching registers for dead channels and due to
the potential for unintended serialization with respect to other
random instructions that could be executed in parallel, which may have
had a cost of the order of hundreds of cycles in the worst case
scenario.

Fixes: f93dfb509c "intel/fs: Write the address register with NoMask for MOV_INDIRECT"
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 45d4665dc7)
2020-03-10 13:30:15 +01:00
Jason Ekstrand
1e3edbfbd3 anv: Parse VkPhysicalDeviceFeatures2 in CreateDevice
The client may enable robustBufferAccess2 via either
pCreateInfo->pEnabledFeatures or via a chained-in
VkPhysicalDeviceFeatures2 struct.  We need to parse both.

Fixes: 022e5c7e5a "anv: Implement VK_KHR_get_physical_device_properties2"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3777>
(cherry picked from commit 35ca2ad22e)
2020-03-10 13:30:15 +01:00
Jason Ekstrand
18508b37cb isl: Set 3DSTATE_DEPTH_BUFFER::Depth correctly for 3D surfaces
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3717>
(cherry picked from commit 9f5f4269a6)
2020-03-06 16:43:09 -08:00
Rafael Antognolli
3d203789a9 intel/gen12+: Disable mid thread preemption.
Fixes a GPU hang in Car Chase.

Cc: mesa-stable@lists.freedesktop.org

v2: Add comment explaining why (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4035>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4035>
(cherry picked from commit 5f13996262)
2020-03-04 08:27:37 -08:00
Paulo Zanoni
d808374674 intel/device: bdw_gt1 actually has 6 eus per subslice
Found by inspection, I'm not aware of any bugs caused by this typo.

According to Lionel, it seems we only use this to generate masks
of available EUs for perfromance queries, and it's only used when we
can't query the fused parts of the GPU through DRM_IOCTL_I915_QUERY.
So this patch should help for the corner case where the Kernel is too
old to support the query ioctl.

v2: improve commit message, cc stable (Lionel).

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4006>
(cherry picked from commit aa78801f0a)
2020-03-04 08:27:34 -08:00
Paulo Zanoni
e7b8a304bd intel: fix the gen 12 compute shader scratch IDs
This is the same idea as "intel: fix the gen 11 compute shader scratch
IDs".

The number of EUs on TGL is not the same as ICL, but the
MEDIA_VFE_STATE restrictions stay the same, so adapt the code to it.
Also, consider the base configuration instead of what we read from the
Kernel.

According to Mark, this fixes the following piglit tests on TGL:

    piglit.spec.arb_compute_shader.execution.shared-atomicmax-uint.tglm64
    piglit.spec.arb_compute_shader.execution.shared-atomicmax-int.tglm64
    piglit.spec.intel_shader_atomic_float_minmax.execution.shared-atomicmax-float.tglm64

v2: s/ICL+/Gen11+/ (Jason).

Cc: mesa-stable@lists.freedesktop.org
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4006>
(cherry picked from commit 9e5ce30da7)
2020-03-04 08:27:33 -08:00
Paulo Zanoni
dd4df57ad9 intel: fix the gen 11 compute shader scratch IDs
Scratch space allocation is based on the number of threads in the base
configuration, and we only have one base configuration for ICL, with 8
subslices.

This fixes an issue with Aztec on Vulkan in a machine with a
configuration that's not the base. The issue looks like a regression
from b9e93db208, but it seems things are broken since forever, just
not easily reproducible.

v2: Reimplement it using the subslices variable. Don't touch TGL.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4006>
(cherry picked from commit 1efe139cad)
2020-03-04 08:27:32 -08:00
Jordan Justen
eb4a39ba2f intel/compiler: Restrict cs_threads to 64
Our current GPGPU_WALKER code only supports up to 64 threads.

On HSW we could use up to 70 and TGL up to 112, but only if the walker
is adjusted so the width does not exceed 64. Work to support this is
in progress.

Previous to this change, we might try to downgrade to SIMD8 if the
SIMD16 shader spilled. Since HSW and TGL have the max number of
threads above 64, we would then try to emit an invalid GPGPU walker
command.

Fixes: 932045061b ("i965/cs: Emit compute shader code and upload programs")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit cf12faef61)
2020-03-02 11:29:18 -08:00
Jason Ekstrand
47bfaa8795 anv: Always enable the data cache
Because we set the needs_data_cache bit from the NIR during compilation,
any time a shader was pulled out of the pipeline cache, we wouldn't set
the bit and the data cache was disabled.  Fortunately, on Gen8+, this
bit is ignored because we always use the ALL section in the L3$ config
instead of separate DC and RO sections.  On Gen7, however, this meant
that we were basically never running with the data cache enabled and our
compute performance was suffering massively because of it.  This commit
improves Geekbench 5 scores on my Haswell GT3 by roughly 330% (no,
that's not a typo).

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3912>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3912>
(cherry picked from commit 5dfd83d7a1)
2020-02-28 14:30:31 -08:00
Ian Romanick
01020aef25 intel/fs: Correctly handle multiply of fsign with a source modifier
The other source of the multiply will be interpreted as a uint32_t in an
XOR instruction.  Any source modifiers with either not be interpreted at
all or will be misinterpreted due to the differing types.

If the other operand of the multiplication has a source modifier, just
emit an extra move to resolve the source modifiers.

The negation source modifier problem is difficult to reproduce due to an
algebraic optimization that changes (-a*b) to -(a*b).  However, changes
in MR !1359 push the negations back down.

On Gen7+ it might be possible to do slightly better for an abs() source
modifier by using BFI2 as a glorified copysign().

On Gen8+ it might be possible to do slightly better for a neg() source
modifier by emitting (~a ^ b).

There were no shader-db changes on any Intel platform, so I think we can
deal with that problem when it arises.

See also piglit!224.

Fixes: 06d2c11641 ("intel/fs: Add a scale factor to emit_fsign")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3780>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3780>
(cherry picked from commit 273b8cd1ca)
2020-02-20 09:10:40 -08:00