fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-03-14 15:10:31 +01:00

Author	SHA1	Message	Date
Ian Romanick	2c51e96672	intel/fs: Fix gl_FrontFacing optimization on Gfx12+ It's not obvious why the (gl_FrontFacing ? -1.0 : 1.0) case was handled different for Gfx12+ than for previous generations, and it's not correct. It tries to negate the result as an integer, and it does this before the mask operation that clears the other bits in the value. When we eventually support dual-SIMD8 dispatch, the other front-facing bit is in g1.6 at bit 15, so similar code should be possible there. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `c92fb60007` ("intel/fs/gen12: Implement gl_FrontFacing on gen12+.") Closes: #5876 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14625> (cherry picked from commit `945fb51fb5`)	2022-01-26 18:28:30 +00:00
Lionel Landwerlin	f7a52a16cf	anv: fix missing descriptor copy of bufferview/surfacestate content When doing copies of descriptors from one set to another, that contain either a UNIFORM_BUFFER or STORAGE_BUFFER, both the buffer view & surface state are allocated from the source descriptor. Therefore we need to copy their content otherwise we could run into lifecycle issues when the source descriptor is destroyed. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14585> (cherry picked from commit `acebea9cf1`)	2022-01-26 18:28:30 +00:00
Lionel Landwerlin	b4fb2974de	intel/fs: disable VRS when omask is written As indicated by VkPhysicalDeviceFragmentShadingRatePropertiesKHR::fragmentShadingRateWithShaderSampleMask our implementation will clamp to 1x1 when reading samplemask or writing to samplemask. This fixes vkd3d-proton tests test_sample_mask_dxbc & test_sample_mask_dxil Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `b6332fc4a8` ("intel/compiler: handle coarse pixel in render target writes descriptors") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14553> (cherry picked from commit `30a8b8d2df`)	2022-01-26 18:28:29 +00:00
Lionel Landwerlin	33461292fb	intel/dev: fixup chv workaround We're using the wrong helper to get the subslice total count. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c24ba6cecb` ("intel/dev: Handle CHV CS thread weirdness in get_device_info_from_fd") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14492> (cherry picked from commit `d6c0d16791`)	2022-01-12 19:54:27 +00:00
Lionel Landwerlin	d987419d8a	anv: limit compiler valid color outputs using NIR variables This fixes a test from the vkd3d-proton test_dual_source_blending_dxbc test which asserts in the backend with : brw_fs_visitor.cpp:716: void fs_visitor::emit_fb_writes(): Assertion `!prog_data->dual_src_blend \|\| key->nr_color_regions == 1' failed. This is because there is 2 color attachments provided by the renderpass so we initially set nr_color_regions = 2. But once we've parsed the shader, we can see it's only using one output (with dual source color blending). This change looks at the output variables to update the valid output variables. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14417> (cherry picked from commit `07bc6b7ed9`)	2022-01-12 19:54:26 +00:00
Lionel Landwerlin	886b86f601	anv: don't leave anv_batch fields undefined Because the extend_cb vfunc is not initialized, there is a risk that the emission code calls into a random pointer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14418> (cherry picked from commit `1d40d53e03`)	2022-01-12 19:54:26 +00:00
Rohan Garg	7241ec2ee5	intel/fs: OpImageQueryLod does not support arrayed images as an operand When we lower SPIR-V to NIR for textures in vtn_handle_texture, we only bump the number of coordinate components when the op is not a lod query. Update the assert to take this into account. This fixes: - dEQP-VK.robustness.robustness2.bind.template.r32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag Fixes: `231337a1` ("intel/fs/xehp: Assert that the compiler is sending all 3 coords for cubemaps.") Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13925> (cherry picked from commit `af13119993`)	2022-01-12 19:54:25 +00:00
Henry Goffin	8be9711422	intel/compiler/test: Fix build with GCC 7 Without this change, test_fs_scoreboard.cpp does not compile on GCC 7 due to the use of C99 initializers in a C++ source file. Fixes: `c847bfaaf5` ("intel/fs/gen12: Add tests for scoreboard pass") Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14349> (cherry picked from commit `fe617bcca0`)	2022-01-12 19:54:24 +00:00
Dave Airlie	5f32bdee91	intel/genxml/gen4-5: fix more Raster Operation in BLT to be a uint This has been partly fixed twice before, but looks like some got missed. Fixes arb_copy_image on gen4 with crocus Fixes: `de625dddee` ("intel/genxml: fix raster operation field in blt genxml") Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14345> (cherry picked from commit `a2293e33fd`)	2022-01-12 19:54:23 +00:00
Jason Ekstrand	4e143b9c0b	Revert "anv: Stop doing too much per-sample shading" This reverts commit `1f559930b6`. Turns out, this approach won't work. Fixes: `1f559930b6` ("anv: Stop doing too much per-sample shading") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14196> (cherry picked from commit `b05d228695`)	2022-01-01 18:16:22 +00:00
Francisco Jerez	03f6edd5b7	intel/fs: Add physical fall-through CFG edge for unconditional BREAK instruction. This adds a missing CFG edge that represents a possible physical control flow path the EU might take under some conditions which isn't part of the logical CFG of the program. This possibility shouldn't have led to problems on platforms prior to Gfx12, since the missing control flow edge cannot possibly influence liveness intervals. However on Gfx12+ it becomes the compiler's responsibility to resolve data dependencies across instructions, and the missing physical control flow paths may lead to a WaR data hazard currently not visible to the software scoreboard pass, which could lead to data corruption. Worse, the possibility for this path to be taken by the EU increases on Gfx12+ due to a hardware bug affecting EU fusion -- However the same physical path can be potentially taken on earlier platforms as well, so this patch extends the CFG on all platforms for consistency, even though the lack of this edge shouldn't lead to any functional issues on platforms earlier than Gfx12. There are no shader-db changes on earlier platforms, so there seems to be no disadvantage from using the same CFG representation as on later platforms. This issue has ben reported on TGL with the following conformance test, thanks to Ian for bringing the FULSIM dependency check warning to my attention: dEQP-VK.graphicsfuzz.spv-stable-pillars-volatile-nontemporal-store Fixes: `4d1959e693` ("intel/cfg: Represent divergent control flow paths caused by non-uniform loop execution.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4940 Reported-by: Tapani Pälli <tapani.palli@intel.com> Reported-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14248> (cherry picked from commit `e7470a40c5`)	2021-12-29 20:56:29 +00:00
Ian Romanick	fd54eeb897	intel/stub: Silence "initialized field overwritten" warning src/intel/tools/intel_noop_drm_shim.c:459:36: warning: initialized field overwritten [-Woverride-init] 459 \| [DRM_I915_GEM_EXECBUFFER2_WR] = i915_ioctl_noop, \| ^~~~~~~~~~~~~~~ Fixes: `0f4f1d70bf` ("intel: add stub_gpu tool") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14218> (cherry picked from commit `2dc7c24b80`)	2021-12-17 22:30:50 +00:00
Jason Ekstrand	29fcb94b9e	anv: Stop doing too much per-sample shading We were setting anv_pipeline::sample_shading_enable based on sampleShadingEnable without looking at minSampleShading. We would then pass this value into nir_lower_wpos_center which would add sample_pos to frag_coord. Then the back-end compiler picks up on the existence of sample_pos and forces persample dispatch. This leads to doing per-sample dispatch whenever sampleShadingEnable = VK_TRUE regardless of the value of minSampleShading. This is almost certainly costing us perf somewhere. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14022> (cherry picked from commit `1f559930b6`)	2021-12-17 22:30:50 +00:00
Ian Romanick	55d80bc20a	intel/compiler: Don't predicate a WHILE if there is a CONT Previously a predicated BREAK that appeared immediately before the WHILE would get merged into the WHILE. This doesn't work if other flow control (e.g., a CONT) can transfer directly to the WHILE. On Intel platforms, this fixes the CTS test dEQP-VK.graphicsfuzz.stable-binarysearch-tree-nested-if-and-conditional. No shader-db changes on any Intel platform. When this commit was first created (over a month before it is going to land), there were some regressions that were prevented by other commits in MR !13095. That does not appear to be the case now, so I don't know what changed. Basically, the treatment of discard as a combination of demote and terminate causes additional continues in some loops, and those continues trigger this bug. The other commits from that MR prevent those continues from being generated in the first place. All Intel platforms had simlar fossil-db results. (Ice Lake shown) Instructions in all programs: 144419989 -> 144419995 (+0.0%) SENDs in all programs: 6947332 -> 6947332 (+0.0%) Loops in all programs: 38277 -> 38277 (+0.0%) Spills in all programs: 204075 -> 204075 (+0.0%) Fills in all programs: 319480 -> 319480 (+0.0%) A few shaders in Doom 2016 were hurt by one instruction each. It seems likely that these shaders would have experienced at least some mis-rendering. Closes: #4213 Fixes: `d13bcdb3a9` ("i965/fs: Extend predicated break pass to predicate WHILE.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14128> (cherry picked from commit `4563261ad1`)	2021-12-17 22:30:49 +00:00
Francisco Jerez	d1f2820154	intel/fs/xehp: Teach SWSB pass about the exec pipeline of FS_OPCODE_PACK_HALF_2x16_SPLIT. This virtual instruction is translated into multiple half float physical instructions, even though its destination is typically of integer type, which prevents the software scoreboard pass from inferring the correct execution pipeline for the virtual instruction on XeHP+ platforms. Teach the SWSB lowering pass about this inconsistency between the IR and physical instruction types. Fixes among other tests: dEQP-GLES31.functional.shaders.builtin_functions.pack_unpack.packhalf2x16_compute Fixes: `d4537770bb` ("intel/fs: Add helper functions inferring sync and exec pipeline of an instruction.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5685 Reported-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14002> (cherry picked from commit `de55fd358f`)	2021-12-17 22:30:49 +00:00
Lionel Landwerlin	b0ce6f5f58	intel/nir: preserve access value when duping intrinsic Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `6339aba775` ("intel/compiler: Lower SSBO and shared loads/stores in NIR") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13718> (cherry picked from commit `7661237a31`)	2021-12-17 22:30:48 +00:00
Tapani Pälli	0025ef9880	anv: allow VK_IMAGE_LAYOUT_UNDEFINED as final layout From VK_KHR_synchronization2: "Image memory barriers that do not perform an image layout transition can be specified by setting oldLayout equal to newLayout. E.g. the old and new layout can both be set to VK_IMAGE_LAYOUT_UNDEFINED, without discarding data in the image." v2: make assert more readable (Lionel Landwerlin) Cc: mesa-stable Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14008> (cherry picked from commit `d44d2e823f`)	2021-12-17 22:30:48 +00:00
Lionel Landwerlin	73f5d5053e	intel/fs: fix shader call lowering pass Now that we removed the intel intrinsic and just use the generic one, we can skip it in the intel call lowering pass and just deal with it in the intel rt intrinsic lowering. v2: rewrite with nir_shader_instructions_pass() (Jason) v3: handle everything in switch (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `423c47de99` ("nir: drop the btd_resume_intel intrinsic") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12113> (cherry picked from commit `c5a42e4010`)	2021-12-01 18:55:47 +00:00
Lionel Landwerlin	f5c31a44a8	anv: don't try to close fd = -1 CID: 1464334 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13879> (cherry picked from commit `04bd5bb69b`)	2021-12-01 18:55:46 +00:00
Iván Briano	5d130ac3ff	intel/nir: also allow unknown format for getting the size of a storage image Fixes: `fa251cf111` ("intel/nir: allow unknown format in lowering of storage images") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13847> (cherry picked from commit `0388783a03`)	2021-12-01 18:55:46 +00:00
Kenneth Graunke	a6c713f8c5	intel/genxml: Fix MI_FLUSH_DW to actually specify the length properly Fixes: `569afd37f1` ("intel/genxml: Copy gen12.xml to gen125.xml") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13809> (cherry picked from commit `29025f66fd`)	2021-11-17 20:06:22 +00:00
Lionel Landwerlin	5380104201	anv: fix multiple wait/signal on same binary semaphore We need to guarantee that when vkQueueSubmit() returns the application can actually wait on a signaled semaphore/syncobj. When using a thread to do the submission to i915, this gets a bit tricky in the following case : A syncobj is used both as a wait & signal semaphore and has been signaled once already. It contains a fence before entering vkQueueSubmit(). This means we need to reset the syncobj to ensure when we return from vkQueueSubmit(), the syncobj contains no stale fence. Currently in the Anv, the submission thread is in charge of putting the new fence in the syncobj and also picks up the wait fence directly from the syncobj. This means we can't reset the syncobj from vkQueueSubmit(). The solution to this has been pointed by Bas & Jason : In vkQueueSubmit(), clone the wait syncobj fence into a new temporary syncobj that will be destroy after submission and use this temporary syncobj as a wait fence for i915. This allows us to reset the original syncobj in vkQueueSubmit(). For this to work with wait_before_signal behavior, we also need to do a wait-on-materialize on binary semaphores from vkQueueSubmit(). Otherwise the application thread calling vkQueueSubmit() could race the submission thread and pick up the wrong fence when cloing. v2: Use copy semantic for clone_syncobj_dma_fence() (Jason) Do the cloning prior to adding the syncobj to anv_queue_submit so that if the cloning fails don't have an invalid syncobj in anv_queue_submit (Jason) v3: Fix another syncobj leak (Jason) v4: Fix invalid argument order (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4945 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11474> (cherry picked from commit `d2ff2b9e4a`)	2021-11-17 20:06:20 +00:00
Lionel Landwerlin	a4ba277451	anv: don't forget to add scratch buffer to BO list We reference the scratch BO using a bindless index in the command streamer instructions, but we forgot to add them to the BO list. v2: Make use of pipeline reloc list (Jason) v3: Don't add NULL BOs to the reloc list (Lionel) v4: Don't add BOs twice to reloc list when dealing with addresses (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `eeeea5cb87` ("anv: Add support for scratch on XeHP") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13544> (cherry picked from commit `46c37c8600`)	2021-11-10 21:58:07 +00:00
Jason Ekstrand	6108b3c7f2	anv: Also disallow CCS_E for multi-LOD images Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4616 Fixes: `e3101c96bb` ("anv/image: Disable multi-layer CCS_E on TGL+") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13680> (cherry picked from commit `e614789588`)	2021-11-10 21:58:06 +00:00
Jason Ekstrand	6bbf2110db	anv: Fix FlushMappedMemoryRanges for odd mmap offsets When the client calls vkMapMemory(), we have to align the requested offset down to the nearest page or else the map will fail. On platforms where we have DRM_IOCTL_I915_GEM_MMAP_OFFSET, we always map the whole buffer. In either case, the original map may start before the requested offset and we need to take that into account when we clflush. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610> (cherry picked from commit `90ac06e502`)	2021-11-10 21:58:05 +00:00
Lionel Landwerlin	03ee3a9dd0	intel/devinfo: fix wrong offset computation A bit difficult to find what commit introduced the issue because of all the renaming, but it was my bug :) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015> (cherry picked from commit `349bfb7275`)	2021-11-10 21:58:04 +00:00
Lionel Landwerlin	198f6463ee	intel/perf: fix perf equation subslice mask generation for gfx12+ v2: Fix comment change (Marcin) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015> (cherry picked from commit `67619d8153`)	2021-11-10 21:58:04 +00:00
Lionel Landwerlin	f4fe896423	intel/dev: fix subslice/eu total computations with some fused configurations When a device has its first slice/subslice fused off, we can't use the number of slices/subslices to iterate the mask array. v2: Fix spelling (Marcin) Use size_t for iterator (Marcin) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Matt Roper <matthew.d.roper@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5601 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015> (cherry picked from commit `a543a94404`)	2021-11-10 21:58:03 +00:00
Lionel Landwerlin	8b4c231932	intel/dev: reuse internal functions to set mask Rather than having 2 paths to set the slice/subslice/eu masks, reuse the other internal functions. This simplifies finding bugs within this code : * If we have i915 query topology support, update_from_topology() is called. * If we don't have query topology support but we have getparam for slice/subslice/EU, we generate a topology data and call update_from_topology() * If we have no kernel support to query any kind of topology, we generate the values return by the kernel for slice/subslice/EU and call update_from_masks() which in turns calls update_from_topology() v2: Fixup typo (Adam) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015> (cherry picked from commit `e10c641f00`)	2021-11-10 21:58:03 +00:00
Lionel Landwerlin	546a870459	intel/dev: don't forget to set max_eu_per_subslice in generated topology Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015> (cherry picked from commit `d7c6a90c26`)	2021-11-10 21:58:02 +00:00
Lionel Landwerlin	84140c8792	intel/dev: fix HSW GT3 number of subslices in slice1 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015> (cherry picked from commit `d1db5d562a`)	2021-11-10 21:58:02 +00:00
Vadym Shovkoplias	bca13e8fc8	intel/fs: Fix a cmod prop bug when cmod is set to inst that doesn't support it Fixes dEQP-VK.reconvergence.nesting tests. There are cases when cmod is set to an instruction that cannot have conditional modifier. E.g. following: find_live_channel(32) vgrf166:UD, NoMask cmp.z.f0.0(32) null:D, vgrf166+0.0<0>:D, 0d is optimized to: find_live_channel.z.f0.0(32) vgrf166:UD, NoMask v2: - Add unit test to check cmod is not set to 'find_live_channel' (Matt Turner) - Update flag_subreg when conditonal_mod is updated (Ian Romanick) Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5431 Fixes: `32b7ba66b0` ("intel/compiler: fix cmod propagation optimisations") Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13268> (cherry picked from commit `2dbb66997e`)	2021-11-03 20:15:49 +00:00
Lionel Landwerlin	a82babccd1	anv: fix push constant lowering with bindless shaders Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `9fa1cdfe7f` ("intel/rt: Implement push constants as global memory reads") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13529> (cherry picked from commit `a6031cd9bd`)	2021-10-27 19:58:10 +01:00
Marcin Ślusarz	2d6c11843d	intel: fix INTEL_DEBUG environment variable on 32-bit systems INTEL_DEBUG is defined (since `4015e1876a`) as: #define INTEL_DEBUG __builtin_expect(intel_debug, 0) which unfortunately chops off upper 32 bits from intel_debug on platforms where sizeof(long) != sizeof(uint64_t) because __builtin_expect is defined only for the long type. Fix this by changing the definition of INTEL_DEBUG to be function-like macro with "flags" argument. New definition returns 0 or 1 when any of the flags match. Most of the changes in this commit were generated using: for c in `git grep INTEL_DEBUG \| grep "&" \| grep -v i915 \| awk -F: '{print $1}' \| sort \| uniq`; do perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c perl -pi -e "s/INTEL_DEBUG & ($[A-Z0-9_ \|]+$)/INTEL_DBG\1/" $c done but it didn't handle all cases and required minor cleanups (like removal of round brackets which were not needed anymore). Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334> (cherry picked from commit `d05f7b4a2c`)	2021-10-20 20:40:58 +01:00
Vinson Lee	d19e28c139	anv: Fix assertion. Fix defect reported by Coverity Scan. Assign instead of compare (PW.ASSIGN_WHERE_COMPARE_MEANT) assign_where_compare_meant: use of "=" where "==" may have been intended Fixes: `35315c68a5` ("anv: Use the common wrapper for GetPhysicalDeviceFormatProperties") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13395> (cherry picked from commit `9eb010ee1e`)	2021-10-20 20:40:58 +01:00
Clayton Craft	3f61f84fe3	anv: don't advertise vk conformance on GPUs that aren't conformant This sets the conformance version to 0.0.0.0 for GPUs that have incomplete support for vulkan, so that it's easier to check if vulkan is fully supported by a GPU at runtime for applications/libraries. $ vulkaninfo\|grep conf MESA-INTEL: warning: Ivy Bridge Vulkan support is incomplete conformanceVersion = 0.0.0.0 Signed-off-by: Clayton Craft <clayton@craftyguy.net> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13275> (cherry picked from commit `b2ef7e6d6b`)	2021-10-20 20:40:57 +01:00
Caio Marcelo de Oliveira Filho	94e07058ee	intel/compiler: Remove unused `ret` declaration Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13340>	2021-10-13 17:24:29 +00:00
Caio Marcelo de Oliveira Filho	bd2cc4b916	intel/compiler: Convert test_eu_compact to use gtest Be consistent with the other test suites in intel/compiler. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13340>	2021-10-13 17:24:29 +00:00
Lionel Landwerlin	9fb2c84768	isl: only bump the min row pitch for display when not specified If the ISL caller didn't specify a row_pitch_B, let's use the NVIDIA/AMD requirements. Otherwise keep using the Intel requirement, as the caller is likely trying to import a buffer and if we can deal with that row_pitch_B, we should accept it. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `a3a4517f41` ("isl: Work around NVIDIA and AMD display pitch requirements") Reported-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13024>	2021-10-13 14:46:49 +00:00
Lionel Landwerlin	47ff6767ea	anv: fill correct surface state for lowered storage image Small typo/copy-paste. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c0093c4668` ("anv: Flip around the way we reason about storage image lowering") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13332>	2021-10-13 14:33:14 +00:00
Tapani Pälli	d729038c07	anv: use vk_object_zalloc for wsi fences created Otherwise we hit assert in vk_object_base_assert_valid when attemping to create handle from anv_fence with unknown base type. Cc: mesa-stable Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13330>	2021-10-13 11:59:17 +00:00
Tapani Pälli	840c79fc9b	anv/android: fix parameters given for vk_common_QueueSubmit Common queue submit expects pWaitDstStageMask to be set per each semaphore (as per Vulkan spec) and crashes if these are not given properly. This fixes crashes seen when running vulkan apps on Android. v2: change the VkPipelineStageFlags given (Lionel) Fixes: `b996fa8efa` ("anv: implement VK_KHR_synchronization2") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13305>	2021-10-13 06:00:56 +00:00
Felix DeGrood	5bf6987873	anv: dirty only state impacted by blorp_exec Instead of dirtying all state after blorp operations, avoid dirtying state that blorp never touches. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5077 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12567>	2021-10-13 04:31:34 +00:00
Jason Ekstrand	a64d90026b	anv: Use the common WSI wrappers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13234>	2021-10-13 00:06:15 +00:00
Jason Ekstrand	916c9335b4	meson: Add and use an idep for Vulkan WSI Acked-by: Chia-I Wu <olvaffe@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13234>	2021-10-13 00:06:15 +00:00
Nanley Chery	10be870c72	anv: Tile cache flush for depth before fast clear Instead of doing a tile cache flush after slow clears, resolves, and ambiguates, do it before fast clears of HIZ_CCS_WT surfaces. This agrees with the Bspec. Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11539>	2021-10-12 18:05:46 +00:00
Nanley Chery	81e9c25c1b	anv: Allow HIZ_CCS_WT with subpass self-dependencies This unblocks later commits that aim to align the driver with the tile cache flushing requirements in the Bspec. Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11539>	2021-10-12 18:05:46 +00:00
Lionel Landwerlin	ce3dd1375f	anv: implement VK_KHR_format_feature_flags2 v2: fix SAMPLED_IMAGE_DEPTH_COMPARISON_BIT_KHR (Ivan) v3: Fixup VK_FORMAT_FEATURE_2_STORAGE_IMAGE_BIT_KHR setting (Ivan) Add missing drm-modifiers/android bits (Lionel) v4: Avoid duplicating get_ahw_buffer_format_properties() (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13198>	2021-10-11 10:29:12 -05:00
Lionel Landwerlin	01d1ec292a	anv: start computing KHR_format_features2 flags for storage images Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13198>	2021-10-11 10:29:12 -05:00
Jason Ekstrand	c0093c4668	anv: Flip around the way we reason about storage image lowering There are roughly two cases when it comes to storage images. In the easy case, we have full hardware support and we can just emit a typed read/write message in the shader and we're done. In the more complex cases, we may need to fall back to a typed read with a different format or even to a raw (SSBO) read. The hardware has always had basically full support for typed writes all the way back to Ivy Bridge but typed reads have been harder to come by. Starting with Skylake, we finally have enough that we at least have a format of the right bit size but not necessarily the right format so we can use a typed read but may still have to do an int->unorm or similar cast in the shader. Previously, in ANV, we treated lowered images as the default and write- only as a special case that we can optimize. This flips everything around and treats the cases where we don't need to do any lowering as the default "vanilla" case and treats the lowered case as special. Importantly, this means that read-write access to surfaces where the native format handles typed writes now use the same surface state as write-only access and the only thing that uses the lowered surface state is access read-write access with a format that doesn't support typed reads. This has the added benefit that now, if someone does a read without specifying a format, we can default to the vanilla surface and it will work as long as it's a format that supports typed reads. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13198>	2021-10-11 10:29:09 -05:00

1 2 3 4 5 ...

7207 commits