NOTE: This commit needs "nir: All set-on-comparison opcodes can take all
float types" or regressions will occur in other Vulkan SPIR-V tests.
No shader-db changes on any Intel platform.
NOTE: This commit depends on "nir: All set-on-comparison opcodes can
take all float types".
v2: Fix handling 16-bit (and presumably 64-bit) values.
About 280 shaders in Talos are hurt by a few instructions, and a couple
shaders in Doom 2016 are hurt by a few instructions.
Tiger Lake
Instructions in all programs: 159893290 -> 159895026 (+0.0%)
SENDs in all programs: 6936431 -> 6936431 (+0.0%)
Loops in all programs: 38385 -> 38385 (+0.0%)
Cycles in all programs: 7019260087 -> 7019254134 (-0.0%)
Spills in all programs: 101389 -> 101389 (+0.0%)
Fills in all programs: 131532 -> 131532 (+0.0%)
Ice Lake
Instructions in all programs: 143624235 -> 143625691 (+0.0%)
SENDs in all programs: 6980289 -> 6980289 (+0.0%)
Loops in all programs: 38383 -> 38383 (+0.0%)
Cycles in all programs: 8440083238 -> 8440090702 (+0.0%)
Spills in all programs: 102246 -> 102246 (+0.0%)
Fills in all programs: 131908 -> 131908 (+0.0%)
Skylake
Instructions in all programs: 134185495 -> 134186618 (+0.0%)
SENDs in all programs: 6938790 -> 6938790 (+0.0%)
Loops in all programs: 38356 -> 38356 (+0.0%)
Cycles in all programs: 8222366923 -> 8222365826 (-0.0%)
Spills in all programs: 98821 -> 98821 (+0.0%)
Fills in all programs: 125218 -> 125218 (+0.0%)
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes: 1feeee9cf4 ("nir/spirv: Add initial support for GLSL 4.50 builtins")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>
(cherry picked from commit 75ef5991f5)
This (sort of) matches the behavior of nir_opt_algebraic. This ensures
that subnormal values are properly flushed to zero.
With the aid of "nir/search: Float sources of texture instructions are
float users" and "nir/search: Transitively apply is_only_used_as_float",
there would have been no shader-db regressions on Intel platforms.
However, those caused a significant increase in compile time. Since the
instruction regressions were so small, I just dropped those commits
rather than improve them.
All Haswell and newer platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 20125042 -> 20125094 (<.01%)
instructions in affected programs: 7184 -> 7236 (0.72%)
helped: 0
HURT: 32
HURT stats (abs) min: 1 max: 4 x̄: 1.62 x̃: 2
HURT stats (rel) min: 0.11% max: 1.49% x̄: 0.85% x̃: 0.78%
95% mean confidence interval for instructions value: 1.39 1.86
95% mean confidence interval for instructions %-change: 0.74% 0.96%
Instructions are HURT.
total cycles in shared programs: 862745586 -> 862746551 (<.01%)
cycles in affected programs: 109872 -> 110837 (0.88%)
helped: 12
HURT: 23
helped stats (abs) min: 2 max: 774 x̄: 90.83 x̃: 19
helped stats (rel) min: 0.07% max: 25.23% x̄: 3.06% x̃: 0.40%
HURT stats (abs) min: 2 max: 1106 x̄: 89.35 x̃: 12
HURT stats (rel) min: 0.08% max: 45.40% x̄: 3.01% x̃: 0.47%
95% mean confidence interval for cycles value: -60.09 115.23
95% mean confidence interval for cycles %-change: -2.21% 4.07%
Inconclusive result (value mean confidence interval includes 0).
All of the shaders hurt are in either UE4 shooter-game or shooter_demo.
Tiger Lake
Instructions in all programs: 159893213 -> 159893290 (+0.0%)
SENDs in all programs: 6936431 -> 6936431 (+0.0%)
Loops in all programs: 38385 -> 38385 (+0.0%)
Cycles in all programs: 7019259514 -> 7019260087 (+0.0%)
Spills in all programs: 101389 -> 101389 (+0.0%)
Fills in all programs: 131532 -> 131532 (+0.0%)
Ice Lake
Instructions in all programs: 143624164 -> 143624235 (+0.0%)
SENDs in all programs: 6980289 -> 6980289 (+0.0%)
Loops in all programs: 38383 -> 38383 (+0.0%)
Cycles in all programs: 8440082767 -> 8440083238 (+0.0%)
Spills in all programs: 102246 -> 102246 (+0.0%)
Fills in all programs: 131908 -> 131908 (+0.0%)
Skylake
Instructions in all programs: 134185424 -> 134185495 (+0.0%)
SENDs in all programs: 6938790 -> 6938790 (+0.0%)
Loops in all programs: 38356 -> 38356 (+0.0%)
Cycles in all programs: 8222366529 -> 8222366923 (+0.0%)
Spills in all programs: 98821 -> 98821 (+0.0%)
Fills in all programs: 125218 -> 125218 (+0.0%)
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes: f5dd6dfe01 ("anv: enable VK_KHR_shader_float_controls and SPV_KHR_float_controls")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>
(cherry picked from commit 38a94c82e6)
Extend 4195a9450b so that the next poor fool doesn't come along and
say, "sge does the right thing for 16-bit sources, but slt gives a NIR
validation failure. What the deuce?"
NOTE: This commit is necessary to prevent regressions in GLSLstd450Step
tests of 16-bit sources at "spriv: Produce correct result for
GLSLstd450Step with NaN".
Fixes: 4195a9450b ("nir: sge operation is defined for floating-point types")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>
(cherry picked from commit 38800b385c)
Fast clear with the resource format instead. This is safe to do because
can_fast_clear_color ensures that the clear color generates the same
pixel with either the view format or the resource format.
On SKL, this prevents us from using an invalid surface state. This platform
doesn't support CCS_E with sRGB formats, but prior to this patch we allowed
fast-clearing with this combination. Piglit's fcc-write-after-clear test
can trigger this.
Fixes: 230952c210 ("iris: Don't support sRGB + Y_TILED_CCS on gen9")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14806>
(cherry picked from commit 6778b3a379)
the idx param for LLVMBuildInsertElement is zero-indexed based on the
value of 'vector_length' (always 4), and the vector length is (obviously)
sized to 'vector_length', so this should be the member of the vec that is being
inserted, not the invocation index
cc: mesa-stable
fixes (zink, but only on my one machine):
KHR-GL46.tessellation_shader.single.max_patch_vertices
KHR-GL46.tessellation_shader.tessellation_shader_tc_barriers.barrier_guarded_read_write_calls
dEQP-GLES31.functional.tessellation.shader_input_output.barrier
dEQP-GLES31.functional.tessellation.shader_input_output.patch_vertices_5_in_10_out
dEQP-GLES31.functional.tessellation_geometry_interaction.feedback.tessellation_output_isolines_geometry_output_points
dEQP-GLES31.functional.tessellation_geometry_interaction.feedback.tessellation_output_isolines_point_mode_geometry_output_triangles
dEQP-GLES31.functional.tessellation_geometry_interaction.feedback.tessellation_output_quads_geometry_output_points
dEQP-GLES31.functional.tessellation_geometry_interaction.feedback.tessellation_output_quads_point_mode_geometry_output_lines
dEQP-GLES31.functional.tessellation_geometry_interaction.feedback.tessellation_output_triangles_geometry_output_points
dEQP-GLES31.functional.tessellation_geometry_interaction.feedback.tessellation_output_triangles_point_mode_geometry_output_lines
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14949>
(cherry picked from commit 68c1b50e48)
All of the opcodes in nir_opt_algebraic_late are the unsized (1-bit)
versions. If the lowering to int32 happens first, many of the
optimizations and lowerings won't happen.
Of particular importance is the lowering of fisfinite. If a shader
happens to contain fisfinite of an fp16 value, it will assert later
during compliation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 78b4e417d4 ("gallivm: handle fisfinite/fisnormal")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14942>
(cherry picked from commit e3cbc328e0)
The root cause is unknown but PAL always update IA_MULTI_VGT_PARAM
whenever primitive type change.
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Singed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14944>
(cherry picked from commit fe560aeb12)
When enabling the caching of index,vertex data in the L3 RO Cache
(L3BypassDisable), we need to use L3ReadOnlyCacheInvalidationEnable
to invalidate cache when buffer is modified by CPU/GPU.
Ref: bspec 46314
Fixes: ed8f2c4cbe ("iris: Cache VB/IB in L3$ for Gen12")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14815>
(cherry picked from commit 562f7eef5b)
When enabling the caching of index,vertex data in the L3 RO Cache
(L3BypassDisable), we need to use L3ReadOnlyCacheInvalidationEnable
to invalidate cache when buffer is modified by CPU/GPU.
Ref: bspec 46314
Fixes: 6c345ddbe4 ("anv: Cache VB/IB in L3$ for Gfx12")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5941
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14815>
(cherry picked from commit 7a6ea04795)
It is being overwritten by the memset. Just set the only remaining
member RelAddr explicitly.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip.gawin@zoho.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14915>
(cherry picked from commit 1f5330de3a)
We just forgot about conditional render for this entry point.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2be89cbd82 ("anv: Implement vkCmdDrawIndirectByteCountEXT")
Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14891>
(cherry picked from commit 93a90fc85d)
We're replacing a generic instruction by an intel specific one, we
need to remove the previous instruction.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c5a42e4010 ("intel/fs: fix shader call lowering pass")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
(cherry picked from commit 39f6cd5d79)
If we have batch a + b, and writing to batch b, causes batch a
to flush, all the bo->index get reset, and we try to submit a -1
to the kernel.
Look the bo index up when creating relocations.
Fixes crash seen in KHR-GL46.compute_shader.pipeline-post-fs
and a trace from Wasteland 3
Fixes: f3630548f1 ("crocus: initial gallium driver for Intel gfx 4-7")
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14905>
(cherry picked from commit 37c3be6947)
Conflicts:
src/gallium/drivers/crocus/ci/crocus-hsw-flakes.txt
I've deleted this file, which the original removed an entry from as it
doesn't exist, and the CI isn't run on the 22.0 branch.
these regions might not have the coords in the correct order, which will
cause them to fail intersection tests, resulting in clears that are never
applied
cc: mesa-stable
fixes:
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_all_buffer_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_color_and_depth_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_color_and_stencil_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_linear_filter_color_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_magnifying_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_minifying_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_missing_buffers_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_nearest_filter_color_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_negative_dimensions_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_negative_height_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_negative_width_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_scissor_blit
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14867>
(cherry picked from commit 388f23eabe)
Matches radeonsi.
Seems Vulkan CTS doesn't really test cull distances. Removing
VARYING_SLOT_CULL_DIST0/VARYING_SLOT_CULL_DIST1 variables doesn't break
any of dEQP-VK.clipping.*, except for tests which read the variables in
the fragment shader.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5984
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14882>
(cherry picked from commit 7ddad1b93a)
A buffer age of 0 means that the buffer is uninitialised or has unknown
content. We rely on the buffer age initially being 0 through zalloc when
the surface is first created; when they are first used for a swap, we
set their age to 1, and then we increment the age of every buffer in the
chain with a non-zero age when we swap.
Now that we can release buffers, both through dmabuf-feedback as well as
detecting when we're using a deeper swapchain than the compositor needs,
make sure to reset their age as they are released. Without doing this,
the age will stay as it was before it was released and be incremented,
returning the wrong age to the user the first time a previously-released
buffer slot has been reused.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5977
Fixes: 22d796feb8 ("egl/wayland: break double/tripple buffering feedback loops")
Fixes: b5848b2dac ("egl/wayland: use surface dma-buf feedback to allocate surface buffers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14873>
(cherry picked from commit 3da8300562)
The most famous RADV revert over the past months. This was an issue
in RADV and not an use-after-free (descriptor set layouts can be
destroyed almost at any time).
This reverts commit b775aaff1e.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14621>
(cherry picked from commit 9ea4029f9f)
In cases where the to-be-allocated node size with padding exceeds BLOCK_SIZE
but without padding doesn't, a new block is not created and no padding is done
to the previous instruction, causing a misaligned pointer to be returned.
v2: Per Ilia Mirkin's suggestion, remove the extra condition in the first
if statement, let it unconditionally pad the last instruction if needed.
The updated currentPos will then be taken into account in the
block size checking.
This fixes crash seen with lightsmark and Optuma apitraces
Fixes: 05605d7f53 (' mesa: remove display list OPCODE_NOP')
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Tested-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14871>
(cherry picked from commit 945a1e0b8c)
When new context was created, shared_mem_size was getting overwritten.
This fixes glretrace failure seen with manhattan, aztec and BASS2_intro
apitraces
Fixes: 247c61f2d0 ('svga: Add support for compute shader, shader buffers and image views')
Tested with glretrace, piglit
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit dd6793ec9218782b1b716a87582d7219bae4e75f)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14870>
(cherry picked from commit 9230b28533)
We didn't remove desc set from the pool's list if pool was
host_memory_base. On the other hand in there is no point in removing
desc set from the list in DestroyDescriptorPool/ResetDescriptorPool.
Fixes: da7a4751
("turnip: Drop references to layout of all sets on pool reset/destruction")
Fixes cts tests:
dEQP-VK.api.buffer_marker.graphics.default_mem.bottom_of_pipe.memory_dep.draw
dEQP-VK.api.buffer_marker.graphics.default_mem.bottom_of_pipe.memory_dep.dispatch
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14855>
(cherry picked from commit 183bc15bdb)
IRIS_BATCH_BLITTER isn't supported prior to Tigerlake; in general,
batches may not be supported on all hardware. In most cases, querying
them is harmless (if useless): they reference nothing, have no commands
to flush, and so on. However, the fence code does need to know that
certain batches don't exist, so it can avoid adding inter-batch fences
involving them.
This patch introduces a new iris_foreach_batch() iterator macro that
walks over all batches that are actually supported on the platform,
while skipping the others. It provides a central place to update should
we add or reorder more batches in the future.
Fixes various tests in the piglit.spec.ext_external_objects.* category.
Thanks to Tapani Pälli for catching this.
Fixes: a90a1f15 ("iris: Create an IRIS_BATCH_BLITTER for using the BLT command streamer")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14834>
(cherry picked from commit fd0e4aedeb)
Without this change LLVM 12 hits this error:
"""
LLVM ERROR: Error while trying to spill SGPR0_SGPR1 from class SReg_64:
Cannot scavenge register without an emergency spill slot!
"""
when running glcts KHR-GL46.arrays_of_arrays_gl.AtomicUsage test.
Fixes: 9ff086052a ("radeonsi: unroll loops of up to 128 iterations")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14848>
(cherry picked from commit eaa87b1a46)