Unlike Gen11, Gen12 hardware supports up to three pixel pipes per
slice.
Unfortunately the kernel interface is somewhat inconsistent between
Gen11 and Gen12: I915_PARAM_SUBSLICE_MASK returns a mask of enabled
*dual* subslices since TGL, so there is half the number of bits per
pixel pipe in the mask. This is worked around here so we're able to
calculate the correct size of each pixel pipe, but the result is
returned in dual subslice units, inheriting the inconsistency from the
kernel -- Reason is that as of now all our Gen12 subslice counts
returned by gen_device_info.c are really dual subslice counts, and the
num_eu_per_subslice counts are also scaled accordingly, so it seems
like it would only make the matter worse if I fixed the units of this
field only without also fixing the rest.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8749>
This command allows programming custom pixel hashing tables
controlling the balancing of load across pixel pipes. Rather
confusingly 3DSTATE_SLICE_TABLE_STATE_POINTERS was serving the same
purpose on Gen11: A pixel is mapped to the pixel pipe with index
specified by the entry in the table corresponding to the LSBs of the
pixel coordinates [Yes you read right the entries are neither subslice
nor slice indices!]. Either a 2-way or a 3-way table can be
programmed based on whether the platform has two or three pixel pipes
per slice. In addition the 16x8 tables defined below can hold two
separate 8x8 tables when in DUAL_TABLE mode (which AFAIA is only
useful for platforms with multiple asymmetric slices -- I.e. no
production platforms as of today to my knowledge).
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8749>
The former "Subslice Hashing Mode" field is no longer used by the
hardware, Gen12 parts always do 16x16 subslice pixel hashing -- Remove
it since it's no longer useful. In addition add a couple of bits that
will be useful in order to make some adjustments to the default pixel
pipe hashing behavior.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8749>
Previously can_do_source_mods was used to determine whether a value with
a source modifier or a value from a scalar source (e.g., a uniform)
could be copy propagated. The former is a superset of the latter, so
this always produces correct results, but it is overly restrictive. For
example, a BFI instruction can't have source modifiers, but it can have
scalar sources.
This was originally authored to prevent a small number of shader-db
regressions in a commit that marked SHR has not being able to have
source modifiers. That commit has since been dropped in favor of a
different method.
v2: Refactor register region restriction detection to a helper function.
Suggested by Jason.
No fossil-db changes on any Intel platform.
All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 20039111 -> 20038943 (<.01%)
instructions in affected programs: 31736 -> 31568 (-0.53%)
helped: 104
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 1.62 x̃: 1
helped stats (rel) min: 0.30% max: 0.88% x̄: 0.45% x̃: 0.42%
95% mean confidence interval for instructions value: -2.03 -1.20
95% mean confidence interval for instructions %-change: -0.47% -0.42%
Instructions are helped.
total cycles in shared programs: 980309750 -> 980308897 (<.01%)
cycles in affected programs: 591078 -> 590225 (-0.14%)
helped: 70
HURT: 26
helped stats (abs) min: 2 max: 622 x̄: 23.94 x̃: 4
helped stats (rel) min: <.01% max: 2.85% x̄: 0.33% x̃: 0.12%
HURT stats (abs) min: 2 max: 520 x̄: 31.65 x̃: 6
HURT stats (rel) min: 0.02% max: 2.45% x̄: 0.34% x̃: 0.15%
95% mean confidence interval for cycles value: -26.41 8.64
95% mean confidence interval for cycles %-change: -0.27% -0.03%
Inconclusive result (value mean confidence interval includes 0).
No shader-db changes on earlier Intel platforms.
Reviewed-by: Anuj Phogat anuj.phogat@gmail.com [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9237>
Some nir lowerings might need to know if txs is supported by
the backend.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8898>
Before we introduced the submission thread in 829699ba63, once we
returned from vkQueueSubmit, all signaled syncobj would have a
i915_request/dma-fence waiting to be signaled by some work that would
submitted to HW by i915.
After this submission thread that is no longer the case. We added a
few checks in places like vkQueuePresentKHR() to wait for the binary
semaphores to materialize before we would hand things over to the WSI
code.
Unfortunately 829699ba63 forgot to reset the signaled binary
semaphore.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 829699ba63 ("anv: implement shareable timeline semaphores")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4276
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9188>
Without this check, the INTEL_MEASURE environment variable could be
misused to overwrite arbitrary files.
Fixes: 0f4143ec37 ("intel: Print GPU timing data based on INTEL_MEASURE")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9166>
This doesn't actually assume the top 32 bits of the source value are
zero. Instead, it does (src >> shift) & UINT32_MAX regardless of what
the top bits of src are.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9116>
It actually returns ~0/0. We're about to make things more self-
documenting so we can delete the comment instead of fixing it.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9116>
Instead of doing a vkQueueSubmit() and hoping for the best, use the
actual sync FD that gets passed in from SurfaceFlinger. The semaphore
and fence FD import functions already handle the -1 case for us so the
implementation is almost trivial.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad@kiwitree.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8814>
They've all supported it since either forever or Iron Lake which is
equivalent to forever for Vulkan.
From Kenneth Graunke's GitLab review:
"Linear blending of depth buffer data is usually fairly nonsense
(something's 2 meters away? another thing's 6 meters away? let's
just report 4 meters?)...but it's definitely a thing we can do, so
we may as well let apps do it, and trust them not when it doesn't
make sense."
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9110>
Test the sampler->conversion for NULL pointer before dereferencing it.
Fixes: Regressions in VulkanCTS.
Fixes: 226316116c "intel/anv: Fix condition to set MipModeFilter for YUV surface"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
On Intel platforms before Gen6, there is no min or max instruction.
Instead, a comparison instruction (*more on this below) and a SEL
instruction are used. Per other IEEE rules, the regular comparison
instruction, CMP, will always return false if either source is NaN. A
sequence like
cmp.l.f0.0(16) null<1>F g30<8,8,1>F g22<8,8,1>F
(+f0.0) sel(16) g8<1>F g30<8,8,1>F g22<8,8,1>F
will generate the wrong result for min if g22 is NaN. The CMP will
return false, and the SEL will pick g22.
To account for this, the hardware has a special comparison instruction
CMPN. This instruction behaves just like CMP, except if the second
source is NaN, it will return true. The intention is to use it for min
and max. This sequence will always generate the correct result:
cmpn.l.f0.0(16) null<1>F g30<8,8,1>F g22<8,8,1>F
(+f0.0) sel(16) g8<1>F g30<8,8,1>F g22<8,8,1>F
The problem is... for whatever reason, we don't emit CMPN. There was
even a comment in lower_minmax that calls out this very issue! The bug
is actually older than the "Fixes" below even implies. That's just when
the comment was added. That we know of, we never observed a failure
until #4254.
If src1 is known to be a number, either because it's not float or it's
an immediate number, use CMP. This allows cmod propagation to still do
its thing. Without this slight optimization, about 8,300 shaders from
shader-db are hurt on Iron Lake.
Fixes the following piglit tests (from piglit!475):
tests/spec/glsl-1.20/execution/fs-nan-builtin-max.shader_test
tests/spec/glsl-1.20/execution/fs-nan-builtin-min.shader_test
tests/spec/glsl-1.20/execution/vs-nan-builtin-max.shader_test
tests/spec/glsl-1.20/execution/vs-nan-builtin-min.shader_test
Closes: #4254
Fixes: 2f2c00c727 ("i965: Lower min/max after optimization on Gen4/5.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8115134 -> 8115135 (<.01%)
instructions in affected programs: 229 -> 230 (0.44%)
helped: 0
HURT: 1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9027>
Since the CMPN builder was never used, there was no reason to make its
interface usable. :)
Fixes: 2f2c00c727 ("i965: Lower min/max after optimization on Gen4/5.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9027>
These checks were originally assertions elsewhere either in the existing
code or later in this MR.
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9027>
Mip Mode Filter must be set to MIPFILTER_NONE for Planar YUV surfaces.
Add the missing condition to check for planar format.
Fixes: b24b93d584 "anv: enable VK_KHR_sampler_ycbcr_conversion"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
On Gen7, we have to split shuffles into two MOVs for 64-bit types so we
can't handle source modifiers. On Gen12.5, we have to use integer types
all the time so we can't use them there either. Fixing that will be a
different commit but it interacts with this one.
Fixes: 90c9f29518 "i965/fs: Add support for nir_intrinsic_shuffle"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9068>
Ensure that all resources are properly released by
properly parenting them to a memory context and releasing
the context during test teardown.
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8162>
Fixes crash in ppgtt_lookup when the same bo is used twice
with different offsets.
It's possible to hit this with i965 and always_flush_batch=true.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9008>
This function is used as a callback and the other instance
of this callback doesn't add its own new line.
Messages printed by this function already end with a new line.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8988>
Aubinator creates lots of 4k mappings, so for large traces it's
possible to hit system limit on the number of mappings created
by a single process.
Ideally, aubinator should merge those mappings, but that's tricky
and I'm not sure it's worth spending time on.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8988>
By the Vulkan specification, and similarly to many other Vulkan calls,
it is allowed to destroy a null descriptor update template.
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Fixes: af5f13e58c ("anv: add VK_KHR_descriptor_update_template support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9005>