This commit fixes performance regressions introduced by e03f965280
in which we started bounds checking our push constants. This added a
LOT of shader code to shaders which use the robustBufferAccess feature
and led to substantial spilling. The checking we just added to the FS
back-end is far more efficient for two reasons:
1. It can be done at a whole register granularity rather than per-
scalar and so we emit one SIMD8 SEL per 32B GRF rather than one
SIMD16 SEL (executed as two SELs) for each component loaded.
2. Because we do it with NoMask instructions, we can do it on whole
pushed GRFs without splatting them out to SIMD8 or SIME16 values.
This means that robust buffer access no longer explodes our register
pressure for no good reason.
As a tiny side-benefit, we're now using can use AND instead of SEL which
means no need for the flag and better scheduling.
Vulkan pipeline database results on ICL:
Instructions in all programs: 293586059 -> 238009118 (-18.9%)
SENDs in all programs: 13568515 -> 13568515 (+0.0%)
Loops in all programs: 149720 -> 149720 (+0.0%)
Cycles in all programs: 88499234498 -> 84348917496 (-4.7%)
Spills in all programs: 1229018 -> 184339 (-85.0%)
Fills in all programs: 1348397 -> 246061 (-81.8%)
This also improves the performance of a few apps:
- Shadow of the Tomb Raider: +4%
- Witcher 3: +3.5%
- UE4 Shooter demo: +2%
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4447>
Even though vkCmdRenderPassBegin() isn't allowed inside a secondary
command buffer, vkCmdDispatch() is, and we emit an IB with compute
dispatches, which means that if the secondary command buffer records a
vkCmdDispatch() then we'll have an IB inside an IB, which is illegal.
Fixes hangs in e.g.
dEQP-VK.api.command_buffers.record_simul_use_secondary_one_primary.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4605>
This is an added interaction in NV_compute_shader_derivatives added in
Sep 2019.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4583>
One notable change is that DRAW_INFO_START_WITH_USER_INDICES is enabled.
An audit of the code indicates that it should work, and a number of
piglit tests exercising glMultiDrawElements continue to function.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4520>
We make some assumptions in opt_large_constants such as the size_align
function returning the obvious sizes for vectors. Now that we've got
the deref_size lying around, we may as well assert it's consistent with
our assumptions. In particular, we now assert that it really claims
booleans are 32-bit. If anyone's driver ever decides to be clever and
change this, we'll now catch the breakage earlier.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4468>
In f1883cc73d we tried to pass through alignments from load_constant
intrinsics when rewriting them to load_ubo in iris. However, those
intrinsics don't have ALIGN_MUL or ALIGN_OFFSET indices. It's easy
enough to add them. We just call the size/align function on the vector
type at the end of our deref chain and use the alignment returned from
there. It's possible we could do better by walking the whole deref
chain but this should be good enough.
Fixes: f1883cc73d "iris: Set alignments on cbuf0 and constant reads"
Closes: #2739
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4468>
clang-cpp.so is broken in LLVM-9 and doesn't exist in LLVM<9,
however meson will find and try to use system libraries in these cases.
v2: Use helper variable to dedpulicate test code
Move second test inside the condition to avoid testing good clang-cpp twice
v3: Check for cross compilation
v4: style fixes
Fixes: ff1a3a00cb
Signed-off-by: Jan Vesely <jano.vesely@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4457>
The ISA doc is inconsistent whether this instruction writes SCC. It does.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552>
It was implemented in 1df871f8ff, but to
really enable it we need to enable PIPE_CAP_DEPTH_BOUNDS_TEST.
v2: Add release notes (Ian).
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4540>
flexint.h uses stdint.h if the compiler claims to support C99. MSVC
doesn't support enough of C99 to enable this flag, but it supports
enough to keep flex happy.
Without this, we end up with *both* some flex-specific definitions as
well as our own definitions from mesa-headers, producing a slew of
compiler warnings.
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4577>
On Windows, main/glheader.h ends up including windows.h which in turn
includes wingdi.h unless the NOGDI macro is defined. And wingdi.h
defines a macro called "ERROR", which we end up redefining below.
To avoid a warning on the redefinition, we can define NOGDI to prevent
wingdi.h from implicitly being included.
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4577>
This cast-expression was meant to cast the result of the terniary
expression, but it just casted the condition expression instead. Let's
correct this, to silence a compiler-warning.
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4577>
This just silences a compiler-warning about a potentially uninitialized
variable. It's not uninitialized, but it's a bit hard for the compiler
to see. So let's just initialize it to zero.
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4577>
These casts cause warnings on x64. We're passing integers through
pointers, which works fine.
So let's make the casts a bit more explicit, to silence that warning.
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4577>
Instead of exposing various layout functions, move image-related logic
into tu_image.c and have the image_view pre-fill relevant register values.
This changes the clear/blit code to use image_view.
This will make it much easier to deal with aspect masks, in particular for
planar formats and D32_S8.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4581>
Otherwise reading from an imported mapped GTT+WC linear texture
is painfully slow.
Sadly no radeon winsys implementation, as I don't know a suitable
kernel driver operation.
Hit this in vaGetImage with an image imported from minigbm (which
we are switching to allocate WC for SCANOUT images).
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4542>
An update seems to have poisoned gitlab-runner, and it can no longer
resolve DNS even though the host system can.
Disable this until we can figure out what's going on.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4589>
Choosing LLVM's link mode is legitimate on UNIX systems, but only static
actually really works under Windows.
Give shared-llvm a default 'auto' mode which will pick the previous
default of true (shared) on UNIX systems, but newly defaulting to false
(static) on Windows.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Suggested-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4555>
Even though we normally use the CP_BLIT path with resolves that aren't
aligned, there's a special case when we're resolving the entire image
and there's enough padding so that we can still use CP_EVENT_WRITE::BLIT
when the render area isn't aligned. The hardware seems to not like
unaligned scissors when not clearing, and sometimes hangs rather than
silently round the scissor. This causes hangs in e.g.
dEQP-VK.glsl.derivate.dfdx.texture.msaa4.float_highp.
There was some concern that the CP_BLIT path might use this scissor
also, but I confirmed that this isn't the case by setting it to 0 before
resolving and then noting that CP_BLIT still works (but CP_EVENT_WRITE
doesn't). Furthermore, this is actually impossible because of how the 2D
engine is set up: it gets its own pair of register banks, which can be
switched independently of the 3D register banks, so that 2D events
(CP_BLIT) normally aren't synchronized relative to 3D events
(CP_EVENT_WRITE, CP_DRAW_*, and CP_EXEC_CS) and therefore they can't
share any registers except for non-pipelined registers like RB_CCU_CNTL
that don't use the register bank mechanism at all.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4585>