When we hit a GPU hang, we failed to reset Surface State Base Address
right away, and would keep hanging until we filled up the binder. Then
we'd finally get it right after a lot of repeated stumbles. Update it
right away so we hopefully hang fewer times before succeeding.
Shader-db results on Kaby Lake:
total instructions in shared programs: 15306230 -> 15304726 (<.01%)
instructions in affected programs: 4570 -> 3066 (-32.91%)
helped: 16
HURT: 0
total cycles in shared programs: 361703436 -> 361680041 (<.01%)
cycles in affected programs: 129388 -> 105993 (-18.08%)
helped: 16
HURT: 0
LOST: 0
GAINED: 2
The helped programs were in XCom 2, Deus Ex: Mankind Divided, and Kerbal
Space Program
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
It will be true for the constant/system value buffer because they use a
constant zero but it's not true in general. If we ever got here when
the source wasn't constant, nir_src_as_uint would assert.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Silence two unused var warnings. And init elem_size, elem_align to
zero to silence "maybe uninitialized" warnings.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Jason pointed out that we don't need to keep an entire copy of the
serialized NIR around, we just need the SHA1. This does change our
disk cache key to be taking a SHA1 of a SHA1, which is a bit odd,
but should work out and be faster and use less memory.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
spirv_to_nir() returned the nir_function corresponding to the
entrypoint, as a way to identify it. There's now a bool is_entrypoint
in nir_function and also a helper function to get the entry_point from
a nir_shader.
The return type reflects better what the function name suggests. It
also helps drivers avoid the mistake of reusing internal shader
references after running NIR_PASS on it. When using NIR_TEST_CLONE or
NIR_TEST_SERIALIZE, those would be invalidated right in the first pass
executed.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Replace its uses with checking for is_entrypoint and calling
nir_shader_get_entrypoint().
This is a preparation to change spirv_to_nir() return type.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Replace its use with checking for is_entrypoint.
This is a preparation to change spirv_to_nir() return type.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Replace its uses with nir_shader_get_entrypoint(), and change the
helper function to return nir_shader *.
This is a preparation to change spirv_to_nir() return type.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
When readback is true, and there are pending writes in the transfer
queue, we should flush to avoid reading back outdated data. This
fixes piglit arb_copy_buffer/dlist and a subtest of
arb_copy_buffer/data-sync.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
It is possible and valid for a pointer to be selected based on a
conditional before used, and depending on the mode, those cases will
result in a phi with derefs as sources.
To achieve this, we don't rematerialize derefs that are used by phis.
As a consequence, when converting from SSA to regs, we may have phis
that come from different blocks and are used by phis. We now convert
those to regs too.
Validation was added to ensure only derefs of certain modes can be
used as phi sources. No extra validation is needed for the presence
of cast, any instruction that uses derefs will validate the
deref-chain is complete (ending in a cast or a var).
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This shouldn't be allowed in GLES 1/2.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
KHR_blend_equation_advanced_coherent isn't exposed on OpenGL ES 1.x, so
we shouldn't allow its enums there either.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This enum shouldn't be allowed on OpenGL ES 1.x, so let's instead
use the extenion-helpers, and check for desktop and gles extensions
separately.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This extension isn't enabled for GLES 1.x, so we shouldn't allow the
state there. Let's use the extension-helpers instead of CHECK_EXTENSION
for this.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This just makes the logic of the checks for this enum the same for
gl{Enable,Disable} and for glIsEnabled. They are already functionally
the same, so this is just a minor code-cleanup.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
{En,Dis}ableClientState(PRIMITIVE_RESTART_NV) should only work on
compatibility contextxs. While we're at it, modernize the code a bit,
by using the extension helpers instead of open-coding.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
It makes sense to use the image view formats when resolving
inside subpasses, while we have to use the image formats for
normal resolves.
Original patch by Philip Rebohle.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110348
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Make sure to sync all previous work if the given command buffer
has pending active queries. Otherwise the GPU might write queries
data after the reset operation.
This fixes a bunch of new dEQP-VK.query_pool.* CTS failures.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This makes the following packets use actual driver provided sizes rather
than guessing an arbitrary number:
- CC_VIEWPORT
- SF_CLIP_VIEWPORT
- BLEND_STATE
- COLOR_CALC_STATE
- SCISSOR_RECT
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Link all Gallium drivers with ld_args_build_id to prevent failures in
Iris that uses GNU_BUILD_ID
Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=110757
Fixes: 4756864cdc "iris: Start wiring up on-disk shader cache"
Signed-off-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
If the driver waits for CP DMA to be idle and emit an EOP event
we need more space.
This fixes a crash with Quake Champions.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(Technically this is common code, but it doesn't affect i965 or anv.)
Improves performance of GFXBench5/gl_tess_off on Skylake GT4e at 1080p
by 9.3933% +/- 0.0305157% by eliminating all spilling in the GS.
Improves performance of GFXBench5/gl_4_off (Car Chase) on Skylake GT4e
at 1080p by 0.325208% +/- 0.0842233% (n=18).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We scalarize IO to enable further optimizations, such as propagating
constant components across shaders, eliminating dead components, and
so on. This patch attempts to re-vectorize those operations after
the varying optimizations are done.
Intel GPUs are a scalar architecture, but IO operations work on whole
vec4's at a time, so we'd prefer to have a single IO load per vector
rather than 4 scalar IO loads. This re-vectorization can help a lot.
Broadcom GPUs, however, really do want scalar IO. radeonsi may want
this, or may want to leave it to LLVM. So, we make a new flag in the
NIR compiler options struct, and key it off of that, allowing drivers
to pick. (It's a bit awkward because we have per-stage settings, but
this is about IO between two stages...but I expect drivers to globally
prefer one way or the other. We can adjust later if needed.)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This fixes the egl_mesa_platform_surfaceless piglit test as well
as the new egl_ext_device_base piglit test on classic swrast.
v2: Fix swrast surfaceless contexts on the driver side.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
This makes use of radv_meta_resolve_compute_image() by filling
a VkImageResolve region instead of duplicating code.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This reverts commit 55376cb31e.
It's been over a year and both QT 5.9.5 and 5.11.0 contained a fix for the
original issue. It seems i965 only ever applied this workaround to the
18.0 branch.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
When using the binding tables to access arrays of YCbCr descriptors we
did not consider the offset of the accessed element. We can't do a
simple multiple because the binding table entries are tightly packed.
For example element 0 of the array could use 2 entries/planes and
element 1 could use 2 entries/planes.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3bb8768b9d ("anv: toggle on support for VK_EXT_ycbcr_image_arrays")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Fixes following piglit and does not introduce any regressions.
spec@ext_packed_depth_stencil@fbo-depth-gl_depth24_stencil8-blit
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
This commit also adds codegen for branch since we need it
for discard_if.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
The old code was not wrong because the transitions performed
after the resolves should re-emit the framebuffer if needed.
This change is mostly a no-op but it improves consistency
regarding other meta operations that need to save/restore subpasses.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This helper will be useful for clearing HTILE after some
depth/stencil resolves.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The libmesa_anv_gen* modules require anv_extensions.h, patch makes sure
it gets generated as a dependency before building them.
Signed-off-by: Chenglei Ren <chenglei.ren@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Just move the block that checks the availability bit into the
switch like other query types.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>