Sampler state prefetching is broken on Gen11, and WA_160668216 says
to disable it. Apparently sampler state prefetching also has basically
zero impact on performance, so we don't need to worry there.
i965, anv, and iris already handle this correctly, but we missed BLORP.
Ideally the kernel should globally disable this by writing SARCHKMD, at
which point we wouldn't have to worry about it. But let's be defensive
and handle it ourselves too.
v2: separate out from BTP workaround in case we change that eventually
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> [v1]
When immutable samplers are set we call write_image_view with a NULL
image view. This causes issues on IVB where we have to fake texture
swizzling.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110999
Fixes: d2aa65eb18 "anv: Emulate texture swizzle in the shader when..."
When set, do as requested and skip any transfer optimization.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-By: Alexandros Frantzis <alexandros.frantzis@collabora.com>
When set, wait after every each flush.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-By: Alexandros Frantzis <alexandros.frantzis@collabora.com>
VIRGL_DEBUG_BGRA_DEST_SWIZZLE should use bit 3. Make some cosmetic
changes as well.
Fixes: a478e56fbd
virgl: Add debug flag to bypass driconf to enable the BGRA tweaks
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-By: Alexandros Frantzis <alexandros.frantzis@collabora.com>
SMEM and VMEM caches are L0 on gfx10. Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Ported from RadeonSI, will be emitted for GFX10 too.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This reduces the size of fill operations needed to clear CMASK
for layered color textures.
GFX9 unsupported for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This reduces the size of fill operations needed to clear FMASK
for layered color textures.
GFX9 unsupported for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This fixes a rendering issue with RoTR/DXVK.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
In case of any enabled VS members from: uses_firstvertex,
uses_baseinstance, uses_drawid, uses_is_indexed_draw
leaks may happens.
Call gen6_upload_push_constants allocates
stage_stat->push_const_bo. It than takes pointer from
push_const_bo to draw_params_bo (in the call
brw_prepare_shader_draw_parameters by brw_upload_data)
and do reference which finally haven't got unreferenced.
Fixes leak:
136 bytes in 1 blocks are definitely lost in loss record 6 of 13
at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0xC2B64B7: bo_alloc_internal (brw_bufmgr.c:596)
by 0xC2B6748: brw_bo_alloc (brw_bufmgr.c:672)
by 0xC314BB3: brw_upload_space (intel_upload.c:88)
by 0xC2EBBC5: gen6_upload_push_constants (gen6_constant_state.c:155)
by 0xC9E4FA6: gen9_upload_vs_push_constants (genX_state_upload.c:3300)
by 0xC2E0EDA: check_and_emit_atom (brw_state_upload.c:540)
by 0xC2E0EDA: brw_upload_pipeline_state (brw_state_upload.c:659)
by 0xC2E0FF1: brw_upload_render_state (brw_state_upload.c:681)
by 0xC2C5D2D: brw_draw_single_prim (brw_draw.c:1052)
by 0xC2C62CB: brw_draw_prims (brw_draw.c:1175)
by 0xC488AD1: vbo_exec_vtx_flush (vbo_exec_draw.c:386)
by 0xC485270: vbo_exec_FlushVertices_internal (vbo_exec_api.c:652)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
st/egl used to support eglCreatePbufferFromClientBuffer, but now that
it's gone, any call to it would segfault.
Let's return a nice error instead.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Nobody ever uses these, so let's just hard code them instead.
If an EGL driver ever comes around that needs them they're trivial to
re-add.
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
The driver doesn't use these values and ac_rtld has assertions
expecting the value of 0.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We'll have to extend this at some point, and using a bitfield union in
this way makes it easier to get the right index without excessive
branching.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The initial prototype used a processor-specific symbol type, but
feedback suggests that an approach using processor-specific section
name that encodes the alignment analogous to SHN_COMMON symbols is
preferred.
This patch keeps both variants around for now to reduce problems
with LLVM compatibility as we switch branches around.
This also cleans up the error reporting in this function.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Incrementing the iteration count was intended to fix an off-by-one error
when the first terminator was superseded by a later terminator. If
there is no first terminator or later terminator, there is no off-by-one
error. Incrementing the loop count creates one. This can be seen in
loops like:
do {
if (something) {
// No breaks or continues here.
}
} while (false);
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Abel Briggs <abelbriggs1@hotmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110953
Fixes: 646621c66d ("glsl: make loop unrolling more like the nir unrolling path")
We were pessimistically uploading all of it in case of indirection,
but we can just bump that when we encounter indirection.
total constlen in shared programs: 2529623 -> 2485933 (-1.73%)
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
ir3_nir_analyze_ubo_ranges() has already told us how much of cb0 we
need to upload (all of it, since it will lower indirect UBO 0 accesses
from load_ubo back to indirection on the constant buffer).
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
If the NIR-level analysis decided to move UBO loads to the constant
file, but the backend decided not to load those constants, we could
upload past the end of constlen. This is particularly relevant for
pre-a6xx, where we emit a different constlen between bin and render
variants.
(Fix by Rob, commit message by anholt)
Reviewed-by: Eric Anholt <eric@anholt.net>
UBOs and uniforms now use a common code path with an explicit `index`
argument passed, enabling UBO reads.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Prevents an assert(0) later in this (not so edge) case. We still have to
have a dummy there.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We've known about this for a while, but it was never formally in the
machine header files / decoder, so let's add them in.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Now that all the counting is sorted, it's a matter of passing along a
GPU address and going.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We already uploaded UBOs, but only a fixed number (1) for uniforms;
let's upload as many as we compute we need.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>