The V3D driver has an open-coded solution for this, and we need the
same thing for Panfrost, so let's add a generic way to lower TXS(LOD)
into max(TXS(0) >> LOD, 1).
Changes in v2:
* Use == 0 instead of !
* Rework the minification logic as suggested by Jason
* Assign cursor pos at the beginning of the function
* Patch the LOD just after retrieving the old value
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
get_texture_size() will create a txs instruction with ->sampler_dim set
to the original tex->sampler_dim. The condition to call lower_rect()
only checks the value of ->sampler_dim and whether lower_rect is
requested or not. This leads to an infinite loop when calling
nir_lower_tex() with the same options until it returns false.
In order to avoid that, let's move the tex->sampler_dim patching before
get_texture_size() is called. This way the txs instruction will have
->sampler_dim set to GLSL_SAMPLER_DIM_2D and nir_lower_tex() won't try
to lower it on the subsequent passes.
Changes in v2:
* Add Jason R-b
* Add a comment explaining why we patch ->sampler_dim at the beginning
of the lower_rect() func
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The code considers that projector lowering was done even if it's not
really the case. Change the project_src() prototype to return a bool
encoding whether projector lowering happened or not and update the
progress var accordingly in nir_lower_tex_block().
---
Changes in v2:
* Add Jason R-b
* Drop the part suggesting that nir_lower_rect() could be called in
a do-while(progress) loop.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We hadn't updated the kernel header after the driver got into mainline.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Only skip levels without DCC when it's a DCC decompression.
Whoops.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
In other words, make use of radv_dcc_enabled() instead of
radv_image_has_dcc() all over the places.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Without them, the state tracker falls back to an RGBA format, but it
doesn't always manage to override the swizzle for us. So we lose the
information that the API expects an X channel, where alpha is garbage
and reads back as 1. We have no equivalent ISL RGBX format for these,
so we just use RGBA directly and override the swizzle in all cases.
The VaryingNames array has NumVaryings entries. But BufferStride is
a small array of MAX_FEEDBACK_BUFFERS (4) entries. Programs with
more than 4 varyings would read out of bounds.
Also, BufferStride is set based on the shader itself, which means that
it's inherently already included in the hash, and doesn't need to be
included again. At the point when shader_cache_read_program_metadata
is called, the linker hasn't even set those fields yet. So, just drop
it entirely.
Fixes valgrind errors in KHR-GL45.transform_feedback.linking_errors_test.
Fixes: 6d830940f7 glsl/shader_cache: Allow shader cache usage with transform feedback
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This should fix floating-point border color on all gen7 HW. Integer is
still thoroughly busted on gen7 because it doesn't exist on IVB and it's
crazy on HSW.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Whenever stencil texturing is not required (most of the time), we can
use VK_EXT_separate_stencil_usage to only create the shadow image when
VK_IMAGE_USAGE_SAMPLED_BIT is required for stencil. Of course, this
depends on applications to use the extension but hopefully DXVK and
similar translators are doing so and that covers most of the apps.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Intel hardware didn't get support for sampling from W-tiled (required
for stencil) images until Broadwell so we can't directly sample from
stencil. Instead, if we want to support stencil texturing on gen7
hardware, we have to keep a texture-capable shadow copy around and use
BLORP to update when stencil changes. The one thing this commit does
not implement is self-dependencies with stencil input attachments.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99493
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This should ensure the TC invalidate happens after the stall.
Fixes KHR-GL43.copy_image.functional which does a CopyImage (blorp_copy)
from a buffer (using R8G8B8A8_UINT), then GetTexImage to read back the
original image (using R10G10B10A2_UNORM).
When copying/blitting with format reinterpretation, we invalidate the
texture cache before/after. Before is so the source of the copy works,
and after is to get rid of our new data in the "wrong" format to protect
future attempts to sample.
When I ported these hacks to iris, I tried to be cautious by only
bothering with the hacks if the batch referenced the BO. This makes
some sense for the before case. If it isn't referenced, the texture
cache can't really have any data for the BO (since it's also invalidated
between batches). But we still need to do the after case regardless,
as we've just polluted the cache with hazardous entries.
When the host virglrenderer is an older version that doesn't check the sRGB write
control feature, or when the guest kernel doesn't support CAPS v2, then the guest
will only report support for GL 2.1 on a GL 3.3 host, even though it was supporting
3.3 with earlier guest mesa versions.
By also checking the host feature check version this regression can be avoided.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110921
Fixes: 2845939d6a
virgl: Set sRGB write control CAP based on host capabilities
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Fixes:
dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_2d
dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_2d
dEQP-GLES31.functional.stencil_texturing.misc.compare_mode_effect
Signed-off-by: Rob Clark <robdclark@chromium.org>
The stencil is actually in the .w component, but we used to use SWAP to
remap the channels. This doesn't work when tiled/ubwc.
Fixes:
dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_2d_array
dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_cube
dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_2d_array
dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_cube
dEQP-GLES31.functional.stencil_texturing.misc.base_level
dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_pot
dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_npot
dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot
dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot
dEQP-GLES31.functional.texture.border_clamp.sampler.uint_stencil
Signed-off-by: Rob Clark <robdclark@chromium.org>
Inline the ring buffer and signal logic into lp_scene_queue instead of
using a u_ringbuffer. The code ends up simpler since there's no need
to handle serializing data from / to packets.
This fixes a crash when compiling Mesa with LTO, that happened because
of util_ringbuffer_dequeue() was writing data after the "header
packet", as shown below
struct scene_packet {
struct util_packet header;
struct lp_scene *scene;
};
/* Snippet of old lp_scene_deque(). */
packet.scene = NULL;
ret = util_ringbuffer_dequeue(queue->ring,
&packet.header,
sizeof packet / 4,
return packet.scene;
but due to the way aliasing analysis work the compiler didn't
considered the "&packet->header" to alias with "packet->scene". With
the aggressive inlining done by LTO, this would end up always
returning NULL instead of the content read by
util_ringbuffer_dequeue().
Issue found by Marco Simental and iThiago Macieira.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110884
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Just encode the Mali magic number for `replace` rather than awkwardly
forcing Gallium structures through.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We switch all fmov to (i)mov, following the NIR switch. This simplifies
some code surrounding blend shaders and should have no functional
changes elsewhere.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
When the resource to be mapped is busy and the backing storage can
be discarded, reallocate the backing storage to avoid waiting.
In this new path, we allocate a new buffer, emit a state change,
write, and add the transfer to the queue . In the
PIPE_TRANSFER_DISCARD_RANGE path, we suballocate a staging buffer,
write, and emit a copy_transfer (which may allocate, memcpy, and
blit internally). The win might not always be clear. But another
win comes from that the new path clears res->valid_buffer_range and
does not clear res->clean_mask. This makes it much more preferable
in scenarios such as
access = enough_space ? GL_MAP_UNSYNCHRONIZED_BIT :
GL_MAP_INVALIDATE_BUFFER_BIT;
glMapBufferRange(..., GL_MAP_WRITE_BIT | access);
memcpy(...); // append new data
glUnmapBuffer(...);
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
We are going support reallocating the HW resource for a
virgl_resource. When that happens, the virgl_resource needs to be
rebound to the context.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
When PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE is properly supported,
virgl_transfer might refer to a different virgl_hw_res than
virgl_resource does. We need to save the virgl_hw_res and use the
saved one.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>