We have some serious leaks, so plug some and also move to ralloc to
limit the lifetime of some objects to that of their parent.
Lots more such work to do.
For some reason, this fixes:
dEQP-GLES2.functional.lifetime.attach.deleted_output.texture_framebuffer
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This is pointless in that we won't ever hit those paths in real life,
but coverity complains.
Fixes: f014ae3c7c ("nouveau: add support for nir")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
GL_MAP_INVALIDATE_BUFFER_BIT cannot be treated as
GL_MAP_INVALIDATE_RANGE_BIT naively. When we run into
ptr = glMapBufferRange(buf, 0, size,
GL_WRITE_BIT|GL_MAP_INVALIDATE_BUFFER_BIT);
memcpy(ptr, data1, size);
glUnmapBuffer(buf);
ptr = glMapBufferRange(buf, size, size,
GL_WRITE_BIT|GL_MAP_UNSYNCHRONIZED_BIT);
memcpy(ptr, data2, size);
glUnmapBuffer(buf);
we never want data1 to be copy_transfer'ed. Because that would mean
that data2 might overwrite valid data.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis alexandros.frantzis@collabora.com
Fixes: a22c5df079 ("virgl: Use buffer copy transfers to avoid waiting when mapping")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Now that sRGB formats are supported for both rendering and sampling,
advertise support.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The performance impact is slightly mitigated by tiling the render
target, but it's undeniably still slow compared to AFBC. Unfortunately,
it doesn't look like AFBC and sRGB play nice...
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
For fixed-function, we have hardware to handle sRGB so we just set a
flag. For blend shaders, it's rather more involved; this is currently
unimplemented. Assert it out for now; we don't need it quite yet.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We already can sample from Mali's linear/tiled encoding (the one from
Utgard -- AFBC is mostly unrelated); let's be able to render to it as
well.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
A mode for rendering tiled/uncompressed was noticed, so we reshuffle the
MFBD render target definitions to explicitly include block type.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This combines the two cmdstream bits "is_3d" and "is_not_cubemap" into a
single 2-bit texture target selection, noticing it's the same as the
2-bit selection in Midgard and Bifrost texturing ops. Accordingly, we
share this definition and add the missing entry for 1D/buffer textures.
This requires a nontrivial (but functionally similar) refactor of all
parts of the driver to use the new definitions appropriately.
Theoretically, this should add support for buffer textures, but that's
obviously not tested and probably wouldn't work.
While doing so, we notice the sRGB enable bit, which we document and
decode as well here so we don't forget about it.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Requirements for a job should be figured out in pan_job.c
v2: [Alyssa] Fix early return
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
It's nice to keep these two files in sync, as they define
guest userspace <---> host userspace communcation.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This patch adds support for nir_texop_txs instructions which are needed
to support the OpenGL textureSize() function. This is also needed to
support RECT texture sampling which is currently lowered to 2D sampling +
a TXS() instruction by the nir_lower_tex() helper.
Changes in v2:
* Split options for the 1st and 2nd tex lowering passes
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We are about to add support for the TXS (texture size) op which is not
implemented using a midgard texture instruction. Let's rename emit_tex()
into emit_texop_native() and repurpose emit_tex() as a dispatcher.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We're about to add more sysval types, and panfrost_emit_for_draw()
is big enough, so let's move the sysval upload logic in a separate
function.
We also add one sub-function per sysval type to keep the
panfrost_upload_sysvals() small/readable.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We are about to add support for nir_texop_txs which requires adding a
sysval/uniform containing the texture size. Let's change the
emit_sysval_read() prototype to take a nir_instr object instead of
a nir_intrinsic_instr one so we can re-use this function when emitting
a sysval for a txs instruction.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We hadn't updated the kernel header after the driver got into mainline.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Without them, the state tracker falls back to an RGBA format, but it
doesn't always manage to override the swizzle for us. So we lose the
information that the API expects an X channel, where alpha is garbage
and reads back as 1. We have no equivalent ISL RGBX format for these,
so we just use RGBA directly and override the swizzle in all cases.
This should ensure the TC invalidate happens after the stall.
Fixes KHR-GL43.copy_image.functional which does a CopyImage (blorp_copy)
from a buffer (using R8G8B8A8_UINT), then GetTexImage to read back the
original image (using R10G10B10A2_UNORM).
When copying/blitting with format reinterpretation, we invalidate the
texture cache before/after. Before is so the source of the copy works,
and after is to get rid of our new data in the "wrong" format to protect
future attempts to sample.
When I ported these hacks to iris, I tried to be cautious by only
bothering with the hacks if the batch referenced the BO. This makes
some sense for the before case. If it isn't referenced, the texture
cache can't really have any data for the BO (since it's also invalidated
between batches). But we still need to do the after case regardless,
as we've just polluted the cache with hazardous entries.
When the host virglrenderer is an older version that doesn't check the sRGB write
control feature, or when the guest kernel doesn't support CAPS v2, then the guest
will only report support for GL 2.1 on a GL 3.3 host, even though it was supporting
3.3 with earlier guest mesa versions.
By also checking the host feature check version this regression can be avoided.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110921
Fixes: 2845939d6a
virgl: Set sRGB write control CAP based on host capabilities
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Fixes:
dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_2d
dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_2d
dEQP-GLES31.functional.stencil_texturing.misc.compare_mode_effect
Signed-off-by: Rob Clark <robdclark@chromium.org>
The stencil is actually in the .w component, but we used to use SWAP to
remap the channels. This doesn't work when tiled/ubwc.
Fixes:
dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_2d_array
dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_cube
dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_2d_array
dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_cube
dEQP-GLES31.functional.stencil_texturing.misc.base_level
dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_pot
dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_npot
dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot
dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot
dEQP-GLES31.functional.texture.border_clamp.sampler.uint_stencil
Signed-off-by: Rob Clark <robdclark@chromium.org>
Inline the ring buffer and signal logic into lp_scene_queue instead of
using a u_ringbuffer. The code ends up simpler since there's no need
to handle serializing data from / to packets.
This fixes a crash when compiling Mesa with LTO, that happened because
of util_ringbuffer_dequeue() was writing data after the "header
packet", as shown below
struct scene_packet {
struct util_packet header;
struct lp_scene *scene;
};
/* Snippet of old lp_scene_deque(). */
packet.scene = NULL;
ret = util_ringbuffer_dequeue(queue->ring,
&packet.header,
sizeof packet / 4,
return packet.scene;
but due to the way aliasing analysis work the compiler didn't
considered the "&packet->header" to alias with "packet->scene". With
the aggressive inlining done by LTO, this would end up always
returning NULL instead of the content read by
util_ringbuffer_dequeue().
Issue found by Marco Simental and iThiago Macieira.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110884
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Just encode the Mali magic number for `replace` rather than awkwardly
forcing Gallium structures through.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We switch all fmov to (i)mov, following the NIR switch. This simplifies
some code surrounding blend shaders and should have no functional
changes elsewhere.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
When the resource to be mapped is busy and the backing storage can
be discarded, reallocate the backing storage to avoid waiting.
In this new path, we allocate a new buffer, emit a state change,
write, and add the transfer to the queue . In the
PIPE_TRANSFER_DISCARD_RANGE path, we suballocate a staging buffer,
write, and emit a copy_transfer (which may allocate, memcpy, and
blit internally). The win might not always be clear. But another
win comes from that the new path clears res->valid_buffer_range and
does not clear res->clean_mask. This makes it much more preferable
in scenarios such as
access = enough_space ? GL_MAP_UNSYNCHRONIZED_BIT :
GL_MAP_INVALIDATE_BUFFER_BIT;
glMapBufferRange(..., GL_MAP_WRITE_BIT | access);
memcpy(...); // append new data
glUnmapBuffer(...);
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
We are going support reallocating the HW resource for a
virgl_resource. When that happens, the virgl_resource needs to be
rebound to the context.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
When PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE is properly supported,
virgl_transfer might refer to a different virgl_hw_res than
virgl_resource does. We need to save the virgl_hw_res and use the
saved one.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>