Re-allocate rather than re-use. Originally we had an unnecessarily
complex design to avoid re-allocating cmdstream buffers. But now that
support for "growable" cmdstream buffers has been in place for a couple
years, I guess we can care a bit less about the extra overhead on older
kernels.
But making the batches one-shot removes a class of potential race
conditions vs the flush_queue.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Instead of the reading batch setting a dependency on the writing batch,
simply flush the writing batch immediately. This avoids situations
where we have to flush the context's current batch later.
Signed-off-by: Rob Clark <robdclark@gmail.com>
This was basically to avoid a zero-dword IB (indirect-branch), but
instead just don't emit the IB packet in that case.
Signed-off-by: Rob Clark <robdclark@gmail.com>
pipe_framebuffer_state can have samples=0 in various cases, which is
actually the same thing as samples=1. So use the _get_num_samples()
helper to populate the key, to avoid this looking like two distinct
fb states to the cache.
Signed-off-by: Rob Clark <robdclark@gmail.com>
It is possible for a batch to be freed under our feet when flushing, so
it is best to hold a reference to all of them up-front.
Signed-off-by: Rob Clark <robdclark@gmail.com>
The current code is buggy: if there are only 12 dwords left in cbuf,
we emit a zero data length command which will be rejected by virglrenderer.
Fix it by calling flush in this case.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
This allows us to implement glMinSampleShading correctly, which up
until now just got ignored.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Converting from a switch statement that would not allow intermediate sample counts
to use an if-else chain went a bit wrong, so that in some cases the range that
should be inclusive was exclusive and the line for 16 samples was copies wrongly.
v2: elaborate commit message.
Fixes: 91f48cdfe5
virgl: Add support for glGetMultisample
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v1)
fixes a couple of packed_pixel CTS tests. No regressions inside a CTS run.
v2: simplify the changes a bit
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
If you load S and clear Z or vice versa, the clear may get lost. Just
fall back to drawing a quad.
Fixes KHR-GLES3.packed_depth_stencil.verify_read_pixels.depth24_stencil8
These are not unit tests, as they rely on the host's XVMC and some user
configuration. Switch them over to being general installed tools, to fix
unit testing.
Fixes: 22a817af8a ("meson: build gallium xvmc state tracker")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
The current spill code checks in each instruction of an instruction group whether
spilling is needed and if so, it adds spilling for each component as a seperate
instruction and it allocates a new temporary for each component and since it takes
the write mask from the TGSI representation, all components might be written
each time and as a result already written components might be overwritten with
garbage like:
...
y: MOV R9.y, [0x42140000 37].x
t: MOV R8.x, [0x42040000 33].y
...
MEM_SCRATCH WRITE_IND_ACK 0 R9.xy__, @R4.x ES:3
MEM_SCRATCH WRITE_IND_ACK 0 R8.xy__, @R4.x ES:3
...
To resolve this isse accumulate spills to the same memory location so that only one
memory write instruction is emitted for an instruction group that writes up to all
four components.
This fixes updated piglits (see https://patchwork.freedesktop.org/series/46064/):
spec/glsl-1.30/execution
fs-large-local-array-vec2.shader_test
fs-large-local-array-vec3.shader_test
fs-large-local-array-vec4.shader_test
v2: fix some typos and add comment about piglits (Roland Scheidegger)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
For V3D, the HW will interpolate slightly differently along the shared
edge of the trifan. The conformance tests manage to catch this in the
nearest_consistency_* group. To get interpolation to match, we need the
last vertex of the triangle to be shared.
I first tried implementing draw_rectangle to do triangles instead, but
that was quite a bit (147 lines) of code duplication from u_blitter, and
this seems much simpler and less likely to break as u_blitter changes.
Fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_* on V3D.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
These helpers have been unused, and were definitely not useful since
330d0607ed ("gallium: remove pipe_index_buffer and set_index_buffer")
made it so that they never had an index buffer passed in.
For an upcoming u_blitter change to use these helpers, I have just 6 bytes
of index data, so pass it as user data until a more interesting caller
comes along.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
I had mistakenly used the COHERENT flag, which can only be set when
PERSISTENT is mapped, but isn't always.
Fixes: a2014c2eb9 ("vc4: Simplify the DISCARD_RANGE handling")
All of our other texture arrays will be tiled, but 1D is an array of
raster mappings and we had the wrong value plugged in here. Fixes piglit
getteximage-targets 1D_ARRAY
We were only emitting the RT blend state for RT 0 and only enabling it for
RT 0, when the gallium API for !independent_blend is for rt0's state to
apply to all of them.
Fixes piglit fbo-drawbuffers-blend-add.
We just never set the value that was returned for MSAA mappings (directly
reading back an MSAA framebuffer). Since we're handing back ss_map, it
should be ss_map's stride from our nested transfer.
Fixes piglit /home/anholt/src/piglit/bin/fbo-depthstencil -samples=4
cases.
Reviewed-by: Rob Clark <robdclark@gmail.com>
We created a temporary with box->{width,height} and then tried to map
width,height from a nonzero offset when we meant to just map the whole
temporary.
Fixes segfaults in V3D in dEQP-GLES3.functional.prerequisite.read_pixels
with --deqp-egl-config-name=rgba8888d24s8ms4 and also piglit's read-front
clear-front-first -samples=4
Reviewed-by: Rob Clark <robdclark@gmail.com>
It's optional, only implemented by the etnaviv driver so far.
Fixes: 501d0edeca "st/mesa: call resource_changed when binding a
EGLImage to a texture"
Fixes: a37cf630b4 "gallium: add pipe_screen::resource_changed callback
wrappers"
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
The vtest protocol is pretty simple but also pretty dumb, and
the v1 caps query was fixed size, with no nice way to expand it,
however the server also ignores any command it doesn't understand.
So we can query v2 caps by sending a v2 followed by a v1, if the
v2 is ignored we know it's an old vtest server, and the we get
a v2 answer then we can just read the v1 answer and discard it.
Acked-by: Jakob Bornecrantz <jakob@collabora.com> (sounds good)
The logic was duplicated in a pretty gross way, when what we really need
is just a helper function for stuffing the values in the packet. This
will make implementing noperspective easier.
Java2d opengl pipeline passes NULL piAttribList to
wglCreatePbufferARB(). So skip parsing the attribute list
if it is NULL.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
OpenGL 4.4 requires a max vertex attrib of 2048 or higher, but
r600 only supports 2047. Technically, this makes it an GL4.3 GPU,
but it's currently exposing GL4.4.
To avoid regressing the GL version supported in the following
patches, let's just lie and pretend like we support 2048. Any
applications using 2048 are already broken anyway.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
fold_assoc() called from fold_alu_op3() can lower the number of src to 2,
which then leads to an invalid access to n.src[2]->gvalue().
This didn't seem to have caused much harm in the past, but on Fedora 28
it will crash (presumably because -D_GLIBCXX_ASSERTIONS is used, although
with libstdc++ 4.8.5 this didn't do anything, -D_GLIBCXX_DEBUG was
needed to show the issue).
An alternative fix would be to instead call fold_alu_op2() from within
fold_assoc() when the number of src is reduced and return always TRUE
from fold_assoc() in this case, with the only actual difference being
the return value from fold_alu_op3() then. I'm not sure what the return
value actually should be in this case (or whether it even can make a
difference).
https://bugs.freedesktop.org/show_bug.cgi?id=106928
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
phi instructions don't have the same results by simply having the same sources.
They need to be inside the same BasicBlock or share an equal condition
resulting into a path through the shader selecting equal sources as well.
short example:
cond = ...;
const0 = 0;
const1 = 1;
if (cond) {
ssa_1 = const0;
} else {
ssa_2 = const1;
}
ssa_3 = phi ssa_1 ssa_2;
if (!cond) {
ssa_4 = const0;
} else {
ssa_5 = const1;
}
ssa_6 = phi ssa_4 ssa_5;
allthough both phis actually have sources with equal results, merging them
would be wrong due to having a different condition selecting which source to
take.
For now we also stick an assert into GlobalCSE, because it should never end up
having to merge phi instructions.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
total instructions in shared programs : 5804448 -> 5804690 (0.00%)
total gprs used in shared programs : 670065 -> 670065 (0.00%)
total shared used in shared programs : 548832 -> 548832 (0.00%)
total local used in shared programs : 21068 -> 21068 (0.00%)
local shared gpr inst bytes
helped 0 0 0 5 5
hurt 0 0 0 191 191
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Ported from i965 including the comment.
This fixes:
dEQP-EGL.functional.reusable_sync.valid.wait_server
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
In Python 2, `print` was a statement, but it became a function in
Python 3.
Using print functions everywhere makes the script compatible with Python
versions >= 2.6, including Python 3.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Cleans up the CL of fbo-drawbuffers2-blend a bit. We could do better on
more complicated cases by noticing if multiple RTs have the same blend
state and emitting them in a single packet.
Commit f69bc797e1 did the following:
- if format.layout in ('bptc', 'astc'):
+ if format.layout in ('astc'):
The intention was to go from matching either 'bptc' or 'astc' to
matching only 'astc'.
But the new code doesn't respect this intention any more, because in
Python `('astc')` is not a tuple containing a string, it is just the
string. (the parentheses are simply ignored)
That means we now match any substring of 'astc', for example 'a'.
This commit fixes the test to respect the original intention.
Fixes: f69bc797e1 "gallium/auxiliary: Add helper support for
bptc format compress/decompress"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>