Commit graph

34781 commits

Author SHA1 Message Date
Rob Clark
4b847b38ae freedreno: make fd_batch a one-shot thing
Re-allocate rather than re-use.  Originally we had an unnecessarily
complex design to avoid re-allocating cmdstream buffers.  But now that
support for "growable" cmdstream buffers has been in place for a couple
years, I guess we can care a bit less about the extra overhead on older
kernels.

But making the batches one-shot removes a class of potential race
conditions vs the flush_queue.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Rob Clark
f129971e71 freedreno: flush immediately when reading a pending batch
Instead of the reading batch setting a dependency on the writing batch,
simply flush the writing batch immediately.  This avoids situations
where we have to flush the context's current batch later.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Rob Clark
20f677f6bc freedreno: get rid of noop render
This was basically to avoid a zero-dword IB (indirect-branch), but
instead just don't emit the IB packet in that case.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Rob Clark
15f6c0509a freedreno: fix samples=0 vs samples=1 confusion
pipe_framebuffer_state can have samples=0 in various cases, which is
actually the same thing as samples=1.  So use the _get_num_samples()
helper to populate the key, to avoid this looking like two distinct
fb states to the cache.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Rob Clark
d77fcdeb59 freedreno: comment for _invalidate_batch()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Rob Clark
f2570409f9 freedreno: hold batch references when flushing
It is possible for a batch to be freed under our feet when flushing, so
it is best to hold a reference to all of them up-front.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-07-17 11:00:00 -04:00
Erik Faye-Lund
591b700944 virgl: respect max_vertex_attrib_stride cap
This is required for OpenGL 4.4 and OpenGL ES 3.1 support.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-17 15:45:37 +10:00
Lepton Wu
04e278f793 virgl: Fix flush in virgl_encoder_inline_write.
The current code is buggy: if there are only 12 dwords left in cbuf,
we emit a zero data length command which will be rejected by virglrenderer.
Fix it by calling flush in this case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-17 14:56:25 +10:00
Erik Faye-Lund
b5db3aa6e8 virgl: implement set_min_samples
This allows us to implement glMinSampleShading correctly, which up
until now just got ignored.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-17 13:59:47 +10:00
Marek Olšák
4054133dcc r600: fix build after the removal of RADEON_PRIO_* flags 2018-07-16 14:33:31 -04:00
Marek Olšák
f8aa116c3c winsys/amdgpu: clean up error handling in amdgpu_cs_submit_ib
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
6b1e0e51e6 radeonsi: rework RADEON_PRIO flags to be <= 31
This decreases sizeof(struct amdgpu_cs_buffer) from 24 to 16 bytes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
54ad9b444c radeonsi: merge DCC/CMASK/HTILE priority flags
For a later simplification.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
3e6888e5d7 radeonsi: remove non-GFX BO priority flags
For a later simplification.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
342fff6cbc winsys/amdgpu: use alloca when using global_bo_list
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
6ec44b7055 winsys/amdgpu: remove label bo_list_error
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
7346e5296e winsys/amdgpu: always update gfx_bo_list_counter
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Marek Olšák
caf41fb96d winsys/amdgpu: make amdgpu_cs_context::flags & handles local
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-16 13:32:33 -04:00
Gert Wollny
78887e99e3 mesa/virgl: Fix off-by-one and copy-paste error in multisample position evaluation
Converting from a switch statement that would not allow intermediate sample counts
to use an if-else chain went a bit wrong, so that in some cases the range that
should be inclusive was exclusive and the line for 16 samples was copies wrongly.

v2: elaborate commit message.

Fixes: 91f48cdfe5
       virgl: Add support for glGetMultisample
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v1)
2018-07-16 12:51:39 +02:00
Karol Herbst
4d0d911875 nouveau: fix 3D blitter for unsigned to signed integer conversions
fixes a couple of packed_pixel CTS tests. No regressions inside a CTS run.

v2: simplify the changes a bit

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-15 19:28:37 +02:00
Jason Ekstrand
b52d79514c vc4: Tell NIR to lower fdiv instructions
This should allow us to use them in nir_lower_tex

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-13 14:02:18 -07:00
Eric Anholt
d009463a65 vc4: Switch to using u_transfer_helper for MSAA maps.
No requirement, just reduces code duplication.
2018-07-13 13:29:29 -07:00
Eric Anholt
afcc714c98 v3d: Work around GFXH-1461 bug losing our Z/S clears.
If you load S and clear Z or vice versa, the clear may get lost.  Just
fall back to drawing a quad.

Fixes KHR-GLES3.packed_depth_stencil.verify_read_pixels.depth24_stencil8
2018-07-13 13:29:29 -07:00
Eric Anholt
162fcdad6a meson: Move xvmc test tools from unit tests to installed tools.
These are not unit tests, as they rely on the host's XVMC and some user
configuration.  Switch them over to being general installed tools, to fix
unit testing.

Fixes: 22a817af8a ("meson: build gallium xvmc state tracker")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-13 13:29:29 -07:00
Gert Wollny
695a4cb0f6 r600: Add spill output to group only if register or target index changes
The current spill code checks in each instruction of an instruction group whether
spilling is needed and if so, it adds spilling for each component as a seperate
instruction and it allocates a new temporary for each component and since it takes
the write mask from the TGSI representation, all components might be written
each time and as a result already written components might be overwritten with
garbage like:

   ...
   y: MOV                R9.y,  [0x42140000 37].x
   t: MOV                R8.x,  [0x42040000 33].y
   ...
   MEM_SCRATCH  WRITE_IND_ACK 0     R9.xy__, @R4.x  ES:3
   MEM_SCRATCH  WRITE_IND_ACK 0     R8.xy__, @R4.x  ES:3
   ...

To resolve this isse accumulate spills to the same memory location so that only one
memory write instruction is emitted for an instruction group that writes up to all
four components.

This fixes updated piglits (see https://patchwork.freedesktop.org/series/46064/):
   spec/glsl-1.30/execution
       fs-large-local-array-vec2.shader_test
       fs-large-local-array-vec3.shader_test
       fs-large-local-array-vec4.shader_test

v2: fix some typos and add comment about piglits (Roland Scheidegger)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
2018-07-13 21:11:34 +02:00
Marek Olšák
2e0b00ab7d radeonsi: add support for Vega20
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-12 16:48:12 -04:00
Eric Anholt
e8dc3c0c36 u_blitter: Add an option to draw the triangles using an index buffer.
For V3D, the HW will interpolate slightly differently along the shared
edge of the trifan.  The conformance tests manage to catch this in the
nearest_consistency_* group.  To get interpolation to match, we need the
last vertex of the triangle to be shared.

I first tried implementing draw_rectangle to do triangles instead, but
that was quite a bit (147 lines) of code duplication from u_blitter, and
this seems much simpler and less likely to break as u_blitter changes.

Fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_* on V3D.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-12 11:49:22 -07:00
Eric Anholt
c17dac0534 u_draw: Add some indices to the util_draw_elements() helpers.
These helpers have been unused, and were definitely not useful since
330d0607ed ("gallium: remove pipe_index_buffer and set_index_buffer")
made it so that they never had an index buffer passed in.

For an upcoming u_blitter change to use these helpers, I have just 6 bytes
of index data, so pass it as user data until a more interesting caller
comes along.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-12 11:49:20 -07:00
Eric Anholt
50a3a283d0 vc4: Don't automatically reallocate a PERSISTENT-mapped buffer.
I had mistakenly used the COHERENT flag, which can only be set when
PERSISTENT is mapped, but isn't always.

Fixes: a2014c2eb9 ("vc4: Simplify the DISCARD_RANGE handling")
2018-07-12 11:31:08 -07:00
Eric Anholt
7714896256 v3d: Don't automatically reallocate a PERSISTENT-mapped buffer.
I had mistakenly used the COHERENT flag, which can only be set when
PERSISTENT is mapped, but isn't always.

Fixes piglit bufferstorage-persistent read
2018-07-12 11:31:08 -07:00
Eric Anholt
e48c615292 v3d: Fix stride of 1D_ARRAY mappings.
All of our other texture arrays will be tiled, but 1D is an array of
raster mappings and we had the wrong value plugged in here.  Fixes piglit
getteximage-targets 1D_ARRAY
2018-07-12 11:31:08 -07:00
Eric Anholt
97ddeed949 v3d: Fix MRT blending with independent blending disabled.
We were only emitting the RT blend state for RT 0 and only enabling it for
RT 0, when the gallium API for !independent_blend is for rt0's state to
apply to all of them.

Fixes piglit fbo-drawbuffers-blend-add.
2018-07-12 11:31:08 -07:00
Eric Anholt
e0dbbf9987 gallium/u_transfer_helper: Initialize the stride of MSAA maps.
We just never set the value that was returned for MSAA mappings (directly
reading back an MSAA framebuffer).  Since we're handing back ss_map, it
should be ss_map's stride from our nested transfer.

Fixes piglit /home/anholt/src/piglit/bin/fbo-depthstencil -samples=4
cases.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-07-12 11:31:06 -07:00
Eric Anholt
589bb5bd65 gallium/u_transfer_helper: Fix MSAA mappings with nonzero x/y.
We created a temporary with box->{width,height} and then tried to map
width,height from a nonzero offset when we meant to just map the whole
temporary.

Fixes segfaults in V3D in dEQP-GLES3.functional.prerequisite.read_pixels
with --deqp-egl-config-name=rgba8888d24s8ms4 and also piglit's read-front
clear-front-first -samples=4

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-07-12 11:31:00 -07:00
Michel Dänzer
34e89e4d38 gallium: Check pipe_screen::resource_changed before dereferencing it
It's optional, only implemented by the etnaviv driver so far.

Fixes: 501d0edeca "st/mesa: call resource_changed when binding a
                     EGLImage to a texture"
Fixes: a37cf630b4 "gallium: add pipe_screen::resource_changed callback
                     wrappers"
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-07-12 17:39:12 +02:00
Dave Airlie
45e25adfe8 virgl/vtest: add support to vtest for new cap getting.
The vtest protocol is pretty simple but also pretty dumb, and
the v1 caps query was fixed size, with no nice way to expand it,
however the server also ignores any command it doesn't understand.

So we can query v2 caps by sending a v2 followed by a v1, if the
v2 is ignored we know it's an old vtest server, and the we get
a v2 answer then we can just read the v1 answer and discard it.

Acked-by: Jakob Bornecrantz <jakob@collabora.com> (sounds good)
2018-07-10 09:07:37 +10:00
Eric Anholt
beeb94402f v3d: Implement noperspective varyings on V3D 4.x.
Fixes a bunch of piglit interpolation tests, and reduces my concern about
some MSAA blit shaders with noperspective varyings.
2018-07-09 11:48:32 -07:00
Eric Anholt
4b4795be9d v3d: Refactor flat shade/centroid flag emission.
The logic was duplicated in a pretty gross way, when what we really need
is just a helper function for stuffing the values in the packet.  This
will make implementing noperspective easier.
2018-07-09 11:48:32 -07:00
Charmaine Lee
097952abaa st/wgl: check for NULL piAttribList in wglCreatePbufferARB()
Java2d opengl pipeline passes NULL piAttribList to
wglCreatePbufferARB(). So skip parsing the attribute list
if it is NULL.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-07-06 17:32:49 -07:00
Erik Faye-Lund
747cf468ff r600: report incorrect max-vertex-attrib for GL 4.4
OpenGL 4.4 requires a max vertex attrib of 2048 or higher, but
r600 only supports 2047. Technically, this makes it an GL4.3 GPU,
but it's currently exposing GL4.4.

To avoid regressing the GL version supported in the following
patches, let's just lie and pretend like we support 2048. Any
applications using 2048 are already broken anyway.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-07-09 17:32:31 +02:00
Roland Scheidegger
817efd8968 r600/sb: fix crash in fold_alu_op3
fold_assoc() called from fold_alu_op3() can lower the number of src to 2,
which then leads to an invalid access to n.src[2]->gvalue().
This didn't seem to have caused much harm in the past, but on Fedora 28
it will crash (presumably because -D_GLIBCXX_ASSERTIONS is used, although
with libstdc++ 4.8.5 this didn't do anything, -D_GLIBCXX_DEBUG was
needed to show the issue).

An alternative fix would be to instead call fold_alu_op2() from within
fold_assoc() when the number of src is reduced and return always TRUE
from fold_assoc() in this case, with the only actual difference being
the return value from fold_alu_op3() then. I'm not sure what the return
value actually should be in this case (or whether it even can make a
difference).

https://bugs.freedesktop.org/show_bug.cgi?id=106928
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-09 07:17:29 +01:00
Karol Herbst
de13978733 nv50/ir: fix Instruction::isActionEqual for PHI instructions
phi instructions don't have the same results by simply having the same sources.
They need to be inside the same BasicBlock or share an equal condition
resulting into a path through the shader selecting equal sources as well.

short example:

cond = ...;
const0 = 0;
const1 = 1;

if (cond) {
  ssa_1 = const0;
} else {
  ssa_2 = const1;
}
ssa_3 = phi ssa_1 ssa_2;

if (!cond) {
  ssa_4 = const0;
} else {
  ssa_5 = const1;
}
ssa_6 = phi ssa_4 ssa_5;

allthough both phis actually have sources with equal results, merging them
would be wrong due to having a different condition selecting which source to
take.

For now we also stick an assert into GlobalCSE, because it should never end up
having to merge phi instructions.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-07-07 20:32:33 +02:00
Rhys Perry
f2cc694d8e nvc0/ir: use the combined tid special register
total instructions in shared programs : 5804448 -> 5804690 (0.00%)
total gprs used in shared programs    : 670065 -> 670065 (0.00%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21068 -> 21068 (0.00%)

                local     shared        gpr       inst      bytes
    helped           0           0           0           5           5
      hurt           0           0           0         191         191

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-07-07 20:31:56 +02:00
Marek Olšák
0eaf069679 st/dri: fix a crash in server_wait_sync
Ported from i965 including the comment.

This fixes:
    dEQP-EGL.functional.reusable_sync.valid.wait_server

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-07-06 16:23:37 -04:00
Mathieu Bridon
0f7b18fa0d python: Use the print function
In Python 2, `print` was a statement, but it became a function in
Python 3.

Using print functions everywhere makes the script compatible with Python
versions >= 2.6, including Python 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2018-07-06 10:04:22 -07:00
Eric Anholt
9d0406c52f v3d: Fix leak of the default attributes BOs.
The GLES3 CTS makes a lot more progress on a run now.
2018-07-05 15:50:54 -07:00
Eric Anholt
6b11131373 v3d: Fix leak of the spill BO on context destruction. 2018-07-05 15:50:52 -07:00
Eric Anholt
03f6d26b62 v3d: Skip emitting per-RT blend state for RTs with blend disabled.
Cleans up the CL of fbo-drawbuffers2-blend a bit.  We could do better on
more complicated cases by noticing if multiple RTs have the same blend
state and emitting them in a single packet.
2018-07-05 12:39:36 -07:00
Eric Anholt
572f6ab489 v3d: Add proper support for GL_EXT_draw_buffers2's blending enables.
I had flagged it as enabled on V3D 4.x, but not actually implemented the
per-RT enables.  Fixes piglit fbo_drawbuffers2-blend.
2018-07-05 12:39:36 -07:00
Mathieu Bridon
3153bcc73e gallium/auxiliary: Fix string matching
Commit f69bc797e1 did the following:

-        if format.layout in ('bptc', 'astc'):
+        if format.layout in ('astc'):

The intention was to go from matching either 'bptc' or 'astc' to
matching only 'astc'.

But the new code doesn't respect this intention any more, because in
Python `('astc')` is not a tuple containing a string, it is just the
string. (the parentheses are simply ignored)

That means we now match any substring of 'astc', for example 'a'.

This commit fixes the test to respect the original intention.

Fixes: f69bc797e1 "gallium/auxiliary: Add helper support for
                             bptc format compress/decompress"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-07-05 11:48:47 +01:00