GS is tested, tessellation is untested.
Have outputs_written_before_ps for HW VS and outputs_written for other
stages. The reason is that COLOR and BCOLOR alias for HW VS, which
drives elimination of VS outputs based on PS inputs.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Optimize AVX-512 PA Assemble (PA_STATE_OPT). Reduced generated code by
about 4x, MSVC compiler was going crazy making temporaries and
split-loading inputs onto the stack unless explicit AVX-512 load ops
were added
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Added two new files for a wrapper function for initialization
v2: added missing include for single architecture builds
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
This fixes 4 out of 5 cases in:
arb_framebuffer_no_attachments-atomic on cayman.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
This unifies the explicit rasterization discard as well as the implicit
rasterization disabled logic (which we need for another state tracker),
which really should do the exact same thing.
We'll now toss out the prims early on in setup with (implicit or
explicit) discard, rather than do setup and binning with them, which
was entirely pointless.
(We should eventually get rid of implicit discard, which should also
enable us to discard stuff already in draw, hence draw would be
able to skip the pointless clip and fallback stages in this case.)
We still need separate logic for only null ps - this is not the same
as rasterization discard. But simplify the logic there and don't count
primitives simply when there's an empty fs, regardless of depth/stencil
tests, which seems perfectly acceptable by d3d10.
While here, also fix statistics for primitives if face culling is
enabled.
No piglit changes.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Fix build error.
CC v3d_blit.lo
In file included from v3d_blit.c:27:0:
v3d_context.h:39:10: fatal error: v3d_drm.h: No such file or directory
#include "v3d_drm.h"
^~~~~~~~~~~
Fixes: 8a793d42f1 ("v3d: Switch the vc5 driver to using the finalized V3D UABI.")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
The host side hasn't got support for this feature yet, so don't enable it
unless we get the caps from the host.
This makes the texture buffer range piglit tests skip now.
Fixes: fe0647df5a (virgl: add offset alignment values to to v2 caps struct)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
I've confirmed after 77554d220d we no
longer need this to pass some tests from another api (as we no longer
generate the bogus extra null tris in the first place).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
The tgsi_info.num_tokens fix broke llvmpipe's detection of no-op shaders.
Fix the code to check for num_instructions <= 1 instead.
Fixes: 8fde9429c3 ("tgsi: fix incorrect tgsi_shader_info::num_tokens
computation")
Tested-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
We were incrementing num_tokens in each loop iteration while parsing
the shader. But each call to tgsi_parse_token() can consume more than
one token (and often does). Instead, just call the tgsi_num_tokens()
function.
Luckily, this issue doesn't seem to effect any current users of this
field (llvmpipe just checks for <= 1, for example).
Reviewed-by: Neha Bhende<bhenden@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
With the syncobj support in place, lets use it to implement the
EGL_ANDROID_native_fence_sync extension. This mostly follows previous
implementations in freedreno and etnaviv.
v2: Drop the flags (Eric)
Handle in_fence_fd already in job_submit (Eric)
Drop extra vc4_fence_context_init (Eric)
Dup fds with CLOEXEC (Eric)
Mention exact extension name (Eric)
Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This gives us access to the fence created for the render job.
v2: Drop flag (Eric)
Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
We need to know if the kernel supports syncobj submission since otherwise
all the DRM syncobj calls fail.
v2: Use drmGetCap to detect syncobj support (Eric)
Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This allows the driver to load against the merged kernel DRM driver. In
the process, rename most of the build system variables and gallium
plumbing functions.
In the process of merging to the kernel, I renamed the driver to the
general product line's name (since we have both vc5 and vc6 supported
already). Since the ABI is finalized, move the header to include/drm-uapi.
At buffer resource validation time, if the resource handle is not yet
created and if the initial buffer bind flags and the tobind flags are
incompatible, just use the tobind flags to create the resource handle.
On the other hand, if the bind flags are compatible, we can combine
the bind flags for the resource handle creation.
Fixes piglit gl-3.1-buffer-bindings crash.
Reviewed-by: Brian Paul <brianp@vmware.com>
Seems that when the rnndb files for etniviv were updated/included back
in Nov 2017, hw/texdesc_3d.xml.h was missed from Makefile.sources and
meson.build. This was all during the conversion to meson, so it apears
to have slipped through the cracks. As such, this file has been missing
from the official tarballs since inclusion in Mesa, so the git trees
and tarballs differ.
Found due to lintian errors in the Debian packages.
Fixes: f1e1c60ff6 ("etnaviv: Update from rnndb")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Thanks for your comment. This version has an additional boolean in the
fps_info struct to distinguish between fps and frame time calculation.
The struct is initialised in the respecting install functions for this
purpose.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Use pipe_reference to release old RAT surfaces.
RAT surface adds a reference to pool bo, so use reference counting for pool->bo
as well.
v2: Use the same pattern for both defrag paths
Drop confusing comment
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Use a single allocation of array type instead of the old-style array
allocation for the temp and immediate arrays.
Probably only makes a difference if they aren't used indirectly (so,
if we used them solely because there's too many temps or immediates).
In this case the sroa and early-cse passes can sometimes do some
optimizations which they otherwise cannot.
(As a side note, for the temp reg array, we actually really should
use one allocation per array id, not just one for everything.)
Note that the instcombine pass would actually promote such
allocations to single alloc of array type as well, but it's too late
for some artificial shaders we've seen to help (we don't want to run
instcombine at the beginning due to its cost, hence would need
another sroa/cse pass after instcombine). sroa/early-cse help there
because they can actually eliminate all of the huge shader, reducing
it to a single const output (don't ask...).
(Interestingly, instcombine also removes all the bitcasts we do on that
allocation for single-value gathering, and in the end directly indexes
into the single vector elements, which according to spec is only
semi-valid, but this happens regardless. Another thing instcombine also
does is use inbound GEPs, which is probably something we should do
manually as well - for indirectly indexed reg files llvm may not be
able to figure it out on its own, but we should be able to guarantee
all pointers are always inbound. In any case, by the looks of it
using single allocation with array type seems to be the right thing
to do even for ordinary shaders.)
No piglit change.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
The code didn't expect that, leading to crashes.
Fixes: 86d63b53a2 "gallium: remove aux_vertex_buffer_slot code"
Tested-by: Michel Dänzer <michel.daenzer@amd.com>