While we are at it, make it static and change the return values
policy to be consistent.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
This silences a divergent error found with F1 2015.
Basically, the NDV bit has to be set when a FSWZ instruction is
inside divergent code, but it's not needed otherwise. The correct
fix should be to set it only in divergent code situations.
GM107 emitter already sets that bit.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
Commit 7413625ad3 flipped a few functions too many to use
pipe_shader_type. These functions actually take an integer that does not
correspond 1:1 with the enum.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Backing views/surfaces are used to handle the case when a resource is
bound both as a render target and as a sampler source (such as when
doing auto mipmap generation).
This patch fixes a bug where mapping a resource (to do a glReadPixels)
was reading the stale data in the original surface rather than the
backing surface which was rendered to.
We need to propagate the backing resource (which we rendered to) back
to the original resource before we read from it. The problem was the
svga_propagate_rendertargets() function was examining the wrong surface
views.
This fixes the "poc9" test described in VMware bug 1686661.
Also tested with Piglit, Cinebench, Lightsmark, etc.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
This should be all that is required for cull distances to work
on radeonsi.
v1.1: whitespace cleanup, add docs fix clipdist_mask usage.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
v2: document the new cap
v3: fix 80 char limit in screen.rst
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Small code clean up that removes magic numbers where a TGSI
opcode has been defined.
No functional change expected as each opcode is unsupported on
the respective hardware.
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: James Harvey <lothmordor@gmail.com>
As reported by Clang, TGSI_OPCODE_DFMA (defined magic number 118) is
currently initialized twice for Cayman and Evergreen.
When Jan Vesely added double precision FMA opcode it did make sense
to locate it immediately after TGSI_OPCODE_DMAD, although this is
out of order.
This change cleans up the prior magic number definition and ensures
any later reordering of this struct will not create problems.
Prior change was:
commit 015e2e0fce
Author: Jan Vesely <jan.vesely@rutgers.edu>
Date: Sat Jul 2 16:14:54 2016 -0400
r600g: Add double precision FMA ops
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96782
Fixes: 54c4d525da ("r600g: Enable FMA on chips that support it")
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Tested-by: James Harvey <lothmordor@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Noticed this error in a debug message whilst reviewing
https://bugs.freedesktop.org/show_bug.cgi?id=97477
This patch doesn't go towards fixing that bug, but at
least may clarify future debug output.
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Implement SCATTERPS as a dynamic loop based on mask set bits
instead of a static compile time loop.
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
- use per-primitive viewports throughout the pipeline.
- track whether all available scissor rects are tile aligned.
Causes failures, so not taken into account when choosing rasterizer yet.
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
We were allocating global variables for the maximum LDS size
which made the compiler think we were using all of LDS, which
isn't the case.
Reviewed-By: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
v1 → v2:
- Fixed indentation (noted by Brian Paul)
- Removed second assert from nouveau's switch statements (suggested by
Brian Paul)
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
This is a direct port of Marek Olšáks patch
"radeonsi: increase performance for DRI PRIME
offloading if 2nd GPU is CIK or VI" to r600.
It uses SDMA for the detiling blit from renderoffload VRAM
to GTT, as SDMA is much faster for tiled->linear blits from
VRAM to GTT.
Testing on a dual Radeon HD-5770 setup reduced the time
for the render offload gpu to get its rendering into
system RAM from approximately 16 msecs for simple rendering
at 1920x1080 pixel 32 bpp to 5 msecs, a > 3x speedup!
This was measured using ftrace to trace the time the radeon kms
driver waited on the dmabuf fence of the renderoffload gpu to
complete.
All in all this brought the time for a flip down from 20 msecs
to 9 msecs, so the prime setup can display at full 60 fps instead
of barely 30 fps vsync'ed.
The current r600 implementation supports SDMA on Evergreen and
later, but not R600/R700 due to some bugs apparently present
in their SDMA implementation.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
This file was supposed to be added with the previous "svga: add guest
statistic gathering interface" patch but went MIA for some reason.
Reviewed-by: Brian Paul <brianp@vmware.com>