Commit graph

28513 commits

Author SHA1 Message Date
Rhys Kidd
d4cb3ee95c r600g: Avoid duplicated initialization of TGSI_OPCODE_DFMA
As reported by Clang, TGSI_OPCODE_DFMA (defined magic number 118) is
currently initialized twice for Cayman and Evergreen.

When Jan Vesely added double precision FMA opcode it did make sense
to locate it immediately after TGSI_OPCODE_DMAD, although this is
out of order.

This change cleans up the prior magic number definition and ensures
any later reordering of this struct will not create problems.

Prior change was:

  commit 015e2e0fce
  Author: Jan Vesely <jan.vesely@rutgers.edu>
  Date:   Sat Jul 2 16:14:54 2016 -0400

      r600g: Add double precision FMA ops

      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96782
      Fixes: 54c4d525da ("r600g: Enable FMA on chips that support it")

      Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
      Tested-by: James Harvey <lothmordor@gmail.com>
      Signed-off-by: Marek Olšák <marek.olsak@amd.com>

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: James Harvey <lothmordor@gmail.com>
2016-08-29 11:03:20 -07:00
Rhys Kidd
8ba1fd339c i915g: Fix typo in i915_translate_instruction()
Noticed this error in a debug message whilst reviewing
https://bugs.freedesktop.org/show_bug.cgi?id=97477

This patch doesn't go towards fixing that bug, but at
least may clarify future debug output.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-29 11:03:20 -07:00
Eric Anholt
60bed14d0f vc4: Handle discards while in control flow.
I missed this while adding loop support because the discard test inside a
loop was crashing before, anyway.  Fixes piglit glsl-fs-discard-04.
2016-08-29 11:03:11 -07:00
Eric Anholt
b9a74fbec7 vc4: Mark when we add discards while lowering blend state. 2016-08-29 10:57:04 -07:00
Tim Rowley
fa8f87132a swr: [rasterier core] fix GetRasterizerFunc selection
Only rasterize scissor edges if one or more scissor/viewport
rects are not hottile aligned.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-29 12:42:36 -05:00
Tim Rowley
8e41a65fc5 swr: [rasterizer core] whitespace cleanup
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-29 12:42:30 -05:00
Tim Rowley
cc7f655177 swr: [rasterizer jitter] reimplement SCATTERPS
Implement SCATTERPS as a dynamic loop based on mask set bits
instead of a static compile time loop.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-29 12:42:23 -05:00
Tim Rowley
c7e21183a1 swr: [rasterizer core] upper left rule for scissors
Fixes upper left rule for scissors and viewport/scissor macrotile alignment.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-29 12:42:15 -05:00
Tim Rowley
e54df2c7e4 swr: [rasterizer scripts] undef DEFINE_KNOB after usage
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-29 12:42:10 -05:00
Tim Rowley
a4efbd14d3 swr: [rasterizer core] minor cleanup to thread initialization
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-29 12:42:04 -05:00
Tim Rowley
7472a8ee75 swr: [rasterizer core] remove KNOB_MAX_THREADS
Use dynamic memory allocation for per-thread data

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-29 12:41:58 -05:00
Tim Rowley
9e4a482d46 swr: [rasterizer core] track guardbands per viewport rect
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-29 12:41:51 -05:00
Tim Rowley
b473bec878 swr: [rasterizer core] per-primitive viewports/scissors
- use per-primitive viewports throughout the pipeline.
- track whether all available scissor rects are tile aligned.
  Causes failures, so not taken into account when choosing rasterizer yet.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-29 12:41:16 -05:00
Tom Stellard
63ed11cde9 radeonsi: Don't use global variables for tess lds
We were allocating global variables for the maximum LDS size
which made the compiler think we were using all of LDS, which
isn't the case.

Reviewed-By: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-29 16:36:46 +00:00
Roland Scheidegger
f48ccb8c07 softpipe: (trivial) honor render_condition_enabled for clear_rt/clear_ds 2016-08-29 18:15:08 +02:00
Roland Scheidegger
c5d7624e1d llvmpipe: (trivial) honor render_condition_enabled for clear_rt/clear_ds 2016-08-29 18:14:49 +02:00
Kai Wasserbäch
4c53267b8f gallium: Use enum pipe_shader_type in set_shader_images()
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-29 09:07:37 -06:00
Kai Wasserbäch
15fe288dea gallium: Use enum pipe_shader_type in set_shader_buffers()
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-29 09:07:33 -06:00
Kai Wasserbäch
532db3b788 gallium: Use enum pipe_shader_type in set_sampler_views()
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-29 09:07:25 -06:00
Kai Wasserbäch
7413625ad3 gallium: Use enum pipe_shader_type in bind_sampler_states() (v2)
v1 → v2:
 - Fixed indentation (noted by Brian Paul)
 - Removed second assert from nouveau's switch statements (suggested by
   Brian Paul)

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-29 08:45:48 -06:00
Marek Olšák
ed24d79ed7 gallium/radeon: clear dirty_level_mask when discarding CMASK
This fixes: GL45-CTS.texture_barrier.*

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2016-08-29 14:23:58 +02:00
Marek Olšák
d301efb400 tgsi/scan: remember sampler view types
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-29 14:16:57 +02:00
Nayan Deshmukh
5f0ea3db16 st/vdpau: use temporary buffers while applying filters
Use temporary buffers so that we don't read and write to the
same surface at the same time. We don't need to use linear
layout now.

v2: rebase the patch against reverted change

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-08-29 11:23:56 +02:00
Christian König
77e4424106 st/vdpau: Revert "change the order in which filters are applied(v3)"
This reverts commit 09dff7ae2e.

Turned out this can cause some artifacts in the output. Let's revert
it for now until we have sorted out all issues.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
2016-08-29 11:23:51 +02:00
Brian Paul
ea33df7b58 svga: minor whitespace, etc clean-ups in svga_pipe_misc.c
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:19 -06:00
Brian Paul
8433b43337 svga: move some code in svga_propagate_surface()
Move computation of zslice, layer inside the conditional where they're
used.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:19 -06:00
Brian Paul
1a10b37ac3 svga: simplify surface propagation code in svga_set_framebuffer_state()
Rewrite the comment too.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:19 -06:00
Brian Paul
bb7f094b37 svga: add some comments in the svga_surface struct
Give more info about backing resources/surfaces.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:19 -06:00
Brian Paul
dcf63339e7 svga: use new svga_check_sampler_framebuffer_resource_collision()
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:19 -06:00
Brian Paul
ff500ed5a1 svga: add new svga_check_sampler_framebuffer_resource_collision()
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:19 -06:00
Brian Paul
d3d20d650d svga: remove assertions in svga_surface cast wrappers
We don't do this for other cast wrappers.  And this will simplify some
code at call sites.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:19 -06:00
Brian Paul
c6e89fa215 svga: minor code simplification in svga_texture_transfer_unmap()
Use the tex variable instead of using svga_texture() again.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:19 -06:00
Brian Paul
fe5a2704ec svga: reformat some expressions in svga_texture_transfer_map()
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:19 -06:00
Brian Paul
10ef6ddcf9 svga: remove duplicated variable in svga_texture_transfer_map()
tex was already declared at the function body scope.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:19 -06:00
Brian Paul
09d2780b39 svga: move some assignments in svga_texture_transfer_map()
Put near other assignments to the svga_transfer variable.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:18 -06:00
Brian Paul
4a52512666 svga: minor simplifications in svga_texture_transfer_map()
Use local vars instead of jumping through a pointer.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:18 -06:00
Brian Paul
088dd8f45e svga: minor reformatting of svga_texture() cast wrapper
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:18 -06:00
Brian Paul
e206f67261 svga: rewrite svga_buffer() cast wrapper
To make it symmetric with the svga_texture() cast wrapper.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:18 -06:00
Brian Paul
c72dcd9a71 svga: remove local variable in create_backed_surface_view()
To simplify the code a bit.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2016-08-26 14:20:18 -06:00
Mario Kleiner
2cc880cba5 r600: increase performance for DRI PRIME offloading if 2nd GPU is Evergreen+
This is a direct port of Marek Olšáks patch
"radeonsi: increase performance for DRI PRIME
offloading if 2nd GPU is CIK or VI" to r600.

It uses SDMA for the detiling blit from renderoffload VRAM
to GTT, as SDMA is much faster for tiled->linear blits from
VRAM to GTT.

Testing on a dual Radeon HD-5770 setup reduced the time
for the render offload gpu to get its rendering into
system RAM from approximately 16 msecs for simple rendering
at 1920x1080 pixel 32 bpp to 5 msecs, a > 3x speedup!

This was measured using ftrace to trace the time the radeon kms
driver waited on the dmabuf fence of the renderoffload gpu to
complete.

All in all this brought the time for a flip down from 20 msecs
to 9 msecs, so the prime setup can display at full 60 fps instead
of barely 30 fps vsync'ed.

The current r600 implementation supports SDMA on Evergreen and
later, but not R600/R700 due to some bugs apparently present
in their SDMA implementation.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-08-26 19:57:21 +02:00
Charmaine Lee
0035f7f136 svga: add guest statistic gathering interface
This file was supposed to be added with the previous "svga: add guest
statistic gathering interface" patch but went MIA for some reason.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-26 08:04:02 -06:00
Marek Olšák
49c798e902 radeonsi: disable CE on SI + AMDGPU
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
281f1a5980 winsys/amdgpu: disable IB chaining on SI
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
a6869e7c06 winsys/amdgpu: finish up SI addrlib integration
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-08-26 15:50:10 +02:00
Ronie Salgado
97b55243fb winsys/amdgpu: initial SI support
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-08-26 15:50:10 +02:00
Marek Olšák
971ef7518f gallium/radeon: add a driver query for AMDGPU_INFO_NUM_EVICTIONS
If the kernel driver doesn't support it, it returns 0.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
7172906c0c radeonsi: fix printing shaders and states on a VM fault
This was missed while rewriting the PIPE_DUMP flags.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
5ee3cac138 radeonsi: increase performance for DRI PRIME offloading if 2nd GPU is CIK or VI
SDMA is much faster for tiled->linear blits from VRAM to GTT.
I have Bonaire in my second PCIe slot.

$ glxinfo | grep OpenGL.renderer
OpenGL renderer string: Gallium 0.4 on AMD TONGA ...

$ DRI_PRIME=1 glxinfo | grep OpenGL.renderer
OpenGL renderer string: Gallium 0.4 on AMD BONAIRE ...

Without SDMA:
$ DRI_PRIME=1 glxgears
8796 frames in 5.0 seconds = 1759.074 FPS
8899 frames in 5.0 seconds = 1779.672 FPS

With SDMA:
$ DRI_PRIME=1 glxgears
12765 frames in 5.0 seconds = 2552.788 FPS
12888 frames in 5.0 seconds = 2577.495 FPS

The 1st GPU is irrelevant. The improvement should be much lower at 60 fps,
but definitely measurable.

SI will get this once we add SDMA blit support for it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
0241d8300f radeonsi: enable SDMA on CIK
It passes R600_DEBUG=testdma on Bonaire/radeon.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
bcfd49e511 gallium/radeon: increase priority for shader binaries
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00