Commit graph

96 commits

Author SHA1 Message Date
George Kyriazis
669d8f626f swr: add fetch shader cache
For now, the cache key is all of FETCH_COMPILE_STATE.

Use new/delete for swr_vertex_element_state, since we have to call the
constructors/destructors of the struct elements.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-23 16:36:13 -06:00
George Kyriazis
e259efd805 swr: Update fs texture & sampler state logic
In swr_update_derived() update texture and sampler state on a new fragment
shader.  GALLIUM_HUD can update fs using a previously bound texture and
sampler.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-01-25 10:02:50 -06:00
Bruce Cherniak
79b66ec05e swr: Implement fence attached work queues for deferred deletion.
Work can now be added to fences and triggered by fence completion. This
allows for deferred resource deletion, and other asynchronous tasks.

Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2016-12-16 11:29:02 -06:00
Ilia Mirkin
ef59cb0820 swr: add streamout buffer offset into pBuffer pointer
The buffer_size does not take the offset into account. Just add the
offset into the pointer which lines up the structures much better.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-30 20:36:03 -05:00
Ilia Mirkin
3d837a8871 swr: fix assertion for max number of so targets
The number has to be less than or equal to the max, not just less than.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-30 20:36:00 -05:00
Ilia Mirkin
632c11e857 swr: fix range computation for instanced client-side arrays
We need to take the instance divisor and number of instances into
account for instanced client-side arrays, rather than the vertex
parameters.

Loosely based on the comparable nvc0 logic.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-11-30 20:35:33 -05:00
Ilia Mirkin
53ca06be8f swr: use util_copy_framebuffer_state helper
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-29 20:54:50 -05:00
Ilia Mirkin
0a5e1b02cf swr: don't clear all dirty bits when changing so targets
Among other things, blits would clear existing SO targets which would
cause a bunch of updates from u_blitter to be missed.

Fixes fbo-scissor-blit fbo, probably among many others.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-11-28 19:41:23 -05:00
Ilia Mirkin
2595aebd91 swr: flatshading makes color outputs flat, it doesn't affect others
We were previously not marking the "regular" flat outputs as flat when
flatshading was enabled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-22 20:27:20 -05:00
Ilia Mirkin
7cfb364b1a swr: rework resource layout and surface setup
This is a bit of a mega-commit, but unfortunately there's no great way
to break this up since a lot of different pieces have to match up. Here
we do the following:
 - change surface layout to match swr's Load/StoreTile expectations
 - fix sampler settings to respect all sampler view parameters
 - fix stencil sampling to read from secondary resource
 - respect pipe surface format, level, and layer settings
 - fix resource map/unmap based on the new layout logic
 - fix resource map/unmap to copy proper parts of stencil values in and
   out of the matching depth texture

These fix a massive quantity of piglits, including all the
tex-miplevel-selection ones.

Note that the swr native miptree layout isn't extremely space-efficient,
and we end up using it for all textures, not just the renderable ones. A
back-of-the-envelope calculation suggests about 10%-25% increased memory
usage for miptrees, depending on the number of LODs. Single-LOD textures
should be unaffected.

There are a handful of regressions as a result of this change:
 - Some textureGrad tests, these failures match llvmpipe. (There are
   debug settings allowing improved gallivm sampling accurancy.)
 - Some layered clearing tests as swr doesn't currently support that. It
   was getting lucky before because enough other things were broken.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-11-22 20:27:20 -05:00
Ilia Mirkin
807bc6ea9e swr: calculate viewport width/height based on the scale
The former calculations were for min/max y. The width/height don't take
translate into account.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-21 21:11:26 -05:00
Ilia Mirkin
d48740568f swr: allocate all scratch space in one go for vertex buffers
Multiple buffers may reference client arrays. When this happens, we
might reach for scratch space multiple times, which could cause later
arrays to invalidate the pointers allocated for the earlier ones.

This fixes copyteximage 2D_ARRAY.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-11-21 21:11:26 -05:00
Ilia Mirkin
ea276512a0 swr: mark streamout buffers as written
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-11-19 10:40:37 -05:00
Ilia Mirkin
2b6b15ab3f swr: always enable adding start/base vertex to gl_VertexId
Fixes gl-3.2-basevertex-vertexid

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-15 20:26:29 -05:00
Ilia Mirkin
a2c1d58ddb swr: make sure that all rendering is finished on shader destroy
Rendering could still be ongoing (or have yet to start) when the shader
is deleted. There's no refcounting on the shader text, so insert a
pipeline stall unconditionally when this happens.

[Note, we should instead introduce a way to attach work to
fences, so that the freeing can be done in the current fence.]

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-11-15 20:25:48 -05:00
Ilia Mirkin
7caed50ff4 swr: disable blending for integer formats
The EXT_texture_integer test says that blending and alphatest should
all be disabled. st/mesa takes care of alphatest already.

Fixes the ext_texture_integer-fbo-blending piglit test.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-11-15 20:25:43 -05:00
Ilia Mirkin
96291478ea swr: mark both frag and vert textures read, don't forget about cbs
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-11-15 20:25:22 -05:00
Ilia Mirkin
828faaef40 swr: correct setting of independentAlphaBlendEnable
This setting is for whether color and alpha have different blend
settings, not for whether blending is enabled on a per-RT basis.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-09 20:11:57 -05:00
Ilia Mirkin
36e5d68cad swr: set halfz rasterizer setting
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-09 20:11:10 -05:00
Ilia Mirkin
4af25e7131 swr: fix support for inverted depth scales
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-09 20:10:44 -05:00
Ilia Mirkin
f037afb701 swr: disable logic op when the rt format is float or srgb
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-08 19:28:35 -05:00
Ilia Mirkin
bef4a48d1c swr: add support for EXT_depth_bounds_test
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-08 19:28:35 -05:00
Tim Rowley
95ed1c19bf swr: allow alphatest without blend or logicop
We need to compile a blend function when alphatest is enabled.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-11-08 14:18:47 -06:00
Tim Rowley
114f7a92c6 swr: [rasterizer jitter] canonicalize blend compile state
Canonicalize to prevent unnecessary JIT compiles.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:57:31 -05:00
Kai Wasserbäch
532db3b788 gallium: Use enum pipe_shader_type in set_sampler_views()
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-29 09:07:25 -06:00
Kai Wasserbäch
7413625ad3 gallium: Use enum pipe_shader_type in bind_sampler_states() (v2)
v1 → v2:
 - Fixed indentation (noted by Brian Paul)
 - Removed second assert from nouveau's switch statements (suggested by
   Brian Paul)

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-29 08:45:48 -06:00
Tim Rowley
0ff57446e3 swr: [rasterizer core] only use Viewport/Scissors during SwrDraw* operations
Add explicit rects for:

- SwrClearRenderTarget
- SwrDiscardRect
- SwrInvalidateTiles
- SwrStoreTiles

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-17 17:08:55 -05:00
Tim Rowley
ad153189ec swr: [rasterizer core] viewport array support
Change viewport matrix storage from AOS to SOA to support viewport arrays.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-10 11:08:40 -05:00
Marek Olšák
54272e18a6 gallium: add a pipe_context parameter to fence_finish
required by glClientWaitSync (GL 4.5 Core spec) that can optionally flush
the context

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-10 01:11:10 +02:00
Tim Rowley
efdaf5fa3e swr: [rasterizer] attribute swizzling and linkage
Add support for enhanced attribute swizzling. Currently supports constant
source overrides to handle PrimitiveID support. No support yet for input
select swizzling or wrap shortest. Removes obsoleted linkageMask and
associated code.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-07-20 10:22:15 -05:00
Tim Rowley
9ca741c645 swr: push/pop DEBUG macro around llvm includes
llvm redefines DEBUG; adding push/pop prevents a undefined reference
to debug_refcnt_state in llvm-3.7+.

v2: add undef DEBUG

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 09:58:08 -05:00
Rob Clark
ef534b9389 gallium: make constant_buffer const
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-20 12:36:20 -04:00
Bruce Cherniak
6b0ac95c28 swr: Update screen->context pointer with multiple contexts.
A pipe pointer in the screen allows for access to current device context
 in flush_frontbuffer and resource_destroy.  This wasn't tracking current
context in multi-context situations.

v2: More caffeine.  Corrected compare, removed unnecessary set of
screen-pipe in create_context, and added a few comments.
2016-06-17 13:56:03 -05:00
Tim Rowley
2c85128e01 swr: implement clipPlanes/clipVertex/clipDistance/cullDistance
v2: only load the clip vertex once

v3: fix clip enable logic, add cullDistance

v4: remove duplicate fields in vs jit key, fix test of clip fixup needed

v5: fix clipdistance linkage for slot!=0,4

v6: support clip+cull; passes most piglit clip (failures understood)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-09 13:28:35 -05:00
Tim Rowley
87f0a0448f swr: fix provoking vertex
Use rasterizer provoking vertex API.

Fix rasterizer provoking vertex for tristrips and quad list/strips.

v2: make provoking vertex tables static const

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-07 11:47:52 -05:00
Bruce Cherniak
c8835a5924 swr: [rasterizer] Correctly select optimized primitive assembly.
Indexed primitives were always using cut-aware primitive assembly,
whether primitive_restart was enabled or not.  Correctly pass down
primitive_restart and select optimized PA when possible.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-05-25 18:47:16 -05:00
Tim Rowley
124a5d4ca0 swr: remove duplicated constant update code
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-28 16:16:46 -05:00
Tim Rowley
504df3a1d7 swr: s/Elements/ARRAY_SIZE/
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 11:07:34 -05:00
Tim Rowley
24f23817d2 swr: [rasterizer core] implement legacy depth bias enable
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:45 -05:00
Tim Rowley
ee9621e2f5 swr: fix memory leaks from vs/fs compilation
v2: varient -> variant

Reviewed by: George Kyriazis <George.Kyriazis@intel.com>
2016-04-22 18:05:02 -05:00
Tim Rowley
3bbe8a09ea swr: fix resource backed constant buffers
Code was using an incorrect address for the base pointer.

v2: use swr_resource_data() utility function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94979
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tested-by: Markus Wick <markus@selfnet.de>
2016-04-20 09:57:55 -05:00
Tim Rowley
b19d214b23 swr: support samplers in vertex shaders
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2016-04-12 20:18:00 -05:00
George Kyriazis
dd63fa28f1 gallium/swr: Cleaned up some context-resource management
Removed bound_to_context.  We now pick up the context from the screen
instead of the resource itself.  The resource could be out-of-date
and point to a pipe that is already freed.

Fixes manywin mesa xdemo.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-03-17 20:57:52 -05:00
Bruce Cherniak
e9d68cc3da gallium/swr: Resource management
Better tracking of resource state and synchronization.
A follow on commit will clean up resource functions into a new
swr_resource.cpp file.

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2016-03-14 14:07:48 -05:00
Tim Rowley
84f857bef7 gallium/swr: remove use of BYTE from swr driver
Remove use of a win32-style type leaked from the swr rasterizer.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-03-10 11:20:58 -06:00
Tim Rowley
2b2d3680bf gallium/swr: add OpenSWR driver
OpenSWR is a new software rasterizer for x86 processors designed
for high performance and high scalablility on visualization workloads.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-02 18:38:41 -06:00