Commit graph

82974 commits

Author SHA1 Message Date
Marek Olšák
4d1f32376d radeonsi: don't interpolate colors if flatshading is enabled
use v_interp_mov for those

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák
4accb02d7a radeonsi: enable the barycentric optimization in all cases
Handle the bc_optimize SGPR bit if both CENTER and CENTROID are enabled.
This should increase the PS launch rate for big primitives with MSAA.
Based on discussion with SPI guys.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák
476e9cee1d radeonsi: compute only one set of interpolation (i,j) when MSAA is disabled
This should increase the PS launch rate for shaders using at least 2 pairs
of perspective (i,j) and same for linear.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák
a675c6a000 radeonsi: split ps.prolog.force_persample_interp into persp and linear bits
This reduces the number of v_mov's in the prolog.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák
61010cfac0 radeonsi: don't dump the shader key for non-monolithic shaders early
It's always zero.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Jan Vesely
015e2e0fce r600g: Add double precision FMA ops
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96782
Fixes: 54c4d525da ("r600g: Enable FMA on chips that support it")

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Tested-by: James Harvey <lothmordor@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-07-05 00:47:12 +02:00
Francesco Ansanelli
9827fc3f03 r600: fix duplicate 'const' declaration
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-04 21:26:31 +02:00
Topi Pohjolainen
2a60654f56 i965/urb: Allow blorp to record current settings
This makes it possible to skip urb re-configuration if the
subsequent renders agree with the settings.

Also allows blorp to allocate the maximun amount of vs entries
available. Core upload logic already knows how to calculate this.
Helps one synthetic benchmark.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
39fdee6b2d i965/blorp/gen7+: Do not trigger push constant space reconfig
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
cc2d0e64c0 i965/blorp/gen7+: Stop trashing push constant allocation
Packet 3DSTATE_CONSTANT_PS is still emitted explicitly as ps stage
itself is enabled and hardware may try to prefetch constants from
the buffer. From the BSpec: 3D Pipeline - Windower -
3DSTATE_PUSH_CONSTANT_ALLOC_PS

  "Specifies the size of the PS constant buffer. This value will
   determine the amount of data the command stream can pre-fetch
   before the buffer is full."

This is not possible on gen6. From the BSpec about 3DSTATE_CONSTANT_PS:

"This packet must be followed by WM_STATE."

Binding table emissions for stages other than PS can be now dropped,
they were only needed for the 3DSTATE_CONSTANT_XS to be effective:

From the BSpec:

  "The 3DSTATE_CONSTANT_* command is not committed to the shader unit
   until the corresponding (same shader) 3DSTATE_BINDING_TABLE_POINTER_*
   command is parsed."

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
175e095744 i965/blorp: Remove support for push constants
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
46e1132b80 i965/blorp: Use flat inputs instead of uniforms
v2 (Jason): Use LOAD_INPUT() macro

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
07db95c24d i965/blorp: Fix the size requirement for vertex elements
v2: Rebased as this is needed before flat inputs are enabled

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
741a245ae4 i965/blorp: Load tranformation coordinates as vec4
In preparation for loading as flat vertex input.

v2: Use LOAD_INPUT() macro

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
01f2f364d4 i965/blorp: Rename LOAD_UNIFORM to LOAD_INPUT
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
641868103c i965/blorp: Organize pixel kill and blend/scaled inputs into vec4s
In addition, as these are never used in parallel, add a few
assertions.

v2 (Jason): Skip some complexity by putting them into a union but
            pad rectangle grid into a vec4 instead. Also keep the
            LOAD_UNIFORM macro.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Lionel Landwerlin
dbbc4fb4cc anv/wsi: create swapchain images using specified image usage
The image usage specified by the caller of vkCreateSwapchainKHR should be
passed onto the internal image creation. Otherwise the driver might later
crash when the user tries to use the image as a combined sampler even though
the creation was explicitly created with VK_IMAGE_USAGE_TRANSFER_SRC_BIT.

Leaving the previous VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT as this might be
expected even if the swapchain is created without any flag.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96791
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-07-04 10:15:48 -07:00
Indrajit Das
51227b41c6 radeon/uvd: fix overflow error while calculating bit stream buffer size
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-07-04 11:38:05 +02:00
Topi Pohjolainen
9e3774a460 i965/blorp: Prepare for more than two vertex attributes
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 09:05:02 +03:00
Topi Pohjolainen
e762354309 i965/blorp: Tell vertex fetcher about flat inputs
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 09:04:38 +03:00
Topi Pohjolainen
89e6b4ef5d i965/blorp: Add support for flat input buffer
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 09:04:00 +03:00
Topi Pohjolainen
9b2fa17e97 i965/blorp: Store input read mask
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 09:03:41 +03:00
Topi Pohjolainen
73f78ab44b i965/blorp: Rename push constants to inputs
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:37:51 +03:00
Topi Pohjolainen
f2c472fcb3 i965/blorp: Use core vertex buffer state setup
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:37:44 +03:00
Topi Pohjolainen
4f7e68799f i965/blorp: Split vertex data and element setup
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:33:41 +03:00
Topi Pohjolainen
575c8cbb54 i965: Unify vertex buffer setup
On gen >= 8 one doesn't provide ending address but number of bytes
available. This is relative to the given offset.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:33:41 +03:00
Topi Pohjolainen
bdab945edd i965/draw: Expose vertex buffer state setup
Also change the interface to use start and end offsets.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:33:41 +03:00
Rob Clark
7295428e41 freedreno: fix crash on smaller gpus and higher resolutions
Devices with smaller GMEM size need more tiles.  On db410c at 2048x1152,
glmark2 shadow needed ~330 tiles for fullscreen.  Lets bump it up to
512.  (Maybe with MRT you could end up needing more, but at that point
things are probably going to be painfully slow.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-03 11:16:28 -04:00
Rob Clark
01ccb0d91e i965: don't drop const initializers in vector splitting
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-02 09:00:19 -04:00
Rob Clark
f78a6b1ce3 glsl: add driconf to zero-init unintialized vars
Some games are sloppy.. perhaps because it is defined behavior for DX or
perhaps because nv blob driver defaults things to zero.

So add driconf param to force uninitialized variables to default to zero.

This issue was observed with rust, from steam store.  But has surfaced
elsewhere in the past.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-02 09:00:19 -04:00
Rob Clark
202710d110 freedreno/ir3: support glsl linking for cmdline compiler
For .vert/.frag, now multiple can be specified on the cmdline for
purposes of linking, and the last one specified is the one that is
fed into the ir3 backend (and dumped along the way if --verbose is
specified)

Without this, varyings in frag shaders would appear as undefined.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-07-02 09:00:19 -04:00
Rob Clark
07cfe4e6aa glsl/standalone: initialize MaxUserAssignableUniformLocations
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-07-02 09:00:19 -04:00
Rob Clark
1759eb1d19 freedreno: update valid_buffer_range for SO buffers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Rob Clark
da39ac9c51 freedreno/ir3: support non-user_buffer consts
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Rob Clark
2081c1ecc0 freedreno/a2xx: move setup/restore cmds into binning pass
Rather than doing a separate submit at context create, move these cmds
to before first tile, as is done on a3xx/a4xx.  Otherwise state can
be overwritten by other contexts.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Rob Clark
2c3b54c278 freedreno: pass index buffer as a pipe_resource
This will be useful in a following patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Rob Clark
88cc11e971 freedreno: switch emit_const_bo() to take prsc's
We can push the unwrap of pipe_resource down.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Hans de Goede
d7dfd4cb51 nv30: Fix "array subscript is below array bounds" compiler warning
gcc6 does not like the trick where we point to one entry before the
array start and then start a while with a pre-increment.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-02 12:21:28 +02:00
Hans de Goede
110ef733dc nouveau: Fix a couple of "foo may be used uninitialized' compiler warnings
These are all new false positives with gcc6.

In nouveau_compiler.c: gcc6 no longer assumes that passing a pointer
to a variable into a function initialises that variable.

In nv50_ir_from_tgsi.cpp op and mode are not set if there are 0
enabled dst channels, this never happens, but gcc cannot know this.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-02 12:21:28 +02:00
Hans de Goede
1f3c8f3664 nouveau: Fix gcc6 / c++11 auto_ptr deprecation compiler warnings
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Hans de Goede
2aa1197eee nouveau: Add support for SV_WORK_DIM
Add support for SV_WORK_DIM for nvc0 and nve4.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Hans de Goede
3345f70f63 nvc0: Make NVC0_CB_AUX_GRID_INFO take an index argument
This brings it inline with the other macros like NVC0_CB_AUX_UBO_INFO
and NVC0_CB_AUX_TEX_INFO.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Hans de Goede
ef8e50a841 clover: Pass work_dim parameter of clEnqueueNDRangeKernel() to driver
In order to implement get_work_dim() the driver may need to know the
clEnqueueNDRangeKernel() work_dim parameter, so pass it to the driver.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Hans de Goede
d386cef246 tgsi: Add WORK_DIM System Value
Add a new WORK_DIM SV type, this is will return the grid dimensions
(1-4) for compute (opencl) kernels.

This is necessary to implement the opencl get_work_dim() function.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Alejandro Piñeiro
da7efadf04 mesa/main: fix error checking logic on CopyImageSubData
For the case (both src or dst) where we had a texobject, but the
texobject target was not the same that the method target, this spec
paragraph was appplied:

 /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core
  * Profile spec says:
  *
  *     "An INVALID_VALUE error is generated if either name does not
  *     correspond to a valid renderbuffer or texture object according
  *     to the corresponding target parameter."
  */

But for that case, the correct spec paragraph should be:
 /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core
  * Profile spec says:
  *
  *     "An INVALID_ENUM error is generated if either target is
  *      not RENDERBUFFER or a valid non-proxy texture target;
  *      is TEXTURE_BUFFER or one of the cubemap face selectors
  *      described in table 8.18; or if the target does not
  *      match the type of the object."
  */

specifically the last sentence: "or if the target does not match the
type of the object".

This patch fixes the error returned (s/INVALID/ENUM) for that case,
and moves up the INVALID_VALUE spec paragraph, as that case (invalid
texture object) was handled before.

Fixes:
GL44-CTS.copy_image.target_miss_match

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-02 11:54:40 +02:00
Dave Airlie
27d456cc87 st/glsl_to_tgsi: don't increase immediate index by 1.
Immediates are stored into a separate table, and are
consolidated, so if we get an immediate we don't need
to offset it as the index it has is correct.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-07-02 17:01:25 +10:00
Ilia Mirkin
6f4d35212b st/mesa: get max supported number of image samples from driver
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-07-01 23:01:03 -04:00
Ilia Mirkin
b2b5075e04 nvc0: fix up image support for allowing multiple samples
Basically we just have to scale up the coordinates and then add the
relevant sample offset. The code to handle this was already largely
present from Christoph's earlier attempts to pipe images through back in
the dark ages, this just hooks it all up.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-01 23:01:02 -04:00
Nicolai Hähnle
07cc838b10 st/mesa: check the texture image level in st_texture_match_image
Otherwise, 1x1 images of arbitrarily high level are accepted.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96639#add_comment
Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-01 17:55:19 +02:00
Nicolai Hähnle
0ba053b34c st/mesa: an incomplete texture may have a zero-size first image
Fixes a regression introduced by commit 42624ea83 which triggered
an assertion in
dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0

While stImage must have a non-zero size as verified by the caller, we also
look at the size of the base image in an attempt to make a better guess at
the level0 size (this is important when the base image size is odd). However,
the base image may have a zero size even when it exists.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96629
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-01 17:54:40 +02:00