Commit graph

71368 commits

Author SHA1 Message Date
Marek Olšák
cd9c07e7cd radeonsi: add ETC1 support for Stoney
It's a subset of ETC2. Tested.

For more information, see page 42 and onward:
http://www.graphicshardware.org/previous/www_2007/presentations/strom-etc2-gh07.pdf

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-01-22 22:05:42 +01:00
Marek Olšák
b3bac55621 radeonsi: change LLVM intrinsics for BREV, CLAMP, EX2
Requested by Matt Arsenault.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 22:05:42 +01:00
Marek Olšák
ce1e7784d0 radeonsi: add max waves / SIMD to shader stats (v2)
v2: account for LDS usage in PS
    the limit is per SIMD, not per CU

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 22:05:42 +01:00
Marek Olšák
5944f3d2fc radeonsi: enable late VS allocation (v3)
v2: take the number of CUs into account
v3: change in LS allocation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 22:05:42 +01:00
Marek Olšák
97648229e4 radeonsi: allow using all CUs for tessellation and on-chip GS (v2)
v2: After more discussion with hw teams, the kernel already contains the
    optimal settings allowing us to use all CUs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 22:05:42 +01:00
Jeremy Huddleston Sequoia
7c99557f53 Revert "mesa: Deal with size differences between GLuint and GLhandleARB in GetAttachedObjectsARB"
This reverts commit 739ac3d39d.

This will be done a differnet way.
See http://lists.freedesktop.org/archives/mesa-dev/2016-January/105642.html
2016-01-22 13:02:01 -08:00
Jason Ekstrand
107a109d1c isl/format_layout: R11G11B10_FLOAT is unsigned 2016-01-22 11:57:49 -08:00
Jason Ekstrand
e5558ffa64 anv/image: Move common code to anv_image.c 2016-01-22 11:57:01 -08:00
Jason Ekstrand
84612f4014 anv/state: Refactor surface state setup into a "fill" function 2016-01-22 11:40:56 -08:00
Francisco Jerez
448285ebf2 anv/state: Add missing clflushes for storage image surface state. 2016-01-22 11:12:09 -08:00
Francisco Jerez
d533c3796d anv/state: Factor out surface state calculation from genX_image_view_init.
Some fields of the surface state template were dependent on the
surface type, which is dependent on the usage of the image view, which
wasn't known until the bottom of the function after the template had
been constructed.  This caused failures in all image load/store CTS
tests using cubemaps.  Refactor the surface state calculation into a
function that is called once for each required usage.
2016-01-22 11:12:09 -08:00
Jason Ekstrand
16780632c2 i965/nir: Temporariliy disable mul+add fusion
We don't want to do this in the long-run but it's needed for passing the
NoContraction tests at the moment.  Eventually, we want to plumb this
through NIR properly.
2016-01-22 11:10:54 -08:00
Ben Widawsky
315cda6715 i965/fs: Remove unused count from vs urb setup
This was originally removed here:
commit 031d350132
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Tue Aug 25 16:59:12 2015 -0700

    i965/vs: Unify URB entry size/read length calculations between backends.

Then added back:
commit bd198b9f0a
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Fri Aug 14 16:01:33 2015 -0700

    i965/vs: Simplify fs_visitor's ATTR file.

Note that the authorship dates are out of order, but the above reflects the
order of the commit dates.

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-01-22 10:38:41 -08:00
Chad Versace
d9abbbe0d8 isl: Fix indentation of isl_format_layout comment 2016-01-22 09:48:11 -08:00
Chad Versace
65f3c420c3 isl/tests: Give tests less cryptic names 2016-01-22 09:46:48 -08:00
Chad Versace
f9d4d09549 isl: Fix isl_surf_get_image_offset_sa for gen4_3d layout
Bug found by unit test
test_bdw_3d_r8g8b8a8_unorm_256x256x256_levels09_tiley0.
2016-01-22 09:45:22 -08:00
Chad Versace
891ed5ca8c isl/tests: Add test for bdw 3d surface
test_bdw_3d_r8g8b8a8_unorm_256x256x256_levels09_tiley0

Currently fails.
2016-01-22 09:45:21 -08:00
Nicolai Hähnle
d76bd85c35 Revert "radeonsi: fix discard-only fragment shaders (v2)"
This reverts commit 843855bbf0.

It became redundant due to Marek's earlier pushed 8667a1ae which achieves
the same thing.
2016-01-22 12:40:26 -05:00
Nicolai Hähnle
843855bbf0 radeonsi: fix discard-only fragment shaders (v2)
When a fragment shader is used that has no outputs but does conditional
discard (KILL_IF), all fragments are killed without this patch.

By comparing various register settings, my conclusion is that the exec mask
is either not properly forwarded to the DB by NULL exports or ends up being
unused, at least when there is _only_ a NULL export (the ISA documentation
claims that NULL exports can be used to override a previously exported exec
mask).

Of the various approaches I have tried to work around the problem, this one
seems to be the least invasive one.

v2: take discard by alpha test into account as well

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93761
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-22 11:59:50 -05:00
Marta Lofstedt
3e640c256a mesa: Update _mesa_has_geometry_shaders
Updates the _mesa_has_geometry_shaders function to also look
for OpenGL ES 3.1 contexts that has OES_geometry_shader enabled.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-01-22 17:13:55 +01:00
Marta Lofstedt
ae4e4ba06d glsl: add support for GL_OES_geometry_shader
This adds glsl support of GL_OES_geometry_shader for
OpenGL ES 3.1.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-22 17:13:55 +01:00
Marta Lofstedt
67e3098703 mesa: enable enums for OES_geometry_shader
Enable GL_OES_geometry_shader enums for OpenGL ES 3.1.

V4: EXTRA tokens updated according to comments from Ilia Mirkin.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-22 17:13:55 +01:00
Marta Lofstedt
af5a14d1e0 glapi: add GL_OES_geometry_shader extension
Add xml definitions for the GL_OES_geometry_shader extension
and expose the extension for OpenGL ES 3.1.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-01-22 17:13:55 +01:00
Chad Versace
fbc87ce4be isl/tests: Remove copy-paste assertion 2016-01-22 07:18:04 -08:00
Chad Versace
63d999b762 isl/tests: Fix build
isl_device_init() acquired a new param for bit6 swizzling.
2016-01-22 07:17:57 -08:00
Marek Olšák
a9d5842ec0 radeonsi: add ETC2 support for Stoney
Tested and working.
2016-01-22 15:36:14 +01:00
Marek Olšák
6f428328d3 radeonsi: implement SAMPLEPOS system value without a constant buffer load
We always get per-sample input position.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák
2b66bc87d4 winsys/amdgpu: compute num_good_compute_units correctly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák
0d8e4f958f gallium/radeon: rename max_compute_units -> num_good_compute_units
radeon sets this correctly, but not amdgpu

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák
99dfeb01bd radeonsi: disable SPI color outputs the shader doesn't write
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák
f6360de8c0 radeonsi: use all SPI color formats
because not using SPI_SHADER_32_ABGR doubles fill rate.

We should also get optimal performance if alpha isn't needed or blending
isn't enabled.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák
933e3c4145 radeonsi: use 32_AR for alpha-to-coverage without a color buffer
This avoids the fp16 packing instructions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák
f1f0158837 radeonsi: add shader conversion code for all SPI color formats
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák
e28b8530b9 radeonsi: set CB_SHADER_MASK according to SPI color formats
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák
8667a1aea2 radeonsi: use SPI_SHADER_COL_FORMAT fields instead of export_16bpc
This does change the behavior slightly:
  If a shader writes COLOR[i] and that color buffer isn't bound,
  the shader will export MRT_NULL instead and discard the IR tree that
  calculates the output. The only exception is alpha-to-coverage, which
  requires an alpha export.

v2: - update a comment about 16BPC
    - account for MRTZ when when fixing alpha-test/kill

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák
0446ea9d08 radeonsi: don't enable blending if colormask == 0
most likely useless, but doesn't hurt

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Ilia Mirkin
dac2964f3e glsl: always compute proper varying type, irrespective of varying packing
Normally there's a producer and consumer, and the producer var gets
picked. In both the vertex->gs and tes->gs cases, that's the un-arrayed
version.

In the SSO case, however, there is no producer. So we picked the arrayed
GS variable, and as a result, used more slots than we should. More
critically, these slots would also no longer line up with the producer's
calculation. To fix this, we need to fix up the type of the variable
based on stage no matter what.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93650
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
2016-01-22 08:48:27 -05:00
Emil Velikov
54702c2fa1 egl/dri2: expose srgb configs when KHR_gl_colorspace is available
Otherwise the user has no way of using it, and we'll try to access the
linear one.

v2:
 - Bail out when KHR_gl_colorspace is missing and srgb is set (Marek)

Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Fixes: c2c2e9ab604(egl: implement EGL_KHR_gl_colorspace (v2))
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91596
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
2016-01-22 11:55:54 +00:00
Emil Velikov
f29a772a7e targets/dri: android: use WHOLE static libraries
By using whole static libraries the android buildsystem provides
whole-archive (alike) solution. This means that we don't need to worry
about the order of the static libraries and any reverse, recursive or
circular dependencies that they have between one another.

Without this the linker will discard any unused hunks of one library
and we'll end up with unresolved symbols as those are required by
another static library. This issue has become more prominent with the
introduction of pipe-loader.

Whole static libraries has been used in i915/i965 for a very long
time, so we might do the same.

v2:
 - Better commit message (Ilia)
 - Keep external dependencies as [normal] static libs (Mauro)

Cc: mesa-stable@lists.freedesktop.org
Cc: Mauro Rossi <issor.oruam@gmail.com>
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-01-22 11:55:34 +00:00
Emil Velikov
72fda2b710 i915: correctly parse/set the context flags
With an earlier commit we've spit the flags parsing to a separate
function, but forgot to update all the dri modules to use it.

Noticed when we've enabled KHR_debug for every dri module - fdo#93048

Fixes: 38366c0c6e "dri_util: Don't assume __DRIcontext->driverPrivate
is a gl_context"
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2016-01-22 11:54:01 +00:00
Iago Toral Quiroga
ab0c7c0829 glsl/lower_instructions: fix regression in dldexp_to_arith
The commit b4e198f47f changed the offset and bits parameters of the
bitfield insert operation from scalars to vectors. However, the lowering
of ldexp on doubles operates on each vector component and emits scalar
code (since it has to deal with the lower and upper 32-bit chunks of
each double component), so it needs its bits and offset parameters to
be scalars.

Fixes fp64 regression (crash) in:
spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-ldexp-dvec4.shader_test

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-22 08:14:11 +01:00
Francisco Jerez
2e54381622 anv/batch_chain: Fix patching up of block pool relocations on Gen8+.
Relocations are 64 bits on Gen8+.  Most CTS tests that send
non-trivial work to the GPU would fail when run from a single deqp-vk
invocation because they were effectively relying on reloc presumed
offsets to be wrong so the kernel would come and apply relocations
correctly.
2016-01-21 16:30:44 -08:00
Jason Ekstrand
13aaf90048 nir/spirv: Ignore cull distance 2016-01-21 16:20:39 -08:00
Jason Ekstrand
13858a1c1a nir/lower_system_values: Use the correct invication id for CS 2016-01-21 16:20:39 -08:00
Jason Ekstrand
d8c0e0805b nir/spirv: Properly assign locations to split structures 2016-01-21 16:20:39 -08:00
Jason Ekstrand
514507825c nir/spirv: Improve handling of variable loads and copies
Before we were asuming that a deref would either be something in a block or
something that we could pass off to NIR directly.  However, it is possible
that someone would choose to load/store/copy a split structure all in one
go.  We need to be able to handle that.
2016-01-21 16:20:39 -08:00
Jason Ekstrand
7e5e64c8a9 nir/spirv: Make vectors a proper array time with an array_element
This makes dealing with single-component derefs easier
2016-01-21 16:20:39 -08:00
Jason Ekstrand
a8af0f536c nir/spirv: Rework access chains a bit to allow for literals
This makes them much easier to construct because you can also just specify
a literal number and it doesn't have to be a valid SPIR-V id.
2016-01-21 16:20:39 -08:00
Jason Ekstrand
5d9a6fd526 vtn/variables: Compact local loads/stores into one function
This is similar to what we did for block loads/stores.
2016-01-21 16:20:39 -08:00
Jason Ekstrand
b298743d7b nir/spirv: Add an actual variable struct to spirv_to_nir
This allows us, among other things, to do structure splitting on-the-fly to
more correctly handle input/output structs.
2016-01-21 16:20:39 -08:00