Commit graph

105646 commits

Author SHA1 Message Date
Timothy Arceri
9dd737bb02 nir: add glsl_type_is_integer() helper
Fixes: 1c9c42d16b ("nir: add varying component packing helpers")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-11-12 15:38:56 +11:00
Francisco Jerez
552642066f intel/fs: Prevent emission of IR instructions not aligned to their own execution size.
This can occur during payload setup of SIMD-split send message
instructions, which can lead to the emission of header setup
instructions with a non-zero channel group and fixed SIMD width.  Such
instructions could end up using undefined channel enable signals
except they don't care since they're always marked force_writemask_all.

Not known to affect correctness of any workload at this point, but it
would be trivial to back-port to stable if something comes up.

Reported-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Sagar Ghuge <sagar.ghuge@intel.com>
2018-11-09 19:39:22 -08:00
Timothy Arceri
590fcb50e7 st/mesa: make use of nir_link_constant_varyings()
Shader-db results radeonsi (VEGA):

Totals from affected shaders:
SGPRS: 161464 -> 161368 (-0.06 %)
VGPRS: 86904 -> 86292 (-0.70 %)
Spilled SGPRs: 296 -> 314 (6.08 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 3618596 -> 3573852 (-1.24 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 26189 -> 26276 (0.33 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-10 11:41:00 +11:00
Timothy Arceri
d40dd05553 nir: add new linking opt nir_link_constant_varyings()
This pass moves constant outputs to the consuming shader stage
where possible.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-11-10 11:41:00 +11:00
Andre Heider
414470854d st/nine: clean up thead shutdown sequence a bit
Just break out of the loop instead, it does the same thing.

Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2018-11-09 22:37:27 +01:00
Andre Heider
123bf9cbe7 st/nine: plug thread related leaks
Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2018-11-09 22:37:27 +01:00
Andre Heider
10598c9667 st/nine: fix stack corruption due to ABI mismatch
This fixes various crashes and hangs when using nine's 'thread_submit'
feature.

On 64bit, the thread function's data argument would just be NULL.
On 32bit, the data argument would be garbage depending on the compiler
flags (in my case -march>=core2).

Fixes: f3fa7e3068 ("st/nine: Use WINE thread for threadpool")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2018-11-09 22:37:26 +01:00
Marek Olšák
d2b2364313 radeonsi: stop command submission with PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET only
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-09 14:55:04 -05:00
Marek Olšák
4bec5025ac gallium: add PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-09 14:55:04 -05:00
Marek Olšák
9dc776f3f2 radeonsi: don't set the CB clear color registers for 0/1 clear colors on Raven2
and add has_dcc_constant_encode.
2018-11-09 14:55:04 -05:00
Marek Olšák
832ab883e2 radeonsi: use better DCC clear codes
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-09 14:55:04 -05:00
Marek Olšák
d059eae269 ac/surface: remove the overallocation workaround for Vega12
not needed anymore (probably since the tile_swizzle fix)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-09 14:55:04 -05:00
Lionel Landwerlin
959e2a5aeb intel/aub_read: remove useless breaks
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-09 18:17:30 +00:00
Erik Faye-Lund
b55af392d9 Revert "mesa: expose NV_conditional_render on GLES"
This reverts commit 5213be9fab.
2018-11-09 17:39:25 +01:00
Erik Faye-Lund
cf8b271cbe Revert "mesa/main: fixup make check after NV_conditional_render for gles"
This reverts commit cccd7a253f.
2018-11-09 17:39:22 +01:00
Erik Faye-Lund
cccd7a253f mesa/main: fixup make check after NV_conditional_render for gles
It seems I missed some details when exposing NV_conditional_render
on GLES; this fixes up "make check".

Fixes: 5213be9fab ("mesa: expose NV_conditional_render on GLES")
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-09 16:47:34 +01:00
Nicolai Hähnle
8c97abc066 radv: include LLVM IR in the VK_AMD_shader_info "disassembly"
Helpful for debugging compiler backend problems: this allows us to
easily retrieve the LLVM IR from RenderDoc.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-09 14:54:37 +01:00
Erik Faye-Lund
5213be9fab mesa: expose NV_conditional_render on GLES
The extension spec has been updated to include GLES 2 support, so let's
enable it there.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-09 13:03:00 +01:00
Iago Toral Quiroga
35baee5dce nir/constant_folding: fix incorrect bit-size check
nir_alu_type_get_type_size takes a type as parameter and we were
passing a bit-size instead, which did what we wanted by accident,
since a bit-size of zero matches nir_type_invalid, which has a
size of 0 too.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-09 08:22:15 +01:00
Iago Toral Quiroga
6c418dfa42 intel/compiler: fix node interference of simd16 instructions
SIMD16 instructions need to have additional interferences to prevent
source / destination hazards when the source and destination registers
are off by one register.

While we already have code to handle this, it was only running for SIMD16
dispatches, however, we can have SIDM16 instructions in a SIMD8 dispatch.
An example of this are pull constant loads since commit b56fa830c6,
but there are more cases.

This fixes a number of CTS test failures found in work-in-progress
tests that were hitting this situation for 16-wide pull constants
in a SIMD8 program.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-11-09 08:22:08 +01:00
Roland Scheidegger
a3c898dc97 gallivm: fix improper clamping of vertex index when fetching gs inputs
Because we only have one file_max for the (2d) gs input file, the value
actually represents the max of attrib and vertex index (although I'm
not entirely sure if we really want the max, since the max valid value
of the vertex dimension can be easily deduced from the input primitive).

Thus in cases where the number of inputs is higher than the number of
vertices per prim, we did not properly clamp the vertex index, which
would result in out-of-bound fetches, potentially causing segfaults
(the segfaults seemed actually difficult to trigger, but valgrind
certainly wasn't happy). This might have happened even if the shader
did not actually try to fetch bogus vertices, if the fetching happened
in non-active conditional clauses.

To fix simply use the correct max vertex index value (derived from
the input prim type) instead when clamping for this case.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-11-09 00:53:03 +01:00
Aditya Swarup
a5c39ed974 i965: Lift restriction in external textures for EGLImage support
Fixes Skqp's unitTest_EGLImageTest test.

For Intel platforms, we support external textures only for EGLImages
created with EGL_EXT_image_dma_buf_import. This restriction seems to
be Intel specific and not present for other platforms.

While running SKQP test - unitTest_EGLImageTest, GL_INVALID is sent
to the test because of this restriction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105301
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-11-08 12:33:06 -08:00
Ian Romanick
c5a4c26450 glsl: Add pragma to disable all warnings
Use #pragma warning(off) and #pragma warning(on) to disable or enable
all warnings.  This is a big hammer.  If we ever need a smaller hammer,
we can enhance this functionality.

There is one lame thing about this.  Because we parse everything, create
an AST, then convert the AST to GLSL IR, we have to treat the #pragma
like a statment.  This means that you can't do something like

'    void
'    #pragma warning(off)
'    __foo
'    #pragma warning(on)
'    (float param0);

Fixing that would, as far as I can tell, require a huge amount of work.

I did try just handling the #pragma during parsing (like we do for
state for the whole shader.

v2: Fix the #pragma lines in the commit message that git-commit ate.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-11-08 11:00:00 -08:00
Ian Romanick
011abfc963 glsl: Add warning tests for identifiers with __
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-11-08 10:59:53 -08:00
Jason Ekstrand
d28bc35ece intel/fs: Add an assert to optimize_frontfacing_ternary
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:25 -06:00
Jason Ekstrand
bcc6aab065 anv: Use nir_src_is_const and friends in lowering code
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:25 -06:00
Jason Ekstrand
52145070c0 intel/analyze_ubo_ranges: Use nir_src_is_const and friends
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:25 -06:00
Jason Ekstrand
1413512b4c intel/vec4: Use the new nir_src_is_const and friends
As of this commit, all uses of const sources either go through a
nir_src_as_<type> helper which handles bit sizes correctly or else are
accompanied by a nir_src_bit_size() == 32 assertion to assert that we
have the size we think we have.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:25 -06:00
Jason Ekstrand
61e15348c4 nir: Add a read_mask helper for ALU instructions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:22 -06:00
Jason Ekstrand
344cfe6980 intel/fs: Use the new nir_src_is_const and friends
As of this commit, all uses of const sources either go through a
nir_src_as_<type> helper which handles bit sizes correctly or else are
accompanied by a nir_src_bit_size() == 32 assertion to assert that we
have the size we think we have.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:20 -06:00
Jason Ekstrand
6b2918709a intel/fs,vec4: Clean up a repeated pattern with SSBOs
Everywhere we handle SSBO intrinsics, we have exactly the same pattern
for computing the index so we may as well make a helper for it.  We also
add a get_nir_src_imm to vec4 and use it for SSBO offsets.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-11-08 10:09:06 -06:00
Samuel Pitoiset
c472ad82e4 radv: fix GPU hangs when loading depth/stencil clear values on SI/CIK
HTILE is supported on these chips, not sure how I missed that.
This restores using PFP_SYNC_ME when LOAD_CONTEXT_REG is not used.

Fixes: f425d9ee74 ("radv: use LOAD_CONTEXT_REG when loading fast clear values")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-08 11:20:03 +01:00
Samuel Pitoiset
f425d9ee74 radv: use LOAD_CONTEXT_REG when loading fast clear values
This avoids syncing the Micro Engine. This is only supported
for VI+ currently. There is probably a way for using
LOAD_CONTEXT_REG on previous chips but that could be done later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-08 10:41:45 +01:00
Samuel Pitoiset
0dcd99c687 radv: only expose VK_SUBGROUP_FEATURE_ARITHMETIC_BIT for VI+
Inclusive and exclusives scan are missing because older chips
don't have llvm.amdgcn.update.dpp.

This fixes crashes with dEQP-VK.subgroups.arithmetic.*.

CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-08 10:41:41 +01:00
Adam Jackson
16f1023037 glx: Demand success from CreateContext requests (v2)
GLXCreate{,New}Context, like most X resource creation requests, does not
emit a reply and therefore is emitted into the X stream asynchronously.
However, unlike most resource creation requests, the GLXContext we
return is a handle to library state instead of an XID. So if context
creation fails for any reason - say, the server doesn't support indirect
contexts - then we will fail in strange places for strange reasons.

We could make every GLX entrypoint robust against half-created contexts,
or we could just verify that context creation worked. Reuse the
__glXIsDirect code to do this, as a cheap way of verifying that the
XID is real.

glXCreateContextAttribsARB solves this by using the _checked version of
the xcb command, so effectively this change makes the classic context
creation paths as robust as CreateContextAttribs.

v2: Better use of Bool, check that error != NULL first (Olivier Fourdan)

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-11-07 12:38:05 -05:00
Karol Herbst
f7fae7f64e gm107/ir: fix compile time warning in getTEXSMask
In function 'uint8_t nv50_ir::getTEXSMask(uint8_t)':
warning: control reaches end of non-void function [-Wreturn-type]

Reported-by: Moiman@freenode
Fixes: f821e80213
       "gm107/ir: use scalar tex instructions where possible"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-07 17:48:58 +01:00
Michel Dänzer
32b0eb51a3 winsys/amdgpu: Stop using amdgpu_bo_handle_type_kms_noimport
It only behaves any different from amdgpu_bo_handle_type_kms with
libdrm 2.4.93, and it breaks if an older version is picked up.

Bugzilla: https://bugs.freedesktop.org/108096
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-07 17:37:47 +01:00
Lionel Landwerlin
792dde66f2 intel/dump_gpu: add platform option
Got tired of remembering the PCI ids.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-07 11:27:41 +00:00
Lionel Landwerlin
e262cc0353 intel/dump_gpu: move output option together
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-07 11:27:38 +00:00
Samuel Pitoiset
0a0aa2ba6c radv: disable conditional rendering for vkCmdCopyQueryPoolResults()
VK_EXT_conditional_rendering says that copy commands should not be
affected by conditional rendering.

Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-07 11:31:36 +01:00
Samuel Pitoiset
1e7c3379e1 radv: allocate enough space in CS when copying query results with compute
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-07 11:31:34 +01:00
Timothy Arceri
9aa3c1915e ac/nir_to_llvm: fix b2f for f64
Fixes: d7e0d47b9d ("nir: Add a bunch of b2[if] optimizations")

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-07 16:35:07 +11:00
Karol Herbst
f821e80213 gm107/ir: use scalar tex instructions where possible
TEXS, TLD4 and TLD4S are variants of tex instructions which are more
scalar, which gives RA more freedom and is less likely to insert silly
MOVs to satisfy quad registers.

shader-db changes:
total instructions in shared programs : 7687265 -> 7614782 (-0.94%)
total gprs used in shared programs    : 803620 -> 798045 (-0.69%)
total shared used in shared programs  : 639636 -> 639636 (0.00%)
total local used in shared programs   : 24648 -> 24648 (0.00%)
total bytes used in shared programs   : 82103400 -> 81330696 (-0.94%)

                local     shared        gpr       inst      bytes
    helped           0           0        3648       10647       10647
      hurt           0           0         464         205         205

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06 19:57:05 +01:00
Karol Herbst
edd6c41751 nv50/ir: add scalar field to TexInstructions
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06 19:57:05 +01:00
Karol Herbst
8d825f78fc nv50/ra: add condenseDef overloads for partial condenses
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06 19:57:05 +01:00
Karol Herbst
a4550de434 nv50/ir: print color masks of tex instructions
v2: print the mask for TXG as well
    make the mask to be printed more mask like

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06 19:57:05 +01:00
Jason Ekstrand
610061838a vulkan: Update the XML and headers to 1.1.91
The biggest change here is the rename of VK_NVX_ray_tracing to
VK_NV_ray_tracing and the total removal of VK_KHR_mir_surface.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-06 12:21:19 -06:00
Gert Wollny
c171d76b94 r600: Add support for EXT_texture_sRGB_R8
Enables on R600 and makes pass:
  dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.*
  dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8*

v2: remove chunk for dri/radeon (Emil)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-11-06 18:49:02 +01:00
Lionel Landwerlin
421fa01d64 anv/android: mark gralloc allocated BOs as external
Allocating through Gralloc implies buffers are going to be used
outside the driver. We have special MOCS settings for external BOs and
we probably want to use them here too.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a1220e7311 ("anv/android: Set the BO flags in bo_cache_import (v2)")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-11-06 15:28:07 +00:00
Lionel Landwerlin
b43f955037 anv: stub internal android code
This reduces the amount of #ifdef ANDROID we'll have to have inside
the driver. Potentially offering better coverage of the android
extensions.

v2: Move anv_android.h include before anv_entrypoints.h (Tapani)
    Fix autotools android build (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-11-06 15:28:07 +00:00