Commit graph

101040 commits

Author SHA1 Message Date
Jason Ekstrand
ce47999cee Revert "anv/radv: release memory allocated by glsl types during spirv_to_nir"
This reverts commit 4e1bbb000c.  It turns
out that some DXVK apps due to some implementation detail of DXVK or
other create and destroy instances in an interleaved way.  Freeing the
glsl_type memory without being a bit more careful causes use-after-free
issues.  Looks like we need to try again.
2019-03-27 11:24:58 -05:00
Tomeu Vizoso
b817d00278 panfrost: Wait for last job to finish in force_flush_fragment
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-27 17:03:34 +01:00
Tomeu Vizoso
53ab812230 panfrost: Pass the context BOs to the kernel so they aren't unmapped while in use
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-27 17:03:34 +01:00
Tomeu Vizoso
b0f67c066f panfrost: Also tell the kernel about the checksum_slab
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-27 17:03:34 +01:00
Tomeu Vizoso
95748f6483 panfrost: Set the GEM handle for AFBC buffers
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-27 17:03:34 +01:00
Tomeu Vizoso
02081edfaf panfrost: Fix sscanf format options
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-27 17:03:34 +01:00
Alexandros Frantzis
3bccf70211 virgl: Fake MSAA when max samples is 1
When the host is running on softpipe/llvmpipe the maximum number of
samples for multisampling is 1. GL 3.0 requires at least 4 samples, and
softpipe/llvmpipe get around this by enabling PIPE_CAP_FAKE_SW_MSAA.

This patch mimics softpipe/llvmpipe behavior in virgl by enabling the
same PIPE_CAP_FAKE_SW_MSAA workaround when the max sample count reported
by the host is 1. This change allows virgl on a softpipe/llvmpipe host
to advertise support for GL 3.0 and beyond.

Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-03-27 15:46:14 +02:00
Samuel Pitoiset
d6a07732c9 ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-27 14:45:52 +01:00
Samuel Pitoiset
bea540173c spirv: propagate the access flag for store and load derefs
It was only propagated when UBO/SSBO access are lowered to offsets.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: <Jason Ekstrand jason@jlekstrand.net>
2019-03-27 09:57:30 +01:00
Samuel Pitoiset
4d0b03c83d nir: add nir_{load,store}_deref_with_access() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: <Jason Ekstrand jason@jlekstrand.net>
2019-03-27 09:57:27 +01:00
Timothy Arceri
d163780f81 spirv: make use of the select control support in nir
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841
2019-03-27 02:39:12 +00:00
Timothy Arceri
e76ae39ae2 nir: add support for user defined select control
This will allow us to make use of the selection control support in
spirv and the GL support provided by EXT_control_flow_attributes.

Note this only supports if-statements as we dont support switches
in NIR.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841
2019-03-27 02:39:12 +00:00
Timothy Arceri
24037ff228 spirv: make use of the loop control support in nir
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841
2019-03-27 02:39:12 +00:00
Timothy Arceri
b56451f82c nir: add support for user defined loop control
This will allow us to make use of the loop control support in
spirv and the GL support provided by EXT_control_flow_attributes.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841
2019-03-27 02:39:12 +00:00
Alyssa Rosenzweig
6170814c42 panfrost: Preliminary work for mipmaps
This patch refactors a substantial amount of code in preparation for
mipmaps. In particular, we know have a correct slice abstraction based
on offsets; cpu/gpu are no longer arbitrary pointers. We additionally
shuffle around other code to accompany these changes and cleanup how
tiled textures are handled, while drawing some attention to the blit
code.

Mipmaps are still disabled at this point, as autogeneration is not yet
implemented; enabling as-is would cause regressions.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-27 02:11:24 +00:00
Alyssa Rosenzweig
04a72391f3 panfrost/midgard: fpow is a two-part operation
In fact, the native "fpow" instruction only does half of it; more work
is needed for the actual instruction. For now, just lower.

Fixes: 1ea42894c ("panfrost/midgard: Implement fpow")

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:36:09 +00:00
Alyssa Rosenzweig
12d1d99fee panfrost/midgard: Handle i2b constant
Fixes
dEQP-GLES2.functional.shaders.conversions.scalar_to_scalar.int_to_bool_fragment

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:36:09 +00:00
Alyssa Rosenzweig
7b78af8e00 panfrost/midgard: Expand fge lowering to more types
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:36:09 +00:00
Alyssa Rosenzweig
b8739c24ee panfrost/midgard: Add ult/ule ops
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:36:09 +00:00
Alyssa Rosenzweig
f277bd3c22 panfrost: Stub out ES3 caps/callbacks
Although this is not functional (and the command stream side is not
aiming for ES3 right now), this is enough to run dEQP-GLES3 shader
tests with the version override directive; this is useful, as some ES3
shader feature can occur in ES2 class shaders due to lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:58 +00:00
Alyssa Rosenzweig
89989e653e panfrost/midgard: Cleanup midgard_nir_algebraic.py
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:37 +00:00
Alyssa Rosenzweig
effe6fb08d panfrost/midgard: Lower source modifiers for ints
On Midgard, float ops support standard source modifiers (abs/neg) and
destination modifiers (sat/pos/round). Integer ops do not support these,
however. To cope, we use native NIR source modifiers for floats, but
lower them away to iabs/ineg for integers, implementing those ops
simultaneously to avoid regressions.

Fixes the integer tests in
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:36 +00:00
Alyssa Rosenzweig
3208c9d9a2 panfrost/midgard: Implement b2i; improve b2f/f2b
Fixes
dEQP-GLES2.functional.shaders.conversions.scalar_to_scalar.bool_to_int_fragment

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:27 +00:00
Alyssa Rosenzweig
5b95fef493 panfrost/midgard: Lower i2b32
Fixes
dEQP-GLES2.functional.shader.conversions.scalar_to_scalar.int_to_bool_vertex

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:26 +00:00
Alyssa Rosenzweig
ae43b8faa7 panfrost/midgard: Lower f2b32 to fne
Fixes
dEQP-GLES2.functional.shaders.swizzles.vector_swizzles.mediump_bvec2_x_vertex

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:24 +00:00
Alyssa Rosenzweig
3fb884259b panfrost/midgard: Lower bool_to_int32
Fixes dEQP-GLES2.functional.shaders.linkage.varying_type_vec2 (among
many others).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:22 +00:00
Alyssa Rosenzweig
53664108c2 panfrost/midgard: Map more bany/ball opcodes
Some of these are not yet fully functional due to related bugs, but this
the correct op mapping. The native ball/bany opcodes act on vec4's
unconditionally. That said, both ball and bany have the nice property
that duplicating an argument does not affect their output, so the
default "hanging swizzles" allow us to implement 2/3-component opcodes
correctly, implicitly lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:20 +00:00
Alyssa Rosenzweig
88b2a6b451 panfrost/midgard: Add more ball/bany, iabs ops
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:18 +00:00
Alyssa Rosenzweig
72cd677bac panfrost/midgard: Schedule ball/bany to vectors
Though they output scalars, they need a vector unit to make sense.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:17 +00:00
Alyssa Rosenzweig
89fdbb6707 panfrost/midgard: Add fcsel_i opcode
Whereas a normal fcsel acts on a boolean input in r31.w, the fcsel_i
variant acts on an integer input in r31.w, which can be preloaded with
an instruction like imov (with the appropriate negate flag on the
source).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:15 +00:00
Alyssa Rosenzweig
121417ef1d panfrost: Implement scissor test
This preliminary implementation should handle some basic cases. Future
work should scissor the FRAGMENT job as well for efficiency.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:14 +00:00
Alyssa Rosenzweig
bd9446e719 panfrost: Fix viewports
Our viewport code hardcoded a number of wrong assumptions, which sort of
sometimes worked but was definitely wrong (and broke most of dEQP). This
corrects the logic, accounting for flipped-Y framebuffers, which
fixes... most of dEQP.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:10 +00:00
Alyssa Rosenzweig
9da4603fb6 panfrost/midgard: Fix b2f32 swizzle for vectors
Fixes issues in most of dEQP-GLES2.functional.shaders.*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-26 23:35:08 +00:00
Dave Airlie
e77013fb7f softpipe: fix clears to only clear specified color buffers.
This fixes piglit clearbuffer-mixed-format

Reviewed-by: Brian Paul <brianp@vmware.com>
2019-03-27 07:53:32 +10:00
Dave Airlie
7f7c9425a8 draw/vs: partly fix basevertex/vertex id
This gets the basevertex from the draw depending on whether
it's an indexed or non-indexed draw.

We still fail a transform feedback test for vertex id, as
the vertex id actually an index id, and isn't getting translated
properly to a vertex id, suggestions on how/where to fix that welcome.

Reviewed-by: Brian Paul <brianp@vmware.com>
2019-03-27 07:52:28 +10:00
Nicolai Hähnle
e16ac33f37 amd/surface: provide firstMipIdInTail for metadata surface calculations
This field was added in a recent addrlib update, and while there
currently seems to be no issue with skipping it, we will have to
set it correctly in the future.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-03-26 10:00:55 +01:00
Bas Nieuwenhuizen
82075e3c42 ac/nir: Return frag_coord as integer.
To preserve the invariant that nir ssa defs are integers or pointers
in LLVM.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-03-26 09:41:15 +01:00
Kristian H. Kristensen
c7c432738a freedreno/ir3: Fix operand order for DSX/DSY
Most cat5 instructions are constructed using ir3_SAM, which uses
regs[1] for the (sampler, tex) src. Not DSX/DSY though, so we look up
src1 and src2 differently for those two.

Fixes: 1dffb089 ("freedreno/ir3: fix sam.s2en encoding")
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-03-25 18:36:48 -07:00
Kristian H. Kristensen
a752422bd4 freedreno/ir3: Track whether shader needs derivatives
In 1088b788 ("freedreno/ir3: find # of samplers from uniform vars") we
started counting number of samplers based on the uniform vars instead
of number of cat5 instructions.  We used the number of samplers to
determine whether to enable derivatives, but when we only use
derivatives and no samplers, that now breaks.  Track whether we need
derivatives explicitly and use that to enable the state.

Fixes: 1088b788 ("freedreno/ir3: find # of samplers from uniform vars")
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-03-25 18:36:48 -07:00
Andre Heider
12f11e6fe6 st/nine: enable csmt per default on iris
iris is thread safe, enable csmt for a ~5% performace boost.

Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2019-03-25 22:21:19 +01:00
Jason Ekstrand
8ed583fe52 spirv: Handle the NonUniformEXT decoration 2019-03-25 16:12:09 -05:00
Jason Ekstrand
e50ab2c0f2 nir: Add access flags to deref and SSBO atomics
We will need them for a new ACCESS_NON_UNIFORM flag that's about to be
added in the next commit.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-25 16:12:09 -05:00
Jason Ekstrand
40074ebf74 nir: Add texture sources and intrinsics for bindless
On Intel, we have both bindless and bindful and we'd like to use them at
the same time if we can so we need to be able to distinguish at the NIR
level between the two.  This also fixes nir_lower_tex to properly handle
bindless in its tex_texture_size and get_texture_lod helpers.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-25 16:12:09 -05:00
Danylo Piliaiev
e0db0c74b9 intel/fs: Make alpha test work with MRT and sample mask
Fix the order of src0_alpha and sample mask in fb payload.
From SKL PRM Volume 7, "Data Payload Register Order
for Render Target Write Messages":
 Type   S0A  oM  sZ  oS  M2     M3       M4
 SIMD8   1   1   0   0   s0A    oM       R
 SIMD16  1   1   0   0   1/0s0A 3/2s0A   oM

It also fixes working of alpha to coverage with sample mask
on GEN6 since now they are in correct order.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by:  Francisco Jerez <currojerez@riseup.net>
2019-03-25 13:54:55 -07:00
Danylo Piliaiev
c8abe03f3b i965,iris,anv: Make alpha to coverage work with sample mask
From "Alpha Coverage" section of SKL PRM Volume 7:
 "If Pixel Shader outputs oMask, AlphaToCoverage is disabled in
  hardware, regardless of the state setting for this feature."

From OpenGL spec 4.6, "15.2 Shader Execution":
 "The built-in integer array gl_SampleMask can be used to change
 the sample coverage for a fragment from within the shader."

From OpenGL spec 4.6, "17.3.1 Alpha To Coverage":
 "If SAMPLE_ALPHA_TO_COVERAGE is enabled, a temporary coverage value
  is generated where each bit is determined by the alpha value at the
  corresponding sample location. The temporary coverage value is then
  ANDed with the fragment coverage value to generate a new fragment
  coverage value."

Similar wording could be found in Vulkan spec 1.1.100
"25.6. Multisample Coverage"

Thus we need to compute alpha to coverage dithering manually in shader
and replace sample mask store with the bitwise-AND of sample mask and
alpha to coverage dithering.

The following formula is used to compute final sample mask:
  m = int(16.0 * clamp(src0_alpha, 0.0, 1.0))
  dither_mask = 0x1111 * ((0xfea80 >> (m & ~3)) & 0xf) |
     0x0808 * (m & 2) | 0x0100 * (m & 1)
  sample_mask = sample_mask & dither_mask
Credits to Francisco Jerez <currojerez@riseup.net> for creating it.

It gives a number of ones proportional to the alpha for 2, 4, 8 or 16
least significant bits of the result.

GEN6 hardware does not have issue with simultaneous usage of sample mask
and alpha to coverage however due to the wrong sending order of oMask
and src0_alpha it is still affected by it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109743

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-03-25 13:54:55 -07:00
Jason Ekstrand
3bd5457641 nir: Add a lowering pass for non-uniform resource access
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-25 15:00:36 -05:00
Jason Ekstrand
39da1deb49 nir/lower_io: Add a bounds-checked 64-bit global address format
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-25 14:40:54 -05:00
Dave Airlie
551950cacd draw/gs: fix point size outputs from geometry shader.
If the geom shader emits a point size we failed to find it here,
use the correct API to look it up.

Fixes:
tests/spec/glsl-1.50/execution/geometry/point-size-out.shader_test

Reviewed-by: Brian Paul <brianp@vmware.com>
2019-03-26 05:17:06 +10:00
Dave Airlie
d3836510d2 draw: bail instead of assert on instance count (v2)
With indirect rendering it's fine to set the instance count
parameter to 0, and expect the rendering to be ignored.

Fixes assert in KHR-GLES31.core.compute_shader.pipeline-gen-draw-commands
on softpipe

v2: return earlier before changing fpstate

Reviewed-by: Brian Paul <brianp@vmware.com>
2019-03-26 05:16:56 +10:00
Leo Liu
382401aab7 vl/dri3: remove the wait before getting back buffer
The wait here is unnecessary since we got a pool of back buffers,
and the wait for swap buffer will happen before the present pixmap,
at the same time the previous back buffer will be put back to pool
for reuse after the check for PresentIdleNotify event

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2019-03-25 12:20:31 -04:00