Commit graph

32650 commits

Author SHA1 Message Date
Nicolai Hähnle
704ddbcdf6 radeonsi: set MIP_POINT_PRECLAMP to 0
This fixes a bug with nearest ("point") mip selection when the fractional
part of max_lod is in (0.5,1). In this case, the spec mandates that
we still select the mip level ceil(max_lod) in the clamping case. However,
MIP_POINT_PRECLAMP will clamp before the mip selection, which is wrong.

Supposedly this setting was originally copied from the closed Vulkan
driver, but as far as I can tell, closed Vulkan was actually changed back
recently :)

Fixes dEQP-GLES3.functional.texture.mipmap.2d.max_lod.{nearest,linear}_nearest

Fixes: f7420ef5b4 ("radeonsi: enable some sampler fields to match the closed driver")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-20 15:43:13 +02:00
Nicolai Hähnle
87f7c7bd65 radeonsi: fix array textures layer coordinate
Like for cube map (array) gather, we need to round to nearest on <= VI.

Fixes tests in dEQP-GLES3.functional.shaders.texture_functions.texture.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-20 15:42:58 +02:00
Christian Gmeiner
62a8ca22cd etnaviv: move sw query defines to etnaviv_query_sw.h
Also add new define ETNA_SW_QUERY_BASE.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-09-20 14:25:41 +02:00
Christian Gmeiner
a3d79946e5 etnaviv: move sw get_driver_query_info(..)
This change makes etna_get_driver_query_info(..) more generic
and puts the knowledge of supported queries directly besides
the implementation.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-09-20 14:25:13 +02:00
Eric Anholt
3752ad28f2 broadcom/vc4: Fix use-after-free when deleting a program.
By leaving the compiled shader in the context's stage state, the next
compile of a new FS would look in the old compiled FS for figuring out
whether to set various dirty flags for the VS compile.  Clear out the
pointer when deleting the program, and make sure that we always mark the
state as dirty if the previous program had been lost.  Fixes valgrind
warnings on glsl-max-varyings.

Fixes: 2350569a78 ("vc4: Avoid VS shader recompiles by keeping a set of FS inputs seen so far.")
2017-09-18 20:17:25 -07:00
Eric Anholt
4db9ad9893 broadcom/vc4: Fix crashes since the gallium blitter reworks.
Even if we're not clearing color, the blitter has started dereferencing
the color value.
2017-09-18 16:16:00 -07:00
Eric Anholt
9940fb4205 broadcom/vc4: Fix use-after-free trying to mix a quad and tile clear.
The blitter will bind just the depth buffer, which flushes the current job
if we had both a color and depth/stencil.  If the clear was doing partial
depth/stencil (quad-based) and color (tile-based), we'd go on to try to
set up the rest of the tile clear in the now flushed job.

Instead, move the partial clear up before we start setting up the job for
the current FBO state, and re-fetch the job if we're continuing on to a
tile-based clear.  Fixes valgrind failures in fbo-depthtex.

Fixes: 9421a6065c ("vc4: Fix fallback to quad clears of depth in GLX.")
2017-09-18 16:16:00 -07:00
Eric Anholt
d88a75182d broadcom/vc4: Fix use-after-free for flushing when writing to a texture.
I was trying to continue the hash table loop, not the inner loop.  This
tended to work out, because we would have *just* freed the job struct.
Fixes some valgrind failures in fbo-depthtex.

Fixes: f597ac3966 ("vc4: Implement job shuffling")
2017-09-18 16:15:58 -07:00
Eric Anholt
6e3d7a5916 ttn: Fix out-of-bounds accesses since the always-2D-constants change.
Only one of the three checks for dim was updated, so we would try to set a
UBO buffer index source value on a nir_load_uniform, and wouldn't actually
declare non-UBO uniforms.

Fixes: 37dd8e8dee ("gallium: all drivers should accept two-dimensional constant buffer indexing")
Tested-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-18 16:14:27 -07:00
Marek Olšák
a5b764cfea radeonsi: reallocate if a non-sharable textures is being shared
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-18 17:47:49 +02:00
Marek Olšák
7b616f7b71 radeonsi: PIPE_BIND_SHARED should allow inter-process sharing
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-18 17:47:49 +02:00
Nicolai Hähnle
f0233ac82d freedreno: compile fix
Fixes: 3f6b3d9db ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Reported-by: Jan Vesely <jan.vesely@rutgers.edu>
2017-09-18 17:39:20 +02:00
Jan Vesely
30741187c1 clover: add missing include to compat.h
Fixes build issues with llvm-3.6
Fixes: 3115687f9b (clover: Fix build after
LLVM r313390)

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Tested-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-18 16:32:09 +01:00
Jan Vesely
fdf0f1db22 clover: Query and export half precision support
v2: PIPE_CAP_HALFS -> PIPE_SHADER_CAP_FP16
    has_halfs -> has_halves

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-09-18 10:45:02 -04:00
Jan Vesely
7b2c5547c3 gallium: Add PIPE_SHADER_CAP_FP16
Denotes native half precision float operations capability
v2: PIPE_CAP_HALFS -> PIPE_SHADER_CAP_FP16
    fix indentation

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-18 10:45:02 -04:00
Benedikt Schemmer
c302f8fa7c nvc0: fix compile error
Fixes: 3f6b3d9db ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Signed-off-by: Benedikt Schemmer <ben@besd.de>
Previously-pointed-out-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-18 15:31:35 +02:00
Nicolai Hähnle
7a62f8621a radeonsi: allow out-of-order rasterization in commutative blending cases
We do not enable this by default for additive blending, since it slightly
breaks OpenGL invariance guarantees due to non-determinism.

Still, there may be some applications can benefit from white-listing
via the radeonsi_commutative_blend_add drirc setting without any real
visible artifacts.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-18 11:25:20 +02:00
Nicolai Hähnle
8c56c45cd4 radeonsi: add drirc option "radeonsi_assume_no_z_fights"
This option enables a performance optimization where typical non-blending
draws with depth buffer may be rasterized out-of-order (on VI+, multi-SE
chips).

This optimization can lead to incorrect results when an applications
renders multiple objects with the same Z value at the same pixel, so we
will never enable it by default. But there may be applications that could
benefit from white-listing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-18 11:25:19 +02:00
Nicolai Hähnle
aab134cfa5 radeonsi: enable out-of-order rasterization when possible on VI and GFX9 dGPUs
This does not take commutative blending into account yet.

R600_DEBUG=nooutoforder disables it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-18 11:25:19 +02:00
Nicolai Hähnle
66d03d0e3e gallium/radeon: pass old_(perfect_)enable to set_occlusion_query_state
The callee can derive the current enable state itself.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-18 11:25:19 +02:00
Nicolai Hähnle
3f6b3d9db7 gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE
To be able to properly distinguish between GL_ANY_SAMPLES_PASSED
and GL_ANY_SAMPLES_PASSED_CONSERVATIVE.

This patch goes through all drivers, having them treat the two
query types identically, except:

1. radeon incorrectly enabled conservative mode on
   PIPE_QUERY_OCCLUSION_PREDICATE. We now do it correctly, only
   on PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE.
2. st/mesa uses the new query type.

Fixes dEQP-GLES31.functional.fbo.no_attachments.*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-18 11:25:18 +02:00
Nicolai Hähnle
6772452e4c amd/common: remove has_ds_bpermute argument from ac_build_ddxy
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-18 11:25:18 +02:00
Nicolai Hähnle
3db86d86ed amd/common: add chip_class to ac_llvm_context
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-18 11:25:18 +02:00
Nicolai Hähnle
e0af3bed2c amd/common: round cube array slice in ac_prepare_cube_coords
The NIR-to-LLVM pass already does this; now the same fix covers
radeonsi as well.

Fixes various tests of
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-18 11:25:18 +02:00
Nicolai Hähnle
6fb0c1013b radeonsi: workaround for gather4 on integer cube maps
This is the same workaround that radv already applied in commit
3ece76f03d ("radv/ac: gather4 cube workaround integer").

Fixes dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i/ui.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-18 11:25:17 +02:00
Jan Vesely
3115687f9b clover: Fix build after LLVM r313390
v2: pass llvm context reference instead of a pointer

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-09-15 21:39:54 -04:00
Gurkirpal Singh
6a8aa11c20 st/omx_bellagio: Rename state tracker and option
Changes --enable-omx option to --enable-omx-bellagio

Signed-off-by: Gurkirpal Singh <gurkirpal204@gmail.com>
Reviewed-and-Tested-by: Julien Isorce <julien.iso...@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-09-15 14:28:36 +02:00
Dave Airlie
1b163238f5 r600: add .gitignore for egd_tables.h 2017-09-15 13:55:01 +10:00
Timothy Arceri
a70a401f52 radeonsi: enable STD430 packing of UBOs by default
Before this change we were defaulting to STD140 which is slightly
less efficient at packing arrays.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri
c96e45ebf0 gallium: introduce PIPE_CAP_LOAD_CONSTBUF
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri
b4401cc104 radeonsi: make use of LOAD for UBOs
v2: always set can_speculate and allow_smem to true

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri
6fa60b5e40 gallium: add CONSTBUF type to tgsi_file_type
This will be use to distinguish between load types when using
the TGSI_OPCODE_LOAD opcode.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:54 +10:00
Dave Airlie
b6f6ead198 virgl: drop const dimensions on first block.
The virgl protocol version of tgsi doesn't handle this yet,
transform it back to the old ways.

Thanks to Nicolai Hähnle <nicolai.haehnle@amd.com>
for also writing nearly the same patch.

Fixes: 41e342d5 tgsi/ureg: always emit constants (and their decls) as 2D
Tested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-15 10:33:14 +10:00
Samuel Pitoiset
f0d09d9012 radeonsi: move si_get_wave_info() to AMD common code
This will allow us to use it from radv.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Eric Engestrom
396d2dbce4 swr: use ARRAY_SIZE macro
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-14 09:36:01 +01:00
Denis Pauk
74d2456491 gallium/{r600, radeonsi}: Fix segfault with color format (v2)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102552

v2: Patch cleanup proposed by Nicolai Hähnle.
    * deleted changes in si_translate_texformat.

Cc: Nicolai Hähnle <nhaehnle@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-09-14 00:59:24 +02:00
Nicolai Hähnle
e4af4433fc radeonsi: hard-code pixel center for interpolateAtSample without multisample buffers
The GLSL rules for interpolateAtSample are unfortunate:

   "Returns the value of the input interpolant variable at
    the location of sample number sample. If
    multisample buffers are not available, the input
    variable will be evaluated at the center of the pixel.
    If sample sample does not exist, the position used to
    interpolate the input variable is undefined."

This fix will fallback to monolithic shader compilation when
interpolateAtSample is used without multisampling.

One alternative would be to always upload 16 sample positions,
filling the buffer up with repetition when the actual number of
samples is less, and then ANDing the sample ID with 0xf. However,
that punishes all well-behaving users of interpolateAtSample,
when in reality, only conformance tests should be affected by
the issue.

Fixes
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.non_multisample_buffer.*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:25:45 +02:00
Nicolai Hähnle
92c4277990 radeonsi: apply a mask to gl_SampleMaskIn in the PS prolog
gl_SampleMaskIn is supposed to contain set bits only for the samples that
are covered by the current fragment shader invocation, but the VGPR
initialization hardware loads the set of all bits that are covered at the
current pixel.

Fixes various tests in
dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:25:41 +02:00
Nicolai Hähnle
792724a337 radeonsi: remove SET_PREDICATION workaround on newer firmware
We need to keep the workaround for older firmware, though.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:25:08 +02:00
Nicolai Hähnle
b8c6e88848 amd/common: get ME/PFP/CE firmware feature versions as well
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:25:06 +02:00
Nicolai Hähnle
8d8f1ef573 radeonsi: rename variable to clarify its meaning
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:24:18 +02:00
Nicolai Hähnle
48b3364b5b radeonsi: make si_init_shader_selector_async static
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:24:18 +02:00
Nicolai Hähnle
7e4344151f radeonsi: fix segfault in descriptor dumping
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:24:18 +02:00
Nicolai Hähnle
81f398dcb1 ddebug: write out final driver log messages with GALLIUM_DDEBUG=always
If the last operation happens to be a non-draw, such as a
transfer_map that triggers a decompress blit, there may be
interesting messages left in the driver log.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:24:18 +02:00
Tim Rowley
000e2958f5 swr/rast: Fetch compile state changes
Add InstanceStrideEnable field and rename InstanceDataStepRate to
InstanceAdvancementState in INPUT_ELEMENT_DESC structure.

Add stubs for handling InstanceStrideEnable in FetchJit::JitLoadVertices()
and FetchJit::JitGatherVertices() and assert if they are triggered.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:54 -05:00
Tim Rowley
ead0dfe31e swr/rast: adjust linux cpu topology identification code
Make more robust to handle strange strange configurations like a vmware
exported 4-way numa X 1-core configuration.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:47 -05:00
Tim Rowley
1ccf9ad280 swr/rast: Missed conversion to SIMD_T
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:41 -05:00
Tim Rowley
c0ce5c4422 swr/rast: whitespace changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:35 -05:00
Tim Rowley
6b9e801832 swr/rast: add graph write to jit debug putput
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:30 -05:00
Tim Rowley
6f0fcec07a swr/rast: Migrate memory pointers to gfxptr_t type
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:24 -05:00