Commit graph

118017 commits

Author SHA1 Message Date
Eric Anholt
424d5e4e11 turnip: Disable timestamp queries for now.
They're not implemented, and not critical to bring up immediately.  Avoids
failures in the CTS when nothing gets written to the query.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-27 10:05:59 -08:00
Jonathan Marek
080c92e7d4 freedreno/perfcntrs/fdperf: add missing a2xx case in select_counter
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-11-27 12:11:57 -05:00
Jonathan Marek
98d7125b36 freedreno/perfcntrs/fdperf: add missing a20x compatible
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-11-27 12:11:57 -05:00
Jonathan Marek
24cde37e8d freedreno/perfcntrs/fdperf: fix u64 print on 32-bit builds
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-11-27 12:11:57 -05:00
Jonathan Marek
baab4017b9 freedreno/perfcntrs: add a2xx MH counters
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-11-27 12:11:57 -05:00
Jonathan Marek
0d0c8a9e82 freedreno/registers: add missing MH perfcounter enum for a2xx
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-11-27 12:11:57 -05:00
Michel Dänzer
a3b3d3bfcc gitlab-ci: Put HTML summary in artifacts for failed piglit jobs
This will make it easier to look at details of failed / skipped tests.

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-27 10:20:31 +01:00
Michel Dänzer
07c1346113 gitlab-ci: Stop storing piglit test results as JUnit
Since we're not reporting test results as JUnit anymore, we can use the
default JSON format.

This affects how test results are summarized, update the reference files
accordingly.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-27 10:19:22 +01:00
Michel Dänzer
c9cdb7cef0 gitlab-ci: Stop reporting piglit test results via JUnit
It was basically useless in this form, and processing the JUnit data in
the GitLab backend was pretty expensive.

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-27 10:18:33 +01:00
Iago Toral Quiroga
18a09e788d v3d: fix indirect BO allocation for uniforms
We were always ensuring a minimum size of 4 bytes for uniforms
for the case where we don't have any, to account for hardware pre-fetching
of the uniform stream, however, pre-fetching could also lead to to out
of bounds reads when have read the last uniform in the stream, so we
probably want to have the extra 4 bytes to prevent the kernel from
observing invalid memory accesses when the uniform stream sits right at
the end of a page.

This seems to fix MMU exceptions reported with a Linux 5.4 kernel.

Credit goes to Phil Elwell for identifying the problem and narrowing
it down to memory accesses in the uniform stream.

Reported-by: Phil Elwell <phil@raspberrypi.org>
Tested-by: Phil Elwell <phil@raspberrypi.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-27 08:43:13 +01:00
Samuel Pitoiset
a24f1c8f7f radv: enable VK_KHR_shader_subgroup_extended_types on GFX10
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-27 07:42:44 +01:00
Samuel Pitoiset
0812dbd403 ac: add 8-bit and 16-bit supports to ac_build_permlane16()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-27 07:42:42 +01:00
Samuel Pitoiset
c9aa843961 radv/gfx10: fix implementation of exclusive scans
This implementation is loosely based on ROCm.
https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl

This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive* on GFX10.

Fixes: 227c29a80d ("amd/common/gfx10: implement scan & reduce operations")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-27 07:39:26 +01:00
Samuel Pitoiset
86a5fbfd4a radv: fix enabling sample shading with SampleID/SamplePosition
When a fragment shader includes an input variable decorated with
SampleId or SamplePosition, sample shading should be enabled
because minSampleShadingFactor is expected to be 1.0.

Cc: 19.2, 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-27 07:22:54 +01:00
Jonathan Marek
62ff90cc5e turnip: fix integer render targets
Add missing required bits.  Fixes at least:

dEQP-VK.pipeline.render_to_image.dedicated_allocation.1d.small.r16g16_sint_d24_unorm_s8_uint
dEQP-VK.pipeline.render_to_image.dedicated_allocation.2d.mipmap.r16g16_sint_d24_unorm_s8_uint
dEQP-VK.renderpass.dedicated_allocation.attachment.4.401
dEQP-VK.renderpass2.suballocation.formats.r16_uint.load.draw
dEQP-VK.synchronization.op.single_queue.barrier.write_draw_read_copy_image_to_buffer.image_128x128_r16_uint

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-26 16:01:19 -08:00
Jason Ekstrand
a8965c076b anv: Push constants are relative to dynamic state on IVB
Fixes: aecde2351 "anv: Pre-compute push ranges for graphics pipelines"
Closes: #2136
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-26 22:15:54 +00:00
Dylan Baker
a24d6fbae6 meson: Add -Werror=gnu-empty-initializer to MSVC compat args
Only clang has this argument (at least as of clang 8 and gcc 9), which
errors when using the gcc empty initializer syntax in C:

```C
struct foo f = {};
```

GCC has a warning for this, but only when using -Wpedantic, which is a
lot of noise to lose useful warnings in.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-26 12:48:11 -08:00
Dylan Baker
25e58e3718 gallium/auxiliary: Fix uses of gnu struct = {} extension
Most of these will never actually be compiled by windows, but in the
interest of being able to make using struct foo = {}; an error and
avoiding breaking windows removing a handful of safe uses seems like a
good trade off.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-26 12:48:11 -08:00
Marek Olšák
ed1ff99da7 st/mesa: add st_variant base class to simplify code for shader variants
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Marek Olšák
b8772a559a st/mesa: don't use ** in the st_nir_link_shaders signature
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Marek Olšák
adbba2142d st/mesa: simplify looping over linked shaders when linking NIR
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Marek Olšák
8567e06046 st/mesa: propagate gl_PatchVerticesIn from TCS to TES before linking for NIR
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Marek Olšák
e8f0a39d45 st/mesa: don't call ProgramStringNotify in glsl_to_nir
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Marek Olšák
5a714531f7 st/mesa: don't use redundant stp->state.ir.nir
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Marek Olšák
6cf011fcc8 st/mesa: don't serialize all streamout state if there are no SO outputs
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Kenneth Graunke
3fdf2bb313 iris: Disable VF cache partial address workaround on Gen11+
The vertex cache uses the full 48-bit address on Gen11+.  See the
documentation for 3DSTATE_VERTEX_BUFFERS, which describes the
workaround and lists it as pre-Icelake.

Interestingly, the docs don't mention index buffers as needing a
workaround at all.  So either we've been overzealous, or the docs
never got updated to record that.  Which begs the question of whether
the issue there was fixed, if there was one...

Cuts 40% of the PIPE_CONTROLs from Civilization VI's benchmark; appears
that it improves performance by about 1-2% on Icelake 8x8 (not frequency
locked).
2019-11-26 12:13:34 -08:00
Rob Clark
8d9f5a28e3 freedreno: switch to layout helper
The slices table and most of the other layout fields in the
freedreno_resource moves into fdl_layout.

v2: Changes by anholt to not have duplicate fields, which was introducing
    a surprising behavior change in resource layout (using the
    level_linear helper before the setup of the shadowed fields)

Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:08 +00:00
Eric Anholt
997b8d4749 freedreno/a6xx: Log the tiling mode in resource layout debug.
This was important for figuring out what went wrong with the layout
refactor.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:07 +00:00
Eric Anholt
2e62a622e7 freedreno: Convert the slice struct to the new resource header.
This gets the worst of the sed required for shared resource layout out of
the way.  The texture layout comment is dropped now that we're referencing
the shared header, which has a more complete description.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:07 +00:00
Eric Anholt
930432577f freedreno: Introduce a resource layout header.
This will be used for sharing resource layout code between freedreno and
tu.  Mostly copied from a commit by Rob, with a new location and the slice
struct renamed for consistency.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:07 +00:00
Eric Anholt
2ec420b264 freedreno: Introduce a fd_resource_tile_mode() helper.
Multiple places were doing the same thing to get the tile mode of a level,
so refactor it out.  This will make the shared resource helper transition
cleaner.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:07 +00:00
Eric Anholt
6b09227ede freedreno: Introduce a fd_resource_layer_stride() helper.
This factors out a bit of duplicated code, but will also make the shared
resource layout transition process clearer.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:07 +00:00
Rob Clark
9e9a26c768 freedreno: use rsc->slice accessor everywhere
This will make it easier to extract the slice table out into a layout
helper.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:07 +00:00
Eric Anholt
d845dca0f5 nir: Make algebraic backtrack and reprocess after a replacement.
The algebraic pass was exhibiting O(n^2) behavior in
dEQP-GLES2.functional.uniform_api.random.3 and
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 (along with
other code-generated tests, and likely real-world loop-unroll cases).
In the process of using fmul(b2f(x), b2f(x)) -> b2f(iand(x, y)) to
transform:

result = b2f(a == b);
result *= b2f(c == d);
...
result *= b2f(z == w);

->

temp = (a == b)
temp = temp && (c == d)
...
temp = temp && (z == w)
result = b2f(temp);

nir_opt_algebraic, proceeding bottom-to-top, would match and convert
the top-most fmul(b2f(), b2f()) case each time, leaving the new b2f to
be matched by the next fmul down on the next time algebraic got run by
the optimization loop.

Back in 2016 in 7be8d07732 ("nir: Do opt_algebraic in reverse
order."), Matt changed algebraic to go bottom-to-top so that we would
match the biggest patterns first.  This helped his cases, but I
believe introduced this failure mode.  Instead of reverting that, now
that we've got the automaton, we can update the automaton's state
recursively and just re-process any instructions whose state has
changed (indicating that they might match new things).  There's a
small chance that the state will hash to the same value and miss out
on this round of algebraic, but this seems to be good enough to fix
dEQP.

Effects with NIR_VALIDATE=0 (improvement is better with validation enabled):

Intel shader-db runtime -0.954712% +/- 0.333844% (n=44/46, obvious throttling
  outliers removed)
dEQP-GLES2.functional.uniform_api.random.3 runtime
  -65.3512% +/- 4.22369% (n=21, was 1.4s)
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 runtime
  -68.8066% +/- 6.49523% (was 4.8s)

v2: Use two worklists, suggested by @cwabbott, to cut out a bunch of
    tricky code.  Runtime of uniform_api.random.3 down -0.790299% +/-
    0.244213% compred to v1.
v3: Re-add the nir_instr_remove() that I accidentally dropped in v2,
    fixing infinite loops.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-26 10:13:46 -08:00
Eric Anholt
90ad6304bf nir: Refactor algebraic's block walk
My motivation was to clarify the changes in the following commit, but
incidentally, it reduces runtime of
dEQP-GLES2.functional.uniform_api.random.3 (an algebraic-heavy
testcase) by -5.39524% +/- 2.21179% (n=15)

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-26 10:13:40 -08:00
Connor Abbott
305d1300f9 nir: Maintain the algebraic automaton's state as we work.
In order to have nir_opt_algebraic be able to do further algebraic
work on the output of a replacement, we need to maintain the
automaton's state.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-26 10:13:19 -08:00
Jonathan Marek
2da4a58ed9 etnaviv: support 3d/array/integer formats in texture descriptors
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-26 19:07:04 +01:00
Jonathan Marek
7806e058c9 etnaviv: blt: fix partial ZS clears with TS
If not all bits are cleared, then BLT needs to be given the current clear
value and not the new one.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-26 19:04:51 +01:00
Daniel Schürmann
7cd548d352 aco: don't value-number instructions from within a loop with ones after the loop.
Fixes:
Wolfenstein:Youngblood (w/o shader_ballot)
dEQP-VK.descriptor_indexing.combined_image_sampler_in_loop_with_lod

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-26 14:39:27 +00:00
Rhys Perry
46420dd294 aco: set dlc/glc correctly for image loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-11-26 14:39:27 +00:00
Rhys Perry
37843e454e aco: allow constant offsets for global/scratch instructions on GFX10
I don't think the bug applies for global/scratch instructions and
load_barycentric_at_sample selection expects this feature to work.

Fixes various dEQP-VK.pipeline.multisample_interpolation.* tests on GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-11-26 14:39:27 +00:00
Bas Nieuwenhuizen
02375b8436 radv: Enable VK_KHR_buffer_device_address.
Still no capture/replay or multi device support.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-26 11:59:52 +00:00
Samuel Pitoiset
34dd4251e2 radv: fix reporting subgroup size with VK_KHR_pipeline_executable_properties
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-26 10:48:48 +01:00
Bas Nieuwenhuizen
25bc9102d8 radv: Allocate cmdbuffer space for buffer marker write.
Fixes: 946193ae00 "radv: add support for VK_AMD_buffer_marker"
Reviewed-by:  Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-26 09:35:02 +00:00
Gert Wollny
e41958e344 r600: Disable eight bit three channel formats
Commit 0899bf55 made some deqp-gles3 tests related to RGB8 PBOs fail
on R600 because it exposed PIPE_FORMAT_R8G8B8_UNORM and R600 doesn't
propely handle this. Disabling this format also for buffers fixes the
issue.

In addition, disabling also the related RGB8 integer formats for buffers
fixes some deqp-gles3 tests:

  dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb8ui_cube
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8i_2d
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8i_cube
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8ui_2d
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8ui_cube
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8i_2d_array
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8i_3d
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8ui_2d_array
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8ui_3d
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8i_2d_array
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8i_3d
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8ui_2d_array
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8ui_3d

Fixes: 0899bf55
  st/mesa: Map MESA_FORMAT_RGB_UNORM8 <-> PIPE_FORMAT_R8G8B8_UNORM

Closes #2118

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-26 09:28:52 +01:00
Samuel Pitoiset
f6770b9726 ac/llvm: fix warning in ac_build_canonicalize()
../src/amd/llvm/ac_llvm_build.c: In function ‘ac_build_canonicalize’:
../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘intr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
 4567 |  return ac_build_intrinsic(ctx, intr, type, params, 1,
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 4568 |       AC_FUNC_ATTR_READNONE);
      |       ~~~~~~~~~~~~~~~~~~~~~~
../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘type’ may be used uninitialized in this function [-Wmaybe-uninitialized]

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-26 08:35:10 +01:00
Tapani Pälli
5d58fea660 mapi: add GetInteger64vEXT with EXT_disjoint_timer_query
From EXT_disjoint_timer_query spec:

   "Interaction: This extension adds GetInteger64vEXT if
    OpenGL ES 3.0 is not supported"

See https://github.com/KhronosGroup/OpenGL-Registry/issues/326.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2090
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-26 07:41:24 +02:00
Jason Ekstrand
200a3301e2 vulkan: Update the XML and headers to 1.1.129
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-26 02:48:42 +00:00
Jason Ekstrand
854859fefa anv/entrypoints: Better handle promoted extensions
In the case of promoted extensions we can end up with an entrypoint that
we support being an alias of an entrypoint we do not support.  For
instance, if an extension gets promoted from EXT to KHR, the EXT entry-
points may be aliases of the KHR ones.  We want to leave everything as
EXT until we get around to advertising the KHR so that we don't break
things when we update the XML and headers.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-26 02:48:42 +00:00
Jason Ekstrand
121551bfdb vulkan/enum_to_str: Handle out-of-order aliases
The current code can only handle enum aliases if the original enum is
declared first followed by the alias as we walk the XML in a linear
fashion.  This commit allows us to handle aliases where the alias
declaration comes before the thing it's aliasing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-26 02:48:42 +00:00