Commit graph

118156 commits

Author SHA1 Message Date
Vasily Khoruzhick
8c12f4e5f2 lima: enable tiling
Now that we have tiled format modifier merged into linux we can enable tiling.

That should improve overall performance and also workaround broken mipmapping
for linear textures since now we prefer tiled textures.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-12-04 08:20:56 -08:00
Tapani Pälli
272ef5d39a glsl: additional interface redeclaration check for SSO programs
Patch adds additional linker check for SSO programs to make sure they
are redeclaring built-in blocks as required by the desktop spec.

This fixes following Piglit tests:
   arb_separate_shader_objects/linker/pervertex-*

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-12-04 15:27:41 +00:00
Tapani Pälli
2d26cc077d gitlab-ci: bump piglit checkout commit
Commit also updates the Piglit quick_gl.txt, list modifications happened
due to following Piglit commits: c248bf201,c acff58ca, 5603e2e60.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-12-04 15:27:41 +00:00
Rhys Perry
3e67aa2e4e nir/load_store_vectorize: fix combining stores with aliasing loads between
v2: add test

Fixes: ce9205c03b ('nir: add a load/store vectorization pass')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v1)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v2)
2019-12-04 12:21:40 +00:00
Timur Kristóf
637c5a1dd9 aco/wave32: Fix reductions.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
21db083504 aco/wave32: Allow setting the subgroup ballot size to 64-bit.
Previously, it would only work when the ballot size was set to the
lane mask. This patch makes is possible to set the ballot size
to either 32-bit or 64-bit for wave32 mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
ed815d503e aco/wave32: Use wave_size for barrier intrinsic.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
b8f2edb452 aco/wave32: Fix load_local_invocation_index to support wave32.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
e0bcefc3a0 aco/wave32: Use lane mask regclass for exec/vcc.
Currently all usages of exec and vcc are hardcoded to use s2 regclass.
This commit makes it possible to use s1 in wave32 mode and
s2 in wave64 mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
b4efe179ed aco/wave32: Add wave size specific opcodes to aco_builder.
Several places in ACO we use SOP1 or SOP2 instructions to operate over the
exec mask or VCC, and these need to be adapted to the new size in wave32
mode.

This commit adds a way to deal with this problem in aco_builder: the caller
can specify a wave size specific opcode and the builder will translate that
to the correct opcode based on the current wave size.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
c44af6cbc7 aco/wave32: Introduce emit_mbcnt which takes wave size into account.
This is relevant because in wave32 mode the v_mbcnt_hi_u32_b32
instruction is superfluous.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
07754a9c9e aco/wave32: Replace hardcoded numbers in spiller with wave size.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
c0dbf42a03 aco/wave32: Change uniform bool optimization to work with wave32.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
dd9dad731b aco: Optimize load_subgroup_id to one bit field extract instruction.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
753670e902 aco: Remove lower_linear_bool_phi, it is not needed anymore.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
0d2d672020 aco: Remove superfluous argument from emit_boolean_logic.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
9a43d26b74 aco: Fix operand of s_bcnt1_i32_b64 in emit_boolean_reduce.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Michel Dänzer
5585b8eadd gitlab-ci: Run piglit glslparser & quick_shader tests separately
And only use --process-isolation false for the quick_gl tests.

This will hopefully avoid variance in the test results that we've been
seeing lately. But even if it doesn't, it should at least help narrow
down the cause of the variance.

Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-04 10:36:33 +01:00
Lionel Landwerlin
ddacd3d43b intel/perf: fix improper pointer access
This expression was unused by the macro, probably why it didn't
register in the compilation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
8c0b058263 intel/perf: simplify the processing of OA reports
This is a more accurate description of what happens in processing the
OA reports.

Previously we only had a somewhat difficult to parse state machine
tracking the context ID.

What we really only need to do to decide if the delta between 2
reports (r0 & r1) should be accumulated in the query result is :

   * whether the r0 is tagged with the context ID relevant to us

   * if r0 is not tagged with our context ID and r1 is: does r0 have a
     invalid context id? If not then we're in a case where i915 has
     resubmitted the same context for execution through the execlist
     submission port

v2: Update comment (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
b364e920bf intel/perf: take into account that reports read can be fairly old
If we read the OA reports late enough after the query happens, we can
get a timestamp in the report that is significantly in the past
compared to the start timestamp of the query. The current code must
deal with the wraparound of the timestamp value (every ~6 minute). So
consider that if the difference is greater than half that wraparound
period, we're probably dealing with an old report and make the caller
aware it should read more reports when they're available.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
9d0a5c817c intel/perf: set read buffer len to 0 to identify empty buffer
We always add an empty buffer in the list when creating the query.
Let's set the len appropriately so that we can recognize it when we
read OA reports up to the end of a query.

We were using an 0 timestamp value associated with the empty buffer
and incorrectly assuming this was a valid value. In turn that led to
not reading enough reports and resulted in deltas added to our counter
values which should have been discarded because those would be flagged
for a different context.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
acea59dbf8 intel/perf: fix invalid hw_id in query results
Accumulation happens between 2 reports, it can be between a start/end
report from another context. So only consider updating the hw_id of
the results when it's not already valid and that we have a valid value
to put in there.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 41b54b5faf ("i965: move OA accumulation code to intel/perf")
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Pierre-Eric Pelloux-Prayer
a7bbebcfb9 radeonsi: display cs blit count for AMD_DEBUG=testdma
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-04 09:08:28 +01:00
Pierre-Eric Pelloux-Prayer
082d1c1686 radeonsi: implement sdma for GFX9
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-04 09:08:28 +01:00
Samuel Pitoiset
4cacba0c86 radv/gfx10: fix the vertex order for triangle strips emitted by a GS
My fix wasn't totally correct as pointed out by Marek.
Ported from RadeonSI.

Fixes: deafe4cc58 ("radv/gfx10: fix primitive indices orientation for NGG GS")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-04 08:28:57 +01:00
Samuel Pitoiset
dac6bd29ae radv: simplify a check in radv_fixup_vertex_input_fetches()
The number of loaded channels should always be > 0 now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-04 08:04:05 +01:00
Samuel Pitoiset
3b51259f06 radv: remove dead shader input/output variables
No pipeline-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-04 08:04:05 +01:00
Jason Ekstrand
0604768ae4 iris: Stop setting up fake params
In d1c4e64a69, we added a parameter to tell the back-end compiler to
ignore the param array and just push however many constants you ask it
to push.  Iris doesn't want to push anything so it gives a bogus number
of parameters and trusts the back-end compiler to dead-code all of them.
Now that we can tell the back-end compiler to stop re-arranging things,
delete the hack and enable the new simpler code path.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 04:52:20 +00:00
Dave Airlie
713636766d gallium/scons: fix graw-xlib build on OSX.
Fixes: 44a6b0107b (gallivm: add nir->llvm translation (v2))

Tested-by: Vinson Lee <vlee@freedesktop.org>
2019-12-04 13:24:44 +10:00
Dave Airlie
3263c9824e llvmpipe: enable texcoord semantics
To make NIR transitioning easier, move the driver to using
texcoord semantics.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-04 12:08:14 +10:00
Jason Ekstrand
178a2946c0 anv: Respect the always_flush_cache driconf option
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-03 17:10:51 -06:00
Krzysztof Raszkowski
07adc47460 gallium/swr: Fix crash when use GL_TDFX_texture_compression_FXT1 format.
Reject the new formats in swr to prevent crashes because it doesn't
know how to handle the new formats.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-12-03 16:51:24 +00:00
Rob Clark
b31637c453 gitlab-ci: disable junit results for deqp
They don't seem to be hugely useful, and seem to be bogging down gitlab.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-03 08:46:39 -08:00
Jason Ekstrand
b1f37688ba anv: Set up SBE_SWIZ properly for gl_Viewport
gl_Viewport is also in the VUE header so we need to whack the read
offset to 0 and emit a default (no overrides) SBE_SWIZ entry in that
case as well.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-03 16:20:50 +00:00
Michel Dänzer
0c88d5952a gitlab-ci: Update to current ci-templates master
Fixes skopeo copy failures.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-03 16:03:31 +01:00
Samuel Pitoiset
f63a3132e8 ac/llvm: fix atomic var operations if source isn't a deref
Fixes some CTS regressions.

Fixes: e61a826f39 ("ac/llvm: fix pointer type for global atomics")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-03 09:41:33 +01:00
Neil Armstrong
dde734030b Add support for T820 CI Jobs
Tomeu: - Small rebase fixups

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-03 06:44:08 +01:00
Dave Airlie
502548a09c gallivm/llvmpipe: add support for front facing in sysval.
This wires up the front facing value as a sysval, I'd like to
remove the other facing code but I'd need to confirm VMware
don't use it first.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-03 15:29:04 +10:00
Dave Airlie
f52cdaa517 llvmpipe/images: handle undefined atomic without crashing
just return 0 for unbound atomic operations.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-03 15:29:04 +10:00
Alyssa Rosenzweig
71dd52e056 panfrost: Remove blend shader hack
This is no longer used.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-03 04:25:04 +00:00
Tomeu Vizoso
c707b4d0f9 gitlab-ci: Test Panfrost on T720 GPUs
Now that the Mali T720 GPU is supoprted at the same level as the T760,
test it on PINE64 H64 boards.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig
6d05e38a96 gitlab-ci: Remove non-default skips from Panfrost
During the past months, Panfrost has matured considerably and several
tests stopped being flaky or failing at all.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-03 04:25:04 +00:00
Tomeu Vizoso
b655be7252 panfrost: White list the Mali T720
Support for this GPU is equal now to that of T760, so whitelist it.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig
8555bffafd pan/midgard: Splatter on fragment out
Make sure that the fragment is complete when writing it out.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-03 04:25:04 +00:00
Tomeu Vizoso
ab81a23d36 panfrost: Simplify shader patching
We need to always upload anyway.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig
6ddaa5558a panfrost: Simplify draw_flags
Fixes dEQP-GLES3.functional.primitive_restart.*. Note the 0x18000 value
is accidentally somehow enabling primitive restart for some reason.
I'm not sure where this value came from but let's not.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig
9fb0904712 panfrost: Implement pan_tiler for non-hierarchy GPUs
The algorithm is as described. Nothing fancy here, just need to add some
new code paths depending on which model we're running on.

Tomeu:
- Also disable tiling when !hierarchy and !vertex_count
- Avoid creating polygon lists smaller than the minimum when
  vertex_count > 0 but tile size smaller than 16 byte
- Take into account tile size when calculating polygon list size for
  !hierarchy
- Allow 0-sized tiles in a single dimension

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig
63cd5b8198 panfrost: Add information about T720 tiling
We've figured out most of the big pieces, and though it looks faintly
like other Midgards, it's much simpler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-03 04:25:04 +00:00
Tomeu Vizoso
6887ff4e79 panfrost: Add quirks system to cmdstream
Similarly to how it's already done in the compiler, add a way to express
differences between GPU models that need to be taken into account when
assembling the cmdstream.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-03 04:25:04 +00:00