Commit graph

117900 commits

Author SHA1 Message Date
Marek Olšák
35655865cb nir/serialize: pack instructions better
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
4fe1d7822b util/blob: add 8-bit and 16-bit reads and writes
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Eric Anholt
59b489f44b ci: Use a tag from the parallel-deqp-runner repo.
If the repo continues development, we don't want to accidentally pick
up potentially breaking changes on our next container rebuild.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 15:37:04 -08:00
Rob Clark
215866523b gitlab-ci/freedreno/a6xx: remove most of the flakes
xfb + lines/points still flakes too frequently (and the problem isn't
even related to xfb), but we can add the rest back into this mix now.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-22 13:48:29 -08:00
Rob Clark
9f422cbe1c gitlab-ci/deqp: generate junit results
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Rob Clark
415d565d96 gitlab-ci/deqp: generate xml results for fails/flakes
Extract .qpa for the individual unexpected results and flakes, and
translate to xml, preserved with the artifacts.  This allows easy
browsing of the test logs for fails/flakes, for easier debugging.

The # of logs to preserve is capped at 50 to avoid saving 100s of
megabytes of logs in case someone pushes a change that breaks
everything.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Rob Clark
8af7551a9e gitlab-ci: bump arm test container
To pick up updated cts_runner and netcat for the flake reporting.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Rob Clark
fdaf777076 gitlab-ci/deqp: detect and report flakes
If there are a small number of fails, re-run to determine if they are
flakes, and optionally (if `$FLAKES_CHANNEL` configured) report the
flakes.

This way flakes don't interfere with developers working on other
drivers, but get logged so that the developers working on the flaking
driver can monitor the situation.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Rob Clark
cc6484f164 gitlab-ci/deqp: preserve caselists for blocks with fails
Bump cts_runner to pick up the change to preserve .qpa and caselist .txt
files for blocks of tests that contain fails, and preserve the caselist
files.  To reproduce fails that depend on order of running tests, these
are useful.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Rob Clark
59ed90fc74 gitlab-ci/deqp: preserve full list of unexpected results
The log only shows the first 50, but preserve the full list for easier
browsing.

(Also move return of exit code to end which makes later patches in the
series easier)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Rob Clark
5fa397a0d9 gitlab-ci: update deqp build so we can generate xml
Update the deqp build to preserve testlog-to-xml and stylesheets, so
deqp runner can extract .qpa for failed/flaked tests, and convert to
xml.  With this, will be able to browse output from failed tests
directly from the artifacts.

The main motiviation is to give better visibility into what happens with
flaked tests, when it is difficult/impossible to reproduce the flake
locally (ie. when it happens once out of N million tests).  But this
should also make it easier to debug regressions that a MR triggers,
especially when it is on hw that you don't have.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Markus Wick
dba903ed0b drirc: Enable glthread for dolphin/citra/yuzu.
Dolphin: 75 fps -> 88 fps - Super Mario Galaxy
Citra:   81 fps -> 91 fps - A Link Between Worlds
Yuzu:    21 fps -> 27 fps - Super Mario Odyssey

Dolphin still has many syncs because of glFenceSync and glClientWaitSync.
Moving them to the dispatcher thread might yield another speedup.

Yuzu uses a compatible profile by default. This benchmark used the variable
MESA_GL_VERSION_OVERRIDE=4.5FC to overwrite this behavior.

This profilation was done on a mobile i7-8550U CPU with i965.

Signed-off-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-22 15:29:29 -05:00
Markus Wick
f4c61d422d mesa/glthread: Implement ARB_multi_bind.
Signed-off-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-22 15:29:07 -05:00
Rhys Perry
517728477c aco: fix waitcnts for barriers at block ends
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: d1b9deee ('aco: improve waitcnt insertion around loops')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-22 19:56:31 +00:00
Zebediah Figura
a3c8bc10aa Revert "draw: revert using correct order for prim decomposition."
This reverts commit f97b731c82.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/250

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-11-22 20:37:42 +01:00
Kenneth Graunke
acd36e488d iris: Change keybox parenting
For temporary lookups, just allocate out of the NULL ralloc context,
so we don't have to edit the linked list of ralloc children to add it
and then immediately remove it again.

When uploading a new shader, allocate the keybox off the shader, so
if we delete the shader the keybox also goes away.  Less manual cleanup.
2019-11-22 09:50:59 -08:00
Ian Romanick
ca353285cb nir/range_analysis: Make sure the table validation only occurs once
All of the tables are static const, so they only need to be validated
once.  As noted in the previous commit, the compiler should be able to
eliminate all of this code when the assertions would pass.  Even with
the help of the previous commit, this does not always occur.

-Og: -95.688 +/- 3.91935 (-24.9562% +/- 1.0222%) N=5
-O1: No difference proven at 95.0% confidence. N=5
-O2: -1.962 +/- 0.85001 (-0.860013% +/- 0.372589%) N=5

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 08:16:06 -08:00
Ian Romanick
ccefce46cb nir/range-analysis: Add pragmas to help loop unrolling
I was pretty liberal with these assertions when I wrote this code
because I had assumed that GCC would unroll the loops, inline the look ups
of static const arrays with now constant indices, and then elmininate
all the actuall assertions.  It seems none of this happens even at -O3.

Adding the pragmas helps encourage loop unrolling at some optimization
levels.  I tested by running shader-db with NIR_VALIDATE=false on a Core
i7 Haswell desktop system.

-Og: No difference proven at 95.0% confidence. N=5
-O1: -48.304 +/- 1.221 (-16.3343% +/- 0.412888%) N=5
-O2: -49.94 +/- 1.23521 (-17.9634% +/- 0.444303%) N=5

v2: Add a _Pragma to an inner loop that was accidentally dropped during
a rebase.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 08:16:06 -08:00
Danylo Piliaiev
25a00b449f glsl: Add varyings to "zero-init of uninitialized vars" workaround
Varyings are similar to already handled cases. And "glsl_zero_init"
name of the workaround already looks like it should include varyings.

The issue was observed in GiMark subtest from GpuTest.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 15:25:56 +00:00
Alyssa Rosenzweig
4c43b354c3 pan/midgard: Use lower_tex_without_implicit_lod
Just a bit of cleanup. lower_tex can do this lowering for us, which
should also eliminate some special cases (one less thing to fix if we
ever need texturing in tess/geom/etc, perhaps?)

Closes #2133

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 08:38:57 -05:00
Christian Gmeiner
47c7c4263c etnaviv: use a more self-explanatory param name
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-22 10:47:13 +00:00
Christian Gmeiner
a949fa9d5d etnaviv: drop not used config_out function param
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-22 10:47:13 +00:00
Samuel Pitoiset
6f7ec6ee39 gitlab-ci: reduce the number of scons build
It seems overkill to me to build scons 7x for every pipeline.
Scons is now build with the oldest llvm version in scons-old-llvm
and with the newest llvm version in scons.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 10:39:21 +01:00
Alyssa Rosenzweig
2e14fe6490 panfrost: Add lcra.c to Android.mk
This was forgotten.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig
bda2bb31b1 pan/midgard: Enable LOD lowering only on buggy chips
T720 and earlier need this workaround, so check the quirk before
lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig
68c2c7962a pan/midgard: Describe quirk MIDGARD_BROKEN_LOD
Corresponds to errata #10471, applies to T6xx and T720. Fixed in T760.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig
d32d4acf68 pan/midgard: Add LOD bias/clamp lowering
We fetch the info with the new intrinsic and lower with ALU ops for txl
instructions, which seemingly correspond to "TEXGRD" instructions (what
we call textureLod).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig
4e07e7b232 pan/midgard: Implement load_sampler_lod_paramaters_pan
We can stuff this information in as parametrized system values, like we
currently do texture size and SSBO addresses.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig
deaebc82a7 nir: Add load_sampler_lod_paramaters_pan intrinsic
This loads in the <min_lod, max_lod, lod_bias> settings for a given
sampler, which is necessary for lowering clamps/biases on certain
Midgard chips.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Markus Wick
b1156ecdf2 mapi/glapi: Generate sizeof() helpers instead of fixed sizes.
Generating a source code with a fixed size leads to issues with plattform dependent types.
We either hard code 4 or 8 bytes there, and both are wrong on the other plattform.
So this patch solves this issue by generating eg sizeof(GLsizeiptr), which is valid both
on 32 and on 64 bit plattforms.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-11-21 22:52:55 -05:00
Ian Romanick
e51eda99df intel/fs: Disable conditional discard optimization on Gen4 and Gen5
The CMP instruction on Gen4 and Gen5 generates one bit (the LSB) of
valid data and 31 bits of junk.  Results of comparisons that are used as
Boolean values need to have a fixup applied to generate the proper 0/~0
values.

Calling fs_visitor::nir_emit_alu with need_dest=false prevents the fixup
code from being generated.  This results in a sequence like:

        cmp.l.f0.0(16)  g8<1>F          g14<8,8,1>F     0x0F  /* 0F */
        ...
        cmp.l.f0.0(16)  g4<1>F          g6<8,8,1>F      0x0F  /* 0F */
(+f0.1) or.z.f0.1(16) null<1>UD g4<8,8,1>UD     g8<8,8,1>UD

instead of

        cmp.l.f0.0(16)  g8<1>F          g14<8,8,1>F     0x0F  /* 0F */
        ...
        cmp.l.f0.0(16)  g4<1>F          g6<8,8,1>F      0x0F  /* 0F */
        or(16) g4<1>UD g4<8,8,1>UD     g8<8,8,1>UD
(+f0.1) and.z.f0.1(16) null<1>UD g4<8,8,1>UD     1UD

I examined a couple of the shaders hurt by this change, and ALL of them
would have been affected by this bug. :(

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1836
Fixes: 0ba9497e66 ("intel/fs: Improve discard_if code generation")

Iron Lake
total instructions in shared programs: 8122757 -> 8122957 (<.01%)
instructions in affected programs: 8307 -> 8507 (2.41%)
helped: 0
HURT: 100
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.84% max: 6.67% x̄: 2.81% x̃: 2.76%
95% mean confidence interval for instructions value: 2.00 2.00
95% mean confidence interval for instructions %-change: 2.58% 3.03%
Instructions are HURT.

total cycles in shared programs: 188510100 -> 188510376 (<.01%)
cycles in affected programs: 76018 -> 76294 (0.36%)
helped: 0
HURT: 55
HURT stats (abs)   min: 2 max: 12 x̄: 5.02 x̃: 4
HURT stats (rel)   min: 0.07% max: 3.75% x̄: 0.86% x̃: 0.56%
95% mean confidence interval for cycles value: 4.33 5.71
95% mean confidence interval for cycles %-change: 0.60% 1.12%
Cycles are HURT.

GM45
total instructions in shared programs: 4994403 -> 4994503 (<.01%)
instructions in affected programs: 4212 -> 4312 (2.37%)
helped: 0
HURT: 50
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.84% max: 6.25% x̄: 2.76% x̃: 2.72%
95% mean confidence interval for instructions value: 2.00 2.00
95% mean confidence interval for instructions %-change: 2.45% 3.07%
Instructions are HURT.

total cycles in shared programs: 128928750 -> 128928982 (<.01%)
cycles in affected programs: 67442 -> 67674 (0.34%)
helped: 0
HURT: 47
HURT stats (abs)   min: 2 max: 12 x̄: 4.94 x̃: 4
HURT stats (rel)   min: 0.09% max: 3.75% x̄: 0.75% x̃: 0.53%
95% mean confidence interval for cycles value: 4.19 5.68
95% mean confidence interval for cycles %-change: 0.50% 1.00%
Cycles are HURT.
2019-11-21 16:40:50 -08:00
Dylan Baker
bba44ef176 docs: update calendar, add news item and link release notes for 19.2.6 2019-11-21 16:34:00 -08:00
Dylan Baker
3531d74e82 docs: Add SHA256 sum for 19.2.6 2019-11-21 16:32:35 -08:00
Dylan Baker
f8070577a4 docs: Add release notes for 19.2.6 2019-11-21 16:32:34 -08:00
Marek Olšák
0b1452ffdd nir/serialize: do ctx = {0} instead of manual initializations
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-21 18:49:57 -05:00
Marek Olšák
ff71fae440 nir: strip as we serialize to remove the nir_shader_clone call
Serializing stripped NIR is faster now.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-21 18:49:57 -05:00
Christian Gmeiner
8acaab1aa7 etnaviv: add drm-shim
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-21 22:56:04 +00:00
Eric Engestrom
609a6ae23e vk_util: drop duplicate formats in vk_format_map[]
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-21 22:52:40 +00:00
Jonathan Marek
773d640efa turnip: implement UBWC
This enables UBWC for everything except 3D textures.

It breaks many image_to_image copies but those aren't important and it can
be worked around later (image_to_image copy needs to be done in two steps,
decode from the source format and then encode to the destination format).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-21 22:21:57 +00:00
Jonathan Marek
91fd83d142 freedreno/regs: update UBWC related bits
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-21 22:21:57 +00:00
Vinson Lee
6613a4a029 swr: Fix build with llvm-10.0.
Fix build error after llvm-10.0 commit 1dfede3122ee ("Move
CodeGenFileType enum to Support/CodeGen.h").

../src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp: In member function ‘void JitManager::DumpAsm(llvm::Function*, const char*)’:
../src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp:428:45: error: ‘CGFT_AssemblyFile’ is not a member of ‘llvm::TargetMachine’
             *pMPasses, filestream, nullptr, TargetMachine::CGFT_AssemblyFile);
                                             ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-11-21 13:20:08 -08:00
Rhys Perry
29d131d619 aco: fix copy+paste error
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-21 20:28:57 +00:00
Rhys Perry
d1b9deeea8 aco: improve waitcnt insertion around loops
Do this by repeating processing of loops until no progress is made.

Totals from affected shaders:
SGPRS: 162576 -> 162576 (0.00 %)
VGPRS: 145228 -> 145228 (0.00 %)
Spilled SGPRs: 668 -> 668 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 15778640 -> 15771336 (-0.05 %) bytes
LDS: 146 -> 146 (0.00 %) blocks
Max Waves: 6087 -> 6087 (0.00 %)

v2: use block_kind_loop_header/block_kind_loop_exit to repeat at the end
    of loops instead of at each continue

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-21 20:28:57 +00:00
Rob Clark
1a8c49d76c freedreno/perfctrs/fdperf: periodically restore counters
When GPU is idle and suspends, the currently selected countables
will all reset to the first one.  So periodically restore the selected
countables.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:03 +00:00
Rob Clark
5a13507164 freedreno/perfcntrs: add fdperf
Port from the envytools tree, but converted to use the .c tables for
describing the perfcounter groups/countables, rather than using rnndec
to get this at runtime from the register xml.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:03 +00:00
Rob Clark
b2338a5b00 freedreno/perfcntrs/a6xx: remove RBBM counters
Currently this are getting blocked by the kernel.. these counters don't
seem to be the most useful ones, and to use them we'd have to somehow
probe the kernel by submitting cmdstream to write the selector regs and
see if that triggers a GPU fault.  So let's just skip them.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:03 +00:00
Rob Clark
6a517b3079 freedreno/perfctrs/a2xx: move CP to be first group
fdperf expects this, to find the ALWAYS_COUNT counter

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:03 +00:00
Rob Clark
e35c4e6ad2 freedreno/perfcntrs: add accessor to get per-gen tables
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:03 +00:00
Rob Clark
b21f03ae7e freedreno/perfcntrs: move to shared location
This should eventually be useful for VK_KHR_performance_query as well.
And in the more near term, for fdperf.

Attempt to not break android build is best-effort and untested.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:03 +00:00
Rob Clark
6727114cba freedreno/perfcntrs: remove gallium dependencies
Prep work to move to a shared location.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:02 +00:00