Commit graph

157226 commits

Author SHA1 Message Date
Mike Blumenkrantz
4830cc77cb nir/lower_point_size: apply point size clamping
point size min/max values are provided through the state vars, so ensure
these are always applied in order to respect ARB_point_parameters

cc: mesa-stable

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
2022-06-22 13:27:29 +00:00
Italo Nicola
42a1264951 virgl: overpropagate precise flags
As it turns out, MOVs weren't the only instructions that blocked precise
flags propagation in the transition to nir-to-tgsi.
This commit fixes some rendering regressions caused by a4a34cd3.

Fixes: a4a34cd3

Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collanora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17144>
2022-06-22 12:58:58 +00:00
Jason Volk
e1488d9374 radeon: Support shared memory user pointers.
The RADEON_GEM_USERPTR_ANONONLY flag is hardcoded here which excludes
shared memory pages. DRM is actually capable of supporting shared file-
backed memory, but only if it's read-only. This mutability intent has to
be conveyed through the stack, so a flags argument is added to the winsys
regime to pass RADEON_FLAG_READ_ONLY.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16115>
2022-06-22 12:23:02 +00:00
Marcin Ślusarz
f871aa10a1 intel/compiler: assert that base is 0 for [load|store]_shared intrins
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17143>
2022-06-22 10:32:13 +00:00
Timur Kristóf
e5970fe22a nir/lower_task_shader: don't use base index for shared memory intrinsics
Intel backend doesn't handle them very well.

Fixes: 8aff8d3dd4 ("nir: Add common task shader lowering to make the backend's job easier.")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17143>
2022-06-22 10:32:13 +00:00
Marcin Ślusarz
49b8fffeed nir/lower_task_shader: insert barrier before/after shared memory read/write
Fixes: 8aff8d3dd4 ("nir: Add common task shader lowering to make the backend's job easier.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17143>
2022-06-22 10:32:13 +00:00
Connor Abbott
c601ba332b ir3/sched: Fix could_sched() determination
This needs to be accurate so that when we split and then schedule a new
a0.x/a1.x/p0.x write we will eventually make progress. It wasn't taking
the kill_path into account which could create an infinite loop as we
keep scheduling writes whose uses are blocked because they are memory
instructions not on the kill_path.

Closes: #6413
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16635>
2022-06-22 10:09:13 +00:00
Danylo Piliaiev
a8671b2182 meson/tu: Don't compile libdrm paths if KGSL is selected
Even if there is libdrm we shouldn't use it if KGSL is selected.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17173>
2022-06-22 11:52:36 +03:00
Danylo Piliaiev
6ad7be1b36 meson/pps: Check if libdrm exists to compile pps
For Turnip with KGSL we may have perffeto enabled but we don't
have libdrm.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17173>
2022-06-22 11:52:36 +03:00
Danylo Piliaiev
ee6a0c675b meson: Define _GNU_SOURCE for android host system
Otherwise sched_getaffinity isn't be defined and util_cpu_detect_once
fails to compile.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17173>
2022-06-22 11:52:36 +03:00
Samuel Pitoiset
ad3d6d9c6e radv/llvm: always emit a null export even if the FS doesn't discard
Even with a noop FS, the color blend state can still be non-zero, and
then SPI color related registers won't be 0 and this would hang.

Fixes: bdf3797aeb ("ac,radeonsi: don't export null from PS if it has no effect on gfx10+")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17169>
2022-06-22 08:31:30 +02:00
Pavel Asyutchenko
17645cb29c llvmpipe: enable PIPE_CAP_FBFETCH_ZS
Support for it was added in previous commits.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Pavel Asyutchenko
ccaa7920ef llvmpipe: implement FB fetch for depth/stencil
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Pavel Asyutchenko
0ba3e797ee llvmpipe: simplify early/late zs tests selection
This does not change selection logic.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Pavel Asyutchenko
443ef18f0c llvmpipe: enable per-sample shading when FB fetch is used
This matches specifications of both color and ZS fetch extensions.

Cc: mesa-stable
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Pavel Asyutchenko
8788b17596 nir_to_tgsi: Don't count ZS fbfetch vars as outputs
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Pavel Asyutchenko
959b748038 glsl: add language support for GL_ARM_shader_framebuffer_fetch_depth_stencil
This extension adds built-in variables gl_LastFragDepthARM and gl_LastFragStencilARM
which can be implemented almost the same as gl_LastFragData from color fetch extension.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Pavel Asyutchenko
41f22a1823 gallium: add PIPE_CAP_FBFETCH_ZS and expose extension
st/mesa will expose GL_ARM_shader_framebuffer_fetch_depth_stencil
if this new capability is supported by the driver.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Dave Airlie
68e8940114 glx/drisw: use xcb instead of X to query connection
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
2022-06-22 03:28:21 +00:00
Dave Airlie
d3e723fb77 wsi/x11: add xcb_put_image support for larger transfers.
This was noticed as a problem in the EGL code, just fixup wsi.

Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
2022-06-22 03:28:21 +00:00
Dave Airlie
c5dbb1139c egl/x11: add missing put_image cookie cleanups
These might not be required but be consistent with the wsi code.

Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
2022-06-22 03:28:21 +00:00
Dave Airlie
e6082ac62e egl/x11: split large put image requests to avoid server destroy
wezterm in fullscreen 4k was exceeding the xcb max request size
on the put image with llvmpipe. This fixes it to send sub-images,
the Xlib put image used in glx does this internally, but not
the xcb one, so just do it in sections here.

Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
2022-06-22 03:28:21 +00:00
Mike Blumenkrantz
e8fc5cca90 zink: fix dual_src_blend driconf workaround
not sure when this broke but it broke

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17156>
2022-06-22 03:14:18 +00:00
Mike Blumenkrantz
ea005c9e04 glx/drisw: invalidate drawables upon binding context if flush extension exists
this forces surface resize as expected

cc: mesa-stable

fixes #6706

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17147>
2022-06-22 02:18:37 +00:00
Mike Blumenkrantz
23b63e536e glx/drisw: store the flush extension to the screen
cc: mesa-stable

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17147>
2022-06-22 02:18:37 +00:00
Guilherme Gallo
cee1c4fc7f ci/lava: Filter out undesired messages
Some LAVA jobs emit lots of messages "Listened to connection for
namespace 'common' for up to 1s" in a row at the end of the logs, making
difficult to see the result of the test script.

This commit removes those lines until a proper solution is deployed on
the LAVA side.

Closes: #6116

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17151>
2022-06-22 01:48:16 +00:00
Jason Ekstrand
64d074879b vulkan/wsi: Use HAVE_LIBDRM to detect DRM instead of !_WIN32
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17170>
2022-06-22 01:15:20 +00:00
Jordan Justen
a7127fbc4c intel/tools: Print memory info in intel_dev_info
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
2022-06-22 00:30:49 +00:00
Jordan Justen
eaf2a35a76 iris/bufmgr: Use memory info from devinfo
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
2022-06-22 00:30:49 +00:00
Jordan Justen
1505f94397 anv: Use memory info from devinfo
Rework:
 * Jordan: Drop regions.valid (Lionel implemented a fallback)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
2022-06-22 00:30:49 +00:00
Lionel Landwerlin
4289c9ec13 intel/dev: add a fallback when memory regions are not available
We have this in Anv and it could be reused in Iris for integrated
memory system.

Rework:
 * Jordan: Drop regions.valid (Lionel implemented a fallback)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
2022-06-22 00:30:49 +00:00
Lionel Landwerlin
4e727297e8 intel/dev: add a helper to update memory info
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
2022-06-22 00:30:49 +00:00
Jordan Justen
4aecfbf0f4 intel/dev: Add devinfo::mem to store i915 regions information
Reworks:
 * Lionel: Change check on memory region valid to vram size
 * Jordan: Drop regions.valid (Lionel implemented a fallback)
 * Jordan: Rename devinfo::regions to devinfo::mem.
 * Jordan: Add devinfo::mem::use_class_instance
 * Add mesa_logw for lmem requiring regions. (s-b Lionel)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
2022-06-22 00:30:49 +00:00
Alyssa Rosenzweig
1222c86e34 panfrost: Bump ESSL_FEATURE_LEVEL on Valhall
This advertises ARB_gpu_shader5 on Valhall, which should be working now. On the
GLES3.1 side, this notably adds support for sample variables and dynamic offsets
for texture gathers, both of which should now be working.

No shader-db changes.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
74460a5d75 panfrost: Enable CAP_INDIRECT_TEMP_ADDR on Valhall
For parity with Bifrost. Apparently this pattern is sufficiently obscure that
the shader-db results on Mali-G57 are mostly noise.

total instructions in shared programs: 2675116 -> 2674820 (-0.01%)
instructions in affected programs: 4336 -> 4040 (-6.83%)
helped: 8
HURT: 1
helped stats (abs) min: 1.0 max: 52.0 x̄: 37.88 x̃: 49
helped stats (rel) min: 0.46% max: 8.20% x̄: 5.97% x̃: 7.56%
HURT stats (abs)   min: 7.0 max: 7.0 x̄: 7.00 x̃: 7
HURT stats (rel)   min: 5.98% max: 5.98% x̄: 5.98% x̃: 5.98%
95% mean confidence interval for instructions value: -52.90 -12.88
95% mean confidence interval for instructions %-change: -8.48% -0.81%
Instructions are helped.

total cvt in shared programs: 14127.08 -> 14126.53 (<.01%)
cvt in affected programs: 33.84 -> 33.30 (-1.62%)
helped: 10
HURT: 1
helped stats (abs) min: 0.015625 max: 0.125 x̄: 0.06 x̃: 0
helped stats (rel) min: 0.71% max: 2.93% x̄: 1.76% x̃: 1.78%
HURT stats (abs)   min: 0.09375 max: 0.09375 x̄: 0.09 x̃: 0
HURT stats (rel)   min: 7.89% max: 7.89% x̄: 7.89% x̃: 7.89%
95% mean confidence interval for cvt value: -0.09 -0.01
95% mean confidence interval for cvt %-change: -2.89% 1.13%
Inconclusive result (%-change mean confidence interval includes 0).

total sfu in shared programs: 7572 -> 7555.69 (-0.22%)
sfu in affected programs: 37.19 -> 20.88 (-43.87%)
helped: 6
HURT: 3
helped stats (abs) min: 2.75 max: 2.75 x̄: 2.75 x̃: 2
helped stats (rel) min: 47.31% max: 48.89% x̄: 48.63% x̃: 48.89%
HURT stats (abs)   min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0
HURT stats (rel)   min: 5.56% max: 6.25% x̄: 5.79% x̃: 5.56%
95% mean confidence interval for sfu value: -2.89 -0.73
95% mean confidence interval for sfu %-change: -51.41% -9.57%
Sfu are helped.

total quadwords in shared programs: 1450040 -> 1449896 (<.01%)
quadwords in affected programs: 1992 -> 1848 (-7.23%)
helped: 6
HURT: 0
helped stats (abs) min: 24.0 max: 24.0 x̄: 24.00 x̃: 24
helped stats (rel) min: 6.82% max: 7.50% x̄: 7.24% x̃: 7.32%
95% mean confidence interval for quadwords value: -24.00 -24.00
95% mean confidence interval for quadwords %-change: -7.48% -6.99%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
7d84bb00dc panfrost: Enable more FP16 caps on Valhall
This brings the FP16 capabilities of Valhall to parity with Bifrost.
Supporting FP16 constant buffers in particular reduces ALU in a ton of GLES
shaders, so that's a nice win. FP16 derivatives get vectorized which is a big
win where that applies, but they are considerably less common.

The lost shaders are from enabling PIPE_SHADER_CAP_FP16_CONST_BUFFERS (these
shaders compile on Midgard but not on Bifrost). The shaders in question declare
the same uniform in linked vertex and fragment shaders with different
precisions. This is contrary to the GLSL ES specification, which states
precisions must match for default uniforms of linked shaders. All the lost
shaders are in 8 Ball Pool and Hill Climb Racing. As those are proprietary
games, if that becomes a problem in the future, drirc is the solution.

total instructions in shared programs: 2697897 -> 2674595 (-0.86%)
instructions in affected programs: 1019922 -> 996620 (-2.28%)
helped: 4838
HURT: 2599
helped stats (abs) min: 1.0 max: 52.0 x̄: 7.13 x̃: 5
helped stats (rel) min: 0.16% max: 46.51% x̄: 8.04% x̃: 5.33%
HURT stats (abs)   min: 1.0 max: 36.0 x̄: 4.30 x̃: 3
HURT stats (rel)   min: 0.17% max: 133.33% x̄: 10.53% x̃: 3.85%
95% mean confidence interval for instructions value: -3.32 -2.95
95% mean confidence interval for instructions %-change: -1.89% -1.22%
Instructions are helped.

total cycles in shared programs: 141764.61 -> 140602.88 (-0.82%)
cycles in affected programs: 5728.22 -> 4566.48 (-20.28%)
helped: 665
HURT: 89
helped stats (abs) min: 0.015625 max: 15.0 x̄: 1.75 x̃: 0
helped stats (rel) min: 0.30% max: 61.54% x̄: 11.17% x̃: 4.62%
HURT stats (abs)   min: 0.015625 max: 0.265625 x̄: 0.04 x̃: 0
HURT stats (rel)   min: 0.30% max: 66.67% x̄: 6.77% x̃: 1.94%
95% mean confidence interval for cycles value: -1.77 -1.31
95% mean confidence interval for cycles %-change: -10.11% -7.99%
Cycles are helped.

total fma in shared programs: 22577.56 -> 22575.91 (<.01%)
fma in affected programs: 2422.78 -> 2421.12 (-0.07%)
helped: 533
HURT: 653
helped stats (abs) min: 0.015625 max: 0.0625 x̄: 0.03 x̃: 0
helped stats (rel) min: 0.30% max: 50.00% x̄: 8.25% x̃: 1.35%
HURT stats (abs)   min: 0.015625 max: 0.125 x̄: 0.03 x̃: 0
HURT stats (rel)   min: 0.19% max: 100.00% x̄: 4.53% x̃: 2.08%
95% mean confidence interval for fma value: -0.00 0.00
95% mean confidence interval for fma %-change: -1.98% -0.44%
Inconclusive result (value mean confidence interval includes 0).

total cvt in shared programs: 14460.95 -> 14122.50 (-2.34%)
cvt in affected programs: 6159.02 -> 5820.56 (-5.50%)
helped: 4827
HURT: 2577
helped stats (abs) min: 0.015625 max: 0.796875 x̄: 0.11 x̃: 0
helped stats (rel) min: 0.20% max: 81.82% x̄: 17.78% x̃: 12.90%
HURT stats (abs)   min: 0.015625 max: 0.546875 x̄: 0.07 x̃: 0
HURT stats (rel)   min: 0.00% max: 600.00% x̄: 43.66% x̃: 13.04%
95% mean confidence interval for cvt value: -0.05 -0.04
95% mean confidence interval for cvt %-change: 2.28% 4.93%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).

total sfu in shared programs: 7593.56 -> 7571.06 (-0.30%)
sfu in affected programs: 357.19 -> 334.69 (-6.30%)
helped: 149
HURT: 1
helped stats (abs) min: 0.0625 max: 0.25 x̄: 0.15 x̃: 0
helped stats (rel) min: 5.26% max: 36.36% x̄: 6.79% x̃: 5.56%
HURT stats (abs)   min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0
HURT stats (rel)   min: 3.57% max: 3.57% x̄: 3.57% x̃: 3.57%
95% mean confidence interval for sfu value: -0.16 -0.14
95% mean confidence interval for sfu %-change: -7.51% -5.93%
Sfu are helped.

total v in shared programs: 8722.62 -> 8722.31 (<.01%)
v in affected programs: 1.62 -> 1.31 (-19.23%)
helped: 2
HURT: 0

total ls in shared programs: 129666 -> 128494 (-0.90%)
ls in affected programs: 4163 -> 2991 (-28.15%)
helped: 192
HURT: 0
helped stats (abs) min: 1.0 max: 15.0 x̄: 6.10 x̃: 5
helped stats (rel) min: 4.35% max: 75.00% x̄: 30.23% x̃: 26.32%
95% mean confidence interval for ls value: -6.67 -5.54
95% mean confidence interval for ls %-change: -32.67% -27.79%
Ls are helped.

total quadwords in shared programs: 1461496 -> 1449768 (-0.80%)
quadwords in affected programs: 273592 -> 261864 (-4.29%)
helped: 1992
HURT: 687
helped stats (abs) min: 8.0 max: 24.0 x̄: 8.76 x̃: 8
helped stats (rel) min: 1.43% max: 50.00% x̄: 16.30% x̃: 11.11%
HURT stats (abs)   min: 8.0 max: 16.0 x̄: 8.31 x̃: 8
HURT stats (rel)   min: 1.92% max: 100.00% x̄: 36.39% x̃: 25.00%
95% mean confidence interval for quadwords value: -4.67 -4.08
95% mean confidence interval for quadwords %-change: -3.95% -1.62%
Quadwords are helped.

total threads in shared programs: 53496 -> 53551 (0.10%)
threads in affected programs: 112 -> 167 (49.11%)
helped: 74
HURT: 19
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.42 0.76
95% mean confidence interval for threads %-change: 56.83% 81.88%
Threads are helped.

total loops in shared programs: 128 -> 127 (-0.78%)
loops in affected programs: 1 -> 0
helped: 1
HURT: 0

total fills in shared programs: 684 -> 672 (-1.75%)
fills in affected programs: 160 -> 148 (-7.50%)
helped: 2
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
3fedf22b60 pan/bi: Tune lower_vars_to_scratch
Increase the threshold to lower indirect indexing of arrays to scratch memory
all the way up to 256 bytes, which was the lowest power-of-two threshold for
which enabling the pass on Mali-G57 was a win in shaderdb.

It's difficult to tell what threshold is optimal here. The shader-db stats are
based on a rough cycle model that assumes a 16:1 ratio between CVT and
load/store on Valhall, and a 24:1 ratio between arithmetic and load/store on
Bifrost. Those ratios are at most rules of thumb, as the number of cycles
required by a load/store instruction will vary tremendously based on caching and
the memory controller. However, they may well be lower bounds (if those are the
upper bounds on instruction issuing in the Mali shader cores). As such, a large
threshold seems well motivated.

shader-db results on Mali-G52 follow, results on Mali-G57 were similar. Note the
shader that's hurt for spills/fills is *helped* for load/store overall.

cycles helped: 129 -> 98 (-24.03%) (spills: 17 -> 20 (17.65%); fills: 34 -> 40 (17.65%))
ldst helped: 129 -> 98 (-24.03%) (spills: 17 -> 20 (17.65%); fills: 34 -> 40 (17.65%))

total instructions in shared programs: 2415410 -> 2415372 (<.01%)
instructions in affected programs: 1041 -> 1003 (-3.65%)
helped: 3
HURT: 0
helped stats (abs) min: 2.0 max: 31.0 x̄: 12.67 x̃: 5
helped stats (rel) min: 2.08% max: 6.02% x̄: 3.90% x̃: 3.60%

total tuples in shared programs: 1928558 -> 1928527 (<.01%)
tuples in affected programs: 826 -> 795 (-3.75%)
helped: 2
HURT: 1
helped stats (abs) min: 6.0 max: 26.0 x̄: 16.00 x̃: 16
helped stats (rel) min: 3.72% max: 9.68% x̄: 6.70% x̃: 6.70%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.54% max: 1.54% x̄: 1.54% x̃: 1.54%

total clauses in shared programs: 355013 -> 354981 (<.01%)
clauses in affected programs: 220 -> 188 (-14.55%)
helped: 3
HURT: 0
helped stats (abs) min: 2.0 max: 27.0 x̄: 10.67 x̃: 3
helped stats (rel) min: 13.99% max: 21.43% x̄: 16.93% x̃: 15.38%

total cycles in shared programs: 166610.27 -> 166574.90 (-0.02%)
cycles in affected programs: 138 -> 102.62 (-25.63%)
helped: 3
HURT: 0
helped stats (abs) min: 0.4583330000000001 max: 31.0 x̄: 11.79 x̃: 3
helped stats (rel) min: 15.28% max: 65.28% x̄: 34.86% x̃: 24.03%

total arith in shared programs: 73690.13 -> 73690.58 (<.01%)
arith in affected programs: 29.71 -> 30.17 (1.54%)
helped: 1
HURT: 2
helped stats (abs) min: 0.0833339999999998 max: 0.0833339999999998 x̄: 0.08 x̃: 0
helped stats (rel) min: 3.85% max: 3.85% x̄: 3.85% x̃: 3.85%
HURT stats (abs)   min: 0.125 max: 0.4166659999999993 x̄: 0.27 x̃: 0
HURT stats (rel)   min: 1.66% max: 5.17% x̄: 3.42% x̃: 3.42%

total ldst in shared programs: 135611 -> 135571 (-0.03%)
ldst in affected programs: 138 -> 98 (-28.99%)
helped: 3
HURT: 0
helped stats (abs) min: 3.0 max: 31.0 x̄: 13.33 x̃: 6
helped stats (rel) min: 24.03% max: 100.00% x̄: 74.68% x̃: 100.00%

total quadwords in shared programs: 1674599 -> 1674523 (<.01%)
quadwords in affected programs: 838 -> 762 (-9.07%)
helped: 3
HURT: 0
helped stats (abs) min: 2.0 max: 65.0 x̄: 25.33 x̃: 9
helped stats (rel) min: 3.39% max: 15.00% x̄: 9.14% x̃: 9.04%

total spills in shared programs: 37 -> 40 (8.11%)
spills in affected programs: 17 -> 20 (17.65%)
helped: 0
HURT: 1

total fills in shared programs: 190 -> 196 (3.16%)
fills in affected programs: 34 -> 40 (17.65%)
helped: 0
HURT: 1

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
fd021a618f pan/va: Replace MKVEC.v4i8 with MKVEC.v2i8
This is the instruction that the hardware actually supports. Do the rename, use
the more specific accurate model in the IR, and rework the Valhall texturing
code to emit MKVEC.v2i8 instead of MKVEC.v4i8.

Will fix:

   dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.*

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
c570693c19 pan/va: Pack MKVEC.v2i8 byte lanes
They are in a different place, but the encoding is otherwise as usual. This will
be required for texture gathers with dynamic offsets.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
10301885ab pan/bi: Constant fold MKVEC.v2i8
Constant MKVEC.v2i8 will be generated during texturing on Valhall, just like
constant MKVEC.v4i8 is currently generated.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
2833d0472a pan/bi: Model MKVEC.v2i8
Valhall does not have Bifrost's 4-source MKVEC.v4i8. Instead, it has a (somewhat
limtied) 3-source MKVEC.v2i8. The full MKVEC.v4i8 may be lowered to a pair of
MKVEC.v2i8 instructions.

For good code quality on both Bifrost and Valhall, we need to model both
instructions in their full generality.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
6792b15971 pan/bi: Remove FRSCALE from IR
It's just LDEXP in different clothing.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
21bedd2c97 pan/va: Rename RSCALE to LDEXP
This avoids needless variation from Bifrost. While at it, fix the opcode
definition: there are no abs/neg/swizzle modifiers on the signed integer source,
and there's no clamp. However, there are round and infinity modes, like on
Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
0da28ee2c7 pan/va: Implement sample positions FAU packing
This will fix:

dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.at_sample_position.default_framebuffer

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
9dd0bc92b5 pan/va: Lower FADD_RSCALE.f32 to FMA_RSCALE.f32
We generate FADD_RSCALE.f32 in our sample variables implementations. Valhall
doesn't have a dedicated FADD_RSCALE.f32 implementation, it should be aliased to
FMA_RSCALE.f32. Handle that alias in isel lowering. This will fix:

   dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.*

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
1a882ecdab pan/bi: Align accesses with packed TLS
When lowering vars to scratch, we need to be careful with alignment on Valhall,
where packed TLS access must not straddle a 16-byte boundary. Fixes regressions
when enabling indirect access to temps on Valhall.

Fixes: 6761dbf891 ("panfrost: Use packed TLS on Valhall")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
5ee1179c94 pan/bi: Fix LD_BUFFER.i16 definition
This was missing the message, breaking UBO-to-push and who-knows-what-else, when
enabling fp16 const buffers.

Fixes: 3dc2095b07 ("pan/bi: Model LD_BUFFER instructions")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig
40accfd3b7 pan/va: Unit test va_mark_last
This pass is super easy to unit test, so we have no excuse not to test
thoroughly. va_mark_last only inserts annotations in a shader without any
annotations, so our test cases are simply annotated shaders. The CASE macro just
has to compare the case against the case with the annotations stripped and added
back with va_mark_last.

In retrospect, I should have used that technique for the flow control insertion
tests too.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig
4b7e337b45 pan/va: Mark last register reads
On Valhall, register reads may be marked as "last" [1]. Setting the last flag
promises the hardware that the value of the register is no longer required. This
may enable hardware optimizations. In particular, it may permit the hardware to
avoid register file writes if a write to the marked register is still in the
forwarding buffer. This may improve power efficiency.

In principle, this is trivial: run liveness analysis and mark killed sources,
like we would in an SSA-based register allocator. In practice, there are a few
wrinkles to avoid hazards around staging registers and 64-bit register pairs,
requiring some additional data flow analysis and fix ups. However, nothing here
is particularly "hard", and all the ideas are already in use for the Bifrost
scheduler and the Bifrost/Valhall scoreboard analyses.

[1] In Mesa's compiler, this is called discard for historical reasons.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig
d4377e1255 pan/va: Use validate_register_pair for BLEND pack
Instead of open-coding. Noticed by inspection.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00