Commit graph

140962 commits

Author SHA1 Message Date
Alyssa Rosenzweig
d4d3328b95 panfrost: Enable 16-bit support on Bifrost
Remove the PAN_MESA_DEBUG=fp16 flag that was hiding it.

Skip two buggy dEQP tests. See linked discussion. We'll need to make
sure this gets sorted out before submitting conformance, but I don't see
a test with a fix in the pipeline as valid reason to hold back valid
code.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
2021-05-06 23:26:21 +00:00
Alyssa Rosenzweig
793d18b79b pan/bi: Enable mediump BLEND lowering
Other lowerings will wait until we iron out various missing features.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
2021-05-06 23:26:21 +00:00
Alyssa Rosenzweig
d3ba26be37 pan/bi: Garbage collect bifrost_nir.h
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
2021-05-06 23:26:21 +00:00
Alyssa Rosenzweig
ff36e40145 pan/bi: Copyprop constants
Needed for constant folding to be effective. But don't copyprop into
instructions already reading from FAU, that will just end up adding more
moves!

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
2021-05-06 23:26:21 +00:00
Alyssa Rosenzweig
1049bb4374 pan/bi: Fix int<-->float size converts
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
2021-05-06 23:26:21 +00:00
Alyssa Rosenzweig
0f6e45f8b2 pan/bi: Enable NIR vectorization
We don't vectorize transcendentals, since those are scalar only in
hardware. Also don't vectorize a few places where impedance mismatches
between NIR and the hardware make handling vectors infeasible for now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
2021-05-06 23:26:21 +00:00
Alyssa Rosenzweig
646e03c451 pan/bi: Temporarily switch back to 0/~0 bools
Keeps things simpler while debugging vectorization woes.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
2021-05-06 23:26:21 +00:00
Alyssa Rosenzweig
8db4166c58 pan/bi: Handle make_vec with 1-bit bools
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
2021-05-06 23:26:21 +00:00
Alyssa Rosenzweig
7793c9ab02 pan/bi: Adapt branching for 1-bit bools
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
2021-05-06 23:26:21 +00:00
Alyssa Rosenzweig
3d78cc5876 pan/bi: Change swizzled scalars to identity
Allows packing for things like IADD.v2s16

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
2021-05-06 23:26:21 +00:00
Alyssa Rosenzweig
b7a757b2f7 panfrost: Fix typo handling blend types
This was right in my head.

Fixes: 93a176b6cf ("panfrost: Key blend shaders to the input types")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
2021-05-06 23:26:21 +00:00
Alyssa Rosenzweig
54046d61f8 pan/mdg: Model blend shader interference
Backport of 4439757db2 ("pan/bi: Use the interference mechanism
to describe blend shader reg use") to Midgard.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
2021-05-06 23:26:21 +00:00
Adam Jackson
90cbab7cae mesa: s/malloc/calloc/ to silence a warning
gcc 11 warns:

[846/1506] Compiling C object src/mesa/libmesa_common.a.p/main_shaderapi.c.o
In function ‘shader_source’,
    inlined from ‘_mesa_ShaderSource_no_error’ at ../src/mesa/main/shaderapi.c:2137:4:
../src/mesa/main/shaderapi.c:2095:25: warning: ‘*offsets_10 + _130’ may be used uninitialized [-Wmaybe-uninitialized]
 2095 |    totalLength = offsets[count - 1] + 2;

I can't really see how it's getting to that conclusion, but allocating
`offsets` with calloc is both natural to do here and guarantees
initialization.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10671>
2021-05-06 21:25:47 +00:00
Adam Jackson
4770d6c01d format/fxt1: Clean up fxt1_variance's argument list
gcc 11 warns:

../src/util/format/u_format_fxt1.c:940:22: warning: ‘fxt1_variance.constprop’ accessing 128 bytes in a region of size 64 [-Wstringop-overflow=]
  940 |    int32_t maxVarR = fxt1_variance(NULL, &input[N_TEXELS / 2], n_comp);

But, suspiciously, if you inline fxt1_variance the warning goes away.
What's happening is that the 2nd arg is uint8_t[N_TEXELS][MAX_COMP], so
it looks like we're passing too small of an array in since gcc knows
that `input` is also [N_TEXELS][MAX_COMP]. Fair enough. Fix the
signature to reflect what's actually going on, and remove some unused
arguments while we're at it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10671>
2021-05-06 21:25:47 +00:00
Samuel Pitoiset
d96507b73f radv: advertise VK_EXT_extended_dynamic_state2
This only implements dynamic primitive restart enable, depth bias
enable and rasterizer discard enable. I leave logic op and patch
control points for later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10340>
2021-05-06 20:58:59 +00:00
Samuel Pitoiset
dd19bf9d7d radv: implement dynamic rasterizer discard enable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10340>
2021-05-06 20:58:59 +00:00
Samuel Pitoiset
c40d7fadc3 radv: implement dynamic primitive restart enable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10340>
2021-05-06 20:58:59 +00:00
Samuel Pitoiset
f2933e9872 radv: implement dynamic depth bias enable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10340>
2021-05-06 20:58:58 +00:00
Samuel Pitoiset
44e7bcf942 radv: declare new dynamic states for VK_EXT_extended_dynamic_state2
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10340>
2021-05-06 20:58:58 +00:00
Samuel Pitoiset
c4a639238e radv: declare VK_EXT_extended_dynamic_state2 but leave it disabled
To declare new prototypes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10340>
2021-05-06 20:58:58 +00:00
Emma Anholt
ae0c4e987e ci/freedreno: Add another daily dose of a530 flakes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10660>
2021-05-06 19:03:23 +00:00
Alyssa Rosenzweig
1378c67bcf panfrost/blend: Inline blend constants
If we're going to key them in NIR, we might as well get the benefit of
constant folding them too.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10652>
2021-05-06 16:40:50 +00:00
Alyssa Rosenzweig
ba39367b96 pan/mdg: Enable nir_opt_{move, sink}
I felt bad about the last patch regressing Midgard perf, so here's some
moar Midgard perf for you ^^

total instructions in shared programs: 97089 -> 97036 (-0.05%)
instructions in affected programs: 5230 -> 5177 (-1.01%)
helped: 53
HURT: 31
helped stats (abs) min: 1 max: 17 x̄: 4.40 x̃: 6
helped stats (rel) min: 0.61% max: 12.24% x̄: 7.74% x̃: 11.54%
HURT stats (abs)   min: 1 max: 8 x̄: 5.81 x̃: 8
HURT stats (rel)   min: 1.08% max: 13.79% x̄: 9.69% x̃: 11.11%
95% mean confidence interval for instructions value: -1.89 0.63
95% mean confidence interval for instructions %-change: -3.41% 0.80%
Inconclusive result (value mean confidence interval includes 0).

total bundles in shared programs: 45612 -> 45507 (-0.23%)
bundles in affected programs: 17331 -> 17226 (-0.61%)
helped: 139
HURT: 166
helped stats (abs) min: 1 max: 21 x̄: 3.76 x̃: 2
helped stats (rel) min: 0.85% max: 18.37% x̄: 6.38% x̃: 4.55%
HURT stats (abs)   min: 1 max: 10 x̄: 2.51 x̃: 1
HURT stats (rel)   min: 0.79% max: 31.25% x̄: 7.54% x̃: 4.55%
95% mean confidence interval for bundles value: -0.90 0.21
95% mean confidence interval for bundles %-change: 0.05% 2.34%
Inconclusive result (value mean confidence interval includes 0).

total quadwords in shared programs: 77275 -> 76952 (-0.42%)
quadwords in affected programs: 32314 -> 31991 (-1.00%)
helped: 142
HURT: 179
helped stats (abs) min: 1 max: 28 x̄: 4.38 x̃: 2
helped stats (rel) min: 0.34% max: 13.79% x̄: 4.29% x̃: 2.78%
HURT stats (abs)   min: 1 max: 6 x̄: 1.67 x̃: 2
HURT stats (rel)   min: 0.44% max: 16.67% x̄: 2.93% x̃: 2.63%
95% mean confidence interval for quadwords value: -1.56 -0.45
95% mean confidence interval for quadwords %-change: -0.78% 0.25%
Inconclusive result (%-change mean confidence interval includes 0).

total registers in shared programs: 7081 -> 6771 (-4.38%)
registers in affected programs: 2217 -> 1907 (-13.98%)
helped: 193
HURT: 75
helped stats (abs) min: 1 max: 6 x̄: 2.04 x̃: 1
helped stats (rel) min: 6.25% max: 62.50% x̄: 24.32% x̃: 20.00%
HURT stats (abs)   min: 1 max: 3 x̄: 1.11 x̃: 1
HURT stats (rel)   min: 7.14% max: 50.00% x̄: 17.17% x̃: 14.29%
95% mean confidence interval for registers value: -1.37 -0.94
95% mean confidence interval for registers %-change: -15.53% -9.89%
Registers are helped.

total threads in shared programs: 5036 -> 5152 (2.30%)
threads in affected programs: 185 -> 301 (62.70%)
helped: 93
HURT: 19
helped stats (abs) min: 1 max: 2 x̄: 1.49 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1 max: 2 x̄: 1.21 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.82 1.25
95% mean confidence interval for threads %-change: 63.96% 85.14%
Threads are helped.

total loops in shared programs: 19 -> 19 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 2 -> 0
spills in affected programs: 2 -> 0
helped: 1
HURT: 0

total fills in shared programs: 15 -> 0
fills in affected programs: 15 -> 0
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10652>
2021-05-06 16:40:50 +00:00
Alyssa Rosenzweig
ad6e53da5c pan/mdg: Remove midgard_opt_copy_prop_reg
This is known broken code, and the fix is highly nontrivial. It isn't
doing terribly much for perf, so just rip off the band-aid. Prevents a
dEQP regression, and likely fixes bugs in real apps too.

total instructions in shared programs: 96640 -> 97089 (0.46%)
instructions in affected programs: 27831 -> 28280 (1.61%)
helped: 33
HURT: 301
helped stats (abs) min: 1 max: 6 x̄: 3.64 x̃: 5
helped stats (rel) min: 1.96% max: 10.00% x̄: 6.48% x̃: 7.94%
HURT stats (abs)   min: 1 max: 18 x̄: 1.89 x̃: 1
HURT stats (rel)   min: 0.46% max: 15.00% x̄: 3.17% x̃: 2.38%
95% mean confidence interval for instructions value: 1.09 1.59
95% mean confidence interval for instructions %-change: 1.80% 2.63%
Instructions are HURT.

total bundles in shared programs: 45615 -> 45612 (<.01%)
bundles in affected programs: 11257 -> 11254 (-0.03%)
helped: 121
HURT: 146
helped stats (abs) min: 1 max: 7 x̄: 2.34 x̃: 1
helped stats (rel) min: 1.22% max: 23.33% x̄: 7.85% x̃: 5.26%
HURT stats (abs)   min: 1 max: 17 x̄: 1.92 x̃: 2
HURT stats (rel)   min: 0.42% max: 25.00% x̄: 5.17% x̃: 3.85%
95% mean confidence interval for bundles value: -0.34 0.31
95% mean confidence interval for bundles %-change: -1.69% 0.23%
Inconclusive result (value mean confidence interval includes 0).

total quadwords in shared programs: 76662 -> 77275 (0.80%)
quadwords in affected programs: 20148 -> 20761 (3.04%)
helped: 28
HURT: 275
helped stats (abs) min: 1 max: 4 x̄: 1.54 x̃: 1
helped stats (rel) min: 0.43% max: 25.00% x̄: 4.89% x̃: 2.50%
HURT stats (abs)   min: 1 max: 12 x̄: 2.39 x̃: 2
HURT stats (rel)   min: 0.51% max: 28.57% x̄: 5.18% x̃: 4.26%
95% mean confidence interval for quadwords value: 1.80 2.25
95% mean confidence interval for quadwords %-change: 3.64% 4.86%
Quadwords are HURT.

total registers in shared programs: 7078 -> 7081 (0.04%)
registers in affected programs: 1028 -> 1031 (0.29%)
helped: 62
HURT: 70
helped stats (abs) min: 1 max: 2 x̄: 1.11 x̃: 1
helped stats (rel) min: 8.33% max: 50.00% x̄: 15.03% x̃: 12.50%
HURT stats (abs)   min: 1 max: 2 x̄: 1.03 x̃: 1
HURT stats (rel)   min: 8.33% max: 66.67% x̄: 20.13% x̃: 11.25%
95% mean confidence interval for registers value: -0.17 0.21
95% mean confidence interval for registers %-change: -0.14% 7.38%
Inconclusive result (value mean confidence interval includes 0).

total threads in shared programs: 5032 -> 5036 (0.08%)
threads in affected programs: 31 -> 35 (12.90%)
helped: 12
HURT: 6
helped stats (abs) min: 1 max: 2 x̄: 1.08 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1 max: 2 x̄: 1.50 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: -0.43 0.87
95% mean confidence interval for threads %-change: 13.82% 86.18%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 19 -> 19 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 0 -> 2
spills in affected programs: 0 -> 2
helped: 0
HURT: 1

total fills in shared programs: 0 -> 15
fills in affected programs: 0 -> 15
helped: 0
HURT: 1

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10652>
2021-05-06 16:40:50 +00:00
Alyssa Rosenzweig
4d9c0a32e7 pan/mdg: Use _output_ type for outmod printing
Fixes incorrect outmods printed for conversions.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10652>
2021-05-06 16:40:50 +00:00
Danylo Piliaiev
b60c46b2b2 docs: mark off VK_KHR_vulkan_memory_model for turnip
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10669>
2021-05-06 16:15:29 +00:00
Mike Blumenkrantz
4dc17b898b lavapipe: don't access pipeline blend state when it should be ignored
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10625>
2021-05-06 14:59:32 +00:00
Mike Blumenkrantz
636a3903be lavapipe: don't access pipeline dsa state when it should be ignored
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10625>
2021-05-06 14:59:32 +00:00
Mike Blumenkrantz
6bacd2a325 lavapipe: don't access pipeline viewport state when it should be ignored
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10625>
2021-05-06 14:59:32 +00:00
Mike Blumenkrantz
11261d2189 lavapipe: ignore tess pipeline info if no tess shaders in pipeline
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10625>
2021-05-06 14:59:32 +00:00
Mike Blumenkrantz
4d60a646b0 lavapipe: don't unnecessarily flag dsa states for updating
these force a new dsa state to be created and bound, which isn't necessary
if the same value is being reset

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10625>
2021-05-06 14:59:32 +00:00
Mike Blumenkrantz
2b1711c8fd lavapipe: zero out the blend state info and flag for updating on null blend state
this still needs to be updated if there's no pipeline info available

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10625>
2021-05-06 14:59:32 +00:00
Mike Blumenkrantz
63df2f736d lavapipe: zero out the dsa state info and flag for updating on null dsa state
this still needs to be updated if there's no pipeline info available

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10625>
2021-05-06 14:59:32 +00:00
Mike Blumenkrantz
788121158a lavapipe: update more states on null multisample pipeline info
these all need to be unset to ensure expected functionality

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10625>
2021-05-06 14:59:31 +00:00
Mike Blumenkrantz
7a955d1501 lavapipe: flag renderpasses as having color/zs attachments
it's useful to track this info for reuse

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10625>
2021-05-06 14:59:31 +00:00
Mike Blumenkrantz
49f93a4c5e lavapipe: set events to the unsignalled state on creation
this is otherwise uninitialized and not compliant with spec

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10625>
2021-05-06 14:59:31 +00:00
Mike Blumenkrantz
e3e4ff0b84 lavapipe: do not read sampler descriptor info during update if layout has immutables
this is illegal

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10625>
2021-05-06 14:59:31 +00:00
Mike Blumenkrantz
4b28ed0d7b lavapipe: handle buffer sizes better in CmdBindTransformFeedbackBuffersEXT
according to spec, the pSizes array member is only used if the array is non-null
and the value is not VK_WHOLE_SIZE, otherwise this value is calculated based
on the buffer size - the offset

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10625>
2021-05-06 14:59:31 +00:00
Icecream95
dbdd4bd9e9 pan/bi: Add two tuples to a clause when needed with NOSCHED
Fixes SuperTuxKart with BIFROST_MESA_DEBUG=nosched.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10661>
2021-05-06 14:45:13 +00:00
Icecream95
e241ca6e9c panfrost: Always write reloaded tiles when making CRC data valid
If CRC data is currently invalid and the current batch will make it
valid, write even clean tiles to make sure CRC data is updated.

Fixes: 8ba2f9f698 ("panfrost: Create a blitter library to replace the existing preload helpers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10566>
2021-05-06 13:27:46 +00:00
Icecream95
1c58614cee panfrost: Make pan_select_crc_rt a non-static function
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10566>
2021-05-06 13:27:46 +00:00
Mike Blumenkrantz
37545418cd nir: add nir_isub_imm
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10654>
2021-05-06 13:01:03 +00:00
Mike Blumenkrantz
6db24986ca gallium/inlines: remove atomic set from pipe_reference_init()
when an object is initialized with this, it should not be visible to any
other threads or contexts, so there should be no need to use an atomic set here

at the time of this commit, there are only two callers in the tree which pass
values != 1:
* zink uses a calculated number for framebuffer refcount on init (this is fine)
* aux/pb passes 0 on init (this is fine)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10650>
2021-05-06 12:36:24 +00:00
Boris Brezillon
6cac9c748e Revert "gallium/util: Fix depth/stencil blit shaders"
This reverts commit 7ca72f1726.
Unlike what's stated in this commit, the depth or stencil components
have to be replicated on all channels, as specified in the
"Texture Sampling and Texture Formats" section of the TGSI doc
(docs/gallium/tgsi.rst).

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10649>
2021-05-06 12:09:38 +00:00
Mike Blumenkrantz
567bdf2e8f zink: clamp zs samplers to XXXX swizzle for all non-zero/one swizzles
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10649>
2021-05-06 12:09:38 +00:00
Erik Faye-Lund
77f3dd85a2 zink: do not ask glsl-compiler to unroll
We don't really need loops unrolled, so let's just disable this. This is
generally recommended for NIR drivers, but we can do even better; not
even unroll in NIR. And since we don't set
nir_shader_compiler_options::max_unroll_iterations, we're already there.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10487>
2021-05-06 11:17:29 +00:00
Erik Faye-Lund
c18ff60087 lavapipe: emit correct textures_used for texture-arrays
When we lower a texture-lookup with a dynamic index, we need to mark the
entire array as used, because we don't know better.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10487>
2021-05-06 11:17:29 +00:00
Iago Toral Quiroga
ca9e0871fb v3d: enable NIR loop unrolling
The GL driver was getting loop unrolling from the GLSL compiler frontend,
but NIR unrolling is more sophisticated, so prefer that.

The only caveat is that loop unrolling is implemented in the Mesa state
tracker, so our backend won't have a chance to undo the optimization if
it causes us to lower thread count or spill, so we choose to be a bit more
conservative with the configuration than what we were doing with GLSL.

Shader-db results follow. Increase in instruction counts is expected due
to additional unrolling. We lose threads in very few shaders, but we
make up for this with the additional unrolling and reduced spilling. We
also managed to get 3 more shaders to compile successfully.

total instructions in shared programs: 13416427 -> 13461431 (0.34%)
instructions in affected programs: 96936 -> 141940 (46.43%)
helped: 58
HURT: 216
Instructions are HURT.

total threads in shared programs: 410626 -> 410598 (<.01%)
threads in affected programs: 56 -> 28 (-50.00%)
helped: 0
HURT: 14
Threads are HURT.

total loops in shared programs: 2121 -> 1708 (-19.47%)
loops in affected programs: 468 -> 55 (-88.25%)
helped: 446
HURT: 47
Loops are helped.

total uniforms in shared programs: 3676567 -> 3691185 (0.40%)
uniforms in affected programs: 25304 -> 39922 (57.77%)
helped: 23
HURT: 199
Uniforms are HURT.

total spills in shared programs: 5902 -> 5727 (-2.97%)
spills in affected programs: 285 -> 110 (-61.40%)
helped: 19
HURT: 0

total fills in shared programs: 13308 -> 13121 (-1.41%)
fills in affected programs: 301 -> 114 (-62.13%)
helped: 19
HURT: 0

total sfu-stalls in shared programs: 31860 -> 32856 (3.13%)
sfu-stalls in affected programs: 1692 -> 2688 (58.87%)
helped: 25
HURT: 196
Sfu-stalls are HURT.

total inst-and-stalls in shared programs: 13448287 -> 13494287 (0.34%)
inst-and-stalls in affected programs: 98404 -> 144404 (46.75%)
helped: 57
HURT: 217
Inst-and-stalls are HURT.

total nops in shared programs: 329276 -> 329551 (0.08%)
nops in affected programs: 2189 -> 2464 (12.56%)
helped: 58
HURT: 181
Nops are HURT.

LOST:   0
GAINED: 3

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>
2021-05-06 12:27:42 +02:00
Iago Toral Quiroga
c11e479852 broadcom/compiler: specify maximum thread count in compile strategies
Once we have exhausted compile strategies at 4 threads and we start
enabling lower thread counts, there is no point in starting compiles
with 4 threads for them, we know these will fail, so let's start at
2 in these cases.

This also has another nice implication: if the driver compiles at 4
threads and fails to register allocate, we were allowing it to try
with 2 threads, but this would only retry the register allocation
process and would not really recompile the shader with 2 threads. This
is not optimal, because at 2 threads we have more TMU fifo space for
each thread and we can do more TMU pipelining, so we were missing that
opportunity.

This improves performance in Sponza by ~1.5% and also seems to help
UE4 slightly.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>
2021-05-06 12:27:06 +02:00
Iago Toral Quiroga
d19ce36ff2 broadcom/compiler: refactor compile strategies
Until now, if we can't compile at 4 threads we would lower thread count
with optimizations disabled, however, lowering thread count doubles the
amount of registers available per thread, so that alone is already a big
relief for register pressure so it makes sense to enable optimizations
when we do that, and progressively disable them until we enable spilling
as a last resort.

This can slightly improve performance for some applications. Sponza,
for example, gets a ~1.5% boost. I see several UE4 shaders that also get
compiled to better code at 2 threads with this, but it is more difficult
to assess how much this improves performance in practice due to the large
variance in frame times that we observe with UE4 demos.

Also, if a compiler strategy disables an optimization that did not make
any progress in the previous compile attempt, we would end up re-compiling
the exact same shader code and failing again. This, patch keeps track of
which strategies won't make progress and skips them in that case to save
some CPU time during shader compiles.

Care should be taken to ensure that we try to compile with the default
NIR scheduler at minimum thread count at least once though, so a specific
strategy for this is added, to prevent the scenario where no optimizations
are used and we skip directly to the fallback scheduler if the default
strategy fails at 4 threads.

Similarly, we now also explicitly specify which strategies are allowed to do
TMU spills and make sure we take this into account when deciding to skip
strategies. This prevents the case where no optimizations are used in a
shader and we skip directly to the fallback scheduler after failing
compilation at 2 threads with the default NIR scheduler but without trying
to spill first.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>
2021-05-06 12:27:06 +02:00