Commit graph

143813 commits

Author SHA1 Message Date
Rhys Perry
2e56e23420 aco: make optimize_postRA() work across blocks
fossil-db (Sienna Cichlid):
Totals from 46 (0.03% of 150170) affected shaders:
CodeSize: 103672 -> 103488 (-0.18%)
Instrs: 21968 -> 21922 (-0.21%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>
2021-08-19 18:17:33 +00:00
Rhys Perry
1d894a8c85 aco: move a bunch of helpers into aco_ir.h/aco_ir.cpp
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>
2021-08-19 18:17:33 +00:00
Rhys Perry
3db3196379 aco: add can_use_DPP() and convert_to_DPP()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>
2021-08-19 18:17:33 +00:00
Rhys Perry
a9562fd0d6 aco: fix validation of DPP v_cndmask_b32/v_addc_co_u32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>
2021-08-19 18:17:33 +00:00
Emma Anholt
6494b08407 i915g: clang-format fixup.
I really need to get clang-format into CI so I can stop doing fixups.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>
2021-08-19 17:28:14 +00:00
Emma Anholt
c38cb5d4d8 i915g: Add comments explaining various xfails.
I haven't gone through every test (particularly ones I think are loop
unrolling or instruction-count-related ones I think), but this gives a
better picture of what's going on in this driver.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>
2021-08-19 17:28:14 +00:00
Emma Anholt
ab2645b54c i915g: Clear some xfails that are now skips.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>
2021-08-19 17:28:14 +00:00
Emma Anholt
e00a749759 i915g: Reduce ARB_fp max tex indirections to match i915c.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>
2021-08-19 17:28:14 +00:00
Emma Anholt
8ebd0f8317 i915g: Correct PIPE_SHADER_CAP_MAX_TEMPS.
This is the value that i915c reported, too, and is required for ARB_fp.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>
2021-08-19 17:28:14 +00:00
Emma Anholt
da3f20a3ab i915g: Fix polygon offset by telling draw the Z format.
This is what initializes the MRD for draw's polygon offset calculations.

Closes: #4976
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>
2021-08-19 17:28:14 +00:00
Boyuan Zhang
8e5e70bb3d frontends/va: add num_temporal_layers check
Fixes: 51935d59

temporal_id check is valid only if the num_temporal_layers is set (>0).
When num_temporal_layers is 0, we shouldn't check temporal_id and return
error.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Thong Thai <thong.thai@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12463>
2021-08-19 14:25:09 +00:00
Boyuan Zhang
4081516b3e radeon/vcn: set min value for num_temporal_layers
Fixes: 51935d59

In the case where num_temporal_layers is not set (0), set it using the
minimum value 1, otherwise the rate control settings will be missing.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Thong Thai <thong.thai@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12463>
2021-08-19 14:25:09 +00:00
Daniel Schürmann
59f2c85845 nir: return false for loops in contains_other_jump()
Allows to unwrap more loops.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12473>
2021-08-19 13:51:17 +00:00
Simon Ser
8de086e12f v3d: implement resource_get_param
Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.

Instead, implement the resource_get_param hook to allow users to
fetch this information without WINSYS_HANDLE_TYPE_KMS.

A tiny helper function is introduced to compute the modifier of a
resource.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 7bcb223639 ("v3d, vc4: Fix dmabuf import for non-scanout buffers")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>
2021-08-19 13:12:51 +00:00
Simon Ser
b1fbceac6f vc4: implement resource_get_param
Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.

Instead, implement the resource_get_param hook to allow users to
fetch this information without WINSYS_HANDLE_TYPE_KMS.

A tiny helper function is introduced to compute the modifier of a
resource.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 7bcb223639 ("v3d, vc4: Fix dmabuf import for non-scanout buffers")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>
2021-08-19 13:12:51 +00:00
Simon Ser
99fc6f7271 panfrost: implement resource_get_param
Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.

Instead, implement the resource_get_param hook to allow users to
fetch this information without WINSYS_HANDLE_TYPE_KMS.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 4c092947df ("panfrost: fail in get_handle(TYPE_KMS) without a scanout resource")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>
2021-08-19 13:12:51 +00:00
Simon Ser
b5919b0b10 etnaviv: add stride, offset and modifier to resource_get_param
Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.

Instead, extend the resource_get_param hook to allow users to fetch
this information without WINSYS_HANDLE_TYPE_KMS.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 9da901d2b2 ("etnaviv: fail in get_handle(TYPE_KMS) without a scanout resource")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>
2021-08-19 13:12:51 +00:00
Erik Faye-Lund
63529782d3 gallium/nir/tgsi: initialize file_max for inputs
When this was rewritten to support Vulkan, we stopped initializing
file_max to -1 in the case of no inputs. This causes the draw module
to go down a needlessly pessimistic case, printing an error while we're
at it.

Fixes: 42b5cfdbd2 ("gallivm/nir: fix vulkan vertex inputs")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12440>
2021-08-19 10:51:01 +00:00
Erik Faye-Lund
4674698008 gallium/nir/tgsi: fixup indentation
This was using mixed tabs and spaces, let's fix that before we start
modifying the code.

Fixes: 42b5cfdbd2 ("gallivm/nir: fix vulkan vertex inputs")
Reviewed-by: default avatarDave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12440>
2021-08-19 10:51:01 +00:00
Danylo Piliaiev
bb4db22ff4 turnip: apply workaround for depth bounds test without depth test
On some GPUs when:
- depth bounds test is enabled
- depth test is disabled
- depth attachment uses UBWC in sysmem mode
GPU hangs. As a workaround we should enable z test. That's what blob
is doing for a630. And since we enable z test we should make it always pass.

Blob doesn't emit this workaround on a650 and a660. Untested on a640.

Fixes:
 dEQP-VK.pipeline.extended_dynamic_state.two_draws_static.depth_bounds_test_disable
 dEQP-VK.pipeline.extended_dynamic_state.two_draws_dynamic.depth_bounds_test_disable
 dEQP-VK.dynamic_state.ds_state.depth_bounds_1

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12407>
2021-08-19 10:25:58 +00:00
Danylo Piliaiev
7faee1430a freedreno: rename Z_TEST_ENABLE->Z_READ_ENABLE, Z_ENABLE->Z_TEST_ENABLE
This makes their interaction with Z_BOUNDS_ENABLE more understandable.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12407>
2021-08-19 10:25:58 +00:00
Erik Faye-Lund
d37aa08f61 draw: fix stippling of fractional lines
The OpenGL 4.6 specification, section 14.5.2.1 (Line Stipple) says:

> The masking is achieved using three parameters: the 16-bit line
> stipple p, the line repeat count r, and an integer stipple counter s.

This is pretty clear that the stipple counter shouldn't carry fractional
parts. But we also don't really do anything useful with the fractional
part anyway, apart from skewing the third or later line-segments

Properly carrying over the fractional parts as the Vulkan specification
allows for rectangular lines is trickier than this and would require us
to use a shorter output-line at the start of the following
line-segments.

But let's just do what the OpenGL specification describes, and the
Vulkan specification allows for now.

This, combined with the following patch for the vulkan CTS makes the
last two rasterization-tests pass for me:

https://github.com/KhronosGroup/VK-GL-CTS/pull/279

Fixes the "spec/!opengl 1.1/linestipple/line strip" piglit-test.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12327>
2021-08-19 09:44:16 +00:00
Marcin Ślusarz
a3d400a016 turnip: use nir_shader_instructions_pass in tu_lower_io
No functional changes.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12467>
2021-08-19 08:15:41 +00:00
Marcin Ślusarz
8892d276d2 r600: preserve all metadata when passes don't make progress
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12467>
2021-08-19 08:15:41 +00:00
Marcin Ślusarz
956d6461ef r600: use nir_shader_instructions_pass in r600_nir_lower_atomics
Changes:
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12467>
2021-08-19 08:15:41 +00:00
Marcin Ślusarz
e2917ef9ef freedreno/ir3: use nir_metadata_none instead of its value
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12467>
2021-08-19 08:15:41 +00:00
Samuel Pitoiset
ab35a63dea radv: do not allocate the FCE predicate for images that use comp-to-single
Images that support comp-to-single don't have to be fast-cleared at
all, so the predicate is unnecessary.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12323>
2021-08-19 07:50:50 +00:00
Samuel Pitoiset
ef546cf96f radv: remove useless check about the FCE predicate offset
radv_update_fce_metadata() already prevents that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12323>
2021-08-19 07:50:50 +00:00
Samuel Pitoiset
dc58b0112f radv: determine if an image support comp-to-single at creation time
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12323>
2021-08-19 07:50:49 +00:00
Juan A. Suarez Romero
c65e2eed32 broadcom/ci: use deqp-runner suites for gles
Glue together all the GLES related jobs using the suites feature.

This allow us to reduce the total number of devices required, moving
some of them to help in other jobs, and the remaining free for other
pipelines in parallel.

Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12453>
2021-08-19 07:31:07 +00:00
Marcin Ślusarz
e3b4c77ed3 glsl: refactor code to avoid static analyzer noise
Clang analyzer thinks struct_base_offset can be used uninitialized
because it doesn't know that glsl_type_is_struct_or_ifc returns
the same value for the same type.

Refactor the code to make it clear what is going on. As a side effect
this should be faster because glsl_get_length and
glsl_type_is_struct_or_ifc will be called only once (they are not
inline functions).

This is an alternative approach to
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12399.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12403>
2021-08-19 06:59:01 +00:00
Qiang Yu
e6790d4a31 nir/inline_uniforms: support loop
Be able to inline uniforms in loop for unrolling it.
Nested loop/if is also supported.

Some example:

    for (i = 0; i < count; i++)
	...

uniform "count" will be inlined. But note this does not
make sure the loop will be unrolled (ie. count = 1000).

    for (i = 0; i < count; i++)
        for (j = init; j < 10; j++)
            if (type == 2)
                ...

uniform "count", "init" and "type" will be inlined.

It is intentional to not be too aggressive to add uniforms
to avoid false positive case while be able to support most
common usage.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
2021-08-19 02:17:35 +00:00
Qiang Yu
3c93ebbae5 nir/loop_analyze: skip unsupported induction variable early
Instead of fail in trip count calculation, just don't mark such
kind of variable as induction from the beginning.

Don't bother inline uniform to deal with such kind of variable
either.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
2021-08-19 02:17:35 +00:00
Qiang Yu
0b9639c35d nir/loop_analyze: record induction variables for each loop
For being used by uniform inline lowering pass.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
2021-08-19 02:17:35 +00:00
Qiang Yu
c86ec09d11 nir/loop_analyze: move nir_is_supported_terminator_condition() to header
To be shared with uniform inline.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
2021-08-19 02:17:35 +00:00
Qiang Yu
a406fff78a nir/inline_uniforms: support vector uniform
Collect per vector component dependency and lower vector uniform
load to scalar if any component need to be inlined.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
2021-08-19 02:17:35 +00:00
Qiang Yu
9d796b21ac nir/inline_uniforms: add uniforms in condition atomically
Unless all uniforms in the condition can be inlined we can
lower the if/loop. So we rollback added uniforms when one
of uniforms in a if condition fail to be added.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
2021-08-19 02:17:35 +00:00
Ilia Mirkin
bce19b3a77 mesa: don't return errors for gl_* GetFragData* queries
There is nothing in the spec about this. BindFragDataLocation* is
supposed to return an error, but not Get.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5221
Fixes: 59012c3133 ("mesa: Implement glGetFragDataLocation")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12333>
2021-08-19 01:52:46 +00:00
Alyssa Rosenzweig
07cc5fd893 panfrost: Add unit tests for non-dithered clears
Would have exposed the bug fixed in the previous commit. This is gnarly
stuff, let's not regress it.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12460>
2021-08-18 23:32:40 +00:00
Alyssa Rosenzweig
22538b89b3 panfrost: Handle non-dithered clear colours
In b9c095cc2c ("panfrost: Rewrite the clear colour packing code"),
packing of clear colours was corrected to use the tilebuffer's
fractional bits, fixing dithering of the clear colour with formats like
RGB565. Unfortunately, that commit did so unconditionally. If the
framebuffer is dithered, but dithering is disabled at the time of
the clear, we would incorrectly dither the clear.

This is a regression, as the old (broken) code passed the relevant CTS
test. What's the catch? Depending on dither state, there are two
formulas to pack tilebuffer colours. We need to handle both. Fixes
KHR-GLES31.core.draw_buffers_indexed.color_masks.

Fixes: b9c095cc2c ("panfrost: Rewrite the clear colour packing code")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12460>
2021-08-18 23:32:40 +00:00
Alyssa Rosenzweig
1b710d4a96 panfrost: Add dither state to the clear colour tests
There is a dependence on dithering state about which I was previously
unaware. All these test cases were with dithering enabled, so mark that
down.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12460>
2021-08-18 23:32:40 +00:00
Alejandro Piñeiro
a4cb756e4d broadcom/qpu: use and expand version info at opcode description
Right now opcode_desc struct, used to define data for all the
operations to pack/unpack, include a version field. In theory that
could be used to check if we are retrieving a opcode valid for our hw
version, or to get the correct opcode if a given one changed across hw
versions, or just the same if it didn't change.

In practice that field was not used. So for example, if by mistake we
asked for an opcode defined at version 41, while being on version 33
hardware, we would still get that opcode description.

This commit fixes that, and as we are here we expand the functionality
to allow to define version ranges, just in case a given opcode number
and their description is only valid for a given range.

v2 (from Iago feedback):
   * Fixed some comment typos
   * Simplified filtering opcode method
   * Rename filtering opcode method

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12301>
2021-08-19 01:08:14 +02:00
Alejandro Piñeiro
8a5f2228db broadcom/qpu: add new lookup opcode description helper
Right now there is a helper to get the opcode description from a
packed instruction, used on unpack related instructions. This commit
adds a helper that refactors the equivalent that is already in use on
pack related instructions.

Right now the helper is small, but we plan to extend it on following
commits in order to use the opcode description version field.

To avoid any possible confusion we rename the existing lookup helper.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12301>
2021-08-19 01:08:02 +02:00
Alejandro Piñeiro
ff74acabf5 broadcom/qpu: update/remove comments
* Remove one about waddr 6 being reserved, when at some point it
     become NOP

   * Fix one comment about reserved signals on v41 map, as 24 and 25
     are in fact defined. This seems a C&P issue (see v40 map).

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12301>
2021-08-19 01:07:35 +02:00
Emma Anholt
fdf47acdc7 ci/freedreno: Flake the rest of the pbuffer/window dEQP-EGL tests.
I had at least 3 of these in my logs, I see no reason not to fill out the
rest at this point.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12458>
2021-08-18 22:47:12 +00:00
Emma Anholt
0d023aaaf5 ci/freedreno: Mark a new flaky SSBO length test.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12458>
2021-08-18 22:47:12 +00:00
Ian Romanick
5ce3bfcdf3 intel/compiler: Lower 8-bit ops to 16-bit in NIR on all platforms
This fixes the Crucible func.shader.shift.int8_t test on Gen8 and Gen9.
See https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/76.

With the previous optimizations in place, this change seems to improve
the quality of the generated code.  Comparing a couple Vulkan CTS tests
on Skylake had the following results.

dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:
SIMD8 shader: 36 instructions. 1 loops. 3822 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 27 instructions. 1 loops. 2742 cycles. 0:0 spills:fills, 5 sends

dEQP-VK.spirv_assembly.type.vec3.i8.max_frag:
SIMD8 shader: 39 instructions. 1 loops. 3922 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 37 instructions. 1 loops. 3682 cycles. 0:0 spills:fills, 5 sends

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
f0a8a9816a nir: intel/compiler: Add and use nir_op_pack_32_4x8_split
A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO.  This results
in a lot of shifts and MOVs.  When that pattern can be recognized, the
individual 8-bit components can be packed much more efficiently.

v2: Rebase on b4369de27f ("nir/lower_packing: use
shader_instructions_pass")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
89f639c0ca nir/algebraic: Remove spurious conversions from inside logic ops
Not only does this eliminate a bunch of unnecessary type converting
MOVs, but it can also enable some SWAR.  The
dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag test does
something about like:

    c = a.x ^ b.x;
    d = a.y ^ b.y;
    e = a.z ^ b.z;

After this change, it looks more like:

    uint t = i8vec3AsUint(a) ^ i8vec3AsUint(b);
    c = extract_u8(t, 0);
    d = extract_u8(t, 1);
    e = extract_u8(t, 2);

On Ice Lake, this results in:

SIMD8 shader: 41 instructions. 1 loops. 3804 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 31 instructions. 1 loops. 2844 cycles. 0:0 spills:fills, 5 sends

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
7c83aa0518 intel/fs: Emit better code for u2u of extract
Emitting the instructions one by one results in two MOV instructions
that won't be propagated.  By handling both instructions at once, a
single MOV is emitted.  For example, on Ice Lake this helps
dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:

SIMD8 shader: 49 instructions. 1 loops. 4044 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 41 instructions. 1 loops. 3804 cycles. 0:0 spills:fills, 5 sends

Without "intel/fs: Allow copy propagation between MOVs of mixed sizes,"
the improvement is still 8 instructions, but there are more instructions
to begin with:

SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 44 instructions. 1 loops. 3944 cycles. 0:0 spills:fills, 5 sends

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00