Commit graph

148059 commits

Author SHA1 Message Date
Alyssa Rosenzweig
f9a01af4f3 pan/bi: Allow selecting from an 8-bit vec8
The word offset is already handled by the above code, there's no need to
restrict the further restrict the swizzle. This pattern can come up with OpenCL.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
65961848b1 pan/bi: Remove bogus assert for pack_32_2x16
The following IR is valid NIR:

   vec1 16 ssa_0 = ...
   vec1 32 ssa_1 = pack_32_2x16 ssa_0.xx

In this case, pack_32_2x16 takes in a two component vector, but the source
itself ssa_0 has only a single component. This is fine due to the shuffle, but
will fail the assert. Remove the assert and all is well.

Fixes test_relational.shuffle_copy.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
5689a932e8 pan/bi: Lower f2i8, f2u8
These need a simple two-instruction lowering regardless of the size of float
involved. Fixes integer_ops.integer_divideAssign

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
82b912f844 pan/bi: Lower 8-bit min/max to bcsel+comparison
We don't have an 8-bit CSEL, so this is the best we can do. It's easier to write
the lowering as an algebraic rule since we don't need to do anything clever.
Fixes integer_ops.integer_clamp.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
4ee56ecd9c pan/va: Add 8-bit integer max assembler case
This needs to be lowered to a two instruction sequence because there is no
CSEL.v4s8.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
31a5eb6165 pan/bi: Add HADD.v4s8.rhadd packing test cases
To confirm the XML is right.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
decc24b18b pan/va: Pack .rhadd bit
Fixes integer_ops.integer_rhadd.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
42a474daac pan/bi: Handle uhadd, urhadd opcodes
Fixes integer_ops.integer_hadd.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
c717c28d87 pan/va: Fix v4s8 form of R2 opcodes
The XML had a typo which was copypasted (incorrectly) into various instructions.
Fixes a pile of integer_ops subtests.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
48ba7f8627 pan/va: Pack IADD.sat bit
Fixes 32-bit portion of integer_ops integer_add_sat.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
77fcb4b291 pan/bi: Strip negate when lowering swizzles
When we lower swizzles, we move source modifiers (except for the swizzle) after
the swizzle operation. In particular, we change the order of composition for
negates and abs. However, copying the source will copy the modifiers unless we
specifically strip the extra modifiers. That's harmless in practice on Bifrost,
which doesn't check for extraneous modifiers, but is incorrect IR and trips an
assertion in the Valhall packing code.

Fixes test_relations.relational_bitselect.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
377bf3a5a4 pan/bi: Lower swizzles for 8-bit shifts
Fixes integers_ops.integer_ctz

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
2e1b02e6a3 pan/bi: Test some 8-bit swizzle lowering
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
d76c48103f pan/bi: Lower some 8-bit swizzles
Fixes the 8-bit portion of OpenCL's integer_ops.integer_clz test case.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
d471b386c1 pan/bi: Unit test swizzle lowering
We're about to extend this pass to support 8-bit swizzles. That will be a
nontrivial change, so let's get some testing for what's already in the pass.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
1370c27728 pan/va: Fix missing swizzle on CLZ.v2u16
Fixes 16-bit portion of integer_clz.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
bdab1f9ce9 panfrost: Assume launch_grid parameters always change
This is only a theoretical bug fix because, for now, we always reemit
everything. But this aligns launch_grid with draw_vbo and makes the intention
explicit, both seem like good things to me.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
b261a18550 panfrost: Honour flush-to-zero controls on Valhall
Fixes math_bruteforce.atan2 and contractions tests.

For OpenCL, we want to flush fp32 and preserve fp16, applying to both inputs and
outputs so F16_TO_F32 acts as preserve, which implements CL spec text:

> Denormalized numbers for the half data type which may be generated when
converting a float to a half using vstore_half and converting a half to a float
using vload_half cannot be flushed to zero

Note that our libclc builds flush denorms and rusticl does not advertise denorms
so we're expected to flush to zero. rusticl correctly sets the desired float
controls, we just have to match to the hardware requirements.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
9333428ca2 panfrost: Advertise PIPE_CAP_INT64
nir_lower_int64 should be able to chew through everything anyway. Fixes
compilers.feature_macro (with LLVM 15).

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
b27589b5d4 panfrost: Bump PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS
Bump to 2048, the minimum maximum for image support in the full profile of
OpenCL. The relevant hardware limit is 65536 so we have plenty of clearance.

Fixes api.get_image1d_array_info.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
ff29ff5fad panfrost: Upload default sampler for txf
In NIR, txf does not take a sampler. However, in the hardware it does take a
sampler. If there is no sampler bound and we use txf, the hardware will read
back all-0's due to bounds checking. As a workaround, bind a trivial sampler and
use that.

As-is this workaround is Valhall specific, making use of an extra resource
table. I'm punting on generalizing back to Bifrost until I can discuss the issue
in more depth with Jason and Karol and figure out the right fix.

Fixes api.image_properties_query.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
6d180c84fb panfrost: Allow compiling MESA_SHADER_KERNEL
Required for Rusticl.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Alyssa Rosenzweig
185b3e2d7e panfrost: Default pipe->clear_texture impl
For rusticl.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
2022-09-19 17:22:58 +00:00
Jason Ekstrand
8f4af4d700 nir/load_libclc: Don't add generic variants that already exist
At some point in the future, adding generic variants to libclc will
hopefully no longer be needed.  At that point, we don't want the NIR
code adding duplicates.  Check if the generic version already exists
and, if it does, don't re-add it.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18675>
2022-09-19 16:52:17 +00:00
Jason Ekstrand
2aa9eb497d nir: Add a helper for finding a function by name
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18675>
2022-09-19 16:52:17 +00:00
Jason Ekstrand
0a06abbb91 spirv: Don't use libclc for wait_group_events
v2: Drop old code (Karol)

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18675>
2022-09-19 16:52:17 +00:00
Vinson Lee
093b19b09a egl/dri2: Fix missing return with dri2_egl_error_unlock.
Fix defect reported by Coverity Scan.

Double unlock (LOCK)
double_unlock: dri2_egl_error_unlock unlocks dri2_dpy->lock while it is unlocked.

Fixes: f1efe037df ("egl/dri2: Add display lock")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18655>
2022-09-19 16:24:08 +00:00
Alyssa Rosenzweig
a1faab0b90 agx: Convert and clamp array indices in NIR
..Rather than at backend IR translation time. This is considerably
simpler because we can use the txs lowering instead of special casing
array sizes. Unfortunately it generates worse code, but that gap should
close once nir_opt_preamble is wired in.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18652>
2022-09-19 16:14:24 +00:00
Alyssa Rosenzweig
1304f4578d panfrost: Adapt emit_shared_memory for indirect dispatch
Indirect dispatch does not actually require any dynamic memory allocation, even
with shared memory. We just need to set wls_instances to some (mostly arbitrary)
value, statically allocate memory based on that, and let the hardware throttle
workgroups to fit if needed.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18661>
2022-09-19 15:18:40 +00:00
Alyssa Rosenzweig
79b66a28cd rusticl: Build Panfrost
We want OpenCL, too!

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18663>
2022-09-19 14:50:09 +00:00
James Park
b7d4897df9 meson,amd: Remove Windows libelf wrap
Functionality isn't worth the maintenance cost.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18478>
2022-09-19 12:51:12 +00:00
Illia Polishchuk
74658b01d2 driconf/Intel: Add lower_depth_range_rate option workaround for Homerun Clash misrendering issue
Intel has different Z interpolation float point rounding
than other mesa gpus
For example gl_Position.z = 0.0 will be interpolated to
gl_FragCoord.z = 0.5 for all gpus

gl_FragCoord = -0.00000001 will be interpolated to
gl_FragCoord.z = 0.4999999702 for Intel
and rounded to gl_FragCoord.z = 0.5 for other gpus

Games with LEQUAL depth func will fail depth test on Intel
and will pass it on other gpus in such case

This workaround lowers translated depth range
and several gl_FragCoord.z coords with extra small difference
will be translated to the same UINT16\UINT24\UINT32
value of an integer depth buffer

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7199

Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18412>
2022-09-19 10:08:48 +00:00
Marcin Ślusarz
dedd8affd8 anv: fix emission of primitive replication packet for mesh stage
anv_pipeline_get_last_vue_prog_data (used by emit_3dstate_primitive_replication)
doesn't work for mesh stage.

Fixes: ae57628dd5 ("anv: Drop anv_pipeline::use_primitive_replication")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18495>
2022-09-19 09:44:00 +00:00
Dave Airlie
9452e5e03a lavapipe: fix 3d depth stencil image clearing.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18665>
2022-09-19 17:26:57 +10:00
Mike Blumenkrantz
73797c2f46 zink: use screen interfaces for pipeline barriers
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18628>
2022-09-19 01:42:28 +00:00
Mike Blumenkrantz
8c4aaa154a zink: add screen interfaces for pipeline barriers
this will enable direct calling of the right function without the overhead
of having conditionals in the barrier functions themselves

eventually, the '2' variants will be widely enough deployed that
this can be deleted

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18628>
2022-09-19 01:42:28 +00:00
Mike Blumenkrantz
5a78fe4445 zink: add functions for using '2' variants of pipeline barriers
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18628>
2022-09-19 01:42:28 +00:00
Mike Blumenkrantz
9b0b8cad60 zink: add have_vulkan13 to device info
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18628>
2022-09-19 01:42:28 +00:00
Mike Blumenkrantz
95ea41dff9 zink: rewrite clears on fb bind if only the format has changed
in some apps (hl2), there's a weird sequence like:
* bind attachment with srgb view
* clear
* bind attachment with base format
* draw

rewriting the clear color like this avoids unnecessarily triggering
a renderpass

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18627>
2022-09-19 01:00:43 +00:00
Mike Blumenkrantz
13a19ad90c zink: make void clears more robust
void clears are intended to be the first clear applied to a surface,
so ensure that these don't clobber any scissored clears

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18627>
2022-09-19 01:00:43 +00:00
Mike Blumenkrantz
d7c64ffcb8 zink: split up get_clear_data()
make the array extension part reusable

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18627>
2022-09-19 01:00:43 +00:00
Mike Blumenkrantz
11a5297ef5 zink: don't add void clears if a full clear already exists
this otherwise may clobber other clears or add unnecessary duplicates

Fixes: 7ea7d0687b ("zink: inject a 0,0,0,1 clear for RGBX formats")

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18627>
2022-09-19 01:00:43 +00:00
David Heidelberg
f380a2d63e ci/intel: drop glmark2 terrain trace
See: https://gitlab.freedesktop.org/gfx-ci/tracie/traces-db/-/merge_requests/50

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18633>
2022-09-18 18:51:14 +00:00
David Heidelberg
ce05ed1866 ci/panfrost: drop glmark2 terrain trace
See: https://gitlab.freedesktop.org/gfx-ci/tracie/traces-db/-/merge_requests/50

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18633>
2022-09-18 18:51:14 +00:00
David Heidelberg
f4eea9ebc2 ci/radeonsi: drop glmark2 terrain trace
See: https://gitlab.freedesktop.org/gfx-ci/tracie/traces-db/-/merge_requests/50

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18633>
2022-09-18 18:51:14 +00:00
Alyssa Rosenzweig
f06809cdca panfrost: Evict the BO cache when allocation fails
If memory allocation fails, we look for a suitable sized BO in the BO cache and
wait until we can use its memory. That usually works, but there's a case when it
can fail despite sufficient memory in the system: BOs in the BO cache
contributing to memory pressure but none of them being of sufficient size. This
case is not just theoretical: it's seen in the OpenCL
test_non_uniform_work_group, which puts the system under considerable memory
pressure with an unusual allocation pattern.

To handle this case, try evicting *everything* from the BO cache and stalling
in order to allocate, if the above attempts failed. Fixes the following error:

   DRM_IOCTL_PANFROST_CREATE_BO failed: No space left on device

on the aforementioned OpenCL test.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18579>
2022-09-18 18:34:21 +00:00
Pavel Ondračka
a1b55fde93 r300: fix register rewrite when converting rbg instructions to alpha
Example from dEQP-GLES2.functional.shaders.indexing.tmp_array.float_dynamic_write_dynamic_loop_read_fragment

Fragment Program: after 'pair translate'
  0: src0.xyz = input[0], src1.xyz = const[5]
     MAD temp[0].xyz, src0.xxx, src1.Hyz, src0.000
  1: src0.xyz = const[1], src1.xyz = const[6]
     MAD temp[1].xyz, src0.xxx, src0.111, -src1.x1z
  2: src0.xyz = temp[1]
     CMP temp[1].xyz, src0.000, src0.111, src0.xyz
  3: src0.xyz = temp[0], src1.xyz = input[0], src2.xyz = temp[1]
     CMP temp[2].x, src0.x__, src1.x__, -src2.y__
  4: src0.xyz = input[0], src1.xyz = temp[0], src2.xyz = temp[1]
     CMP temp[3].x, src0.x__, src1.x__, -src2.z__
  5: src0.xyz = temp[1]
     MAX temp[4].x, src0.x__, src0.z__
  6: src0.xyz = temp[0], src1.xyz = input[0], src2.xyz = temp[4]
     CMP temp[4].x, src0.x__, src1.x__, -src2.x__
  7: src0.xyz = temp[3], src1.xyz = input[0], src2.xyz = temp[1]
     CMP temp[3].x, src0.x__, src1.x__, -src2.x__
  8: src0.xyz = input[0], src1.xyz = temp[2], src2.xyz = temp[1]
     CMP temp[2].x, src0.x__, src1.x__, -src2.x__
  9: src0.xyz = temp[1]
     MAD temp[1].x, src0.x__, src0.y__, src0.000
 10: src0.xyz = input[0], src1.xyz = temp[0], src2.xyz = temp[1]
     CMP temp[1].x, src0.x__, src1.x__, -src2.x__
 11: src0.xyz = const[2], src1.xyz = const[6]
     MAD temp[5].xyz, src0.xxx, src0.111, -src1.x1z
 12: src0.xyz = temp[5]
     CMP temp[5].xyz, src0.000, src0.111, src0.xyz
 13: src0.xyz = temp[0], src1.xyz = temp[2], src2.xyz = temp[5]
     CMP temp[6].x, src0.y__, src1.x__, -src2.y__
 14: src0.xyz = temp[3], src1.xyz = temp[0], src2.xyz = temp[5]
     CMP temp[7].x, src0.x__, src1.y__, -src2.z__
 15: src0.xyz = temp[5]
     MAX temp[8].x, src0.x__, src0.z__
 16: src0.xyz = temp[0], src1.xyz = temp[4], src2.xyz = temp[8]
     CMP temp[4].x, src0.y__, src1.x__, -src2.x__
 17: src0.xyz = temp[7], src1.xyz = temp[3], src2.xyz = temp[5]
     CMP temp[3].x, src0.x__, src1.x__, -src2.x__
 18: src0.xyz = temp[2], src1.xyz = temp[6], src2.xyz = temp[5]
     CMP temp[2].x, src0.x__, src1.x__, -src2.x__
....

This will be pair scheduled to:
Fragment Program: after 'pair scheduling'
  0: src0.xyz = input[0], src1.xyz = const[5]       // original inst 0
     MAD temp[0].xyz, src0.xxx, src1.Hyz, src0.000
  1: src0.xyz = const[1], src1.xyz = const[6]       // original inst 1
     MAD temp[1].xyz, src0.xxx, src0.111, -src1.x1z
  2: src0.xyz = const[2], src1.xyz = const[6]       // original inst 11
     MAD temp[5].xyz, src0.xxx, src0.111, -src1.x1
  3: src0.xyz = temp[1]                             // original inst 2
     CMP temp[1].xyz, src0.000, src0.111, src0.xyz
  4: src0.xyz = temp[1], src1.xyz = temp[0], src2.xyz = input[0]
     MAX temp[4].x, src0.x__, src0.z__              // original inst 5
     CMP temp[2].w, src1.x, src2.x, -src0.y         // original inst 3
  5: src0.xyz = input[0], src1.xyz = temp[0], src2.xyz = temp[1]
     CMP temp[3].w, src0.x, src1.x, -src2.z         // original inst 4
  6: src0.xyz = temp[5], src0.w = temp[2], src1.xyz = input[0], src2.xyz = temp[1]
     CMP temp[5].xyz, src0.000, src0.111, src0.xyz  // original inst 12
     CMP temp[5].w, src1.x, src0.w, -src2.x         // original inst 8
  7: src0.xyz = temp[0], src0.w = temp[5], src1.xyz = temp[2], src2.xyz = temp[5]
     CMP temp[6].x, src0.y__, src0.w__, -src2.y__   // original inst 13
  8: src0.xyz = temp[5], src0.w = temp[3], src1.xyz = input[0], src2.xyz = temp[1]
     MAX temp[8].x, src0.x__, src0.z__              // original inst 15
     CMP temp[5].w, src0.w, src1.x, -src2.x         // original inst 7
  9: src0.xyz = temp[3], src0.w = temp[5], src1.xyz = temp[0], src2.xyz = temp[5]
     CMP temp[7].x, src0.w__, src1.y__, -src2.z__   // original inst 14
 10: src0.xyz = temp[2], src0.w = temp[5], src1.xyz = temp[6], src2.xyz = temp[5]
     CMP temp[2].x, src0.w__, src1.x__, -src2.x__   // original inst 18
 11: src0.xyz = temp[7], src0.w = temp[5], src1.xyz = temp[3], src2.xyz = temp[5]
     CMP temp[3].x, src0.x__, src0.w__, -src2.x__   // original inst 17
....

The problem is that instruction 11 (which was instruction 17 before the scheduling) now reads
a wrong source for src0. It initially used the result of instruction 8 (now scheduled as 6),
but now it reads from instruction 8 (corresponding to instruction 7 before the scheduling).

The bug is quite subtle and needs few conditions to reproduce:
- there is a loop, therefore we skip the the register rename
  pass and hence don't have the ssa-like form,
- there are at least two rgb instructions writing the same register
  and both are convertible to alpha instruction,
- there is excess of rgb instructions, so that the conversion actually
  happens.

So what happens, while scheduling instructions, the scheduler will
recognize there are no alpha instruction to pair the rgb ones with
and convert some to alpha. It primarily tries to use the same register,
just reuse the alpha channel.

Why it happens? We are tracking the usage of registers in the block
being scheduled and when we rewrite something we move the users tracked
by the reg_value structures to the new register. The problem is that when
we do this, the current code expects that the code is in the ssa-like
form. Here it is not (because of the loop) and when we convert the
original instruction 2, we move the dependency information about the
temp[2].x to temp[2].w. When we later convert instruction 8, which also
writes temp[2].x, the original dependency info is gone, and when we copy
that to the new reg (temp[5].w), we just set it to NULL and it means we
don't mark it as used effectively, and later wrongly use it again when
we look for a next empty register.

Fix this by not deleting the original dependency info. We can't reuse the
reg now, but it doesn't matter, because the regalloc later can sort it out.
There are no changes in the shader-db.

Fixes: dEQP-GLES2.functional.shaders.indexing.tmp_array.float_dynamic_write_dynamic_loop_read_fragment
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6508

Reviewed-by: Filip Gawin <filip@gawin.net>
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18621>
2022-09-18 18:25:08 +00:00
Vinson Lee
bbd549205c pan/bi: Fix memory leaks.
Fix defects reported by Coverity Scan.

Resource leak (RESOURCE_LEAK)
leaked_storage: Variable used going out of scope leaks the storage it points to.
leaked_storage: Variable multiple_uses going out of scope leaks the storage it points to.

Fixes: 8fb415fee2 ("pan/bi: Reduce some moves when going out-of-SSA")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18653>
2022-09-18 10:18:04 -07:00
Alyssa Rosenzweig
bcd75a13e0 asahi: Identify shared memory layouts
Somehow maps to the tile size. Not sure about the details yet.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig
b8b3c9fa2a asahi: Identify pixel stride
Number of bytes in a pixel in the tilebuffer, does not depend on the
tile size.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:37 -04:00