Commit graph

514 commits

Author SHA1 Message Date
Eric Engestrom
79af30768d meson: inline inc_common
Let's make it clear what includes are being added everywhere, so that
they can be cleaned up.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
2020-03-28 21:36:54 +01:00
Rob Clark
36aed70b59 util/ra: spiff out select_reg_callback
Add a parameter so the callback can know which node it is selecting a
register for.  And remove the graph parameter, as it is unused by
existing users, and somewhat unnecessary (ie. the callback data could
be used instead).

And add a comment so $future_me remembers how this works.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
2020-03-10 16:01:39 +00:00
Eric Anholt
12cf484d02 v3d: Ask the state tracker to lower image accesses off of derefs.
This saves a bunch of hassle in handling derefs in the backend, and would
be needed for reasonable handling of dynamic indexing of image arrays.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3728>
2020-02-24 18:25:02 +00:00
Jose Maria Casanova Crespo
68bb26af63 broadcom: Fix implicit declaration of ffs for Android build
Include util/bitscan.h to ensure ffs is available when there is no
glibc like in Android.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1983
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2554>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2554>
2020-02-06 18:31:13 +01:00
Eric Anholt
8d07d66180 glsl,nir: Switch the enum representing shader image formats to PIPE_FORMAT.
This means you can directly use format utils on it without having to have
your own GL enum to number-of-components switch statement (or whatever) in
your vulkan backend.

Thanks to imirkin for fixing up the nouveau driver (and a couple of core
details).

This fixes the computed qualifiers for EXT_shader_image_load_store's
non-integer sizeNxM qualifiers, which we don't have tests for.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3d)
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355>
2020-02-05 10:31:14 -08:00
Anthony Pesch
f77369086c util/hash_table: update users to use new optimal integer hash functions
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
2020-01-23 17:06:57 +00:00
Jason Ekstrand
d3737002ee nir/lower_atomics_to_ssbo: Also lower barriers
This is more correct for a pass which is supposed to completely lower
away atomic counters.  It also lets us stop supporting atomic counter
barriers in most of the drivers.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Jason Ekstrand
e40b11bbcb nir: Rename nir_intrinsic_barrier to control_barrier
This is a more explicit name now that we don't want it to be doing any
memory barrier stuff for us.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Jason Ekstrand
60097cc840 nir: Add a new memory_barrier_tcs_patch intrinsic
Right now, it's implemented as a no-op for everyone.  For most drivers,
it's a switch case in the NIR -> whatever which just breaks.  For ir3,
they already have code to delete tessellation barriers so we just add a
case to also delete memory_barrier_tcs_patch.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Iago Toral Quiroga
6c7a2b69f8 v3d: handle writes to gl_Layer from geometry shaders
When geometry shaders write a value to gl_Layer that doesn't correspond to
an existing layer in the target framebuffer the rendering behavior is
undefined according to the spec, however, there are CTS tests that trigger
this scenario on purpose, probably to ensure that nothing terrible happens.

For V3D, this situation is problematic because the binner uses the layer
index to select the offset to write into the tile state data, and we only
allocate tile state for MAX2(num_layers, 1), so we want to make sure we
don't produce values that would lead to out of bounds writes. The simulator
has an assert to catch this, although we haven't observed issues in actual
hardware it is probably best to play safe.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
a6b318ef52 v3d: predicate geometry shader outputs inside non-uniform control flow
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
a07d70c54b v3d: we always have at least one output segment
If we program an output size of 0 the simulator asserts. This was
not a problem until now because our VS would always have to
emit fixed function outputs, however, now that it can be paired
with a GS we can end up with a VS shader that no longer emits
any outputs.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
76fc8c8bb1 v3d: compute appropriate VPM memory configuration for geometry shader workloads
Geometry shaders can output many vertices and thus have higher VPM memory
pressure as a result. It is possible that too wide geometry shader dispatches
exceed the maximum available VPM output allocated, in which case we need
to reduce the dispatch width until we can fit the VPM memory requirements.
Supported dispatch widths for geometry shaders are 16, 8, 4, 1.

There is a limit in the number of VPM output sectors that can be used by a
geometry shader that we can meet by lowering the dispatch width at compile
time, however, at draw time we need to revisit this number and, together with
other elements that can contribute to total VPM memory requirements, decide
on a configuration that can fit the program into the available VPM memory.
Ideally, we also want to aim for not using more than half of the available
memory so we that we can run a pair of bin and render programs in parallel.

v2: fixed language in comment and typo in commit log. (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
76f4c83815 v3d: add 1-way SIMD packing definition
According to the documentation, the 1-way dispatch width is only supported
with geometry shaders.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
4f5fbd6490 v3d: implement geometry shader instancing
v2:
 - Remove unused field uses_iid from v3d_gs_prog_data (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
0934bd4460 v3d: fix packet descriptions for geometry and tessellation shaders
Every code address starts at bit 3 (addresses must be 64-bit aligned),
with the first 3 bits used to specify threading and NaN propagation
parameters for the shader program.

We generally skip "reserved" bits, however, doing this when the
reserved field is the last in a struct and it is large enough can
make us compute incorrect (smaller) struct sizes which can
lead to corrupt CLs. In particular, the "Tess/Geom Common Params"
struct has a reserved field at the end that is 8-bit, so if we
don't include this we compute a packet size that is 1 byte smaller
than it shold, making the next packet we emit start 1 byte
earlier and therefore leading to incorrect CL data from that point
forward.

The name of one of the fields was not correct.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
5d578c27ce v3d: add initial compiler plumbing for geometry shaders
Most of the relevant work happens in the v3d_nir_lower_io. Since
geometry shaders can write any number of output vertices, this pass
injects a few variables into the shader code to keep track of things
like the number of vertices emitted or the offsets into the VPM
of the current vertex output, etc. This is also where we handle
EmitVertex() and EmitPrimitive() intrinsics.

The geometry shader VPM output layout has a specific structure
with a 32-bit general header, then another 32-bit header slot for
each output vertex, and finally the actual vertex data.

When vertex shaders are paired with geometry shaders we also need
to consider the following:
  - Only geometry shaders emit fixed function outputs.
  - The coordinate shader used for the vertex stage during binning must
    not drop varyings other than those used by transform feedback, since
    these may be read by the binning GS.

v2:
 - Use MAX3 instead of a chain of MAX2 (Alejandro).
 - Make all loop variables unsigned in ntq_setup_gs_inputs (Alejandro)
 - Update comment in IO owering so it includes the GS stage (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
f63750accf v3d: remove unused variable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
52cbef0039 v3d: enable debug options for geometry shader dumps
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
d6b0786a38 v3d: add debug assert
While lowering vpm outputs we look for the NIR variables matching
particular store output instructions and we expect to find a match,
so assert on that.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
6e68f74395 v3d: add missing plumbing for VPM load instructions
We will need to use LDVPMG_IN specifically to read VPM inputs
in geometry shaders.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Dylan Baker
d0eebda990 meson/broadcom: libbroadcom_cle also needs zlib
Fixes: 1ae8018a6a
       ("meson: Add support for the vc4 driver.")
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-11 00:49:44 +00:00
Dylan Baker
85a9698ac3 meson/broadcom: libbroadcom_cle needs expat headers
Fixes: 1ae8018a6a
       ("meson: Add support for the vc4 driver.")
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-10 10:48:38 -08:00
Eric Anholt
8afab607ac nir: Add a scheduler pass to reduce maximum register pressure.
This is similar to a scheduler I've written for vc4 and i965, but this
time written at the NIR level so that hopefully it's reusable.  A notable
new feature it has is Goodman/Hsu's heuristic of "once we've started
processing the uses of a value, prioritize processing the rest of their
uses", which should help avoid the heuristic otherwise making such
systematically bad choices around getting texture results consumed.

Results for v3d:

total instructions in shared programs: 6497588 -> 6518242 (0.32%)
total threads in shared programs: 154000 -> 152828 (-0.76%)
total uniforms in shared programs: 2119629 -> 2068681 (-2.40%)
total spills in shared programs: 4984 -> 472 (-90.53%)
total fills in shared programs: 6418 -> 1546 (-75.91%)

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> (v1)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (v2)

v2: Use the DAG datastructure, fold in the scheduling-for-parallelism
    patch, include SSA defs in live values so we can switch to bottom-up
    if we want.
v3: Squash in improvements from Alejandro Piñeiro for getting V3D to
    successfully register allocate on GLES3.1 dEQP.  Make sure that
    discards don't move after store_output.  Comment spelling fix.
2019-11-25 21:12:21 +00:00
Alejandro Piñeiro
b4bc59e37e v3d: adds an extra MOV for any sig.ld*
Specifically when we are in non-uniform control flow, as we would need
to set the condition for the last instruction. If (for example) a
image atomic load stores directly their value on a NIR register,
last_inst would be a nop, and would fail when set the condition.

Fixes piglit test:
spec/glsl-es-3.10/execution/cs-ssbo-atomic-if-else-2.shader_test

Fixes: 6281f26f06 ("v3d: Add support for shader_image_load_store.")

v2: (Changes suggested by Eric Anholt)
   * Cover all sig.ld* signals, not just ldunif and ldtmu, as all of
     them have the same restriction.
   * Update comment explaining why we add a MOV in that case
   * Tweak commit message.

v3:
   * Drop extra set of parens (Eric)
   * Add missing ld signal to is_ld_signal to fix shader-db regression.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-20 11:21:16 +01:00
Jose Maria Casanova Crespo
d983055184 v3d: Fix predication with atomic image operations
Fixes dEQP test:
dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.image_atomic_multiple_interleaved_write_read

Fixes piglit test:
spec/glsl-es-3.10/execution/cs-image-atomic-if-else.shader_test

Fixes: 6281f26f06 ("v3d: Add support for shader_image_load_store.")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-20 11:20:55 +01:00
Eric Anholt
882ca6dfb0 util: Move gallium's PIPE_FORMAT utils to /util/format/
To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to
move their helpers out of gallium.  Since u_format used
util_copy_rect(), I moved that in there, too.

I've put it in a separate directory in util/ because it's a big chunk
of related code, and it's not clear to me whether we might want it as
a separate library from libmesa_util at some point.

Closes: #1905
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-14 10:47:20 -08:00
Iago Toral Quiroga
e7e501efce v3d: rename vertex shader key (num)_fs_inputs fields
Until now this made sense because we always paired vertex shaders
with fragment shaders, but as soon as we implement geometry and
tessellation shaders that will no longer be the case, so rename
this to (num_)used_outputs.

v2: Use 'used_outputs' instead of ns_outputs, which is more explicit (Eric).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-31 08:46:35 +00:00
Timothy Arceri
7f106a2b5d util: rename list_empty() to list_is_empty()
This makes it clear that it's a boolean test and not an action
(eg. "empty the list").

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Eric Engestrom
0581a86753 v3d: fix empty-body instruction
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-27 22:10:31 +00:00
Erik Faye-Lund
65328bd32d Revert "v3d: do not report alpha-test as supported"
This reverts commit 9d0523b569.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
2019-10-23 13:03:55 +02:00
Rhys Perry
8b98d0954e nir/lower_idiv: add new llvm-based path
v2: make variable names snake_case
v2: minor cleanups in emit_udiv()
v2: fix Panfrost build failure
v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature
v4: remove nir_op_urcp
v5: drop nv50 path
v5: rebase
v6: add back nv50 path
v6: add comment for nir_lower_idiv_path enum
v7: rename _nv50/_llvm to _fast/_precise
v8: fix etnaviv build failure

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 18:49:46 +00:00
Iago Toral Quiroga
2d8b51ea4d broadcom: document known hardware issues for L2T flush command
Suggested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-18 14:08:52 +02:00
Iago Toral Quiroga
46182fc1da v3d: add new flag dirty TMU cache at v3d_compiler
That we set for any TMU write on spills and general tmu. It is then
used as part of v3d_emit_gl_shader_state later.

v2: add a new flag instead at v3d_compiler instead of dirty the flag
    at v3dx if there is any spill (change suggested by Eric, added by
    Alejandro)

v3: set this for anything that is not a load and do it also in
    v3d40_vir_emit_image_load_store (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-18 14:08:52 +02:00
Erik Faye-Lund
9d0523b569 v3d: do not report alpha-test as supported
This triggers lowering in the state-tracker, which makes things a bit
simpler.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
71c0dcf266 nir: support feeding state to nir_lower_clip_[vg]s
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
eb3047c094 nir: support lowering clipdist to arrays
This allows us to make sure clipdist is emitted as a scalar array rather
than two vec4s. This matches SPIR-V semantics, and will be useful for
Zink.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
6d7e02e37d nir: allow passing alpha-ref state to lowering-code
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Marek Olšák
cebc38ff60 nir: add nir_shader_compiler_options::lower_to_scalar
This will replace PIPE_SHADER_CAP_SCALAR_ISA.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-10 15:49:18 -04:00
Eric Anholt
ca1aa5d225 v3d: Enable the late algebraic optimizations to get real subs.
This worked better than my original v3d-local pass for just subs, and is a
huge win over not producing subs.

total instructions in shared programs: 6408469 -> 6167932 (-3.75%)
total threads in shared programs: 153784 -> 154104 (0.21%)
total uniforms in shared programs: 2157078 -> 1905823 (-11.65%)
total max-temps in shared programs: 904546 -> 895796 (-0.97%)
total spills in shared programs: 4959 -> 4993 (0.69%)
total fills in shared programs: 6558 -> 6670 (1.71%)
total sfu-stalls in shared programs: 25845 -> 25175 (-2.59%)
total inst-and-stalls in shared programs: 6434314 -> 6193107 (-3.75%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-30 09:44:10 +00:00
Kenneth Graunke
a733423da5 broadcom/genxml: Stop manually scrubbing 'α' -> "alpha"
'α' has never appeared in any genxml files, so there's no need to
replace it with the word "alpha".

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-23 20:24:54 +00:00
Vasily Khoruzhick
9367d2ca37 nir: allow specifying filter callback in lower_alu_to_scalar
Set of opcodes doesn't have enough flexibility in certain cases. E.g.
Utgard PP has vector conditional select operation, but condition is always
scalar. Lowering all the vector selects to scalar increases instruction
number, so we need a way to filter only those ops that can't be handled
in hardware.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-06 01:51:28 +00:00
Jose Maria Casanova Crespo
a5df0fa0b1 v3d: writes to magic registers aren't RF writes after THREND
Shaders must not attempt to write to the register files in the last
three instructions, but that doesn't include the magic registers:

nop                  ; nop               ; thrsw; ldtmu.- *** ERROR ***
nop                  ; nop
nop                  ; nop

v2: Simplify validation rules. (Eric Anholt)
v3: Adjust validation even more. (Eric Anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-05 22:54:13 +01:00
Connor Abbott
dcc64fcfed nir: Fix num_ssbos when lowering atomic counters
Otherwise it's impossible to know the maximum SSBO index for both
internal TGSI shaders from TTN (which don't have any notion of atomic
counters and no offset) as well as shaders from GLSL.

I fixed everything I could find while grepping for num_ssbos and
num_abos, which hopefully is everything (iris was the only user I could
find that uses it in a meaningful way).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-03 15:54:54 +02:00
Jason Ekstrand
c9a4793de8 v3d: Use the correct opcodes for signed image min/max
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-21 17:19:55 +00:00
Jason Ekstrand
951cf94521 nir: Add explicit signs to image min/max intrinsics
This better matches all the other atomic intrinsics such as those for
SSBOs and shared variables where the sign is part of the intrinsic
opcode.  Both generators (GLSL and SPIR-V) know the sign from the type
of the image variable or handle.  In SPIR-V, signed min/max are separate
opcodes from unsigned.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-21 17:19:55 +00:00
Iago Toral Quiroga
3539bd63dd v3d: clamp gl_PointSize to a minimum of 1.0
The OpenGL ES spec requires that the value of gl_PointSize is clamped
to an implementation-dependent range matching what is advertised by
GL_ALIASED_POINT_SIZE_RANGE. For V3D this is [1.0, 512.0], but the
hardware won't clamp to the minimum side of the range and won't render
points with a size strictly smaller than 1.0 either, so we need to
clamp manually. For points larger than the maximum size of the range
the hardware clamps automatically.

Fixes piglit test:
spec/!opengl 2.0/vs-point_size-zero

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 09:44:54 +02:00
Iago Toral Quiroga
62e0ca3064 v3d: line length style fixes
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 08:38:19 +02:00
Iago Toral Quiroga
99e9809cab v3d: honor the write mask on store operations
v2:
  - Fix incremental update of the const offset when we need to emit a sequence
    with more than one write because of the writemask.
  - Do not move the tmu write emission to a separate helper.

v3:
  - Get the store writemask before the loop, use ffs to get the first component
    to write and clear writemask bits as we process the components (Eric).
  - Simplified the code that figured out the number of components for the TMU
    config based on the number of tmu writes for stores and atomics.

v4:
  - Code clean-ups (Eric).

Fixes:
KHR-GLES31.core.shader_image_load_store.advanced-cast-cs
KHR-GLES31.core.shader_image_load_store.advanced-cast-fs
KHR-GLES31.core.shader_storage_buffer_object.advanced-switchBuffers-cs
KHR-GLES31.core.shader_storage_buffer_object.advanced-switchPrograms-cs
KHR-GLES31.core.shader_storage_buffer_object.basic-operations-case1-cs

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 08:38:19 +02:00
Iago Toral Quiroga
3d65d2a488 v3d: refactor ntq_emit_tmu_general() slightly
When we implement write masks on store operations we might need to
emit multiple write sequences for a given store intrinsic. To make
that easier, let's split the emission of the tmud instructions to
their own block after we are done with the code that only needs to
run once no matter how many write sequences we need to emit.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 08:38:19 +02:00