This enables gl_Layer/gl_ViewportIndex when the ext is enabled, as well
as adding the new gl_ViewportMask[] array and viewport_relative layout
qualifier for gl_Layer.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4529>
I had missed that LDC actually uses vec4 units for its offset. This
means that we have to create a new instruction, and lower it in
ir3_nir_lower_io_offsets, similar to the existing SSBO instructions.
Unfortunately we can't assume that loads are always vec4-aligned, so we
have to use the alignment information that NIR gives us. Unfortunately,
it's currently woefully inadequate, and will have to be fixed to give us
good codegen in the future.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4568>
These macros produced a lot of errors with ubsan preventing us from
expanding the ubsan coverage on CIs.
C++ spec has such clause:
"If the prvalue of type "pointer to cv1 B" points to a B that is
actually a subobject of an object of type D, the resulting pointer
points to the enclosing object of type D. Otherwise, the result
of the cast is undefined."
Ubsan error example:
../src/compiler/glsl/builtin_functions.cpp:4945:4: runtime error: downcast of address 0x559b926abb50 which does not point to an object of type 'ir_instruction'
0x559b926abb50: note: object has invalid vptr
9b 55 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 58 ba 6a 92 9b 55 00 00 01 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
invalid vptr
#0 0x559b914dbe1a in call ../src/compiler/glsl/builtin_functions.cpp:4945
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4129>
Add new builtin parameters that are used to keep track of the group
size. This will be used to implement ARB_compute_variable_group_size.
The compiler will use the maximum group size supported to pick a
suitable SIMD variant. A later improvement will be to keep all SIMD
variants (like FS) so the driver can select the best one at dispatch
time.
When variable workgroup size is used, the small workgroup optimization
is disabled as it we can't prove at compile time that the barriers
won't be needed.
Extracted from original i965 patch with additional changes by
Caio Marcelo de Oliveira Filho.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4504>
nir_cf_{extract,reinsert}() can't stitch a block together
if the block we are extracting ends in a jump but other jumps
nested in further ifs should be fine to move.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4477>
If a nir_variable is tagged with per_view, it must be an array with
size corresponding to the number of views. For slot-tracking, it is
considered to take just the slot for a single element -- drivers will
take care of expanding this appropriately.
This will be used to implement the ability of having per-view position
in a vertex shader in Intel platforms.
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
We've had alignment parameters on these operations for a while but a
bunch of places weren't setting them. That should be resolved now so we
can start validating that they're always set.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4441>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4441>
This commit completely rewrites the way we extract a structured CFG from
SPIR-V. The new approach is different in a few ways:
1. It does a breadth-first search instead of depth-first. This means
that we've visited the merge node for a construct before we visit
any of the nodes inside the construct. This makes it easier to
validate things like loop and switch nesting.
2. We record more information in the CFG. Earlier commits added a
parent pointer to vtn_cf_node but we now record all of the merge and
other special blocks for each CFG node. This lets us validate
things more precisely.
3. It makes heavy use of merge blocks for walking the CFG. Previously,
we sort of used them as hints for trying to guess the CFG structure
but things got dicey whenever a merge was missing. We had some
heuristics for how to handle short-circuiting if statements but it
was a bunch of special cases.
Now, we make them a fundamental part of walking the CFG. When we
encounter a control-flow construct, we add the body components of
the construct to the BFS work list and then jump to the merge block
if one exists to continue scanning the current CFG nesting level.
If no merge block exists, we assume that means that control-flow
never re-converges in a normal way and that the only way to get back
to normality is with a direct jump such as a loop break or continue.
This should make things far more robust when trying to deal with the
more creative placement (or lack thereof) of merge instructions.
Reviewed-by: Alan Baker <alanbaker@google.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3820>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3820>
These were clearly copied and pasted from SSBOs. The shared atomics
don't have an SSBO index so their offset is src0 and data is src1.
Fixes: ce9205c03b "nir: add a load/store vectorization pass"
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4367>
The use of vector.end()[-1] seems to generate warnings in Coverity about
not allowing a negative argument to a parameter. The intention with the
code snippet is just to access the last element of the vector. The
vector.back() call acheives the same thing, is clearer and will
hopefully fix the Coverity warning.
I’m not exactly sure why Coverity thinks the array index can’t be
negative. cplusplus.com says that vector::end() returns a random access
iterator and that the type of the array index operator argument to that
should be the difference type for the container. It then also says that
difference_type for a vector is "a signed integral type".
Reviewed-by: Eric Anholt <eric@anholt.net>
The algorithm we use for resolving parallel copy instructions plays this
little shell game with the values. The reason for this is that it lets
us handle cases where, for instance we have a -> b and b -> a and we
need to use a temporary to do a swap. One result of this algorithm is
that it tends to emit a lot of mov chains which are typcially really bad
for GPUs where a mov is far from free. For instance, it's likely to
turn this:
r16 = ssa_0; r17 = ssa_0; r18 = ssa_0; r15 = ssa_0
into this:
r15 = mov ssa_0
r18 = mov r15
r17 = mov r18
r16 = mov r17
which, if it's the only thing in a block (this is common for phis) is
impossible for a scheduler to fix because of the dependencies and you
end up with significant stalling. If, on the other hand, we only do the
chaining in the actual case where we need to free up a so that it can be
used as a destination, we can emit this:
r15 = mov ssa_0
r18 = mov ssa_0
r17 = mov ssa_0
r16 = mov ssa_0
which is far nicer to the scheduler. On Intel, our copy propagation
pass will undo the chain for us so this has no shader-db impact.
However, for less intelligent back-ends, it's probably a lot better.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4412>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4412>
The placement of new shader_info.tess members unnecessarily wastes
space by interspersing 64bit members between bitfields.
Fixes: f1dd81ae10 ("nir: Collect if shader uses cross-invocation or indirect I/O.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4408>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4408>
If the shader is not a tesselation shader, then writing to the tess
member of the shaderinfo union will overwrite other members and crash.
Closes: #2722
Fixes: f1dd81ae10 ("nir: Collect if shader uses cross-invocation or indirect I/O.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4408>
If the expression tree that is being replaced has a unary operation at
its root, set the cursor (location where new instructions are inserted)
at the source instruction instead.
This doesn't do much now because there are very few patterns that have a
unary operation as the root. Almost all of the patterns that do have a
unary operation as the root have inot. All of the shaders that are
affected by this commit have expression trees with an inot at the root.
This change prevents some significant, spurious caused by the next
commit. There is further explanation in the large comment added in
the code.
I also considered a couple other options that may still be worth exploring.
1. Add some mark-up to the search pattern to denote where new
instructions should be added. I considered using "@" to denote the
cursor location. For example,
(('fneg', ('fadd@', a, b)), ...)
2. To prevent other kinds of unintended code motion, add the ability to
name expressions in the search pattern so that they can be reused in
the replacement. For example,
(('bcsel', ('ige', ('find_lsb=b', a), 0), ('find_lsb', a), -1), b),
An alternative would be to add some kind of CSE at the time of
inserting the replacements. Create a new instruction, then check to
see if it already exists. That option might be better overall.
Over the years I know Matt has heard me complain, "I added a pattern
that just deleted an instruction, but it added a bunch of spills!" This
was always in large, complex shaders that are very hard to analyze. I
always blamed these cases on the scheduler being dumb. I am now very
suspicious that unintended code motion was the real problem.
All Gen4+ Intel platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611405 -> 17611333 (<.01%)
instructions in affected programs: 18613 -> 18541 (-0.39%)
helped: 41
HURT: 13
helped stats (abs) min: 1 max: 18 x̄: 4.46 x̃: 4
helped stats (rel) min: 0.27% max: 5.68% x̄: 1.29% x̃: 1.34%
HURT stats (abs) min: 1 max: 20 x̄: 8.54 x̃: 7
HURT stats (rel) min: 0.30% max: 4.20% x̄: 2.15% x̃: 2.38%
95% mean confidence interval for instructions value: -3.29 0.63
95% mean confidence interval for instructions %-change: -0.95% 0.02%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 338366118 -> 338365223 (<.01%)
cycles in affected programs: 257889 -> 256994 (-0.35%)
helped: 42
HURT: 15
helped stats (abs) min: 2 max: 120 x̄: 39.38 x̃: 34
helped stats (rel) min: 0.04% max: 2.55% x̄: 0.86% x̃: 0.76%
HURT stats (abs) min: 6 max: 204 x̄: 50.60 x̃: 34
HURT stats (rel) min: 0.11% max: 4.75% x̄: 1.12% x̃: 0.56%
95% mean confidence interval for cycles value: -30.39 -1.02
95% mean confidence interval for cycles %-change: -0.66% -0.02%
Cycles are helped.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
Because the types etc. are required to logically match, we can just
copy-propagate the guts of the vtn_value. This was causing issues with
some new CTS tests that are doing an OpCopyObject of a sampler which is
a special-cased type in spirv_to_nir. Of course, this is only a partial
solution. Ideally, we've got a bit of work to do to make all the
composite stuff able to handle all types including images, sampler, and
combined image/samplers but this gets some CTS tests passing.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4375>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4375>