This has the downside of putting block successor validation in two
places that are a bit further apart. However, handling them as a
special case makes the code more confusing than needed. At least two
different people have not noticed that we don't have jump instruction
validation in the last week or two and added it. Being able to search
for validate_jump_instr is useful.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5101>
For 16-bit bank LDS (ie. Kabini/Stoney) we need a slightly different
path. It's completely untested though because I don't have these
chips but according to vkpipeline-db the generated assembly seems fine.
Note that 16-bit I/O is currently only exposed on GFX9+ for both
compiler backends.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
This adds a separate emission path in the assembly for the 16-bit
interp instructions.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
16-bit interp instructions are considered VINTRP by the compiler
but they are emitted as VOP3 by the assembler.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
We only have to adjust some assertions to allow storing/loading
16-bit values.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
On EGL 1.4, one had to check for the existence of EGL_EXT_platform_base
before querying the eglGetPlatformDisplayEXT() and
eglCreatePlatformWindowSurfaceEXT() symbols, to then use them if the
EGL_EXT_platform_* extension for the given platform was exposed.
Since EGL 1.5, the platform functionality was made core, which means we
can obtain the symbols unconditionally, but we can't know the EGL
version before having created a display, at which point we've already
done a platform selection by passing an EGLNativeDisplay. The
EGL_KHR_platform_* extensions thus are used by clients to know whether
it's safe or not to dlsym() the EGL 1.5 symbols.
This commit adds those extensions when the given platform is enabled.
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5052>
We create a hash table mapping GPU va's to mmap structures, such that
searching for a mapped address is effectively O(1) rather than O(N) to
the number of mapped entries as with the previous linked list approach.
This is a memory-time tradeoff, but the speed-up is tracing is notable.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5099>
For cases where instructions have a src and/or dst type, validate that
it matches the src/dst register types. And for cases where there are
different opcodes for half vs full, validate that the opcode matches.
Now that we maintain this properly throughout the stages of the ir, we
can drop the fixups from the RA pass.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
Add some helpers to properly maintain src/dst types, and in the cases
where opcode depends on src or dst type, maintain that as well.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
When we start doing cp iteratively, we hit the case that we've already
`cmps.s.*` into a `cmps.s.ne p0.x, ...`.. when we try to do that again
we can invert the logic condition. So check specifically the condition
to prevent this.
TODO we could maybe be more clever about this to combine conditions.
But why isn't that happening in nir? For example, see
dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.bool
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
There can be multiple (for ex.) f32f16's from a single source, in
particular appearing in different blocks. We need to update all uses
of the src which had conversion folded in, not all the uses of the
individual cov. Also, to avoid invalidating the ssa use info that was
gathered at the beginning of the pass, don't actually eliminate the
cov, but instead change it to a simple mov that the cp pass can gobble
up.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
We have to fixup the meta:split half flag, because `ir3_split_dest()` is
called before we fixup the dest type. But we should fixup both the
split src and dest, as well as the thing it is splitting.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
If we're inserting a mov to resolve a conflict between meta:collect's
(ie. for .zyx type swizzles, etc), we should use the correct precision.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
It does pick up a few more cf/cp opportunities, according to sharder-db.
But don't think it will be measurable.
But this will allow some future simplification to cp by pulling out it's
internal iteration.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
For a6xx, since we use same VBO state for binning and VS, we need to
preserve potentially unused inputs. This needs to be done before DCE.
So move it before we add earlier DCE passes.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
Or do the easy thing and claim we always changed something. It is kinda
hard and not worth the effort to determine for real.
Also rip out unused error handling. This pass should never fail. And
we weren't even actually checking the return.
And while we're at it, switch over to taking the 'struct ir3 ir*`
instead of ctx, to standardize with the other passes.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
Later when we do this pass iteratively, we can drop some of the internal
iteration and just rely on this pass getting run until there is no more
progress.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
In a later patch, this will get folded into an IR3_PASS() macro, at
least for most passes. But to do that, it is better to standardize
on printing the ir3 after the pass.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
SymbolInfoTy was modified in LLVM 11. It is also in MCDisassembler.h now
and we don't have to duplicate it anymore.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5060>
This was wrong, if anything it should be sorted by device_location, and NIR usually
provides this.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5085>