We'll want to unify this with main copy prop (and extend to varyings),
but that'll take more care to handle some special cases, so leave it as
a stub pass for now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
This extends copy propagation to respect output modifiers for ALU
instructions, as well as potentially fixing some bugs related to looping
(all dEQP loop tests pass).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
OpenGL 4.6 Spec:
"5.3.3 Rules
.......
Note: “Updates” via rendering or transform feedback
are treated consistently with updates via GL commands.
Once EndTransformFeedback has been issued, any subsequent
command in the same context that uses the results of the
transform feedback operation will see the results."
v2: removed a wrong comment
( Kenneth Graunke <kenneth@whitecape.org> )
v3: - flush+dirty depends on buffers usage history
- removed an old hack
( Kenneth Graunke <kenneth@whitecape.org> )
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110404
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Just enable it during init_render_context on Gen10+, and move the
Gen9 state tracking into iris_genx_state so it only exists on Gen9.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Needed to track context rolls caused by streamout and ACQUIRE_MEM.
ACQUIRE_MEM can occur outside of draw calls.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110355
v2: squashed patches and done more rework
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
DRI driver loadable modules are always installed with
install_megadriver.py with names ending with '.so', irrespective of
platform.
Force the name the loadable module is built with to match, so
install_megadriver.py doesn't spin trying to remove non-existent
symlinks.
Fixes: c77acc3c "meson: remove meson-created megadrivers symlinks"
Force the driver thread to sync immediately with a compiler thread (but
compilation still happens in a separate thread).
This can be useful to simplify debugging compiler issues.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Enabling this option will create ddebug-style dumps for the aux context,
except that instead of intercepting the pipe_context layer
we just dump the IB contents on flush.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Due to asynchronous execution, it's not clear which of the draws the state
may refer to.
This also works around an issue encountered with radeonsi where dumping
the driver state itself caused a hang.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Move the definition of radeonsi_clear_db_cache_before_clear there,
as well as radeonsi_enable_nir.
This removes the AMD_DEBUG=nir option.
We currently still have two places for options: the driconf machinery
and AMD_DEBUG/R600_DEBUG. If we are to have a single place for options,
then the driconf machinery should be preferred since it's more flexible.
The only downside of the driconf machinery was that adding new options
was quite inconvenient. With this change, a simple boolean option can
be added with a single line of code, same as for AMD_DEBUG.
One technical limitation of this particular implementation is that while
almost all driconf features are available, the translation machinery doesn't
pick up the description strings for options added in si_debvug_options. In
practice, translations haven't been provided anyway, and this is intended
for developer options, so I'm not too worried. It could always be added
later if anybody really cares.
v2:
- use bool instead of uint8_t for options
- si_debug_options.inc -> si_debug_options.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
need_cs_space may clear the buffer list.
Fixes: 951d60f8cd "radeonsi: delay adding BOs at the beginning of IBs until the first draw"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The CTS fails on
dEQP-GLES31.functional.shaders.opaque_type_indexing.atomic_counter.*vertex
when they are enabled, due to the VS being run for both bin and render. I
think this behavior is expected to be valid, but I can't find text in
atomic counters or SSBO specs saying so (the closed I found was in
shader_image_load_store). Just disable it for now, since the closed
source driver doesn't expose vertex atomic counters/SSBOs either.
this automatically enables preemption on gen10 where it is disabled by
default but still available
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
this is basically just porting the following two commits to gallium:
d8b50e152a5c454661c6resolveskwg/mesa#49
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
We create two new helpers, iris_flush_bits_for_history, and
iris_dirty_for_history, then use them in the existing function.
The first accumulates flush bits based on res->bind_history, but doesn't
actually perform a flush. This allows us to accumulate flush bits by
looping over multiple resources, but ultimately emit a single flush for
all of them.
The latter flags dirty bits without flushing, which again allows us to
handle multiple resources, but also is more convenient when writing from
the CPU where we don't need a flush (as in commit 4d12236072).
This inserts a handle for the flink name and a handle the correct
gem handle for the bo.
v2: fix handles/names confusion (Lepton Wu)
v3: set flink name correctly (Lepton Wu)
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
This code is outdated and unused; now that the compiler is mature,
there's no point keeping it around in-tree (or at all).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
This code is stable and can live upstream independently while the rest
of the Bifrost stack comes up.
v2: Added a verbose flag to hide away some of the more verbose features
that nobody really needs
[The Bifrost disassembler is written by Connor Abbott, Lyude Paul, and
Ryan Houdek.]
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
We create an all-encompassing opcode table for handling name and
properties, removing a number of ad hoc opcode tables which became
brittle and quickly out of date. While we're at it, we fix some
incorrect opcodes relating to ball/bany, and move a small function out
to midgard_compile.c. Together these changes should allow compilation
without warnings, along with helping the codebase health considerably.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Most copy prop should occur at the NIR level, but we generate a fair
number of moves implicitly ourselves, etc... long story short, it's a
net win to also do simple copy prop + DCE on the MIR. As a bonus, this
fixes the weird imov precision bug once and for good, I think.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
For floating point ops, these bits determine the "negate?" and "abs?"
modifiers. For integer ops, it turns out they control how sign/zero
extension work, useful for mixing types.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
In the future, we might want to switch to a table-based approach, but
for now, at least have it current.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
We reshuffle the existing "dead move elimination" pass into a generic
dead code elimination layer, fixing bugs incurred with looping in the
process.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
The bug this worked around is no longer applicable, it seems -- remove
the hack that breaks more than it fixes.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
The hardware needs this lowered anyway; for now, might as well use
mesa's default lowering for pure conformance reasons.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>