Now that all the other kinds were added, all transforms to SEND will
come from non-BASE kinds, so we don't need overallocate for BASE
instructions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
This kind of instruction doesn't have a special struct but
will still be always allocated so that it can be lowered to SEND.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
Fixed the types in brw_inst::bits so the struct is packed correctly.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
Move all the SEND specific fields from brw_inst into brw_send_inst.
This new instruction kind will contain all variants of SENDs plus the
virtual opcodes that were already relying on those SEND fields.
Use the `as_send()` helper to go from a brw_inst into the brw_send_inst
when applicable. Some of the code was changed to use the brw_send_inst
type directly.
Until other kinds are added, all the instructions are allocated the same
amount of space as brw_send_inst. This ensures that all
brw_transform_inst() calls are still valid. This will change after
a few patches so that BASE instructions can use less memory.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
Prepare code for supporting subclasses of brw_inst for certain
specialized kinds of instructions. This will allow
- Move certain fields from brw_inst to the specialized one,
reducing its size and making it easy to understand what applies
to which instruction;
- Move certain control sources into the specialized inst type, which
currently take a full brw_reg to encode small integers. Reducing
the overall sources we walk and care also might help the code
in general.
Next commits will add the new instruction kinds.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
The new function takes care of changing an instruction opcode and sources,
which will allow later patches to tweak how allocations are done in
those cases. Like the instruction allocation, this also takes a shader
(or a builder, for it to get a shader).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
Flatten all the work being done into brw_new_inst() and
brw_clone_inst() and allocate both the instruction and
the sources in one swoop.
For now we still keep a pointer to the array instead of
declaring an array as last element to still allow growing
the array -- which is used by the compiler in a few places.
This commit removes the constructors for brw_inst, the idea
is that the instructions are managed by the brw_shader, so
we always go through it for new ones.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
In the few cases we have to _increase_ the number of sources, the new
code will not attempt to recollect the memory, i.e. it delays freeing
the old smaller one source array. For the instructions that may need
this (when making a SEND into a SEND_GATHER), this is not expected to
happen more than once.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
Add and use brw_new_inst() and brw_clone_inst() and do not use stack
allocated brw_insts. The builder was changed to not use the temporary
ones either.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
The problem with the current code is that there is a disconnect between :
- the virtual register size allocated
- the dispatch size
- the size_written value
Only the last 2 are in sync and this confuses the spiller that only
looks at the destination register allocation & dispatch size to figure
out how much to spill.
The solution in this change is to make BROADCAST more like
MOV_INDIRECT, so that you can do a BROADCAST(8) that actually reads a
SIMD32 register. We put the size of the register read into src2.
Now the spiller sees correct read/write sizes just looking at the
destination register & dispatch size.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 662339a2ff ("brw/build: Use SIMD8 temporaries in emit_uniformize")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13614
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36564>
The logical send lowering already resolves sources when constructing
the send payload, so prior to that lowering, we don't need to apply
any special restrictions here.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
The logical send lowering already resolves sources when constructing
the send payload, so prior to that lowering, we don't need to apply
any special restrictions here.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
The goal here was to avoid propagating source modifiers, unusual
regions, and other things that couldn't be used as a send source.
A few patches ago ("brw: Properly resolve non-sendable sources in a few
logical opcodes") we fixed the logical send lowering to handle these
by resolving them when constructing the send payload. So now prior
to lowering, we don't need to treat these opcodes specially.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
This introduces enums for SHADER_OPCODE_SEND[_GATHER] sources, similar
similar to what we've done for most of the newer logical opcodes. This
allows us to use actual names for sources rather than remembering their
order, or leaving ourselves comments like /* ex_desc */ all over. It
will also make it easier to add or reorder sources in the future.
While we're at it, we also standardize on the number of sources.
Previously, we allowed SHADER_OPCODE_SEND to have either 3 (monosend) or
4 (split send) sources, but this is mostly for haphazard historical
reasons. We now specify all sources every time, eliminating the need
for careful inst->source checks before accessing the last source.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
The is_send() helper is just a wrapper around inst->is_send_from_grf()
now, so we can combine the two. Trim the name from is_send_from_grf()
to is_send(), as it's shorter, and also matches is_math().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
The GLSL spec says (among other things):
"When a volatile variable is read, its value must be re-fetched from
the underlying memory, even if the shader invocation performing the
read had previously fetched its value from the same memory. When a
volatile variable is written, its value must be written to the
underlying memory, even if the compiler can conclusively determine
that its value will be overwritten by a subsequent write."
The SPIR-V spec says (among other things):
"Accesses to volatile memory cannot be eliminated, duplicated, or
combined with other accesses."
So in this commit we make sure that both writes and reads marked as
volatile can't be affected by CSE.
v2: Reorder patches in the series.
Credits-to: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Iván Briano <ivan.briano@intel.com> (v1)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36150>
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>
See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in
And this causes build errors when building for C23:
-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
123 | #define unreachable(str) \
| ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
456 | #define unreachable() (__builtin_unreachable ())
| ^~~~~~~~~~~
-----------------------------------------------------------------------
So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.
Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.
This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.
All the instances of the macro, including the definition, were updated
with the following command line:
git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
while read file; \
do \
sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
done && \
sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
load_reg is something like load_payload except it has a single
source. It copies the entire source to the destination. Its purpose is
to convert a non-SSA VGRF into an SSA value. This copy is marked as
volatile so that it will act as a scheduling barrier.
v2: Fix some typos in the commit message. Eliminate the
brw_builder::LOAD_REG overload that returns a brw_inst*. This is
unlikely to ever be used. Add some checks to brw_validate. All
suggested by Caio.
v3: Force the source and destination types of the LOAD_REG to by
integer. This will (eventually) simplify the creating of unit tests for
the pass that adds LOAD_REG instructions.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
Now that the brw_ip_ranges analysis is being used, there's no
need to track start_ip/end_ips in the blocks as they are mutate. And
also no need to call adjust_block_ips at the end of some passes.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
The problem occurs with a series of instructions build the subgroup
invocation value :
mov(8) g23<1>UW 0x76543210V
add(8) g23.8<1>UW g23<8,8,1>UW 0x0008UW
add(16) g23.16<1>UW g23<16,16,1>UW 0x0010UW
Our register spilling code operates on physical registers (64B on
Xe2+) and using the brw_inst::is_partial_write() helper only considers
32B registers. So the spiller doesn't see that the add(16) instruction
is doing a partial write and ends up discarding the previous value.
You can reproduce the issue by running a test like :
INTEL_DEBUG=spill_fs ./deqp-vk -n dEQP-VK.compute.pipeline.cooperative_matrix.khr_a.subgroupscope.constant.uint8_uint8.buffer.rowmajor.linear
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: aa494cbacf ("brw: align spilling offsets to physical register sizes")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33642>
I'm not aware of any workloads that will be impacted by this change,
but let's keep our list of control flow instructions complete. A
shader-db run on MTL tells me nothing changes.
v2: "The scheduler relies on HALT not being considered control flow to
be able to move code past HALT instructions. Doing this would prevent
such optimization from happening and would reduce performance
dramatically in some cases." - Francisco.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33021>
Make the "block after DO" more stable so that adding instructions after
a DO doesn't require repairing the CFG. Use a new SHADER_OPCODE_FLOW
instruction that is a placeholder representing "go to the next block"
and disappears at code generation.
For some context, there are a few facts about how CFG currently works
- Blocks are assumed to not be empty;
- DO is always by itself in a block, i.e. starts and ends a block;
- There are no empty blocks;
- Predicated WHILE and CONTINUE will link to the "block after DO";
- When nesting loops, it is possible that the "block after DO" is
another "DO".
Reasons and further explanations for those are in the brw_cfg.c comments.
What makes this new change useful is that a pass might want to add
instructions between two DO instructions. When that happens, a new
block must be created and any predicated WHILE and CONTINUE must be
repaired.
So, instead of requiring a repair (which has proven to be tricky in
the past), this change adds a block that can be "virtually" empty but
allow instructions to be added without further changes.
One alternative design would be allowing empty blocks, that would be
a deeper change since the blocks are currently assumed to be not empty
in various places. We'll save that for when other changes are made to
the CFG.
The problem described happens in brw_opt_combine_constants, and a
different patch will clean that up.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33536>
FS_OPCODE_INTERPOLATE_AT_{SAMPLE,SHARED_OFFSET} never have a mlen set.
They are lowered to SHADER_OPCODE_SEND in logical send lowering, at
which point they acquire an mlen, but cease to be those opcodes.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
We already have logical pixel interpolator messages that get lowered
to send messages. We can just add an extra boolean source to those
opcodes rather than sticking a opcode-specific boolean in the generic
fs_inst data structure.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
This was originally turned into a separate struct for reuse between vec4
and fs backends, that's not needed anymore.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33334>