Every driver uses the nir_lower_system_values path now.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18327>
v2: drop the hardcoded inst->mlen=1 (Rohan)
v3: Move back to LOAD/STORE messages (limited to SIMD16 for LSC)
v4: Also use 4 GRFs transpose loads for fills (Curro)
v5: Reduce amount of needed register to build per lane offsets (Curro)
Drop some now useless SIMD32 code
Unify unspill code
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17555>
With the following SEND instruction :
send(1) nullUD nullUD g0UD 0x4200c504 a0.1<0>UD
This instruction although valid but somewhat nonsensical (SEND message
to write at offset contained in NULL register), triggers an error in
the validator.
The restriction is that we cannot have overlapping sources. The
validator not checking the type of register incorrectly thinks that
the null register (offset 0) is the same as g0.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17555>
We're now able to load up to 8 GRFs in one send.
v2: Switch to use transpose + vector of up to 64 (Thanks Curro!)
v3: Increase parallelism by not reusing the same register for push
constant offset (Curro)
v4: Drop dead ADD() instruction (Curro)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17555>
Make it clearer we are dealing with multiple patches,
works better in constrast with SINGLE_PATCH.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18151>
For Gen11 and prior, the dispatch mode for TCS was SINGLE_PATCH, and
this debug setting could be used to change it to 8_PATCH (falling back
to SINGLE_PATCH when shader couldn't be in the multi dispatch mode).
However after talking to Ken, seems this debug setting is not really
worth keeping around, so removing it.
For Gen12+ the only option is 8_PATCH, so it was always using that
dispatch mode as before.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18151>
We did not handle the operation with data size < 4. It works fine on
all other messages (global/shared). The initial commit was just too
restrictive.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1e242785c3 ("intel/fs: Implement load/store_scratch on XeHP")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16964>
The selection of the internal opcode to deal with load_scratch is
incorrect.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c643979228 ("intel/fs: Choose memory message type based on bit size")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16964>
Warning messages:
../src/intel/compiler/test_eu_compact.cpp:238:1: warning: 'InstantiateTestCase_P_IsDeprecated' is deprecated: INSTANTIATE_TEST_CASE_P is deprecated, please use INSTANTIATE_TEST_SUITE_P [-Wdeprecated-declarations]
../src/intel/compiler/test_eu_compact.cpp:256:1: warning: 'InstantiateTestCase_P_IsDeprecated' is deprecated: INSTANTIATE_TEST_CASE_P is deprecated, please use INSTANTIATE_TEST_SUITE_P [-Wdeprecated-declarations]
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18203>
This will make convenient later to keep track of the urb
handles directly in a Task thread payload struct (to be part of
fs_visitor).
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18188>
Call it now from a fs_visitor member functions instead of the static
ones. This will make convenient later to keep track of the urb
handles directly in a Task thread payload struct (to be part of
fs_visitor).
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18188>
Right now even the simplest mesh test (func.mesh.basic.mesh from crucible) fails like this:
ASSERT: Scalar MESH validation failed!
load_payload(16) vgrf11+0.0:F, vgrf8:D
../../src/intel/compiler/brw_fs_validate.cpp:61: inst->dst.offset / REG_SIZE + regs_written(inst) <= alloc.sizes[inst->dst.nr]
Because we try to load 8 regs with LOAD_PAYLOAD in SIMD16 mode.
Fixes: 349a040f68 ("intel/fs: Make logical URB write instructions more like other logical instructions")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18075>
This backend lowering code has been dead since the removal of i965 -
nothing in the current source tree ever sets the flag.
This is handled by iris_setup_uniforms() and crocus_setup_uniforms().
Variable group size does not appear to be a feature in anv.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18055>
In the early days of NIR, you had to prod at inst->const_index[]
directly, but a long while back, we added handy accessor functions
that let you use the actual name of the thing you want instead of
memorizing the exact order of parameters.
Also rewrite a comment I had a hard time parsing.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18067>
source_root function is deprecated in Meson version 0.56.0, so let's use
instead a current_source_dir() function, available in all Meson
versions. This also allows to deduplicate some code by declaring
commonly used string at the top meson.build file.
Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17974>
The old version worked fine for SIMD16 instructions but SIMD8
instructions where the destination spans two registers have the same
problem.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17908>
That way we can look at the SBT entries for debug purposes.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17908>
Bspec 53421 says:
"A URB fence memory is typically performed prior the thread
exit message, so that the next thread dispatch that reads
that URB memory will see it."
Cc: 22.1 <mesa-stable>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16665>
Found by code inspection.
There's an assert later checking that we haven't overflown
this array, so this change probably doesn't matter for any
workload.
Cc: 22.1 <mesa-stable>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16665>
This allows the software scoreboarding pass, scheduler, and so on
to handle the individual instructions and handle them, rather than
trusting in the generator to do scoreboarding correctly when expanding
the virtual instruction to multiple actual instructions.
By using SHADER_OPCODE_READ_SR_REG, we also correctly handle the
software scoreboarding workaround when reading DMask/VMask.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17530>
When the compiler pads a data structure, the padded bytes will not be
initialized. Shader keys are compared with memcmp and unitialized
bytes within the structure breaks this mechanism.
Explicitly pad the structures with members, so the compiler is forced
to initialize them. Add a warning to indicate if a change to
alignment in any of the data structures requires additional padding.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17749>
Fixes the following EU validation error:
ERROR: Header must be present for all URB messages.
The message header is ignored for URB fence messages, so I doubt that
this actually matters in practice. But we should probably mark it as
present, because you have to send something, and according to the
documentation, there is a message header, it's just ignored.
Fixes: e6a9501aa2 ("intel/fs: Add the URB fence message")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17624>
When this rule started causing issues, I looked it up in the
documentation, and found the rule for 64-bit destinations and
integer DWord multiplication, but there was no mention of floating
point destinations, as the text in brackets suggested. The actual
restriction text had been updated, so this led to some confusion
where I thought the conditions had been changed in newer docs.
However, what's actually going on is that there are two separate
conditions, each listed in separate rows of the table. One lists
64-bit destinations or integer DWord multiplication, and the other
mentions floating-point destinations. In both cases, the actual
restrictions are identical, so we handle them together in the code.
Try to update the comment to avoid future confusion.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17624>
Recently, we started using <1;1,0> register regions for consecutive
channels, rather than the <8;8,1> we've traditionally used, as the
<1;1,0> encoding can be compacted on XeHP. Since then, one of the
EU validator rules has been flagging tons of instructions as errors:
mov(16) g114<1>F g112<1,1,0>UD { align1 1H I@2 compacted };
ERROR: Register Regioning patterns where register data bit locations are changed between source and destination are not supported except for broadcast of a scalar.
Our code for this restriction checked three things:
#1: vstride != width * hstride ||
#2: src_stride != dst_stride ||
#3: subreg != dst_subreg
Destination regions are always linear (no replicated values, nor
any overlapping components), as they only have hstride. Rule #1 is
requiring that the source region be linear as well. Rules #2-3 are
straightforward: the subregister must match (for the first channel to
line up), and the source/destination strides must match (for any
subsequent channels to line up).
Unfortunately, rules #1-2 weren't working when horizontal stride was 0.
In that case, regions are linear if width == 1, and the stride between
consecutive channels is given by vertical stride instead.
So we adjust our src_stride calculation from
src_stride = hstride * type_size;
to:
src_stride = (hstride ? hstride : vstride) * type_size;
and adjust rule #1 to allow hstride == 0 as long as width == 1.
While here, we also update the text of the rule to match the latest
documentation, which apparently clarifies that it's the location of
the LSB of the channel which matters.
Fixes: 3f50dde8b3 ("intel/eu: Teach EU validator about FP/DP pipeline regioning restrictions.")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17624>
When the EU validator encountered an error, it would add an annotation
to the disassembly. Unfortunately, the code to insert an error assumed
that the next instruction would start at (offset + sizeof(brw_inst)),
which is not true if the instruction with an error is compacted.
This could lead to cascading disassembly errors, where we started trying
to decode the next instruction at the wrong offset, and getting lots of
scary looking output:
ERROR: Register Regioning patterns where [...]
(-f0.1.any16h) illegal(*** invalid execution size value 6 ) { align1 $7.src atomic };
(+f0.1.any16h) illegal.sat(*** invalid execution size value 6 ) { align1 $9.src AccWrEnable };
illegal(*** invalid execution size value 6 ) { align1 $11.src };
(+f0.1) illegal.sat(*** invalid execution size value 6 ) { align1 F@2 AccWrEnable };
(+f0.1) illegal.sat(*** invalid execution size value 6 ) { align1 F@2 AccWrEnable };
(+f0.1) illegal.sat(*** invalid execution size value 6 ) { align1 $15.src AccWrEnable };
illegal(*** invalid execution size value 6 ) { align1 $15.src };
(+f0.1) illegal.sat.g.f0.1(*** invalid execution size value 6 ) { align1 $13.src AccWrEnable };
Only the first instruction was actually wrong - the rest are just a
result of starting the disassembler at the wrong offset. Trash ensues!
To fix this, just pass the instruction size in a few layers so we can
record the next offset properly.
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17624>