This allows us to not generate 64-bit iadd3 on Intel but continue
generating it for NVIDIA.
No shader-db or fossil-db changes.
v2: Add nir_lower_iadd3_64 flag so we can continue to generate 64-bit
iadd3 on NVIDIA platforms.
v3: s/bit_size == 64/s == 64/. This cut-and-paste bug prevented any of
the optimizations from ever occuring.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>
It doesn't make sense to have two sets of opcodes for this when all backends
that support the flush_to_zero variant just rely on the global floating point
mode anyway.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29433>
Running intel_clc as part of the build doesn't need to issue this
warning.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29445>
Gfx11-12 can support SLM block loads via OWord Block Load messages
(notably, the aligned version, not the unaligned version).
A while back we deleted the SHADER_OPCODE_OWORD_BLOCK_READ opcode.
Rather than bring it back, we continue using UNALIGNED_OWORD_BLOCK_READ
for SLM block access (like we do for SSBOs) but switch it over to the
aligned variant when lowering logical sends. We do ensure the alignment
is at least 16B, however. This is ugly, but it's probably not worth
bringing back a whole extra opcode for a legacy HDC block load quirk.
References: BSpec 47652 and 1689
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9960
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29429>
This functions were inlined in a header and duplicated between brw and
elk.
That would be enough reasons to move to a C file but next patches
will add more code to support Xe2 platforms, what would cause more
code to be inlined, duplicating even more code and increasing lib
size.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28910>
Another architecture register that requires some care before reading.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 49ee3ae9e8 ("intel/compiler: Lower FIND_[LAST_]LIVE_CHANNEL in IR on Gfx8+")
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29319>
Fixes failures of CTS tests that currently end up emitting 64-bit
integer ADDs with saturation, which isn't supported by the hardware.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28283>
Note that some previously-supported 64-bit integer operations have
been removed from the hardware, so we need to instruct NIR to lower
them.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28283>
Rework:
* Francisco Jerez: Specify the src1 length value in the correct
units. Don't break earlier platforms.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28283>
So that the driver can decode the printf buffer.
We're not going to use the NIR data directly from the driver
(Iris/Anv) because the late compile steps might want to add more
printfs.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25814>
We'll use the delta for an upcoming internal printf mechanism, where
the PARAM_IDX will be the base printf reloc identifier and the BASE
will be the string id.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25814>
No shader-db or fossil-db changes on any Intel platform. In this MR, the
only benefit of these changes is to convert some "-a > 0" CSEL
comparisons to "a < 0" for improved readability.
v2: Add integer CSEL support
v3: Use fs_inst::resize_sources and brw_type_is_sint. Both suggested by
Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
No shader-db or fossil-db changes on any Intel platform. This ends up
begin helpful in "intel/brw: Use range analysis to optimize fsign."
v2: Add integer CSEL support
v3: Massive simplification (-20 lines!) of constant propagation
logic. Suggested by Ken. Add missing CSEL case in supports_src_as_imm.
Noticed by Ken.
v4: While MAD can mix F and HF sources on some platforms, CSEL
cannot. Found by skqp on TGL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
Gfx9 can only have F, but newer GPUs can have F, HF, *D, or *W. The
source and destination types must still match in size.
v2: Simplify the float vs integer logic. Suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
This bit from the comment should have been a big red flag:
There are currently zero instances of fsign(double(x))*IMM in
shader-db or any test suite, so it is hard to care at this time.
The implementation of that path was incorrect. The XOR instructions
should be predicated like the OR instruction in the non-multiplication
path. As a result, dsign(zero_value) * x will not produce the correct
result.
Instead of fixing this code that is never exercised by anything, replace
it with the simple lowering in NIR.
Ironically, the vec4 implementation is correct. The odds of encountering
an application that is performace limited by dsign performance in vertex
processing stages on Ivy Bridge or Haswell is infinitesimal.
No shader-db changes on any Intel platform.
v2: Delete 's' in emit_fsign as it is now unused.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
This bit from the comment should have been a big red flag:
There are currently zero instances of fsign(double(x))*IMM in
shader-db or any test suite, so it is hard to care at this time.
The implementation of that path was incorrect. The XOR instructions
should be predicated like the OR instruction in the non-multiplication
path. As a result, dsign(zero_value) * x will not produce the correct
result.
Instead of fixing this code that is never exercised by anything, replace
it with the simple lowering in NIR.
No shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
Some commas were being skipped, according to history as an attempt
to elide BAD_FILEs, but we still print them, so be consistent. Also
for instructions without any sources, the trailing comma was always
being printed. Fix that too.
Example of instruction output before the change
halt_target(8) (null):UD,
send(8) (mlen: 1) (EOT) (null):UD, 0u, 0u, g126:UD(null):UD NoMask
and after it
halt_target(8) (null):UD
send(8) (mlen: 1) (EOT) (null):UD, 0u, 0u, g126:UD, (null):UD NoMask
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29114>
It was the default to show register pressure for each instruction,
but it gets in the way of cleaner diffs before/after an optimization pass.
Add INTEL_DEBUG=reg-pressure option to show it again.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29114>
The sequential IP cause noise when diffing before/after a pass that
either add or remove instructions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29114>
Running
KHR-GL46.sparse_texture_clamp_tests.SparseTextureClampLookupColor test
with Zink on Anv we run into an assert :
assert(inst->mlen <= MAX_SAMPLER_MESSAGE_SIZE * reg_unit(devinfo));
Turns out we've not covered all the cases in the SIMD lowering.
It's a bit of a shame to have both files reproduce the same logic.
Will try to think of a better way to extract the layout of the a send
message but that'll be a much bigger rework.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29118>
For CMP/CMPN, use src0 type if destination is null otherwise get the
src0 type register with destination register size.
This fixes dEQP-VK.glsl.builtin_var.frontfacing.* tests cases on Xe2+.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28679>
Fixes fs-uint-to-float-of-extract-int8.shader_test and
fs-uint-to-float-of-extract-int16.shader_test added by piglit!883.
v2: Expand the comment explaining the potential problem. Suggested by
Caio.
Fixes: e6022281f2 ("intel/elk: Rename files to use elk prefix")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27891>
Fixes fs-uint-to-float-of-extract-int8.shader_test and
fs-uint-to-float-of-extract-int16.shader_test added by piglit!883.
No shader-db or fossil-db changes on any Intel platform.
v2: Expand the comment explaining the potential problem. Suggested by
Caio.
Fixes: 29ce110be6 ("i965/fs: Remove extract virtual opcodes.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27891>