I was doing math manually in a lowering pass for converting a division to
a ushr, and this will let the pass be expressed more naturally.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6378>
We always want to demote the y to the bit size of the ssa def, but also
want to sanity check that our input and our masking is big enough.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6378>
Previously, when the instruction had 3 operands, this would cause
possible corruption because of writing to sdwa->sel[2].
This was noticed thanks to GCC 10's -Wstringop-overflow warning.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6436>
Drivers may be able to optimize cases where blending is enabled but the
destination colour is not used. This helps detect that case.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6407>
It should reliably report the faulty shader but the faulty instruction
is inacurate, especially for memory violations because it's reported
when the addr is processed. It will be improved by emitting more
wait-states.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>
When TRAP_PRESENT is not enabled, all traps and exceptions are ignored.
Only EXCP_EN.mem_viol is currently supported because the other
exceptions have to be tested/validated first.
EXCP_EN.mem_viol is used to detect any sort of invalid memory
access like VM fault. When a memory violation is reported, the
hw jumps to the trap handler.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>
A trap handler is used to handle shader exceptions like memory
violations, divide by zero etc. The trap handler shader code will
help to identify the faulty shader/instruction and to report
more information for better debugging.
This has only been tested on GFX8, though it should work on GFX6-GFX7.
It seems we need a different implemenation for GFX9+.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>
Similar to the GS copy shader except that NIR is unused because
the shader is written directly using ACO IR.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>
The shader is written by hands with assigned registers, so most of
the pass are unnecessary.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>
It's way easier to write a trap handler shader using ACO IR
instead of writing disassembly by hand + clrxasm + copy&paste.
This trap handler is quite simple for now, it just loads a
buffer descriptor from the TMA BO, it saves ttmp0-1 which
contain various info about the faulty instruction, and it
stores some hw registers about the wave/trap status.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>
To fix a validation error when loading the scalar tma buffer
descriptor because it's not a temp but a fixed reg (tma_lo/tma_hi).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>
The TBA/TMA scalar registers are only available on GFX6-GFX8.
On GFX9+, TBA/TMA addr are stored in hardware registers and
the number of TTMP scalar registers is thus increased by 4.
Just keep in mind that tba_lo is actually ttmp0. Best would
be to support ttmp registers in RA but that's more complicated.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>
Microsoft's DXIL is based on LLVM, which doesn't have an integer ABS
opcode, but instead needs it lowered to NEG + MAX. We need to do this
with an option, to prevent an already existing optimization rule from
undoing this.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5211>
This file exports a variable that is then used in a python script,
but it can never be executed by itself, so having a shebang here
makes no sense.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6166>
writeSpirv() already takes care of that, and calling it twice seems to
duplicate functions and cause problems when processing execution modes.
Fixes: 2043c5f37c "clover/llvm: Add functions for compiling from..."
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6306>
Both of these are clover-only caps. We don't really support clover and,
even if we did, the number of address bits is wrong and we definitely
don't support the CL path for images.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6405>
v2: create variables only once
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
If no options are provided, existing intrinsics are used.
If the lowering pass indicates there should be offsets used for global
invocation ID or work group ID, then those instructions are lowered to
include the offset.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
The actual variable -> intrinsic lowering stays where it is, but
ops which convert one intrinsic to be implemented in terms of
another have moved.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
For D3D12, we don't want to lower this, as there's a dedicated global-id
system-value that might be faster to use, depending on the hardware.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
New intrinsics are added for global invocation IDs and work group IDs to
deal with offsets in both. The only one of these that needs a system value
is global invocation offset, for CL's get_global_offset().
Note that CL requires very large work group sizes, so these intrinsics
are modified to be able to use 64bit values, for 64bit SPIR-V.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
Occasionally something goes weird in the network and a group of chezas
will produce streams of these errors during the tftp process, eventually
timing out after 60 minutes in the job. By the time we notice, the next
jobs seem to go through fine, so watch for them and try rebooting the
cheza to see if that gets our jobs to pass again.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6398>
If we get this error, we can just try rebooting again and see if it comes
up then. The POWER_GOOD failures are clustered in time, but it's better
to retry a few times in a row in one job (which has its own 60min timeout)
than to spuriously fail someone's pipeline.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6398>
This one uses python threads to move some of our logic from shell
pipelines to python, and opens the door to doing better serial output
tracking in the future (the SerialBuffer.lines() method)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6398>
Copied from virglrenderer. Some in-development features are guarded by
VIRGL_RENDERER_UNSTABLE_APIS and they should not be used without knowing
the consequences.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6235>
On some malloc implementation, malloc doesn't always align to 16
bytes even on 64 bits system. To make sure ralloc_header always
starts at the wanted alignment, just force the size to be aligned at
the alignment of ralloc_header. This fixes crashed on instruction
like "movaps %xmm0,0x10(%rax)" which requires aligned memory access.
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6314>
With commit 2122b902b8, u_index_translator can return U_TRANSLATE_MEMCPY
for 8-bits indices, and in this case we need to call the translation function
instead of a simple passthrough to the device.
Fixes piglit spec@nv_primitive_restart tests.
Fixes: 2122b902b8 "gallium/indices: don't expand prim-type for 8-bit indices"
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6414>
When we initialize the buffer surface, do not map the existing storage
with DONTBLOCK, leave it as a synchronized map.
This patch also sets the surface rebind flag after it is bound to a
new buffer and sets the surface buffer pointer accordingly.
This fixes display corruption issue seen with running steam.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6415>
Bump up the size of the bitfields for gl_register_file type for MSVC.
Also add ASSERT_BITFIELD_SIZE check where this bitfield is used.
Fixes spec@arb_shader_atomic_counter_ops tests in MSVC.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6417>
It was always fneu but naming it fne causes confusion from time to time. So
lets rename it. Later we also want to add other unordered and fne, this is
a smaller preparation for that.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>