On Bifrost, we have to return the blend return offset in the compiled
shader info and that means we need to be able to index into an array by
blend target deep inside the compiler. Instead of assuming bound blend
targets and subtracting BIR_FAU_BLEND_0 from fau_idx, add a separte
blend_target to bi_instr and use that. This way what we return will be
based on the nir_io_semantics::location, regardless of where the actual
blend descriptor comes from.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39274>
When doing sample comparisons on shadow images the compare value
should be clamped to [0,1]
Fix:
dEQP-VK.glsl.texture_functions.texture.samplercubearrayshadow_fragment
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39327>
This uses the vk-meta framework, so it feels more like it belongs here.
While we're at it, rename the function as well.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39304>
Since the move to MEMORY_*_LOGICAL the result value was being ignored, so
change to use that.
Since the conversion to use new registers, some issues were introduced:
- Even with `has_64bit_int` ADD with 64-bit immediate value is not supported;
- `dst_high` was not being filled if there was no overflow;
- Only `dst_low` returned.
Found when writing some new code involving large block loads.
Fixes: b79e85a93f ("brw: always use new registers for load address increments")
Fixes: b55f77161d ("intel/brw: Switch to emitting MEMORY_*_LOGICAL opcodes")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39282>
This catches a number of bugs in the current NIR algebraic optimizations
or opcodes implementations (as fixed in this series, or documented in the
XFAIL tests), and should prevent many future bugs from landing.
This required bumping the test timeout, because s390x is very slow to
emulate in CI.
Closes: #3338
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
This way as a pattern author/editor you can immediately see whether it's
getting test coverage and if there are known issues with the pattern.
This will also give us clear outcomes from testing as we fix failing
patterns.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
nir_algebraic_pattern_test can validate shaders with the following
structure:
%0 = @provide(base = 0)
...
%N = @provide(base = input_count)
// multiple equivalent expressions
a = ...
b = ...
valid = ieq(a, b)
@use(valid)
Expressions are evaluated by emulating the shader using
nir_eval_const_opcode.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
Same trick we do for nir_imul evaluation -- do the multiply in unsigned to
get defined behavior from C. Fixes UBSan failures with
nir_opt_algebraic_pattern_tests.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
We all know that (int)0xff << 24 is fine, but UBSan doesn't like it.
These were triggered by nir_opt_algebraic_pattern_tests.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
These opcodes are generated inside NIR algebraic when the shift is
constant, but this will help us do automated algebraic pattern testing
with arbitrary inputs that are unaware of the opcode's restrictions.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
This is unused by any callers currently, but will be useful for nir
algebraic pattern testing, and as a way to turn our comments in
nir_opcodes.py into actual C code. For now, always returns false.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
We know what's at the four bits starting at 11:0; that's the uniform
count. But we don't know whats at 12:0, so let's update the start
address here.
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38493>
When calculating the relative offset for a branch the pco_first_igrp
function is used to find the first instruction of a block.
However if the block is empty the function does not return NULL as it
description implies but returns a pointer to the list head which is not a
valid node. Using this leads to a garbage relative offset been calculated
which leads to unexpected behaviour.
Fix is to add a check for the list been empty and return NULL (the same
issue also exists in pco_last_igrp). This leads to the calling function,
pco_cf_node_offset, searching for the next none empty block which is the
expected behaviour.
Fix deqp:
dEQP-VK.graphicsfuzz.cov-two-nested-loops-switch-case-matrix-array-increment
dEQP-VK.graphicsfuzz.stable-binarysearch-tree-false-if-discard-loop
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39287>
The previous approach does ensure that all entries are zero'd, but that
may not be clear to the reader (i.e., me). Using `{ 0 }` is clearer.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39245>
Add WRITE_DATA packet before and after FENCE_WAIT_MULTI packet. Based
on the last number written in WRITE_DATA packet buffer, it can be
found if FENCE_WAIT_MULTI packet passed or not in CP firmware.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39206>
if set, creates a file in /tmp folder with mesa_<process_name>_<pid>_XXXXXX.log
logging all errors, warnings, etc., rather than stderr. The XXXXXX will be replaced
with alpha numeric character so for each run of the app a new log file will be
created guaranteed.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39206>
This gives us the infrastructure that allows us to slowly migrate
pieces of blorp shaders from NIR to OpenCL, which, IMHO, are much
easier to read. We can't fully migrate everything due to all the
conditional building we do with these shaders, but I'm sure we'll find
opportunities to replace some NIR with OpenCL eventually.
The conversion of blorp_check_in_bounds() serves as the first example.
I also plan to have the shaders from the new indirect copy extension
be OpenCL shaders (mixed with some NIR as well), so having this patch
merged now will reduce the diff for the extension later.
Thanks to Alyssa Rosenzweig for her help here.
v2:
- Use SPDX (Alyssa).
- Use nir_trim_vector() (Alyssa).
- Adjust CL variable declaration (Alyssa).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39046>
We have two distinct code paths sharing blorp_params->wm_inputs for
different purposes: the code from blorp_blit.c and the code from
blorp_clear.c. While blorp_blit.c uses most of the parameters (all
except clear_color), blorp_clear.c only uses clear_color and
bounds_rect. Split the parameters in two structs: one for blits and
the other for clears.
This not only helps save some space in the shader inputs, but it also
organizes things so it's more clear which parameters are used by what.
In addition, my plan is to later add struct blorp_wm_inputs_indirect,
which won't share anything that the others use, and would otherwise
grow the struct even more.
This change would reduce the size of struct blorp_wm_inputs from 96 to
80, but we have to add padding due to the assertion that compares it
to cs_prog_data->push.cross_thread.size. Still good, though.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39046>