Suggested by @gurchetansingh.
Android's Soong build system treats several compiler warnings as errors
by default: https://android.googlesource.com/platform/build/soong/+/27f57506/cc/config/global.go/#218
To catch these issues in Mesa, introduce `soong_compat_c_args`
and `soong_compat_cpp_args` with the following flags treated as errors:
-D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS
-Werror=date-time
-Werror=gnu-alignof-expression
-Werror=ignored-qualifiers
-Werror=implicit-fallthrough
-Werror=int-conversion
-Werror=missing-prototypes
-Werror=pragma-pack
-Werror=pragma-pack-suspicious-include
-Werror=sizeof-array-div
-Werror=string-plus-int
-Werror=unreachable-code-loop-increment
These compatibility flags are added to the meson configurations
for ANV, Gfxstream, Lavapipe, PanVK, Turnip, and Venus.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Gurchetan Singh <gurchetan.singh.foss@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41644>
This struct was initially packed to fit in a slot in NIR intrinsics
indices. Nowadays NIR supports larger indices and cooperative matrix
has extensions that allow it to go beyond the existing limit. This
patch changes the struct to be larger and remove the manual bit packing.
The hash table change is to use the specialized version for u64 keys
that's available in src/util.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41691>
jay is trying to use the fragment shader dispatch mask for helper
invocation lowering, but it was using load_sample_mask_in for that
(now load_coverage_mask_intel). But this isn't the MSAA coverage
mask, the two are different payload fields.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
Constructing the render target store payload is more complex than we can
reasonably handle at the NIR level. The main reason is that samplemask
and stencil are packed 16-bit and 8-bit parameters, respectively, which
are intermixed with other values that are 32-bit. In SIMD32 mode, the
packed sub-32-bit values take up fewer registers than normal values.
Currently we also don't specialize the NIR for each FS dispatch width,
and we can't construct the message descriptor without knowing it.
So, we alter nir_intrinsic_store_render_target_intel to take each of
the expected parameters - colour, depth, stencil, samplemask,
src0_alpha, and discard predicate. We construct the payloads and
descriptors in the backend.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
Adding the zero constants have a minor impact on stats due to some unlucky
interactions with nir_opt_cse, opt_instr_sched_prepass and assign_regs.
Totals from 61 (0.01% of 1212873) affected shaders:
CodeSize: 1044720 -> 1047472 (+0.26%); split: -0.00%, +0.27%
Static cycle count: 1198932 -> 1198490 (-0.04%); split: -0.07%, +0.04%
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39384>
The nir_instrs_equal normalizes the some indices but hash_intrinsic
wasn't normalizing them. Reorganize the code so both do it using the
same helper.
Fixes: b2bc57551a ("nir/instr_set: allow cse with fp_math_ctrl mismatches for intrinsics")
Assisted-by: Pi coding agent (GPT-5.5)
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41606>
We skip emitting ffma_weak here, because otherwise we'd require a lowering
loop with opt_algebraic and lower_double_ops and this way it's also
cheaper.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
We'll get three new opcodes to properly model float multiply-add.
ffma_old is temporary and will be deleted at the end of this series.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
to strengthen it and fix a rare case of incorrect behavior.
This is the incorrect behavior that's fixed:
if (invocation_id == 0)
patch_output[0] = invocation_id;
else if (invocation_id == 1)
patch_output[1] = invocation_id;
else
patch_output[2] = invocation_id;
The stored SSA defs are the same, but each invocation writes a different
value to different outputs, so they are not duplicated, but the code
incorrectly treated them as duplicated and removed [1] and [2].
The requirement for correctness is that if an output is duplicated,
it must have stores in the same blocks as the output it duplicates.
To fix the incorrect behavior, awareness of blocks is added to the algorithm,
which leads to the following new cases that are optimized:
1. Different blocks store different SSA defs, but equal in each block:
if (..) {
output[0] = a;
output[1] = a; // eliminated
} else {
output[0] = b;
output[1] = b; // eliminated
}
2. Different GS emit sections store different SSA defs, but equal in each
section:
output[0] = a;
output[1] = a; // eliminated
emit_vertex;
output[0] = b;
output[1] = b; // eliminated
emit_vertex;
3. A weird case that could be duplicated but is left alone due to
the counterexample above:
if (..)
output[0] = a;
else
output[1] = a;
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41329>
Intel uses nir_lower_bit_size to convert 8-bit integer values to 16-bit
for most instructions. By constant folding u2u8 or i2i8 through a bcsel,
this lowering is undone.
Fixes assertion failure in fossils/parallel-rdp/small_subgroup.foz.
fossilize-replay: src/intel/compiler/brw/brw_from_nir.cpp:852: void brw_from_nir_emit_alu(nir_to_brw_state&, nir_alu_instr*, bool): Assertion `brw_type_size_bytes(op[i].type) > 1' failed.
v2: Reject all integer conversions. Suggested by Daniel Schürmann.
Fixes: f4812dc11d ("nir/opt_constant_folding: constant-fold op(bcsel(), #c) -> bcsel(.., #c1, #c2)")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41412>
nir_intrinsic_index_size() expects a nir_intrinsic_index_flag, not
the position in the intrinsic's index list. This could cause
part of a multi-slot index to be ignored.
Fixes: b2bc57551a ("nir/instr_set: allow cse with fp_math_ctrl mismatches for intrinsics")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41593>
nir_def_init sets divergent = true, this means for something like
reduce(reduce(convergent)) we previously only optimized the inner
reduce.
No fossil changes at the moment, but I hit this when trying to
optimize shared memory to subgroup operations.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41542>
This was fundamentally broken for workgroup sizes >= 8x8.
This fixes new VKCTS coverage
dEQP-VK.glsl.texture_functions.texture.*_compute, and also few tests
from the vkd3d-proton testsuite (note that quad derivatives is
currently disabled for < GFX12).
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41483>
This splits the nir_move_to_top_input_loads option into 2 options. The latter
option is mainly for at_offset/at_sample loads. Then it updates most places to
use only the first option.
The rationale is that moving at_sample loads makes Control (game) shaders
worse, as per the code comment.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41167>
This makes adding workgroup scope easier, this just creates the
split_box and moves things into it and adds some helpers.
This also rewrites some loops from r/c into i which calc r/c
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41500>