The properly terminated regex automatically detects this case now.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586>
For mesh/task shaders, the thread payload provides a local invocation
index, but it's always linear so it doesn't give the correct value when
quad derivatives are in use.
The lowering pass where all of this is done correctly for compute
shaders assumes load_local_invocation_index will be lowered in the
backend for mesh/task, calculates the values for the quads correctly but
then avoid replacing the original intrinsic and we remain with the wrong
results.
Add an intel specific intrinsic and always lower the generic one to that
(or whatever else was calculated) to avoid ambiguities and fix the value
for quad derivatives.
Fixes future CTS tests using mesh/task shaders under:
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.*
Fixes: d89bfb1ff7 ("intel/brw: Reorganize lowering of LocalID/Index to handle Mesh/Task")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39276>
By default, load/store vectorization uses nir_round_up_components()
to round up loads and possibly writemasked stores to the next valid
NIR vector width. However, some backends may not support load/stores
at all sizes. For example, older Intel supports only power-of-two
vector widths. Newer Intel also supports vec2 and vec3, but not
vec5/6/7. By providing a callback, backends can request promotion
to their next supported memory load/store vector width.
The existing "should we vectorize?" callback should continue to return
false for unsupported vector widths (i.e. beyond the maximum supported).
With this new callback, they do not need to say "no" to vectorization
that would normally produce an unsupported count (e.g. vec5/6/7) but
instead request that the component count be rounded up appropriately.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>
This adds a new option, round_up_store_components, which rounds up the
number of components for stores that support writemasking to the next
valid vector size. For example, vec4+vec2 stores would round up from
6 components (which wouldn't be supported) to a full supportable vec8
store, relying on writemasking to ensure the correct pieces are written.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>
URB intrinsics are simply memory load/stores to a special memory region,
so it's pretty reasonable to handle these in the memory vectorizer. We
treat emit_vertex_* intrinsics as a barrier for shader outputs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>
This makes it easier for NIR passes to distinguish between inputs and
outputs without having to reason about which URB handle source was
passed to the intrinsic. It probably also makes it a bit easier for
humans to read the NIR too.
v2: Don't add memory mode to store intrinsics. It's always output.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>
We were initializing to a nir_const_value of undefined (in practice on x86
builds, a pointer value), with .b set to 0. Those values would get dumped
in the annotated shader disassembly at the end of a test where all inputs
where unexpectedly skipped, producing very surprising output.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
The mul is 24-bit sign-extended, so in simplifying we should retain that.
If nothing else, this keeps us on the happy path of mul24s.
I didn't fix the other broken pattern, since it's not really part of this
MR.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
We were enumerating enough for a single component, but not all the
combinations. This helps show that our fdots fail pretty consistently.
And triggers more skipping from the fany_equal16s thanks to varied inputs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
As I introduced another layer of iteration for signed zero testing, the
former logic got unwieldy. In fact, it was already unwieldy enough that I
forgot to clear all_skipped when the assert failed, allowing a failing
test to be marked UNSUPPORTED instead of XFAIL.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
Add support for advanced blending (VK_EXT_blend_operation_advanced and
GL_KHR_blend_equation_advanced), enabling around 40 advanced blend modes
including multiply, screen, overlay, HSL modes (hue, saturation, color,
luminosity), Porter-Duff modes, and extended modes like lineardodge
and vividlight.
Advanced blending slots into the existing blending logic alongside logic
operations and standard blending. The implementation supports both
premultiplied and non-premultiplied alpha for source and destination, and
provides three overlap modes (uncorrelated, conjoint, disjoint).
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38929>
Move the blend equation helper functions (blend_multiply, blend_screen,
blend_overlay, etc.) from gl_nir_lower_blend_equation_advanced.c to a
new shared header file nir_blend_equation_advanced_helper.h.
These helpers implement the mathematical blend operations defined by
KHR_blend_equation_advanced and will be reused by the new NIR lowering
pass for VK_EXT_blend_operation_advanced.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38929>