This feature causes piles of vertex shader timeouts in the CTS which makes CTS
testing extremely unreliable on my min-spec M1. Since we only really have it as
a tickbox for Proton, hide it except for Proton - at least for now.
This eliminates our known flakes.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39111>
no, I don't know how this worked before.
fixes
KHR-GL45.conditional_render_inverted.functional
KHR-GL46.transform_feedback_overflow_query_ARB.*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: mesa-stable
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39111>
order render targets by alignment, eliminating gaps. this is the same trick we
use in RA.
dEQP-GLES3.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.3
now allocates as 32x32 instead of 32x16.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39111>
Per current OpenGL spec: When a cube map texture is sampled, the (s t r)
texture coordinates are treated as a direction vector (rx ry rz)
emanating from the center of a cube. The q coordinate is ignored.
Older spec had slightly different wording that allowed projecting the
coordinates, but to be consistent with GLSL/SPIRV and the expectations
of the real-world applications it's better to ignore the projector.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14235
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39087>
Applying the projector after the normalization breaks the coordinates, so
apply it early. Usually it's not even necessary for the cubemaps anyway,
but ARB_fragment_program and TGSI allow it.
Fixes: 52e71809 ("nir: Add a cubemap normalizing pass")
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39087>
Will make it easier to adapt to gen8 changes.
Note that with the change we no longer emit RB_VIEWPORT_ZCLAMP_MIN/MAX
if viewport_count==0. But this should not be a valid case, so no
functional change is expected.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39052>
r600 doesn't use LLVM anymore. Remove the remaining print of the version
number and the dependencies in the build system.
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39071>
NIR shaders is the default for the r600 driver now. There is no point in
an option to enable this as experimental support.
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39071>
Subgroup ops make divergence information useless for our purpose,
we would need workgroup divergence.
The game affected here has control flow dependent on vote_any,
so it's possible that a wave only executes the code after culling/reordering
invocations.
That means we can't reuse the maybe undefined value from before culling.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14459
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39060>
This broke a number of dEQP tests, for reasons that are unclear -- in
looking at cffdumps, the const setup looks appropriate, just changing from
e.g:
VS_CTRL_REG1(CONSTLEN=2)
FS_CTRL_REG1(CONSTLEN=1)
LOAD_STATE6(VERT, DIRECT, DST_OFF=2, NUM_UNIT=2)
LOAD_STATE6(FRAG, DIRECT, DST_OFF=0, NUM_UNIT=2)
to:
VS_CTRL_REG1(CONSTLEN=2)
FS_CTRL_REG1(CONSTLEN=1)
LOAD_STATE6(VERT, DIRECT, DST_OFF=2, NUM_UNIT=2)
LOAD_STATE6(FRAG, INDIRECT, DST_OFF=0, NUM_UNIT=2)
Fixes: 203ac73374 ("gallium: set prefer_real_buffer_in_constbuf0 for all drivers using tc")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39070>
We don't need one bit per bitsize per instruction if only one actually
matters in the end.
First step towards moving NIR in the direction of full float_controls2
only.
Also rename this from fp_fast_math, because that name implied that 0 is
the no fast math mode, while the opposite was the case.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39026>
Performance counters are too different between generations and it's
less error prone to define them separately for each generations.
I'm starting with GFX12 first.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39083>
We had our own pipeline cache field, we need to use the common field
instead for VK_KHR_pipeline_binary.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39095>
This patch adds compute_w_to_host_r barrier and also throw QueueWaitIdle
just to make sure all operations are completed before we fetch the BVH
data on the host.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39079>
The entire range of the allocated bo up to bo->size will be used, even if
alloc_size was way less, so to track the growth precisely for load factoring,
the allocated_batch_size should increase by bo->size.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38703>
Although a specific size is requested, the entire range of the returned bo up
to bo->size may end up being used by anv_batch_chain, spamming memcheck errors.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38703>
Implement parsing and encoding of special floating-point values for
both 20-bit (f20) and 16-bit (f16) immediate formats:
inf:f20 - Positive infinity (imm_val=0x7f800, imm_type=0)
-inf:f20 - Negative infinity (imm_val=0xff800, imm_type=0)
nan:f20 - Quiet NaN (imm_val=0x7fc00, imm_type=0)
-nan:f20 - Negative NaN (imm_val=0xffc00, imm_type=0)
inf:f16 - Positive infinity (imm_val=0x7c00, imm_type=3)
-inf:f16 - Negative infinity (imm_val=0xfc00, imm_type=3)
nan:f16 - Quiet NaN (imm_val=0x7fff, imm_type=3)
-nan:f16 - Negative NaN (imm_val=0xffff, imm_type=3)
The f20 format stores the upper 20 bits of an IEEE 754 single-precision
float. The f16 format stores the 16-bit half-float value directly.
This enables round-trip assembly of shaders containing these special
values, which can appear in GPU command streams captured from the
proprietary driver.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39016>
The previous code incorrectly treated f16 immediates as truncated f32
values (bits >> 12). The f16 immediate format (imm_type=3) expects a
16-bit IEEE 754 half-precision float, not the upper 20 bits of an f32.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39016>
The dual16 mode is now fully encoded in the disassembly output through
type suffixes (:f16 vs :f20) and register naming (th vs t), making the
explicit dual16 mode parameter redundant.
Previously, the parser needed a dual16_mode flag to determine whether
float immediates should use imm_type=0 (f32) or imm_type=3 (f16). Now
that the disassembler emits explicit :f16/:f20 suffixes, the parser can
determine the correct encoding directly from the input text.
This simplifies the API by removing the dual16_mode parameter from:
- isa_parse_str()
- isa_parse_file()
- Internal parsing functions
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39016>
The assembler and disassembler now use explicit type suffixes for all
immediate values to ensure correct round-trip encoding:
- :f20 - 20-bit float (upper 20 bits of IEEE 754 single precision)
- :f16 - 16-bit half-float
- :s20 - 20-bit signed integer (two's complement)
- :u20 - 20-bit unsigned integer
Previously, the parser did not distinguish between signed and unsigned
integers, causing incorrect encoding. The signed format uses 20-bit
two's complement where bit 19 is the sign bit and maps to AMODE[0] in
the instruction encoding.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39016>
When a parsing test fails, output the actual error message to aid
debugging. This makes it immediately clear why parsing failed instead
of just showing that the test didn't succeed.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39016>