order render targets by alignment, eliminating gaps. this is the same trick we
use in RA.
dEQP-GLES3.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.3
now allocates as 32x32 instead of 32x16.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39111>
The polling period of the monitoring loop is a constant hardcoded and defined
to an arbitrary default number. This is related to the output reporting to the
user the progress in the pipeline running, so it shows this tool is alive when
it can be generating a huge number of output lines. Transforming it to a
definable variable allows the possibility to experiment with different values,
because of the balance between when a job can be triggered and the verbose
output produced.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39105>
The behavior of crnm when the target resolves to be one single job is special;
the job traces of this single job are dumped as the output of this monitoring
tool. There may be user cases where the wished behavior of this tool would be
equivalent to when the target resolves on multiple jobs.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39105>
After the migration to use the rich library, the execution within a GitLab job
gets dirty: coloring disappeared, the number of columns defaults to 80, and
job name padding fails with a broken link in the lines.
This is only a specialization when the tool detects it is running within a
job, without changing the behavior in the usual usage of this tool.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39105>
Per current OpenGL spec: When a cube map texture is sampled, the (s t r)
texture coordinates are treated as a direction vector (rx ry rz)
emanating from the center of a cube. The q coordinate is ignored.
Older spec had slightly different wording that allowed projecting the
coordinates, but to be consistent with GLSL/SPIRV and the expectations
of the real-world applications it's better to ignore the projector.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14235
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39087>
Applying the projector after the normalization breaks the coordinates, so
apply it early. Usually it's not even necessary for the cubemaps anyway,
but ARB_fragment_program and TGSI allow it.
Fixes: 52e71809 ("nir: Add a cubemap normalizing pass")
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39087>
Will make it easier to adapt to gen8 changes.
Note that with the change we no longer emit RB_VIEWPORT_ZCLAMP_MIN/MAX
if viewport_count==0. But this should not be a valid case, so no
functional change is expected.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39052>
r600 doesn't use LLVM anymore. Remove the remaining print of the version
number and the dependencies in the build system.
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39071>
NIR shaders is the default for the r600 driver now. There is no point in
an option to enable this as experimental support.
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39071>
Subgroup ops make divergence information useless for our purpose,
we would need workgroup divergence.
The game affected here has control flow dependent on vote_any,
so it's possible that a wave only executes the code after culling/reordering
invocations.
That means we can't reuse the maybe undefined value from before culling.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14459
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39060>
This broke a number of dEQP tests, for reasons that are unclear -- in
looking at cffdumps, the const setup looks appropriate, just changing from
e.g:
VS_CTRL_REG1(CONSTLEN=2)
FS_CTRL_REG1(CONSTLEN=1)
LOAD_STATE6(VERT, DIRECT, DST_OFF=2, NUM_UNIT=2)
LOAD_STATE6(FRAG, DIRECT, DST_OFF=0, NUM_UNIT=2)
to:
VS_CTRL_REG1(CONSTLEN=2)
FS_CTRL_REG1(CONSTLEN=1)
LOAD_STATE6(VERT, DIRECT, DST_OFF=2, NUM_UNIT=2)
LOAD_STATE6(FRAG, INDIRECT, DST_OFF=0, NUM_UNIT=2)
Fixes: 203ac73374 ("gallium: set prefer_real_buffer_in_constbuf0 for all drivers using tc")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39070>
We don't need one bit per bitsize per instruction if only one actually
matters in the end.
First step towards moving NIR in the direction of full float_controls2
only.
Also rename this from fp_fast_math, because that name implied that 0 is
the no fast math mode, while the opposite was the case.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39026>
Performance counters are too different between generations and it's
less error prone to define them separately for each generations.
I'm starting with GFX12 first.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39083>
We had our own pipeline cache field, we need to use the common field
instead for VK_KHR_pipeline_binary.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39095>
Instead of an exception that completely stops the run'n'monitor process, when
a job with a specific status can't be enabled, just log it with a warning.
Don't attempt to enable it again unless it changes the status.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39049>
The counter about how many times a job has been tried to enable, can host the
information about the job status when these attempts happen. This groups the
attempts to separate unrelated differences between job statuses.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39049>
This patch adds compute_w_to_host_r barrier and also throw QueueWaitIdle
just to make sure all operations are completed before we fetch the BVH
data on the host.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39079>
The entire range of the allocated bo up to bo->size will be used, even if
alloc_size was way less, so to track the growth precisely for load factoring,
the allocated_batch_size should increase by bo->size.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38703>
Although a specific size is requested, the entire range of the returned bo up
to bo->size may end up being used by anv_batch_chain, spamming memcheck errors.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38703>
Implement parsing and encoding of special floating-point values for
both 20-bit (f20) and 16-bit (f16) immediate formats:
inf:f20 - Positive infinity (imm_val=0x7f800, imm_type=0)
-inf:f20 - Negative infinity (imm_val=0xff800, imm_type=0)
nan:f20 - Quiet NaN (imm_val=0x7fc00, imm_type=0)
-nan:f20 - Negative NaN (imm_val=0xffc00, imm_type=0)
inf:f16 - Positive infinity (imm_val=0x7c00, imm_type=3)
-inf:f16 - Negative infinity (imm_val=0xfc00, imm_type=3)
nan:f16 - Quiet NaN (imm_val=0x7fff, imm_type=3)
-nan:f16 - Negative NaN (imm_val=0xffff, imm_type=3)
The f20 format stores the upper 20 bits of an IEEE 754 single-precision
float. The f16 format stores the 16-bit half-float value directly.
This enables round-trip assembly of shaders containing these special
values, which can appear in GPU command streams captured from the
proprietary driver.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39016>
The previous code incorrectly treated f16 immediates as truncated f32
values (bits >> 12). The f16 immediate format (imm_type=3) expects a
16-bit IEEE 754 half-precision float, not the upper 20 bits of an f32.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39016>
The dual16 mode is now fully encoded in the disassembly output through
type suffixes (:f16 vs :f20) and register naming (th vs t), making the
explicit dual16 mode parameter redundant.
Previously, the parser needed a dual16_mode flag to determine whether
float immediates should use imm_type=0 (f32) or imm_type=3 (f16). Now
that the disassembler emits explicit :f16/:f20 suffixes, the parser can
determine the correct encoding directly from the input text.
This simplifies the API by removing the dual16_mode parameter from:
- isa_parse_str()
- isa_parse_file()
- Internal parsing functions
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39016>
The assembler and disassembler now use explicit type suffixes for all
immediate values to ensure correct round-trip encoding:
- :f20 - 20-bit float (upper 20 bits of IEEE 754 single precision)
- :f16 - 16-bit half-float
- :s20 - 20-bit signed integer (two's complement)
- :u20 - 20-bit unsigned integer
Previously, the parser did not distinguish between signed and unsigned
integers, causing incorrect encoding. The signed format uses 20-bit
two's complement where bit 19 is the sign bit and maps to AMODE[0] in
the instruction encoding.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39016>
When a parsing test fails, output the actual error message to aid
debugging. This makes it immediately clear why parsing failed instead
of just showing that the test didn't succeed.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39016>