We now use `parse_debug_string` to parse debug strings from files, which
may have newlines in them. This change ensures that newline characters
are ignored during parsing, a similar change was made to
`parse_enable_string` for consistency.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32906>
Currently the `nolrz` TU_DEBUG options are only checked during
device creation and image creation, respectively. This means that if
the options are enabled after the device/image is created, LRZ
will still be used. Similarly, if the options are disabled after the
device/image is created, LRZ will still be disabled.
This change moves the checks to the point where the LRZ is actually
used, allowing for runtime toggling of LRZ.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32906>
This adds a new environment variable, TU_DEBUG_FILE, which can be used to
enable/disable various debug options at runtime via writing to a file. This
is useful for switching between different debug options (such as toggling
between SYSMEM/GMEM) without needing to restart the application.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32906>
debian-testing is the critical path: the shortest possible job to build
exactly what we need to execute on hardware, and nothing else.
debian-build-testing exists to give us better coverage at the expense of
running longer.
Since the only jobs using r300 and Nine, and the only jobs using NVK,
are in post-merge stages which are manually triggered, move these builds
to debian-build-testing. This makes the critical path to those a little
longer, but we do get to make it shorter for everyone else just running
regular Marge jobs.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33287>
The division between color and d/s resolve groups appears to be
important. Not doing so causes corruption in some of games
when forcing gmem, e.g. in Arma, War Thunder.
Fixes: 25b73dff5a
("tu/a7xx: use concurrent resolve groups")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33169>
This trades 1 VALU (v_cmpx) instruction in GS to 2 SALU,
and removes a VALU->SALU dependency for the branch that stores
attributes.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218>
Both the VS/TES and GS lowering passes have grown a lot over time,
and therefore the C file has become unwieldy. Mitigate that by
moving the GS lowering out to a separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218>
Create separate branches for output processing and exports.
Normally, emit attribute ring stores at the end of the shader,
but with the attribute ring wait bug, insert them between the
primitive and position export branches.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218>
This is the part of the code that processes things related to
a single vertex, mainly loading the outputs from LDS and
performing some adjustments on them.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218>
Normally, emit attribute ring stores at the end of the shader.
When the attribute ring wait bug is present, insert them between
the primitive and position exports.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218>
Same idea as the VS/TES and GS lowering:
Make shader compilation decisions based on the features of the
current GPU instead of ad-hoc deciding according to GFX level.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218>
While theoretically all GFX11+ GPUs have an attribute ring, it is
nicer to have this property instead of deciding ad-hoc based on
the GFX level.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218>
The intention is to have all the HW features affecting
shader compilation in one place, instead of ad-hoc decisions
in the code based on the GFX level and chip class.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218>
There is a possibility that some waves in an NGG workgroup
don't have any input vertices, only primitives. When these
waves store the primitive ID as a per-primitive attribute,
they will need to wait for those stores before the primitive
export, because the other waves can't wait for them.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218>
This case is unlikely but possible. We forgot to handle it here,
because it was originally handled by the backend compiler.
On GFX10 chips that have issues with 0 vertices and primitives
exported, this will always export at least 1 vertex and primitive.
This could theoretically fix some hangs on Navi 10, although we are not aware of a specific issue caused by this problem.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218>
After optimization happen, if the sources are still in one or two
contigous spans for some reason (e.g. some data read from memory
now being written), it is beneficial to just use regular SEND
and avoid having to set the ARF scalar instruction.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32410>
Add an optimization pass to turn regular SENDs into SEND_GATHERs.
This allows the payload to be "broken" into smaller pieces that
can be further optimized, which _may_ result in
- less register pressure (no need to contiguous space), and
- less instructions (no need to MOV to such space).
For debugging, the INTEL_DEBUG=no-send-gather option skips this
optimization, and reporting how many opportunities were missed.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32410>